# This notebook commits staged image_calc entries to the DSWx calval database
Use this notebook after uploading a classified image to transfer it from staging bucket to database bucket. ** To be used only by the database manager or as delegated **. Use with caution as all uploads are final. To correct an error in a previous entry, use notebook #3 to upload a new version of the entry.

In [1]:
import geopandas as gpd
import boto3

In [2]:
bucket_name = 'opera-calval-database-dswx'
bucket_name_staging = 'opera-calval-database-dswx-staging'

In [3]:
session = boto3.session.Session(profile_name='saml-pub')
s3 = session.resource('s3')
s3_client = session.client('s3')

### Search for pending geojsons in the staging bucket
This cell lists each pending staged entry to the image_calc table. 
Each geojson file represents a single row to be added to the table. 
Geojson filenames are generated using the date and time they were staged

In [12]:
# This cell lists each pending staged entry to the image_calc table. 
# Each geojson file represents a single row to be added to the table
# geojson filenames are generated using the date and time they were staged
bucket = s3.Bucket(bucket_name_staging)

for obj in bucket.objects.filter(Delimiter='/', Prefix='pending/'):
    print(obj.key)

pending/20230111_170230_imagecalc.geojson


Manually copy/paste the key from the above list to select a single entry. Typically it is best to upload in cronological order. Note that key names include a timestamp

In [13]:
# I just manually copy/paste the key from the above list to select one entry at a time.
pending_key = 'pending/20230111_170230_imagecalc.geojson'

### Inspect the staged geojson row
The staged table row should always be generated by notebook #3 which uses the addImagecalc function found in /tools/

In [14]:
obj = s3.Object(bucket_name_staging,pending_key)
pending_data = obj.get()['Body']
pending_gdf = gpd.read_file(pending_data)
pending_gdf.head()

Unnamed: 0,image_name,image_calc_name,calc_type,processing_level,oversight_level,calculated_by,reviewed_by,notes,public,water_stratum,bucket,s3_keys,upload_date,geometry
0,20210912_034049_22_2421,20210912_034049_22_2421_classification,Review,Final,,Matthew Bonnema,Alexander Handwerger,Previous(Previous(Classified using NDWI and ma...,True,3,opera-calval-database-dswx-staging,pending/files/20230111_170230_imagecalc/classi...,20230111_170230,"POLYGON ((97.58389 49.35489, 97.58389 49.40992..."


### Read image and image_calc geojson tables from database

In [16]:
imagecalc_gdf = gpd.read_file(s3.Object(bucket_name,'image_calc.geojson').get()['Body'])
image_gdf = gpd.read_file(s3.Object(bucket_name,'image.geojson').get()['Body'])

### Build some metadata fields and identify staged file keys 

In [17]:
source_image_name = pending_gdf.image_name.iloc[0]
imagecalc_name = pending_gdf.image_calc_name.iloc[0]
site = image_gdf[image_gdf.image_name == source_image_name].site_name.iloc[0]
src_bucket = pending_gdf.bucket.iloc[0]
src_keys = pending_gdf.s3_keys.iloc[0].split(',')
src_keys

['pending/files/20230111_170230_imagecalc/classification_20210912_034049_22_2421_formatted.tif']

This cell assigns a version number to the classification. If this is the first classification of a given planet image, the assigned version should be 0. Otherwise, it will increment on the latest version found in the database

In [18]:
search = imagecalc_gdf[imagecalc_gdf.image_name == source_image_name]
prev_version = -1
if len(search) == 0:
    version = 0
    previous_name = None
    print('first entry into table for ID:'+source_image_name+' assigning version = 0')
else:
    try:
        prev_version = search['version'].max() 
        version = int(prev_version + 1)
        previous_name = search[search.version==search['version'].max()].image_calc_name.iloc[0]
        print('assigning version based on maximum version in table. version = '+str(version))
    except:
        version = int(len(search))
        
        previous_name = None
        print('could not read version from table. assigned based on number of matching table entries. verson = '+str(version))

pending_gdf['image_calc_name'] = imagecalc_name+'_v'+str(version)
pending_gdf['version'] = version
pending_gdf['previous_name'] = previous_name
pending_gdf.head()

assigning version based on maximum version in table. version = 2


Unnamed: 0,image_name,image_calc_name,calc_type,processing_level,oversight_level,calculated_by,reviewed_by,notes,public,water_stratum,bucket,s3_keys,upload_date,geometry,version,previous_name
0,20210912_034049_22_2421,20210912_034049_22_2421_classification_v2,Review,Final,,Matthew Bonnema,Alexander Handwerger,Previous(Previous(Classified using NDWI and ma...,True,3,opera-calval-database-dswx-staging,pending/files/20230111_170230_imagecalc/classi...,20230111_170230,"POLYGON ((97.58389 49.35489, 97.58389 49.40992...",2,20210912_034049_22_2421_classification_v1


### Commit staged image_calc to database
This codeblock copies the staged files to the database, as well as to the completed folder in the staging bucket. If the cell ends in error, do not attempt to re-run and contact the database manager.

In [19]:
s3_folder_path = 'data/site/'+site+'/image/'+source_image_name+'/image_calc/'+imagecalc_name+'/'
s3_keys = []
for key in src_keys:
    new_key = s3_folder_path+key.split('/')[-1]
    complete_key = 'complete/'+'/'.join(key.split('/')[1:])
    s3_keys.append(new_key)
    response = s3.meta.client.copy({'Bucket':src_bucket,'Key':key}, bucket_name, new_key)
    response = s3.meta.client.copy({'Bucket':src_bucket,'Key':key}, src_bucket, complete_key)
    response = s3_client.delete_object(Bucket=src_bucket, Key=key)

pending_gdf['s3_keys'] = ','.join(s3_keys)
pending_gdf['bucket'] = bucket_name

if len(imagecalc_gdf[imagecalc_gdf.image_calc_name==pending_gdf.image_calc_name.iloc[0]]) != 0:
    print('image_calc_name: '+pending_gdf.image_calc_name.iloc[0]+' is already in image table')
    imagecalc_upd = imagecalc_gdf[imagecalc_gdf.image_calc_name != pending_gdf.image_calc_name.iloc[0]]
else:
    print('Adding new row to table')
    
imagecalc_upd = imagecalc_gdf.append(pending_gdf,ignore_index=True)
imagecalc_upd = imagecalc_upd[imagecalc_upd.image_calc_name !=None]
imagecalc_upd_bytes = bytes(imagecalc_upd.to_json(drop_id=True).encode('UTF-8'))
s3object = s3.Object(bucket_name,'image_calc.geojson')
s3object.put(Body=imagecalc_upd_bytes)

response = s3.meta.client.copy({'Bucket':src_bucket,'Key':pending_key}, src_bucket, 'complete/'+'/'.join(pending_key.split('/')[1:]))
response = s3_client.delete_object(Bucket=src_bucket, Key=pending_key)



Adding new row to table


### Check  updated image_calc table
This cell inspects the last 10 rows of the database table so you can verify that the entry was successfully uploaded

In [20]:
new_imagecalc_gdf = gpd.read_file(s3.Object(bucket_name,'image_calc.geojson').get()['Body'])
new_imagecalc_gdf.tail(10)

Unnamed: 0,bucket,calc_type,calculated_by,image_calc_name,image_name,notes,oversight_level,previous_name,processing_level,public,reviewed_by,s3_keys,upload_date,version,water_strata,water_stratum,geometry
131,opera-calval-database-dswx,Manual Classification,Matthew Bonnema,20210911_001230_44_2262_classification_v0,20210911_001230_44_2262,No water found in image. Stream channels appea...,,,Intermediate,True,,data/site/1_34/image/20210911_001230_44_2262/i...,20221028_142550,0.0,,0.0,"POLYGON ((138.62735 -29.78828, 138.62735 -29.7..."
132,opera-calval-database-dswx,Review,Simran Sangha,20210912_094213_84_240f_classification_v1,20210912_094213_84_240f,Previous(Dry chip in the middle of the desert....,,20210912_094213_84_240f_classification_v0,Final,True,Alexander Handwerger,data/site/1_43/image/20210912_094213_84_240f/i...,20221028_150116,1.0,,0.0,"POLYGON ((11.20751 13.95104, 11.20751 14.00578..."
133,opera-calval-database-dswx,Manual Classification,Simran Sangha,20210915_173832_80_2307_classification_v1,20210915_173832_80_2307,Previous(Dry chip with a few small ponds to th...,,20210915_173832_80_2307_classification_v0,Intermediate,True,,data/site/1_41/image/20210915_173832_80_2307/i...,20221028_150919,1.0,,1.0,"POLYGON ((-104.50906 35.55102, -104.50906 35.6..."
134,opera-calval-database-dswx,Review,Simran Sangha,20210915_173832_80_2307_classification_v2,20210915_173832_80_2307,Re-uploading with correct notes and processing...,,20210915_173832_80_2307_classification_v1,Final,True,Matthew Bonnema,data/site/1_41/image/20210915_173832_80_2307/i...,20221028_151801,2.0,,1.0,"POLYGON ((-104.50906 35.55102, -104.50906 35.6..."
135,opera-calval-database-dswx,Review,Matthew Bonnema,20210904_093422_44_1065_classification_v1,20210904_093422_44_1065,Previous(Only water is a small pond near cente...,,20210904_093422_44_1065_classification_v0,Final,True,Simran Sangha,data/site/1_31/image/20210904_093422_44_1065/i...,20221028_164450,1.0,,1.0,"POLYGON ((17.28244 -29.97141, 17.28244 -29.916..."
136,opera-calval-database-dswx,Review,Matthew Bonnema,20210911_001230_44_2262_classification_v1,20210911_001230_44_2262,Previous(No water found in image. Stream chann...,,20210911_001230_44_2262_classification_v0,Final,True,Simran Sangha,data/site/1_34/image/20210911_001230_44_2262/i...,20221101_224732,1.0,,0.0,"POLYGON ((138.62735 -29.78828, 138.62735 -29.7..."
137,opera-calval-database-dswx,Review,Simran Sangha,20210930_021156_60_2434_classification_v1,20210930_021156_60_2434,Previous(Unsupervised classification with kmea...,,20210930_021156_60_2434_classification_v0,Final,True,Karthik Venkataramani,data/site/2_7/image/20210930_021156_60_2434/im...,20221109_164041,1.0,,2.0,"POLYGON ((116.91498 42.33511, 116.91498 42.389..."
138,opera-calval-database-dswx,Review,Very hard Alaskan area with dark and bright wa...,20210928_211311_91_2457_classification_v3,20210928_211311_91_2457,Previous(Previous(This is water with lots of b...,,20210928_211311_91_2457_classification_v2,Final,True,Alexander Handwerger,data/site/3_43/image/20210928_211311_91_2457/i...,20221221_110743,3.0,,3.0,"POLYGON ((-161.16899 60.15636, -161.16899 60.2..."
139,opera-calval-database-dswx,Review,Alexander Handwerger,20210924_000522_94_2421_classification_v3,20210924_000522_94_2421,Previous(Previous(None) (CM): A scene predomin...,,20210924_000522_94_2421_classification_v2,Final,True,Alexander Handwerger,data/site/4_42/image/20210924_000522_94_2421/i...,20230111_164853,3.0,,3.0,"POLYGON ((139.64244 -17.06008, 139.64244 -16.9..."
140,opera-calval-database-dswx,Review,Matthew Bonnema,20210912_034049_22_2421_classification_v2,20210912_034049_22_2421,Previous(Previous(Classified using NDWI and ma...,,20210912_034049_22_2421_classification_v1,Final,True,Alexander Handwerger,data/site/4_8/image/20210912_034049_22_2421/im...,20230111_170230,2.0,,3.0,"POLYGON ((97.58389 49.35489, 97.58389 49.40992..."
