## Generate depth to surface area tables by merging satellite data with depth gauge data
This notebook merges gauge data with satellite data for multiple reservoirs at a time (in the 00_Library directory). It outputs a depth to surface area table for the reservoirs. The inputs are 1) a shapefile of reservoirs with their gauge ID attached and 2) the folder containing the csv files of gauge data. At the top of the screen is a tab called Kernel. If you want to start the notebook again at anytime and clear everything, you can select Restart Kernel and Clear All Outputs. The available memory is at the bottom of the screen. You have 15GB of memory and if the memory is exceeded the kernel will crash (but it's ok you can just start it again). Read the instructions carefully to run the code blocks and understand what's going on at each step. Press Shift + Enter to run a code block. Good Luck! - Katey

## Load modules
Here's all the standard python modules you're gonna need. There's some special DEA modules in here to help handle the satellite data. You might get some warnings about depreciation or whatever but just ignore them. Those square brackets at the top left of the code cell will make a star while it's running and then a number when it's finished running. Loading the modules should only take a second to run.

In [1]:
import os
import xarray as xr
import numpy as np
import pandas as pd
import csv
import rasterio.crs
from tqdm.auto import tqdm #this one is a loading bar, it's cool to add loading bars to loops
from pandas import DataFrame
import geopandas as gpd
import matplotlib.gridspec as gs
import matplotlib.pyplot as plt
from matplotlib import pyplot
import datacube

import sys
sys.path.append('../../Scripts')
from dea_spatialtools import xr_rasterize
from dea_datahandling import wofs_fuser #this joins wofs data across tiles correctly
from datacube.utils import geometry 
from datacube.utils.geometry import CRS
from datacube.utils import masking
from datacube.helpers import ga_pq_fuser, write_geotiff
#from digitalearthau.utils import wofs_fuser
#import DEAPlotting, DEADataHandling
import warnings
warnings.filterwarnings('ignore', module='datacube')
%load_ext autoreload
%autoreload 2

  shapely_geos_version, geos_capi_version_string


## User inputs
The inputs are the reservoir shapefile with the gauge IDs attached and the directory that contains the csv files of gauge data. We can also define how deep the reservoir should be before we start taking every 2m instead of every 1m. It's better for larger reservoirs to take every 2m because in my opinion it's more accurate to do this because it improves the image quality of each depth slice. I recommend 25 or 30 as the limit. Also put todays date in 'dd-mm-yyyy' format to get the latest satellite data. The image_cap variable is a little hard to explain, but basically this is how many images you are willing to load per depth interval. The more images you load, the more accurate the results will be but the longer it will take to run this script. If you just want to quickly see how this script works, set the image cap to 10 (approx. 20 minutes to calculate the surface areas of 150 reservoirs). If you want the most accurate results you don't want to limit the images at all so set the image cap to 10000 to effectively unlimit it (approx. 3 hours, could maybe crash).  

In [14]:
#User inputs
reservoirs_shape_file = '00_Library_reservois/00_Library_reservois.shp'
directory = '00_Library' # this is where your csv files of gauge data are
depth_interval_limit = 25
todays_date = '14-04-2021'
image_cap = 10000

## Query and Dask load the satellite data for all the reservoirs
'Dask load' which is the dask_chunks = { } argument in the dc.load line means you only load satellite data parameters, not the actual images, which is good for linking the satellite data to the gauge data without it taking literally hours. So basically, in this box we read the shapefile with the reservoir polygons and query the satellite data with the polygons. Most of the query is done in a loop because we have to make hundreds of queries, one for each reservoir. To handle all these queries, we put them in a dictionary and use the gauge ID as the key for each query, so we can call the satellite data later on by the gauge ID. This is great, because the csv files of the depth gauge data all have the gauge ID in them so it's going to be easy to match up the satellite data of each reservoir with the gauge data just by using the gauge ID as the handle.  

In [8]:
gdf = gpd.read_file(reservoirs_shape_file)

query = {'time': ('01-01-1988', todays_date)} 
         #'crs': 'EPSG:3577'}
dc = datacube.Datacube(app='dc-WOfS') #this is how you access the open data cube where the satellite data is

results = {} 

#tqdm is gonna make the bar. tqdm is Arabic abbreviation for 'progress'
for index, row in tqdm(gdf.iterrows(), total=len(gdf)):
    geom = geometry.Geometry(geom=row.geometry, crs=gdf.crs)
    query.update({'geopolygon': geom})
    
    wofs_albers= dc.load(product = 'wofs_albers', dask_chunks = {}, 
                         group_by='solar_day', fuse_func = wofs_fuser, **query) #wofs_fuser is important, it fixes thing on the edge of tiles
    
    poly_mask = xr_rasterize(gdf.iloc[[index]], wofs_albers)
    wofs_albers = wofs_albers.where(poly_mask, other=wofs_albers.water.nodata) #put other argument or all the data turns into 0
    
    results.update({str(row['gauge_ID']): wofs_albers}) #The handle for dictionary objects is the gauge ID

NameError: name 'reservoirs_shape_file' is not defined

## loop read all the csv files in 00_Library
In the previous code block we made a dictionary of the wofs data with the gauge ID as the key. We now need a dictionary of the depth data with, again, the gauge ID as the key. Then we can match them up later. 

In [11]:
#make a list of the file names so we can call them with pandas
file_list = []

for filename in os.listdir(directory):
    if filename.endswith(".csv"):
        file_list.append(os.path.join(directory, filename))

#Read the gauge files twice, once to get ID and second to get the data. Append them together in a dictionary
#May as well make a list of IDs here because we will use it later
data_dict = {}        
ID_list = []
#let's use tqdm again to make a progress bar. The bar is so cool I love this module
for i in tqdm(file_list, total=len(file_list)):
    df = pd.read_csv(i, nrows=1, escapechar='#')
    column = df.iloc[:,[1]] #This is the column with the ID in it
    ID = list(column)
    ID = ID[0]
    ID = df.at[0, ID]
    ID_list.append(str(ID))
    #now read again this time to get the actual data
    data = pd.read_csv(i, error_bad_lines = False, skiprows=9, escapechar='#',
                         parse_dates=['Timestamp'], 
                         index_col=('Timestamp'),
                        date_parser=lambda x: pd.to_datetime(x.rsplit('+', 1)[0]))
    data = data.drop(columns=['Quality Code', 'Interpolation Type'])
    data_dict.update({str(ID): data}) #Now we have the gauge data, again with the gauge ID as the handle

  0%|          | 0/178 [00:00<?, ?it/s]

## Make a function that generates the depth slices of a reservoir and gets the depth to surface area relationship
This is my first time writing a function, Matthew and Bex from DEA helped me write it. It's a function that you can apply to one reservoir to load all the passes that have been over that reservoir since 1988, cloud mask them, organise them into depth intervals, stack the images for each depth interval on top of each other to make one average image for each depth and then count the pixels that have water in them to get the surface area at each depth. Then after we define this function we can loop it over all of the reservoirs. 

In [15]:
def image_prod(ID_caller, gauge_data, wofs_albers, make_plots = False) -> 'depth slices': 
    """
    This function takes the gauge data and the wofs data,
    cloud masks the images and counts the pixels in each depth slice.
    It returns a list of all the surface areas per depth.
    
    """
    #Get the depth range and intervals
    gauge_data = gauge_data.dropna()
    depth_integers = gauge_data.astype(np.int64)
    max_depth = depth_integers.Value.max()
    min_depth = depth_integers.Value.min()
    integer_array = depth_integers.Value.unique()
    integer_list = integer_array.tolist()
    
    gauge_data_xr = gauge_data.to_xarray() #convert gauge data to xarray
    merged_data = gauge_data_xr.interp(Timestamp=wofs_albers.time) #use xarrays .interp() function to merge

    surface_area_list = []

    for i in tqdm(integer_list, leave = False):
        if len(integer_list) > depth_interval_limit: #here's where the depth interval limit you set is taken into account
            specified_level = merged_data.where((merged_data.Value > i) & 
                                (merged_data.Value < i+2), drop=True)
        else:
            specified_level = merged_data.where((merged_data.Value > i) & 
                                (merged_data.Value < i+1), drop=True)


        date_list = specified_level.time.values[:image_cap] #caps images at x per slice 
        n_images_used = int(len(date_list))
        specified_passes = wofs_albers.sel(time=date_list).compute() #This .compute() Xarray function loads actual images
        #cloudmask (Claire Krause wrote this for me)
        cc = masking.make_mask(specified_passes.water, cloud=True)
        ncloud_pixels = cc.sum(dim=['x', 'y'])
        # Calculate the total number of pixels per timestep
        npixels_per_slice = (specified_passes.water.shape[1] * 
                             specified_passes.water.shape[2])
        cloud_pixels_fraction = (ncloud_pixels / npixels_per_slice)
        clear_specified_passes = specified_passes.water.isel(
            time=cloud_pixels_fraction < 0.2) #has to be under 20% cloudy to pass
        wet = masking.make_mask(clear_specified_passes, wet=True).sum(dim='time')
        dry = masking.make_mask(clear_specified_passes, dry=True).sum(dim='time')
        clear = wet + dry
        frequency = wet / clear
        frequency = frequency.fillna(0)  

        #Get area from the satellite data
        #get the frequency array
        frequency_array = frequency.values
        n_images_after_masking = len(clear_specified_passes.time)
        #Turn any pixel in the frequency array with a value greater than 0.2 into a pixel of value 1
        #if the pixel value is 0.2 or lower it gets value 0
        is_water = np.where((frequency_array > 0.2),1,0) #has to be water in more than 20% of images to count
        #give the 'frequency' xarray back its new values of zero and one
        frequency.values = is_water
        #sum up the pixels
        number_water_pixels = frequency.sum(dim=['x', 'y'])
        #get the number
        number_water_pixels = number_water_pixels.values.tolist()
        #multiply by pixel size to get area in m2
        area_m2 = number_water_pixels*(25*25)

        surface_area_list.append([ID_caller, i, area_m2, n_images_used, n_images_after_masking])
        #print('This is the area as calculated from wet pixels at', i, 'meters', area_m2)

        #Plotting the image
        if make_plots:
            frequency.plot(figsize = (7,5))
    del wofs_albers
    del specified_passes
    del cc
    del clear_specified_passes
    del wet
    del dry
    del clear
    del frequency
    #delete the images when you finish each reservoir (otherwise the memory will run out and the kernel will break)
    return surface_area_list

## Run the function for all of the reservoirs
OK this is the part that's going to take a while. How long this part takes depends on if you capped the images or not. Now we will loop over all of the gauge data and apply the function we just made. If you're doing this with the image cap at 10000, make sure you shutdown any other notebooks to free up memory (see the icon on the left with the circle and square to shutdown other notebooks). It's going to tell you it didn't find some of the gauges because not all of the gauges are matched up to a reservoir in the reservoirs shapefile (because of issues I'm having with the spatial join of the gauge points to the reservoir polygons, need to fix later). 

In [16]:
array_list = []


def listsplit(N, K=1):
    length = len(N)
    return [N[i*length/K: (i+1)*length/K] for i in range(K)]


for ID in tqdm(ID_list, total=len(ID_list)):
    print("Working on gauge ", ID)
    if (ID in data_dict.keys()) and (ID in results.keys()):
        data = image_prod(ID, data_dict[ID], results[ID], make_plots = False)
        array_list.append(data)
        
        del data
    else:
        print('we didnt find', ID)        

  0%|          | 0/178 [00:00<?, ?it/s]

Working on gauge  ODSS_18335_WSLAHD.1


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  604.1


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  136115A


  0%|          | 0/29 [00:00<?, ?it/s]

Working on gauge  BARKERSCREEK


  0%|          | 0/11 [00:00<?, ?it/s]

Working on gauge  sp-o10350


  0%|          | 0/31 [00:00<?, ?it/s]

Working on gauge  222537


  0%|          | 0/17 [00:00<?, ?it/s]

Working on gauge  226230A
we didnt find 226230A
Working on gauge  579.1
we didnt find 579.1
Working on gauge  229406A


  0%|          | 0/6 [00:00<?, ?it/s]

Working on gauge  RE690


  0%|          | 0/10 [00:00<?, ?it/s]

Working on gauge  sp-o10298


  0%|          | 0/90 [00:00<?, ?it/s]

Working on gauge  410545


  0%|          | 0/11 [00:00<?, ?it/s]

Working on gauge  sp-o11534


  0%|          | 0/10 [00:00<?, ?it/s]

Working on gauge  656.1


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  171.1


  0%|          | 0/6 [00:00<?, ?it/s]

Working on gauge  410717


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  627.1


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  UPPERCOLIBAN


  0%|          | 0/25 [00:00<?, ?it/s]

Working on gauge  233272A


  0%|          | 0/20 [00:00<?, ?it/s]

Working on gauge  sp-o11590


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  233232A


  0%|          | 0/21 [00:00<?, ?it/s]

Working on gauge  136023A


  0%|          | 0/16 [00:00<?, ?it/s]

Working on gauge  sp-o10930


  0%|          | 0/16 [00:00<?, ?it/s]

Working on gauge  231222A
we didnt find 231222A
Working on gauge  203042


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  130354A


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  135009A


  0%|          | 0/34 [00:00<?, ?it/s]

Working on gauge  229607A
we didnt find 229607A
Working on gauge  ODSS_21168_WSLAHD.1


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  229130A


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  136113A


  0%|          | 0/30 [00:00<?, ?it/s]

Working on gauge  145035A
we didnt find 145035A
Working on gauge  ODSS_21167_WSLAHD.1


  0%|          | 0/17 [00:00<?, ?it/s]

Working on gauge  228264A


  0%|          | 0/4 [00:00<?, ?it/s]

Working on gauge  ODSS_21139_WSLAHD.1
we didnt find ODSS_21139_WSLAHD.1
Working on gauge  130338A


  0%|          | 0/17 [00:00<?, ?it/s]

Working on gauge  130104A


  0%|          | 0/15 [00:00<?, ?it/s]

Working on gauge  143048A


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  146034A
we didnt find 146034A
Working on gauge  286.1


  0%|          | 0/6 [00:00<?, ?it/s]

Working on gauge  130304B


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  410742


  0%|          | 0/43 [00:00<?, ?it/s]

Working on gauge  143036A


  0%|          | 0/25 [00:00<?, ?it/s]

Working on gauge  136020A


  0%|          | 0/15 [00:00<?, ?it/s]

Working on gauge  580.1
we didnt find 580.1
Working on gauge  176.1
we didnt find 176.1
Working on gauge  401027


  0%|          | 0/29 [00:00<?, ?it/s]

Working on gauge  110014A


  0%|          | 0/17 [00:00<?, ?it/s]

Working on gauge  143234A


  0%|          | 0/10 [00:00<?, ?it/s]

Working on gauge  412106


  0%|          | 0/20 [00:00<?, ?it/s]

Working on gauge  419080


  0%|          | 0/40 [00:00<?, ?it/s]

Working on gauge  422214A


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  410791


  0%|          | 0/2 [00:00<?, ?it/s]

Working on gauge  sp-o10109


  0%|          | 0/11 [00:00<?, ?it/s]

Working on gauge  146033A
we didnt find 146033A
Working on gauge  130106A


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  228263A
we didnt find 228263A
Working on gauge  410572A


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  167.1
we didnt find 167.1
Working on gauge  120016A


  0%|          | 0/16 [00:00<?, ?it/s]

Working on gauge  ODSS_47683_WSLAHD.1


  0%|          | 0/10 [00:00<?, ?it/s]

Working on gauge  419041


  0%|          | 0/25 [00:00<?, ?it/s]

Working on gauge  412010


  0%|          | 0/46 [00:00<?, ?it/s]

Working on gauge  210102


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  LAKE_BELLFIELD


  0%|          | 0/25 [00:00<?, ?it/s]

Working on gauge  657.1


  0%|          | 0/26 [00:00<?, ?it/s]

Working on gauge  125008A


  0%|          | 0/7 [00:00<?, ?it/s]

Working on gauge  401571


  0%|          | 0/10 [00:00<?, ?it/s]

Working on gauge  421148


  0%|          | 0/22 [00:00<?, ?it/s]

Working on gauge  573.1
we didnt find 573.1
Working on gauge  190.1
we didnt find 190.1
Working on gauge  295.1


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  sp-o10926


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  sp-o10814


  0%|          | 0/7 [00:00<?, ?it/s]

Working on gauge  410542


  0%|          | 0/59 [00:00<?, ?it/s]

Working on gauge  139.2


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  138112A


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  410765


  0%|          | 0/3 [00:00<?, ?it/s]

Working on gauge  613.1


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  128.1
we didnt find 128.1
Working on gauge  232217A


  0%|          | 0/11 [00:00<?, ?it/s]

Working on gauge  229405A


  0%|          | 0/4 [00:00<?, ?it/s]

Working on gauge  658.1


  0%|          | 0/7 [00:00<?, ?it/s]

Working on gauge  298.1


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  145021A


  0%|          | 0/21 [00:00<?, ?it/s]

Working on gauge  142801A


  0%|          | 0/17 [00:00<?, ?it/s]

Working on gauge  225225A
we didnt find 225225A
Working on gauge  210097


  0%|          | 0/36 [00:00<?, ?it/s]

Working on gauge  ODSS_57389_WSLAHD.1


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  RE851


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  WARTOOK


  0%|          | 0/6 [00:00<?, ?it/s]

Working on gauge  625.1


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  136003C


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  222538


  0%|          | 0/40 [00:00<?, ?it/s]

Working on gauge  135008A


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  ODSS_21166_WSLAHD.1


  0%|          | 0/26 [00:00<?, ?it/s]

Working on gauge  629.1


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  ROCKLANDS


  0%|          | 0/11 [00:00<?, ?it/s]

Working on gauge  421078


  0%|          | 0/39 [00:00<?, ?it/s]

Working on gauge  138012A


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  143228A


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  425022


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  RE750


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  422315B


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  410102


  0%|          | 0/76 [00:00<?, ?it/s]

Working on gauge  401569


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  416409A


  0%|          | 0/11 [00:00<?, ?it/s]

Working on gauge  M309


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  594.1


  0%|          | 0/5 [00:00<?, ?it/s]

Working on gauge  143305A


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  179.1
we didnt find 179.1
Working on gauge  296.1


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  416030


  0%|          | 0/55 [00:00<?, ?it/s]

Working on gauge  613.1


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  294.1


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  TAYLORS


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  210117


  0%|          | 0/21 [00:00<?, ?it/s]

Working on gauge  143111A


  0%|          | 0/24 [00:00<?, ?it/s]

Working on gauge  120012A
we didnt find 120012A
Working on gauge  sp-o10606


  0%|          | 0/7 [00:00<?, ?it/s]

Working on gauge  179.1
we didnt find 179.1
Working on gauge  222539


  0%|          | 0/15 [00:00<?, ?it/s]

Working on gauge  136316A


  0%|          | 0/28 [00:00<?, ?it/s]

Working on gauge  229102A


  0%|          | 0/19 [00:00<?, ?it/s]

Working on gauge  211.1
we didnt find 211.1
Working on gauge  155.1


  0%|          | 0/7 [00:00<?, ?it/s]

Working on gauge  229111A


  0%|          | 0/22 [00:00<?, ?it/s]

Working on gauge  565.1
we didnt find 565.1
Working on gauge  138122A
we didnt find 138122A
Working on gauge  A4260553
we didnt find A4260553
Working on gauge  sp-o11454


  0%|          | 0/20 [00:00<?, ?it/s]

Working on gauge  RE704


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  ODSS_21137_WSLAHD.1
we didnt find ODSS_21137_WSLAHD.1
Working on gauge  LAURISTON


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  120211A


  0%|          | 0/23 [00:00<?, ?it/s]

Working on gauge  410573


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  229407A


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  230216A
we didnt find 230216A
Working on gauge  418.1


  0%|          | 0/6 [00:00<?, ?it/s]

Working on gauge  419069


  0%|          | 0/21 [00:00<?, ?it/s]

Working on gauge  225256A


  0%|          | 0/55 [00:00<?, ?it/s]

Working on gauge  646.1
we didnt find 646.1
Working on gauge  418035


  0%|          | 0/49 [00:00<?, ?it/s]

Working on gauge  SANDHURST


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  219027


  0%|          | 0/18 [00:00<?, ?it/s]

Working on gauge  742.1


  0%|          | 0/9 [00:00<?, ?it/s]

Working on gauge  MALMSBURY


  0%|          | 0/12 [00:00<?, ?it/s]

Working on gauge  401565


  0%|          | 0/31 [00:00<?, ?it/s]

Working on gauge  DAM_MARDI.1


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  592.1
we didnt find 592.1
Working on gauge  ODSS_18313_WSLAHD.1
we didnt find ODSS_18313_WSLAHD.1
Working on gauge  RE670


  0%|          | 0/17 [00:00<?, ?it/s]

Working on gauge  ODSS_55089_WSLAHD.1
we didnt find ODSS_55089_WSLAHD.1
Working on gauge  RE604


  0%|          | 0/20 [00:00<?, ?it/s]

Working on gauge  138121A


  0%|          | 0/13 [00:00<?, ?it/s]

Working on gauge  ODSS_21140_WSLAHD.1


  0%|          | 0/20 [00:00<?, ?it/s]

Working on gauge  158.1


  0%|          | 0/14 [00:00<?, ?it/s]

Working on gauge  130314B


  0%|          | 0/22 [00:00<?, ?it/s]

Working on gauge  401570


  0%|          | 0/28 [00:00<?, ?it/s]

Working on gauge  228224A


  0%|          | 0/8 [00:00<?, ?it/s]

Working on gauge  648.1
we didnt find 648.1
Working on gauge  231223A
we didnt find 231223A
Working on gauge  MCCAY


  0%|          | 0/7 [00:00<?, ?it/s]

Working on gauge  410748


  0%|          | 0/19 [00:00<?, ?it/s]

Working on gauge  136210A


  0%|          | 0/21 [00:00<?, ?it/s]

Working on gauge  LKALEX
we didnt find LKALEX
Working on gauge  142108A
we didnt find 142108A
Working on gauge  562.1


  0%|          | 0/22 [00:00<?, ?it/s]

Working on gauge  410131


  0%|          | 0/41 [00:00<?, ?it/s]

Working on gauge  130360A


  0%|          | 0/15 [00:00<?, ?it/s]

Working on gauge  sp-o10334


  0%|          | 0/46 [00:00<?, ?it/s]

Working on gauge  222540


  0%|          | 0/16 [00:00<?, ?it/s]

Working on gauge  410784


  0%|          | 0/2 [00:00<?, ?it/s]

Working on gauge  M350
we didnt find M350
Working on gauge  410853
we didnt find 410853
Working on gauge  sp-o10438


  0%|          | 0/2 [00:00<?, ?it/s]

Working on gauge  M316


  0%|          | 0/6 [00:00<?, ?it/s]

Working on gauge  410543


  0%|          | 0/20 [00:00<?, ?it/s]

## Make a look up table of depth to surface area for each gauge ID
Nice, now we can make the output, which is a table. The columns will be the gauge ID, the depth interval (AHD), the corresponding surface area (m2) and the number of satellite images each surface area was calculated from. The number of images column is useful because it can indicate how accurate the surface area calculation was. If you only get 3 images to calculate the surface area with, it won't be as accurate as a surface area calculated from 100 images. 

In [17]:
look_up_table = []
for i in array_list:
    df = DataFrame(i, columns = ['ID', 'Depth', 'Surface Area', 'number of images before masking', 'number of images after masking'])
    look_up_table.append(df)

df = pd.concat(look_up_table)
df = df.set_index("ID")  
df = df.drop(columns=['number of images before masking'])
df

Unnamed: 0_level_0,Depth,Surface Area,number of images after masking
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
ODSS_18335_WSLAHD.1,220,223750,15
ODSS_18335_WSLAHD.1,219,210625,52
ODSS_18335_WSLAHD.1,218,202500,62
ODSS_18335_WSLAHD.1,217,182500,82
ODSS_18335_WSLAHD.1,216,165625,60
...,...,...,...
410543,1220,10160000,30
410543,1221,10820000,26
410543,1222,11677500,13
410543,1223,5947500,6


## save the look up table as a csv file
Well done! This is the initial depth to surface area table. When you run this next cell, the table above will appear as a csv file in your sandbox files. Then you can right click it and download to your local computer and open it in excel. This table will also be the input for the next step, which is to correct for bad quality images and also fill in every other meter for those big reservoirs we took every 2m for.  

In [18]:
df.to_csv('depth_to_surface_polygon_drill.csv')