## Call WOfS using bounding boxes
This will be used to get exact extents of reservoirs for the depth-to-surface area relationship. It's an automated way to call the right satellite data accurately.

In [1]:
import os
import xarray as xr
import numpy as np
import pandas as pd
import csv
import glob    #This one lets you read all the csv files in a directory
import rasterio.crs
from pandas import DataFrame
import geopandas as gpd
import matplotlib.gridspec as gs
import matplotlib.pyplot as plt
from matplotlib import pyplot
import datacube

import sys
sys.path.append('../Scripts')
from dea_spatialtools import xr_rasterize
from datacube.utils import geometry 
from datacube.utils.geometry import CRS
from datacube.utils import masking
from datacube.helpers import ga_pq_fuser, write_geotiff
#from digitalearthau.utils import wofs_fuser
#import DEAPlotting, DEADataHandling
import warnings
warnings.filterwarnings('ignore', module='datacube')
%load_ext autoreload
%autoreload 2



## Load the bounding box shapefile
I made a shapefile in ArcMAP that has bounding boxes of the reservoirs identified in 00_Library_reservoirs

In [2]:
gdf = gpd.read_file('00_Lib_bound/00_Lib_bound.shp')
gdf

Unnamed: 0,gauge_ID,NAME,staion_nam,ORIG_FID,geometry
0,TAYLORS,LAKE TAYLOR,Taylors Lake,0,"POLYGON ((142.36410 -36.82037, 142.37857 -36.7..."
1,RE604,UPPER STONY CREEK RESERVOIR,Upper Stony,1,"POLYGON ((144.19442 -37.81257, 144.21163 -37.8..."
2,sp-o10334,LAKE EILDON,EILDON,2,"POLYGON ((145.86701 -36.93337, 146.21666 -37.1..."
3,425022,LAKE MENINDEE,LAKE MENINDEE,3,"POLYGON ((142.29594 -32.24831, 142.42359 -32.3..."
4,sp-o11534,WARANGA BASIN,WARANGA BASIN,4,"POLYGON ((145.02963 -36.55203, 145.11966 -36.4..."
...,...,...,...,...,...
148,136023A,NED CHURCHWARD WEIR,Ned Churchward HW,148,"POLYGON ((151.94635 -25.14140, 152.05044 -25.0..."
149,136020A,BEN ANDERSON BARRAGE,Ben Anderson Barrage,149,"POLYGON ((152.14677 -24.97170, 152.26926 -24.8..."
150,136003C,CLAUDE WHARTON WEIR,Claude Wharton HW,150,"POLYGON ((151.52403 -25.61861, 151.59165 -25.6..."
151,125008A,MARIAN WEIR,Mirani Weir HW,151,"POLYGON ((148.82252 -21.15658, 148.92972 -21.1..."


## Dask load the satellite data for all the reservoirs
The following code blocks were copied from a DEA notebook called 'Open and run analysis on multiple polygons'. In this case the multiple polygons are my bounding boxes from the geodataframe above. First you make a query with no x, y points and no CRS. Just the time. Then you loop the location for the query using a datacube package called geomoetry. Put the dc.load() line in the loop (I'm going to dask load, not load actual images, I'll do that later after I've merged with the gauges). 

In [3]:
query = {'time': ('01-01-1988', '09-12-2020')} 
         #'crs': 'EPSG:3577'}
dc = datacube.Datacube(app='dc-WOfS')

results = {} 

for index, row in gdf.iterrows():
    print(f'Feature: {index + 1}/{len(gdf)}')
    geom = geometry.Geometry(geom=row.geometry, crs=gdf.crs)
    query.update({'geopolygon': geom})
    
    wofs_albers= dc.load(product = 'wofs_albers', dask_chunks = {}, group_by='solar_day', **query)
    
    poly_mask = xr_rasterize(gdf.iloc[[index]], wofs_albers)
    wofs_albers = wofs_albers.where(poly_mask)
    
    results.update({str(row['gauge_ID']): wofs_albers}) #The handle for dictionary objects is the gauge ID

Feature: 1/153
Feature: 2/153
Feature: 3/153
Feature: 4/153
Feature: 5/153
Feature: 6/153
Feature: 7/153
Feature: 8/153
Feature: 9/153
Feature: 10/153
Feature: 11/153
Feature: 12/153
Feature: 13/153
Feature: 14/153
Feature: 15/153
Feature: 16/153
Feature: 17/153
Feature: 18/153
Feature: 19/153
Feature: 20/153
Feature: 21/153
Feature: 22/153
Feature: 23/153
Feature: 24/153
Feature: 25/153
Feature: 26/153
Feature: 27/153
Feature: 28/153
Feature: 29/153
Feature: 30/153
Feature: 31/153
Feature: 32/153
Feature: 33/153
Feature: 34/153
Feature: 35/153
Feature: 36/153
Feature: 37/153
Feature: 38/153
Feature: 39/153
Feature: 40/153
Feature: 41/153
Feature: 42/153
Feature: 43/153
Feature: 44/153
Feature: 45/153
Feature: 46/153
Feature: 47/153
Feature: 48/153
Feature: 49/153
Feature: 50/153
Feature: 51/153
Feature: 52/153
Feature: 53/153
Feature: 54/153
Feature: 55/153
Feature: 56/153
Feature: 57/153
Feature: 58/153
Feature: 59/153
Feature: 60/153
Feature: 61/153
Feature: 62/153
Feature: 63/153
F

## loop read all the csv files in 00_Library
Now we have a library of the wofs data with the gauge ID as the key. We now need a library of the depth data with, again, the gauge ID as the key. Then we can match them up later. 

In [4]:
#make a list of the file names so we can call them with pandas
file_list = []

directory = '00_Library'
for filename in os.listdir(directory):
    if filename.endswith(".csv"):
        file_list.append(os.path.join(directory, filename))

#Read the gauge files twice, once to get ID and second to get the data. Append them together in a dictionary
#May as well make a list of IDs here because we will probably use it later
data_dict = {}        
ID_list = []
for i in file_list:
    df = pd.read_csv(i, nrows=1, escapechar='#')
    column = df.iloc[:,[1]] #This is the column with the ID in it
    ID = list(column)
    ID = ID[0]
    ID = df.at[0, ID]
    ID_list.append(ID)
    
    data = pd.read_csv(i, error_bad_lines = False, skiprows=9, escapechar='#',
                         parse_dates=['Timestamp'], 
                         index_col=('Timestamp'),
                        date_parser=lambda x: pd.to_datetime(x.rsplit('+', 1)[0]))
    data_dict.update({str(ID): data})

In [7]:
import random

random.choice(list(results.items()))

('130304B',
 <xarray.Dataset>
 Dimensions:      (time: 789, x: 56, y: 152)
 Coordinates:
   * time         (time) datetime64[ns] 1988-01-06T23:28:45.500000 ... 2019-07...
   * y            (y) float64 -2.734e+06 -2.734e+06 ... -2.738e+06 -2.738e+06
   * x            (x) float64 1.784e+06 1.784e+06 ... 1.785e+06 1.785e+06
     spatial_ref  int32 3577
 Data variables:
     water        (time, y, x) float64 dask.array<chunksize=(1, 152, 56), meta=np.ndarray>
 Attributes:
     crs:           EPSG:3577
     grid_mapping:  spatial_ref)

In [8]:
random.choice(list(data_dict.items()))

('410131',
               Value  Quality Code  Interpolation Type
 Timestamp                                            
 2000-01-01  350.772            90                 603
 2000-01-02  350.891            90                 603
 2000-01-03  350.999            90                 603
 2000-01-04  351.072            90                 603
 2000-01-05  351.060            90                 603
 ...             ...           ...                 ...
 2020-11-11  360.649           140                 603
 2020-11-12  360.574           140                 603
 2020-11-13  360.511           140                 603
 2020-11-14  360.437           140                 603
 2020-11-15  360.377           140                 603
 
 [7625 rows x 3 columns])

## merge the wofs arrays with the gauge data for each reservoir
We have a library of wofs data and a library of gauge data. In both cases, the key is the gauge ID. I've merged depth dataframes with wofs xarrays before but I don't know how to loop merge them using 2 dictionaries...