# **Extract DayMET Data**

By Bridget Bittmann

Date created: March 28, 2022

Date editted: May 4, 2022

Purpose: This script extracts DayMET data from Google Earth Engine using the Earth Engine API in Python. This script also copied over a SSEBop ET data dataset and calculated zonal stats based on polygons. 

In [1]:
# Installs geemap package
import subprocess

try:
    import geemap
except ImportError:
    print('geemap package not installed. Installing ...')
    subprocess.check_call(["python", '-m', 'pip', 'install', 'geemap'])

# Checks whether this notebook is running on Google Colab
try:
    import google.colab
    import geemap.eefolium as emap
except:
    import geemap as emap

# Authenticates and initializes Earth Engine
import ee

try:
    ee.Initialize()
except Exception as e:
    ee.Authenticate()
    ee.Initialize()  

geemap package not installed. Installing ...
To authorize access needed by Earth Engine, open the following URL in a web browser and follow the instructions. If the web browser does not start automatically, please manually browse the URL below.

    https://code.earthengine.google.com/client-auth?scopes=https%3A//www.googleapis.com/auth/earthengine%20https%3A//www.googleapis.com/auth/devstorage.full_control&request_id=aJ7fS5YrCkXu8a0rCNhujC93TNu1srG4Hx5lfcAxXV8&tc=LoMrXnXhtJHCcHnhHHQQKcwriFyX5ADViKuM6TmM4Is&cc=w_ZIZm8uWtBkNhBY9RXzkedpYtHMO_BxNqY07tKMf4w

The authorization workflow will generate a code, which you should paste in the box below. 
Enter verification code: 4/1AX4XfWhMLTf2zfm54YzUW2nY9VAzuCF_HSml7Aa0kAziPxuCNs7LcaQp0MU

Successfully saved authorization token.


In [2]:
!pip install geopandas
import geopandas as gpd #import independent shapefile
import json #for metadata of shapefile
import os #for file paths
import numpy as np #for stats and arrays
import pandas as pd #for dataframes

Collecting geopandas
  Downloading geopandas-0.10.2-py2.py3-none-any.whl (1.0 MB)
[K     |████████████████████████████████| 1.0 MB 11.0 MB/s 
[?25hCollecting pyproj>=2.2.0
  Downloading pyproj-3.2.1-cp37-cp37m-manylinux2010_x86_64.whl (6.3 MB)
[K     |████████████████████████████████| 6.3 MB 11.4 MB/s 
Collecting fiona>=1.8
  Downloading Fiona-1.8.21-cp37-cp37m-manylinux2014_x86_64.whl (16.7 MB)
[K     |████████████████████████████████| 16.7 MB 341 kB/s 
Collecting click-plugins>=1.0
  Downloading click_plugins-1.1.1-py2.py3-none-any.whl (7.5 kB)
Collecting cligj>=0.5
  Downloading cligj-0.7.2-py3-none-any.whl (7.1 kB)
Collecting munch
  Downloading munch-2.5.0-py2.py3-none-any.whl (10 kB)
Installing collected packages: munch, cligj, click-plugins, pyproj, fiona, geopandas
Successfully installed click-plugins-1.1.1 cligj-0.7.2 fiona-1.8.21 geopandas-0.10.2 munch-2.5.0 pyproj-3.2.1


In [3]:
#Connect to Google Drive if you want to export images
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [4]:
#Go to correct folder in Google Drive
%cd drive/MyDrive/spatial_colab/datasets/
%ls

/content/drive/MyDrive/spatial_colab/datasets
[0m[01;34mclimate_stats[0m/         [01;34mirrig_lbrb[0m/  [01;34mlcmap_files[0m/   [01;34mPOUs[0m/
[01;34mdiversion_timeseries[0m/  [01;34mIrrMapper[0m/   [01;34mmasked[0m/        [01;34msubset_test_shp[0m/
[01;34mirrigation_companies[0m/  [01;34mLBRB_shp[0m/    [01;34moutput_files[0m/


In [5]:
## ----------------------------------- ## 
## 1. Import shapefile to clip dataset ##
## ----------------------------------- ## 

shp_file = 'POUs/POUs_EDIT_051822_Merge.shp'
subset = emap.shp_to_ee(shp_file)

map=emap.Map(center=(43.6150, -116.2023),zoom=8)
map.addLayer(ee.Image().paint(subset, 0, 2), {}, 'POU')
map.addLayerControl()
map

In [6]:
## --------------------------------------- ##
## 2. IMPORT THE DAYMET DATA FOR MAX TEMPS ##
## --------------------------------------- ##

years = np.arange(1987,2021)
mean_max = []
for i in range(len(years)):
  daymet = ee.ImageCollection("NASA/ORNL/DAYMET_V4").filterDate((str(years[i])+'-06-01'), (str(years[i])+'-8-31')) #get image collection
  mxtmp = daymet.select('tmax').map(lambda image: image.clip(subset)).mean().set({'system:index':str(years[i])}) #select the bands to analyze
  mean_max.append(mxtmp) #calculate the mean across all pixels

means_max_temp = ee.ImageCollection(mean_max) #convert list of image to image collection for zonal stats command

maximumTemperatureVis = {
  'min': -40.0,
  'max': 30.0,
  'palette': ['1621A2', 'white', 'cyan', 'green', 'yellow', 'orange', 'red'],
}

Map = emap.Map(center=(43.6150, -116.2023),zoom=8)
Map.addLayer(means_max_temp, maximumTemperatureVis, 'tmax')
Map

In [105]:
## Ask kendra if I should use this or an automated call for zonal stats, either works
## and both csvs need to be editted to an extent

## could define a spatial reduce function

def spatial_reduction(image_collection, shape_file, reducer, scale, file_name, format, output_folder, name):

  '''
  This function is used to produce zonal statistics out of an image collection in GEE.

  Variables:
  image_collection: A time series of ee.ImageCollection
  shapefile: The spatial area to reduce the image to. 
  reducer: The ee.Reducer to calculate. It can be a mixture of reducers or just one.
  scale: Pixel size
  file_name: The output table file name.
  format: The output table format, can include 'CSV', "GeoJSON", "KML", "KMZ", or "SHP", or "TFRecord".
  output_folder: The file path for the output file. 
  name: The climate statistic you are calculating (i.e., 'precip_mean')
  '''
  reduce_step = ee.FeatureCollection(image_collection.map(lambda image: image.reduceRegions(collection=shape_file,
                                                                                  scale=scale,
                                                                                  reducer=reducer)))
  flat_reduction = reduce_step.flatten()
  task = ee.batch.Export.table.toDrive(collection = flat_reduction,
                                     description=file_name,
                                     fileNamePrefix=file_name,
                                     fileFormat=format,
                                     folder=output_folder
  )
  task.start()
  while(True):
    if 'COMPLETED' not in task.status().values():
      print(task.status())
      time.sleep(10)
    else:
      file_path = os.path.join(output_folder+'/'+file_name+'.csv')
      df = pd.read_csv(file_path)
      df = df.drop(columns=['.geo', 'Shape_Area', 'Shape_Leng'], axis=1)
      df['Year'] = df['system:index'].str.slice(start=0, stop=4)
      df = df.drop('system:index', axis=1).rename(columns={'mean':name})
      break 

  return df

In [106]:
new_df = spatial_reduction(image_collection=means_max_temp, 
                  shape_file = subset,
                  reducer = ee.Reducer.mean(),
                  scale=1000,
                  file_name = 'function_try',
                  format = 'CSV',
                  output_folder = 'climate_stats',
                  name = 'max_temp_mean')

{'state': 'READY', 'description': 'function_try', 'creation_timestamp_ms': 1653077709471, 'update_timestamp_ms': 1653077709471, 'start_timestamp_ms': 0, 'task_type': 'EXPORT_FEATURES', 'id': 'ICAP2LDAVQ3GJGL3ZZQXFQL6', 'name': 'projects/earthengine-legacy/operations/ICAP2LDAVQ3GJGL3ZZQXFQL6'}
{'state': 'RUNNING', 'description': 'function_try', 'creation_timestamp_ms': 1653077709471, 'update_timestamp_ms': 1653077717786, 'start_timestamp_ms': 1653077717750, 'task_type': 'EXPORT_FEATURES', 'attempt': 1, 'id': 'ICAP2LDAVQ3GJGL3ZZQXFQL6', 'name': 'projects/earthengine-legacy/operations/ICAP2LDAVQ3GJGL3ZZQXFQL6'}


In [64]:
## ------------------------------------------- ##
## 3. IMPORT THE DAYMET DATA FOR PRECIPITATION ##
## ------------------------------------------- ##

years = np.arange(1986,2022)
sum_pr = []
for i in years:
  daymet = ee.ImageCollection("NASA/ORNL/DAYMET_V4").filterDate((str(i)+'-10-01'), (str((i+1))+'-9-30')) #get image collection
  prcp = daymet.select('prcp').map(lambda image: image.clip(subset)).sum().set({'system:index':str(i)}) #select the bands to analyze
  sum_pr.append(prcp) #calculate the mean across all pixels

sum_precip = ee.ImageCollection(sum_pr) #convert list of image to image collection for zonal stats command

precip_vis = {
  'min': 0,
  'max': 544,
  'palette': ['1621A2', 'white', 'cyan', 'green', 'yellow', 'orange', 'red'],
}

Map = emap.Map(center=(43.6150, -116.2023),zoom=8)
Map.addLayer(sum_precip, precip_vis, 'prcp')
Map

In [67]:
new_df = spatial_reduction(image_collection=sum_precip, 
                  shape_file = subset,
                  reducer = ee.Reducer.mean(),
                  scale=1000,
                  file_name = 'precip_sums_try',
                  format = 'CSV',
                  output_folder = 'climate_stats',
                  name = 'precip_sum_mean')

{'state': 'READY', 'description': 'precip_sums_try', 'creation_timestamp_ms': 1653069937204, 'update_timestamp_ms': 1653069937204, 'start_timestamp_ms': 0, 'task_type': 'EXPORT_FEATURES', 'id': '2JSBXBHBHK4V24J5NFP7U3HZ', 'name': 'projects/earthengine-legacy/operations/2JSBXBHBHK4V24J5NFP7U3HZ'}


In [68]:
new_df

Unnamed: 0,WaterRight,precip_sum_mean,Year
0,Andrews,183.372697,1986
1,Atwell,180.050000,1986
2,Ballentyne,192.502740,1986
3,Barber,193.929075,1986
4,Bates,183.398532,1986
...,...,...,...
2299,Thomas Aiken,131.388089,2021
2300,Thurman Mill,130.748151,2021
2301,Upper Centerpoint,83.465471,2021
2302,Wagner,93.770000,2021


In [None]:
## ----------------------------- ##
## 4. IMPORT MONTHLY PRECIP DATA ##
## ----------------------------- ##

years = np.arange(1987,2021)
months = np.arange(3,11)
sum_month_pr = []

for i in years:
  for m in months:
    if (m == 3) or (m == 5) or (m == 7) or (m == 8) or (m==10):
      daymet = ee.ImageCollection("NASA/ORNL/DAYMET_V4").filterDate((str(i)+'-'+str(m)+'-01'), (str(i)+'-'+str(m)+'-31')) #get image collection
      prcp = daymet.select('prcp').map(lambda image: image.clip(subset)).sum().set('system:index', (str(i)+'-'+str(m))) #select the bands to analyze
      sum_month_pr.append(prcp) #calculate the mean across all pixels
    else: 
      daymet = ee.ImageCollection("NASA/ORNL/DAYMET_V4").filterDate((str(i)+'-'+str(m)+'-01'), (str(i)+'-'+str(m)+'-30')) #get image collection
      prcp = daymet.select('prcp').map(lambda image: image.clip(subset)).sum().set('system:index', (str(i)+'-'+str(m))) #select the bands to analyze
      sum_month_pr.append(prcp) #calculate the mean across all pixels

sum_month_precip = ee.ImageCollection(sum_month_pr) #convert list of image to image collection for zonal stats command


In [None]:
## IMPORT MONTHLY ET DATA ##

et_irrig = []

for y in years:
  et_data = ee.ImageCollection('projects/earthengine-legacy/assets/users/bridgetbittmann/ssebop/boise').filterDate((str(y)+'-03-01'), str(y)+'-10-31')
  et = et_data.map(lambda image: image.clip(subset)).sum().multiply(0.00001).set({'system:index': str(y)}) # sum and convert to meters 
  et_irrig.append(et)

et_irrig = ee.ImageCollection(et_irrig)

et_cut = []

for y in years:
  et_data = ee.ImageCollection('projects/earthengine-legacy/assets/users/bridgetbittmann/ssebop/boise').filterDate((str(y)+'-04-01'), str(y)+'-9-30')
  et_march = ee.ImageCollection('projects/earthengine-legacy/assets/users/bridgetbittmann/ssebop/boise').filterDate((str(y)+'-03-01'), str(y)+'-03-31')
  et_march = et_march.map(lambda image: image.clip(subset)).sum().multiply(0.00001).set({'system:index': (str(y)+'-03')})
  et_oct = ee.ImageCollection('projects/earthengine-legacy/assets/users/bridgetbittmann/ssebop/boise').filterDate((str(y)+'-10-01'), str(y)+'-10-31')
  et_oct = et_oct.map(lambda image: image.clip(subset)).sum().multiply(0.00001).set({'system:index': (str(y)+'-10')})
  et = et_data.map(lambda image: image.clip(subset)).sum().multiply(0.00001).set({'system:index': str(y)}) # sum and convert to meters 
  et_cut.append(et)
  et_cut.append(et_march)
  et_cut.append(et_oct)

et_cut = ee.ImageCollection(et_cut)
  

In [None]:

## ------------------------ ##
## 4. CALCULATE ZONAL STATS ##
## ------------------------ ##

# Allowed output formats: csv, shp, json, kml, kmz
# Allowed statistics type: MEAN, MAXIMUM, MINIMUM, MEDIAN, STD, MIN_MAX, VARIANCE, SUM

out_stats = os.path.join('climate_stats/maxtemp_stats.csv')
emap.zonal_statistics(means_max_temp, subset, out_stats, statistics_type='MEAN', scale=1000)

out_stats = os.path.join('climate_stats/precip_stats.csv')
emap.zonal_statistics(sum_precip, subset, out_stats, statistics_type='MEAN', scale=1000)

out_stats = os.path.join('climate_stats/month_precip_stats.csv')
emap.zonal_statistics(sum_month_precip, subset, out_stats, statistics_type='MEAN', scale=1000)

out_stats = os.path.join('climate_stats/et.csv')
emap.zonal_statistics(et_irrig, subset, out_stats, statistics_type='MEAN', scale=30)

out_stats = os.path.join('climate_stats/et_cut.csv')
emap.zonal_statistics(et_cut, subset, out_stats, statistics_type='MEAN', scale=30)

Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/tables/eed2f6e88eec27791c56e8caa75591d7-91e84a056c8a18cdc2493dee5f82e8e3:getFeatures
Please wait ...
Data downloaded to /content/drive/MyDrive/spatial_colab/datasets/climate_stats/maxtemp_stats.csv
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/tables/de0f1e4d5175ae46db60c81f3ec77234-91ad0ebce45a79d99a4e326436aba511:getFeatures
Please wait ...
Data downloaded to /content/drive/MyDrive/spatial_colab/datasets/climate_stats/precip_stats.csv
Computing statistics ...
Generating URL ...
Downloading data from https://earthengine.googleapis.com/v1alpha/projects/earthengine-legacy/tables/f5862b0616a38934c3ec2b39aa251b74-8df5216e07319824d6932c0fc35cf69a:getFeatures
Please wait ...
Data downloaded to /content/drive/MyDrive/spatial_colab/datasets/climate_stats/month_precip_s

In [None]:
## ---------------------------------------------- ##
## 5. CREATE CLIMATE STAT FOR EACH POU AND EXPORT ##
## ---------------------------------------------- ##

precip = pd.read_csv('climate_stats/precip_stats.csv')
max_temp = pd.read_csv('climate_stats/maxtemp_stats.csv')

names = precip['WaterRight']

for i in range(len(names)):
  df = pd.DataFrame(years, columns=['Year'])
  df['DIV_NAME'] = names[i]
  df['Precip_mm'] = precip.iloc[i,0:34].values
  df['Max_temp'] = max_temp.iloc[i,2:36].values
  out_path = os.path.join('climate_stats/final/'+names[i]+'_climate.csv')
  df.to_csv(out_path)


In [None]:
## --------------- ## 
## COMPARE ET DATA ##
## --------------- ## 

years = np.arange(1987,2021)
et_whole = pd.read_csv('climate_stats/et.csv').drop(columns = ['Shape_Area', 'system:index', 'Shape_Leng'], axis=1)
et_cut = pd.read_csv('climate_stats/et_cut.csv')


names = et_whole['WaterRight']
et = []

march = pd.DataFrame(et_cut['WaterRight'])
october = pd.DataFrame(et_cut['WaterRight'])
cut = pd.DataFrame(et_cut['WaterRight'])

for i in years:
  fill = str(i)
  march[fill] = et_cut[et_cut.columns[et_cut.columns == (str(i)+'-03_et')]]
  october[fill] = et_cut[et_cut.columns[et_cut.columns == (str(i)+'-10_et')]]
  cut[fill] = et_cut[et_cut.columns[et_cut.columns == (str(i)+'_et')]]

for i in range(len(names)):
  df = pd.DataFrame(years, columns=['Year'])
  df['DIV_NAME'] = names[i]
  df['et_whole'] = et_whole.iloc[i,0:34].values
  df['et_cut'] = cut.iloc[i,1:35].values
  df['march'] = march.iloc[i,1:35].values
  df['october'] = october.iloc[i,1:35].values
  et.append(df)

et = pd.concat(et)

et['diff'] = et['et_whole'] - et['et_cut']

avg = et.groupby('DIV_NAME')['et_whole', 'et_cut', 'diff', 'march', 'october'].mean()
avg.to_csv('climate_stats/et_avgs.csv')




In [None]:
## ------------------------- ##
## COPY ET DATA to Cloud API ##
## ------------------------- ##

# folder from where to copy
src_folder = "projects/earthengine-legacy/assets/users/dgketchum/ssebop/boise"
# folder where to copy
dest_folder = "projects/ee-bridgetbittmann/assets/ssebop"

# get all assets in the folder
assets = ee.data.listAssets({'parent': src_folder})

# loop through assets and copy them one by one to the new destination
for asset in assets['assets']:
    # construct destination path
    new_asset = dest_folder + '/' + asset['id'].split('/')[-1]
    # copy to destination
    ee.data.copyAsset(asset['id'], new_asset, True)

In [None]:
## ------------------------------------------- ##
## COPY ET DATA from Cloud API to Earth Engine ##
## ------------------------------------------- ##

# folder from where to copy
src_folder = "projects/ee-bridgetbittmann/assets/ssebop"
# folder where to copy
dest_folder = "projects/earthengine-legacy/assets/users/bridgetbittmann/ssebop/boise"

# get all assets in the folder
assets = ee.data.listAssets({'parent': src_folder})

# loop through assets and copy them one by one to the new destination
for asset in assets['assets']:
    # construct destination path
    new_asset = dest_folder + '/' + asset['id'].split('/')[-1]
    # copy to destination
    ee.data.copyAsset(asset['id'], new_asset, True)