# Ingest Additional Rasters on Earth Engine

* Purpose of script: This notebook will ingest some of the missing rasters to Earth Engine 
* Author: Rutger Hofste
* Kernel used: python27
* Date created: 20170803

## Preparation

1. gcloud authorization (`gcloud init`)
1. earthengine authorization (`earthengine authorize`)
1. aws authorization (`aws configure`)


In [16]:
!rm -r /volumes/data/PCRGlobWB20V01/additional

In [17]:
!mkdir /volumes/data/PCRGlobWB20V01/additional

In [18]:
!aws s3 cp s3://wri-projects/Aqueduct30/rawData/WRI/samplegeotiff /volumes/data/PCRGlobWB20V01/additional --recursive

download: s3://wri-projects/Aqueduct30/rawData/WRI/samplegeotiff/readme.txt to ../../../../data/PCRGlobWB20V01/additional/readme.txt
download: s3://wri-projects/Aqueduct30/rawData/WRI/samplegeotiff/sampleGeotiff.tiff to ../../../../data/PCRGlobWB20V01/additional/sampleGeotiff.tiff


Check if the file is actually copied

In [19]:
!ls /volumes/data/PCRGlobWB20V01/additional/

readme.txt  sampleGeotiff.tiff


Copy Indicator files to EC2 instance

In [20]:
!mkdir /volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01

mkdir: cannot create directory '/volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01': File exists


In [21]:
!aws s3 cp \
s3://wri-projects/Aqueduct30/processData/03PCRGlobWBIndicatorsV01 \
/volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01 --recursive

download: s3://wri-projects/Aqueduct30/processData/03PCRGlobWBIndicatorsV01/global_droughtseveritystandardisedsoilmoisture_5min_1960-2014.asc to ../../../../data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/global_droughtseveritystandardisedsoilmoisture_5min_1960-2014.asc
download: s3://wri-projects/Aqueduct30/processData/03PCRGlobWBIndicatorsV01/global_environmentalflows_5min_1960-2014.asc to ../../../../data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/global_environmentalflows_5min_1960-2014.asc
download: s3://wri-projects/Aqueduct30/processData/03PCRGlobWBIndicatorsV01/global_q3seasonalvariabilitywatersupply_5min_1960-2014.asc to ../../../../data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/global_q3seasonalvariabilitywatersupply_5min_1960-2014.asc
download: s3://wri-projects/Aqueduct30/processData/03PCRGlobWBIndicatorsV01/global_q1seasonalvariabilitywatersupply_5min_1960-2014.asc to ../../../../data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/global_q1seasonalvariabil

Create output folder to store geotiffs

In [30]:
!mkdir /volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output

## Script

Create working environment and copy relevant files

In [63]:
INPUTLOCATION_SAMPLE_GEOTIFF = "/volumes/data/PCRGlobWB20V01/additional/sampleGeotiff.tiff"
INPUTLOCATION_INDICATORS = "/volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/"
OUTPUTLOCATION_INDICATORS = "/volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output"
GCS_BASE = "gs://aqueduct30_v01/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/"
EE_BASE = "projects/WRI-Aquaduct/PCRGlobWB20V05"

In [90]:
try:
    from osgeo import ogr, osr, gdal
except:
    sys.exit('ERROR: cannot find GDAL/OGR modules')
    
from netCDF4 import Dataset
import os
import datetime
import subprocess
import pandas as pd
import re

In [92]:
def readFile(filename):
    filehandle = gdal.Open(filename)
    band1 = filehandle.GetRasterBand(1)
    geotransform = filehandle.GetGeoTransform()
    geoproj = filehandle.GetProjection()
    Z = band1.ReadAsArray()
    xsize = filehandle.RasterXSize
    ysize = filehandle.RasterYSize
    filehandle = None
    return xsize,ysize,geotransform,geoproj,Z

def writeFile(filename,geotransform,geoprojection,data):
    (x,y) = data.shape
    format = "GTiff"
    driver = gdal.GetDriverByName(format)
    # you can change the dataformat but be sure to be able to store negative values including -9999
    dst_datatype = gdal.GDT_Float32
    dst_ds = driver.Create(filename,y,x,1,dst_datatype, [ 'COMPRESS=LZW' ])
    dst_ds.GetRasterBand(1).SetNoDataValue(-9999)
    dst_ds.GetRasterBand(1).WriteArray(data)
    dst_ds.SetGeoTransform(geotransform)
    dst_ds.SetProjection(geoprojection)
    dst_ds = None
    return 1

def splitKey(key):
    prefix, extension = key.split(".")
    fileName = prefix.split("/")[-1]
    outDict = {"fileName":fileName,"extension":extension}
    return outDict

def splitFileName(fileName):
    values = re.split("_|-", fileName)
    keys = ["geographic_range","indicator","spatial_resolution","temporal_range_min","temporal_range_max"]
    outDict = dict(zip(keys, values))
    outDict["fileName"] = fileName
    return outDict

In [49]:
[xsizeSample,ysizeSample,geotransformSample,geoprojSample,ZSample] = readFile(INPUTLOCATION_SAMPLE_GEOTIFF)

In [50]:
files = os.listdir(INPUTLOCATION_INDICATORS)

In [53]:
newExtension =".tif"
for oneFile in files:
    if oneFile.endswith(".asc"):
        base , extension = oneFile.split(".")
        xsize,ysize,geotransform,geoproj,Z = readFile(os.path.join(INPUTLOCATION_INDICATORS,oneFile))
        Z[Z<-9990]= -9999
        Z[Z>1e19] = -9999
        outputFileName = base + newExtension
        writeFile(os.path.join(OUTPUTLOCATION_INDICATORS,outputFileName),geotransformSample,geoprojSample,Z)
    

Upload to GCS

In [66]:
!gsutil -m cp \
{OUTPUTLOCATION_INDICATORS}/*.tif \
{GCS_BASE}

Copying file:///volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output/global_interannualvariabilitywatersupply_5min_1960-2014.tif [Content-Type=image/tiff]...
Copying file:///volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output/global_droughtseveritystandardisedsoilmoisture_5min_1960-2014.tif [Content-Type=image/tiff]...
Copying file:///volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output/global_q1seasonalvariabilitywatersupply_5min_1960-2014.tif [Content-Type=image/tiff]...
Copying file:///volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output/global_droughtseveritystandardisedstreamflow_5min_1960-2014.tif [Content-Type=image/tiff]...
Copying file:///volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output/global_environmentalflows_5min_1960-2014.tif [Content-Type=image/tiff]...
Copying file:///volumes/data/Y2017M08D02_RH_Ingest_Additional_Rasters_EE_V01/output/global_q2seasonalvariabilitywatersupply_5min_1960-2014.ti

Ingest in earthengine

In [64]:
command = ("/opt/google-cloud-sdk/bin/gsutil ls %s") %(GCS_BASE)

In [94]:
keys = subprocess.check_output(command,shell=True)
keys2 = keys.decode('UTF-8').splitlines()

In [95]:
df = pd.DataFrame()
i = 0
for key in keys2:
    i = i+1
    outDict_key = splitKey(key)
    df2 = pd.DataFrame(outDict_key,index=[i])
    df = df.append(df2)   

In [96]:
df.head()

Unnamed: 0,extension,fileName
1,tif,global_droughtseveritystandardisedsoilmoisture...
2,tif,global_droughtseveritystandardisedstreamflow_5...
3,tif,global_environmentalflows_5min_1960-2014
4,tif,global_interannualvariabilitywatersupply_5min_...
5,tif,global_q1seasonalvariabilitywatersupply_5min_1...


global_droughtseveritystandardisedstreamflow_5min_1960-2014


In [107]:
df_fileName = pd.DataFrame()

for index, row in df.iterrows():
    outDict_fileName = splitFileName(row.fileName)
    df2 = pd.DataFrame(outDict_fileName,index=[index])
    df_fileName = df_fileName.append(df2)  

In [108]:
df_fileName.head()

Unnamed: 0,fileName,geographic_range,indicator,spatial_resolution,temporal_range_max,temporal_range_min
1,global_droughtseveritystandardisedsoilmoisture...,global,droughtseveritystandardisedsoilmoisture,5min,2014,1960
2,global_droughtseveritystandardisedstreamflow_5...,global,droughtseveritystandardisedstreamflow,5min,2014,1960
3,global_environmentalflows_5min_1960-2014,global,environmentalflows,5min,2014,1960
4,global_interannualvariabilitywatersupply_5min_...,global,interannualvariabilitywatersupply,5min,2014,1960
5,global_q1seasonalvariabilitywatersupply_5min_1...,global,q1seasonalvariabilitywatersupply,5min,2014,1960


In [109]:
df_complete = df.merge(df_fileName,how='left',left_on='fileName',right_on='fileName')

In [110]:
df_complete.head()

Unnamed: 0,extension,fileName,geographic_range,indicator,spatial_resolution,temporal_range_max,temporal_range_min
0,tif,global_droughtseveritystandardisedsoilmoisture...,global,droughtseveritystandardisedsoilmoisture,5min,2014,1960
1,tif,global_droughtseveritystandardisedstreamflow_5...,global,droughtseveritystandardisedstreamflow,5min,2014,1960
2,tif,global_environmentalflows_5min_1960-2014,global,environmentalflows,5min,2014,1960
3,tif,global_interannualvariabilitywatersupply_5min_...,global,interannualvariabilitywatersupply,5min,2014,1960
4,tif,global_q1seasonalvariabilitywatersupply_5min_1...,global,q1seasonalvariabilitywatersupply,5min,2014,1960
