# 2 Big Earth Data Access
### from different data repositories, e.g. Climate Data Store, Google Earth Engine or Earth on AWS

<hr>
<a href="./01_Introduction_Jupyter_Notebooks.ipynb"><< 1 - Introduction to Jupyter Notebooks</a>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space>&nbsp;<space> <a href="./03_Introduction_Jupyter_widgets.ipynb">3 - Introducing Jupyter widgets >></a>

## Overview of Big Earth Data repositories

There is a lot of open environmental data available. The problem is often that people do not know where to find the data and how to access it.
We will go through three Big Earth Data repositories today.

* [2.1 Copernicus Climate Data Store (CDS)](#cds)
* [2.2 Google Earth Engine (GEE)](#gee)
* [2.3 Earth on Amazon Web Services (AWS)](#earth_aws)



<hr>

## <a id="cds"></a> 2.1 Copernicus Climate Data Store (CDS)

The [Copernicus Climate Data Store (CDS)](cds.copernicus.eu) is a one-stop shop for information about the climate: past, present and future. It is operated by the [European Centre for Medium-Range Weather Forecasts(ECMWF)](https://ecmwf.int).

It consists of two parts:
* Access to Climate Datasets via a [web interface](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset) or programmatically via the [Climate Data Store API](https://cds.climate.copernicus.eu/api-how-to)
* Analyse and visualise climate data with the [Climate Data Store toolbox (Python interface)](https://cds.climate.copernicus.eu/user/login?destination=/toolbox-user)

Data are natively available in GRIB and NetCDF.

### Data available on the CDS (a selection)

The climate data store has a wide variety of climate data, e.g.:
* **ERA5 climate reanalysis**
* **Seasonal forecasts**
* **Climate projections**
* **Sectoral climate indices**

Have a look and browse through [all the publicly available datasets on the CDS](https://cds.climate.copernicus.eu/cdsapp#!/search?type=dataset). 

ECMWF has many more publicly available datasets, e.g. data on flood, fire risk and air quality (provided by the Copernicus Atmosphere Monitoring Service). Have a look at [ECMWF's public datasets](https://apps.ecmwf.int/datasets/).


### Example to download ERA5 data in GRIB

Required non-standard libraries:
* [cdsapi](https://pypi.org/project/cdsapi/)
* [urllib](https://docs.python.org/3/library/urllib.html)

In [15]:
import cdsapi
import os
import urllib
import sys

c = cdsapi.Client()

os.chdir('./')

def retrieve_func():
    data = c.retrieve(
        'reanalysis-era5-single-levels',
        {
            'product_type':'reanalysis',
            'format':'grib',
            'variable':[
            '2m_temperature'
            ],
            'year':[
                 '2000'
            ],
            'month':[
                 '02'
            ],
            'day':[
                '16'
            ],
            'time':[
                '00:00'
            ],
            'area':'90/-180/-90/179.75'
        },
        'filename.grib')
    return data


filename = "era5_t2m_test.grib"
data = retrieve_func()
urllib.request.urlretrieve(data.location, filename)

sys.exit()

2019-04-04 22:16:33,533 INFO Sending request to https://cds.climate.copernicus.eu/api/v2/resources/reanalysis-era5-single-levels
2019-04-04 22:16:33,905 INFO Request is queued


KeyboardInterrupt: 

### Example to open a GRIB file with xarray and cfgrib 

Useful Python libraries to open GRIB files:
* [xarray](http://xarray.pydata.org/en/stable/)
* [cfgrib](https://github.com/ecmwf/cfgrib) - A Python interface that supports GRIB engine for the xarray library

In [12]:
 import xarray as xr

In [None]:
test = xr.open_dataset('era5_t2m_test.grib')

### Good news for R users

Koen Huefkens and Reto Stauffer just released the CRAN package [ecmwfr](https://cran.r-project.org/web/packages/ecmwfr/index.html), which is a programmatic interface to public data at ECMWF and on the CDS.

* [ecmwfr](https://cran.r-project.org/web/packages/ecmwfr/index.html)

<hr>

## <a id='gee'></a> 2.2 Google Earth Engine

[Google Earth Engine (GEE)](https://earthengine.google.com/) is a planetary-scale platform for Earth science data & analysis. 

There are several ways to work with the Google Earth Engine:
* [Code Editor]( code.earthengine.google.com), a web-based IDE in Javascript
* [Client libraries](https://github.com/google/earthengine-api) provide Javascript and Python wrapper functions for the Earth Engine API

You have to sign up for GEE.

### Data available on GEE

Earth Engine's data archive includes:
* **Weather and Climate Data**
    * A selection of ERA5 reanalysis **[SOON PUBLICLY AVAILABLE]**
    * TRMM precipitation
* **Imagery**
    * Landsat
    * Sentinel
    * MODIS
    
... and many more. Have a look yourself at the [Earth Engine Data Catalog](https://developers.google.com/earth-engine/datasets/catalog/).

### Example how to load an image from GEE and to interactively visualize it with ipyleaflet

Required libraries:
* [Earth Engine Python API](https://developers.google.com/earth-engine/python_install)
* [ipyleaflet](https://ipyleaflet.readthedocs.io/en/latest/)

In [1]:
import ee

In [2]:
%matplotlib inline

In [3]:
from ipyleaflet import Map, basemaps, basemap_to_tiles, FullScreenControl, Marker
import ipyleaflet
import ipywidgets
import ipywidgets as widgets
from IPython.display import display, clear_output

In [4]:
ee.Initialize()

Function below is taken from Tyler Erickson's [notebooks](https://github.com/tylere/EEUS2018-JupyterSession/blob/master/02%20-%20Interactive%20Maps.ipynb) for his Interactive Jupyter session at EEUS18.

In [5]:
def GetTileLayerUrl(ee_image_object):
  map_id = ee.Image(ee_image_object).getMapId()
  tile_url_template = "https://earthengine.googleapis.com/map/{mapid}/{{z}}/{{x}}/{{y}}?token={token}"
  return tile_url_template.format(**map_id)

### Load an ERA5 image and get image information

In [6]:
img_test = ee.Image('projects/ecmwf/era5_monthly/200001')
img_test.getInfo()

{'type': 'Image',
 'bands': [{'id': 't2m',
   'data_type': {'type': 'PixelType', 'precision': 'float'},
   'dimensions': [1440, 721],
   'crs': 'EPSG:4326',
   'crs_transform': [0.25, 0.0, -180.125, 0.0, -0.25, 90.125]},
  {'id': 'tp',
   'data_type': {'type': 'PixelType', 'precision': 'float'},
   'dimensions': [1440, 721],
   'crs': 'EPSG:4326',
   'crs_transform': [0.25, 0.0, -180.125, 0.0, -0.25, 90.125]}],
 'version': 1554236476943184,
 'id': 'projects/ecmwf/era5_monthly/200001',
 'properties': {'system:time_start': 946684800000.0,
  'month': 1.0,
  'year': 2000.0,
  'system:footprint': {'type': 'LinearRing',
   'coordinates': [[-180.0, -90.0],
    [180.0, -90.0],
    [180.0, 90.0],
    [-180.0, 90.0],
    [-180.0, -90.0]]},
  'system:time_end': 949363200000.0,
  'system:asset_size': 8744264,
  'system:index': '200001'}}

### Select one specific parameter

In [7]:
t2m = img_test.select('t2m')
tp = img_test.select('tp')

### Get image url for visualization

In [8]:
t2m_url = GetTileLayerUrl(t2m)
tp_url = GetTileLayerUrl(tp)

In [9]:
map1 = ipyleaflet.Map(
    zoom=2,
    layout={'height':'500px'},
)

map1.add_layer(ipyleaflet.TileLayer(url=t2m_url))
map1.add_layer(ipyleaflet.TileLayer(url=tp_url))

# Adding some fance controls to the map, e.g. layers conrol, FullScreenControl
map1.add_control(ipyleaflet.LayersControl())
control = FullScreenControl()
map1.add_control(control)

map1

Map(basemap={'url': 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png', 'max_zoom': 19, 'attribution': 'Map …

### Get image url and apply visualization params to it

In [10]:
t2m_url = GetTileLayerUrl(t2m.visualize(min=250, max=330, palette=['#000080','#0000D9','#4000FF','#8000FF','#0080FF'\
                                                                   ,'#00FFFF','#00FF80','#80FF00','#DAFF00','#FFFF00','#FFF500','#FFDA00','#FFB000','#FFA400','#FF4F00','#FF2500','#FF0A00','#FF00FF']))
tp_url = GetTileLayerUrl(tp.visualize(min=0, max=1, palette=['#FFFFFF', '#00FFFF', '#0080FF', '#DA00FF', '#FFA400','#FF0000']))

In [11]:
map2 = ipyleaflet.Map(
    zoom=2,
    layout={'height':'500px'},
)

map2.add_layer(ipyleaflet.TileLayer(url=t2m_url))
map2.add_layer(ipyleaflet.TileLayer(url=tp_url))

# Adding the layers control to the map.
map2.add_control(ipyleaflet.LayersControl())
control = FullScreenControl()
map2.add_control(control)

map2

Map(basemap={'url': 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png', 'max_zoom': 19, 'attribution': 'Map …

<hr>

## <a id='earth_aws'></a> 2.3 Earth on AWS

[Earth on AWS]("https://aws.amazon.com/earth/") is a registry of open geospatial datasets on Amazon Web Services.
* [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) is the Amazon Web Services (AWS) SDK for Python in order e.g. to access data on AWS S3 storage

The following example to access ERA5 data from a S3 cloud storage bucket is taken from the [example notebooks](https://github.com/planet-os/notebooks/blob/master/aws/era5-s3-via-boto.ipynb) generated by Intertrust Technologies Corporation.

In [283]:
import boto3
import botocore

In [284]:
era5_bucket = 'era5-pds'

# AWS access / secret keys required
# s3 = boto3.resource('s3')
# bucket = s3.Bucket(era5_bucket)

# No AWS keys required
client = boto3.client('s3', config=botocore.client.Config(signature_version=botocore.UNSIGNED))

In [None]:
paginator = client.get_paginator('list_objects')
result = paginator.paginate(Bucket=era5_bucket, Delimiter='/')
for prefix in result.search('CommonPrefixes'):
    print(prefix.get('Prefix'))

<hr>
&copy; 2019 | Julia Wagemann
<a rel="license" href="http://creativecommons.org/licenses/by/4.0/"><img style="float: right" alt="Creative Commons Lizenzvertrag" style="border-width:0" src="https://i.creativecommons.org/l/by/4.0/88x31.png" /></a>