# Tutorial MUR SST on AWS  

(parts pf this tutorial were adapted from pangeo's ocean_ssh_example.ipynb by Dr. Ryan Abernathey)
  
Credits:

tutorial development

* [Dr. Chelle Gentemann](mailto:gentemann@faralloninstitute.org)    - Farallon Institute, USA
* [Dr. Marisol Garcia-Reyes](mailto:marisolgr@faralloninstitute.org)  - Farallon Institute, USA 
* [Dr. Rich Signell](mailto:rsignell@usgs.gov) - USGS

creating of the Zarr MUR SST dataset

* [Aimee Barciauskas](mailto:aimee@developmentseed.org) - Development Seed
* [Dr. Rich Signell](mailto:rsignell@usgs.gov) - USGS
* [Dr. Chelle Gentemann](mailto:gentemann@faralloninstitute.org)    - Farallon Institute, USA
-------------


# Structure of this tutorial

1. Opening data
2. Data plotting and exploration

# 1. Opening data

-------------------

## Import python packages

You are going to want to turn off warnings and set xarray display options

In [None]:
import warnings
import numpy as np
import pandas as pd
import xarray as xr
import fsspec

warnings.simplefilter('ignore') # filter some warning messages
xr.set_options(display_style="html")  #display dataset nicely 

### start a cluster, the key to reading effectively on Cloud

- This will set up a cluster for you and give you a path that you can paste into the top of the Dask dashboard to visualize parts of your cluster.  You don't need to paste it into the Dask dashboard for this to work.

In [None]:
from dask_kubernetes import KubeCluster
from dask.distributed import Client

In [None]:
cluster = KubeCluster(n_workers=20)
client = Client(cluster)
cluster

### Initialize Dataset

Here we load the dataset from the zarr store. Note that this very large dataset initializes nearly instantly, and we can see the full list of variables and coordinates.

In [None]:
file_location = 's3://nasa-eodc/eodc/mursst_zarr/5x1799x3600'
ds_sst = xr.open_zarr(fsspec.get_mapper(file_location, anon=True),consolidated=True)

### Examine Metadata

For those unfamiliar with this dataset, the variable metadata is very helpful for understanding what the variables actually represent
Printing the dataset will show you the dimensions, coordinates, and data variables with clickable icons at the end that show more metadata and size.

In [None]:
ds_sst

# 2. Data plotting and exploration

``xarray`` plotting functions rely on matplotlib internally, but they make use of all available metadata to make the plotting operations more intuitive and interpretable.  
More plotting examples are given [here](http://xarray.pydata.org/en/stable/plotting.html)

### Here we use ``holoviews`` and ``hvplot`` thing using interactive graphics

In [None]:
import hvplot.xarray
import holoviews as hv
from holoviews.operation.datashader import regrid
hv.extension('bokeh')

#### Let's explore the data

In [None]:
sst = ds_sst['analysed_sst']

### Plot a timeseries in the 2015 Blob Region

In [None]:
sst.sel(lon=-140, lat=53).hvplot(grid=True)

### Plot a global image of SST on 1/1/2005

In [None]:
%%time
sst.sel(time='2005-01-01').hvplot.quadmesh(x='lon', y='lat', geo=True, 
                rasterize=True, cmap='rainbow', tiles='EsriImagery')

### Subset the El Niño/La Niña Region:

In [None]:
sst_elnino = sst.sel(lon=slice(-180,-70), lat=slice(-25,25))

### Difference the monthly mean temperature fields for Jan 2016 (El Niño) and Jan 2014 (normal)

In [None]:
sst_jan2016 = sst_elnino.sel(time=slice('2016-01-01','2016-02-01')).mean(dim='time')
sst_jan2014 = sst_elnino.sel(time=slice('2014-01-01','2014-02-01')).mean(dim='time')

In [None]:
%%time
(sst_jan2016 - sst_jan2014).hvplot.quadmesh(x='lon', y='lat', geo=True, 
                rasterize=True, cmap='rainbow', tiles='EsriImagery')

### Plotting on maps

For plotting on maps, we rely on the excellent [cartopy](http://scitools.org.uk/cartopy/docs/latest/index.html) library.

In [None]:
import cartopy.crs as ccrs

### In cartopy you need to define the map projection you want to plot.  

- Common ones are Ortographic and PlateCarree.
- You can add coastlines and gridlines to the axes as well.

### Create a monthly average SST from 2015 then image it

In [None]:
sst_2016_monthly = sst.sel(time=slice('2015-01-01','2016-01-01')).groupby('time.month').mean(dim='time')

In [None]:
sst_2016_monthly.hvplot.quadmesh(x='lon', y='lat', geo=True, 
                rasterize=True, cmap='rainbow', projection=ccrs.Orthographic(-80, 35))

### Add coastline

In [None]:
sst_2016_monthly.hvplot.quadmesh(x='lon', y='lat', geo=True, 
                rasterize=True, cmap='rainbow', projection=ccrs.Orthographic(-80, 35),
                                           coastline='110m')

## A nice cartopy tutorial is [here](http://earthpy.org/tag/visualization.html)

# xarray can do more!

* concatentaion
* open network located files with openDAP
* import and export Pandas DataFrames
* .nc dump to 
* groupby_bins
* resampling and reduction

For more details, read this blog post: http://continuum.io/blog/xray-dask


## Where can I find more info?

### For more information about xarray

- Read the [online documentation](http://xarray.pydata.org/)
- Ask questions on [StackOverflow](http://stackoverflow.com/questions/tagged/python-xarray)
- View the source code and file bug reports on [GitHub](http://github.com/pydata/xarray/)

### For more doing data analysis with Python:

- Thomas Wiecki, [A modern guide to getting started with Data Science and Python](http://twiecki.github.io/blog/2014/11/18/python-for-data-science/)
- Wes McKinney, [Python for Data Analysis](http://shop.oreilly.com/product/0636920023784.do) (book)

### Packages building on xarray for the geophysical sciences

For analyzing GCM output:

- [xgcm](https://github.com/xgcm/xgcm) by Ryan Abernathey
- [oogcm](https://github.com/lesommer/oocgcm) by Julien Le Sommer
- [MPAS xarray](https://github.com/pwolfram/mpas_xarray) by Phil Wolfram
- [marc_analysis](https://github.com/darothen/marc_analysis) by Daniel Rothenberg

Other tools:

- [windspharm](https://github.com/ajdawson/windspharm): wind spherical harmonics by Andrew Dawson
- [eofs](https://github.com/ajdawson/eofs): empirical orthogonal functions by Andrew Dawson
- [infinite-diff](https://github.com/spencerahill/infinite-diff) by Spencer Hill 
- [aospy](https://github.com/spencerahill/aospy) by Spencer Hill and Spencer Clark
- [regionmask](https://github.com/mathause/regionmask) by Mathias Hauser
- [salem](https://github.com/fmaussion/salem) by Fabien Maussion

Resources for teaching and learning xarray in geosciences:
- [Fabien's teaching repo](https://github.com/fmaussion/teaching): courses that combine teaching climatology and xarray
