# Case Study: 2017 Northern Plains Flash Drought

In [None]:
import earthaccess
import numpy as np
import xarray as xr
from matplotlib import pyplot

auth = earthaccess.login()

## Before We Get Started

For this case study, we're going to download some data from [the North American Land Data Assimilation System (NLDAS).](https://disc.gsfc.nasa.gov/datasets/NLDAS_NOAH0125_M_2.0/summary?keywords=NLDAS)

Consequently, we'll need a place to store these raw data. It's important that we have a folder in our file system reserved for these raw data so we can keep them separate from any new datasets we might create. 

**Let's create a folder called `data_raw` in our Jupyter Notebook's file system.**

We should never modify the raw data (that we're about to download). Doing so would make it hard to repeat the analysis we're going to perform as we will lose the original data values. This doesn't mean we have to keep the `data_raw` folder around forever: if it's publicly available data, we can always download it again.

---

## Downloading the Data

In [None]:
results = []

for year in range(2008, 2018):
    search = earthaccess.search_data(
        short_name = 'NLDAS_NOAH0125_M',
        version = '2.0',
        temporal = (f'{year}-09', f'{year}-09'))
    results.extend(search)

In [None]:
len(results)

Previously, we've used `earthaccess.open()` to get access to these data. This time, we'll use `earthaccess.download()`. What's the difference?

- `earthaccess.open()` provides a file-like object that is available to be downloaded and read *only we need it.*
- `earthaccess.download()` actually downloads the file to our file system.

**Note that, below, we're telling `earthaccess.download()` to put the downloaded files into our new `data_raw` folder.**

In [None]:
earthaccess.download(results, 'data_raw')

In [None]:
import glob

file_list = glob.glob('data_raw/*.nc')
file_list

In [None]:
import netCDF4

# Open just the first file
nc = netCDF4.Dataset(file_list[0])

In [None]:
# TODO Discuss file-level metadata

nc

In [None]:
# TODO Discuss file-level metadata
# TODO Discuss "scale_factor" and "add_offset" and "missing_value"

et = nc.variables['Evap']
et

In [None]:
# TODO Note the shape
# TODO Note the orientation
# TODO Discuss CF convention

pyplot.imshow(et[0])

In [None]:
pyplot.imshow(np.flipud(et[0]))

In [None]:
# TODO Note data type, why we're changing it to an array

type(et)

In [None]:
et_series = []

for filename in file_list:
    nc = netCDF4.Dataset(filename)
    et = np.array(nc.variables['Evap'])
    # Don't forget to to flip the image upside-down!
    et_series.append(np.flipud(et[0]))

et_series = np.stack(et_series, axis = 0)
et_series.shape

---

## Computing a Climatology

In [None]:
# TODO define a climatology

et_clim = et_series.mean(axis = 0)
et_clim.shape

In [None]:
pyplot.imshow(et_clim)
pyplot.colorbar()

In [None]:
# TODO NoData

et_clim.min()

In [None]:
et_clim[et_clim < 0] = np.nan

pyplot.imshow(et_clim)
cbar = pyplot.colorbar()
cbar.set_label('Evapotranspiration [kg m-2]')
pyplot.title('Mean September ET')
pyplot.show()

### How Does September 2017 Compare?

In [None]:
file_list[-1]

In [None]:
et_2017_anomaly = et_series[-1] - et_clim

pyplot.imshow(et_2017_anomaly, cmap = 'RdYlBu')
cbar = pyplot.colorbar()
cbar.set_label('Evapotranspiration Anomaly [kg m-2]')
pyplot.show()

In [None]:
extent = [
    nc.variables['lon'][:].min(),
    nc.variables['lon'][:].max(),
    nc.variables['lat'][:].min(),
    nc.variables['lat'][:].max()
]
extent

In [None]:
import cartopy.crs as ccrs
import cartopy.io.shapereader as shpreader

shapename = 'admin_1_states_provinces_lakes'
states_shp = shpreader.natural_earth(resolution = '110m', category = 'cultural', name = shapename)

fig = pyplot.figure()
ax = fig.add_subplot(1, 1, 1, projection = ccrs.PlateCarree())
ax.imshow(et_2017_anomaly, extent = extent, cmap = 'RdYlBu')
ax.add_geometries(shpreader.Reader(states_shp).geometries(), ccrs.PlateCarree(), facecolor = 'none')
pyplot.show()

---

## More Resources

- Curious about how to use `earthaccess.open()` along with `xarray` so that you don't have keep any downloaded files around? Well, `xarray.open_dataset()` can be slow when you have a lot of files to open, as in this time-series example. [This article describes how you can speed up `xarray.open_dataset()`](https://climate-cms.org/posts/2018-09-14-dask-era-interim.html) when working with multiple cloud-hosted files.