# NLDAS_VIC0125_M.002:
  NLDAS VIC Land Surface Model L4 Monthly 0.125 x 0.125 degree V002  _NLDAS Forcing Data L4 Hourly 0.125 x 0.125 degree V001_

This notebook accomplishes the following:

- Downloads data file(s) from NASA
- Show attribute statistics and visualizations
- Do viz-related data cleaning
- Show (corrected) attribute statistics and visualizations

Required libraries:
- `pynio`, for opening GRIB files (requires Python 2)
- `pydap`, for accessing files hosted on gesdisc.eosdis.nasa.gov server
- `holoviews`, for data viz

In [None]:
#!conda create -n nldas_py27 -c conda-forge -c  elm -c elm/label/dev -c ioam -c ncar pynio elm earthio pydap

In [None]:
from __future__ import absolute_import, division, print_function, unicode_literals

import gc
import os
import getpass

import six
import holoviews as hv
import numpy as np
import pandas as pd
import xarray as xr
from pydap.cas.urs import setup_session
from six.moves.urllib.parse import urljoin, urlparse

hv.notebook_extension('bokeh')
%matplotlib inline

url = 'https://hydro1.gesdisc.eosdis.nasa.gov/data/NLDAS/NLDAS_VIC0125_M.002/2017/NLDAS_VIC0125_M.A201706.002.grb'

This persists the file to disk, then loads the data into RAM as an xarray Dataset object.

In [None]:
url = 'https://hydro1.gesdisc.eosdis.nasa.gov/data/NLDAS/NLDAS_FOR0125_H.001/2007/365/NLDAS_FOR0125_H.A20071231.2300.001.grb'

data_fpath = urlparse(url).path.lstrip(os.sep)
data_dpath = os.path.dirname(data_fpath)
if not os.path.exists(data_fpath):
    session = setup_session(os.environ.get('NLDAS_USERNAME', raw_input('NLDAS Username: ')),
                            os.environ.get('NLDAS_PASSWORD', getpass.getpass('Password: ')))
    resp = session.get(url)
    if not os.path.isdir(data_dpath):
        os.makedirs(data_dpath)
    with open(data_fpath, 'w') as outfp:
        outfp.write(resp.content)
gc.collect()
ds = xr.open_dataset(data_fpath, engine='pynio')
ds

### Attributes alongside their descriptions

In [None]:
info = []
for k in ds.data_vars:
    raster = ds[k]
    about = (k, raster.long_name, raster.units, raster.initial_time)
    about_raster = '{:<20} {} ({}) - {}'.format(*about)
    info.append(about_raster)
print('Rasters in {}\n'.format(os.path.basename(data_fpath)), '\n  '.join(info), sep='\n  ')

In [None]:
raster

## Statistics and visualizations

Below we show the data as-is.

In [None]:
ds.to_dataframe().describe(percentiles=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975))

In [None]:
%opts Image RGB [width=300 height=200]
hvds = hv.Dataset(ds)
imgs = [hvds.to(hv.Image, ['lon_110', 'lat_110'], var).relabel(var) for var in ds.data_vars]
hv.Layout(imgs)

## Viz-related data cleaning

Noticing that -9999 seems to confuse the visualizations, we replace -9999 values with 0.

In [None]:
def set_to_na(da):
    da.values[np.isclose(da.values, -9999.)] = 0
ds.apply(set_to_na)
ds.to_dataframe().describe(percentiles=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975))

## Corrected visualizations

In [None]:
hvds = hv.Dataset(ds)
imgs = [hvds.to(hv.Image, ['lon_110', 'lat_110'], var, group='('+ds[var].long_name+')').relabel(var) for var in ds.data_vars]
hv.Layout(imgs)