# Concept portfolio
-------------------------------
This Jupyter notebook contains code and explanatory text for three key visualization products:
* An ash deposition map
* An ash thickness time profile plotting tool
* A GIS / Google Earth data export utility

To contextualize the code presented, I provide a brief overview of each of the Python packages used in this notebook (and by the repository as a whole) in section **1**. The netCDF filename and other plotting parameters specified in section **2** are used globally between all three products. The code cells for the three products (sections **3**, **4**, and **5**) are independent of one another, however, so once you specify the global parameters you can choose to run any or all of the following cells.

## Table of contents
---------------------------
1. [Overview of tools](#tools)
2. [Global visualization specifications](#global)
3. [Interactive ash deposition map](#bokeh_geoviews)
4. [Ash thickness time profiles](#time_profiles)

<br>
<div style="text-align: right"> [[Back to top]](#Concept-portfolio) </div>

<a id="tools"></a>
## 1. Overview of tools
-------------------------------
The list of Python packages here corresponds to those specified in [`environment.yml`](environment.yml), which is the file used to build the [Binder](https://mybinder.org/v2/gh/liamtoney/ashfall_visual/master) for this repository. I've spent quite a bit of time searching through various package options &mdash; for example, there are approximately one gazillion KML writing packages out there &mdash; and this list is the distilled result of my efforts.

#### "Standard" tools
- [**matplotlib**](https://matplotlib.org/)
- [**numpy**](http://www.numpy.org/)

#### Gridded data I/O
- [**netcdf4**](http://unidata.github.io/netcdf4-python/) &mdash; Required by xarray to read netCDF files.


- [**xarray**](https://xarray.pydata.org/en/latest/) &mdash; Excellent package for working with N-D arrays. Reading a netCDF file with xarray creates an `xarray.Dataset` object, which is essentially an in-memory representation of the netCDF file.

#### Georeferenced data I/O
- [**fiona**](http://toblerity.org/fiona/) &mdash; Python wrapper for the **vector** portion of the GDAL library. Can be used to create georeferenced point, line, and polygon information (shapefiles) for ingestion into GIS programs such as ArcMap.


- [**gdal**](https://pypi.org/project/GDAL/) &mdash; Required by both Rasterio and Fiona. It's possible to use this library directly with (for example)
```python
from osgeo import gdal
from osgeo import ogr
```
However, Rasterio and Fiona provide a much friendlier interface.


- [**rasterio**](https://rasterio.readthedocs.io/en/latest/) &mdash; Python wrapper for the **raster** portion of the GDAL library. Can be used to create georeferenced GeoTIFF files for ingestion into GIS programs such as [ArcMap](http://desktop.arcgis.com/en/arcmap/).


- [**simplekml**](https://simplekml.readthedocs.io/en/latest/) &mdash; Provides a simple Python interface for constructing KML and KMZ files for ingestion into programs such as [Google Earth Pro](https://www.google.com/earth/download/gep/agree.html).

#### Plotting
- [**bokeh**](https://bokeh.pydata.org/en/latest/) &mdash; A plotting backend alternative to Matplotlib. Offers easily configurable interactivity with some pretty serious features available if you're willing to code your own JavaScript callbacks. Limited support for geographic data, but GeoViews mostly solves this issue.


- [**cartopy**](http://scitools.org.uk/cartopy/) &mdash; I like to think of cartopy (in combination with Matplotlib) as a fairly solid Python substitute for GMT. Cartopy supports a large number of projections and allows for easy translation of data between projections by defining source and destination [`cartopy.crs`](http://scitools.org.uk/cartopy/docs/v0.15/crs/index.html) instances.


- [**geoviews**](http://geo.holoviews.org/) &mdash; GeoViews is the geospatially-oriented version of HoloViews. Like HoloViews, it is compatible with both the Bokeh and Matplotlib backends. The notebooks in [`experiments/`](experiments) explore these various backends. GeoViews has great integration with xarray and cartopy, and at the time of writing is rapidly accumulating powerful new features including reprojection and regridding routines. Check out the [gallery](http://geo.holoviews.org/gallery/index.html) for inspiration.


- [**holoviews**](http://holoviews.org/) &mdash; HoloViews is designed to make visualizing data as trivial as possible. For our purposes we mostly interact with HoloViews `Elements` that are wrapped into geospatially-friendly `GeoElements` by GeoViews. HoloViews works with either Bokeh or Matplotlib as the plotting backend.

#### Miscellaneous
- [**colorcet**](https://bokeh.github.io/colorcet/) &mdash; Perceptually uniform colormaps compatible with both Matplotlib and Bokeh.


- [**esmpy**](http://www.earthsystemmodeling.org/esmf_releases/public/ESMF_7_1_0r/esmpy_doc/html/index.html) &mdash; Python interface to the Earth System Modeling Framework (ESMF) regridding utility. Required by xESMF.


- [**geopy**](https://geopy.readthedocs.io/en/latest/) &mdash; Provides "geocoding" functionality. Enter in a location string, such as `'Taupo, NZ'` or `'Te Papa'` or even something as vague as `'steepest street in the world'`, and retrieve the coordinates and address information.


- [**pyproj**](https://jswhit.github.io/pyproj/) &mdash; I mostly use this for transforming coordinates between different coodinate systems using [`pyproj.transform`](https://jswhit.github.io/pyproj/pyproj-module.html#transform).


- [**xesmf**](http://xesmf.readthedocs.io/en/latest/) &mdash; Very slick package for regridding datasets (i.e. reduce the cell size of an ash deposition model).

<br>
<div style="text-align: right"> [[Back to top]](#Concept-portfolio) </div>

<a id="global"></a>
## 2. Global visualization specifications
------------------------------------------------------
Specify a `FILENAME` and colorbar limits expressed as `ASH_MIN` and `ASH_MAX`. For plots with colorbars, ash thickness values below `ASH_MIN` and above `ASH_MAX` will be mapped to the colorbar's lowest and highest colors, respectively. This is useful for ignoring trace ash amounts. `CMAP` must be a string corresponding to one of the many excellent perceptually uniform palettes included with colorcet. They are listed [here](https://bokeh.github.io/colorcet/) under **Full list of available colormaps**.

In [1]:
# filename (including path) of HYSPLIT model
FILENAME = '18042918_taupo_15.0_0.01.nc'

# colorbar cutoff values
ASH_MIN = 10**-1
ASH_MAX = 10**2

# colormap string from colorcet
CMAP = 'linear_kry_5_98_c75'

Now that we've specified a model, it's a good time to introduce a key function: `read_hysplit_netcdf()` is used by essentially every program in this repository. It reads the raw netCDF (extension `*.nc`) file and makes a few tweaks (delve into [the code itself](vis_tools.py) if you're curious), the most important being to "crop" the gridded dataset. The HYSPLIT model grid includes all of New Zealand, but the final modelled ash extent is usually only a very small portion of that entire model space. We trim the grid to smallest rectanglar bounding box encompassing the non-zero ash distribution. `read_hysplit_netcdf()` returns an `xarray.Dataset` object with all of the netCDF information. The idea behind this data type is that the attributes and dimensions are carried through each processing step, so the dataset remains self-describing.

In [2]:
from vis_tools import read_hysplit_netcdf
read_hysplit_netcdf(FILENAME)

<xarray.Dataset>
Dimensions:           (lat: 417, lon: 656, time: 9)
Coordinates:
  * lat               (lat) float32 -41.58 -41.57 -41.56 -41.55 -41.54 ...
  * lon               (lon) float32 171.16 171.17 171.18 171.19 171.2 171.21 ...
  * time              (time) datetime64[ns] 2018-04-29T18:00:00 ...
Data variables:
    total_deposition  (lat, lon, time) float64 nan nan nan nan nan nan nan ...
Attributes:
    eruption_time:           2018-04-29T18:00:00
    accumulation_period_h:   24
    volcano:                 Taupo
    plume_height_amsl:       15.0
    volume_km3:              0.01
    eruption_duration_hhmm:  0100
    volcano_location:        [-38.8072, 175.9782]

Comparing the `Dimensions` of the above dataset with those of the original netCDF file:

In [3]:
import xarray
xarray.open_dataset(FILENAME)

<xarray.Dataset>
Dimensions:           (lat: 1401, lon: 1401, time: 8)
Coordinates:
  * time              (time) datetime64[ns] 2018-04-29T21:00:00 2018-04-30 ...
  * lat               (lat) float32 -48.0 -47.99 -47.98 -47.97 -47.96 -47.95 ...
  * lon               (lon) float32 166.0 166.01 166.02 166.03 166.04 166.05 ...
Data variables:
    total_deposition  (time, lat, lon) float64 ...
Attributes:
    eruption_time:           2018-04-29T18:00:00
    accumulation_period_h:   24
    volcano:                 Taupo
    plume_height_amsl:       15.0
    volume_km3:              0.01
    eruption_duration_hhmm:  0100

. . . we see why cropping the model is so important. We're vastly reducing the memory and number of computations required to process the data, and we're not throwing out any information.

<br>
<div style="text-align: right"> [[Back to top]](#Concept-portfolio) </div>

<a id="bokeh_geoviews"></a>
## 3. Interactive ash deposition map
-------------------------------------------------
This code creates an interactive pseudocolor plot of ash deposition that has zoom, pan, and reset capabilities. The user can drag a slider to select which "snapshot" of ash thickness to view. Ash thicknesses are scaled logarithmically, reflecting the qausi-exponential reduction in ash deposit thickness with distance from the vent. This code performs no re-sampling; each model cell is simply colored by its value. Contours are logarithmically spaced and colored using the same colormap as the model cells.

The map is created using Bokeh and GeoViews (for more information, return to the [tools section](#tools) above). This allows us to export the resulting plot to a standalone HTML file which contains all of the model data as well as the JavaScript required to visualize it. To save the cell output to HTML, uncomment the final line of the cell and run.

This map leverages WMTS map tiles to provide geographic context. It's easy to swap these out to fit the desired audience or purpose of the map. See [here](http://geo.holoviews.org/user_guide/Working_with_Bokeh.html) for the relevant GeoViews documentation.

> **Note:**
>
> The colored contours are confusing. An ideal implementation would be to plot the contours all as one (unobtrusive) color, and then label them to more clearly and quickly convey their value.

In [None]:
import numpy as np
import cartopy.crs as ccrs
import colorcet as cc

import holoviews as hv
import geoviews as gv

from bokeh.models.glyphs import Image
from bokeh.models.renderers import GlyphRenderer
from bokeh.models.formatters import FuncTickFormatter
from bokeh.models.annotations import ColorBar, Legend
from bokeh.tile_providers import STAMEN_TERRAIN_RETINA

from vis_tools import read_hysplit_netcdf

# squish a bunch of benign Matplotlib warnings related to NaN's
import warnings
warnings.filterwarnings('ignore', message='Warning: converting a masked element to nan.')
warnings.filterwarnings('ignore', message='invalid value encountered in greater')
warnings.filterwarnings('ignore', message='invalid value encountered in less')
warnings.filterwarnings('ignore', message='No contour levels were found within the data range.')

hv.extension('bokeh')
renderer = hv.renderer('bokeh')
       
model = read_hysplit_netcdf(FILENAME, ASH_MIN)
model['total_deposition'].values = np.log10(model['total_deposition'].values)  # manually take the log

volc_loc = (model.attrs['volcano_location'][1], model.attrs['volcano_location'][0])

gv_ds = gv.Dataset(model, crs=ccrs.PlateCarree())
gv_ds = gv_ds.redim.range(total_deposition=(np.log10(ASH_MIN), np.log10(ASH_MAX)))  # clip colorbar
gv_ds = gv_ds.redim.label(time='Time')

contour_levels = np.arange(np.log10(ASH_MIN)+1, np.log10(ASH_MAX)+1)  # log-spaced contours (skip ASH_MIN contour)

background_map = gv.WMTS(STAMEN_TERRAIN_RETINA)
model_image = gv_ds.to(gv.Image, ['lon', 'lat'])
model_contours = gv.operation.contours(model_image, levels=contour_levels)
source_location = gv.Points(volc_loc, crs=ccrs.Geodetic(), label='Source')

fig = background_map * model_image * model_contours * source_location

# hook function for Bokeh plot adjustments
def adjust_plot(plot, element):
    p = plot.handles['plot']
            
    # modify tools
    tools = dict(zip(map(lambda tool: str(tool).split('(')[0], p.tools), p.tools))
    wz = tools['WheelZoomTool']
    wz.zoom_on_axis = False
    p.tools = [tools['PanTool'], tools['BoxZoomTool'], wz, tools['ResetTool']]
    p.toolbar.active_scroll = wz
    
    # modify plot elements
    for object in p.renderers:    
        if isinstance(object, GlyphRenderer):
            if isinstance(object.glyph, Image):
                object.glyph.global_alpha = 0.5  # set the global alpha for Bokeh Images  
        elif isinstance(object, ColorBar):
            object.ticker.desired_num_ticks = len(contour_levels)+1  # set the number of colorbar ticks
        elif isinstance(object, Legend):
            object.click_policy = 'none'
            i = 0
            while list(object.items[i].label.keys())[0] != 'value':
                i = i + 1
            object.items = [object.items[i]]  # remove contours from the legend            
                                          
cmap=list(reversed(cc.palette[CMAP]))
plot_opts = {'Image': {'style': dict(cmap=cmap),
                        'plot': dict(colorbar=True, title_format=FILENAME.split('/')[-1], height=600, width=700,
                                     colorbar_opts=dict(location=(0, 0), orientation='horizontal',
                                                        title='Ash thickness (mm)',
                                                        formatter=FuncTickFormatter(code='''return(Math.pow(10, tick));''')),
                                     colorbar_position='bottom', finalize_hooks=[adjust_plot])},
          'Contours': {'style': dict(cmap=cmap, line_width=1.5),
                        'plot': dict()},
            'Points': {'style': dict(size=20, marker='triangle', line_color='black', fill_color='cyan'),
                        'plot': dict()}
            }             
fig = fig.opts(plot_opts)

fig
#renderer.save(fig, 'geoviews_bokeh_map'); print('HTML file saved')

<br>
<div style="text-align: right"> [[Back to top]](#Concept-portfolio) </div>

<a id="time_profiles"></a>
## 4. Ash thickness time profiles
--------------------------------------------
Text goes here.

> **Note:**
>
> Wish list items from down below go here.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import colorcet as cc
import pyproj
import datetime
from geopy.geocoders import Nominatim, GoogleV3
from vis_tools import read_hysplit_netcdf

from bokeh.plotting import figure
from bokeh.io import show, output_notebook
from bokeh.models.sources import ColumnDataSource
from bokeh.models.tiles import WMTSTileSource
from bokeh.models.tools import PanTool, WheelZoomTool, HoverTool, TapTool, CrosshairTool, ResetTool 
from bokeh.palettes import linear_palette
from bokeh.layouts import gridplot, column
from bokeh.models.widgets.markups import Div

# ignore benign error related to NaN's
import warnings
warnings.filterwarnings('ignore', message='invalid value encountered in greater')

output_notebook()

#########################################################
# SPECIFY: a list of location strings to geocode and plot
LOCATIONS = ['Whanganui, NZ',
             'Palmerston North, NZ',
             'New Plymouth, NZ',
             'Hastings, NZ',
             'GNS Science',
             'steepest street in the world'
            ]
#########################################################

#GEOCODER = Nominatim(country_bias='New Zealand')  # does not require an API key!
GEOCODER = GoogleV3(api_key='AIzaSyCYLEpQOyh4w1mF8J1yHZ8ILdDvSrwYsos', domain='maps.google.co.nz')  # uses Liam's key

WGS84 = pyproj.Proj(init='EPSG:4326')
WM = pyproj.Proj(init='EPSG:3857')

# WMTS tile url
URL = 'https://maps.wikimedia.org/osm-intl/{Z}/{X}/{Y}@2x.png'

model = read_hysplit_netcdf(FILENAME)

# create ColumnDataSource
loc_data = {'address':[], 'x':[], 'y':[], 'ash_values':[], 'time_values':[], 'ash_onset':[], 'max_ash':[]}
for location in LOCATIONS:
    
    loc_info = GEOCODER.geocode(location)    
    
    try:
        lon = loc_info.longitude
        lat = loc_info.latitude       
    except AttributeError:  
        print('\'{}\' could not be located!'.format(location))       
    else:
        loc_data['address'].append(loc_info.address)
        x, y = pyproj.transform(WGS84, WM, lon, lat)
        loc_data['x'].append(x)
        loc_data['y'].append(y)   
        
        try:  # try to extract a time profile for this location
            ash = model.sel(lon=np.float32(round(lon, 2)), lat=np.float32(round(lat, 2)))['total_deposition'].values
            np.place(ash, np.isnan(ash), 0)          
        except KeyError:  # this location is outside the (cropped) model space
            ash = np.zeros(len(model['time']))
            
        loc_data['ash_values'].append(ash)
        loc_data['time_values'].append(model['time'].values)        
        loc_data['max_ash'].append(np.max(ash)) 
        
        if np.max(ash) != 0:
            onset_index = np.nonzero(ash)[0][0]-1  # find ind corresponding to one step BEFORE the first nonzero-ashfall step
            loc_data['ash_onset'].append(model['time'].values[onset_index])            
        else:
            loc_data['ash_onset'].append(np.datetime64(datetime.datetime.max))  # set date to INF 
            
source = ColumnDataSource(loc_data)
source.add(linear_palette(cc.rainbow, len(loc_data['address'])), name='color')

HV = HoverTool(tooltips=[('Location', '@address'),
                         ('Ash first arrival', '@ash_onset{%Y-%m-%d %R}'),
                         ('Total ash accumulation', '@max_ash{%.2g} mm')],
               formatters={'ash_onset':'datetime', 'max_ash':'printf'}, names=['circles', 'lines'],
               show_arrow=False, line_policy='interp')

WZ = WheelZoomTool(zoom_on_axis=True)

TOOLS = [PanTool(), WZ, HV, TapTool(), ResetTool()]

# MAP plot
p1 = figure(x_range=(19300000, 19900000), y_range=(-5100000, -4400000), x_axis_type='mercator', y_axis_type='mercator',
            x_axis_label='Longitude', y_axis_label='Latitude', tools=TOOLS, active_scroll=WZ)
p1.add_tile(WMTSTileSource(url=URL))

# generate a contour
X, Y = np.meshgrid(model['lon'].values, model['lat'].values)
Z = np.where(model.isel(time=-1)['total_deposition'].values > 0, 1, 0)  # binary ash-or-no-ash matrix

contour_info = plt.contour(X, Y, Z, 1); plt.close()  # draw one contour, tracing the ash extent

xs = []; ys = []
for path in contour_info.collections[0].get_paths():
    vs = path.vertices
    x, y = pyproj.transform(WGS84, WM, vs[:,0].tolist(), vs[:,1].tolist())        
    xs.append(x)
    ys.append(y)           
        
p1.multi_line(xs, ys, color='black', line_width=2, line_alpha=0.5, legend='Ashfall extent') 
p1.circle(x='x', y='y', size=20, source=source, name='circles',
          color='color', selection_color='color', nonselection_color='black',
          fill_alpha=1, selection_fill_alpha=1, nonselection_fill_alpha=0.3,
          line_color='black', selection_line_color='black', nonselection_line_color=None
          )
src_E, src_N = pyproj.transform(WGS84, WM, model.attrs['volcano_location'][1], model.attrs['volcano_location'][0])
p1.scatter(src_E, src_N, size=20, marker='triangle', fill_color='black', line_color='white', legend='Source')

p1.legend.click_policy='none'

# PROFILE plot
p2 = figure(x_axis_type='datetime', x_axis_label='Time', y_axis_label='Ash thickness (mm)',
            tools=TOOLS, active_scroll=WZ, active_inspect=HV)
p2.multi_line(xs='time_values', ys='ash_values', line_width=3, source=source, name='lines',
              color='color', selection_color='color', nonselection_color='black'
             )

p2.x_range.range_padding = 0
p2.y_range.range_padding = 0.05
p2.x_range.bounds = 'auto'
p2.y_range.bounds = 'auto'
p2.xaxis[0].formatter.hours = ['%R']

plots = gridplot([[p1, p2]], plot_width=450, plot_height=500)

title = Div(text='<h1>'+FILENAME.split('/')[-1]+'</h1>')

show(column(title, plots))

# misc notes (don't delete!)

### wish list
time_profiles
* bar plot instead of line plot
* force time axis tick interval to match model time step
* plot total ashfall extent as a patch instead of a contour
* tweak vocabulary of descriptions
* implement time slider for contours

### notes for Liam
* for Matplotlib plots, use:
```python
cc.cm[CMAP+'_r']
```
* add HTML export to time_plots