# Step 1: Notebook Setup

The cell below contains a number of helper functions used throughout this walkthrough. They are mainly wrappers around existing `matplotlib` functionality and are provided for the sake of simplicity in the steps to come.

Take a moment to read the descriptions for each method so you understand what they can be used for. You will use these "helper methods" as you work through this notebook below.

If you are familiar with `matplotlib`, feel free to alter the functions as you please.

## TODOs

1. Click in the cell below and run the cell.

In [1]:
# TODO: Make sure you run this cell before continuing!

%matplotlib inline
import matplotlib.pyplot as plt

def show_plot(x_datas, y_datas, x_label, y_label, legend=None, title=None):
    """
    Display a simple line plot.
    
    :param x_data: Numpy array containing data for the X axis
    :param y_data: Numpy array containing data for the Y axis
    :param x_label: Label applied to X axis
    :param y_label: Label applied to Y axis
    """
    fig = plt.figure(figsize=(16,8), dpi=100)
    for (x_data, y_data) in zip(x_datas, y_datas):
        plt.plot(x_data, y_data, '-', marker='|', markersize=2.0, mfc='b')
    plt.grid(b=True, which='major', color='k', linestyle='-')
    plt.xlabel(x_label)
    fig.autofmt_xdate()
    plt.ylabel (y_label)
    if legend:
        plt.legend(legend, loc='upper left')
    if title:
        plt.title(title)
    plt.show()
    return plt
    
def plot_box(bbox):
    """
    Display a Green bounding box on an image of the blue marble.
    
    :param bbox: Shapely Polygon that defines the bounding box to display
    """
    min_lon, min_lat, max_lon, max_lat = bbox.bounds
    import matplotlib.pyplot as plt1
    from matplotlib.patches import Polygon
    from mpl_toolkits.basemap import Basemap

    map = Basemap()
    map.bluemarble(scale=0.5)
    poly = Polygon([(min_lon,min_lat),(min_lon,max_lat),(max_lon,max_lat),
                    (max_lon,min_lat)],facecolor=(0,0,0,0.0),edgecolor='green',linewidth=2)
    plt1.gca().add_patch(poly)
    plt1.gcf().set_size_inches(15,25)
    
    plt1.show()
    
def show_plot_two_series(x_data_a, x_data_b, y_data_a, y_data_b, x_label, y_label_a, 
                         y_label_b, series_a_label, series_b_label, align_axis=True):
    """
    Display a line plot of two series
    
    :param x_data_a: Numpy array containing data for the Series A X axis
    :param x_data_b: Numpy array containing data for the Series B X axis
    :param y_data_a: Numpy array containing data for the Series A Y axis
    :param y_data_b: Numpy array containing data for the Series B Y axis
    :param x_label: Label applied to X axis
    :param y_label_a: Label applied to Y axis for Series A
    :param y_label_b: Label applied to Y axis for Series B
    :param series_a_label: Name of Series A
    :param series_b_label: Name of Series B
    :param align_axis: Use the same range for both y axis
    """
    
    fig, ax1 = plt.subplots(figsize=(10,5), dpi=100)
    series_a, = ax1.plot(x_data_a, y_data_a, 'b-', marker='|', markersize=2.0, mfc='b', label=series_a_label)
    ax1.set_ylabel(y_label_a, color='b')
    ax1.tick_params('y', colors='b')
    ax1.set_ylim(min(0, *y_data_a), max(y_data_a)+.1*max(y_data_a))
    ax1.set_xlabel(x_label)
    
    ax2 = ax1.twinx()
    series_b, = ax2.plot(x_data_b, y_data_b, 'r-', marker='|', markersize=2.0, mfc='r', label=series_b_label)
    ax2.set_ylabel(y_label_b, color='r')
    ax2.set_ylim(min(0, *y_data_b), max(y_data_b)+.1*max(y_data_b))
    ax2.tick_params('y', colors='r')
    
    if align_axis:
        axis_min = min(0, *y_data_a, *y_data_b)
        axis_max = max(*y_data_a, *y_data_b)
        axis_max += .1*axis_max
        
        ax1.set_ylim(axis_min, axis_max)
        ax2.set_ylim(axis_min, axis_max)
    
    plt.grid(b=True, which='major', color='k', linestyle='-')
    plt.legend(handles=(series_a, series_b), bbox_to_anchor=(1.1, 1), loc=2, borderaxespad=0.)
    plt.show()


# Step 2: Run a Daily Difference Average (Anomaly) calculation

El Niño is a common oceanographic phenomenon to study. The nexuscli has a function that can be used to generate a Daily Difference Average (aka. Anomaly plot). We'd like to do this anomoly calculation on the [El Niño 3.4 region](https://www.ncdc.noaa.gov/teleconnections/enso/indicators/sst.php) with a bounding box `-170, -5, -120, 5`.

The Daily Difference Average algorithm compares a dataset against a climatological mean and produces a time series of the difference from that mean.

This time, using the `nexuscli` module, call the `daily_difference_average` method. The signature for that method is reprinted below:

>Generate an anomaly Time series for a given dataset, bounding box, and timeframe.  
>  
__dataset__ Name of the dataset as a String  
__bounding_box__ Bounding box for area of interest as a `shapely.geometry.polygon.Polygon`  
__start_datetime__ Start time as a `datetime.datetime`  
__end_datetime__ End time as a `datetime.datetime`  
>      
>__return__ List of `nexuscli.nexuscli.TimeSeries` namedtuples

Generate an anomaly time series using the `AVHRR_OI_L4_GHRSST_NCEI` SST dataset for the time period 2016-01-01 through 2016-12-31 and a bounding box `-170, -5, -120, 5` (west, south, east, north).

## TODOs

1. Target your EC2 instance
1. Generate the Anomaly Time Series by calling the `daily_difference_average` method in the `nexuscli` module
2. Plot the result using the `show_plot` helper method


In [None]:
import time
import nexuscli
from datetime import datetime

from shapely.geometry import box

# TODO: Target your AWS NEXUS server using your public DNS name and port 8083
nexuscli.set_target("http://<public dns>:8083", use_session=False)

# Do not modify this line ##
start = time.perf_counter()#
############################


# TODO Generate the Anomaly Time Series by calling the daily_difference_average method in the nexuscli module


print("Daily Difference Average took {} seconds to generate".format(time.perf_counter() - start))


In [None]:
# TODO plot the result


# Step 3: Clean up the Output

These graphs can get quite noisy and sometimes it is desireable to make them easier to read by sampling the returned data rather than plotting every single data point. You may also have noticed that the [TimeSeries](https://htmlpreview.github.io/?https://raw.githubusercontent.com/apache/incubator-sdap-nexus/107438af45b479348ffb75a667b276ee3c81f9da/client/docs/nexuscli/nexuscli.m.html#nexuscli.nexuscli.TimeSeries) object returned from the `daily_difference_average` method includes the standard deviation.

Let's try sampling the data once per week (every 7 days) and plotting the results including the standard deviation as error bars on the graph.

## TODO

1. Sample every 7th data point to reduce plot noise
  - __Hint__: Python's [extended slices notation](https://docs.python.org/2.3/whatsnew/section-slices.html) is very useful here
2. Plot the extracted means with standard deviation shown as error bars
  - __Hint__: `matplotlib` has an [errorbar](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.errorbar.html) function

In [None]:
# TODO Sample every 7th data point to reduce plot noise


# TODO Plot the extracted means with standard deviation shown as error bars
