# Step 1: Notebook Setup

The cell below contains a number of helper functions used throughout this walkthrough. They are mainly wrappers around existing `matplotlib` functionality and are provided for the sake of simplicity in the steps to come.

Take a moment to read the descriptions for each method so you understand what they can be used for. You will use these "helper methods" as you work through this notebook below.

If you are familiar with `matplotlib`, feel free to alter the functions as you please.

## TODOs

1. Click in the cell below and run the cell.

In [None]:
# TODO: Make sure you run this cell before continuing!

%matplotlib inline
import matplotlib.pyplot as plt

def show_plot(x_datas, y_datas, x_label, y_label, legend=None, title=None):
    """
    Display a simple line plot.
    
    :param x_data: Numpy array containing data for the X axis
    :param y_data: Numpy array containing data for the Y axis
    :param x_label: Label applied to X axis
    :param y_label: Label applied to Y axis
    """
    fig = plt.figure(figsize=(16,8), dpi=100)
    for (x_data, y_data) in zip(x_datas, y_datas):
        plt.plot(x_data, y_data, '-', marker='|', markersize=2.0, mfc='b')
    plt.grid(b=True, which='major', color='k', linestyle='-')
    plt.xlabel(x_label)
    fig.autofmt_xdate()
    plt.ylabel (y_label)
    if legend:
        plt.legend(legend, loc='upper left')
    if title:
        plt.title(title)
    plt.show()
    return plt
    
def plot_box(bbox):
    """
    Display a Green bounding box on an image of the blue marble.
    
    :param bbox: Shapely Polygon that defines the bounding box to display
    """
    min_lon, min_lat, max_lon, max_lat = bbox.bounds
    import matplotlib.pyplot as plt1
    from matplotlib.patches import Polygon
    from mpl_toolkits.basemap import Basemap

    map = Basemap()
    map.bluemarble(scale=0.5)
    poly = Polygon([(min_lon,min_lat),(min_lon,max_lat),(max_lon,max_lat),
                    (max_lon,min_lat)],facecolor=(0,0,0,0.0),edgecolor='green',linewidth=2)
    plt1.gca().add_patch(poly)
    plt1.gcf().set_size_inches(15,25)
    
    plt1.show()
    
def show_plot_two_series(x_data_a, x_data_b, y_data_a, y_data_b, x_label, y_label_a, 
                         y_label_b, series_a_label, series_b_label, align_axis=True):
    """
    Display a line plot of two series
    
    :param x_data_a: Numpy array containing data for the Series A X axis
    :param x_data_b: Numpy array containing data for the Series B X axis
    :param y_data_a: Numpy array containing data for the Series A Y axis
    :param y_data_b: Numpy array containing data for the Series B Y axis
    :param x_label: Label applied to X axis
    :param y_label_a: Label applied to Y axis for Series A
    :param y_label_b: Label applied to Y axis for Series B
    :param series_a_label: Name of Series A
    :param series_b_label: Name of Series B
    :param align_axis: Use the same range for both y axis
    """
    
    fig, ax1 = plt.subplots(figsize=(10,5), dpi=100)
    series_a, = ax1.plot(x_data_a, y_data_a, 'b-', marker='|', markersize=2.0, mfc='b', label=series_a_label)
    ax1.set_ylabel(y_label_a, color='b')
    ax1.tick_params('y', colors='b')
    ax1.set_ylim(min(0, *y_data_a), max(y_data_a)+.1*max(y_data_a))
    ax1.set_xlabel(x_label)
    
    ax2 = ax1.twinx()
    series_b, = ax2.plot(x_data_b, y_data_b, 'r-', marker='|', markersize=2.0, mfc='r', label=series_b_label)
    ax2.set_ylabel(y_label_b, color='r')
    ax2.set_ylim(min(0, *y_data_b), max(y_data_b)+.1*max(y_data_b))
    ax2.tick_params('y', colors='r')
    
    if align_axis:
        axis_min = min(0, *y_data_a, *y_data_b)
        axis_max = max(*y_data_a, *y_data_b)
        axis_max += .1*axis_max
        
        ax1.set_ylim(axis_min, axis_max)
        ax2.set_ylim(axis_min, axis_max)
    
    plt.grid(b=True, which='major', color='k', linestyle='-')
    plt.legend(handles=(series_a, series_b), bbox_to_anchor=(1.1, 1), loc=2, borderaxespad=0.)
    plt.show()


# Step 2: List available Datasets

Now we can interact with NEXUS using the `nexuscli` python module. The `nexuscli` module has a number of useful methods that allow you to easily interact with the NEXUS webservice API. One of those methods is `nexuscli.dataset_list` which returns a list of Datasets in the system along with their start and end times.

However, in order to use the client, it must be told where the NEXUS webservice is running. The `nexuscli.set_target(url)` method is used to target NEXUS. An instance of NEXUS is already running for you and is available at `http://<public dns>:8083` where `<public dns>` is the public DNS of the EC2 instance you signed up for.

## TODOs

1. Import the `nexuscli` python module.
2. Target your EC2 instance
3. Call `nexuscli.dataset_list()` and print the results

In [None]:
# TODO: Import the nexuscli python module.

# TODO: Target your AWS NEXUS server using your public DNS name and port 8083
nexuscli.set_target("http://<public dns>:8083", use_session=False)

# TODO: Call nexuscli.dataset_list() and print the results

# Step 3: Run a Time Series

Now that we can interact with NEXUS using the `nexuscli` python module, we would like to run a time series. To do this, we will use the `nexuscli.time_series` method. The signature for this method is described below:


>nexuscli.time_series(datasets, bounding_box, start_datetime, end_datetime, spark=False)  
>  
>Send a request to NEXUS to calculate a time series.  
>  
__datasets__ Sequence (max length 2) of the name of the dataset(s)  
__bounding_box__ Bounding box for area of interest as a `shapely.geometry.polygon.Polygon`  
__start_datetime__ Start time as a `datetime.datetime`  
__end_datetime__ End time as a `datetime.datetime`  
__spark__ Optionally use spark. Default: `False`
>  
>__return__ List of `nexuscli.nexuscli.TimeSeries` namedtuples
```

As you can see, there are a number of options available. Let's take a look at the precipitation rate in LA County. LA county is a rectangle which makes it easy for analysis.

Generate a time series for the `TRMM_3B42_daily_scrubbed` precipitation dataset for the time period 1997-12-31 through 1998-12-31 and a bounding box `-118.9517, 32.7969, -117.6462, 34.8233` (west, south, east, north).

## TODOs

1. Create the bounding box using shapely's `box` method
2. Plot the bounding box using the `plot_box` helper method
3. Generate the Time Series by calling the `time_series` method in the `nexuscli` module
  - __Hint__: `datetime` is already imported for you. You can create a `datetime` using the method `datetime(int: year, int: month, int: day)`
  - __Hint__: pass `spark=True` to the `time_series` function to speed up the computation
4. Plot the result using the `show_plot` helper method

In [None]:
import time
import nexuscli
from datetime import datetime

from shapely.geometry import box

# TODO: Create a bounding box using the box method imported above

# TODO: Plot the bounding box using the helper method plot_box


In [None]:
# Do not modify this line ##
start = time.perf_counter()#
############################


# TODO: Call the time_series method for the TRMM_3B42_daily_scrubbed dataset using 
# your bounding box and time period 1997-12-31 through 1998-12-30


# Enter your code above this line
print("Time Series took {} seconds to generate".format(time.perf_counter() - start))


In [None]:
# TODO: Plot the result using the `show_plot` helper method



# Step 4: Comparing Data

In [2_Subsetting](2_Subsetting.ipynb) you created an average flow rate for rivers in LA county. Now that we have precipitation data for the same time period, it would be nice to plot the data on the same graph to look for patterns.

Using the results from your Time Series call and the average discharge rate for rivers in LA county provided below, plot the two time series side by side using the `show_plot_two_series` helper method

## TODO
 
1. Plot the TRMM time series and the RAPID time series on the same graph using the `show_plot_two_series` helper method.

In [None]:
import nexuscli
import numpy
from datetime import datetime
from shapely.geometry import box

la_county_river_ids = [17575859, 17574289, 17575711, 17574677, 17574823,
                       948070361, 22560728, 22560730, 22560738]
la_county_river_data = [nexuscli.subset("RAPID_WSWM", None, datetime(1997, 1, 1), 
                                        datetime(1998, 12, 31, 23, 59, 59), None, "rivid_i:{}".format(river_id))
                        for river_id in la_county_river_ids]
avg_discharge_rates = numpy.mean(numpy.array([[point.variable['variable'] for point in river] 
                               for river in la_county_river_data]), axis=0)
single_river_time_steps = numpy.array([point.time for point in next(iter(la_county_river_data))])


# TODO plot the two time series side by side using the show_plot_two_series helper method
show_plot_two_series()