## A Quick NDBC data Example
**Oceanography House Spring 2021**

*Written by Sage Lichtenwalner, Rutgers University, February 15, 2021*

This example demonstrates how to quickly load and plot NDBC Standard Meteorological Buoy Data from the [NOAA Coastwatch Erddap server](https://coastwatch.pfeg.noaa.gov/erddapinfo/index.html).

One of the big advantages of ERDDAP is that it provides data in CSV format (among many others).  Using the Python [pandas](https://pandas.pydata.org) library, we can easily load any CSV file available on the internet or on our machine.  And with [matplotlib](https://matplotlib.org/stable/index.html) we can plot the results.  Thus, this example provides the basics for creating simple data plots in Python from any CSV file you might have.

In [None]:
# Notebook setup - Let's load some libraries
import pandas as pd
import matplotlib.pyplot as plt

First we need to specify the data url that will give us a CSV file.  

You can customize the data url on this page;
https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.html

In [None]:
# Let's specify our Dataset URL
url = 'https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.csv?station%2Clongitude%2Clatitude%2Ctime%2Cwd%2Cwspd%2Cgst%2Cwvht%2Cdpd%2Capd%2Cmwd%2Cbar%2Catmp%2Cwtmp%2Cdewp%2Cvis%2Cptdy%2Ctide%2Cwspu%2Cwspv&station=%2244025%22&time%3E=2019-01-01&time%3C=2020-01-01'

For this exmaple, I've specified a single station ([Station 44025](https://www.ndbc.noaa.gov/station_page.php?station=44025)) and the date range 2019-01-01 to 2020-01-01.

Note, there are three kinds of CSV downloads available.  I've chosen `.csv` which includes the variable names in line 1 and the units in line 2.  I will skip line 2 when loading.

In [None]:
# Load the Data
data = pd.read_csv(url,skiprows=[1], parse_dates=['time'], index_col='time')

By default, if you specify a variable in the last line of a code cell, you will see some sort of print out of what the variable is, though what you see depends greatly on the *type* of object it is.

Pandas provides a few commands to help you understand what your dataset looks like.
* `data.head()` - You can also specify the number of rows as `data.head(2)`
* `data.tail()`
* `data.size`
* `data.shape`
* `data.sample()`
* `data.keys()`



In [None]:
# data.head()

Pandas also provides a number of commands to quickly calculate a number of common statistics.

You can use `data.describe()` to get several calculations at once in a nice table.  

Or you can use `mean()`, `std()`, `count()`, `min()` and `max()` to get specific statistics for the entire table or specific variables.

In [None]:
# data.describe() # Full table
# data.mean() # Means for all columns
# data['wtmp'].mean() # Mean for just water temp
# data.wtmp.mean()  # Another format for mean water temp

## A Quick Plot
Now that we have some data loaded, let's make some plots.

We can easily use the matplotlib library to create a basic X/Y plot with `plt.plot(x,y)`.

In [None]:
plt.plot(data.index,data.atmp);

That's nice.

But we've forgotten the first rule in Data Visualization...

## Label your Graphs!!!
Thankfully, Matplotlib provides quite a few ways to customize your plot.  Here are a few quick things we can add.
* Axis Title: `plt.title('Title')`
* Axes Labels: `plt.xlabel('Time')` or `plt.ylabel('Temperature')`
* Axes Limits: `plt.ylim([-5,5])`
* Add a Legend: `plt.legend()`.  This is helpful when you plot more than one graph on the same axis.  For this to work, you will need to add `label='NAME'` to your plot commands.

This just scratches the surface.  There are several more examples and references in [this tutorial](https://github.com/ooi-data-lab/data-lab-workshops/blob/master/Other_Examples/OH2020_Python_Basics.ipynb).

In [None]:
# Example Plot
plt.figure(figsize=(8,6)) # Let's make the figure bigger

plt.plot(data.index, data['atmp'], label='Air Temperature', c='#e41a1c');
plt.plot(data.index, data['wtmp'], label='Water Temperature', c='#377eb8'); # Add a 2nd plot to the same graph

plt.legend() # Add a legend
plt.ylabel('Temperature (°C)')
plt.title('Tempetaure at NDBC Station 44025', fontweight='bold', fontsize=12);

## Creating Subplots
Finally, let's do a quick example that shows how to create 2 plots in the same figure.  

For this example, we will load the first 2 weeks of February for [Station 44077](https://www.ndbc.noaa.gov/station_page.php?station=44077).

In [None]:
url = 'https://coastwatch.pfeg.noaa.gov/erddap/tabledap/cwwcNDBCMet.csv?station%2Clongitude%2Clatitude%2Ctime%2Cwd%2Cwspd%2Cgst%2Cwvht%2Cdpd%2Capd%2Cmwd%2Cbar%2Catmp%2Cwtmp%2Cdewp%2Cvis%2Cptdy%2Ctide%2Cwspu%2Cwspv&station=%2244077%22&time%3E=2021-02-01&time%3C=2021-02-15'

In [None]:
# Load the Data
data = pd.read_csv(url,skiprows=[1], parse_dates=['time'], index_col='time')

In [None]:
# Plot
fig,(ax1,ax2,ax3) = plt.subplots(3,1, sharex=True, figsize=(8,6) ) # Let's create 3 subplots, and also make the figure bigger

ax1.plot(data.index, data['atmp'], label='Air Temperature');
ax1.plot(data.index, data['wtmp'], label='Water Temperature');
ax1.legend()
ax1.set_ylabel('Temperature (°C)')
ax1.set_title('NDBC Station 44077', fontweight='bold', fontsize=12);

ax2.plot(data.index, data['bar'], label='Barometric Pressure');
ax2.set_ylabel('Pressure (hPa)')

ax3.plot(data.index, data['wspd'], label='Wind Speed');
ax3.set_ylabel('Wind Speed (m/s)');

plt.savefig('ndbc_test.png') # Save the figure to a file

## Your Turn
Using the few lines of code above, try to create your own plots.

Here are some things you can try:
* Try a different [NDBC station](https://www.ndbc.noaa.gov).  You can use the map on their site to find one in a region that interests you.  (Not all sites may have data.)
* Select a different time range to plot.  Maybe a month, a season, or a full year.
* Plot different variables against each other, like winds vs. waves, or air vs. water temperatures.
* Create a plot that compares timeseries plots of different variables (e.g. using sub plots)
* Load 2 or 3 different stations and plot the data together.