<a href="https://colab.research.google.com/github/BCODMO/Data-Use-Examples/blob/master/notebooks/erddap_cariaco.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## CARIACO Niskin bottle time series data from BCO-DMO
This notebook walks through building an ERDDAP url for a specific dataset, and plotting a time series of some of the data.

Author: Mathew Biddle (mbiddle@whoi.edu)

In [5]:
!pip install erddapy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [6]:
from erddapy import ERDDAP
import pandas as pd
%matplotlib inline

### Start building the ERDDAP request

In [7]:
server='http://erddap.bco-dmo.org/erddap'

e = ERDDAP(
    server=server,
    protocol='tabledap',
    response = 'csv'
    )

### Go out and get the data for a specific dataset

In this example, we happen to know the dataset ID is **bcodmo_dataset_3093** ([www.bco-dmo.org/dataset/3093](www.bco-dmo.org/dataset/3093)). So, we build the ERDDAP url for the dataset ID of interest. Then, we import the data into a pandas DataFrame.

In [8]:
e.dataset_id = 'bcodmo_dataset_3093'
print(e.get_download_url())
df = e.to_pandas(header=[0,1])
df.describe()

http://erddap.bco-dmo.org/erddap/tabledap/bcodmo_dataset_3093.csv?


Unnamed: 0_level_0,Cruise_number (integer (nnn)),Leg (integer (n)),Day (unitless),Month (unitless),Year (unitless),latitude (degrees_north),longitude (degrees_east),Hydro_cast_no (integer (n)),Depth_target (meters (m)),depth (m),...,q_DOC (dimensionless),TOC (micromolar (\u03bcM)),q_TOC (dimensionless),PrimaryProductivity (milligrams Carbon/meter^3/hour (mgC/m^3/hr)),q_PrimaryProductivity (dimensionless),Chlorophyll (milligrams/meter^3 (mg/m^3)),q_Chlorophyll (dimensionless),Phaeopigments (milligrams/meter^3 (mg/m^3)),q_Phaeopigments (dimensionless),Bio_cast_no (integer (n))
Unnamed: 0_level_1,1,2,8,11,1995,10.5,-64.667,1,1,1.5,...,0,NaN,0,NaN,0,0.0940762,0,0.09,0,2
count,4393.0,4393.0,4393.0,4393.0,4393.0,4393.0,4393.0,4364.0,4392.0,4391.0,...,4393.0,452.0,4393.0,1658.0,4393.0,1828.0,4393.0,1827.0,4393.0,1838.0
mean,115.239017,1.152288,9.778056,6.496016,2005.296836,10.501043,-64.666233,1.58593,264.242942,264.688388,...,0.161165,125.672434,0.025267,1.922717,0.001366,0.498997,0.000228,0.427587,0.000228,3.966268
std,67.342454,0.471192,4.333086,3.480501,5.967446,0.012768,0.021966,0.64565,316.692561,316.536402,...,0.367725,406.30316,0.156954,4.034208,0.036936,1.031987,0.015088,0.532791,0.015088,0.338034
min,1.0,1.0,1.0,1.0,1995.0,10.492,-64.735,0.0,1.0,0.349,...,0.0,24.8917,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0
25%,56.0,1.0,7.0,3.0,2000.0,10.499,-64.667,1.0,35.0,35.5305,...,0.0,56.8142,0.0,0.095373,0.0,0.12,0.0,0.127131,0.0,4.0
50%,114.0,1.0,10.0,6.0,2005.0,10.5,-64.667,2.0,160.0,161.331,...,0.0,63.97495,0.0,0.833984,0.0,0.2,0.0,0.247098,0.0,4.0
75%,174.0,1.0,12.0,10.0,2010.0,10.5,-64.666,2.0,350.0,350.2085,...,0.0,81.595825,0.0,1.86523,0.0,0.407875,0.0,0.529713,0.0,4.0
max,232.0,4.0,29.0,12.0,2017.0,10.683,-64.367,5.0,1320.0,1351.0,...,1.0,6538.33,1.0,65.445,1.0,24.7812,1.0,7.53675,1.0,6.0


### Make some plots
Here we assign the 'time' variable to the index, and make time series plots of a few of the variables.

In [9]:
df.index = pd.to_datetime(df['time']['UTC'], infer_datetime_format=True)

df[['Chlorophyll','Alkalinity_mol_kg','NO3_UDO','DOC','Salinity_CTD']].plot(subplots=True, figsize=(15,16));

KeyError: ignored