<a class="anchor" id="top"></a>

## Outline:
* [Getting Started](#getting-started)
* [Catalog](#catalog)
<br/>
* [**Data Visulization**](#data-visualization)
    * [Regional Map](#map)
    * [Sparse Regional Map](#s_map)
    * [Time Series](#timeseries)
    * [Section Map](#section)
    * [Depth Profile](#depth-profile)
    * [Cruise Sampling](#cruise)
    * [Colocalize Custom External Dataset](#external)
    
* [**Data Retrieval**](#retrieval)
    * [Calling Pre-defined Functions](#retrieval) 
        * [Space-Time Subset](#space-time)
        * [Time Series Subset](#time-series-subset)



<a class="anchor" id="catalog"></a>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>


<center>
<h1> Catalog </h1>
</center>


<br/>
<br/>
<center>
<h1> cmap.readthedocs.io </h1>
</center>



<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>





In [None]:
from opedia import getCatalog
import pandas as pd


pd.read_csv('./data/catalog.csv')
df = pd.read_csv('./data/catalog.csv')

print(df[['Variable', 'Table_Name']].to_string())

<a class="anchor" id="data-visualization"></a>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>


<center>
<h1> Data Visualization </h1>
</center>


<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>





<a class="anchor" id="map"></a>

# Plot Regional Maps (Satellite, Model)

Create a regional map using satellite and model data.
<br/> <br/>
**Notes:**<br/> 
* Darwin_Nutrient_3day is a 3 day interval version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>

* Satellite SST data set is a daily-global product with spatial resolution $\frac{1}{4}^\circ \times \frac{1}{4}^\circ$.<br/>

In [None]:
from opedia import plotRegional as REG

tables = ['tblSST_AVHRR_OI_NRT','tblDarwin_Nutrient_3day','tblDarwin_Nutrient_3day','tblDarwin_Nutrient_3day']    # see catalog.csv  for the complete list of tables and variable names
variables = ['sst','PO4_darwin_3day','DIN_darwin_3day','O2_darwin_3day']                            # see catalog.csv  for the complete list of tables and variable names   
startDate = '2014-04-27'
endDate = '2014-04-27'
lat1, lat2 = -90, 90
lon1, lon2 = -180, 180
depth1, depth2 = 0, 5
fname = 'regional'
exportDataFlag = False       # True if you you want to download data

REG.regionalMap(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, fname, exportDataFlag)

<a class="anchor" id="s_map"></a>

# Plot Sparse Regional Maps (Cruise)

Create a regional map using sparse cruise data
<br/> <br/>
**Notes:**<br/> 
* Global Cyanobacteria (Flombaum) is a dataset that contains global observations of prochlorococcus and synechococcus abundance from 1987-09-17	through 2008-11-10.


In [None]:
from opedia import plotRegional as REG


tables = ['tblFlombaum']
variables = ['prochlorococcus_abundance_flombaum']  
startDate = '1987-09-17'
endDate = '2008-11-10'
lat1, lat2 = -90, 90
lon1, lon2 = -180, 180
depth1, depth2 = 0, 5
fname = 'regional'
exportDataFlag = False       # True if you you want to download data

REG.regionalMap(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, fname, exportDataFlag)

<a class="anchor" id="timeseries"></a>

# Plot Time Series (Model, Satellite)

Create time series plots using satelite and model data.
<br/> <br/>
**Note:**<br/> 
* Darwin_Nutrient_3day is a 3 day interval version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>

* Satellite SST data set is a daily-global product with spatial resolution $\frac{1}{4}^\circ \times \frac{1}{4}^\circ$.<br/>

* Satellite Altimetry data set is a daily-global product with spatial resolution $\frac{1}{4}^\circ \times \frac{1}{4}^\circ$.<br/>

In [None]:
from opedia import plotTS as TS


tables = ['tblSST_AVHRR_OI_NRT', 'tblAltimetry_REP','tblDarwin_Nutrient_3day']    # see catalog.csv  for the complete list of tables and variable names
variables = ['sst', 'sla', 'O2_darwin_3day']   
startDate = '2014-01-01'
endDate = '2014-09-29'
lat1, lat2 = 25, 30
lon1, lon2 = -160, -155
depth1, depth2 = 0, 5
fname = 'TS'
exportDataFlag = False                                                   # True if you you want to download data

TS.plotTS(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, fname, exportDataFlag)

<a class="anchor" id="section"></a>

# Plot Section Map (Model outputs)

Create section maps using Darwin model outputs.
<br/> <br/>
**Notes:**
* Darwin_Nutrient_3day is a 3 day interval version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>




In [None]:
from opedia import plotSection as SEC

tables = ['tblDarwin_Nutrient_3day', 'tblDarwin_Nutrient_3day']     # see catalog.csv  for the complete list of tables and variable names      
variables = ['O2_darwin_3day', 'SiO2_darwin_3day']                           # see catalog.csv  for the complete list of tables and variable names
startDate = '2014-04-30'                                         
endDate = '2014-04-30'
lat1, lat2 = 14, 15
lon1, lon2 = -156, -20
depth1, depth2 = 0, 6000
fname = 'SEC'
exportDataFlag = False                                           # True if you you want to download data

SEC.sectionMap(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, fname, exportDataFlag)

<a class="anchor" id="depth-profile"></a>

# Plot Depth Profile (BGC-Argo Floats, Model outputs)

Create depth profile plots using model and BGC-Argo float profiles.
<br/> <br/>
**Notes:**
* Darwin_Climatology is a monthly climatology version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>

* Argo float data set has irregular temporal and spatial resolution. <br/>

In [None]:
from opedia import plotDepthProfile as DEP

tables = ['tblArgoMerge_REP', 'tblDarwin_Chl_Climatology']     # see catalog.csv  for the complete list of tables and variable names      
variables = ['argo_merge_chl_adj', 'chl01_darwin_clim']        # see catalog.csv  for the complete list of tables and variable names
startDate = '2016-04-30'   
endDate = '2016-04-30'
lat1, lat2 = 20, 24
lon1, lon2 = -170, -160
depth1, depth2 = 0, 1500
fname = 'DEP'
exportDataFlag = False                                         # True if you you want to download data

DEP.plotDepthProfile(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, fname, exportDataFlag)

<br/> <br/>
<a class="anchor" id="cruise"></a>

# Colocalize Darwin model and satellite data with cruise

Compare the underway (in-situ) picoeukaryote abundance measurements performed during the KM1502 cruise with satellite chlorophyll data and picoeukaryote climatological estimates provided by Darwin model.

<br/> 
**Notes:**<br/> 

* In-Situ picoeukaryote abundance measurements are results of the SeaFlow data set with 3-minute temporal resultion and irregular spatial resultion.

* Satellite Chlorophyll data used in this example is a daily-global reprocessed and optimally interpolated data set with $4~{\rm km}\times4~{\rm km}$ spatial resolution. 

* Darwin_Climatology is a monthly climatology version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>



<br/>


In [None]:
from opedia import plotCruise as CRS

DB_Cruise = True                 # < True > if cruise trajectory already exists in DB. < False > if arbiturary cruise file (e.g. virtual) 
source = 'tblSeaFlow'            # cruise table name or path to csv trajectory file    
cruise = 'KM1502'              # cruise name, or file name of the csv trajectory file     
resampTau = '6H'                 # resample the cruise trajectory making trajectory time-space resolution coarser: e.g. '6H' (6 hourly), '3T' (3 minutes), ... '0' (ignore)  
fname = 'alongTrack'             # figure filename
tables = ['tblSeaFlow', 'tblDarwin_Plankton_Climatology', 'tblCHL_REP']    # list of varaible table names               
variables = ['picoeuk_abundance', 'picoeukaryote_c03_darwin_clim', 'chl']               # list of variable names           
spatialTolerance = 0.3           #a colocalizer spatial tolerance (+/- degrees) 
exportDataFlag = False           # export the cruise trajectory and colocalized data on disk
depth1 = 0                      # depth range start (m) 
depth2 = 5                       # depth range end (m)  


df = CRS.getCruiseTrack(DB_Cruise, source, cruise)
df = CRS.resample(df, resampTau) 
loadedTrack = CRS.plotAlongTrack(tables, variables, cruise, resampTau, df, spatialTolerance, depth1, depth2, fname, exportDataFlag, marker='-', msize=30, clr='darkturquoise')

<a class='anchor' id='external'></a>


# Colocalize a custom dataset with Darwin model, and satellite data

Colocalize a a custom dataset (Particulate Cobalamins observed on KM1314 cruise) with climatological POC, prokaryote estimates provided by Darwin model, and dissolved iron concentration. The dataset should be in either '.xlsx' or '.csv' format with 'time', 'lat', 'lon', and 'depth' columns. 


| time        | lat           | lon  | depth | [var1] | [...] | [varn] |
| -----------   | -----------   | ----- | ----- | ----- | ----- | ----- |
| <%Y-%m-%dT%H:%M:%S>  | [-90, 90] | [-180, 180] | positive number | number | number | number |


<br/> 
**Notes:**<br/> 

* Darwin_Nutrient_3day is a 3 day interval version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>

* Darwin_Climatology is a monthly climatology version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>

* Satellite SST data set is a daily-global product with spatial resolution $\frac{1}{4}^\circ \times \frac{1}{4}^\circ$.<br/>

<br/>


**Thanks to Anitra Ingalls, Katherine Heal *et al.* (Inglass Lab, UW) for the beautiful dataset!**  <br/> <br/> 


In [None]:
from opedia import colocalize as COL

DB = False                            # < True > if source data exists in the database. < 0 > if the source data set is a spreadsheet file on disk. 
source = './data/KM1314_ParticulateCobalamins_2018_06_12_vPublished.xlsx'            # the source table name (or full filename)    
temporalTolerance = 1                # colocalizer temporal tolerance (+/- degrees)
latTolerance = 0.3                   # colocalizer meridional tolerance (+/- degrees)
lonTolerance = 0.3                   # colocalizer zonal tolerance (+/- degrees) 
depthTolerance = 5                   # colocalizer depth tolerance (+/- meters)
tables = ['tblsst_AVHRR_OI_NRT','tblDarwin_Nutrient_3day', 'tblDarwin_Plankton_Climatology']    # list of varaible table names               
variables = ['sst','FeT_darwin_3day', 'prokaryote_c01_darwin_clim']                            # list of variable names           
exportPath = './data/loaded.csv'         # path to save the colocalized data set 
    
COL.matchSource(DB, source, temporalTolerance, latTolerance, lonTolerance, depthTolerance, tables, variables, exportPath)    

from opedia import getCatalog
import pandas as pd


df = pd.read_csv('./data/loaded.csv')

print(df.head(10))


<a class="anchor" id="retrieval"></a>

<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>

<center>
<h1> Data Retrieval </h1>
<h3> Extract customized subsets of data:  calling pre-defined functions</h3> 
</center>



<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>





<a class="anchor" id="space-time"></a>

# Space-Time subset
This tutorial shows how to retrieve a generic distribution of a variable within a predefined space-time domain. You need to know the variable and table names, both of which can be found in the catalog. Data is retrieved in form of a dataframe with time, space, and variable columns. <br/> <br/> 

In [None]:
from opedia import subset

############## set parameters ################
table = 'tblsst_AVHRR_OI_NRT'
variable = 'sst'       
dt1 = '2016-06-01'
dt2 = '2016-06-05'
lat1, lat2, lon1, lon2 = 23, 24, -160, -158  
depth1, depth2 = 0, 0
##############################################

df = subset.spaceTime(table, variable, dt1, dt2, lat1, lat2, lon1, lon2, depth1, depth2)    # retrieves a DataFrame
# df.to_csv('data.csv', index=False)      # save the retrieved data into a csv file

<a class='anchor' id='time-series-subset'> </a>

# Time series subset
This tutorial shows how to retrieve time series of a variable within a predefined space-time domain. You need to know the variable and table names, both which can be found in the catalog. The *timeSeries* function computes the mean and standard deviation of the variable per time period. Data is retrieved in form of a dataframe with time, space, and variable columns. <br/> <br/> 

In [None]:
from opedia import subset

############## set parameters ################
table = 'tblsst_AVHRR_OI_NRT'
variable = 'sst'       
dt1 = '2016-06-01'
dt2 = '2016-07-01'
lat1, lat2, lon1, lon2 = 23, 24, -160, -158  
depth1, depth2 = 0, 0
##############################################

subset.timeSeries(table, variable, dt1, dt2, lat1, lat2, lon1, lon2, depth1, depth2)    # retrieves a DataFrame
#df.to_csv('data.csv', index=False)      # save the retrieved data into a csv file


# Regional Map Videos (Model, Satellite)

Create videos using model and satellite data.
<br/> <br/>
**Notes:**<br/> 
* Darwin_Nutrient_3day is a 3 day interval version of the Darwin model with spatial resolution $\frac{1}{2}^\circ \times \frac{1}{2}^\circ$.<br/>

* The MODIS Aerosol Optical Depth dataset is a monthly gridded data product with spatial resolution ${1}^\circ \times {1}^\circ$.<br/>

In [None]:
from opedia import vidRegional as vREG

tables = ['tblDarwin_Nutrient_3day']    # see catalog.csv  for the complete list of tables and variable names
variables = ['FeT_darwin_3day']                            # see catalog.csv  for the complete list of tables and variable names   
startDate = '2015-01-01'
endDate = '2015-12-01'
lat1, lat2 = -30, 30
lon1, lon2 = -130, 20
depth1, depth2 = 0, 5
frameRate = 10
cmap = 'viridis'                         # https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html
bounds = [None, None]                 # if bounds are None, they'll be set automatically
levels = 21

vREG.regionalVideo(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, frameRate, cmap, bounds, levels)




In [None]:
from opedia import vidRegional as vREG

tables = ['tblModis_AOD_REP']    # see catalog.csv  for the complete list of tables and variable names
variables = ['AOD']                            # see catalog.csv  for the complete list of tables and variable names   
startDate = '2015-01-01'
endDate = '2015-12-01'
lat1, lat2 = -30, 30
lon1, lon2 = -120, 30
depth1, depth2 = 0, 5
frameRate = 5
cmap = 'viridis'                         # https://matplotlib.org/3.1.0/tutorials/colors/colormaps.html
bounds = [None, None]                 # if bounds are None, they'll be set automatically
levels = 21

vREG.regionalVideo(tables, variables, startDate, endDate, lat1, lat2, lon1, lon2, depth1, depth2, frameRate, cmap, bounds, levels)
