# Plotting Exact Amplicon Sequence Variants (16S) Along an Atlantic Latitudinal Transect
### Query and aggeragate by taxonomy level, clustering thereshold, and size fraction

The example below retrieves the "topN" number of most abundant sequenced organisms along track of the cruise. One can aggregate and visualize the relative abundance of the organisms according to their taxonomy, clustering levels, and size fractions. The cruise, 'ANT28-5', is an Atlantic latitudinal transect. <br/> <br/>

**Thanks to Irene Wagner-Döbler and Meinhard Simon's research groups for making this beautiful dataset publicly available!**  <br/> <br/> 

In [None]:
from opedia import esv

############## set parameters ################
# only plot the top_N number of most abundant organisms
topN = 10           
# aggregate organisms by their taxa level
tax = ['domain', 'phylum', 'class', 'order', 'family', 'genus', 'species'][5]
depth1 = 20
depth2 = depth1
cruise_name = 'ANT28-5'
cluster_level = [89, 92, 96, 97, 98, 99, 100][0] # minimum similarity percentage to be clustered 100 = ASV level

iSizeFrac = 0 # Free-living fraction 0.2 - 3 uM
#iSizeFrac = 1 # Small particle-associated 3 - 8 uM 
#iSizeFrac = 2 # Large particle-associated fraction > 8 uM 

size_frac_lower = [0.2, 3, 8][iSizeFrac]
size_frac_upper = [3, 8, None][iSizeFrac]
##############################################

esv.plotESVs(topN, tax, depth1, depth2, cruise_name, cluster_level, size_frac_lower, size_frac_upper)

<br/><br/>
# Colocalize with Model and Satellite

Here, the retrieved trends of relative abundances are colocalized with other datasets, in this case with Darwin model. The results are stored in a .csv file in the ./data directory. 

In [None]:
from opedia import colocalize as COL

DB = False                           # < True > if source data exists in the database. < 0 > if the source data set is a spreadsheet file on disk. 
source = './data/esv.csv'            # the source table name (or full filename)    
temporalTolerance = 3                # colocalizer temporal tolerance (+/- degrees)
latTolerance = 0.3                   # colocalizer meridional tolerance (+/- degrees)
lonTolerance = 0.3                   # colocalizer zonal tolerance (+/- degrees) 
depthTolerance = 5                   # colocalizer depth tolerance (+/- meters)
tables = ['tblDarwin_Plankton_Climatology', 'tblDarwin_Plankton_Climatology', 'tblDarwin_Plankton_Climatology']    # list of varaible table names               
variables = ['prokaryote_c01_darwin_clim', 'prokaryote_c02_darwin_clim', 'cocco_c05_darwin_clim']                  # list of variable names           
exportPath = './data/loaded.csv'     # path to save the colocalized data set 
    
COL.matchSource(DB, source, temporalTolerance, latTolerance, lonTolerance, depthTolerance, tables, variables, exportPath)    