![Python logo](https://cmap.readthedocs.io/en/latest/_static/CMAP_logos/CMAP_logo_High_Res.png) 
# In this notebook we will download SeaFlow and enviormental data Using [Simons CMAP](https://simonscmap.com).

Below are the datasets that will be used
The End goalEnd goal is to create a dataset that has these variables below.



#### SeaFlow (tblSeaFlow_v1_5):
- time
- lat
- lon
- biomass

In this notebook we will also use <u> depth, and cruise</u> to match with other avalaible CMAP dataframes. 


#### Mercator-Pisces Biogeochemistry Daily Forecast (cl1) and Mercator-Pisces Biogeochemistry and Weekly Forecast:<br>(we use two models since they together match the temporal bounds of available CMAP data)
- NO3
- PO4
- Fe
- Si
- chl

#### From the SeaFlow's matching cruises we will grab:
1. Temperature
2. Salinity

#### Nasa's MODIS (or Moderate Resolution Imaging Spectroradiometer) :
1. Daily PAR (or Photosynthetically Available Radiation)

Importing nessasary functions:

In [3]:
import pandas as pd
import numpy as np
import os
from datetime import datetime
import pycmap

### Set your working directory to the cloned [Github Repository](https://github.com/CristianSwift/Seaflow-Machine-Learning).

In [4]:
# Setting a working Directory
os.chdir('/Users/cristianswift/Desktop/armbrust-lab/Seaflow-Machine-Learning')


## First we need to set our API. You can create an API key [here](https://simonscmap.com/apikeymanagement) if you don't have one.
Make sure to **save it somewhere**!

In [5]:
api = pycmap.API(token='<6e1eb1d3-d364-4dfb-9121-8c23369dbbbe>')

### Using our api we can get the complete seaflow dataset avaliable on Simons CMAP

Click [here] (https://cmap.readthedocs.io/en/latest/user_guide/API_ref/pycmap_api/data_retrieval/pycmap_retrieve_dataset.html?highlight=get_dataset) for additional support on how to retrieve datasets
```python
api.get_dataset("table_name")
```

Retrieving Seaflow data

Now we are going to get rid of unnessaasary variables and keep only what we need to optimize our download times

In [17]:
seaflow_cmap = api.get_dataset('tblSeaFlow_v1_5')

seaflow = seaflow_cmap[['time', 'lat', 'lon', 'depth', 'cruise', 'biomass_prochloro',
                        'biomass_synecho', 'biomass_picoeuk', 'biomass_croco']]
#saving seaflow data
seaflow_cmap.to_csv('data/01-original/Seaflow_all_CMAP.csv', index=False)

seaflow_cmap

Unnamed: 0,time,lat,lon,depth,cruise,abundance_prochloro,abundance_synecho,abundance_picoeuk,abundance_croco,diam_prochloro,...,diam_picoeuk,diam_croco,Qc_prochloro,Qc_synecho,Qc_picoeuk,Qc_croco,biomass_prochloro,biomass_synecho,biomass_picoeuk,biomass_croco
0,2010-05-04T23:13:08,49.942400,-136.309100,5,TN248,,9.450561,42.470594,0.000000,,...,1.156123,,,0.139359,0.217537,,,1.317022,9.238921,0.000000
1,2010-05-04T23:16:08,49.942400,-136.309100,5,TN248,,9.719664,44.614850,0.045525,,...,1.150068,2.736205,,0.135899,0.214609,2.379907,,1.320895,9.574742,0.108346
2,2010-05-04T23:19:08,49.942400,-136.309100,5,TN248,,9.955824,44.106351,0.000000,,...,1.147056,,,0.139359,0.213163,,,1.387435,9.401822,0.000000
3,2010-05-04T23:22:08,49.942400,-136.309100,5,TN248,,11.054114,43.395945,0.022792,,...,1.142564,4.366571,,0.134768,0.211015,6.707313,,1.489737,9.157209,0.152873
4,2010-05-04T23:28:08,49.942400,-136.309100,5,TN248,,9.504259,40.182275,0.045584,,...,1.166330,3.305096,,0.139359,0.222525,3.287736,,1.324505,8.941571,0.149868
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
190720,2021-12-30T01:27:09,33.008092,-118.472350,5,TN398,9.051235,31.529153,11.533800,0.000000,0.844581,...,1.434694,,0.096765,0.283582,0.379686,,0.875840,8.941116,4.379228,0.000000
190721,2021-12-30T01:33:10,33.019267,-118.481058,5,TN398,8.719288,32.165627,11.499984,0.000000,0.842596,...,1.505780,,0.096179,0.282935,0.430138,,0.838613,9.100776,4.946578,0.000000
190722,2021-12-30T01:36:10,33.024867,-118.485458,5,TN398,8.815814,31.440368,11.155153,0.017162,0.848569,...,1.489154,2.218431,0.097948,0.283582,0.417991,1.168907,0.863490,8.915938,4.662748,0.020061
190723,2021-12-30T01:39:10,33.030467,-118.489850,5,TN398,8.465514,29.952781,10.625084,0.000000,0.843921,...,1.374284,,0.096569,0.280361,0.339796,,0.817509,8.397587,3.610357,0.000000


## For now we will use <u> one tenth</u> of the avaliable data to matchting with a climatological model easier through CMAP. 


In [23]:
seaflow_cmap_10 = (seaflow_cmap
                   .sample(frac=0.1)
                   .reset_index(drop=True)
                  )

In [24]:
seaflow_cmap_10

Unnamed: 0,time,lat,lon,depth,cruise,abundance_prochloro,abundance_synecho,abundance_picoeuk,abundance_croco,diam_prochloro,...,CMAP_PO4_tblPisces_Forecast,CMAP_Fe_tblPisces_Forecast,CMAP_Si_tblPisces_Forecast,CMAP_chl_tblPisces_Forecast,CMAP_NO3_tblPisces_Forecast_cl1,CMAP_PO4_tblPisces_Forecast_cl1,CMAP_Fe_tblPisces_Forecast_cl1,CMAP_Si_tblPisces_Forecast_cl1,CMAP_chl_tblPisces_Forecast_cl1,CMAP_PAR_tblModis_PAR
0,2016-05-02T21:57:21,26.32350,-158.140100,5,KOK1606,215.098704,10.391242,5.957194,,0.573971,...,,,,,,,,,,
1,2013-04-05T11:59:25,-23.31220,-33.985600,5,KN210-04,143.446382,0.777799,1.623232,0.163450,0.500019,...,,,,,,,,,,
2,2011-09-21T11:05:49,28.18310,127.671000,5,Tokyo_3,162.219310,10.452729,10.221730,0.000000,0.545370,...,,,,,,,,,,
3,2013-04-01T04:52:44,-32.17910,-41.429700,5,KN210-04,129.800702,2.794369,1.532396,0.011268,0.498374,...,,,,,,,,,,
4,2020-09-28T11:26:59,22.73400,-157.930600,5,KM2011,162.073777,0.434709,0.583146,0.034459,0.511737,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
19067,2021-12-03T11:42:52,1.40045,-139.999483,5,TN397,90.391578,9.047072,20.203861,0.000000,0.620876,...,,,,,,,,,,
19068,2011-09-26T06:07:45,42.61430,170.606100,5,Tokyo_3,23.818529,275.723400,32.323506,0.000000,0.571049,...,,,,,,,,,,
19069,2017-07-08T03:09:16,24.94170,-158.507600,5,KM1709,148.589788,0.792314,0.598842,0.073704,0.557848,...,,,,,,,,,,
19070,2017-09-15T11:52:13,46.05450,-157.923700,5,KM1713,0.000000,3.950412,27.580063,0.072819,,...,,,,,,,,,,


## First we will match data from our SeaFlow data frame to two Pisces climatological models

- <u>Mercator-Pisces Biogeochemistry Daily Forecast (cl1)</u> (dataset: **tblPisces_Forecast_cl1** )
    - temporal coverage:
**2020-11-01 to 2023-06-16**

- <u>Mercator-Pisces Biogeochemistry Daily Forecast</u> (dataset:  **tblPisces_Forecast**)
    - temporal coverage: **2019-01-01 to 2021-05-07**

    - <u>Mercator-Pisces Biogeochemistry and Weekly Forecast</u> (dataset:  **tblPisces_NRT**)
- temporal coverage: **2011-12-31 to 2019-04-27**


#### We will use [pycmap.Sample](https://colab.research.google.com/github/simonscmap/pycmap/blob/master/docs/Sampling.ipynb) in order to match our CMAP SeaFlow data with CMAP Pisces Climatological Data

Tolerances are set as:

temporal: 4 days
Meridonal tolerance: 0.25º
Zonal Tolerance: 0.25º
depth tolerance: 5m

In [None]:
targets = {
        
        # BioGeoChemical Numerical Near-Real-Time Model
#         "tblPisces_NRT": {
#                           "variables": ["NO3", "PO4", "Fe", "Si", "chl"],
#                           "tolerances": [4, 0.25, 0.25, 5]
# #         },
        # BioGeoChemical Numerical Near-Real-Time Model
        "tblPisces_Forecast": {
                          "variables": ["NO3", "PO4", "Fe", "Si", "chl"],
                          "tolerances": [3, 0.25, 0.25, 5]
#         },
#         "tblPisces_Forecast_cl1": {
#                             "variables": ["NO3", "PO4", "Fe", "Si", "chl"],
#                             "tolerances": [0, 0.25, 0.25, 5]
#         },
#         "tblModis_PAR": {
#                     "variables": ["PAR"],
#                     "tolerances": [0, 0.25, 0.25, 5]
#     }
}

}


seaflow_pisces_p_forecast = pycmap.Sample(
              source=seaflow_cmap_10, 
              targets=targets, 
              replaceWithMonthlyClimatolog=False
             )


Gathering metadata .... 
Sampling starts
Sampling tblPisces_Forecast ... 4977 / 19072                                                  

In [None]:
seaflow_pisces_p_forecast.to_csv('/Users/cristianswift/Desktop/Spring-Quarter-2022-2023/SeniorThesis/data/modified/seaflow_pisces_p_forecast.csv')