# Comparison Argo and GO-SHIP data

In this Example notebook, profile data from a selected GO-SHIP cruise line is obtained, along with nearby Argo profiles that fall within user-provided time and space constraints from the GO-SHIP profiles. Those profile data are converted to more user-friendly xarray format, plotted as profiles and sections, interpolated onto a regular grid for comparison, and plotted as differences on that regular grid.

## Task 0: Import necessary packages and set constants for upcoming functions.

In [None]:
# data processing
import numpy as np
import pandas as pd
import xarray as xr
from time import sleep

# data visualization
%matplotlib inline
import matplotlib.pyplot as plt
%matplotlib inline

# API convenience functions
from utilities_NSF_EC2022 import get_data_for_timeRange

import warnings
warnings.filterwarnings('ignore')

# set constants
URL_PREFIX = 'https://argovis-api.colorado.edu'
API_KEY = ''

## Task 1: Download data from GO-SHIP line
With the function *get_goship_line*, users can provide the name of a GO-SHIP to download all historical profiles from that line.

In [None]:
# define the function
def get_goship_line(line_name, startDate='1900-01-01T00:00:00Z', endDate='2022-05-01T00:00:00Z', dt_tag='365d', url=URL_PREFIX, api_key=API_KEY):
    df = get_data_for_timeRange(startDate, endDate, url_prefix=url+'/profiles?', 
                                source='cchdo_go-ship', woceline=line_name, 
                                myAPIkey=api_key, dt_tag=dt_tag)
    return df

# get GO-SHIP data from line A22
a22 = get_goship_line('A22')
coords = [c['coordinates'] for c in a22.geolocation]
time = a22.timestamp.values

## Task 2: Download data from surrounding Argo profiles
With the function *get_argo_along_line*, users can download data from Argo profiles that are within given time and space constraints from a given set of profiles (in this case the GO-SHIP data we downloaded in Task 1).

In [3]:
# define the function
# I think we should have a time independent version of this function as well
# could cheat this function into doing it with something like timedelta=1e5 or something
def get_argo_along_line(time, coords, radius=50, timedelta=30, dt_tag='365d', url=URL_PREFIX, api_key=API_KEY):
    df_all = pd.DataFrame()
    for t, c in zip(time, coords):
        sleep(.2)
        startDate = (pd.Timestamp(t) - pd.Timedelta(timedelta/2)).strftime('%Y-%m-%dT%H:%M:%SZ')
        endDate   = (pd.Timestamp(t) + pd.Timedelta(timedelta/2)).strftime('%Y-%m-%dT%H:%M:%SZ')
        center    = f'{c[0]},{c[1]}'
        df = get_data_for_timeRange(startDate, endDate, url_prefix=url+'/profiles?',
            center=center, radius_km=f'{radius}', source='argo_core', data='pres,temp,psal',
            myAPIkey=api_key, dt_tag=dt_tag, writeFlag=False)
        df_all = df_all.append(df)
    
    return df_all

# get argo data along line A22
argo_a22 = get_argo_along_line(time, coords)

## Task 3: Convert data to xarray
The function *json_dataframe_to_dataframe* processes the data downloaded from prior tasks in json format to be ready for conversion to an xarray. The function *to_xarray* converts the data to an xarray.

In [4]:
# process data from full of JSON points to more usable form
def json_dataframe_to_dataframe(df):
    out = pd.DataFrame()
    for i in range(df.shape[0]):
        # get the argo data
        data_dict = dict()
        data = df.data.iloc[i]
        # repeat location and time data for same lenth as array
        N_levels = len(data)
        data_dict['wmo'] = N_levels*[int(df._id.iloc[i].split('_')[0])]
        data_dict['cycle_number'] = N_levels*[df.cycle_number.iloc[i]]
        data_dict['time'] = N_levels*[df.timestamp.iloc[i]]
        data_dict['longitude'] = N_levels*[df.geolocation.iloc[i]['coordinates'][0]]
        data_dict['latitude'] = N_levels*[df.geolocation.iloc[i]['coordinates'][1]]
        # extract data from JSON dict
        for k in df.data_keys.iloc[i]:
            data_dict[k] = [d[k] for d in data]

        out = out.append(pd.DataFrame(data_dict))
    
    return out

df = json_dataframe_to_dataframe(argo_a22)
ds = df.to_xarray()

## Task 4: Plot GO-SHIP and Argo data
First, we’ll plot data from each source as individual profiles(?), then as sections across latitude.

## Task 5: Interpolate data to regular depth levels
To compare GO-SHIP data with nearby Argo profiles, we’ll need the data to be on consistent depth levels. With the function *function name*, we can interpolate profiles onto a regular 2-dbar(?) by 0.1 degree(?) grid.