# Introduction
oneargopy is a library designed to ease the access to Argo float data. This notebook is meant to briefly explain and give examples of some of the functionality of oneargopy.

# Setup


In [None]:
pip install oneargopy

In [None]:
from oneargopy.OneArgo import Argo

# Constructor
  





The Argo constructor downloads the index files from either one of the Argo Global Data Assembly Centers (GDAC) and stores them in the proper directories defined in the DownloadSettings class. It then constructs the dataframes from the argo_synthetic-profile_index.txt file and the ar_index_global_prof.txt file for use in class function calls. Two of the dataframes are a reflection of the index files, the third dataframe is a two column frame with float ids and if they are a bgc float or not.

There are two different ways to call the Argo class constructor, with or without an argument (the path to the configuration json file).

Argo has a few settings classes that it uses internally to determine the location of files and the hosts to pull data from. You can either call the constructor with default settings or call the constructor with a passed user configuration file.

## User Configuration: Load settings from a json file

The standard way to initalize OneArgoPy is to pass a path to a json configuration file to the constructor. An example of the configuration json file can be found in the [oneargopy Github repository](https://github.com/NOAA-PMEL/OneArgoPy/blob/main/argo_config.json).

The settings can be adjusted to the user's preference using this configuration file. This must be done before calling the Argo constructor.
The file name used here refers to the sample json file that is part of the github repository.

For use in google colab the base_dir should be something like "/content/folder" because 'content' is the name of the base folder in the files section of google colab.

*NOTE*: When you call the constructor without arguments, OneArgoPy uses default values for these settings and the folders needed to store the Argo data files will be created in the same directory that the Argo class is in. If you used the standard installation with pip, this will be in the Python repository path, which is probably not ideal.

In [None]:
# Call constructor and initialize the library. 
# This includes downloading the latest versions of the index files if necessary.
argo = Argo('argo_config.json')

# Functions

oneargopy currently has four public functions to help scientists access and analyze Argo float data.

## select_profiles()

select_profiles is a public function of the Argo class that returns a dictionary of float IDs (keys) and profile lists (values) that match the passed criteria.

The profiles can be selected based on geographic limits, date limits, specific float IDs, ocean basin, and float type ('bgc' for biogeochemical floats, 'phys' for floats without biogeochemical sensors, or 'all' for all floats, which is the default).

They can be further modified based on an 'outside' parameter which by default is None but can be set to 'time' or 'space' or 'both': By default, only float profiles that are within both the temporal and spatial constraints are returned (None); specify 'time' to also maintain profiles outside of the temporal constraints, 'space' to maintain profiles outside of spatial constraints, or 'both' to keep both, i.e., all profiles of all floats that have at least one profile that matches both temporal and spatial constraints simultaneously.

The longitude and latitude limits can be entered as either two-element lists, in which case the limits will be interpreted as maximum and minimum limits that form a rectangle, or they can be entered as longer lists (of matching lengths) in which case each pair of longitude and latitude values correspond to a vertex of a polygon. The longitude and latitude limits can be input in any 360 degree range that encloses all the desired longitude values, e.g., lon_lim=[20, 370] will include all profiles between 20E and 360E as well as 0 to 10E.
In the two-element format, it is possible to enter only longitudes or latitudes. For instance, specify lat_lim=[-90, -60] to restrict your search to the Southern Ocean.

## Example 1: Biogeochemical floats along the US West Coast
This example is selecting biogeochemical floats along the US West coast with profiles from 2021 until now.

In [None]:
profiles_uswc = argo.select_profiles(lon_lim=[-127,-115], lat_lim=[32.5,48.5], 
                                     start_date='2021-01-01', type='bgc')

## Example 2: All profiles of a specified float
If no geographic or date range are given, OneArgoPy will select all floats and profiles that match the other criteria. If no criteria are specified at all ("select_profiles()"), all floats and profiles will be returned.

You can also specify one float by its WMO ID as shown here, or multiple floats with a list, e.g: floats=[5906441, 5906446, 5906507]. Without specifying further criteria, all of its profiles will be returned.

Here we select the profiles for a float off the coast of Hawaii.

In [None]:
profiles_hawaii = argo.select_profiles(floats=5903611)
profiles_hawaii

## trajectories()

This function plots the trajectories of one or more specified float(s).

Floats can be passed as a singular ID (int), a list of IDs, or a dictonary returned from the select_profiles function, in which case only the passed profiles will be plotted.

In this example we use the trajectories function to plot the profiles we selected along the US West coast.

In [None]:
argo.trajectories(profiles_uswc)

In this example we plot the full trajectory of the float we selected along the coast of Hawaii.

In [None]:
argo.trajectories(profiles_hawaii)

## load_float_data()

load_float_data() is a function to load float profile data into memory from the netCDF files stored on the GDAC. These netCDF files will be downloaded unless the current version of them exists locally already.

To specify what float data to load, the user must pass floats (as either a single ID, a list of IDs, or a dictonary as returned from the select_profiles function to potentially limit the matching profiles) and can optionally pass a list of variables that they would like to be included in the dataframe. For each variable, its associated variables will be loaded as well, i.e., in the TEMP example: TEMP_QC, TEMP_ADJUSTED, TEMP_ADJUSTED_QC, TEMP_ADJUSTED_ERROR

By default, i.e., without specifying any variables, only depth-independent variables (one value per profile) are included: WMOID, CYCLE_NUMBER, DIRECTION, DATE, DATE_QC, LATITUDE, LONGITUDE, and POSITION_QC

In this example we pass the profiles we selected along the west coast and load temperature data for these floats.

In [None]:
uswc_float_data = argo.load_float_data(profiles_uswc, 'TEMP')

The data are loaded into a standard pandas dataframe:

In [None]:
uswc_float_data

Since the data are now in a pandas dataframe, it is easy to filter them further. For all scientific uses, the "_ADJUSTED" values should be used instead of the raw data.

Here we select adjusted temperature data at a pressure level of 200 dbar:

In [None]:
uswc_temp_200db = uswc_float_data[(abs(uswc_float_data['PRES_ADJUSTED'] - 200) < 0.5) & uswc_float_data['TEMP_ADJUSTED_QC'] == 1]
uswc_temp_200db['TEMP_ADJUSTED']

## Map plots at specified depths

It is now easy to create a map of temperature at this depth level:

In [None]:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import cartopy.feature as cfeature
import matplotlib.cm as cm
import matplotlib.colors as mcolors

fig = plt.figure(figsize=(6, 10))
ax = plt.axes(projection=ccrs.PlateCarree())

# Adding features
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.BORDERS, linestyle=':')
ax.add_feature(cfeature.LAND, edgecolor='black')
ax.add_feature(cfeature.OCEAN)

sc = ax.scatter(uswc_temp_200db['LONGITUDE'], uswc_temp_200db['LATITUDE'],
           c=uswc_temp_200db['TEMP_ADJUSTED'], cmap='viridis', s=100)

# Colorbar
cbar = plt.colorbar(sc, ax=ax, orientation='vertical', pad=0.05)
cbar.set_label('Adjusted temperature (deg C)')

plt.title('Map of adjusted temperature at 200 dbar along the US West Coast');

In the next example we load the float data for the Hawaii biogeochemical float we specified earlier.

In [None]:
hawaii_float_data = argo.load_float_data(profiles_hawaii, ['TEMP','DOXY'])
hawaii_float_data

Now we select all rows where TEMP_ADJUSTED and DOXY_ADJUSTED have good data, indicated by a value of 1 in their respetive _QC columns:

In [None]:
hawaii_good_T_O2 = hawaii_float_data[(hawaii_float_data['TEMP_ADJUSTED_QC'] == 1) & (hawaii_float_data['DOXY_ADJUSTED_QC'] == 1)]
hawaii_good_T_O2.head()

## Correlation plot between two variables

Now we create a T-O2 plot from these data:

In [None]:
fig, ax = plt.subplots()

ax.scatter(hawaii_good_T_O2['TEMP_ADJUSTED'], hawaii_good_T_O2['DOXY_ADJUSTED'], s=3, c='k')

#ax.set(xlim=(0, 8), xticks=np.arange(1, 8),
#       ylim=(0, 8), yticks=np.arange(1, 8))
plt.xlabel('Temperature (deg C)')
plt.ylabel('Dissolved Oxygen (umol/kg)')
plt.show()

## sections()

sections() is a function to create section plots along the float trajectory for the passed variables using data from the passed float_data dataframe.

It uses the return value from the load_float_data() function as first argument.

In this example produces temperature section plots for the data from the the west coast floats that we selected and loaded into memory previousy. A separate plot is created for each one of the floats.

In [None]:
argo.sections(uswc_float_data, 'TEMP_ADJUSTED')

The next example creates section plots for adjusted data of dissolved oxygen and temperatue for the data from the Hawaii float:

In [None]:
argo.sections(hawaii_float_data, ['DOXY_ADJUSTED', 'TEMP_ADJUSTED'])