<img src="https://raw.githubusercontent.com/euroargodev/argopy/master/docs/_static/argopy_logo_long.png" alt="argopy logo" width="200"/>

# Training Camp - Sept 22<sup>th</sup> 2025

***

## Notebook Title : Vertical interpolation & Binning

**Author contact : [K. Balem](https://annuaire.ifremer.fr/cv/22144/)**

**Description:**

This notebook describes:
- how to transform your data from a set of points to a collection of profiles,
- how to interpolate your data onto standard depth levels,
- how to bin vertically your data.

This notebook is based on the [Argopy documentation](https://argopy.readthedocs.io/en/v1.3.0/user-guide/working-with-argo-data/data_manipulation.html#) where you can find more details on each function.

*This notebook was developed with Argopy version: 1.3*

***

Let's start with the usual import:

In [None]:
from argopy import DataFetcher

## Points vs Profiles
By default, fetched data are returned as a 1D array collection of measurements, ie a set of points.

In [None]:
f = DataFetcher().region([-75,-55,30.,40.,0,100., '2011-01-01', '2011-01-15'])
f 

The returned xarray dataset with the data will only have one dimension : **N_POINTS**

In [None]:
ds_points = f.data
ds_points

If you prefer to work with a 2D array collection of vertical profiles, simply transform the dataset with [Dataset.argo.point2profile()](https://argopy.readthedocs.io/en/v1.3.0/generated/xarray.Dataset.argo.point2profile.html#xarray.Dataset.argo.point2profile)  
This will return a new dataset where you can find a newer dimension : **N_LEVELS**

In [None]:
ds_profiles = ds_points.argo.point2profile()
ds_profiles

You can simply reverse this transformation with the [Dataset.argo.profile2point()](https://argopy.readthedocs.io/en/v1.3.0/generated/xarray.Dataset.argo.profile2point.html#xarray.Dataset.argo.profile2point):

In [None]:
ds = ds_profiles.argo.profile2point()
ds

#### ✏️ EXERCICE
Fetch the data from the float 6990680 and transform the returned dataset into profiles

## Vertical interpolation
Once your dataset is a collection of vertical profiles, you can interpolate variables on standard pressure levels using [Dataset.argo.interp_std_levels()](https://argopy.readthedocs.io/en/v1.3.0/generated/xarray.Dataset.argo.interp_std_levels.html#xarray.Dataset.argo.interp_std_levels) with your levels as input.  
This interpolated dataset has a new vertical dimension : **PRES_INTERPOLATED**

Let's define our standard pressure levels

In [None]:
import numpy as np
z_levels = np.arange(0,100,10)
z_levels

We can now interpolate all of our profiles onto those levels

In [None]:
ds_interp = ds_profiles.argo.interp_std_levels(z_levels)
ds_interp

Note on the linear interpolation process :
- Only profiles that have a maximum pressure higher than the highest standard level are selected for interpolation.
- Remaining profiles must have at least five data points to allow interpolation.
- For each profile, shallowest data point is repeated to the surface to allow a 0 standard level while avoiding extrapolation.

In our case, one profile is missing from the interpolated dataset, because it's maximum pressure does not reach one or more standard levels

In [None]:
ds_profiles['PRES'].max('N_LEVELS').values

#### ✏️ EXERCICE
Fetch the data from the float 6901987, transform the dataset into profiles, and interpolate onto standard pressure levels

## Pressure levels: Group-by bins
If you prefer to avoid interpolation, you can opt for a pressure bins grouping reduction using [Dataset.argo.groupby_pressure_bins()](https://argopy.readthedocs.io/en/v1.3.0/generated/xarray.Dataset.argo.groupby_pressure_bins.html#xarray.Dataset.argo.groupby_pressure_bins).  
This method can be used to subsample and align an irregular dataset (pressure not being similar in all profiles) on a set of pressure bins. The output dataset could then be used to perform statistics along the N_PROF dimension because N_LEVELS will corresponds to similar pressure bins.

To illustrate this method, let’s start by fetching some data from a low vertical resolution float:

In [None]:
f = DataFetcher(src='erddap', mode='expert').float(2901623)  # Low res float
ds = f.data
ds

Let’s now sub-sample these measurements along 250db bins, selecting values from the deepest pressure levels for each bins:

In [None]:
bins = np.arange(0.0, np.max(ds["PRES"]), 250.0)
bins

In [None]:
ds_binned = ds.argo.groupby_pressure_bins(bins=bins, select='deep')
ds_binned

See the new STD_PRES_BINS variable that hold the pressure bins definition.

The figure below shows the sub-sampling effect:

In [None]:
import matplotlib as mpl
import matplotlib.pyplot as plt

fig, ax = plt.subplots(figsize=(16,5))
# LETS PLOT FULL DATASET 
ds.plot.scatter(x='CYCLE_NUMBER', y='PRES', hue='PSAL', ax=ax)
# LETS PLOT BINNED DATASET POINTS
plt.plot(ds_binned['CYCLE_NUMBER'], ds_binned['PRES'], 'r+')
# LETS PLOT OUR BINS LEVELS
plt.hlines(bins, ds['CYCLE_NUMBER'].min(), ds['CYCLE_NUMBER'].max(), color='k')
plt.hlines(ds_binned['STD_PRES_BINS'], ds_binned['CYCLE_NUMBER'].min(), ds_binned['CYCLE_NUMBER'].max(), color='r')
plt.title(ds.attrs['Fetched_constraints'])
plt.gca().invert_yaxis()

The bin limits are shown with horizontal red lines, the original data are in the background colored scatter and the group-by pressure bins values are highlighted in red marks

The select option can take many different values, corresponding to different ways to process your data within the bin, see the full documentation of [Dataset.argo.groupby_pressure_bins()](https://argopy.readthedocs.io/en/v1.3.0/generated/xarray.Dataset.argo.groupby_pressure_bins.html#xarray.Dataset.argo.groupby_pressure_bins) , for all the details. 

#### ✏️ EXERCICE
Fetch the data from the float 5906993, and bin the data using another select option

### 
***
Useful argopy commands:
```python
argopy.reset_options()
argopy.show_options()
argopy.status()
argopy.clear_cache()
argopy.show_versions()
```
***
![logo](https://raw.githubusercontent.com/euroargodev/argopy-training/refs/heads/main/for_nb_producers/template_argopy_training_EAONE.png)