<img src="https://raw.githubusercontent.com/euroargodev/argopy/master/docs/_static/argopy_logo_long.png" alt="argopy logo" width="200"/>

# Training Camp - Sept 22<sup>th</sup> 2025

***

## Notebook Title : Select and fetch Argo data

**Author contact : [G. Maze](https://annuaire.ifremer.fr/cv/17182)**

**Description:**

This notebook explains:
- how to select (region, float, profile) Argo data to fetch,
- how to trigger data fetching,
- format of return data.

It is all based on the [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher). 

This is basically a notebook to explore this [section of the Argopy documentation](https://argopy.readthedocs.io/en/v1.3.0/user-guide/fetching-argo-data/data_selection.html).

🏷️ This notebook was developed with [Argopy version *1.3.0*](https://argopy.readthedocs.io/en/v1.3.0)

©  [European Union Public Licence (EUPL) v1.2](https://github.com/euroargodev/argopy-training/blob/main/LICENSE), see at the bottom of this notebook for more.

**Table of Contents**
- [Selecting data to fetch](#selecting-data-to-fetch)
  - [🗺 Select data for a space/time domain](#🗺-select-data-for-a-space/time-domain)
    - [✏️ EXERCICE](#✏️-exercice)
  - [🤖 For one or more floats](#🤖-for-one-or-more-floats)
    - [✏️ EXERCICE](#✏️-exercice)
  - [⚓ For one or more profiles](#⚓-for-one-or-more-profiles)
    - [✏️ EXERCICE](#✏️-exercice)
- [Trigger data fetching](#trigger-data-fetching)
  - [Default and recommended data structures](#default-and-recommended-data-structures)
  - [Alternative data structures](#alternative-data-structures)
- [🏁 End of the notebook](#🏁-end-of-the-notebook)
    - [👀 Useful argopy commands](#👀-useful-argopy-commands)
    - [⚖️ License Information](#⚖️-license-information)
    - [🤝 Sponsor](#🤝-sponsor)
***

Let's start with the usual import:

In [None]:
from argopy import DataFetcher

And to prevent cell output to be too large, we won't display xarray object attributes:

In [None]:
import xarray as xr
xr.set_options(display_expand_attrs = False)

Before selecting any data, let’s first create a [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance:

In [None]:
f = DataFetcher()
f

<br>

The ``f`` instance print indicates that `erddap` is the data source for this fetcher (it's the default choice) and that "No access point initialised", an access point is a data selection method.

2nd line of the print gives a list of all the access points available for this data source:

They are 3 data selection methods that can be used on this [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance:
    
- 🗺 ``region`` for a space/time domain,
- 🤖 ``float`` for one or more floats,
- ⚓ ``profile`` for one or more profiles.

We will now review each of these.

## Selecting data to fetch

### 🗺 Select data for a space/time domain

The ``region`` access point takes a rectangular box definition of space/time bounds to be included. Argopy expects one of the following 2 format to define a box:

``box = [lon_min, lon_max, lat_min, lat_max, pres_min, pres_max]``

or

``box = [lon_min, lon_max, lat_min, lat_max, pres_min, pres_max, datim_min, datim_max]``


Longitude, latitude and pressure limits are float values. Starting and ending datetime must be objects convertible to a [Pandas datetime](https://pandas.pydata.org/docs/reference/api/pandas.to_datetime.html).

Let's try the most exhaustive definition first, and select data from 75W to 45W, 20N to 30N, 0db to 10db and from January to May 2011:

In [None]:
box = [-75, -45, 20, 30, 0, 10, '2011-01', '2011-06']
f = f.region(box)
f

<br>

Now that the [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance has been initialised with an access point, the print provides a little bit more information.

Note that the last time bound is exclusive: that’s why here we specify June to retrieve data collected in May.

#### ✏️ EXERCICE
Make the [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance to select data from a single date, say Feb. 12th of 2009:

💡 Code hint:
```python
box = [-75, -45, 20, 30, 0, 10, ...
f = f.region(box)
f.data
```

In [None]:
# your code here

### 🤖 For one or more floats

If you know the Argo float unique identifier number, called a WMO number, you can use the access point ``float`` to specify one or more float WMO platform numbers to select.

For instance, to select data for float WMO 6902746:

In [None]:
f = f.float(6902746)
f

<br>
To fetch data for a collection of floats, input them in a list.

#### ✏️ EXERCICE
Make the [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance to select data from two floats, let's say 6902746 and 6902755:

💡 Code hint:
```python
    f = f.float(...
    f
```

In [None]:
# your code

### ⚓ For one or more profiles

Use the fetcher access point ``profile`` to specify one or more float WMO platform number and profile cycle number(s) to retrieve profiles for.

For instance, to select data from the 12th profile of float WMO 6902755:

In [None]:
f = f.profile(6902755, 12)
f

<br>

We can note that the profile number correspond to the cycle number.

#### ✏️ EXERCICE
Make the [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance to select data from the first 2 cycles of two floats, let's say again 6902746 and 6902755:

💡 Code hint:
```python
f = f.profile(...
f
```

In [None]:
# your code here

## Trigger data fetching

### Default and recommended data structures

Once the access point is created, we can trigger data fetching by calling on the ``data`` property of the [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) instance:

In [None]:
%%time
ds = f.data
ds

<br>

The ``%%time`` command is a Jupyter magic to monitor how much time is needed to execute the cell.

Argopy works primarily with xarray [Dataset](https://docs.xarray.dev/en/stable/api/dataset.html) and this is what is return here when fetching Argo data.

The ``data`` property keep track of data downloaded, so if you re-execute the same command, no data will be downloaded again and cached data are return.

In a notebook, to make sure that you trigger a fresh data download, you can use the explicit ``to_xarray`` method:

In [None]:
%%time
ds = f.to_xarray()
ds

<br>

You can also note that a [DataFetcher](https://argopy.readthedocs.io/en/v1.3.0/generated/argopy.fetchers.ArgoDataFetcher.html#argopy.fetchers.ArgoDataFetcher) will return data as a collection of points, not profiles. This is the default choice, primarily driven by performance considerations.

But no worries, Argopy makes it very easy to go from a collection of points to profiles and vice versa. See this notebook for an illustration.

### Alternative data structures

Argopy also makes it easy to fetch data in alternative data structures.

Like [Pandas Dataframe](https://pandas.pydata.org/docs/reference/frame.html):

In [None]:
%%time
df = f.to_dataframe()
df.head()

<br>

A [Pandas Dataframe](https://pandas.pydata.org/docs/reference/frame.html) may be usefull, but is not very efficient for performances since all the per-profile properties will have to be replicated on all one-profile rows.

<br>

For compatibility issues with legacy softwares and procedures, a [legacy Netcdf library dataset object](https://unidata.github.io/netcdf4-python/#netCDF4.Dataset) is also available as an output format.

Although we strongly recommend users to migrate their procedures toward the [xarray Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html) structure, since it is easier to manipulate and allows more Argopy features to be used.

In [None]:
%%time
ds = f.to_dataset()
ds

## 🏁 End of the notebook

***
#### 👀 Useful argopy commands
```python
argopy.reset_options()
argopy.show_options()
argopy.status()
argopy.clear_cache()
argopy.show_versions()
```
#### ⚖️ License Information
This Jupyter Notebook is licensed under the **European Union Public Licence (EUPL) v1.2**.

| Permissions      | Limitations     | Conditions                     |
|------------------|-----------------|--------------------------------|
| ✔ Commercial use | ❌ Liability     | ⓘ License and copyright notice |
| ✔ Modification   | ❌ Trademark use | ⓘ Disclose source              |
| ✔ Distribution   | ❌ Warranty      | ⓘ State changes                |
| ✔ Patent use     |                  | ⓘ Network use is distribution  |
| ✔ Private use    |                  | ⓘ Same license                 |

For more details, visit: [EUPL v1.2 Full Text](https://github.com/euroargodev/argopy-training/blob/main/LICENSE).

#### 🤝 Sponsor
![logo](https://raw.githubusercontent.com/euroargodev/argopy-training/refs/heads/main/for_nb_producers/template_argopy_training_EAONE.png)
***
