In [None]:
# HIDDEN CELL
import sys, os

# Importing argopy in dev mode:
on_rtd = os.environ.get('READTHEDOCS', None) == 'True'
if not on_rtd:
    sys.path.insert(0, "/Users/gmaze/git/github/euroargodev/argopy")
    import git
    import argopy
    from argopy.options import OPTIONS
    print("argopy:", argopy.__version__, 
          "\nsrc:", argopy.__file__, 
          "\nbranch:", git.Repo(search_parent_directories=True).active_branch.name, 
          "\noptions:", OPTIONS)
else:
    sys.path.insert(0, os.path.abspath('..'))

import xarray as xr
# xr.set_options(display_style="html");
xr.set_options(display_style="text");

In [None]:
import argopy
from argopy import DataFetcher as ArgoDataFetcher

# Data sources

## Selecting a source

**argopy** can get access to Argo data from different sources: 

1. the [Ifremer erddap server](http://www.ifremer.fr/erddap).  

    The erddap server database is updated daily and doesn't require you to download anymore data than what you need.  
    You can select this data source with the keyword ``erddap`` and methods described below. The Ifremer erddap dataset is based on mono-profile files of the GDAC.
    
1. your local collection of Argo files, organised as in the [GDAC ftp](http://www.argodatamgt.org/Access-to-data/Access-via-FTP-on-GDAC).

    This is how you would use **argopy** with your data, as long as they are formated and organised the Argo way.  
    You can select this data source with the keyword ``localftp`` and methods described below.

1. the [Argovis server](https://argovis.colorado.edu/).

    The Argovis server database is updated daily and provides access to curated Argo data (QC=1 only).
    You can select this data source with the keyword ``argovis`` and methods described below.

You have several ways to specify which data source you want to use:

* **using argopy global options**:

In [None]:
argopy.set_options(src='erddap')

* **in a temporary context**:

In [None]:
with argopy.set_options(src='erddap'):
    loader = ArgoDataFetcher().profile(6902746, 34)

- **with an argument in the data fetcher**:

In [None]:
loader = ArgoDataFetcher(src='erddap').profile(6902746, 34)

## Setting a local copy of the GDAC ftp

Data fetching with the ``localftp`` data source will require you to specify the path toward your local copy of the GDAC ftp server with the ``local_ftp`` option.

This is not an issue for expert users, but standard users may wonder how to set this up.
The primary distribution point for Argo data, the only one with full support from data centers and with nearly a 100% time availability, is the GDAC ftp. Two mirror servers are available:

- France Coriolis: ftp://ftp.ifremer.fr/ifremer/argo
- US GODAE: ftp://usgodae.org/pub/outgoing/argo

If you want to get your own copy of the ftp server content, Ifremer provides a nice rsync service. The rsync server "vdmzrs.ifremer.fr" provides a synchronization service between the "dac" directory of the GDAC and a user mirror. The "dac" index files are also available from "argo-index".

From the user side, the rsync service:

- Downloads the new files
- Downloads the updated files
- Removes the files that have been removed from the GDAC
- Compresses/uncompresses the files during the transfer
- Preserves the files creation/update dates
- Lists all the files that have been transferred (easy to use for a user side post-processing)

To synchronize the whole dac directory of the Argo GDAC:
```bash
rsync -avzh --delete vdmzrs.ifremer.fr::argo/ /home/mydirectory/...
```

To synchronize the index:
```bash
rsync -avzh --delete vdmzrs.ifremer.fr::argo-index/ /home/mydirectory/...
```

## Comparing data sources

### Features

Each of the available data sources have their own features and capabilities. Here is a summary:

| Data source:            | erddap | localftp | argovis |
|-------------------------|:------:|:--------:|:-------:|
| **Access Points**       |        |          |         |
| region                  |    X   |     X    |    X    |
| float                   |    X   |     X    |    X    |
| profile                 |    X   |     X    |    X    |
| **User mode**           |        |          |         |
| standard                |    X   |     X    |    X    |
| expert                  |    X   |     X    |         |
| **Dataset**             |        |          |         |
| core (T/S)              |    X   |     X    |    X    |
| BGC                     |        |          |         |
| Reference data for DMQC |    X   |          |         |
| **Parallel method**     |        |          |         |
| multi-threading         |    X   |     X    |    X    |
| multi-processes         |        |     X    |         |
| Dask client             |        |          |         |

### Fetched data and variables

You may wonder if the fetched data are different from the available data sources.  
This will depend on the last update of each data sources and of your local data.

In [None]:
# Download ftp sample and get the ftp local path:
ftproot = argopy.tutorial.open_dataset('localftp')[0]

# then fetch data:
with argopy.set_options(src='localftp', local_ftp=ftproot):
    ds = ArgoDataFetcher().float(1900857).to_xarray()
    print(ds)

Let's now retrieve the latest data for this float from the ``erddap``:

In [None]:
with argopy.set_options(src='erddap'):
    ds = ArgoDataFetcher().float(1900857).to_xarray()
    print(ds)

In [None]:
with argopy.set_options(src='argovis'):
    ds = ArgoDataFetcher().float(1900857).to_xarray()
    print(ds)

We can see some minor differences between ``localftp``/``erddap`` vs the ``argovis`` response: this later data source does not include the descending part of the first profile, this explains why ``argovis`` returns slightly less data.