# Download Datasets from 4TU.ResearchData

**fairly** can download public datasets from 4TU.ResearchData.
The *4TU.ResearchData* repository uses Figshare as a platform for managing research datasets. For this example, we will use the dataset [EDoM measurement campaign](https://data.4tu.nl/articles/dataset/EDoM_measurement_campaign_full_data_from_the_lower_Ems_River/20308263). This dataset contains 28 files of different types (`.txt`, `.pdf`), and it is about `278 MBs` in size. 

The dataset has ID: `20308263`, in 4TU.ResearchData the dataset ID is the last part of the URL that appears in the web browser. We can fetch a dataset using either its ID or its URL.


## 1. Connect to 4TU.ResearchData
To connect to data repositories we use clients. A client manage the connection to an specific data repository. We can create a client to connect to 4TU.ReseachData as follows:

In [4]:
import fairly 

fourtu = fairly.client("4tu") 

## 2. Connect to a dataset

Now, we can connect to a *public* dataset by calling the `get_dataset()` method and using either the dataset ID or its URL, or the DOI.

In [5]:
# USING ID
# dataset = fourtu.get_dataset("20308263") 

# USING URL
dataset = fourtu.get_dataset("https://data.4tu.nl/articles/dataset/EDoM_measurement_campaign_full_data_from_the_lower_Ems_River/20308263") 

# COMVENIENT FUNCTION, USING DOI
# dataset = fairly.dataset( https://doi.org/10.4121/19519618.v1) # client is infered from DOI

## 3. Explore dataset's metadata

Once we have made a connection to a dataset, we can access its metadata (as stored in the data repository) by using the `metadata` property. 

In [3]:
# Retrieves metadata from data repository
dataset.metadata

Metadata({'authors': [Person({'fullname': 'Bas van Maren', 'orcid_id': '0000-0001-5820-3212', 'figshare_id': 11844539}), Person({'fullname': 'Andreas Engels', 'figshare_id': 12901508})], 'keywords': ['Hydrodynamics', 'Sediment dynamics', 'Collection: The Ems-Dollard Measurement (EDoM) campaign'], 'description': '<p>A large amount of long term monitoring data collected during the Edom measurement campaign has been published in Net CDF as part of the collection \'Edom measurements campaign: data from long-term monitoring\' ( <a href="https://doi.org/10.4121/19519618.v1" target="_blank">https://doi.org/10.4121/19519618.v1</a>). This dataset provides the full subset of the long term mooring data (including oxygen and flow velocities) in ASCII text format, and only for the lower Ems River</p>', 'license': 'CC BY-NC-SA 4.0', 'title': 'EDoM measurement campaign: full data from the lower Ems River', 'doi': '10.4121/20308263.v1', 'type': 'dataset', 'access_type': 'open', 'custom_fields': {'Publ

## 4. List dataset's files

We can list the files of a dataset using the `files` property. The result is a Python dictionary where the name of each file becomes elements of the dictionary.

In [4]:
# Lists files (data) associated to the dataset
files = dataset.files

print("There are", len(files), "files in this dataset")

print(files)

There are 28 files in this dataset
{'CsEmspier_01052017-01052019_from_NLWKN.txt': 'CsEmspier_01052017-01052019_from_NLWKN.txt', 'CsGandesum_01052017-01052019_from_NLWKN.txt': 'CsGandesum_01052017-01052019_from_NLWKN.txt', 'CsKnock_01052017-01052019_from_NLWKN.txt': 'CsKnock_01052017-01052019_from_NLWKN.txt', 'CsMP1_01052017-01052019_from_WSV.txt': 'CsMP1_01052017-01052019_from_WSV.txt', 'CsPogum_01052017-01052019_from_NLWKN.txt': 'CsPogum_01052017-01052019_from_NLWKN.txt', 'CsTerborg_01052017-01052019_from_NLWKN.txt': 'CsTerborg_01052017-01052019_from_NLWKN.txt', 'Messung_Gewaesserguete_EMS_NLWKN.pdf': 'Messung_Gewaesserguete_EMS_NLWKN.pdf', 'O2Emspier_01052017-01052019_from_NLWKN.txt': 'O2Emspier_01052017-01052019_from_NLWKN.txt', 'O2Gandersum_01052017-01052019_from_NLWKN.txt': 'O2Gandersum_01052017-01052019_from_NLWKN.txt', 'O2Knock_01052017-01052019_from_NLWKN.txt': 'O2Knock_01052017-01052019_from_NLWKN.txt', 'O2MP1_01052017-01052019_from_WSV.txt': 'O2MP1_01052017-01052019_from_WSV.

## 5. Download a file

We can donwload a single file in a dataset by using tis name. For example, this dataset contains a file with the name `'CsEmspier_01052017-01052019_from_NLWKN.txt'`. 

> The `path` parameter can be used to define where to store the file, otherwise the file will be store in the working directory.


In [5]:
# Select a file from the dataset
single_file =  dataset.files['CsEmspier_01052017-01052019_from_NLWKN.txt']

# download the file
fourtu.download_file(single_file)

'CsEmspier_01052017-01052019_from_NLWKN.txt'

## 6. Download a dataset

We can download all files and metadata of a dataset using the `store()` function. We need to provide a `path` to a directory to store the dataset. If the directory does not exist, it would be created.

In [6]:
# This will download about 278 MBs
dataset.store("./edom")

<fairly.dataset.local.LocalDataset at 0x7f143af8a6b0>