# Download Datasets from Zenodo

*fairly* can also download publid datasets from Zenodo.
The *Zenodo* repository its own platform for managing research datasets. For this example, we will use the dataset [Quality and timing of crowd-based water level class observations](https://zenodo.org/records/3929547). This dataset is a single compressed file of type `.zip`, which contains several other files and directories, and it is about `27 MBs` in size. 

In Zenodo the ID of a dataet can be found by looking its DOI. It last part of a DOI (a number). For example, the DOI for the second version of the dataset is `10.5281/zenodo.3929547`, therefore its ID is `3929547`. We can fetch a dataset using either its ID or its URL.



## 1. Connect to Zenodo
To connect to data repositories we use clients. A client manage the connection to an specific data repository. We can create a client to connect to Zenodo as follows:

In [7]:
import fairly

zenodo = fairly.client(id="zenodo")

## 2. Connect to a dataset
Now, we can connect to a *public* dataset by calling the `get_dataset()` method and using either the dataset ID or its URL.

In [8]:
# USING ID
dataset = zenodo.get_dataset("3929547") 

# USING URL
dataset = zenodo.get_dataset("https://zenodo.org/records/3929547") 

## 3. Explore dataset's metadata

Once we have made a connection to a dataset, we can access its metadata (as stored in the data repository) by calling the `metadata` property of a dataset. 

In [9]:
# Retrieves metadata from data repository
dataset.metadata

Metadata({'type': 'dataset', 'publication_date': '2020-02-20', 'title': 'Data and R-Scripts for "Quality and timing of crowd-based water level class observations"', 'authors': [Person({'fullname': 'Etter, Simon', 'institution': 'University of Zurich, Department of Geography', 'orcid_id': '0000-0002-7553-9102', 'name': 'Simon', 'surname': 'Etter'}), Person({'fullname': 'Strobl, Barbara', 'institution': 'University of Zurich, Department of Geography', 'orcid_id': '0000-0001-5530-4632', 'name': 'Barbara', 'surname': 'Strobl'}), Person({'fullname': 'Seibert, Jan', 'institution': 'University of Zurich, Department of Geography', 'orcid_id': '0000-0002-6314-2124', 'name': 'Jan', 'surname': 'Seibert'}), Person({'fullname': 'van Meerveld, Ilja (H.J.)', 'institution': 'University of Zurich, Department of Geography', 'orcid_id': '0000-0002-7547-3270', 'name': 'Ilja (H.J.)', 'surname': 'van Meerveld'})], 'description': '<p>This are the data and the R-scripts used for the manuscript &quot;Quality a

## 4. List dataset's files

We can list the files of a dataset using the `files` property. The result is a Python dictionary where the name of each file is an element of the dictionary. In this case the dataset contains only one file.

In [10]:
# Lists files (data) associated to the dataset
files = dataset.files

print("There are", len(files), "files in this dataset")

print(files)

There are 1 files in this dataset
{'DataForUploadToZenodo.zip': 'DataForUploadToZenodo.zip'}


## 5. Download a file

We can download the file in the dataset by using the name of a file. For example `'DataForUploadToZenodo.zip'`. 

> The `path` parameter can be used to define where to store the file, otherwise the file will be store in the working directory.


In [5]:
# Select a file to download from the dataset
single_file =  dataset.files['DataForUploadToZenodo.zip'] # missing updating the manifest

# download a file
zenodo.download_file(single_file, path="./from-zenodo")

'DataForUploadToZenodo.zip'

## 6. Download a dataset

We also can download all files and metadata of a dataset using the `store()` function. We need to provide a path to a directory to store the dataset. If the directory does not exist, it would be created.

In [11]:
# This will download about 278 MBs
dataset.store("./quality") # use extract=True for unzipping


<fairly.dataset.local.LocalDataset at 0x7f5250515ba0>