# Archiving Dataset

With **fairly**, we can remotely archive and edit datasets in a user account. Users can prepare a dataset for archiving by editing metadata, defining which files are part of a dataset, and uploading them to a data repository. One of the purposes of **fairly** is to *remove the need of preparing metadata and data for every repository to which a dataset will be archived*. Therefore, saving time and effort, and lowering the barries for practicing open science.
This tutorial shows what is possible using the 4TU.ResearchData repository. The procedure is the same for Zenodo.

**Requirements:**

* A 4TU.ReseachData account
* A personal access token. See [configuring access token](https://jupyterfair.readthedocs.io/en/latest/package/account-token.html) if you don't have one.
* Files to be archived. We will use a hypothetical case in this tutorial.

> For this toturial, we assume that our goal is to archive a dataset that we previously archived in Zenodo, to an account in 4TU.ReseachData. We will use the dataset [Quality and timing of crowd-based water level class observations](https://zenodo.org/record/3929547#.YwdoitJBy3c), as an example.
   

## 1. Download the Zenodo dataset

First, we will download the [Quality and timing of crowd-based water level class observations](https://zenodo.org/record/3929547#.YwdoitJBy3c), using its URL.

In [21]:
import fairly

# create a zenodo client
zenodo = fairly.client(id="zenodo")

# connect to a dataset
source_dataset = zenodo.get_dataset("https://zenodo.org/record/3929547#.YxAeJNJBxhF") 

# download dataset to a directory
# zen_dataset.store("./from-zenodo")

## 2. Editing Metadata

If we wish, we could edit the metadata of a dataset as following. Fow example, we can add some keywords.  

In [22]:
# zenodo dataset metadata
print(source_dataset.metadata)

KeyError: 'notes'

In [25]:
# edit keyworkds
source_dataset.metadata["keywords"] = ["CrowdWater", "Hydrology", "fairly"]

KeyError: 'notes'

## 3. Archive to 4TU.ResearchData

Now we can create a new dataset in an 4TU.ReseachData account, by doing the following. 

In [27]:
import fairly

# create a client using a personal access token
fourtu = fairly.client(id="figshare", token="my-4tu-token")

# serialize original metadata 
metadata = source_dataset.metadata.serialize()

# created dataset and upload metadata to 4TU.ResearchData
dataset_4tu = fourtu.create_dataset(metadata)


KeyError: 'notes'

If you would log in to your 4TU.ResearcData account, you should see a new dataset entry with the metadata we collected from Zenodo. Now, we can upload files to the dataset.

In [20]:
# upload files
fourtu.upload_file(dataset_4tu, "./from-zenodo/DataForUploadToZenodo.zip")

'./from-zenodo/DataForUploadToZenodo.zip'

> We could continue uploading files or editing the metadata in a similar way.