# Zenodo

There are two types of Zenodo interfaces. One interfaces to the public repositories (`ZenodoRecord`), the other is for testing and accessed the sandbox server (`ZenodoSandboxDeposit`).

The class diagram below shows how they are constructed. First, an abstract zenodo interface class (`AbstractZenodoInterface`) is derived. From this, the concrete interface classes are derived.

<img src="../../_static/repo_class_diagram.svg"
     alt="../../_static/repo_class_diagram.svg"
     style="margin-right: 10px; height: 500px;" />
     

## Example usage

The example below will upload an HDF file to the sandbox server:

In [1]:
from h5rdmtoolbox.repository import zenodo
import h5rdmtoolbox as h5tbx

### 1. Init a Repo:

For testing purpose, let's use the sandbox environment of Zenodo (`ZenodoSandboxDeposit`)

In [2]:
repo = zenodo.ZenodoSandboxDeposit(None)

We create a test HDF5 file, which we will later publish in the repository:

In [3]:
with h5tbx.File() as h5:
    h5.create_dataset('velocity', shape=(10, 30), attrs={'units': 'm/s'})
filename = h5.hdf_filename

### 2. Add repository metadata
The repository needs **metadata**. The Zenodo module has a special class `Metadata` for this purpose. It validates the data expected by the Zenodo API (For required and optional fields, please refer to the [API](https://developers.zenodo.org/#representation) or carefully read the `Metadata` docstring. However, as `pydantic` is used as parent class, invalid or missing parameters will lead to errors):

In [4]:
from h5rdmtoolbox.repository.zenodo import metadata
from datetime import datetime

meta = metadata.Metadata(
    version="0.1.0-rc.1+build.1",
    title='[deleteme]h5tbxZenodoInterface',
    description='A toolbox for managing HDF5-based research data management',
    creators=[metadata.Creator(name="Probst, Matthias",
                      affiliation="KIT - ITS",
                      orcid="0000-0001-8729-0482")],
    contributors=[metadata.Contributor(name="Probst, Matthias",
                              affiliation="KIT - ITS",
                              orcid="0000-0001-8729-0482",
                              type="ContactPerson")],
    upload_type='image',
    image_type='photo',
    access_right='open',
    keywords=['hdf5', 'research data management', 'rdm'],
    publication_date=datetime.now(),
    embargo_date='2020'
)

... finally make the changes effective by setting the metadata:

In [5]:
repo.set_metadata(meta)

### 3. Upload files

*Any* file can be added (uploaded) by calling `upload_file(...)`. It can be a simple text, CSV or binary file. Often, it is advisable to describe the content in an additional file and hence provide more (machine-interpretable) information. Best is, to use JSON-LD files for this. The JSON-LD format allows describing file content and context in a standardized way.

One of the parameters of `upload_file(...)` is `metamapper`. It expects a function, that extracts meta information from the input file. If the parameter `auto_map_hdf` is True and a HDF5 file is passed (scans for file suffixes `.hdf`, `.hdf5` and `.h5`), the built-in converter function will be called, which writes a JSON-LD file.

By providing the `metamapper`-function, the target file and its metadata filename (which the function created) will be uploaded together.

Adding a metadata file is especially beneficial for large, binary files. Like this, the metadata file can be downloaded and explored quickly by the user.

In [7]:
repo.upload_file(filename)

List the just uploaded files, by requesting the current filenames in the repository:

In [8]:
repo.get_filenames()

{'tmp0.json': {'id': '11220027-8174-4d87-9c9e-3a718536af21',
  'filename': 'tmp0.json',
  'filesize': 1074,
  'checksum': 'ef88b0f5be47cb450ab5fd6d91ccffeb',
  'links': {'self': 'https://sandbox.zenodo.org/api/deposit/depositions/70629/files/11220027-8174-4d87-9c9e-3a718536af21',
   'download': 'https://sandbox.zenodo.org/api/records/70629/draft/files/tmp0.json/content'}},
 'tmp0.hdf': {'id': '126ebe06-441d-47ab-9cd6-307db0e4150c',
  'filename': 'tmp0.hdf',
  'filesize': 6944,
  'checksum': '1169b5c09635913ac8e6606d86c40692',
  'links': {'self': 'https://sandbox.zenodo.org/api/deposit/depositions/70629/files/126ebe06-441d-47ab-9cd6-307db0e4150c',
   'download': 'https://sandbox.zenodo.org/api/records/70629/draft/files/tmp0.hdf/content'}}}

### 3b Custom metamapper

We could of course write and use our own metadata extract function like so:

In [12]:
import pathlib

def my_meta_mapper(filename):
    """very primitive...and not a jsonld file, but 
    servese the demonstrating purpose."""
    txt_filename = pathlib.Path(filename).with_suffix('.txt')
    with open(txt_filename, 'w') as f:
        f.write(f'filename: {filename}')
    return txt_filename

In [13]:
repo.upload_file(filename, metamapper=my_meta_mapper, overwrite=True)

Proof, that it worked:

In [15]:
repo.get_filenames()

{'tmp0.json': {'id': '11220027-8174-4d87-9c9e-3a718536af21',
  'filename': 'tmp0.json',
  'filesize': 1074,
  'checksum': 'ef88b0f5be47cb450ab5fd6d91ccffeb',
  'links': {'self': 'https://sandbox.zenodo.org/api/deposit/depositions/70629/files/11220027-8174-4d87-9c9e-3a718536af21',
   'download': 'https://sandbox.zenodo.org/api/records/70629/draft/files/tmp0.json/content'}},
 'tmp0.hdf': {'id': '5cdc2613-352c-426f-b673-bf9c2e6206cc',
  'filename': 'tmp0.hdf',
  'filesize': 6944,
  'checksum': '1169b5c09635913ac8e6606d86c40692',
  'links': {'self': 'https://sandbox.zenodo.org/api/deposit/depositions/70629/files/5cdc2613-352c-426f-b673-bf9c2e6206cc',
   'download': 'https://sandbox.zenodo.org/api/records/70629/draft/files/tmp0.hdf/content'}},
 'tmp0.txt': {'id': 'bc9558f1-a502-484f-b660-967cbd31697b',
  'filename': 'tmp0.txt',
  'filesize': 86,
  'checksum': 'fafb067c01a902a14330e5aedd4b1b0a',
  'links': {'self': 'https://sandbox.zenodo.org/api/deposit/depositions/70629/files/bc9558f1-a502