fair-software.nl recommendations | Badges |
---|---|
1. Code repository | |
2. License | |
3. Community Registry | |
4. Enable Citation | |
Other best practices | |
Continuous integration | |
Documentation | |
Anaconda package |
A package to create, publish and clone research datasets.
fairly requires Python 3.8 or later, and ruamel.yaml version 0.17.26 or later. It can be installed directly from PYPI or Conda-Forge.
# Using pip
pip install fairly
# using anaconda or miniconda
conda install conda-forge::fairly
Clone or download the source code:
git clone https://github.com/ITC-CRIB/fairly.git
Go to the root directory:
cd fairly/
Compile and install using pip:
pip install .
Basic example to create a local research dataset and deposit it to a repository:
import fairly
# Initialize a local dataset
dataset = fairly.init_dataset('/path/dataset')
# Set metadata
dataset.metadata['license'] = 'MIT'
dataset.set_metadata(
title='My dataset',
keywords=['FAIR', 'research', 'data'],
authors=[
'0000-0002-0156-185X',
{'name': 'John', 'surname': 'Doe'}
]
)
# Add data files
dataset.includes.extend([
'README.txt',
'*.csv',
'train/*.jpg'
])
# Save dataset
dataset.save()
# Upload to a data repository
remote_dataset = dataset.upload('zenodo')
Basic example to access a remote dataset and store it locally:
import fairly
# Open a remote dataset
dataset = fairly.dataset('doi:10.4121/21588096.v1')
# Get dataset information
dataset.id
>>> {'id': '21588096', 'version': '1'}
dataset.url
>>> 'https://data.4tu.nl/articles/dataset/.../21588096/1'
dataset.size
>>> 33339
len(dataset.files)
>>> 6
dataset.metadata
>>> Metadata({'keywords': ['Earthquakes', 'precursor', ...], ...})
# Update metadata
dataset.metadata['keywords'] = ['Landslides', 'precursor']
dataset.save_metadata()
# Store dataset to a local directory (i.e. clone dataset)
local_dataset = dataset.store('/path/dataset')
Currently, the package supports the following research data management platforms:
All research data repositories based on the listed platforms are supported.
For more details and examples, consult the package documentation.
Unit tests can be run by using pytest
command in the root directory.
Read the guidelines to know how you can be part of this open source project.
An extension for JupyerLab is being developed in a different repository.
Please cite this software using as follows:
Girgin, S., Garcia Alvarez, M., & Urra Llanusa, J., fairly: a package to create, publish and clone research datasets [Computer software]
This research is funded by the Dutch Research Council (NWO) Open Science Fund, File No. 203.001.114.
Project members: