Keep scientific data under control with git and git-annex
Clone or download
yarikoptic Merge pull request #3109 from yarikoptic/rf-metadata-allannex
RF: place all extracted metadata under git-annex
Latest commit 5e37f0b Jan 18, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
.asv @ 9bdd343 New benchmark results Feb 5, 2018
.github ENH: added initial .zenodo.json, and its mentioning in PR template an… Nov 13, 2018
benchmarks Added two benchmarks for ls -- recursive and full all and long Feb 23, 2018
datalad Merge pull request #3109 from yarikoptic/rf-metadata-allannex Jan 18, 2019
docs DOC: Add talk flavor Jan 3, 2019
tools Added "reproducible" (due to nd_freeze) Singularity.0.11.1 Dec 11, 2018
.coveragerc ENH: Attempt to cover external remotes execution -- might get too muc… Sep 26, 2016
.gitignore RF: Auto-generate JSON-LD context from simpler key def list Sep 5, 2017
.gitmodules Added .asv as a submodule with fresh results from smaug Jan 18, 2017
.mailmap ENH: Placed commits from questionable (although primarily mine) under… Nov 13, 2018
.noannex ENH: add .noannex to avoid accidental git annex initification Sep 8, 2018
.travis.yml TST: travis: Add cron run for latest Git Nov 9, 2018
.zenodo.json Adjusted zenodo's grants entry to include correct DOI for NSF and gra… Dec 6, 2018 BF(DOC): all releases are ## section (fixed for 0.9.1, thought we los… Nov 28, 2018 Create Sep 30, 2017 CLN: CONTRIBUTING: Drop merge.summary suggestion Dec 20, 2018
CONTRIBUTORS adding to contributors woot! Sep 15, 2018
COPYING RF: prune dependencies on numpy for borrowed tests runner(s) Jul 26, 2017
Gruntfile.js TST: Js test integration with grunt on travis Sep 2, 2016
Makefile Add release helper snippet from extension Makefiles Jun 9, 2018 List extension Oct 28, 2018
appveyor.yml BF: cannot do "full" install, so revert to tests + add devel-utils Dec 8, 2018
asv.conf.json ASV: no python 3.5, test also 0.9.x branch Feb 5, 2018 BF: Bunch of formatting fixes in the manpages Jun 5, 2018
readthedocs.yml ENH: Export some doc building knowledge into a RTD cfg file Sep 29, 2016
requirements-devel.txt RF: No special casing of pybids needed anymore, but versioned dep Mar 16, 2018
requirements.txt Trying .[full] again Oct 17, 2016 ENH: add coverage to devel-utils Dec 8, 2018 DOC: datalad-cmd -> datalad cmd in manpage (fixes gh-1761) Apr 2, 2018
tox.ini drop python 3.4 and aim for 3.5 and 3.6 Mar 1, 2018

 ____            _             _                   _ 
|  _ \    __ _  | |_    __ _  | |       __ _    __| |
| | | |  / _` | | __|  / _` | | |      / _` |  / _` |
| |_| | | (_| | | |_  | (_| | | |___  | (_| | | (_| |
|____/   \__,_|  \__|  \__,_| |_____|  \__,_|  \__,_|
                                              Read me

Travis tests status Build status Documentation License: MIT GitHub release PyPI version Testimonials 4

10000ft overview

DataLad makes data management and data distribution more accessible. To do that, it stands on the shoulders of Git and Git-annex to deliver a decentralized system for data exchange. This includes automated ingestion of data from online portals and exposing it in readily usable form as Git(-annex) repositories, so-called datasets. The actual data storage and permission management, however, remains with the original data providers.

The full documentation is available at:


A number of extensions are available that provide additional functionality for DataLad. Extensions are separate packages that are to be installed in addition to DataLad. In order to install DataLad customized for a particular domain, one can simply install an extension directly, and DataLad itself will be automatically installed with it. Here is a list of known extensions:


The documentation of this project is found here:

All bugs, concerns and enhancement requests for this software can be submitted here:

If you have a problem or would like to ask a question about how to use DataLad, please submit a question to with a datalad tag. is a platform similar to StackOverflow but dedicated to neuroinformatics.

All previous DataLad questions are available here:


Debian-based systems

On Debian-based systems, we recommend to enable NeuroDebian from which we provide recent releases of DataLad. Once enabled, just do:

apt-get install datalad

Other Linux'es, OSX via pip

Before you install this package, please make sure that you install a recent version of git-annex. Afterwards, install the latest version of datalad from PyPi. It is recommended to use a dedicated virtualenv:

# create and enter a new virtual environment (optional)
virtualenv --python=python3 ~/env/datalad
. ~/env/datalad/bin/activate

# install from PyPi
pip install datalad

By default, installation via pip installs core functionality of datalad allowing for managing datasets etc. Additional installation schemes are available, so you could provide enhanced installation via pip install datalad[SCHEME] where SCHEME could be

  • tests to also install dependencies used by unit-tests battery of the datalad
  • full to install all dependencies.

There is also a Singularity container available. The latest release version can be obtained by running:

singularity pull shub://datalad/datalad




See if you are interested in internals or contributing to the project.


DataLad development is supported by a US-German collaboration in computational neuroscience (CRCNS) project "DataGit: converging catalogues, warehouses, and deployment logistics into a federated 'data distribution'" (Halchenko/Hanke), co-funded by the US National Science Foundation (NSF 1429999) and the German Federal Ministry of Education and Research (BMBF 01GQ1411). Additional support is provided by the German federal state of Saxony-Anhalt and the European Regional Development Fund (ERDF), Project: Center for Behavioral Brain Sciences, Imaging Platform. This work is further facilitated by the ReproNim project (NIH 1P41EB019936-01A1).