an initiative to provide infrastructure for reproducible workflows around open data
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.




The Open Data Lab is an initiative to provide infrastructure for reproducible workflows around open data. It is spearheaded by the Data Science Institute at the University of Virginia. Partner organizations include: the UVA library, the Center for Open Science, and the Public Library of Science.

Getting Started


  • Reference Data - Provision of a large corpus of open research and other data (CC0) and associated analytics (OSI-approved licence) across disciplines to be an exemplar for the representation of open structured and unstructured reference data.
  • Training - Use the reference data and analytics to train the next generation of data scientists and raise literacy around data at UVA and elsewhere.
  • Reproducibility - Facilitate scientific reproducibility through execution of workflows associated with the data and analytics.
  • Productivity - Science is currently a highly inefficient process. Provision of public executable workflows can accelerate the process.
  • Public engagement - Open data exploration pipelines based on open data are a powerful means to engage with the broader public, both around open data and data science in general and the issues that they can help address.
  • Broad Recognition - as a valuable community resource.

Currently we are in closed beta testing. If you are interested in the project please watch us on github and email Goals for closed beta:

  • Establish storage platform
  • Establish computation platform
  • Establish discovery platform
  • Accumulating a corpus of reference data