A repository as an entry point for the DAta Tag Suite
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.



This repository provides and entry point for the DAta Tag Suite (DATS) repositories.

  • schema: This repository contains the JSON-schemas that constitute the DATS model.
  • context: This repository contains the JSON-LD context files that can be used in combination with the JSON-schemas to produce a JSON-LD representation of dataset descriptions. We provide three sets of context files, based on the following vocabularies: schema.org, OBO Foundry ontologies, DCAT vocabulary.
  • examples: This repository contains example instance data files (JSON or JSON-LD files) representing different datasets using the DATS model.
  • WG3-MetadataSpecifications: This repository is a fork of the original reposity where DATS development started, as part of the bioCADDIE project
  • dats-tools: This repository contains python code to validate DATS metadata.
  • ETL-Public: This repository is a fork of the initial repository by the NIH Data Commons Team Phosphorous ETL pipeline to convert GTEx data into DATS. This pipeline moved to the crosscut-metadata repository mentioned below.
  • crosscut-metadata: This repository is a fork of the NIH Data Commons crosscut-metadata repository, containing the ETL pipelines to convert AGR, GTEx, and TOPMed data into DATS.

DATS Publications

  • Xiaoling Chen, Anupama E Gururaj, Burak Ozyurt, Ruiling Liu, Ergin Soysal, Trevor Cohen, Firat Tiryaki, Yueling Li, Nansu Zong, Min Jiang, Deevakar Rogith, Mandana Salimi, Hyeon-eui Kim, Philippe Rocca-Serra, Alejandra Gonzalez-Beltran, Claudiu Farcas, Todd Johnson, Ron Margolis, George Alter, Susanna-Assunta Sansone, Ian M Fore, Lucila Ohno-Machado, Jeffrey S Grethe, Hua Xu; DataMed – an open source discovery index for finding biomedical datasets, Journal of the American Medical Informatics Association, Volume 25, Issue 3, 1 March 2018, Pages 300–308, https://doi.org/10.1093/jamia/ocx121
  • Alejandra N Gonzalez-Beltran, John Campbell, Patrick Dunn, Diana Guijarro, Sanda Ionescu, Hyeoneui Kim, Jared Lyle, Jeffrey Wiser, Susanna-Assunta Sansone, Philippe Rocca-Serra; Data discovery with DATS: exemplar adoptions and lessons learned, Journal of the American Medical Informatics Association, Volume 25, Issue 1, 1 January 2018, Pages 13–16, https://doi.org/10.1093/jamia/ocx119
  • Lucila Ohno-Machado, Susanna-Assunta Sansone, George Alter, Ian Fore, Jeffrey Grethe, Hua Xu, Alejandra Gonzalez-Beltran, Philippe Rocca-Serra, Anupama E Gururaj, Elizabeth Bell, Ergin Soysal, Nansu Zong & Hyeon-eui Kim; Finding useful data across multiple biomedical data repositories using DataMed; Nature Genetics 49, 816–819 (2017) https://doi.org/10.1038/ng.3864
  • Susanna-Assunta Sansone#, Alejandra Gonzalez-Beltran#, Philippe Rocca-Serra#, George Alter, Jeffrey S. Grethe, Hua Xu, Ian M. Fore, Jared Lyle, Anupama E. Gururaj, Xiaoling Chen, Hyeon-eui Kim, Nansu Zong, Yueling Li, Ruiling Liu, I. Burak Ozyurt & Lucila Ohno-Machado; DATS, the data tag suite to enable discoverability of datasets; Scientific Data volume 4, Article number: 170059 (2017) DOI: 10.1038/sdata.2017.59 #=equal contribution; Pre-print: bioRxiv DOI: 10.1101/103143 .