Omics Dataset Curation Toolkit (OMD Curation Toolkit) is a suite of programs designed for the download and curation of metadata and fastq files of public omics datasets. This workflow provides a standardized framework intended to facilitate the arduous task of curating public omics projects. While centered on the European Nucleotide Archive (ENA), the majority of provided tools are generic and can be used to curate datasets from different sources.
For further details, see the following:
- Documentation (https://github.com/tbcgit/omdctk/wiki)
- Installation (https://github.com/tbcgit/omdctk/wiki/Installation)
- Tutorial Full Example (https://github.com/tbcgit/omdctk/wiki/Tutorial-Full-Example)
If you use OMD Curation Toolkit, please cite: Piquer-Esteban, S., Arnau, V., Diaz, W. et al. OMD Curation Toolkit: a workflow for in-house curation of public omics datasets. BMC Bioinformatics 25, 184 (2024). https://doi.org/10.1186/s12859-024-05803-9
This package has been jointly developed in the Theory, Bioinformatics and Computation (https://www.uv.es/tbc) and Evolutionary Genetics (https://www.uv.es/symbiosis) research groups at the Institute for Integrative Systems Biology (I2SysBio), University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), Valencia, Spain. SPE is supported by an FPU grant from the Spanish Ministry of Universities (Reference: FPU20/05756).
This package is licensed under the MIT License. See the LICENSE file for details.