mirdata
is an open-source Python library that provides tools for working with common Music Information Retrieval (MIR) datasets, including tools for:
- downloading datasets to a common location and format
- validating that the files for a dataset are all present
- loading annotation files to a common format, consistent with
mir_eval
- parsing track level metadata for detailed evaluations.
pip install mirdata
For more details on how to use the library see the tutorial
.
If you are using the library for your work, please cite the version you used as indexed at Zenodo:
If you refer to mirdata's design principles, motivation etc., please cite the following paper1:
When working with datasets, please cite the version of mirdata
that you are using (given by the DOI
above) AND include the reference of the dataset, which can be found in the respective dataset loader using the cite()
method.
We welcome contributions to this library, especially new datasets. Please see contributing
for guidelines.
source/overview source/quick_reference source/tutorial
source/mirdata
source/contributing source/faq
Rachel M. Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, and Thor Kell. "mirdata: Software for Reproducible Usage of Datasets." In Proceedings of the 20th International Society for Music Information Retrieval (ISMIR) Conference, 2019.:↩