Skip to content

probablyrobot/neomirdata

 
 

Repository files navigation

neomirdata

Common loaders for Music Information Retrieval (MIR) datasets. Find the API documentation here.

CI status Documentation Status GitHub

PyPI version codecov PRs Welcome

This library is a fork of mirdata and provides tools for working with common MIR datasets, including tools for:

  • downloading datasets to a common location and format
  • validating that the files for a dataset are all present
  • loading annotation files to a common format, consistent with the format required by mir_eval
  • parsing track level metadata for detailed evaluations

Maintainer

Igor Bogicevic (igor.bogicevic@gmail.com) - @probablyrobot

Installation

To install, simply run:

pip install neomirdata

Quick example

import mirdata

orchset = mirdata.initialize('orchset')
orchset.download()  # download the dataset
orchset.validate()  # validate that all the expected files are there

example_track = orchset.choice_track()  # choose a random example track
print(example_track)  # see the available data

See the documentation for more examples and the API reference.

Currently supported datasets

Supported datasets include AcousticBrainz, DALI, Guitarset, MAESTRO, TinySOL, among many others.

For the complete list of supported datasets, see the documentation

Citing

This project is a fork of mirdata. When using this library, please cite both the original mirdata paper and this fork:

Original mirdata paper:

"mirdata: Software for Reproducible Usage of Datasets"
Rachel M. Bittner, Magdalena Fuentes, David Rubinstein, Andreas Jansson, Keunwoo Choi, and Thor Kell
in International Society for Music Information Retrieval (ISMIR) Conference, 2019
@inproceedings{
  bittner_fuentes_2019,
  title={mirdata: Software for Reproducible Usage of Datasets},
  author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},
  booktitle={International Society for Music Information Retrieval (ISMIR) Conference},
  year={2019}
}

When working with datasets, please cite both the original mirdata paper and include the reference of the dataset, which can be found in the respective dataset loader using the cite() method.

Contributing

We welcome contributions to this library, especially new datasets. Please see contributing for guidelines.

About

Python library for working with Music Information Retrieval datasets

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Python 100.0%