Assets 2

OTMM Makam Recognition Dataset

Erratum: Several annotations have been corrected since this release. Please refer to the latest release or the master branch for the corrections.

This repository hosts the dataset designed to test makam recognition methodologies on Ottoman-Turkish makam music. It is composed of 50 recording from each of the 20 most common makams in CompMusic Project's Dunya Ottoman-Turkish Makam Music collection. Currently the dataset is the largest makam recognition dataset.

Please cite the publication below, if you use this dataset in your work:

Karakurt, A., Şentürk S., & Serra X. (2016). MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop. New York, NY

The recordings are selected from commercial recordings carefully such that they cover diverse musical forms, vocal/instrumentation settings and recording qualities (e.g. historical recordings vs. contemporary recordings). Each recording in the dataset is identified by an 16-character long unique identifier called MBID, hosted in MusicBrainz. The makam and the tonic of each recording is annotated in the file annotations.json.

The audio related data in the test dataset is organized by each makam in the folder data. Due to copyright reasons, we are unable to distribute the audio. Instead we provide the predominant melody of each recording, computed by a state-of-the-art predominant melody extraction algorithm optimized for OTMM culture. These features are saved as text files (with the paths data/[makam]/[mbid].pitch) of single column that contains the frequency values. The timestamps are removed to reduce the filesizes. The step size of the pitch track is 0.0029 seconds (an analysis window of 128 sample hop size of an mp3 with 44100 Hz sample rate), with which one can recompute the timestamps of samples.

Moreover the metadata of each recording is available in the repository, crawled from MusicBrainz using an open source tool developed by us. The metadata files are saved as data/[makam]/[mbid].json.

For reproducability purposes we note the version of all tools we have used to generate this dataset in the file algorithms.json.

A complementary toolbox for this dataset is MORTY, which is a mode recogition and tonic identification toolbox. It can be used and optimized for any modal music culture. Further details are explained in the publication above.

For more information, please contact the authors.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.