The Harmonix Set
Beats, downbeats, and functional structural annotations for 912 Pop tracks.
This repository contains human annotated labels for 912 Western Pop music tracks, gathered by Harmonix.
The full dataset can be found in the dataset directory, which contains the following:
- beats_and_downbeats: Directory with a tab-separated file for each track in the dataset, with the following three fields per line containing beats and downbeats:
beat_time_stamp: The placement of the beat in seconds (and downbeat, if
beat_position_in_bar: The number of beat within a bar (when 1, the beat also represents a downbeat).
bar_number: The number of the bar.
- segments: Directory with a tab-separated file for each track in the dataset, with the following two fields per line containing segmentation data:
boundary_time_stamp: The placement of a functional segmentation boundary in seconds.
label: The label of the segment that starts on the current boundary.
- metadata.tsv: Metadata of the Harmonix Set in a comma-separated file containing the following fields:
- File: File name, used to identify each of the tracks in the dataset.
- Title: Title of the track.
- Artist: Name of the artist of the given track.
- Release: Name of the release (e.g., album, compilation, EP) where the track is found.
- Duration: Duration of the track in seconds.
- BPM: Beats per minute.
- Ratio Bars in 4: Percentage of bars that have 4 beats.
- Time Signature: The time signature of the track.
- Genre: The music genre of the track.
- MusicBrainz Id: The MusicBrainz identifier of the track.
- Acoustid Id: The AcoustID identifier of the current track (when available).
- jams: Directory containing JAMS files, one per track, with beats, downbeats, segmentation, and metadata (using JAMS version 0.3.3).
- YouTube URLs: URLs with the associated YouTube video.
- YouTube Alignment Scores: Alignment scores based on Dynamic Time Warping when aligning audio from YouTube videos to the original audio used to annotate the dataset.
You may find the raw results in the results folder.
These results include song-level segmentation metrics for the entire dataset, using three different types of beat-synchronized Constant-Q Transform features:
- annot_beats.csv: Annotated beats from the Harmonix set.
- korz_beats.csv: Korzeniowski beats from madmom.
- librosa_beats.csv: Beats computed using the default librosa beat tracker.
These results were computed using the following libraries with their default parameters:
- librosa 0.6.3 (on a macOS 10.13.6 with its default CoreAudio MP3 decoder)
- madmom 0.16.1
- mir_eval 0.5
- msaf 0.1.8-dev
A couple of Jupyter notebooks are also included:
- Dataset Analysis: The plots of the original publication  were produced using this notebook, which employs the results discussed above.
- JAMS Creation: This notebook was used to generate the JAMS files of the Harmonix Set.
- Audio Alignment: Notebook containing the code to align the audio from YouTube to the original audio used for annotating the dataset. It uses DTW to generate the final audio files and get an alignment score to get a sense of how close the audio from YouTube is from the original one.
Please, cite the following paper if you're planning to publish results using this dataset:
 Nieto, O., McCallum, M., Davies., M., Robertson, A., Stark, A., Egozy, E., The Harmonix Set: Beats, Downbeats, and Functional Segment Annotations of Western Popular Music, Proc. of the 20th International Society for Music Information Retrieval Conference (ISMIR), Delft, The Netherlands, 2019 (PDF).