Skip to content

Latest commit

 

History

History
236 lines (235 loc) · 45.5 KB

mir-datasets.md

File metadata and controls

236 lines (235 loc) · 45.5 KB
status dataset metadata contents with audio
200DrumMachines audio samples 7371 one-shots yes
AAM onsets, pitches, instruments, melody instrument, keys, chords, tempos, beats, segments 3,000 music tracks (with single instrument multitracks) yes
AccoMontage2 song harmonization and accompaniment arrangement based on a lead melody None no
ACM_MIRUM tempo 1410 excerpts (60s) yes
AcousticBrainz-Genre 15-31 genres with 265-745 subgenres audio features for about 2000000 songs no
ADC2004 predominant pitch 20 excerpts yes
Acoustic Event Dataset 28 event classes 5223 audio snippets yes
AIST Dance Video Database street dance videos 13,940 videos for 60 pieces yes
Amg1608 valence & arousal 1608 excerpts (30s) no
AMT-pilot structure by multiple annotators 8 songs yes
Automatic Practice Logging piano practice 620 segments yes
artist20 20 artists 1413 songs no
ASAP aligned MIDI/audio performances and MIDI/XML scores, beats, downbeats, time signatures, key signatures 1068 MIDI performances, 520 audio performances, 222 scores yes (see MAESTRO)
ATEPP symbolic music MIDI, musicXML, classification tasks, expressive piano performances 1742 performances (~1000 hours) by 49 pianists and covers 1580 movements by 25 composers
AudioSet 632 event classes 2084320 clips (10s) no
bach10 aligned multitrack MIDI 10 chorales yes
BAF audio fingerprinting, music monitoring in broadcast 2,000 tracks from Epidemic Sound and 3,425 TV audio recordings (60s) yes
ballroom 8 genres, tempo, beats, bars / downbeats 698 excerpts (30s) yes
beatboxset1 percussion annotation 14 clips yes
BPS-FH Beethoven Piano Sonata with Function Harmony functional annotation 32 sonatas no
C224a 14 genres 224 artists no
C3ka 18 genres 3000 artists no
C49ka-C111ka genres 48800/110588 artists no
CAL10k tags 10870 songs no
CAL500 tags 502 songs yes
CarnaticRhythm sama, beats 176 pieces on request
Chordify Annotator Subjectivity Dataset chords by 4 annotators 50 songs no
CBFdataset 4 playing techniques (Chinese Bamboo Flute) 10 performers yes
CCMixter vocal track, background track 50 mixes yes
ChoCo, the Chord Corpus chords, keys, knowledge graph 20K+ songs/pieces no
Chopin22 aligned MIDI 44 recordings yes
ChoralMusicSeparation JSB chorales, separation 8.2-hour-long choral music dataset from the JSB Chorales Dataset yes
Clotho 5 descriptive captions 4981 snippets yes
CMMSD note/rest/transition, onsets, vibrato 36 excerpts no
Coidach 55 genres 26420 songs no
corpusCOFLA editorial, predominant melody 1800 flamenco recordings no
covers80 cover songs 80 song pairs yes
Cross-Composer 11 composers, piece, key, era, instrumentation 1100 chromagrams and chord labels no
Cross-Era composer, piece, key, era, instrumentation 2000 chromagrams and chord labels no
Choral Singing Dataset f0, MIDI 48 recordings yes
Da-TACOS cover songs 25000 songs no
dadaGP dataset guitarPro tablatures, encoder and decoder python tool to and from text and token format, symbolic music generation a total of 26,181 songs in guitarPro/token format for symbolic music generation no
Dataset of synchronised Audio, LyrIcs and vocal notes aligned notes and lyrics 5358 songs no
DAMP karaoke performances, aligned lyrics, pronunciation assessment 34000 monophonic recordings yes
Dagstuhl ChoirSet beats, time-aligned scores, F0 81 takes yes
DEAM - The MediaEval Database for Emotional Analysis of Music valence & arousal 1802 excerpts yes
DEAPDataset valence & arousal, dominance, physiological data 120 music video excerpts no
DESED 10 audio event classes pprox 20k 10s clips (unlabeled, weakly/strongly labeled) yes
DREANSS onset times, percussion instruments 18 excerpts yes
DrumPt 4 playing techniques app. 2000 annotations yes (see ENST)
DSD100 multitrack recordings, stems for vocals, drums, bass and accompaniment 100 songs yes
EMO-Soundscapes arousal & valence 1213 soundscape recordings yes
emoMusic arousal & valence 744 excerpts (45s) yes
Emotify induced emotion 400 excerpts yes
EMusic arousal & valence 100 excerpts (experimental music) yes
EnsembleSet source separation, synthesized with Spitfire BBC Symphony Orchestra Professional Library, 20 different mix/microphone configurations dataset presents 80 tracks (6+ hours) with a range of string, wind, and brass instruments arranged as chamber ensembles yes
ENST-Drums onset times, perc. instruments, playing technique 318 segments yes
Erkomaishvili Dataset sheet music, structure, F0, note onsets 118 tracks yes
Expanded Groove MIDI Dataset drummer/session id, drum timing, kit name 45537 midi/audio pairs rendered
Extendedballroom 9 genres, tempo 4000 excerpts (30s) downloadable
ExtraSensory 51 context labels 300000 sensor recordings from 60 users yes
ffuhrmann 11 predom. instr. 6951 excerpts from 220 songs yes/no
FifteenSongs Grateful Dead 15 grateful dead songs with leadsheets yes
Flamenco database editorial, biographical, musicological information on flamenco, 1102 artists, 74 palos, 2860 albums 13311 tracks no
FMA-full 161 genres 106574 songs yes
FMA-large 161 genres 106574 excerpts (30s) yes
FMA-medium 16 genres 25000 excerpts (30s) yes
FMA-small 8 genres 8000 excerpts (30s) yes
Freesound-Loop-Dataset tempo, key, instrumentation, genre 3000 annotated loops, 9455 loops total yes
FSD-Kaggle2019 80 tags 29000 clips yes
Fugue Analyses fugue structure, patterns, cadences 36 fugues (Bach & Shostakovich) no
GiantStepsKey key 604 files no
GiantStepsTempo tempo 664 files no
GiantStepsTempo:alternate tempo 664 files no
Greek Music Dataset genre, valence, arousal 1400 songs downloadable
Gracenote Music Identification 2014 timestamp, country 110M music ID matches no
GoodSounds 12 instruments, pitch, sound quality 8750 notes yes
GPT 7 guitar playing techniques 6580 clips yes
Groove MIDI Dataset drummer/session id, drum timing 1150 MIDI recordings rendered
Guitar Solo Dataset start/stop of guitar solos 60 songs no
GTZAN 10 genres, tempo labels, key labels (lerch), key labels (li), beat/downbeat, metrical levels 1000 excerpts (30s) yes
GuitarSet midi, pitch, beat, chords 360 guitar excerpts (30s) with hexaphonic audio yes
GZ_IsoTech Guzheng 2824 yes
Hainsworth tempo 245 excerpts (60s) yes
HarmonixSet beats, downbeats, structure 912 pop songs no
HED emotion annotations, harmonisation and tempo arrangements 4000 tracks with emotion annotations yes
HHDS multitrack, style, tempo 18 songs yes
holzapfel:onset onset times 78 excerpts yes
homburg 9 genres 1889 excerpts (10s) yes
HookTheory aligned melody and harmony annotations 50 hours of aligned melody and harmony annotations yes
IADS valence & arousal, dominance 111 sound snippets yes
IDMT Multitrack multitrack, style 12 songs yes
IDMT-PIANO-MM classical and jazz piano recordings 432 piano recordings (around four hours) yes
IDMT-SMT-Audio-Effects effects on bass and guitar notes 55044 recordings yes
IDMT-SMT-Bass bass performance styles 4300 excerpts yes
IDMT-SMT-Bass-SINGLE-TRACK style annotated bass lines 17 bass lines (?) yes
IDMT-SMT-Drum onset times, perc. instruments 518 files yes
IDMT-SMT-Guitar 9 guitar playing techniques 4700+400 note events yes
iKala singing voice tracks, background tracks 252 excerpts (30s) yes
INRIA:EuroVision structure 124 songs no
INRIA:Quaero structure 159 songs no
IRMAS 11 instruments 2874 excerpts yes
ISMIR2004Genre 6 genres 729 excerpts (30s) yes
ISMIR2004Tempo tempo 465 excerpts (20s) yes
Jazz Audio-Aligned Harmony Dataset structure, key, chords, beats 113 songs no
Jamendo-VAD voice activity 61+16+16 songs yes
JGDB multitrack, MIDI random generated excerpts yes
JKU-ScoFo audio, MIDI 16 recordings yes
Josquin La Rue Secure Duo Dataset symbolic scores 77 duos (Josquin & La Rue) no
Jordan:Classical structure 15 pieces yes
Jordan:Jazz structure 15 pieces yes
KUGDastgahi dastgahi music 213 solo recordings by four professional musicians audio
LabROSA:APT MIDI 29 piano excerpts yes
LabROSA:MIDI audio, MIDI 4 songs yes
last.fm-1K and last.fm-360K user listening habits from last.fm 992 users no
LFM-1b listening habits 120000 users no
Lyrical Influence Networks Dataset lyrics-based artist and genre graphs 42802 artists/214 genres no
Lakh MIDI Dataset MIDI, tempo, key 176581 MIDI files no
LMD - Latin 10 genres 3160 songs no
LocalifyMusicEvents-USA-2019 music events, socioeconomic indicators 308051 music events that took place in 2019 and from 1139 US cities no
Lyra a dataset for Greek Traditional and Folk music that includes 1570 pieces 1570 songs yes
M-DJCUE cue points 134 tracks no
MAESTRO audio aligned midi, velocity, sustain 172 hours of piano yes
magnatagatune similarity, tags 25863 excerpts (30s) yes
MAPS piano notes/chords/pieces, tempo/key 238 pieces yes
MARD album reviews 66566 songs no
MARG-AMT MIDI pitch, onset/offset times 30 melodies yes
MAST vocal performance assessment 1018 performances no
MAST-Rhythm rhythm performance assessment 3721 performances yes
McGill Billboard chords 740 songs no
MDBDrums onset times, perc. instrument, playing technique 23 excerpts yes
Medley-solos-DB: a cross-collection dataset for musical instrument recognition 8 instruments 21572 excerpts yes
MedleyDB multitrack, genre, melody f0, instrument activation 122 songs yes
Melon Playlist Dataset 148826 playlists, 30 genres, 219 subgenres, 30652 playlist tags mel-spectrograms for 649091 songs (20-50s segments) no
MeloSol melody, monophonic, symbolic, kern, key 783 melodies no
MER500 emotion 500 clips yes
MIR-1K vocal tracks, background tracks 1000 excerpts yes
mirex05Train predominant pitch 13 excerpts yes
mirex06Train tempo, beats 20 excerpts (30s) yes
Mid Level Perceptual Music Features 7 perceptual features 5000 audio files yes
Million Musical Tweets listening behavior 1086808 tweets no
Modal onset times 71 snippets yes
MOODetector:Bi-Modal lyrics, valence & arousal 133 excerpts yes
MOODetector:Multi-Modal lyrics, MIDI, mood 903 excerpts (30s) yes
moodswings arousal & valence 240 excerpts (30s) no
Mozart's String Quartets sonata from structure, cadences 32 movements no
Million Song Dataset metadata, proprietary features 1000000 songs no
Multimodal Sheet Music Dataset piano notes/chords/pieces, synthetic audio, aligned MIDI, aligned sheet music images, OMR 497 pieces no
The Meertens Tune Collections phrases, key, meter 18000 melodies partially
A Multimodal Dataset of Musical Themes for MIR Research sheet music, symbolic encodings, audio snippets, symbolic-audio alignments, composer, work, recording, and theme characteristics 2067 Themes yes
MTG-Jamendo tags (genre, instruments, mood) 55000 tracks yes
MTG-Query by Humming title, artist 118 queries/481 songs yes/no
MusAV arousal & valence (relative annotations) 2092 excerpts (30s) yes
musdb-XL source separation musdb-XL is an eXtremely Loud version of musdb-hq evaluation dataset yes
MUSDB18 multitrack recordings, stems for vocals, drums, bass and accompaniment 150 songs yes
MUSIC4ALL tags, lyrics 109,269 excerpts (30s) on request
musiclef2012 tags 1355 songs no
MusicMicro music listening patterns 136866 users no
MusicNet pitch, onsets 330 recordings implicitly
Multi-modal Dataset of Music Video chords / keys (music feature), note density (music feature), loudness (music feature), semantic (video feature), motion (video feature), emotion (video feature), scene offset (video feature) 748 music videos on request
NES-MDB multi-track MIDI, aligned audio 5000 songs on request
Nine Inch Nails Multitracks multitrack 66 songs yes
NMED-H - Naturalistic Music EEG Dataset Hindi EEG 24 trials x 16 excerpts (4.5min) no
Naturalistic Music EEG Dataset – Rhythm Pilot EEG 20 trials x 10 excerpts (4.5min) no
Naturalistic Music EEG Dataset - Tempo EEG 30 trials x 16 excerpts (30sec) no
NSynth instrument, pitch 305979 single notes yes
NUS-48E aligned phonemes 48 pairs of sung and spoken yes
ODB onset times 19 excerpts yes
Onset_Leveau onset times 21 excerpts yes
Open Broadcast Media Audio from TV 6 classes for music presence 1647 excerpts (60s) yes
OpenMIC-2018 20 instruments 20000 excerpts (10s) yes
Orchset predominant pitch 64 excerpts yes
Piano Gestures Dataset video, intentions, audio 210 clips yes
Phenicx-Anechoic multi-track audio (orchestral music), aligned MIDI 4 pieces yes
Phonation pitch, vowel, phonation mode 900 monophonic snippets yes
PlaylistDataset playlists 75262 songs/2840553 transitions no
QBT-Extended taps 3365 queries/51 songs MIDI
QMUL:Beatles structure, key, chords, beats 181 songs no
QMUL:King structure, key, chords 14 songs no
QMUL:MichaelJackson structure 38 songs no
QMUL:MixEvaluation multitrack, mixes 18 songs/180 mixes yes
QMUL:Queen structure, key, chords 51/31 songs no
QMUL:RSS structure 60 songs no
QMUL:Zweieck structure, key, chords, beats 18 songs no
QUASI multitrack 11 songs yes
RobbieWilliamsAnnotations chords, keys, beats 65 songs no
RockCorpus chords, melody, bars 200 songs no
RWC lyrics, 10 genre, 50 instruments, chords, structure, aligned MIDI 115 songs/50 classical/100 songs yes
SALAMI structure 1447 songs no
SAMBASET recording date, escolas, beats 392 no
Sargon structure 4 songs yes
Semantic Artist Similarity artist biographies, similarity 268+2336 artists no
Schenker Anayses MusicXML, Schenker analysis 41 pieces no
SCP - EEG-Recorded Responses to Short Chord Progressions EEG 108/648 trials x 12 stimuli (5s) yes
Sample detection dataset start of samples 80 songs, 80 samples no
SEILS scores in different symbolic formats 30 madrigals no
Seyerlehner:1517-Artists 19 genres 3180 songs yes
Seyerlehner:Annotated 19 genres 190 songs yes
Seyerlehner:Pop tempo 1105 songs yes
Seyerlehner:Unique 14 genres 3115 excerpts (30s) yes
SHS100K cover songs ca. 10,000 songs with 100,000 tracks no
SISEC2013 multitrack, mix 5 excerpts yes
SLAKH MIDI, synthesized audio (tracks + mix) 2100 mixes yes
SMC:MIREX tempo, beats 217 excerpts yes
SMD audio, aligned MIDI 50 recordings yes
SongInterpretationDataset lyrics 27,834 songs (30 seconds each, recorded at 44.1 kHz) yes
SoundTracks valence, energy, tension, mood 360+110 excerpts yes
SPAM structure 50 songs no
Shazam Research Dataset Offsets in-song query times 188M queries over 20 songs no
Su-AMT onset times, pitch 10 excerpts yes
SUPRA-RW piano roll performances 478 performances yes
Schubert Winterreise Dataset (SWD) lyrics, scores (image, symbolic, MIDI), audio, measures, chords, local keys, global keys, structure 24 songs, 9 performances yes
SymbolicTextureMozartSonatas symbolic music 9 movements of Mozart Piano Sonatas totaling a set of 1164 annotated measures no
SymphonyMIDI MIDI, symphonic 46187 MIDI scores no
Texture in String Quartets texture 11 movements no
Traditional Flute Dataset audio, aligned MIDI 30 excerpts yes
ThisIsMyJam favorite songs, artists 131k users no
TinySOL, an audio dataset of isolated musical notes instrument, pitch, dynamics, string number (if applicable) 2913 isolated notes yes
TONAS pitch 72 single-voiced excerpts yes
Track Popularity popularity rating 23385 songs no
Tunebot title, artist 10000 queries/? songs yes/no
UIOWA:MIS single instrument notes many yes
UMA-Piano piano chords 275040 recordings yes
UnmixDB DJ mix parameters 37 playlists yes
URBAN-SED 9 event classes 10000 recordings yes
UrbanSound8k 10 event classes 8732 slices yes
Multi-modal Music Performance score-aligned video and audio 44 recordings yes
uspop2002 tags, genre, chords 8752 songs no
Violin Gestures Dataset EMG, playing techniques, audio 960 recordings yes
ViolinEtudesf0Estimation f0 estimation for Violin Etudes 27.8-hours violin performance yes
VocalSet 17 vocal techniques 3560 recordings yes
YM2413-MDB retro video game symbolic music dataset with emotion annotations, ismir 2022 669 songs no
YousicianUkulele evaluated notes and chords 500000 exercises by 1000 users no