Skip to content
master
Switch branches/tags
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 

README.md

IEEE ISM 2017

Reproducible research code for the article published to IEEE ISM 2017 conference:

@inproceedings{Bayle2017,
  author = {Bayle, Yann and Maršík, Ladislav and Rusek, Martin and Robine, Matthias and Hanna, Pierre and Slaninová, Kateřina and Martinovič, Jan and Pokorný, Jaroslav},
  booktitle = {Proceedings of the 19th IEEE International Symposium on Multimedia},
  link = {http://ism2017.asia.edu.tw/december-12/},
  month = {Dec.},
  title = {Kara1k: a karaoke dataset for cover song identification and singing voice analysis},
  year = {2017},
  address={Taichung, Taiwan},
  pages = {1--8}
}

Aim of the paper

  • Propose a novel industrial musical database
  • Cover Song Identification task on the before-mentioned database
  • Singer's Gender Classification task on the before-mentioned database

Tree structure (Description of available files)

  • The folder src/ contains python files necessary to reproduce our algorithm
  • The folder data/ contains a file named filelist.csv that lists for each audio file:
    • the unique identifier
    • the artist name
    • the track name
    • the gender tag (female, male, females, males, mixed)
    • the language tag (en, fr, es, it, de, pt, nl)
    • a boolean indicating if features have been extracted for this audio file by:
      1. YAAFE
      2. Marsyas
      3. Essentia
      4. Vamp
      5. harmony-analyser
  • The folder features/ contains features extracted by
    • bextract from Marsyas with the following command: bextract -mfcc -zcrs -ctd -rlf -flx -ws 1024 -as 898 -sv -fe.

As concerns features extracted by YAAFE, Essentia, Vamp and harmony-analyser they cannot be stored on this github repository because of their inherent size and so are available upon request for direct download. The command used for extracting features with:

  • YAAFE: yaafe -r 22050 -f "mfcc: MFCC blockSize=2048 stepSize=1024" --resample -b output_dir_features input_filename
  • Essentia: essentia-extractors-v2.1_beta2/streaming_extractor_music input_filename output_filename
  • Vamp extracted via harmony-analyser using JNI wrapper:
    • java -jar ha-script.jar -a nnls-chroma:nnls-chroma -s .wav -t 0.07
    • java -jar ha-script.jar -a nnls-chroma:chordino-tones -s .wav -t 0.07
    • java -jar ha-script.jar -a nnls-chroma:chordino-labels -s .wav -t 0.07
    • java -jar ha-script.jar -a qm-vamp-plugins:qm-keydetector -s _wav -t 0.07
  • harmony-analyser with the following commands (note that Vamp plugin analysis was first performed to extract low-level features):
    • java -jar ha-script.jar -a chord_analyser:chord_complexity_distance -s .wav -t 0.07
    • java -jar ha-script.jar -a chroma_analyser:complexity_difference -s .wav -t 0.07
    • java -jar ha-script.jar -a chord_analyser:average_chord_complexity_distance -s .wav -t 0.07
    • java -jar ha-script.jar -a chord_analyser:tps_distance -s .wav -t 0.07
    • java -jar ha-script.jar -a filters:chord_vectors -s .wav -t 0.07
    • java -jar ha-script.jar -a filters:key_vectors -s .wav -t 0.07

About

Reproducible research code for the experiments presented in our article "Kara1k: a karaoke dataset for cover song identification and singing voice analysis" published at IEEE ISM 2017

Topics

Resources

License

Releases

No releases published

Packages

No packages published

Languages