This repository contains code for an end to end model for raga and tonic identification on audio samples
A better user-friendly repositiory containing only the inference code can be accessed here - https://github.com/VishwaasHegde/SPD_KNN
Requires python==3.6.9
Download and install Anaconda for easier package management
Install the requirements by running pip install -r requirements.txt
- Create an empty folder called
model
and place it in E2ERaga folder - Download the models (model-full.h5 and hindustani_raga_model.hdf5) from here and place them in models folder
- I dont have the permisssion to upload the datasets, the datasets has to be obtained by request from here: https://compmusic.upf.edu/node/328
- Or contact me directly: vishwaas (dot) universe (at) gmail (dot) com
E2ERaga supports audio samples which can be recorded at runtime
Steps to run:
- Run the command
python test_sample.py --runtime=True --tradition=hindustani --duration=60
- You can change the tradition to hindustani/carnatic and duration to record in seconds
- Once you run this command, there will be a prompt -
Press 1 to start recording or press 0 to exit:
- Enter accordingly and start recording for
duration
duration - After this the raga label and the tonic frequency is outputted
E2ERaga supports recorded audio samples which can be provided at runtime
Steps to run:
-
Run the command
python test_sample.py --runtime_file=<audio_file_path> --tradition=<hindustani/carnatic> --file_type=wav
Example:
python test_sample.py --runtime_file=data/sample_data/Ahira_bhairav_27.wav --tradition=hindustani --file_type=wav
-
The model supports wav and mp3 file (--filetype defaults to wav if not provided), with mp3 there will be a delay in converting into wav format internally
-
After this the raga label and the tonic frequency is outputted
Demo videos:
(Click will redirect to youtube)
Acknowledgments:
- The model uses CREPE to find the pitches for the audio, I would like to thank Jong Wook for clarifiying my questions
- Also thank CompMusic and Sankalp Gulati for providing me the datasets
Acknowledgments: