Project is WIP. The make files are kinda a mess, and the build process is not straightforward if you're not the one who wrote it. I'm planning on fixing this, but wanted to create a minimum viable product first. The full dataset takes ~2hrs to build on my laptop Ryzen 9 6900HS and is around 40GB of audio files after all processes are complete. Might try and optimize a bit, but there's no instant way to process several million audio files.
This project aims to create a machine learning model that can determine what solo instrument is playing in an audio file. It uses sci-kit learn's decision tree algorithm to create the model. Weka was used for the exploratory data analysis of this project, which is why arff was the file format chosen for the dataset.
The dataset is obtained by downloading audio files off of YouTube, then converting them to wav, splitting into 0.1 second audio files, normalizing them to -20db, then running an FFT on all the files to generate an arff file.
To determine the instrument playing in an audio file, the same process used to create the dataset is used on the target file. A prediction is made for each slice of the file. Then the statistical mode is taken of the predictions. This allows for some error in the individual predictions while still producing a correct output
dataset_gen Contains the code that generates the dataseet. Run make here to create it
model_gen Creates the sci-kit learn model. Move the arff generated by dataset_gen to a folder called arff in this folder
cli_tool The script here is able to take an audio file input and tell you what instrument is playing in it.