Skip to content

The goal of this project is to create a classifier and see how accurately it can predict song genres. Taking a dataset from Spotify [Pandya, 2022], which is al- ready using machine learning algorithms for these purposes, can help assess if the resulting model can be considered apt for a large-scale business.

License

Notifications You must be signed in to change notification settings

abisliouk/IE500-data-mining

Repository files navigation

Screenshot 2024-04-13 at 14 43 50

Song genre predictor based on Spotify data

The goal of this project is to create a classifier and see how accurately it can predict song genres. Taking a dataset from Spotify [Pandya, 2022], which is al- ready using machine learning algorithms for these purposes, can help assess if the resulting model can be considered apt for a large-scale business or is more appropriate for a smaller audio streaming market player.

Used Machine Learning Methods

  • SVM
  • Decision Tree
  • Gaussian Naive Bayes
  • K-nn
  • MLP
  • Multinomial Naive Bayes
  • Nearest Centroids
  • Random Forest
  • XGBoost

Structure

The contents of the repository are the following:

Folders

Notebooks

  • baseline → implement the majority and rule-based baselines
  • clustering → reduce the number of genres in the dataset to only 18 via a combination of agglomerative clustering and manual input
  • data-cleaning → choose only one genre for every song in the dataset that appeared with multiple genres
  • data-exploration → visualize the features of the dataset and propose preprocessing steps
  • hyperparemter-optimization → hyperparameter optimization implemented using GridSearchCV
  • plots → generate plots for the report and presentation

Setup

  1. Activate your virtual environment
  2. Run the following command to install all the dependencies needed for this project:
pip install -r requirements.txt
  1. Inspect the code for the different algorithms that were explored (stored under ml_methods/)

Submission details

Team 1

  • Elizaveta Nosova (1983805)
  • Miguel Samaniego (1980439)
  • Nico Sharei (1986818)
  • Julian Ament (1981511)
  • Artem Bisliouk (1978986)
  • Jannik Kranz (1981766)

About

The goal of this project is to create a classifier and see how accurately it can predict song genres. Taking a dataset from Spotify [Pandya, 2022], which is al- ready using machine learning algorithms for these purposes, can help assess if the resulting model can be considered apt for a large-scale business.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published