Music Genre Classification: A Comparative Study of Classical and Deep Learning Methods

This repository contains the code and experiments for the paper "Music Genre Classification Using Classical and Deep Learning Methods: A Comparative Study" by Rauan Arstangaliyev, Kamila Spanova, Moldir Azhimukhanbet, Dilyara Arynova, and Amina Aimuratova from Nazarbayev University.

Abstract

We explore different machine learning approaches for music genre classification. The study begins by combining Naive Bayes (NB) with a Multilayer Perceptron (MLP) and further enhances this by incorporating kernel methods (SVM) to capture complex patterns. Additionally, deep convolutional recurrent neural networks (CRNNs) are applied to two types of features: raw Mel-spectrograms and engineered statistical summaries of audio properties. Using the GTZAN dataset, models are compared based on accuracy, F1 score, and ROC-AUC. Our experimental results show that the best performance was achieved by combining MLP and SVM, as well as CRNNs trained on engineered features. Considering model complexity, we conclude that the MLP and SVM combination offers a practical and effective solution for this task.

Index Terms: Music Genre Classification, Naive Bayes, Multilayer Perceptron, Kernel Methods, SVM, CNN, CRNN, Classifier Combination, GTZAN.

Key Findings

MLP + SVM: Achieved approximately 90% accuracy and F1-score.
CRNN on Engineered Features: Also achieved ~90% accuracy and F1-score.
CRNN on Raw Mel-spectrograms: Performed respectably (around 82-85% accuracy).
GNB as a Feature Enhancer: Showed modest improvements, especially when SVM was not yet added.

Repository Structure

This repository contains the following key scripts and files:

extract_features.py: Script for extracting audio features (e.g., MFCCs, Chroma, Spectral Centroid, etc.) from the GTZAN dataset.
dataset.py: General dataset loading and preprocessing utilities.
dataset_for_mlp.py: Specific dataset preparation tailored for MLP models.
dataset_for_nb.py: Specific dataset preparation tailored for Naive Bayes models.
gnb_simple_52_final.py: Implementation and evaluation of a simpler Gaussian Naive Bayes model.
gnb_complex_64_final.py: Implementation and evaluation of an enhanced/complex Gaussian Naive Bayes model (likely involving feature transformations like Box-Cox/Yeo-Johnson and PCA as described in the paper).
mlp.py: Implementation and evaluation of Multilayer Perceptron models, potentially including the MLP+SVM combination.
.gitignore: Specifies intentionally untracked files that Git should ignore.

Setup & Installation

Clone the repository:

git clone https://github.com/Moldier/genreclassification.git
cd genreclassification

Create a Python virtual environment (recommended):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install dependencies: We recommend creating a requirements.txt file. Based on the paper, common libraries would include:

numpy
scipy
scikit-learn
librosa
matplotlib
# Add pandas if used for data handling
# Add TensorFlow or PyTorch if used for CRNNs

Install them using:

pip install -r requirements.txt

Dataset:
- Download the GTZAN dataset. Available at: https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification
- Ensure the dataset is placed in a location accessible by the scripts (e.g., a data/gtzan/ directory) or update the paths within the scripts accordingly.

Usage

Feature Extraction: Run extract_features.py to process the GTZAN audio files and save the engineered features.
```
python extract_features.py
```
Running Models: Execute the individual model scripts to train and evaluate them:
```
python gnb_simple_52_final.py
python gnb_complex_64_final.py
python mlp.py
# Add commands for other models/experiments as needed
```
The scripts will load pre-extracted features (or extract them if designed that way) and output performance metrics (accuracy, F1-score, and confusion matrices).

Authors

Rauan Arstangaliyev (rauan.arstangaliyev@nu.edu.kz)
Kamila Spanova (kamila.spanova@nu.edu.kz)
Moldir Azhimukhanbet (moldir.azhimukhanbet@nu.edu.kz)
Dilyara Arynova (dilyara.arynova@nu.edu.kz)
Amina Aimuratova (amina.aimuratova@nu.edu.kz)

School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music Genre Classification: A Comparative Study of Classical and Deep Learning Methods

Abstract

Key Findings

Repository Structure

Setup & Installation

Usage

Authors

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
__pycache__		__pycache__
.gitignore		.gitignore
README.md		README.md
dataset.py		dataset.py
dataset_for_mlp.py		dataset_for_mlp.py
dataset_for_nb.py		dataset_for_nb.py
extract_features.py		extract_features.py
gnb_complex_64_final.py		gnb_complex_64_final.py
gnb_simple_52_final.py		gnb_simple_52_final.py
mlp.py		mlp.py

ammn23/genreclassification

Folders and files

Latest commit

History

Repository files navigation

Music Genre Classification: A Comparative Study of Classical and Deep Learning Methods

Abstract

Key Findings

Repository Structure

Setup & Installation

Usage

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages