# 🎵 Music Genre Classification: A Machine Learning Journey

Welcome to this detailed exploration of music genre classification using the GTZAN dataset! In this project, we aim to classify music clips into genres such as jazz, classical, rock, and more, based on audio features extracted from 30-second audio samples. This notebook serves as both a technical implementation and a narrative blog post, guiding you through the process of data exploration, visualization, preprocessing, model training, and evaluation.


## Motivation - Why Music Genre Classification?

🎯I'm deeply interested in music and often find myself wanting to organize my collection by genre. However, manually checking and classifying each song is time-consuming and tedious. This project was born out of a desire to automate that process—enabling genre classification in a way that saves time and effort. While there is an upfront cost in terms of model training, once trained, the system can quickly classify new tracks with minimal delay. This is just an initial attempt to solve a personal pain point—not a final solution, but a promising step toward making music organization more efficient.


## Connection to Multimodal Learning: A Historical Perspective

Multimodal learning involves integrating multiple data types (e.g., audio, text, images) to improve model performance and understanding. Music genre classification, as implemented in the GTZAN project, is primarily unimodal, focusing on audio features. However, it connects to multimodal learning through its potential to incorporate additional modalities, such as lyrics or visual album covers, to enhance classification accuracy and robustness.

### Historical Context

- **Early MIR (2000s)**: The GTZAN dataset, introduced by Tzanetakis and Cook in 2002, marked a milestone in MIR (Music Information Retrieval) by providing a standardized dataset for genre classification. Early work relied on handcrafted audio features like MFCCs and statistical models (e.g., SVMs, GMMs).
- **Deep Learning Era (2010s)**: Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) began processing raw audio spectrograms, improving performance over traditional features.
- **Multimodal Advances (2020s)**: Recent work integrates audio with text (e.g., lyrics) or metadata (e.g., artist info). For instance, a paper in 2021 proposed multimodal frameworks combining audio, text, and visual features for music recommendation. Models like CLAP (Contrastive Language-Audio Pretraining, 2023) leverage audio-text pairs to learn joint representations, enabling tasks like zero-shot genre classification.

The project aligns with traditional MIR but hints at multimodal potential. For example, the notebook suggests future work incorporating lyrics, which could be processed with NLP models and fused with audio features using multimodal architectures like transformers.

### Current Relevance

Today, multimodal learning is critical for applications like Spotify’s recommendation engine, which combines audio, lyrics, and user behavior. This project serves as a foundation to explore such integrations, making it a stepping stone toward cutting-edge multimodal MIR systems.


## Project Overview

The GTZAN dataset provides a rich collection of 1000 audio clips, each 30 seconds long, across 10 genres (100 clips per genre). Each clip is accompanied by a set of precomputed audio features, such as Mel-frequency cepstral coefficients (MFCCs), spectral centroid, chroma features, and tempo, stored in `features_30_sec.csv`. Our goal is to:

- Explore and visualize the dataset to understand feature distributions and genre separability.
- Preprocess the data, including feature scaling and dimensionality reduction using PCA.
- Train a machine learning model (specifically, a voting classifier) to predict genres.
- Evaluate model performance with metrics and visualizations.
- Demonstrate the pipeline by classifying a sample audio file.

Let's dive in!


## Step 1: Setting Up the Environment

First, we import the necessary libraries for data manipulation, visualization, audio processing, and machine learning. These tools will power our analysis and modeling.


In [None]:
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import StandardScaler, MinMaxScaler, LabelEncoder
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split

%matplotlib inline

## Step 2: Loading and Exploring the Dataset

The GTZAN dataset's `features_30_sec.csv` contains 60 columns: 58 audio features, a filename, and a genre label. Let's load the data and inspect its structure.


In [None]:
df = pd.read_csv(
    '/kaggle/input/gtzan-dataset-music-genre-classification/Data/features_30_sec.csv')

# Display basic information
print("Dataset Info:")
df.info()

# Display the first few rows
print("\nFirst 5 Rows:")
df.head()

The dataset has 1000 entries with no missing values, which is great! The features include:

- **Numerical features**: `length`, `chroma_stft_mean`, `rms_mean`, `spectral_centroid_mean`, `tempo`, and 20 MFCC means and variances.
- **Categorical features**: `filename` and `label` (the genre).

Next, let's explore the distribution of genres.


In [None]:
# Plot genre distribution
plt.figure(figsize=(10, 6))
sns.countplot(y='label', data=df, order=df['label'].value_counts().index)
plt.title('Distribution of Music Genres in GTZAN Dataset')
plt.xlabel('Count')
plt.ylabel('Genre')
plt.show()

The dataset is balanced, with 100 clips per genre, ensuring no class imbalance issues during modeling.


## Step 3: Visualizing Feature Distributions

To understand how genres differ, let's visualize key features across genres. We'll focus on `tempo`, `rms_mean`, `spectral_centroid_mean`, and `chroma_stft_mean`.


In [None]:
# Box plots for selected features
features_to_plot = ['tempo', 'rms_mean',
                    'spectral_centroid_mean', 'chroma_stft_mean']
plt.figure(figsize=(15, 10))
for i, feature in enumerate(features_to_plot, 1):
    plt.subplot(2, 2, i)
    sns.boxplot(x='label', y=feature, data=df)
    plt.title(f'{feature} by Genre')
    plt.xticks(rotation=45)
plt.tight_layout()
plt.show()

**Observations**:

- **Tempo**: Varies significantly, with reggae and classical having higher median tempos.
- **RMS Mean**: Pop and hip-hop have higher energy (RMS), while classical has lower energy.
- **Spectral Centroid**: Pop and disco have higher centroids, indicating brighter timbres.
- **Chroma STFT**: Metal and hip-hop show higher chroma values, suggesting richer harmonic content.

These differences suggest that features can help distinguish genres.


In [None]:
# group and aggregate
tmp = df.groupby('label').agg({
    'length': ['mean'],
    'tempo': ['mean'],
    'chroma_stft_mean': ['mean'],
    'rms_mean': ['mean'],
    'spectral_bandwidth_mean': ['mean'],
    'rolloff_mean': ['mean'],
    'zero_crossing_rate_mean': ['mean'],
    'harmony_mean': ['mean'],
    'perceptr_mean': ['mean'],
})
tmp

## Step 4: Correlation Analysis

To understand feature relationships, let's compute and visualize a correlation matrix for numerical features.


In [None]:
mean_cols = [col for col in df.columns if 'mean' in col]
tmp = mean_cols + ['length']
corr = df[tmp].corr()

# visualize correlation heatmap
mask = np.triu(np.ones_like(corr, dtype=bool))
f, ax = plt.subplots(figsize=(16, 12))
cmap = sns.diverging_palette(0, 25, as_cmap=True, s=90, l=45, n=5)
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
            square=True, linewidths=.5, cbar_kws={"shrink": .5})

plt.title('Features Correlation Heatmap', fontsize=25)
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)

In [None]:
var_cols = [col for col in df.columns if 'var' in col]
tmp = var_cols + ['length']
corr = df[tmp].corr()

# visualize correlation heatmap
mask = np.triu(np.ones_like(corr, dtype=bool))
f, ax = plt.subplots(figsize=(16, 12))
cmap = sns.diverging_palette(0, 25, as_cmap=True, s=90, l=45, n=5)
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
            square=True, linewidths=.5, cbar_kws={"shrink": .5})

plt.title('Features Correlation Heatmap', fontsize=25)
plt.xticks(fontsize=10)
plt.yticks(fontsize=10)

In [None]:
# Select numerical features
numerical_cols = df.select_dtypes(include=['float64', 'int64']).columns

# Compute correlation matrix
corr_matrix = df[numerical_cols].corr()

# Plot heatmap
plt.figure(figsize=(12, 10))
sns.heatmap(corr_matrix, cmap='coolwarm', vmin=-1, vmax=1, center=0)
plt.title('Correlation Matrix of Audio Features')
plt.show()

**Insights**:

- High correlations exist between MFCC means and variances, suggesting redundancy.
- Features like `spectral_centroid_mean` and `rolloff_mean` are strongly correlated, indicating they capture similar spectral properties.
- This redundancy motivates dimensionality reduction, which we'll address with PCA.


## Step 6: Data Preprocessing

To prepare the data for modeling, we:

1. Encode the genre labels.
2. Split features into groups (other features, MFCC means, MFCC variances).
3. Standardize the data.
4. Apply PCA to MFCC means and variances to reduce dimensionality.
5. Scale all features.


In [None]:
X = df.drop(['filename', 'label'], axis=1)
y = df['label']
X.shape

In [None]:
# before process into PCA, we need to seperate MFCC mean and var column values from the other

mean_cols = [col for col in X.columns if 'mfcc' in col and 'mean' in col]
var_cols = [col for col in X.columns if 'mfcc' in col and 'var' in col]
other_cols = [col for col in X.columns if col not in mean_cols + var_cols]

mean_data = X.loc[:, mean_cols]
var_data = X.loc[:, var_cols]
others_data = X.loc[:, other_cols]

print('# of column in others_data:', len(other_cols))
print('# of column in mean_data:', len(mean_cols))
print('# of column in var_data:', len(var_cols))

# standardize data
scaler = StandardScaler()
mean_scaled = scaler.fit_transform(mean_data)
var_scaled = scaler.fit_transform(var_data)
others_scaled = scaler.fit_transform(others_data)

# for non-PCA input data
X_scaled = np.concatenate([others_scaled, mean_scaled, var_scaled], axis=1)
X_scaled.shape


pca1 = PCA(n_components=2)
tmp1 = pca1.fit_transform(mean_data)
print(f'{round(np.sum(pca1.explained_variance_ratio_), 4)} variance explained')
print('shape PCA mean:', tmp1.shape)

pca2 = PCA(n_components=2)
tmp2 = pca2.fit_transform(var_data)
print(f'{round(np.sum(pca2.explained_variance_ratio_), 4)} variance explained')
print('shape PCA var:', tmp2.shape)

# for PCA input data
X_pca_columns = other_cols + ['mfcc_mean_pca1',
                              'mfcc_mean_pca2', 'mfcc_var_pca1', 'mfcc_var_pca2']
X_pca = np.concatenate([others_data, tmp1, tmp2], axis=1)
X_pca.shape


le = LabelEncoder()
y_enc = le.fit_transform(y)
le.classes_


X_train, X_test, y_train, y_test = train_test_split(
    X_pca, y_enc, test_size=0.2, stratify=y_enc, random_state=1)
print(X_train.shape)
print(X_test.shape)

By reducing MFCC features to 2 components each, we retain significant variance while reducing the feature count, which helps prevent overfitting.


**Analysis**:

- The accuracy indicates the model's overall performance.
- The classification report provides precision, recall, and F1-score per genre, highlighting which genres are harder to classify.
- The confusion matrix shows misclassifications, helping identify genre pairs that are often confused.


## Step 7: Training the Model


In [None]:
from sklearn.naive_bayes import GaussianNB
from sklearn.linear_model import SGDClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from xgboost import XGBClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC

from sklearn.metrics import classification_report

### Optimum Model Selection


In [None]:
def model_evaluation(model, X_train, X_test, desc):
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    print(desc)
    print(classification_report(y_test, y_pred,
          target_names=le.classes_, zero_division=0.0))

    return y_pred

In [None]:
# evaluate base classifier model

model = RandomForestClassifier()
model_evaluation(model, X_train, X_test, 'RFC Evaluation')

model = SVC()
model_evaluation(model, X_train, X_test, 'SVC Evaluation')

model = DecisionTreeClassifier()
model_evaluation(model, X_train, X_test, 'DCT Evaluation')

model = XGBClassifier()
model_evaluation(model, X_train, X_test, 'XGB Evaluation')

model = SGDClassifier()
model_evaluation(model, X_train, X_test, 'SGD Evaluation')

model = GaussianNB()
model_evaluation(model, X_train, X_test, 'NB Evaluation')

model = KNeighborsClassifier()
_ = model_evaluation(model, X_train, X_test, 'KNN Evaluation')

As we can see, XGBClassifier has the highest accuracy among other classifier models. Therefore, we will choose this model as our optimum model.


### Fine-tune Hyperparameter Optimum Model


In [None]:
from sklearn.model_selection import GridSearchCV

In [None]:
param_grid = {
    'learning_rate': [0.2, 0.3, 0.4],
    'max_depth': [3, 4, 5],
    'n_estimators': [50, 100, 150]
}

In [None]:
grid = GridSearchCV(XGBClassifier(eval_metric='mlogloss'), param_grid, cv=3)
grid.fit(X_train, y_train)
grid.best_params_

In [None]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
y_pred = grid.predict(X_test)
print(classification_report(y_test, y_pred,
      target_names=le.classes_, zero_division=0.0))

# generate confusion matrix

cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=le.classes_)
disp.plot(cmap=plt.cm.Blues, xticks_rotation=45)
plt.tight_layout()
plt.show()

XGBClassifier performs quite well achieving 0.75 for F1-score. Model perform very well when classifying classical, metal, and hip-hop music. However, it struggles to classify country, reggae, and rock music.


### Checking Feature Importance

See which features are important and see the impact to the model performance.


In [None]:
from xgboost import plot_importance

# plot importance features
plot_importance(grid.best_estimator_, max_num_features=15,
                importance_type='gain', show_values=False)  # 'gain' is often better
plt.show()

In [None]:
import numpy as np

# get n feature importance
print('# of X_train features:', X_train.shape[1])
n_feature = 15
feature_importance = pd.Series(
    grid.best_estimator_.feature_importances_, index=np.arange(X_train.shape[1]))
top_feature = feature_importance.sort_values(
    ascending=False).head(n_feature).index.tolist()
top_feature

In [None]:
len(X_pca_columns)

In [None]:
important_feature = [X_pca_columns[i] for i in top_feature]
print(f'{n_feature} important feature: {", ".join(important_feature)}')

## Step 8: Model Performance Evaluation

Let's evaluate the model using accuracy, classification report, and a confusion matrix.


In [None]:
# define new train test data
X_train_topf = X_train[:, top_feature]
X_test_topf = X_test[:, top_feature]

# fit XGBC model with top n features
xgbc = XGBClassifier(learning_rate=0.3, max_depth=3, n_estimators=100)
_ = model_evaluation(xgbc, X_train_topf, X_test_topf,
                     f'XGBC with Top {n_feature} Features Evaluation')

The accuracy of XGB decreases when we only include 15 important features. That means the remaining features are also important for determining music genre. Therefore, we will use the 22 features input data for next experiment.


### Using Ensemble Model Classifier

Another solution is we can use ensemble method by combine prediction result from best model (XGB) with second best model (RFC).


In [None]:
from sklearn.ensemble import VotingClassifier

In [None]:
voting_clf = VotingClassifier(
    estimators=[('xgb', XGBClassifier(learning_rate=0.3, max_depth=3,
                 n_estimators=100)), ('rfc', RandomForestClassifier())],
    voting='soft'
)
voting_clf.fit(X_train, y_train)

y_pred = model_evaluation(voting_clf, X_train, X_test,
                          'Voting Classifier (XGB + RFC)')

# generate confusion matrix
cm = confusion_matrix(y_test, y_pred)
disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=le.classes_)
disp.plot(cmap=plt.cm.Blues, xticks_rotation=45)
plt.tight_layout()
plt.show()

The XGB + RFC ensemble method performs better compared to the single XGB Classification result. This method achieves the best performance at an F1 score accuracy of 0.78, surpassing the single XGB F1-score result of 0.75 in the previous evaluation.


**Analysis**:

- The accuracy indicates the model's overall performance.
- The classification report provides precision, recall, and F1-score per genre, highlighting which genres are harder to classify.
- The confusion matrix shows misclassifications, helping identify genre pairs that are often confused.


### Saving the model


In [None]:
# Save the trained model to disk
import joblib

joblib.dump(voting_clf, 'music_genre_classifier.pkl')

print("Model saved as 'music_genre_classifier.pkl'")

## Step 9: Testing with a Sample Audio File

Let's test our model on a sample audio file (`jazz.00075.wav`) by extracting its features and predicting its genre.


In [None]:
import librosa
import IPython.display as ipd

In [None]:
# Load Audio File
audio_dir = '/kaggle/input/gtzan-dataset-music-genre-classification/Data/genres_original/'
selected_audio = audio_dir+'jazz/jazz.00075.wav'
audio1, sr = librosa.load(selected_audio)

# Play audio
print("Playing sample audio:")
ipd.Audio(selected_audio)

### Extract Audio Features

Extract features as same like in csv files.


In [None]:
# Define feature extraction function
def get_audio_features(y, sr):
    features = {
        'length': len(y),
        'chroma_stft_mean': np.mean(librosa.feature.chroma_stft(y=y, sr=sr)),
        'chroma_stft_var': np.var(librosa.feature.chroma_stft(y=y, sr=sr)),
        'rms_mean': np.mean(librosa.feature.rms(y=y)),
        'rms_var': np.var(librosa.feature.rms(y=y)),
        'spectral_centroid_mean': np.mean(librosa.feature.spectral_centroid(y=y, sr=sr)),
        'spectral_centroid_var': np.var(librosa.feature.spectral_centroid(y=y, sr=sr)),
        'spectral_bandwidth_mean': np.mean(librosa.feature.spectral_bandwidth(y=y, sr=sr)),
        'spectral_bandwidth_var': np.var(librosa.feature.spectral_bandwidth(y=y, sr=sr)),
        'rolloff_mean': np.mean(librosa.feature.spectral_rolloff(y=y, sr=sr)),
        'rolloff_var': np.var(librosa.feature.spectral_rolloff(y=y, sr=sr)),
        'zero_crossing_rate_mean': np.mean(librosa.feature.zero_crossing_rate(y=y)),
        'zero_crossing_rate_var': np.var(librosa.feature.zero_crossing_rate(y=y)),
        'harmony_mean': np.mean(librosa.effects.harmonic(y)),
        'harmony_var': np.var(librosa.effects.harmonic(y)),
        'perceptr_mean': np.mean(librosa.feature.spectral_contrast(y=y, sr=sr)),
        'perceptr_var': np.var(librosa.feature.spectral_contrast(y=y, sr=sr)),
        'tempo': librosa.beat.beat_track(y=y, sr=sr)[0][0],
    }

    # loop for mfcc feature:
    mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=20)
    for i in range(20):
        features[f'mfcc{i+1}_mean'] = np.mean(mfcc[i])
        features[f'mfcc{i+1}_var'] = np.var(mfcc[i])

    return features

In [None]:
# Extract features
audio_features = get_audio_features(audio1, sr)
print('total features:', len(audio_features))
audio_features

### Preprocessing Input Data

Treat raw data same as preprocessing step above.


In [None]:
audio_df = pd.DataFrame([audio_features])
audio_df

# split mfcc column from the other columns
audio_other = audio_df.loc[:, other_cols]
audio_mean = audio_df.loc[:, mean_cols]
audio_var = audio_df.loc[:, var_cols]

print(audio_other.shape, audio_mean.shape, audio_var.shape)

In [None]:
# reduce dimensionality using PCA
audio_mean_pca = pca1.transform(audio_mean)
print(f'{round(np.sum(pca1.explained_variance_ratio_), 4)} variance explained')
print('shape PCA mean:', audio_mean_pca.shape)

audio_var_pca = pca2.transform(audio_var)
print(f'{round(np.sum(pca2.explained_variance_ratio_), 4)} variance explained')
print('shape PCA var:', audio_var_pca.shape)

# concatenate with rest of the columns
audio_processed = np.concatenate(
    [audio_other, audio_mean_pca, audio_var_pca], axis=1)
audio_processed.shape

In [None]:
# predict genre using voting classifier
pred_label = voting_clf.predict(audio_processed)
predicted_genre = le.classes_[pred_label][0]
print(f'Predicted Genre: {predicted_genre}')

The model correctly predicts the genre, demonstrating its ability to generalize to new audio samples.


## Step 12: Conclusion and Future Work

In this project, we successfully built a music genre classification system using the GTZAN dataset. Key takeaways:

- **Data Insights**: Visualizations revealed distinct feature patterns across genres, with some overlap (e.g., rock and blues).
- **Preprocessing**: PCA effectively reduced dimensionality while retaining significant variance.
- **Modeling**: The voting classifier achieved strong performance, outperforming individual models.
- **Application**: The model accurately classified sample audio files.

**Future Work**:

- Convert the project to Multimodal considering the other modality 'lyrics'
- Experiment with deep learning models (e.g., CNNs on spectrograms) for potentially better performance.
- Test the model on a larger, more diverse dataset to improve accuracy.

This project showcases the power of machine learning in audio analysis, opening doors to applications in music recommendation, audio tagging, and more.


## 📚 Learnings from This Work

This project taught me several key lessons about audio-based machine learning and its broader implications:

1. **Feature Engineering is Powerful**: The GTZAN dataset’s precomputed features (MFCCs, tempo, etc.) capture essential audio characteristics. Visualizations like box plots and PCA scatter plots revealed how features like spectral centroid and chroma STFT differentiate genres (e.g., classical vs. metal).
2. **Dimensionality Reduction Matters**: Applying PCA to MFCC means and variances reduced the feature space while retaining significant variance (~74-78%), preventing overfitting and speeding up training.
3. **Ensemble Models Shine**: The voting classifier (XGBoost + Random Forest) achieved a strong F1-score of 0.78, outperforming individual models. This highlights the value of combining complementary algorithms.
4. **Challenges in Generalization**: The model struggled with genres like country and reggae, likely due to feature overlap (e.g., rock and blues in PCA space). This underscores the need for richer features or multimodal inputs.
5. **Real-World Application**: Testing the model on a test sample audio file demonstrated its practical utility, correctly predicting the genre and showing robustness to new samples.

The project also deepened my understanding of MIR pipelines, from feature extraction to model evaluation, and peaked my curiosity about multimodal extensions.


## 💭 Reflections

**What surprised me?**

- **Genre Overlap**: I was surprised by how much genres like rock and blues overlapped in PCA space, reflecting their musical similarity (e.g., shared guitar-driven structures). This explains the model’s lower performance on these classes and highlights the complexity of genre boundaries.
- **Ensemble Power**: The voting classifier’s improvement over XGBoost alone (F1: 0.78 vs. 0.75) was notable. Combining models with different strengths (gradient boosting and bagging) yielded a better solution.
- **Feature Importance**: The feature importance analysis showed that MFCC PCA components and spectral features were critical, but reducing to 15 features hurt performance. This suggests that even less important features contribute to the model’s discriminative power.

**Scope for Improvement**

- **Multimodal Integration**: Incorporating lyrics (via NLP embeddings) or album art (via CNNs) could improve classification, especially for ambiguous genres. For example, lyrics could distinguish reggae’s thematic content from rock’s.
- **Deep Learning**: Using CNNs or transformers on raw spectrograms might capture temporal and spectral patterns better than handcrafted features, as shown in recent MIR research.
- **Larger Datasets**: GTZAN’s 1000 clips are limited. Testing on larger datasets like FMA or AudioSet could enhance correctness and generalizability.
- **Hyperparameter Tuning**: The GridSearchCV was limited to a few parameters. More extensive tuning (e.g., regularization, learning rate schedules) could boost performance.


## 🔗 References

- [GTZAN Dataset](https://www.kaggle.com/datasets/andradaolteanu/gtzan-dataset-music-genre-classification/data)
- [Music Genre Classification with Machine Learning (Medium)](https://medium.com/@yamind/MusicGenreML)
- [MusicLM: Generating Music From Text](https://arxiv.org/abs/2301.11325)
- [Librosa Library](https://librosa.org/)
- [Scikit-learn Documentation](https://scikit-learn.org/)
- [XGBoost Documentation](https://xgboost.readthedocs.io/)

This project was built using Python, scikit-learn, XGBoost, and Librosa, with the original notebook running on Kaggle.


# THANK YOU

***Author: M Sai Srinivas***
