Spotify Music Preference Classification 🎶

Yay 👍🏻! or Nay 👎🏻!

A Binary Classification project leveraging powerful algorithms such as Extreme Gradient Boosting, Stacking Classifier (KNN, Decision Tree, Logistic Regression, SVM, Bernoulli Naive Bayes), Gaussian Naive Bayes, and Random Forest to predict user preferences for songs.

Key Techniques

Hyperparameter Tuning: Using Gridsearch CV.
Cross Validation: Ensures robust model evaluation through 10-fold cross-validation.
Data Preprocessing: Includes handling duplicates, managing outliers via deletion or clipping, and standardizing data alongside label encoding string values to prepare for effective model training.

This project combines advanced algorithms and meticulous data preparation to create a predictive model aimed at enhancing the Spotify user experience by predicting song preferences.

Comparison Results between different ML Models:

Overall Conclusion:

Algorithm	Accuracy (%)	Train-Test Diff (%)	Precision (%)	Recall (%)	F1 Score (%)	AUC (%)
XGB	72.80	3.86	72.22	72.96	72.59	72.80
Gaussian Naive Bayes	62.22	4.41	64.38	52.55	57.87	62.10
Random Forest	71.54	6.89	70.65	72.45	71.54	71.55
Stacking Classifier	71.03	7.33	69.95	72.45	71.18	75.24

XGB appears to be the best-performing model when considering accuracy, precision, and F1-score. It also has the lowest difference between train and test accuracy, indicating good generalization.
Random Forest is a close second in terms of accuracy, precision, recall, and F1-score. However, it has a higher difference between train and test accuracy, indicating that it may be overfitting.
Stacking Classifier performs competitively but doesn't outshine the other models in any particular metric except AUC, where it performs the best.
Gaussian Naive Bayes has the lowest performance across all metrics except for the difference between train and test accuracies, where it performs well, indicating good generalization but possibly a simpler model.

Tran Test Accuracies (To measure overfit):

Accuracy:

Precision:

Recall:

F1-Score:

AUC:

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.cache		.cache
README.md		README.md
SpotifyDataset.csv		SpotifyDataset.csv
SpotifyPreprocessing.ipynb		SpotifyPreprocessing.ipynb
UserInterface.ipynb		UserInterface.ipynb
app.py		app.py
gaussian_naive_bayes_model.pkl		gaussian_naive_bayes_model.pkl
gnb_model.pkl		gnb_model.pkl
known_artists.pkl		known_artists.pkl
known_song_titles.pkl		known_song_titles.pkl
label_encoder_artist.pkl		label_encoder_artist.pkl
label_encoder_song_title.pkl		label_encoder_song_title.pkl
power_transformer.pkl		power_transformer.pkl
requirements.txt		requirements.txt
rf_model.pkl		rf_model.pkl
stacked_classifier.pkl		stacked_classifier.pkl
standard_scaler.pkl		standard_scaler.pkl
xgb_Model.pkl		xgb_Model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spotify Music Preference Classification 🎶

Key Techniques

Overall Conclusion:

About

Releases

Packages

Languages

Bernardbyy/SpotifyMusicClassifier

Folders and files

Latest commit

History

Repository files navigation

Spotify Music Preference Classification 🎶

Key Techniques

Overall Conclusion:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages