❤️ This project was truly my favorite as it not only allowed me to explore my passion for music, but also made me a fan of data science and opened up a whole new world of discovery in this field. ❤️
✍️ The report for this project can be found and downloaded through the following links :
- English Version : https://github.com/louiselize/SpotifyDataAnalysis/blob/main/PDF/ENGLISH_REPORT_LIZE_AYOUB.pdf
- French Version : https://github.com/louiselize/SpotifyDataAnalysis/blob/main/PDF/FRENCH_REPORT_LIZE_AYOUB.pdf
As a music lover and data enthusiast, I was excited to dive into the vast amount of data collected by Spotify, a popular music streaming service with millions of users worldwide. In this project, I will use both supervised and unsupervised learning methods to classify songs into their respective genres, based on their audio features.
➡️ Supervised learning involves training a model to predict an outcome based on labeled data. In the context of Spotify, this could involve training a model to classify songs into their respective genres based on their audio features. We will use a supervised learning algorithm, such as logistic regression or decision trees, to build a predictive model based on features such as danceability, tempo, and popularity.
➡️ On the other hand, unsupervised learning involves identifying patterns and relationships in unlabeled data. In the context of Spotify, this could involve clustering songs based on their audio features and identifying common patterns or characteristics within each cluster. I will use unsupervised learning techniques, such principal component analysis (PCA), to uncover hidden insights in the data that may not be apparent through manual analysis. We will also use K-means clustering or hierarchical clustering, to group similar songs together and explore the underlying structure of the data.
Through this project, I hope to gain insights into the relationships between different audio features and the genres to which they belong. By using both supervised and unsupervised learning methods, I aim to develop a more comprehensive understanding of the underlying patterns in the data and ultimately build a more accurate genre classification model.
I played a significant role in this project and I received invaluable assistance from Yoann Ayoub especially for the random forest and decision tree part.
I am delighted to share that our hard work and collaboration paid off, and we were thrilled to receive a high grade for this project 🥇
While we had a lot of exciting ideas for this project, one shortcoming we encountered was not dedicating enough time to properly cleaning and organizing our repository before testing new features. In retrospect, we recognized the importance of thoroughly cleaning and structuring our work. This experience has taught me a valuable lesson that I can apply to future data analysis projects.
Spring 2022 semester @UTC