Skip to content

Latest commit

 

History

History
29 lines (22 loc) · 1.98 KB

README.md

File metadata and controls

29 lines (22 loc) · 1.98 KB

logo

Analizing liked songs features from Spotify, using Bayesian networks

Modern streaming platforms are known for their ability to predict the preferences of their users. The music industry in particular poses a complex challenge due to the vastness of different music genres and songs available.
This project aims to model a Bayesian Network from a dataset built by fetching Spotify's API personal data and to experiment with different queries and methods in order to find interesting relationships. Finally, a use case scenario with the final model is presented.

Screenshot

logo

Relevant files

  • Dataset.ipynb: notebook that generates the dataset, with all the explanations on how data was processed, and the reasons behind each choice.
  • Bayesify.ipynb: main notebook where the models are built, and all the experiments performed.
  • spotifyData.csv: the preprocessed dataset used for esimating the CPDs.

Libraries

  • Spotipy to retrieve all kinds of data regarding my liked pieces, and converting it in csv to be imported by Pandas.
  • Pgmpy to make Bayesian networks and inferences.
  • Numpy, Seaborn, Matplotlib and Pandas for data manipulation and visualization.

References

  1. Spotify's API
  2. PGMpy Sampling source code
  3. PGMpy rejection sampling source code