The Parameters of our dataset:
-
943 users
-
1682 songs
-
100000 rows
-
5 keywords per song: title, duration, loudness, tempo, key
-
9 types of event, event strength and listening time per user per song
We generated this dataset on our own:
-
The 5 keywords per song comes from the songs dataset downloaded here https://github.com/thomasSve/Million-Song-Dataset-Analysis/tree/master/datasets
-
The "943 users+ 1682 items+ 100000 rows" structure comes from the 'u.data' used in the notebook "3a_movieLens_dataset_solutions" downloaded here https://github.com/thibaultallart/IA316-2020/tree/master/notebooks
-
The 9 types of event, event strength and listening time per user per song are generated by ourself.
Listening operation | Strength | Subsequent operation | Strength |
---|---|---|---|
Have listened it completely | +1 | Like | +1 |
Listen it frequently | +3 | View Song Information | +1 |
Skip it | -1 | Add it to your music | +2 |
Skip it frequently | -3 | Download | +3 |
NaN | NaN | Don’t recommend it anymore | -2 |
You can find here
-
this dataset named "song_data.csv"
-
and a smaller dataset named "smaller_songs_dataset.csv"
- Coding the main enivironment and the "Random", "UCB", "Linear UCB" agents:
-
"Spotify Recommendation (Large dataset).ipynb" to see the coding of the enivironment and the agents and the experiments using the large dataset of 100000 rows
-
"Spotify_Recommendation (Smaller dataset).ipynb" to see the experiments using a smaller dataset extracted by the condition "user_id<=200".
- Coding the same main enivironment and the "ALS" agent:
- "ALS_vs_random.ipynb" to see the coding of the "ALS" agent and its regrets.
You can have a look at our presentation slide here called "Spotify.pdf".