Automatic Playlist Continuation using Random Walks
This repository contains the code for my submission to the Recsys Challenge 2018 (team name 'Team Radboud'). Recommendations are generated from a bipartite graph representation (playlists and tracks) by running random walks over the graph.
The graph is built from the Million Playlist Dataset:
Million Playlist Dataset, official website hosted at https://recsys-challenge.spotify.com/
- python >= 3.6
- spotipy (and Spotify API credentials)
First, download the metadata using
utils/get_metadata.py. Note that this script requires the Spotify API credentials to be set in the environment (
'SPOTIPY_CLIENT_ID': your client id 'SPOTIPY_CLIENT_SECRET': your client secret
Then, build the graph using
utils/build_graph.py. Both these operations will take a long time, the -quick flag can be passed to either script to generate a very small graph from the first MPD files for a quick check if everything is working. The graph is needed as input to
run_mpd.py runs the random walk methods on a validation set taken from the MPD, this file is used for experimentation.
run_challenge.py runs the methods on the challenge set and generates a csv in the format described on the challenge page.
The final score on the leaderboard was generated by running the command:
python run_challenge.py <challenge_set.json> <graph.npz> <name_index> -alpha 0.96 -N 100000 -n_p 100000 -seed 1 -switch_d_prune