My submission to the Recsys Challenge 2018: Automatic Playlist Continuation using Random Walks
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
recsysrw
utils
.gitignore
LICENSE
README.md
Thesis.pdf
run_challenge.py
run_mpd.py

README.md

Automatic Playlist Continuation using Random Walks

This repository contains the code for my submission to the Recsys Challenge 2018 (team name 'Team Radboud'). Recommendations are generated from a bipartite graph representation (playlists and tracks) by running random walks over the graph.

The graph is built from the Million Playlist Dataset:

Million Playlist Dataset, official website hosted at https://recsys-challenge.spotify.com/

Requirements:

  • python >= 3.6
  • numpy
  • scipy
  • tqdm
  • spotipy (and Spotify API credentials)
  • whoosh

Usage

First, download the metadata using utils/get_metadata.py. Note that this script requires the Spotify API credentials to be set in the environment (os.environ):

'SPOTIPY_CLIENT_ID': your client id
'SPOTIPY_CLIENT_SECRET': your client secret

Then, build the graph using utils/build_graph.py. Both these operations will take a long time, the -quick flag can be passed to either script to generate a very small graph from the first MPD files for a quick check if everything is working. The graph is needed as input to run_mpd.py and run_challenge.py.

run_mpd.py runs the random walk methods on a validation set taken from the MPD, this file is used for experimentation. run_challenge.py runs the methods on the challenge set and generates a csv in the format described on the challenge page.

The final score on the leaderboard was generated by running the command:

python run_challenge.py <challenge_set.json> <graph.npz> <name_index> -alpha 0.96 -N 100000 -n_p 100000 -seed 1 -switch_d_prune