Publication of the code we used in the RecSys Challenge 2018.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.



Publication of the code we used in the RecSys Challenge 2018.


  • The code was tested with miniconda3 (python3.6.5)
  • All neccessary libraries can be installed via requirements.txt
pip install -r requirements.txt


  • Place the playlist JSON files in the data/original and the challenge set in the data/online directory
  • Run the python scripts in the following order
    • (Combines and converts the files into CSV with int ids for tracks, artists, and albums)
    • (Converts the challenge set file into CSV while mapping the URIs to our int ids)
    • (Optional, uses the libary spotipy to collect additional meta data for all tracks in the dataset)
      • USER, CLIENT_ID, and CLIENT_SECRET have to be adjusted in the file.
    • (Creates a sample of 50k random playlists with a test set of 500 playlists)
    • (Contains the code to individualy compute the predictions for all employed methods for the 50k sample)
    • (Combines all the individual solutions in our hybrid approach for the 50k sample)
      • The creation of our "creative" solution is commented out as the crawling is time consuming and marked as optional.
    • (Converts our solution format to the official submission format)
  • Most of the scripts defines FOLDER_TRAIN and FOLDER_TEST along with other important parameters in the head of the file
    • To reproduce our final submissions change those folders to 'data/data_formatted/' and 'data/online/', respectively.
    • As mentioned above, the creative solutions is commented out and can be included in line 53 of once the metadata is fully craweled.