Skip to content
Solution for the 2018 Spotify RecSys Challenge
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
output
utils
.gitignore
1_mpd_tocsv.py
2_cs_tocsv.py
3_tasks.py
4_0t.py
5_1t.py
6_100f.py
7_rest.py
8_merge.py
9_format_and_fix.py
LICENSE
README.md
requirements.txt
run_all.sh

README.md

recsys-challenge-2018

Solution for the 2018 Spotify RecSys Challenge by the team Definitive Turtles

How to run

  • place the million playlist dataset json files into the data_raw/million_playlist_dataset and the challenge set into the data_raw/challenge_set directories
  • The scripts 5-6-7 contain a variable named threads; set this as desired
  • run the python files 1-6 without parameters
  • run the script 7_rest.py with parameters 0-4 (e.g. for i in $(seq 0 4); do python 7_rest.py $i; done)
  • run python 8_merge.py
  • run python 9_format_and_fix.py output/merged.csv output/submission.csv

Requirements

  • Python 3.5 with standard scientific packages (pandas, numpy, scipy, etc.)
  • 16gb of RAM
  • About 40gb free space

Reference environment

We ran the models using the following python version and packages:

Python 3.5.2 (we used the conda environment)
pandas 0.22.0
numpy 1.14.0
matplotlib 2.0.2
scipy 1.0.0
You can’t perform that action at this time.