Track2Seq

Track2Seq is a Deep Long Short Term Memory network implementation which can be used to generate diverse playlist continuations by predicting one track at a time. The method leverages top-k next-item probabilities to construct a list of recommendations through a semi guided prediction process. In addition the setup shows how title information can be used for playlists when no seed tracks are available.

Dependencies

All dependencies are in requirements.txt and can be installed i.e. through pip install -f requirements.txt

Data

The network was designed to work with the Million Playlist Dataset (MPD), official website hosted at https://recsys-challenge.spotify.com.

Preprocessing

After setting local variables in src/config.json pre-processing and training scripts can be executed. Make sure to download the pre-computed word2vec embeddings for step 6) by Mikolov et al. Run the scripts in following order:

src/a_generate_sequences.py
src/b_generate_levenshtein_seeds.py
src/c_generate_w2v_seeds.py

This will perform the following steps:

Generating statistics for all playlists to stratify on
Binning and stratification of playlists
Splitting of playlists in train, development and test sets
Bucketing of development and test sets to match challenge data
Turning training playlists in int-sequence and filtering less frequent songs (< 5)
Computing seed tracks for 0-seed playlists

The whole process will take a while to compute and requires sufficient memory as well as disk space. Pre-computed files as well as weights will be made available as well.

Model

Afterwards the model can be trained by running src/rnn.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Track2Seq

Dependencies

Data

Preprocessing

Model

Files

README.md

Latest commit

History

README.md

File metadata and controls

Track2Seq

Dependencies

Data

Preprocessing

Model