MelonRec-W2V

Team ThunderCat: (Public LeaderBoard 58th)

1. Competition Overview

Goal: To predict songs and tags that were not given when half or all of the songs and tags in the playlist are unseen

data

Playlist metadata (title, song, tag, number of likes, update time)
Song metadata (song title, album title, genre, title date)
Mel-spectrogram of the song

Number of participating teams: 786 teams

2. Issue

Challenge: all of the songs and tags in the playlist are unseen

Approach - Multi-Modal Retrieval - Query by Song, Tag, Title

Main Issue

Make co-embedding space that contain song vector, tag vector, title vector
Evaluate that the embedding space learns the semantic relationship.
Cover all retrieval scenario
- Given Song to Song retrieval (Playlist Continuous)
- Given Playlist Tag to Tag retrieval (Playlist Auto-tagging)
- Given Playlist tag to Song retrieval (Unseen Item retrieval)
- Title to Tag and Song retrieval (Sentence to Item retrieval)

3. Idea

Approach

Make co-embedding space with Word2vec Method
Single Modal Retrieval
- Voting each modality
Multi Modal Retrieval
- Mean each modality
Cluster Based Retrieval

4. Approach

Train Co-embedding Space (Word2Vec Embedding)
- Input: Sentence (Title token, Tag, Genre, Song, Plylist id)
- Ouput: Word, Item, Song, Plylist Vector
Multi-Modality Retrieval (Inference)
Evaluation with ndcg
- tag wise, song wise

Dependencies

numpy 1.16.2
pandas 0.24.2
matplotlib 3.0.3
tqdm 4.31.1
gensim 3.8.3
sentencepiece 0.1.91
sklearn 0.20.3
khaiii
pytorch 1.5.1

4. Folder and Files

data download (link)
- train.json, val.json, test.json, genre_gn_all.json, song_meta.json

Learning code (from scratch)

Best model hyper-parameter, window: 100, vector: 300, mincount: 10, iteration: 50, Skip-gram

$ train_embedding.py
$ embedding_most_similar.py

5. To-do

Measure Playlist-Song's Mean and Playlist's Vector Similarity

Mid-Evaluation of Embedding Space: KL Divergence
Training Method: Self-Supervised Approach

6. Reference

Musical Word Embedding: Bridging the Gap between Listening Contexts and Music
- Seungheon Doh, Jongpil Lee, Tae Hong Park, and Juhan Nam. Machine Learning for Media Discovery Workshop, International Conference on Machine Learning (ICML), 2020
Automatic music playlist continuation via neighbor-based collaborative filtering and discriminative reweighting/reranking.
- Zhu, L., He, B., Ji, M., Ju, C., & Chen, Y. (2018). In Proceedings of the ACM Recommender Systems Challenge 2018 (pp. 1-6).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
clustering_most_similar.py		clustering_most_similar.py
embedding_most_similar.py		embedding_most_similar.py
retrieval_utils.py		retrieval_utils.py
train_embedding.py		train_embedding.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MelonRec-W2V

1. Competition Overview

2. Issue

3. Idea

4. Approach

Dependencies

4. Folder and Files

Learning code (from scratch)

5. To-do

6. Reference

About

Releases

Packages

Languages

seungheondoh/MusicPlaylistRecSys-W2V

Folders and files

Latest commit

History

Repository files navigation

MelonRec-W2V

1. Competition Overview

2. Issue

3. Idea

4. Approach

Dependencies

4. Folder and Files

Learning code (from scratch)

5. To-do

6. Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages