# **Research for Automatic Playlist Generation using ML Algorithms**
#### A. Podcast-like Playlists:

    - Categorical Taggs (Genre, Era/Year, Label, Producers)
    - Qualitative Tags (Dancebility, BPM, Key, Vocal/Instrumental)

#### B. Mixtape-like Playlists:
    
    - Audio Features
    - Feature Similarity Measures 
    
        - Harmony
        - Rhythm
        - Sound
        - Instrumentation
        - Mood/Sentiment
        - Dynamic

#### C. DJ-Mixes:

    - Start/Intro & End/Outro Features of Songs
    - Beat Matching Features
    - Song to Song Transition Features
    - Story Telling Features over whole Song Sequence    
    - Coherence Measures for Transitions & Sequences

## Overview: State of the Art Approaches by Model Type
#### Recommender System Techniques:

**What are the Recommendations / the generated Playlists based on?**

    - emotion / mood
    - input song
    - playlist coherence
    - genre
    - user taste
    - popularity

**2 opposing principles:**

    - Collaborative filtering: Matrix Factorization, alternating least squares
    - Content-based approaches: Input is music information fetched through analysis of music audio data (MIR)


**More recent approaches:**

    - Sequence-aware music recommendation:
        - Next track recommendatons
        - Automatic playlist continuation (APC)

## Possible Datasets & Feature Selection
[**Melon Music Dataset**](https://github.com/MTG/melon-music-dataset)

[last.fm Dataset](https://zenodo.org/record/6090214)

[MTG Barcelona Datasets & Software](https://www.upf.edu/web/mtg/software-datasets)

[Kaggle: Spotify Tracks Dataset](https://www.kaggle.com/datasets/maharshipandya/-spotify-tracks-dataset?datasetId=2570056&sortBy=voteCount)

[Kaggle: Spotify Playlists Dataset](https://www.kaggle.com/datasets/andrewmvd/spotify-playlists?datasetId=1720572&sortBy=voteCount)
## Possible Models and Approaches with respect to 'business' goal
#### YOUTUBE VIDEOS:

[Spotify Playlist Generation](https://www.youtube.com/watch?v=3vvvjdmBoyc&list=PL-wATfeyAMNrTEgZyfF66TwejA0MRX7x1&index=2)




#### PYTHON AUDIO ANALYSIS PACKAGES:
[**Essentia (ML Application ready)**](https://essentia.upf.edu/)

[Essentia citing papers](https://essentia.upf.edu/research_papers.html)

[Librosa (lightweigth analysis)](https://librosa.org/doc/main/feature.html)

## 1. Project Goal

    Automatically generate Playlists from input songs of your library of liked songs or lokal music audio files with a content based approach by analysing existing DJ mixes and their tracklist

**Input:** *Array of songs, for example: all likes on Spotify, your music library*

**Output:** *Ordered Sequence(s) of Songs / Playlists*

## 2. Approach Brainstorming

## DJMix Python Package and Dataset

[Youtube to Spotify mapping](https://www.geeksforgeeks.org/automate-youtube-music-to-spotify-with-python/)

#### *Source:*

[The DJ Mix Dataset](https://github.com/mir-aidj/djmix-dataset)

[A Computational Analysis of Real-World DJ Mixes using Mix-To-Track Subsequence Alignment](https://github.com/mir-aidj/djmix-analysis)

In [None]:
import djmix as dj
import pandas as pd

In [None]:
mixes = pd.DataFrame(dj.mixes)

In [None]:
mixes_columns = [
    'mix_id',
    'mix_title',
    'url',
    'audio_source',
    'audio_url',
    'identified_tracks',
    'tracks',
    'transitions',
    'timestamps',
    'tracklist',
    'tags'
]

In [None]:
tracks_columns = [
    'track_id',
    'track_title'
]

In [None]:
mixes.columns = mixes_columns

In [None]:
mixes.loc[0]

In [None]:
mixes.info()

In [None]:
mixes = mixes.applymap(lambda x: x[1])

In [None]:
mixes[['mix_id', 'identified_tracks', 'tracks', 'transitions', 'timestamps', 'tracklist', 'tags' ]]

In [None]:
mixes.loc[4958]

In [None]:
dj.mixes[4958]

In [None]:
bollek = dj.mixes[4958]

In [None]:
bollek.tracklist

In [None]:
df_bollek = pd.DataFrame(bollek.tracklist)

In [None]:
df_bollek.columns = tracks_columns

In [None]:
df_bollek = df_bollek.applymap(lambda x: x[1])

In [None]:
df_bollek

In [None]:
bollek.tags

In [None]:
tracks = pd.DataFrame(dj.tracks.values())

In [None]:
tracks.columns = tracks_columns

In [None]:
tracks = tracks.applymap(lambda x: x[1])

In [None]:
tracks

In [None]:
import json

with open('./djmix-dataset.json', 'r') as f:
    mixdatabase = json.load(f)

In [None]:
json_mixes_db = pd.DataFrame(mixdatabase)

In [None]:
json_mixes_db.columns = mixes_columns

In [None]:
json_mixes_db[['mix_id', 'identified_tracks', 'tracks', 'transitions', 'timestamps', 'tracklist', 'tags' ]]

In [None]:
track_list_all = []
for mix in dj.mixes:
    track_list_all.append(mix.tracklist)

In [None]:
track_list_all[0][0].id

In [None]:
mixes_tracks = pd.DataFrame(track_list_all)

In [None]:
mixes_tracks.applymap(lambda x: x.id if x != None else None)

In [None]:
mixes_tracks = mixes_tracks.applymap(lambda x: x.id if x != None else None).reset_index(names='mix_id')

In [None]:
mixes_tracks