# Automatic Playlist Generation:
# A Content-Based Music Sequence Recommender System

## 1. Concept

#### A. Podcast-like Playlists:
- Categorical Tags (Genre, Era/Year, Label, Producers)
- Qualitative Tags (Dancebility, BPM, Key, Vocal/Instrumental)

#### B. Mixtape-like Playlists:  
- Audio Features
- Feature Similarity Measures   
    - Harmony
    - Rhythm
    - Sound
    - Instrumentation
    - Mood/Sentiment
    - Dynamic

#### C. DJ-Mixes:
- Start/Intro & End/Outro Features of Songs
- Beat Matching Features
- Song to Song Transition Features
- Story Telling Features over whole Song Sequence    
- Coherence Measures for Transitions & Sequences

## 2. Recommender Systems Overview: State of the Art Approaches

#### A. Two General Approaches:

- **Collaborative filtering:** Matrix Factorization, alternating least squares
- **Content-based approaches:** Input is music information (basis of songs and/or 
    existing playlists) fetched through Music Information Retrieval (MIR) processes


#### B. What are the Recommendations / the generated Playlists based on?

- emotion / mood
- genre
- user taste
- user similarity
- popularity


#### C. More recent Approaches / Deep Learning Approaches

- Sequence-aware music recommendation:
    - Next track recommendatons
    - Automatic playlist continuation (APC)

## 3. Possible Datasets, Models & Feature Selection

#### A. Datasets:

[**Melon Music Dataset**](https://github.com/MTG/melon-music-dataset)  
[last.fm Dataset](https://zenodo.org/record/6090214)  
[MTG Barcelona Datasets & Software](https://www.upf.edu/web/mtg/software-datasets)  
[Kaggle: Spotify Tracks Dataset](https://www.kaggle.com/datasets/maharshipandya/-spotify-tracks-dataset?datasetId=2570056&sortBy=voteCount)  
[Kaggle: Spotify Playlists Dataset](https://www.kaggle.com/datasets/andrewmvd/spotify-playlists?datasetId=1720572&sortBy=voteCount)

#### B. Python Audio Analysis (MIR) Packages: 

[**Essentia (ML Application ready)**](https://essentia.upf.edu/)  
[Essentia citing papers](https://essentia.upf.edu/research_papers.html)  
[**Librosa (lightweigth analysis)**](https://librosa.org/doc/main/feature.html)


#### C. Youtube Tutorials:

[Spotify Playlist Generation](https://www.youtube.com/watch?v=3vvvjdmBoyc&list=PL-wATfeyAMNrTEgZyfF66TwejA0MRX7x1&index=2)  
[Librosa Music Analysis](https://www.youtube.com/watch?v=MhOdbtPhbLU)

## 4. Content-Based Recommendation

**Reasoning:** *Cold-start problem for metadata-based recommendation systems using only user-generated metadata*  
**Solution:** *Find underlying features of audio/music by MIR*  
**High-level Features:** *genre, mood, instrument(s), vocals, gender of singing voice, lyrics, ...*  
**Low-level Features:** *MFCC, ZCR, Spectral Coefficients, mixability*



## 5. Strategy

    1. Obtain mixes data for Model from mixesdB  

    2. Get songs for each mix  
        2.1 Get songs from mixesDB  
        2.2 Get 30s songs from Spotify API

    3. Analyze song / sequence data for content-based Recommendation system  

    4. Produce playlists  
    
    5. Compare to baseline model  

    (6. Produce dj-mix with transitions)

map mixes songs to spotify

extract items featurers matrix for mixes

## CODE

In [None]:
import numpy as np
import pandas as pd

import matplotlib
import matplotlib.pylab as plt
%matplotlib inline

import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import time
import math


import librosa
import librosa.display

from IPython.display import Audio
import ipywidgets as widgets

#### Librosa: MIR Library

In [None]:
file = '../audio/Marie Davidson - Work It.mp3'

In [None]:
song_file = file.split('/')[-1]

In [None]:
song_name = song_file.split('.')[0]

In [None]:
song_ext = song_file.split('.')[1]

In [None]:
y, sr = librosa.load(file)

#### The Mel-Spectogram

In [None]:
S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128)

In [None]:
log_S = librosa.power_to_db(S, ref=np.max)

In [None]:
plt.figure(figsize=(16,4))
librosa.display.specshow(log_S, sr=sr, x_axis='time', y_axis='mel')
plt.title('mel power sprectrogram for: {}'.format(song_name))
plt.colorbar(format='%+02.0f dB')
plt.show()

#### The Chromagram

In [None]:
y_harmonic, y_percussive = librosa.effects.hpss(y)

In [None]:
C = librosa.feature.chroma_cqt(y=y_harmonic, sr=sr)

In [None]:
plt.figure(figsize=(16,4))
librosa.display.specshow(C, sr=sr, x_axis='time', y_axis='chroma', vmin=0, vmax=1)
plt.title('Chromagram for: {}'.format(song_name))
plt.colorbar()
plt.show()

#### Spotipy: Spotify Web API

In [None]:
cid = '4dec6d8665bc47a0a5c48c24f3322241'
secret = '2434bcfb453d442096d29fa57eb50e41'
username = '1182143609'
client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

#### mixesDb Extractions

In [None]:
tracklist = [
    'uNYS-FibMn8',
    '_5E3ggezzww',
    None,
    '7ythk60VILM',
    'u005Fx-YBSI',
    None,
    'bxPiEolWDZY',
    '8WG0rjNApFk',
    'zhhHG2AWu80',
    'y-N9iNr3wcU',
    'Z-I8HLaIOIY',
    'LKKgi6rDWJA',
    '0Z_319O2GhI',
    'f8cHxydDb7o',
    'zXSzJB_0yjk',
    'HTJxoLy25Xo',
    'YFLt1lhbtKM',
    None,
    's3dNTgA1eyo',
    'ffePfl-Ew_c',
    'H9vP9GPUkGI'
]

In [None]:
import os
path = '../data/tracks'
files = os.listdir(path)


for index, file in enumerate(files):
    os.rename(os.path.join(path, file), os.path.join(path, file + '.mp3'))

In [None]:
tracks_mp3 = []

for id in tracklist:
    if id:
        file = f'../data/tracks/{id}.mp3'
        tracks_mp3.append(file)

In [None]:
tracks_mp3

In [None]:
tracks_mp3[0]

In [None]:
def mel_spec_tracks(file):

    y, sr = librosa.load(file)

    spec = librosa.feature.melspectrogram(y=y, sr=sr)

    return spec

In [None]:
def plot_mel_spec(S):

    fig, ax = plt.subplots()

    S_dB = librosa.power_to_db(S, ref=np.max)

    img = librosa.display.specshow(
        S_dB,
        x_axis='time',
        y_axis='mel',
        sr=sr,
        fmax=8000,
        ax=ax
        )

    fig.colorbar(img, ax=ax, format='%+2.0f dB')

    ax.set(title='Mel-frequency spectrogram')


In [None]:
# specs = []
# for file in tracks_mp3:
#     spec = mel_spec_tracks(file)
#     plot_mel_spec(spec)

In [None]:
def chromogram_plot(file):
    y, sr = librosa.load(file)    
    y_harmonic, y_percussive = librosa.effects.hpss(y)
    C = librosa.feature.chroma_cqt(y=y_harmonic, sr=sr)
    plt.figure(figsize=(12,4))
    librosa.display.specshow(C, sr=sr, x_axis='time', y_axis='chroma', vmin=0, vmax=1)
    plt.title('Chromagram for: {}'.format(song_name))
    plt.colorbar()
    plt.show()

In [None]:
# for file in tracks_mp3:
#     chromogram_plot(file)