# Daltonify: An Audio Feature Based Recommender System

## *Additional Examples using the Recommender System & Conclusions*

#### Table of Contents

* [Example 1: Beast of Burden](#example-1)
* [Example 2: WAP](#example-2)
* [Conclusions & Future Work](#conclusions)

### Import Libraries

In [8]:
## STANDARD IMPORTS
import pandas as pd 
import numpy as np
import re
## VISUALIZATIONS
import matplotlib.pyplot as plt
import seaborn as sns
## MODELING
from sklearn.metrics.pairwise import cosine_similarity
## SPOTIFY
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

In [9]:
### Spotify Credentials - must be set in local environment to run
auth_manager = SpotifyClientCredentials()
sp = spotipy.Spotify(auth_manager=auth_manager)

### Recommender System Functions

In [19]:
def add_track_data(df, track):
    '''combines track sample set and given track into single dataframe'''
    ID = track['track_id'].values[0]
    ### Create X data
    data = pd.concat([df, track], ignore_index=True)
    ### desired features for model (may change later)
    features = ['danceability','energy','valence','instrumentalness','acousticness','speechiness']
    X = data[features]
    return X, data

def pop_track_recommender(df, track):
    '''uses cosine similarity to recommend tracks'''
    
    ID = track['track_id'].values[0]
    ### calculate data 
    X, data = add_track_data(df, track)
    
    ### calculate similarity matrix
    similarity_matrix = cosine_similarity(X, X)
    
    ### create mapping bwtn track ids and index
    track_id_map = pd.Series(data.index, index=data['track_id'])
    ## find index of track in dataframe
    track_index = track_id_map[ID]
    
    ### find the correct column for the track in the similarity matrix
    similarity_scores = pd.Series(similarity_matrix[track_index])
    similarity_scores.sort_values(ascending=False, inplace=True)

    ### CREATE DF OF ALL SCORES
    scores_ids = data['track_id'].loc[similarity_scores.index]
    
    ### CREATE DF OF ALL SCORES
    rec_tracks_df = data[data['track_id'].isin(scores_ids.values)].copy()
    rec_tracks_df['score'] = similarity_scores
    rec_tracks_df.sort_values(by=['score', 'popularity'], ascending=False, inplace=True)

    return rec_tracks_df

def top_recommended_tracks(results, num_tracks):
    '''selects songs in the top 50% in terms of similarity score, 
    sorts recommended tracks by popularity and then by similarity score'''
    
    ### GET TOP 50% PERCENTILE OF SIMILARITY SCORE
    top_half = results[results['score'] >= results['score'].median()].copy()
    ### SORT VALUES BY POPULARITY
    top_half.sort_values(by='popularity', ascending=False, inplace=True)
    ### SELECT DESIRED NUMBER OF TRACKS
    top_tracks = top_half[:num_tracks]
    
    return top_tracks

def recommender(df, track, num_tracks):
    '''combines functions above into single function call for simplicity'''
    results = pop_track_recommender(df, track)
    top_tracks = top_recommended_tracks(results, num_tracks)
    ### ADD TRACK TO TOP OF DATAFRAME SO INCLUDED IN PLAYLIST LIST
    playlist = pd.concat([track, top_tracks], ignore_index=True)
    return playlist


### FUNCTIONS FOR CREATING PLAYLIST FILE
def display_playlist(playlist):
    ### displays playlist track name, artist, album
    playlist_df = playlist[['track_name', 'artist', 'album']]
    playlist_df.columns = ['Title', 'Artist', 'Album']
    ### start index at 1
    playlist_df.index = np.arange(1,len(playlist_df)+1)
    return playlist_df

def make_track_URIs(track_ids):
    '''reformats track ids as track URIs'''
    ### need text spotify:track: in front of each ID to use in Spotify
    track_URIs = []
    for track_id in track_ids:
        uri = 'spotify:track:'+ track_id
        track_URIs.append(uri)
    return track_URIs

def create_playlist_file(track_ids):
    '''creates text file of Spotify URIs'''
    track_list = track_ids.values.tolist()
    track_URIs = make_track_URIs(track_list)
    ### write URIs to text file
    playlist = open(fr'../playlists/playlist.txt','w')
    playlist.writelines('%s\n' % track for track in track_URIs) 
    playlist.close()
    pass

## Example 1: Beast of Burden <a class="anchor" id="example-1"></a>
<hr/>
I'll generate a playlist using Beast of Burden by The Rolling Stones and the genre "Rock"

### Read in Sample Data

The sample data is in pre-extracted csv files to reduce the need to call new sample sets while we test the recommender system.

In [13]:
### read in data
df = pd.read_csv('../data/rock.csv')
track = pd.read_csv('../data/beastofburden.csv')

drop_cols = ['key', 'mode', 'time_signature', 'duration_ms']
df.drop(columns=drop_cols, inplace=True)
track.drop(columns=drop_cols, inplace=True)  ### not present in test set using here

In [14]:
### construct and display playlist
playlist = recommender(df, track, 15)
create_playlist_file(playlist['track_id'])
display_playlist(playlist)

Unnamed: 0,Title,Artist,Album
1,Beast Of Burden,The Rolling Stones,Honk (Deluxe)
2,Dreams - 2004 Remaster,Fleetwood Mac,Rumours (Super Deluxe)
3,Believer,Imagine Dragons,Evolve
4,Are You Bored Yet? (feat. Clairo),Wallows,Nothing Happens
5,Take on Me,a-ha,Hunting High and Low
6,Every Breath You Take,The Police,Synchronicity (Remastered 2003)
7,Do I Wanna Know?,Arctic Monkeys,AM
8,High Hopes,Panic! At The Disco,Pray for the Wicked
9,Pumped Up Kicks,Foster The People,Torches
10,Stressed Out,Twenty One Pilots,Blurryface


### Access the Playlist

I have made this playlist public on my Spotify account. You can listen to it [here](https://open.spotify.com/playlist/4Mv2Blxc2IbAdnqpD1Hp9O?si=mZqwxx6HSP64dS1i5Kmxgw).

## Example 2: WAP <a class="anchor" id="example-2"></a>
<hr/>
I'll generate a playlist using WAP by Cardi B and the genre "Hip Hop"

### Read in Sample Data

The sample data is in pre-extracted csv files to reduce the need to call new sample sets while we test the recommender system.

In [20]:
### read in data
df = pd.read_csv('../data/hiphop.csv')
track = pd.read_csv('../data/WAP.csv')

drop_cols = ['key', 'mode', 'time_signature', 'duration_ms']
df.drop(columns=drop_cols, inplace=True)
track.drop(columns=drop_cols, inplace=True)  ### not present in test set using here

In [21]:
### construct and display playlist
playlist = recommender(df, track, 15)
create_playlist_file(playlist['track_id'])
display_playlist(playlist)

Unnamed: 0,Title,Artist,Album
1,WAP (feat. Megan Thee Stallion),Cardi B,WAP (feat. Megan Thee Stallion)
2,WAP (feat. Megan Thee Stallion),Cardi B,WAP (feat. Megan Thee Stallion)
3,Laugh Now Cry Later (feat. Lil Durk),Drake,Laugh Now Cry Later (feat. Lil Durk)
4,ROCKSTAR (feat. Roddy Ricch),DaBaby,BLAME IT ON BABY
5,POPSTAR (feat. Drake),DJ Khaled,POPSTAR (feat. Drake)
6,"WHATS POPPIN (feat. DaBaby, Tory Lanez & Lil W...",Jack Harlow,"WHATS POPPIN (feat. DaBaby, Tory Lanez & Lil W..."
7,my ex's best friend (with blackbear),Machine Gun Kelly,Tickets To My Downfall
8,"Agua (with J Balvin) - Music From ""Sponge On T...",Tainy,"Agua (with J Balvin) [Music From ""Sponge On Th..."
9,Runnin,21 Savage,SAVAGE MODE II
10,GREECE (feat. Drake),DJ Khaled,GREECE (feat. Drake)


An interesting issue popped up here clearly. WAP already appears on the list of tracks pulled. The idea is that this normally shouldn't happen to artists but a check should be added later to remove duplicate tracks.

### Access the Playlist

I have made this playlist public on my Spotify account. You can listen to it [here](https://open.spotify.com/playlist/7HPjDhTdcR53MPtD6kmBSE?si=Tf9xBvnTQSObj_DAjw8S7w).

## Conclusions & Future Work <a class="anchor" id="conclusions"></a>
<hr/>

This project has been successfully implemented via Streamlit in it's current form.

[![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://daltonify-playlist-creator.herokuapp.com/)

A successful recommender system was built. Further testing will need to be done to see the effectiveness of these generated playlists. 

While I met my basic project goals there is still a ton of work to be done here. I plan on continuing this project and working towards a more full app implementation using React and incorporating Spotify user authentication so that playlists can be added to a user's Spotify account directly.

### Future Work

* Add error messages to certain functions which require user input.
* Find ways to better randomize track selection during search process.
* Add check to remove duplicate tracks from appearing in the play list.
* Full app implenmentation using React.
* Exploration of other scoring methods and incorporation of more audio features.
* Exploration of using audio analysis directly.



