**What we did in this notebook:**

Implemented content filtering recommendation system using Cosine distances


*Implementing our model:*

We load data that we saved as csv from our 'final_project_data_cleaning.ipynb'

In order to recommend songs we use Cosine distances of songs' audio features that we get from Spotify API with other songs' features.

The theory is that if songs share similar audio features (a short cosine distance), they are similar songs.

Cosine similarity measures the orientation of two n-dimensional sample vectors irrespective to their magnitude. It is calculated by the dot product of two numeric vectors, and it is normalized by the product of the vector lengths. The output is in the range 0 to 1, with 1 being the highest similarity.

We will be using cosine_similarity function from sklearn.pairwise to implement our model.
We created 2 generators that are used in Spotify:
1. Radio generator (Generates a playlist based on 1 song)
2. Playlist generator (Generates n playlists based on 1 playlist)


In [66]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json

from sklearn.model_selection import cross_val_score
from sklearn.utils import resample
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.metrics.pairwise import cosine_similarity 
from sklearn.preprocessing import MinMaxScaler

%matplotlib inline

import seaborn as sns
sns.set(style='whitegrid')
pd.set_option('display.width', 1500)
pd.set_option('display.max_columns', 100)

In [112]:
# Reading the dataframe to start working
df = pd.read_csv("data/Sample_from_Million_Playlist.csv")

In [113]:
# Reading the data from saved file
# This is the dataframe we will implement our mode
df_spotify = pd.read_csv("data/100_Sample_MilPlay_Spotify.csv")
df_spotify.head(10)

Unnamed: 0,pid,pos,artist_name,track_uri,artist_uri,track_name,album_uri,duration_ms,album_name,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms.1,time_signature
0,4,0,Alessia Cara,spotify:track:1wYZZtamWTQAoj8B812uKQ,spotify:artist:2wUjUUtkb5lvLKcGKsKqsR,Here,spotify:album:3rDbA12I5duZnlwakqDdZa,199453,Know-It-All,0.376,0.821,0,-3.974,1,0.104,0.0785,0.0,0.0823,0.331,120.462,audio_features,1wYZZtamWTQAoj8B812uKQ,spotify:track:1wYZZtamWTQAoj8B812uKQ,https://api.spotify.com/v1/tracks/1wYZZtamWTQA...,https://api.spotify.com/v1/audio-analysis/1wYZ...,199453,4
1,4,1,The Weeknd,spotify:track:0Gi17qCJh9e9RJxLaYkm9l,spotify:artist:1Xyo4u8uXC1ZmMpatF05PJ,Dark Times,spotify:album:28ZKQMoNBB0etKXZ97G2SN,260640,Beauty Behind The Madness,0.585,0.421,7,-9.593,1,0.0707,0.106,1e-05,0.14,0.24,132.986,audio_features,0Gi17qCJh9e9RJxLaYkm9l,spotify:track:0Gi17qCJh9e9RJxLaYkm9l,https://api.spotify.com/v1/tracks/0Gi17qCJh9e9...,https://api.spotify.com/v1/audio-analysis/0Gi1...,260640,3
2,4,2,J. Cole,spotify:track:6Ius4TC0L3cN74HT7ENE6e,spotify:artist:6l3HvQ5sa6mXTsMTB19rO5,Wet Dreamz,spotify:album:7viNUmZZ8ztn2UB4XB3jIL,239320,2014 Forest Hills Drive,0.504,0.705,6,-8.205,0,0.364,0.0752,0.0,0.128,0.584,175.483,audio_features,6Ius4TC0L3cN74HT7ENE6e,spotify:track:6Ius4TC0L3cN74HT7ENE6e,https://api.spotify.com/v1/tracks/6Ius4TC0L3cN...,https://api.spotify.com/v1/audio-analysis/6Ius...,239320,4
3,4,3,Chance The Rapper,spotify:track:0jx8zY5JQsS4YEQcfkoc5C,spotify:artist:1anyVhU62p31KFi8MEzkbf,Angels (feat. Saba),spotify:album:71QyofYesSsRMwFOTafnhB,206240,Coloring Book,0.771,0.647,5,-5.127,0,0.376,0.294,0.0,0.37,0.678,155.914,audio_features,0jx8zY5JQsS4YEQcfkoc5C,spotify:track:0jx8zY5JQsS4YEQcfkoc5C,https://api.spotify.com/v1/tracks/0jx8zY5JQsS4...,https://api.spotify.com/v1/audio-analysis/0jx8...,206240,4
4,4,4,The Weeknd,spotify:track:7fPHfBCyKE3aVCBjE4DAvl,spotify:artist:1Xyo4u8uXC1ZmMpatF05PJ,In The Night,spotify:album:28ZKQMoNBB0etKXZ97G2SN,235653,Beauty Behind The Madness,0.48,0.682,7,-4.94,1,0.13,0.0696,0.0,0.0463,0.506,167.939,audio_features,7fPHfBCyKE3aVCBjE4DAvl,spotify:track:7fPHfBCyKE3aVCBjE4DAvl,https://api.spotify.com/v1/tracks/7fPHfBCyKE3a...,https://api.spotify.com/v1/audio-analysis/7fPH...,235653,3
5,4,5,Donnie Trumpet & The Social Experiment,spotify:track:6fTdcGsjxlAD9PSkoPaLMX,spotify:artist:0ojcq9LJQWMawQdFDw3M0L,Sunday Candy,spotify:album:3eM1KTKmpqrQOvuvYY42cr,226013,Surf,0.511,0.596,0,-6.56,1,0.224,0.53,0.0,0.0798,0.554,158.063,audio_features,6fTdcGsjxlAD9PSkoPaLMX,spotify:track:6fTdcGsjxlAD9PSkoPaLMX,https://api.spotify.com/v1/tracks/6fTdcGsjxlAD...,https://api.spotify.com/v1/audio-analysis/6fTd...,226014,4
6,4,6,Beyoncé,spotify:track:2CvOqDpQIMw69cCzWqr5yr,spotify:artist:6vWDO969PvNqNYHIOW5v0m,Halo,spotify:album:3ROfBX6lJLnCmaw1NrP5K9,261160,I AM...SASHA FIERCE - Platinum Edition,0.422,0.712,11,-5.907,0,0.1,0.273,0.0,0.051,0.471,78.454,audio_features,2CvOqDpQIMw69cCzWqr5yr,spotify:track:2CvOqDpQIMw69cCzWqr5yr,https://api.spotify.com/v1/tracks/2CvOqDpQIMw6...,https://api.spotify.com/v1/audio-analysis/2CvO...,261160,4
7,4,7,Hozier,spotify:track:1ivHxaGL5ld9VS1zsYc4YN,spotify:artist:2FXC3k01G6Gw61bmprjgqS,Cherry Wine - Live,spotify:album:36k5aXpxffjVGcNce12GLZ,240147,Hozier,0.418,0.111,1,-14.848,1,0.0389,0.953,0.00342,0.0982,0.228,82.508,audio_features,1ivHxaGL5ld9VS1zsYc4YN,spotify:track:1ivHxaGL5ld9VS1zsYc4YN,https://api.spotify.com/v1/tracks/1ivHxaGL5ld9...,https://api.spotify.com/v1/audio-analysis/1ivH...,240147,4
8,4,8,Hozier,spotify:track:1TGimSbipZ3XZ7q3eszBRV,spotify:artist:2FXC3k01G6Gw61bmprjgqS,Angel Of Small Death & The Codeine Scene,spotify:album:36k5aXpxffjVGcNce12GLZ,219214,Hozier,0.377,0.638,4,-5.754,1,0.0545,0.213,8e-05,0.12,0.369,92.644,audio_features,1TGimSbipZ3XZ7q3eszBRV,spotify:track:1TGimSbipZ3XZ7q3eszBRV,https://api.spotify.com/v1/tracks/1TGimSbipZ3X...,https://api.spotify.com/v1/audio-analysis/1TGi...,219214,4
9,4,9,Hozier,spotify:track:2Tjlq3aGhg3dIZFvSfsnyc,spotify:artist:2FXC3k01G6Gw61bmprjgqS,From Eden,spotify:album:36k5aXpxffjVGcNce12GLZ,283466,Hozier,0.395,0.676,0,-5.46,1,0.0498,0.608,4.9e-05,0.117,0.315,142.929,audio_features,2Tjlq3aGhg3dIZFvSfsnyc,spotify:track:2Tjlq3aGhg3dIZFvSfsnyc,https://api.spotify.com/v1/tracks/2Tjlq3aGhg3d...,https://api.spotify.com/v1/audio-analysis/2Tjl...,283467,5


In [114]:
# Cleaning the data frame and removing features we don't need
df_spotify = df_spotify.drop_duplicates(['track_uri']).reset_index()
df_spotify_tracks = df_spotify['track_uri']
df_spotify_track_names = df_spotify['track_name']

# We will be using the clean df to implement our model
df_spotify_clean = df_spotify.drop(columns=['index', 'pid','pos','artist_name','track_uri','artist_uri','track_name','album_uri','album_name','type','id','uri','track_href','analysis_url','duration_ms'])
df_spotify_clean.head(10)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms.1,time_signature
0,0.376,0.821,0,-3.974,1,0.104,0.0785,0.0,0.0823,0.331,120.462,199453,4
1,0.585,0.421,7,-9.593,1,0.0707,0.106,1e-05,0.14,0.24,132.986,260640,3
2,0.504,0.705,6,-8.205,0,0.364,0.0752,0.0,0.128,0.584,175.483,239320,4
3,0.771,0.647,5,-5.127,0,0.376,0.294,0.0,0.37,0.678,155.914,206240,4
4,0.48,0.682,7,-4.94,1,0.13,0.0696,0.0,0.0463,0.506,167.939,235653,3
5,0.511,0.596,0,-6.56,1,0.224,0.53,0.0,0.0798,0.554,158.063,226014,4
6,0.422,0.712,11,-5.907,0,0.1,0.273,0.0,0.051,0.471,78.454,261160,4
7,0.418,0.111,1,-14.848,1,0.0389,0.953,0.00342,0.0982,0.228,82.508,240147,4
8,0.377,0.638,4,-5.754,1,0.0545,0.213,8e-05,0.12,0.369,92.644,219214,4
9,0.395,0.676,0,-5.46,1,0.0498,0.608,4.9e-05,0.117,0.315,142.929,283467,5


In [109]:
# Standardizing the data
scaler = MinMaxScaler()
scaler.fit(df_spotify_clean)
df_spotify_clean_scaled = scaler.transform(df_spotify_clean)

# We create the cosine similarity matrix of the small scaled dataframe we have 
df_spotify_cosine = cosine_similarity(df_spotify_clean_scaled)


In [78]:
# We save the cosine similarity arrays as dataframes such that each column gives the similarity one song has with every other song
df_spotify_cosine = pd.DataFrame(df_spotify_cosine)
df_spotify_cosine.head(10)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,...,9341,9342,9343,9344,9345,9346,9347,9348,9349,9350,9351,9352,9353,9354,9355,9356,9357,9358,9359,9360,9361,9362,9363,9364,9365,9366,9367,9368,9369,9370,9371,9372,9373,9374,9375,9376,9377,9378,9379,9380,9381,9382,9383,9384,9385,9386,9387,9388,9389,9390
0,1.0,0.901929,0.782104,0.768605,0.929427,0.952791,0.704712,0.798435,0.971115,0.956404,0.804617,0.656117,0.924023,0.712121,0.895556,0.709808,0.816971,0.838186,0.759852,0.771055,0.848531,0.93346,0.89467,0.851069,0.910214,0.751266,0.7983,0.916668,0.776012,0.854353,0.981741,0.878649,0.852297,0.967445,0.807738,0.830235,0.813581,0.893107,0.856532,0.751023,0.780303,0.826329,0.672799,0.881161,0.758602,0.776846,0.97599,0.822804,0.865163,0.839222,...,0.713278,0.869824,0.896826,0.709012,0.82549,0.745285,0.988602,0.93294,0.944161,0.80215,0.939805,0.910416,0.985648,0.806747,0.748022,0.767741,0.986608,0.981791,0.990721,0.935329,0.991845,0.924825,0.99059,0.883254,0.974323,0.954247,0.77241,0.929324,0.991472,0.989693,0.924388,0.735684,0.928009,0.730468,0.794967,0.72242,0.882501,0.928154,0.925783,0.806396,0.934373,0.981311,0.805707,0.954403,0.983465,0.72727,0.800813,0.853179,0.726925,0.822481
1,0.901929,1.0,0.793016,0.78466,0.977052,0.895163,0.787653,0.810602,0.9582,0.883065,0.934328,0.808315,0.987625,0.771784,0.899791,0.869206,0.876955,0.856533,0.732228,0.700819,0.953584,0.983008,0.97968,0.883688,0.99067,0.822082,0.800197,0.968612,0.782559,0.981413,0.902625,0.958088,0.968871,0.922734,0.775614,0.905028,0.808688,0.952769,0.828141,0.816685,0.68945,0.784084,0.76369,0.958329,0.705311,0.721822,0.966398,0.768118,0.970312,0.83796,...,0.737182,0.85765,0.811276,0.739321,0.764768,0.727575,0.925147,0.934282,0.942909,0.734376,0.957503,0.959847,0.888393,0.769362,0.792096,0.768625,0.891964,0.894465,0.908489,0.92303,0.920633,0.939517,0.894091,0.947318,0.867078,0.937729,0.687527,0.933926,0.908407,0.904241,0.959847,0.782222,0.96322,0.78239,0.731275,0.792868,0.956223,0.961318,0.928315,0.69246,0.945796,0.899743,0.788914,0.882595,0.895465,0.785291,0.77855,0.912718,0.634707,0.767966
2,0.782104,0.793016,1.0,0.973037,0.851803,0.798792,0.918544,0.586112,0.797468,0.763012,0.759194,0.890711,0.827118,0.861079,0.742305,0.643266,0.722593,0.687334,0.860089,0.842453,0.795338,0.812014,0.83319,0.764368,0.828459,0.935405,0.952947,0.846,0.957549,0.834544,0.792923,0.745959,0.820177,0.745273,0.95416,0.693827,0.594258,0.715039,0.65624,0.935515,0.893171,0.956374,0.877879,0.822125,0.840717,0.897295,0.822096,0.937097,0.774265,0.636774,...,0.884961,0.767591,0.679021,0.876631,0.942432,0.889531,0.823967,0.813327,0.817252,0.907036,0.832438,0.857655,0.790348,0.969155,0.934746,0.931267,0.792509,0.807401,0.784257,0.813844,0.824072,0.837256,0.792045,0.81076,0.742562,0.797463,0.851686,0.810332,0.812263,0.785111,0.823543,0.920936,0.833663,0.905589,0.916684,0.914852,0.834711,0.834362,0.803617,0.875673,0.819295,0.75928,0.958868,0.762531,0.796715,0.917604,0.922885,0.764699,0.774391,0.929519
3,0.768605,0.78466,0.973037,1.0,0.822618,0.819227,0.894,0.6475,0.791634,0.777361,0.762265,0.891221,0.811168,0.887975,0.773242,0.68046,0.767974,0.75668,0.899,0.901862,0.811657,0.816071,0.82109,0.818471,0.815553,0.935979,0.946546,0.812633,0.954119,0.831972,0.811696,0.735179,0.811987,0.765635,0.952866,0.681707,0.639078,0.71504,0.71802,0.94882,0.928096,0.95837,0.902304,0.831195,0.898883,0.927927,0.800233,0.952067,0.76426,0.705855,...,0.833104,0.730343,0.705829,0.818188,0.905117,0.891443,0.803364,0.797466,0.780061,0.885374,0.814528,0.813439,0.778076,0.953829,0.911675,0.930184,0.792822,0.80908,0.772799,0.813936,0.798709,0.822443,0.776007,0.764325,0.720907,0.775025,0.841921,0.795193,0.791291,0.770289,0.799148,0.882436,0.795576,0.878228,0.887471,0.890335,0.799061,0.811843,0.786815,0.876713,0.797486,0.731633,0.933108,0.764892,0.796239,0.88249,0.899056,0.710437,0.775097,0.889249
4,0.929427,0.977052,0.851803,0.822618,1.0,0.916521,0.819212,0.765575,0.963947,0.889003,0.914695,0.789438,0.97502,0.756557,0.888978,0.823373,0.85284,0.83014,0.733486,0.722604,0.932269,0.968567,0.972715,0.868885,0.97411,0.823289,0.840391,0.984634,0.834775,0.960297,0.918734,0.961071,0.95801,0.904419,0.831439,0.871818,0.770172,0.906836,0.800295,0.814622,0.742138,0.824007,0.763629,0.956913,0.720916,0.76107,0.973844,0.803463,0.935053,0.791747,...,0.774968,0.875988,0.81445,0.781253,0.819162,0.764098,0.952347,0.947344,0.961165,0.778548,0.960997,0.980405,0.918694,0.83943,0.828457,0.808648,0.913314,0.931102,0.926738,0.93032,0.952231,0.959808,0.928897,0.948272,0.884223,0.93817,0.756266,0.944427,0.936224,0.927778,0.957366,0.813866,0.971044,0.801727,0.78989,0.815232,0.957725,0.964988,0.938985,0.738308,0.947744,0.915248,0.8386,0.909068,0.917351,0.81724,0.82137,0.91586,0.669654,0.815024
5,0.952791,0.895163,0.798792,0.819227,0.916521,1.0,0.706899,0.903442,0.948583,0.979138,0.873687,0.679838,0.924253,0.784322,0.968065,0.811352,0.895709,0.94043,0.83207,0.83793,0.909819,0.929272,0.883874,0.945481,0.890206,0.741137,0.795584,0.878619,0.774595,0.855242,0.971309,0.842153,0.845285,0.944854,0.791273,0.843678,0.911701,0.887889,0.919769,0.755882,0.761816,0.805605,0.65112,0.907064,0.843083,0.808941,0.941176,0.819305,0.819312,0.923324,...,0.642261,0.812405,0.827241,0.650882,0.767474,0.699991,0.934523,0.865742,0.874692,0.730783,0.878449,0.862613,0.918289,0.785186,0.702362,0.727959,0.925914,0.933063,0.922418,0.879037,0.934448,0.871402,0.927257,0.815272,0.892563,0.878408,0.719409,0.87627,0.934788,0.924312,0.861498,0.679099,0.872595,0.67,0.756561,0.673524,0.829255,0.870541,0.857065,0.75362,0.866432,0.907988,0.761925,0.885505,0.929828,0.673472,0.738057,0.778793,0.667428,0.759814
6,0.704712,0.787653,0.918544,0.894,0.819212,0.706899,1.0,0.593035,0.791211,0.710834,0.797875,0.948154,0.782769,0.874781,0.695746,0.748584,0.700603,0.684358,0.80838,0.808348,0.830289,0.782913,0.874723,0.736715,0.833085,0.947332,0.96597,0.859524,0.944502,0.823973,0.720371,0.756492,0.864237,0.674502,0.907509,0.779892,0.560952,0.681178,0.613554,0.932272,0.797209,0.905172,0.898848,0.855902,0.788329,0.787816,0.781902,0.831857,0.825766,0.620716,...,0.93731,0.727115,0.594416,0.912245,0.887459,0.850991,0.760411,0.820092,0.82915,0.870214,0.825962,0.861715,0.721351,0.903613,0.969138,0.906676,0.704595,0.744429,0.7285,0.772216,0.753003,0.850297,0.734009,0.854312,0.665343,0.780912,0.759712,0.79404,0.724591,0.726004,0.829815,0.96072,0.832265,0.94217,0.817524,0.952845,0.850751,0.837795,0.813945,0.753818,0.814348,0.689174,0.918393,0.702887,0.696183,0.964581,0.900425,0.828181,0.722386,0.884464
7,0.798435,0.810602,0.586112,0.6475,0.765575,0.903442,0.593035,1.0,0.853437,0.923244,0.876081,0.607414,0.825206,0.766266,0.955485,0.901963,0.904897,0.975624,0.780068,0.765079,0.898631,0.830439,0.799418,0.936004,0.789072,0.607697,0.646406,0.742878,0.583807,0.752818,0.842961,0.706184,0.754328,0.846728,0.578209,0.874818,0.986602,0.85676,0.942197,0.638997,0.54865,0.6153,0.510678,0.847983,0.783378,0.628068,0.813194,0.618927,0.75221,0.978802,...,0.476317,0.665251,0.683003,0.481813,0.549976,0.51281,0.759472,0.72209,0.728731,0.528681,0.736071,0.710376,0.746388,0.550732,0.522275,0.531126,0.752456,0.753418,0.761312,0.729499,0.755714,0.720781,0.751045,0.693182,0.736975,0.73929,0.456611,0.728946,0.756048,0.757385,0.728226,0.508192,0.730428,0.506833,0.531469,0.507517,0.697105,0.731041,0.721601,0.534024,0.726206,0.760177,0.549959,0.713085,0.754114,0.503457,0.52629,0.673208,0.488939,0.552545
8,0.971115,0.9582,0.797468,0.791634,0.963947,0.948583,0.791211,0.853437,1.0,0.955226,0.895986,0.750479,0.963837,0.767396,0.926667,0.841322,0.882677,0.891459,0.773156,0.775597,0.933303,0.971346,0.963724,0.907258,0.962155,0.80213,0.834498,0.968573,0.810331,0.923423,0.96341,0.919437,0.937233,0.955621,0.813161,0.915837,0.845697,0.936088,0.880531,0.800196,0.762183,0.826749,0.73727,0.961618,0.762697,0.761032,0.986048,0.796749,0.938459,0.870506,...,0.759176,0.876237,0.868707,0.747809,0.807726,0.758801,0.968786,0.963225,0.970534,0.79031,0.964935,0.952696,0.960327,0.807392,0.797362,0.790354,0.951865,0.964343,0.965477,0.947252,0.96934,0.965093,0.964818,0.934782,0.934127,0.960502,0.716524,0.948155,0.960066,0.963366,0.957493,0.78455,0.958071,0.776149,0.765676,0.77623,0.933946,0.963235,0.959113,0.752928,0.9577,0.953362,0.811308,0.938729,0.947816,0.782015,0.796542,0.913565,0.694589,0.80496
9,0.956404,0.883065,0.763012,0.777361,0.889003,0.979138,0.710834,0.923244,0.955226,1.0,0.870887,0.675287,0.911135,0.809796,0.972496,0.810207,0.901879,0.933807,0.848579,0.844427,0.904234,0.914609,0.88386,0.920484,0.883653,0.743469,0.801195,0.872599,0.747544,0.831382,0.967099,0.81618,0.823995,0.944811,0.772493,0.886436,0.935761,0.907247,0.94563,0.76563,0.743175,0.80384,0.649787,0.899192,0.849964,0.774563,0.947214,0.801866,0.836693,0.939366,...,0.665585,0.820266,0.838906,0.669947,0.783464,0.713756,0.931191,0.875204,0.884438,0.757142,0.889283,0.860186,0.92498,0.766052,0.705975,0.729931,0.928864,0.920169,0.931722,0.886192,0.933798,0.865936,0.924821,0.831157,0.920137,0.898527,0.696455,0.874823,0.934083,0.929532,0.873073,0.696698,0.878133,0.692579,0.761701,0.685447,0.836108,0.876613,0.872311,0.769235,0.877998,0.933695,0.766884,0.886777,0.930843,0.687867,0.750195,0.80221,0.68908,0.783472


**Recommendation Approach:**

Let's create two recommendation function.
We will be use a content filtering method to recommend songs.
We will look at features of songs found from Spotify API and then use Cosine similarity to provide the recommendations.

In [79]:

# This is the radio generator, where you give a song and Spotify suggests you X number of songs based on one song

# This functions assume that the song you provide is already in the list of 9,300 songs we have for
# randomly selected 100 playlists

def generate_radio(uri, cosine_df, info_df, num_tracks = 10):
    '''
    Input:
    uri = Track to provide recommendations on
    cosine_df = Cosine similarity df
    info_df = The df with all the songs and track info and audio features
    num_tracks = Number of tracks to recommend
    
    Output:
    rec_songs = df of track names and uri of recommended songs, length = num_tracks 
    '''
    index = info_df.index[info_df['track_uri'] == uri][0]
    similarities = cosine_df.iloc[:, index].sort_values(ascending=False)
    final_indices = list(similarities[1:num_tracks].index)
    rec_songs = info_df[['track_name','artist_name','track_uri']].iloc[final_indices]
    return rec_songs

In [80]:
# Testing our radio generating function

i = 34
test_uri = df_spotify_tracks[i]
test_pid = df_spotify['pid'].iloc[i]
print("Song provided:")
display(df_spotify.iloc[34][['track_name','artist_name','track_uri']])
x = generate_radio(test_uri, df_spotify_cosine, df_spotify, 10)
print()
print("Radio recommended")
display(x)


Song provided:


track_name                        Island In The Sun
artist_name                                  Weezer
track_uri      spotify:track:2MLHyLy5z5l5YRp7momlgw
Name: 34, dtype: object


Radio recommended


Unnamed: 0,track_name,artist_name,track_uri
1531,Sugar (feat. Francesco Yates),Robin Schulz,spotify:track:5tf1VVWniHgryyumXyJM7w
6525,Summer,Calvin Harris,spotify:track:6YUTL4dYpB9xZO5qExPf05
8596,Ni Una Sola Palabra,Paulina Rubio,spotify:track:52IreIJblsuK0SAJDIIh6v
7951,Yayo,Snootie Wild,spotify:track:1BfPC2cHY3m9RIstE4cjr5
4021,Closer,Ne-Yo,spotify:track:2nbClS09zsIAqNkshg6jnp
4546,Music Sounds Better,Big Time Rush,spotify:track:21svHgL8NlWthrlW9Gy0BZ
4171,Fading,William Bolton,spotify:track:2lnQRBzMUjSStGQRsGE8qU
8492,Take You There,Sean Kingston,spotify:track:2YUv5bLhf4ena4JWVY0Lql
8990,Valerie - '68 Version,Amy Winehouse,spotify:track:7Lxt392wmXBuWahE55fFAU


In [81]:
# This is the playlist generator, where you give a playlist and Spotify suggests you playlists based
# on that playlists

# This functions assume that the playlist you provide is already in the list of 10,000 songs we have for
# randomly selected 100 playlists

def generate_playlist(playlist, cosine_df, info_df, num_playlist = 2):
    '''
    Input:
    playlist = Playlist to provide recommendations
    cosine_df = Cosine similarity df
    info_df = The df with all the songs and track info and audio features
    num_playlist = Number of playlist to generate
    
    Output:
    rec_playlists = list of dfs of recommended playlist track names, artists and uri of recommended songs, length = num_playlists
    '''
    indices = list(info_df[info_df['track_uri'].isin(playlist)].index)
    #print("Playlist provided:")
    #display(info_df.iloc[indices][['track_name','artist_name','track_uri']])

    x = cosine_df.iloc[:,indices]

    my_list = []
    for i in range(len(indices)):
        my_list.append(list(x.iloc[:, i].sort_values(ascending = False)[1:num_playlist + 1].index))
    my_list = np.array(my_list).transpose()
    
    rec_playlist = []
    for i in my_list:
        rec_playlist.append(info_df.iloc[i][['track_name', 'artist_name', 'track_uri']])
    
    return rec_playlist

In [82]:
np.random.seed(seed = 10)

numbers3 = np.arange(0,len(df_spotify),1)
sample3 = np.random.choice(numbers3, size=10, replace=False)
test_playlist = df_spotify['track_uri']

rec_playlist = generate_playlist(test_playlist, df_spotify_cosine, df_spotify , 1)

counter = 0
for i in rec_playlist:
    counter += 1
    print("Recommended playlist #" + str(counter))
    display(i)
    print()

Recommended playlist #1


Unnamed: 0,track_name,artist_name,track_uri
6797,Saturday Night,Misfits,spotify:track:04EqqBVDCU2LOtxOMZ223T
7159,"Oh, Sister",Bob Dylan,spotify:track:4JtK4KieKw8mlPAIX4ODht
5017,Diamonds & Gold,Mac Miller,spotify:track:75bpA2kj6hAFeunmJ9IQie
7475,Poetic Justice,Kendrick Lamar,spotify:track:1zCi4cVFqe6ja16MeGZKRN
8330,Chelsea Dagger,The Fratellis,spotify:track:1bCmvezFg5MRcENzCGG1Cy
...,...,...,...
9372,Youngblood,Wage War,spotify:track:4t3lsnPp3ywXpvRxMWyJe1
114,Take Ü There (feat. Kiesza),Jack Ü,spotify:track:2RpKh7kXSdO8NLrW9VQ46p
9314,Revenant,Chelsea Grin,spotify:track:2jC2c2Ou7enoXPo1hhEzOV
1435,Aurora,Approaching Nirvana,spotify:track:7L3QsXEFNrhdxAPh7hFCkU





In [83]:
# This is one of the metrics that can be used to understand the precision of the recommendation
def r_precision(prediction, validation):
    score = np.sum(validation.isin(prediction))/validation.shape[0]
    return score

### NDCG Code Source: https://gist.github.com/bwhite/3726239
def dcg_at_k(r, k, method=0):
    r = np.asfarray(r)[:k]
    if r.size:
        if method == 0:
            return r[0] + np.sum(r[1:] / np.log2(np.arange(2, r.size + 1)))
        elif method == 1:
            return np.sum(r / np.log2(np.arange(2, r.size + 2)))
        else:
            raise ValueError('method must be 0 or 1.')
    return 0.


def ndcg_at_k(r, k, method=0):
    dcg_max = dcg_at_k(sorted(r, reverse=True), k, method)
    if not dcg_max:
        return 0.
    return dcg_at_k(r, k, method) / dcg_max

In [104]:
# Testing model

# Choosing pids with at least 100 songs such that our model gets at least 50 songs to work with
# The average playlist in our data set contains 52 songs
# So, if we take 100 songs at least and do a .5 train-test split
# The model will produce 50 recommendations and get tested on a 50 song playlist on average which makes sense

pids = list(df_spotify['pid'].value_counts()[df_spotify['pid'].value_counts() > 100].index)
r_precisions = []
ndcgs = []

for pid in pids:
    train, test = train_test_split(df_spotify[df_spotify['pid'] == pid], test_size=0.50, random_state = 24)
    prediction = rec_playlist = generate_playlist(train['track_uri'], df_spotify_cosine, df_spotify , 1)
    r = r_precision(prediction[0]['track_uri'], test['track_uri'])
    r_precisions.append(r)
    
    rx = np.zeros(len((prediction[0])))
    for i, p in enumerate(prediction[0]['track_uri']):
        if np.any(test['track_uri'].isin([p])):              
            rx[i] = 1
    ndcgs.append(ndcg_at_k(rx, len(rx)))

print("Average R precision: " + str(np.mean(r_precisions)))
print("Average NDCG: " + str(np.mean(ndcgs)))

Average R precision: 0.043582461509301736
Average NDCG: 0.2691182070784967
