# Song Recommender

### Based on a dataset with the songs listed by 10000 users, use a recommender model to pick the best songs for users and identify the most recommended song

<hr>

In [1]:
import turicreate

In [2]:
song_data = turicreate.SFrame('../song_data.sframe/song_data.sframe/')

# Data Exploration

In [3]:
song_data.head()

user_id,song_id,listen_count,title,artist
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOAKIMP12A8C130995,1,The Cove,Jack Johnson
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOBBMDR12A8C13253B,2,Entre Dos Aguas,Paco De Lucia
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOBXHDL12A81C204C0,1,Stronger,Kanye West
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOBYHAJ12A6701BF1D,1,Constellations,Jack Johnson
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SODACBL12A8C13C273,1,Learn To Fly,Foo Fighters
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SODDNQT12A6D4F5F7E,5,Apuesta Por El Rock 'N' Roll ...,Héroes del Silencio
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SODXRTY12AB0180F3B,1,Paper Gangsta,Lady GaGa
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOFGUAY12AB017B0A8,1,Stacked Actors,Foo Fighters
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOFRQTD12A81C233C0,1,Sehr kosmisch,Harmonia
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOHQWYZ12A6D4FA701,1,Heaven's gonna burn your eyes ...,Thievery Corporation feat. Emiliana Torrini ...

song
The Cove - Jack Johnson
Entre Dos Aguas - Paco De Lucia ...
Stronger - Kanye West
Constellations - Jack Johnson ...
Learn To Fly - Foo Fighters ...
Apuesta Por El Rock 'N' Roll - Héroes del ...
Paper Gangsta - Lady GaGa
Stacked Actors - Foo Fighters ...
Sehr kosmisch - Harmonia
Heaven's gonna burn your eyes - Thievery ...


In [4]:
len(song_data)

1116609

In [5]:
song_data.show()

## Exploring: finding out total unique users that listened to some artists

Kanye West

In [10]:
len(song_data[song_data['artist'] == 'Kanye West']['user_id'].unique())

2522

Foo Fighters

In [11]:
len(song_data[song_data['artist'] == 'Foo Fighters']['user_id'].unique())

2055

Taylor Swift

In [12]:
len(song_data[song_data['artist'] == 'Taylor Swift']['user_id'].unique())

3246

Lady GaGa

In [13]:
len(song_data[song_data['artist'] == 'Lady GaGa']['user_id'].unique())

2928

## Exploring: Finding out the most popular and least popular artist

In [16]:
artist_stats = song_data.groupby(key_column_names='artist', operations={'total_count': turicreate.aggregate.SUM('listen_count')})

In [21]:
artist_stats = artist_stats.sort('total_count', ascending=False)

In [22]:
artist_stats.head()

artist,total_count
Kings Of Leon,43218
Dwight Yoakam,40619
Björk,38889
Coldplay,35362
Florence + The Machine,33387
Justin Bieber,29715
Alliance Ethnik,26689
OneRepublic,25754
Train,25402
The Black Keys,22184


In [23]:
artist_stats.tail()

artist,total_count
Aneta Langerova,38
Jody Bernal,38
Kanye West / Talib Kweli / Q-Tip / Common / ...,38
Nâdiya,36
harvey summers,31
Boggle Karaoke,30
Diplo,30
Beyoncé feat. Bun B and Slim Thug ...,26
Reel Feelings,24
William Tabbert,14


# Creating a simple popularity recommender

In [25]:
train_data,test_data = song_data.random_split(.8,seed=0)
popularity_model = turicreate.popularity_recommender.create(train_data,
                                                           user_id = 'user_id',
                                                           item_id = 'song')

In [28]:
song_data[0]['user_id']

'b80344d063b5ccb3212f76538f3d9e43d87dca9e'

### What are the most popular songs? (recommendation will be the same for any user)

In [29]:
popularity_model.recommend(users=[song_data[0]['user_id']])

user_id,song,score,rank
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Undo - Björk,4227.0,1
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,You're The One - Dwight Yoakam ...,3781.0,2
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Dog Days Are Over (Radio Edit) - Florence + The ...,3633.0,3
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Revelry - Kings Of Leon,3527.0,4
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Horn Concerto No. 4 in E flat K495: II. Romance ...,3161.0,5
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Secrets - OneRepublic,3148.0,6
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Hey_ Soul Sister - Train,2538.0,7
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Fireflies - Charttraxx Karaoke ...,2532.0,8
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Tive Sim - Cartola,2521.0,9
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Drop The World - Lil Wayne / Eminem ...,2053.0,10


# Creating a similarity recommender

In [30]:
similarity_model = turicreate.item_similarity_recommender.create(train_data,
                                                                  user_id = 'user_id',
                                                                  item_id = 'song')

### Checking the recommendation for the same user as the popularity model

In [31]:
similarity_model.recommend(users=[song_data[0]['user_id']])

user_id,song,score,rank
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Meadowlarks - Fleet Foxes,0.0248072429707175,1
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Quiet Houses - Fleet Foxes ...,0.0240329645181957,2
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Heard Them Stirring - Fleet Foxes ...,0.0203885561541507,3
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Tiger Mountain Peasant Song - Fleet Foxes ...,0.0199806752957795,4
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Your Protector - Fleet Foxes ...,0.0193978893129449,5
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Oliver James - Fleet Foxes ...,0.019061129344137,6
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Great Indoors - John Mayer ...,0.0149489750987605,7
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,Innocent Son - Fleet Foxes ...,0.0148925859677164,8
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,White Winter Hymnal - Fleet Foxes ...,0.0148194040122785,9
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,City Love - John Mayer,0.0138473055864635,10


# What are the most recommended songs by the similarity model?

## Using a subset of users just to work on a smaller sample (for the sake of time)

In [32]:
limited_test_users = test_data['user_id'].unique()[0:10000]

## Getting one recommendation for these users

In [35]:
limited_test_recommendations = similarity_model.recommend(limited_test_users,k=1)

In [37]:
limited_test_top_songs = limited_test_recommendations.groupby(key_column_names='song', operations={'count': turicreate.aggregate.COUNT()})

In [39]:
limited_test_top_songs.sort('count', ascending=False)

song,count
Undo - Björk,432
Secrets - OneRepublic,384
Revelry - Kings Of Leon,227
You're The One - Dwight Yoakam ...,157
Fireflies - Charttraxx Karaoke ...,111
Hey_ Soul Sister - Train,104
Sehr kosmisch - Harmonia,104
Horn Concerto No. 4 in E flat K495: II. Romance ...,87
OMG - Usher featuring will.i.am ...,60
Bigger - Justin Bieber,43


Most recommended song by the model: Undo - Björk