# Recommender with TuriCreate

In [37]:
import turicreate
song_data = turicreate.SFrame('songs.frame_idx')

### Our data consists on a list of songs and how many times they have been played by the users

In [2]:
song_data

user_id,song_id,listen_count,title,artist
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOAKIMP12A8C130995,1,The Cove,Jack Johnson
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOBBMDR12A8C13253B,2,Entre Dos Aguas,Paco De Lucia
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOBXHDL12A81C204C0,1,Stronger,Kanye West
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOBYHAJ12A6701BF1D,1,Constellations,Jack Johnson
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SODACBL12A8C13C273,1,Learn To Fly,Foo Fighters
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SODDNQT12A6D4F5F7E,5,Apuesta Por El Rock 'N' Roll ...,Héroes del Silencio
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SODXRTY12AB0180F3B,1,Paper Gangsta,Lady GaGa
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOFGUAY12AB017B0A8,1,Stacked Actors,Foo Fighters
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOFRQTD12A81C233C0,1,Sehr kosmisch,Harmonia
b80344d063b5ccb3212f76538 f3d9e43d87dca9e ...,SOHQWYZ12A6D4FA701,1,Heaven's gonna burn your eyes ...,Thievery Corporation feat. Emiliana Torrini ...

song
The Cove - Jack Johnson
Entre Dos Aguas - Paco De Lucia ...
Stronger - Kanye West
Constellations - Jack Johnson ...
Learn To Fly - Foo Fighters ...
Apuesta Por El Rock 'N' Roll - Héroes del ...
Paper Gangsta - Lady GaGa
Stacked Actors - Foo Fighters ...
Sehr kosmisch - Harmonia
Heaven's gonna burn your eyes - Thievery ...


### The first Recommender we are going to build is a simple popularity recommender

In [38]:
train_data,test_data = song_data.random_split(.3,seed=0)
popularity_model = turicreate.popularity_recommender.create(train_data,
                                                           user_id = 'user_id',
                                                           item_id = 'song')

With this model we can make some predictions, for example for users[0]

In [39]:
users = song_data['user_id'].unique()
popularity_model.recommend(users=[users[0]]).head(2)


user_id,song,score,rank
c66c10a9567f0d82ff31441a9 fd5063e5cd9dfe8 ...,Sehr kosmisch - Harmonia,1858.0,1
c66c10a9567f0d82ff31441a9 fd5063e5cd9dfe8 ...,Undo - Björk,1516.0,2


In [40]:
popularity_model.recommend(users=[users[1]]).head(2)

user_id,song,score,rank
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ...,Sehr kosmisch - Harmonia,1858.0,1
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ...,Undo - Björk,1516.0,2


From these results, we can observe how in both cases, the most recommended songs usign a popularity-based model are the same

### Our next model will be a Recommender based on personalization

In [41]:
personalized_model = turicreate.item_similarity_recommender.create(train_data,
                                                                  user_id = 'user_id',
                                                                  item_id = 'song')

### Now, let's check what are the recommendations using a personalized model

In [42]:
personalized_model.recommend(users=[users[0]]).head(2)

user_id,song,score,rank
c66c10a9567f0d82ff31441a9 fd5063e5cd9dfe8 ...,En El Septimo Dia - Soda Stereo ...,0.0260416666666666,1
c66c10a9567f0d82ff31441a9 fd5063e5cd9dfe8 ...,Sobredosis De T.V. - Soda Stereo ...,0.0173160235087076,2


In [52]:
personalized_model.recommend(users=[users[1]]).head(2)

user_id,song,score,rank
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ...,Sei Lá Mangueira - Elizeth Cardoso ...,0.0256410241127014,1
279292bb36dbfc7f505e36ebf 038c81eb1d1d63e ...,West One (Shine On Me) - The Ruts ...,0.0153349041938781,2


### As we can see, this model has a better outcome, recommending different songs to each user.

### We can also look for songs who are similar to another

In [67]:
personalized_model.get_similar_items(['Add It Up - Violent Femmes'])

song,similar,score,rank
Add It Up - Violent Femmes ...,Kiss Off - Violent Femmes,0.0526315569877624,1
Add It Up - Violent Femmes ...,Blister In The Sun - Violent Femmes ...,0.0512820482254028,2
Add It Up - Violent Femmes ...,Holiday - Weezer,0.0327869057655334,3
Add It Up - Violent Femmes ...,Charlotte Sometimes - The Cure ...,0.0322580933570861,4
Add It Up - Violent Femmes ...,Psycho Killer (Acoustic) - Talking Heads ...,0.0317460298538208,5
Add It Up - Violent Femmes ...,Debaser - Pixies,0.0259740352630615,6
Add It Up - Violent Femmes ...,Transmission - Joy Division ...,0.0232558250427246,7
Add It Up - Violent Femmes ...,Susie Q - Creedence Clearwater Revival ...,0.0232558250427246,8
Add It Up - Violent Femmes ...,Rx Queen (LP Version) - Deftones ...,0.0227272510528564,9
Add It Up - Violent Femmes ...,Thrill Me - Simply Red,0.0222222208976745,10


In [69]:
personalized_model.get_similar_items(['Someday - The Strokes'])

song,similar,score,rank
Someday - The Strokes,The Modern Age - The Strokes ...,0.0714285969734191,1
Someday - The Strokes,Mass Appeal (Explicit) - Gang Starr ...,0.0512820482254028,2
Someday - The Strokes,Reptilia - The Strokes,0.0444444417953491,3
Someday - The Strokes,Love It All - The Kooks,0.0400000214576721,4
Someday - The Strokes,I Want You - The Kooks,0.0384615659713745,5
Someday - The Strokes,In The Aeroplane Over The Sea - Neutral Milk Hotel ...,0.0370370149612426,6
Someday - The Strokes,Resolve - Foo Fighters,0.0370370149612426,7
Someday - The Strokes,Poor Places - Wilco,0.0370370149612426,8
Someday - The Strokes,The Pain - Murs,0.0370370149612426,9
Someday - The Strokes,2nd Self - Umphrey's McGee ...,0.0357142686843872,10


#### Which are all good recommendations and not based on popularity

### We can compare the two models quantitatively

In [70]:
model_performance = turicreate.recommender.util.compare_models(test_data, [popularity_model, personalized_model], user_sample=.05)

compare_models: using 3296 users to estimate model performance
PROGRESS: Evaluate model M0





Precision and recall summary statistics by cutoff
+--------+---------------------+----------------------+
| cutoff |    mean_precision   |     mean_recall      |
+--------+---------------------+----------------------+
|   1    | 0.06371359223300976 | 0.005702881637244093 |
|   2    | 0.06174150485436893 | 0.010703190811554674 |
|   3    | 0.05865695792880267 | 0.015160416794306259 |
|   4    | 0.05605279126213592 | 0.018885597670174187 |
|   5    | 0.05424757281553395 | 0.02276902169351821  |
|   6    | 0.05132483818770222 | 0.025864677530317515 |
|   7    | 0.05014736477115119 | 0.029620715557916347 |
|   8    | 0.04824029126213591 | 0.03240240761318508  |
|   9    | 0.04645361380798284 | 0.03522456484515842  |
|   10   | 0.04569174757281554 | 0.03813974254524545  |
+--------+---------------------+----------------------+
[10 rows x 3 columns]

PROGRESS: Evaluate model M1





Precision and recall summary statistics by cutoff
+--------+----------------------+----------------------+
| cutoff |    mean_precision    |     mean_recall      |
+--------+----------------------+----------------------+
|   1    | 0.06401699029126212  | 0.006047765463733435 |
|   2    | 0.05961771844660187  | 0.010846710881625006 |
|   3    | 0.05420711974110034  | 0.014793066969642386 |
|   4    | 0.04990898058252427  | 0.018062002291635834 |
|   5    | 0.047087378640776716 | 0.021166859156793342 |
|   6    | 0.044498381877022646 | 0.024063296936069293 |
|   7    | 0.042475728155339745 | 0.026434800781896244 |
|   8    | 0.04095873786407765  | 0.028653912498233253 |
|   9    | 0.03913834951456323  | 0.030727811598540933 |
|   10   | 0.03771237864077663  | 0.03278657605933505  |
+--------+----------------------+----------------------+
[10 rows x 3 columns]



#### The performance is better with the personalized model

## We can look for the most recommended songs for a subset of users

In [71]:
subset_test_users = test_data['user_id'].unique()[0:1000]

In [72]:
recommendations = personalized_model.recommend(subset_test_users,k=1)

In [73]:
recom_groups =  recommendations.groupby('song', operations={'count': turicreate.aggregate.COUNT()})
recom_groups.sort('count', ascending = False).head()

song,count
Sehr kosmisch - Harmonia,54
Undo - Björk,16
You're The One - Dwight Yoakam ...,15
Revelry - Kings Of Leon,10
Hey_ Soul Sister - Train,8
Secrets - OneRepublic,7
I'm Not Calling You A Liar - Florence + The ...,6
Overboard - Justin Bieber / Jessica Jarrell ...,5
U Smile - Justin Bieber,5
Clocks - Coldplay,4
