# Build Up My Own Recommend Playlist from Scratch

Collaborative Filtering is usually the first type of method for recommender system. It has two approachs, the first one is user-user based model, which use the similiarity between user to recommend new items, another one is item-item based model. Instead of using the similarity between users, it uses that between items to recommend new items.

Here, we are going to go beyond collaborative filtering and introduce latent factor model in the application of recommender system. In this article, we are going to use [Million Song Dataset](https://labrosa.ee.columbia.edu/millionsong/). It contains users and song data. The main motivation behind is that when we using music streaming service like Spotify, kkbox, and youtube, the recommended songs often catch my eye. Take Spotify for example, there is a feature called *Discover Weekly*, which automatically generate a recommended playlist weekly. Very often, I enjoyed listening to the recommended songs. Therefore, I think it will be a great idea if I can build up a recommend playlist or songs using different methods, and to see what the result will be.

Here are the steps that I take for this experiment:
* Take [Million Song Dataset](https://labrosa.ee.columbia.edu/millionsong/)
* Use user-user based collaborative filtering to build up a recommended playlist
* Use item-item based collaborative filtering to build up a recommended playlist
* Use Latent Factor Model to build up a recommended playlist
* Measure the performance using Root Mean Square Error(RMSE)
* Compare the result of different approachs

For collaborative filtering, we follow the same step as the [previous notebook for recommending movies](https://github.com/johnnychiuchiu/Machine-Learning/blob/master/RecommenderSystem/collaborative_filtering.ipynb).
Firstly, in order to calculate the similarity, we need to get a utility matrix using the song dataframe. For illustration purpose, I also manually append three rows into the utility matrix. Each row represent a person with some specific music taste. 

We use two method to compare the measure the result of different approachs. The first approach is calculate the Mean Square Error. Since both collaborative filtering and latent factor model need all the dataset to calcualte the predicted result. The way we generate train and test different from the way we usually use, that is randomly select some row to be test. 

In the song data, we randomly take 3 listen_count of each user out and place it in the test dataset. Then we use only train dataset to predict the recommended playlist. After have the predicted score for all the songs, we then compare the nonzero values in test data set with the corresponding value in the train dataset and calcualte the MSE of it. Also, I have make sure each user has at least listened to 5 different songs in the song data.

---

## Implementing Collaborative Filtering to build up Recommeded Playlist

In [1]:
%matplotlib inline

import pandas as pd
from sklearn.cross_validation import train_test_split
import numpy as np
import os
from sklearn.metrics import mean_squared_error



In [2]:
def compute_mse(y_true, y_pred):
    """ignore zero terms prior to comparing the mse"""
    mask = np.nonzero(y_true)
    mse = mean_squared_error(y_true[mask], y_pred[mask])
    return mse

def create_train_test(ratings):
    """
    split into training and test sets,
    remove 3 ratings from each user
    and assign them to the test set
    """
    test = np.zeros(ratings.shape)
    train = ratings.copy()
    for user in range(ratings.shape[0]):
        test_index = np.random.choice(
            np.flatnonzero(ratings[user]), size=3, replace=False)

        train[user, test_index] = 0.0
        test[user, test_index] = ratings[user, test_index]

    # assert that training and testing set are truly disjoint
    assert np.all(train * test == 0)
    return (train, test)

In [3]:
class collaborativeFiltering():
    def __init__(self):
        pass

    def readSongData(self, top):
        """
        Read song data from targeted url
        """
        if 'song.pkl' in os.listdir('_data/'):
            song_df = pd.read_pickle('_data/song.pkl')
        else:
            # Read userid-songid-listen_count triplets
            # This step might take time to download data from external sources
            triplets_file = 'https://static.turi.com/datasets/millionsong/10000.txt'
            songs_metadata_file = 'https://static.turi.com/datasets/millionsong/song_data.csv'

            song_df_1 = pd.read_table(triplets_file, header=None)
            song_df_1.columns = ['user_id', 'song_id', 'listen_count']

            # Read song  metadata
            song_df_2 = pd.read_csv(songs_metadata_file)

            # Merge the two dataframes above to create input dataframe for recommender systems
            song_df = pd.merge(song_df_1, song_df_2.drop_duplicates(['song_id']), on="song_id", how="left")



            # Merge song title and artist_name columns to make a merged column
            song_df['song'] = song_df['title'].map(str) + " - " + song_df['artist_name']

            n_users = song_df.user_id.unique().shape[0]
            n_items = song_df.song_id.unique().shape[0]
            print(str(n_users) + ' users')
            print(str(n_items) + ' items')

            song_df.to_pickle('_data/song.pkl')

        # keep top_n rows of the data
        song_df = song_df.head(top)

        song_df = self.drop_freq_low(song_df)

        return(song_df)

    def drop_freq_low(self, song_df):
        freq_df = song_df.groupby(['user_id']).agg({'song_id': 'count'}).reset_index(level=['user_id'])
        below_userid = freq_df[freq_df.song_id <= 5]['user_id']
        new_song_df = song_df[~song_df.user_id.isin(below_userid)]

        return(new_song_df)

    def utilityMatrix(self, song_df):
        """
        Transform dataframe into utility matrix, return both dataframe and matrix format
        :param song_df: a dataframe that contains user_id, song_id, and listen_count
        :return: dataframe, matrix
        """
        song_reshape = song_df.pivot(index='user_id', columns='song_id', values='listen_count')
        song_reshape = song_reshape.fillna(0)
        ratings = song_reshape.as_matrix()
        return(song_reshape, ratings)

    def fast_similarity(self, ratings, kind='user', epsilon=1e-9):
        """
        Calculate the similarity of the rating matrix
        :param ratings: utility matrix
        :param kind: user-user sim or item-item sim
        :param epsilon: small number for handling dived-by-zero errors
        :return: correlation matrix
        """

        if kind == 'user':
            sim = ratings.dot(ratings.T) + epsilon
        elif kind == 'item':
            sim = ratings.T.dot(ratings) + epsilon
        norms = np.array([np.sqrt(np.diagonal(sim))])
        return (sim / norms / norms.T)

    def predict_fast_simple(self, ratings, kind='user'):
        """
        Calculate the predicted score of every song for every user.
        :param ratings: utility matrix
        :param kind: user-user sim or item-item sim
        :return: matrix contains the predicted scores
        """

        similarity = self.fast_similarity(ratings, kind)

        if kind == 'user':
            return similarity.dot(ratings) / np.array([np.abs(similarity).sum(axis=1)]).T
        elif kind == 'item':
            return ratings.dot(similarity) / np.array([np.abs(similarity).sum(axis=1)])

    def get_overall_recommend(self, ratings, song_reshape, user_prediction, top_n=10):
        """
        get the top_n predicted result of every user. Notice that the recommended item should be the song that the user
         haven't listened before.
        :param ratings: utility matrix
        :param song_reshape: utility matrix in dataframe format
        :param user_prediction: matrix with predicted score
        :param top_n: the number of recommended song
        :return: a dict contains recommended songs for every user_id
        """
        result = dict({})
        for i, row in enumerate(ratings):
            user_id = song_reshape.index[i]
            result[user_id] = {}
            zero_item_list = np.where(row == 0)[0]
            prob_list = user_prediction[i][np.where(row == 0)[0]]
            song_id_list = np.array(song_reshape.columns)[zero_item_list]
            result[user_id]['recommend'] = sorted(zip(song_id_list, prob_list), key=lambda item: item[1], reverse=True)[
                                           0:top_n]

        return (result)

    def get_user_recommend(self, user_id, overall_recommend, song_df):
        """
        Get the recommended songs for a particular user using the song information from the song_df
        :param user_id:
        :param overall_recommend:
        :return:
        """
        user_score = pd.DataFrame(overall_recommend[user_id]['recommend']).rename(columns={0: 'song_id', 1: 'score'})
        user_recommend = pd.merge(user_score,
                                  song_df[['song_id', 'title', 'release', 'artist_name', 'song']].drop_duplicates(),
                                  on='song_id', how='left')
        return (user_recommend)

    def createNewObs(self, artistName, song_reshape, index_name):
        """
        Append a new row with userId 0 that is interested in some specific artists
        :param artistName: a list of artist names
        :return: dataframe, matrix
        """
        interest = []
        for i in song_reshape.columns:
            if i in song_df[song_df.artist_name.isin(artistName)]['song_id'].unique():
                interest.append(10)
            else:
                interest.append(0)

        print(pd.Series(interest).value_counts())

        newobs = pd.DataFrame([interest],
                              columns=song_reshape.columns)
        newobs.index = [index_name]

        new_song_reshape = pd.concat([song_reshape, newobs])
        new_ratings = new_song_reshape.as_matrix()
        return (new_song_reshape, new_ratings)

## Take Million Song Dataset

We only keep the first 50000 rows for this notebook. Otherwise it will take too long to execute it. As following, we can see that there are around **17k** users and **93k** different songs out of the first 50k rows.

In [4]:
cf = collaborativeFiltering()
song_df = cf.readSongData(100000)

In [5]:
song_df.head()

Unnamed: 0,user_id,song_id,listen_count,title,release,artist_name,year,song
0,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOAKIMP12A8C130995,1,The Cove,Thicker Than Water,Jack Johnson,0,The Cove - Jack Johnson
1,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBBMDR12A8C13253B,2,Entre Dos Aguas,Flamenco Para Niños,Paco De Lucia,1976,Entre Dos Aguas - Paco De Lucia
2,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBXHDL12A81C204C0,1,Stronger,Graduation,Kanye West,2007,Stronger - Kanye West
3,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBYHAJ12A6701BF1D,1,Constellations,In Between Dreams,Jack Johnson,2005,Constellations - Jack Johnson
4,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SODACBL12A8C13C273,1,Learn To Fly,There Is Nothing Left To Lose,Foo Fighters,1999,Learn To Fly - Foo Fighters


In [6]:
artist_df= song_df.groupby(['artist_name']).agg({'song_id':'count'}).reset_index(level=['artist_name']).sort_values(by='song_id',ascending=False).head(100)

In [7]:
n_users = song_df.user_id.unique().shape[0]
n_items = song_df.song_id.unique().shape[0]
print(str(n_users) + ' users')
print(str(n_items) + ' songs')

3464 users
9930 songs


• **Get the utility matrix**

In [8]:
song_reshape, ratings = cf.utilityMatrix(song_df)

• **Append new rows to simulate a users who love coldplay and Eninem**

In [9]:
song_reshape, ratings = cf.createNewObs(['Beyoncé', 'Katy Perry', 'Alicia Keys'], song_reshape, 'GirlFan')
song_reshape, ratings = cf.createNewObs(['Metallica', 'Guns N\' Roses', 'Linkin Park', 'Red Hot Chili Peppers'],
                                        song_reshape, 'HeavyFan')
song_reshape, ratings = cf.createNewObs(['Daft Punk','John Mayer','Hot Chip','Coldplay'],
                                        song_reshape, 'Johnny')

0     9876
10      54
dtype: int64
0     9801
10     129
dtype: int64
0     9747
10     183
dtype: int64


• **Create train test dataset**

In [10]:
train, test = create_train_test(ratings)

In [11]:
song_reshape.shape

(3467, 9930)

## Calculate user-user collaborative filtering

In [12]:
user_prediction = cf.predict_fast_simple(train, kind='user')
user_overall_recommend = cf.get_overall_recommend(train, song_reshape, user_prediction, top_n=10)
user_recommend_girl = cf.get_user_recommend('GirlFan', user_overall_recommend, song_df)
user_recommend_heavy = cf.get_user_recommend('HeavyFan', user_overall_recommend, song_df)
user_recommend_johnny = cf.get_user_recommend('Johnny', user_overall_recommend, song_df)

## Calculate item-item collaborative filtering

In [13]:
item_prediction = cf.predict_fast_simple(train, kind='item')
item_overall_recommend = cf.get_overall_recommend(train, song_reshape, item_prediction, top_n=10)
item_recommend_girl = cf.get_user_recommend('GirlFan', item_overall_recommend, song_df)
item_recommend_heavy = cf.get_user_recommend('HeavyFan', item_overall_recommend, song_df)
item_recommend_johnny = cf.get_user_recommend('Johnny', item_overall_recommend, song_df)

---

The main idea behind Latent Factor Model is that we can transform our utility matrix into the multiple of two lower rank matrix. For example, if we have 5 users and 10 songs, then our utility matrix is 5 * 10. We can transform the matrix in to two matrixs, say 5 x 3 (say Q) and 3 x 10 (say P). Each user can be represented by a vector in 3 dimension, and each song can als obe represented by a vector in 3 dimension. The meaning of each dimension for Q can be, for example, do the user like jazz related music; each dimension for P can be, for example, is it a jazz song. The picture copied from google search result visualize it more clearly:

![](_pic/latentfactor.png)

In order to get all the values in the Q and P, we need some optimization method to help us. The optimization method suggested by the winner of Netflix is called **Alternating Least Squares with Weighted Regularization (ALS-WR)**. 

Our cost function is as follows:

$$ \begin{align} L &= \sum\limits_{u,i \in S}( r_{ui} - \textbf{x}_{u} \textbf{y}_{i}^{T} )^{2} + \lambda \big( \sum\limits_{u} \left\Vert \textbf{x}_{u} \right\Vert^{2} + \sum\limits_{i} \left\Vert \textbf{y}_{i} \right\Vert^{2} \big) \end{align} $$

We will try to minimize the loss function to get our optimal $x_u$ and $y_i$ vectors. The main idea behind ALS-WR method is that we try to get the optimal Q and P matrix by holding one vector to be fixed at a time. We alternate back and forth until the value of Q and P converges. The reason why we don't optimize both vector at the same time is that it is hard to get optimal vectors at the same time. By holding one vector to be fixed and optimize another vector alternately, we can find the optimal Q and P more efficiently.

For a detailed explaination on how to 
please check [Ethen's Alternating Least Squares with Weighted Regularization (ALS-WR) from scratch](http://nbviewer.jupyter.org/github/ethen8181/machine-learning/blob/master/recsys/1_ALSWR.ipynb).

## Recommend using Latent Factor Model

In [14]:
class ExplicitMF:
    """
    This function is directly taken from Ethen's github (http://nbviewer.jupyter.org/github/ethen8181/machine-learning/blob/master/recsys/1_ALSWR.ipynb)
    Train a matrix factorization model using Alternating Least Squares
    to predict empty entries in a matrix

    Parameters
    ----------
    n_iters : int
        number of iterations to train the algorithm

    n_factors : int
        number of latent factors to use in matrix
        factorization model, some machine-learning libraries
        denote this as rank

    reg : float
        regularization term for item/user latent factors,
        since lambda is a keyword in python we use reg instead
    """

    def __init__(self, n_iters, n_factors, reg):
        self.reg = reg
        self.n_iters = n_iters
        self.n_factors = n_factors

    def fit(self, train):#, test
        """
        pass in training and testing at the same time to record
        model convergence, assuming both dataset is in the form
        of User x Item matrix with cells as ratings
        """
        self.n_user, self.n_item = train.shape
        self.user_factors = np.random.random((self.n_user, self.n_factors))
        self.item_factors = np.random.random((self.n_item, self.n_factors))

        # record the training and testing mse for every iteration
        # to show convergence later (usually, not worth it for production)
        # self.test_mse_record = []
        # self.train_mse_record = []
        for _ in range(self.n_iters):
            self.user_factors = self._als_step(train, self.user_factors, self.item_factors)
            self.item_factors = self._als_step(train.T, self.item_factors, self.user_factors)
            predictions = self.predict()
            # test_mse = self.compute_mse(test, predictions)
            # train_mse = self.compute_mse(train, predictions)
            # self.test_mse_record.append(test_mse)
            # self.train_mse_record.append(train_mse)

        return self

    def _als_step(self, ratings, solve_vecs, fixed_vecs):
        """
        when updating the user matrix,
        the item matrix is the fixed vector and vice versa
        """
        A = fixed_vecs.T.dot(fixed_vecs) + np.eye(self.n_factors) * self.reg
        b = ratings.dot(fixed_vecs)
        A_inv = np.linalg.inv(A)
        solve_vecs = b.dot(A_inv)
        return solve_vecs

    def predict(self):
        """predict ratings for every user and item"""
        pred = self.user_factors.dot(self.item_factors.T)
        return pred

• **Fit using Alternating Least Square Method**

In [15]:
als = ExplicitMF(n_iters=200, n_factors=10, reg=0.01)
als.fit(train)

<__main__.ExplicitMF at 0x118355d30>

In [16]:
latent_prediction = als.predict()
latent_overall_recommend = cf.get_overall_recommend(train, song_reshape, latent_prediction, top_n=10)
latent_recommend_girl = cf.get_user_recommend('GirlFan', latent_overall_recommend, song_df)
latent_recommend_heavy = cf.get_user_recommend('HeavyFan', latent_overall_recommend, song_df)
latent_recommend_johnny = cf.get_user_recommend('Johnny', latent_overall_recommend, song_df)

## Measure the performance using Root Mean Square Error(RMSE)

In [17]:
user_mse = compute_mse(test, user_prediction)
item_mse = compute_mse(test, item_prediction)
latent_mse = compute_mse(test, latent_prediction)

print("MSE for user-user approach: "+str(user_mse))
print("MSE for item-item approach: "+str(item_mse))
print("MSE for latent factor model: "+str(latent_mse))

MSE for user-user approach: 130.285737775
MSE for item-item approach: 131.051978749
MSE for latent factor model: 131.328796261


We can see that even though latent factor model is somewhat a more advanced model, the MSE not the lowerest for some reason. It is something that I should keep in mind.

## Compare the result of different approachs

### > Recommend Playlist for someone who is a big fan of *Beyoncé*, *Katy Perry* and *Alicia Keys*

• **User-user approach**

In [18]:
user_recommend_girl

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOFRQTD12A81C233C0,1.219399,Sehr kosmisch,Musik von Harmonia,Harmonia,Sehr kosmisch - Harmonia
1,SOAWJSH12A8C13AE09,0.53002,Diary,The Diary Of Alicia Keys,Alicia Keys featuring Tony! Toni! Toné! and Je...,Diary - Alicia Keys featuring Tony! Toni! Toné...
2,SONYKOW12AB01849C9,0.512538,Secrets,Waking Up,OneRepublic,Secrets - OneRepublic
3,SOPTLQL12AB018D56F,0.488574,Billionaire [feat. Bruno Mars] (Explicit Albu...,Billionaire [feat. Bruno Mars],Travie McCoy,Billionaire [feat. Bruno Mars] (Explicit Albu...
4,SOFKABN12A8AE476C6,0.452344,Just Dance,Just Dance,Lady GaGa / Colby O'Donis,Just Dance - Lady GaGa / Colby O'Donis
5,SOAUWYT12A81C206F1,0.445955,Undo,Vespertine Live,Björk,Undo - Björk
6,SOXFPND12AB017C9D1,0.435067,I Gotta Feeling,Todo Éxitos 2009,Black Eyed Peas,I Gotta Feeling - Black Eyed Peas
7,SODJWHY12A8C142CCE,0.41338,Hey_ Soul Sister,Save Me_ San Francisco,Train,Hey_ Soul Sister - Train
8,SOHTKMO12AB01843B0,0.400156,Catch You Baby (Steve Pitron & Max Sanna Radio...,Catch You Baby,Lonnie Gordon,Catch You Baby (Steve Pitron & Max Sanna Radio...
9,SOUWBLM12A8C1353D7,0.396952,EMOTIONS,The Remixes,Mariah Carey,EMOTIONS - Mariah Carey


• **Item-item approach**

In [19]:
item_recommend_girl

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SONLMAQ12A58A7C17E,1.979151,It Kills Me,The Bridge,Melanie Fiona,It Kills Me - Melanie Fiona
1,SOEVGOY12AB0181053,1.848646,Be With You,Freedom,Akon,Be With You - Akon
2,SOUWBLM12A8C1353D7,1.83634,EMOTIONS,The Remixes,Mariah Carey,EMOTIONS - Mariah Carey
3,SOUVXMC12A8C13FBD3,1.775607,The Storm Is Over Now,Kirk Franklin Presents: Songs For The Storm_ V...,Gods Property,The Storm Is Over Now - Gods Property
4,SOYINGU12AAF3B3314,1.649004,Now Behold The Lamb,Christmas,Kirk Franklin & The Family,Now Behold The Lamb - Kirk Franklin & The Family
5,SOXAGPE12A6D4F9496,1.581293,Picture U & Me,Back Seat Beats,Mo B. Dick,Picture U & Me - Mo B. Dick
6,SOTJTUX12AB018247F,1.490161,Keep You Much Longer,Freedom,Akon,Keep You Much Longer - Akon
7,SOPZFDD12A8C13AE12,1.373254,A Prayer,Red Balloon,East Blues Experience,A Prayer - East Blues Experience
8,SOBVFLL12AF72A4EE8,1.32658,Intuition,In A Perfect World...,Keri Hilson,Intuition - Keri Hilson
9,SOOXYLM12AB018472B,1.288548,Mr. Man,Songs In A Minor,Alicia Keys;Jimmy Cozier,Mr. Man - Alicia Keys;Jimmy Cozier


• **Latent Factor Model**

In [20]:
latent_recommend_girl

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOAUWYT12A81C206F1,0.713664,Undo,Vespertine Live,Björk,Undo - Björk
1,SOEGIYH12A6D4FC0E3,0.212513,Horn Concerto No. 4 in E flat K495: II. Romanc...,Mozart - Eine kleine Nachtmusik,Barry Tuckwell/Academy of St Martin-in-the-Fie...,Horn Concerto No. 4 in E flat K495: II. Romanc...
2,SOQXBZN12AB018610A,0.135733,Waiting For A Dream,Want,Rufus Wainwright,Waiting For A Dream - Rufus Wainwright
3,SOPUCYA12A8C13A694,0.095225,Canada,The End Is Here,Five Iron Frenzy,Canada - Five Iron Frenzy
4,SOHTKMO12AB01843B0,0.094304,Catch You Baby (Steve Pitron & Max Sanna Radio...,Catch You Baby,Lonnie Gordon,Catch You Baby (Steve Pitron & Max Sanna Radio...
5,SOWEHOM12A6BD4E09E,0.085335,16 Candles,16 Candles,The Crests,16 Candles - The Crests
6,SOBCMUG12AB017D50A,0.073343,Move Shake Drop Remix,Category 6,DJ Laz,Move Shake Drop Remix - DJ Laz
7,SOFRQTD12A81C233C0,0.069015,Sehr kosmisch,Musik von Harmonia,Harmonia,Sehr kosmisch - Harmonia
8,SOUFTBI12AB0183F65,0.067117,Invalid,Fermi Paradox,Tub Ring,Invalid - Tub Ring
9,SOQQQMM12A6310DFCC,0.063846,(Nice Dream),The Bends (Collectors Edition),Radiohead,(Nice Dream) - Radiohead


### > Recommend Playlist for someone who is a big fan of  *Metallica*, *Guns N' Roses*, *Linkin Park*  and *Red Hot Chili Peppers*

In [21]:
user_recommend_heavy

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOHNRJO12A8AE44A2B,0.792604,Duality (Album Version),Vol. 3 The Subliminal Verses,Slipknot,Duality (Album Version) - Slipknot
1,SOBONKR12A58A7A7E0,0.445356,You're The One,If There Was A Way,Dwight Yoakam,You're The One - Dwight Yoakam
2,SOFRQTD12A81C233C0,0.431592,Sehr kosmisch,Musik von Harmonia,Harmonia,Sehr kosmisch - Harmonia
3,SOAUWYT12A81C206F1,0.409792,Undo,Vespertine Live,Björk,Undo - Björk
4,SOEGIYH12A6D4FC0E3,0.374046,Horn Concerto No. 4 in E flat K495: II. Romanc...,Mozart - Eine kleine Nachtmusik,Barry Tuckwell/Academy of St Martin-in-the-Fie...,Horn Concerto No. 4 in E flat K495: II. Romanc...
5,SOLRGNF12AB0187CF4,0.364115,Sample Track 2,Dance & Hip Hop Breaks,Simon Harris,Sample Track 2 - Simon Harris
6,SONYKOW12AB01849C9,0.304168,Secrets,Waking Up,OneRepublic,Secrets - OneRepublic
7,SOSXLTC12AF72A7F54,0.282448,Revelry,Only By The Night,Kings Of Leon,Revelry - Kings Of Leon
8,SODJWHY12A8C142CCE,0.265619,Hey_ Soul Sister,Save Me_ San Francisco,Train,Hey_ Soul Sister - Train
9,SONIQRE12AF72A2B02,0.247181,Bring Me To Life,Fallen,Evanescence,Bring Me To Life - Evanescence


In [22]:
item_recommend_heavy

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOKNDKQ12A58A795CC,5.535981,Boom (2006 Remastered Album Version),Greatest Hits [The Atlantic Years],P.O.D.,Boom (2006 Remastered Album Version) - P.O.D.
1,SOQGYQP12A8C1322D9,2.922603,Down Rodeo,Evil Empire,Rage Against The Machine,Down Rodeo - Rage Against The Machine
2,SOHNRJO12A8AE44A2B,2.231694,Duality (Album Version),Vol. 3 The Subliminal Verses,Slipknot,Duality (Album Version) - Slipknot
3,SOZYIQR12A58A7DB25,2.131937,Overburdened (Album Version),Ten Thousand Fists,Disturbed,Overburdened (Album Version) - Disturbed
4,SOXBRKW12A8C142084,2.115472,'Til We Die (Album Version),All Hope Is Gone,Slipknot,'Til We Die (Album Version) - Slipknot
5,SOXQBCW12AB018704A,2.031804,Own Little World,ROUNDERS,Celldweller,Own Little World - Celldweller
6,SOCRIYW12A8C143467,1.828914,Ella Elle L A,Free,Kate Ryan,Ella Elle L A - Kate Ryan
7,SOVDMUW12A8AE45BC1,1.824179,Skin Ticket (Album Version),Iowa,Slipknot,Skin Ticket (Album Version) - Slipknot
8,SOMRAUN12A6D4F5224,1.807502,Departure (Album Version),Ascendancy,Trivium,Departure (Album Version) - Trivium
9,SOBNLQK12A8C131F2E,1.760694,I'm Yours (Album Version),We Sing. We Dance. We Steal Things.,Jason Mraz,I'm Yours (Album Version) - Jason Mraz


In [23]:
latent_recommend_heavy

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOAUWYT12A81C206F1,1.195373,Undo,Vespertine Live,Björk,Undo - Björk
1,SOBONKR12A58A7A7E0,0.523214,You're The One,If There Was A Way,Dwight Yoakam,You're The One - Dwight Yoakam
2,SOANOQW12A58A793D2,0.391354,Cold Blooded (Acid Cleanse),Machine Punk Music,The fFormula,Cold Blooded (Acid Cleanse) - The fFormula
3,SOEGIYH12A6D4FC0E3,0.348488,Horn Concerto No. 4 in E flat K495: II. Romanc...,Mozart - Eine kleine Nachtmusik,Barry Tuckwell/Academy of St Martin-in-the-Fie...,Horn Concerto No. 4 in E flat K495: II. Romanc...
4,SOHTKMO12AB01843B0,0.205647,Catch You Baby (Steve Pitron & Max Sanna Radio...,Catch You Baby,Lonnie Gordon,Catch You Baby (Steve Pitron & Max Sanna Radio...
5,SOUFTBI12AB0183F65,0.17542,Invalid,Fermi Paradox,Tub Ring,Invalid - Tub Ring
6,SOPUCYA12A8C13A694,0.16829,Canada,The End Is Here,Five Iron Frenzy,Canada - Five Iron Frenzy
7,SOQXBZN12AB018610A,0.163835,Waiting For A Dream,Want,Rufus Wainwright,Waiting For A Dream - Rufus Wainwright
8,SOWEHOM12A6BD4E09E,0.143406,16 Candles,16 Candles,The Crests,16 Candles - The Crests
9,SOBCMUG12AB017D50A,0.12105,Move Shake Drop Remix,Category 6,DJ Laz,Move Shake Drop Remix - DJ Laz


### > Recommend Playlist for myself, I like *Daft Punk*, *John Mayer*, *Hot Chip* and *Coldplay*

In [24]:
user_recommend_johnny

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOFRQTD12A81C233C0,0.625776,Sehr kosmisch,Musik von Harmonia,Harmonia,Sehr kosmisch - Harmonia
1,SOAXGDH12A8C13F8A1,0.535344,Dog Days Are Over (Radio Edit),Now That's What I Call Music! 75,Florence + The Machine,Dog Days Are Over (Radio Edit) - Florence + Th...
2,SONYKOW12AB01849C9,0.521639,Secrets,Waking Up,OneRepublic,Secrets - OneRepublic
3,SOBONKR12A58A7A7E0,0.487021,You're The One,If There Was A Way,Dwight Yoakam,You're The One - Dwight Yoakam
4,SOTWNDJ12A8C143984,0.422357,Marry Me,Save Me_ San Francisco,Train,Marry Me - Train
5,SODJWHY12A8C142CCE,0.390983,Hey_ Soul Sister,Save Me_ San Francisco,Train,Hey_ Soul Sister - Train
6,SOAUWYT12A81C206F1,0.372784,Undo,Vespertine Live,Björk,Undo - Björk
7,SOSXLTC12AF72A7F54,0.371595,Revelry,Only By The Night,Kings Of Leon,Revelry - Kings Of Leon
8,SOWCKVR12A8C142411,0.345242,Use Somebody,Use Somebody,Kings Of Leon,Use Somebody - Kings Of Leon
9,SOLFXKT12AB017E3E0,0.322854,Fireflies,Karaoke Monthly Vol. 2 (January 2010),Charttraxx Karaoke,Fireflies - Charttraxx Karaoke


In [25]:
item_recommend_johnny

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOJLYEB12A6D4F9750,3.74306,Angel,Beneath These Fireworks,Matt Nathanson,Angel - Matt Nathanson
1,SOACMJJ12A6D4FC66A,3.397123,Legacy Of Kings,Steel Meets Steel - 10 Years Of Glory,HAMMERFALL,Legacy Of Kings - HAMMERFALL
2,SOJZVDN12A8C133C0B,3.013383,With Oden On Our Side,With Oden On Our Side,Amon Amarth,With Oden On Our Side - Amon Amarth
3,SOAWXHP12AB01840AC,2.330975,Lemonade,Grey Oceans,Cocorosie,Lemonade - Cocorosie
4,SOMFPCO12AF72ABFC2,2.240123,Falling Or Flying,Grey's Anatomy Volume 3 Original Soundtrack,Grace Potter and the Nocturnals,Falling Or Flying - Grace Potter and the Noctu...
5,SOXEJIX12A8C13B47E,2.039762,I've Got The World On A String,Evening Soundtrack,Michael Bublé,I've Got The World On A String - Michael Bublé
6,SOOIFDD12A8C13C468,2.012411,Monsters (Album Version),Voices,Matchbook Romance,Monsters (Album Version) - Matchbook Romance
7,SOLUKGT12A67ADF6B0,1.964689,Just Like Honey,21 Singles,The Jesus And Mary Chain,Just Like Honey - The Jesus And Mary Chain
8,SOFESLM12AB017ED43,1.926129,Play On,Play On,Carrie Underwood,Play On - Carrie Underwood
9,SOZVZSP12A6D4F6A99,1.886486,Rosemary,Scott 3,Scott Walker,Rosemary - Scott Walker


In [26]:
latent_recommend_johnny

Unnamed: 0,song_id,score,title,release,artist_name,song
0,SOAUWYT12A81C206F1,1.293457,Undo,Vespertine Live,Björk,Undo - Björk
1,SOBONKR12A58A7A7E0,0.906564,You're The One,If There Was A Way,Dwight Yoakam,You're The One - Dwight Yoakam
2,SOUFTBI12AB0183F65,0.888223,Invalid,Fermi Paradox,Tub Ring,Invalid - Tub Ring
3,SOHKZSM12A8C13E5D5,0.756217,(They Long To Be) Close To You,The Essential Collection (1965-1997),Carpenters,(They Long To Be) Close To You - Carpenters
4,SOQXBZN12AB018610A,0.53703,Waiting For A Dream,Want,Rufus Wainwright,Waiting For A Dream - Rufus Wainwright
5,SOEGIYH12A6D4FC0E3,0.418033,Horn Concerto No. 4 in E flat K495: II. Romanc...,Mozart - Eine kleine Nachtmusik,Barry Tuckwell/Academy of St Martin-in-the-Fie...,Horn Concerto No. 4 in E flat K495: II. Romanc...
6,SOHTKMO12AB01843B0,0.261271,Catch You Baby (Steve Pitron & Max Sanna Radio...,Catch You Baby,Lonnie Gordon,Catch You Baby (Steve Pitron & Max Sanna Radio...
7,SOSXLTC12AF72A7F54,0.2315,Revelry,Only By The Night,Kings Of Leon,Revelry - Kings Of Leon
8,SOPUCYA12A8C13A694,0.189099,Canada,The End Is Here,Five Iron Frenzy,Canada - Five Iron Frenzy
9,SOFRQTD12A81C233C0,0.159831,Sehr kosmisch,Musik von Harmonia,Harmonia,Sehr kosmisch - Harmonia


We see that the all the recommended playlist are actually kind of make sense. In the following notebook, I will continue to try some other methods to build up custom recommended playlists.

### Reference

* http://nbviewer.jupyter.org/github/ethen8181/machine-learning/blob/master/recsys/1_ALSWR.ipynb
* https://github.com/dvysardana/RecommenderSystems_PyData_2016/blob/master/Song%20Recommender_Python.ipynb