# Spotify Collaborative Filtering
The goal of this recommender is to look at a playlist and recommend similar new tracks, by only comparing what songs other users have added to their own playlists.
Using just which songs exist in the playlists means the user doesn't have to give a score/like for every track in order to get recommendations, and neither does the recommender need it to produce valuable insights. Instead it is implied that a user adds a specific track to their playlist because they find the song good/compelling, and fitting for the playlist they are building.

For data we use the Spotify million playlist dataset[0].

Following the recommender systems chapter from *Data Science from Scratch, 2nd Edition by Joel Grus*[1], we have implemented a recommender which can do both user-based collaborative filtering (playlists with songs), and also item-based collaborative filtering (songs with playlists).

[0]https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge

[1]https://www.oreilly.com/library/view/data-science-from/9781492041122/

## Dataset
A json or two of the Spotify million playlist dataset are used to supply data to the recommender.
The MPD does not contain any auditory properties of the tracks, so the goal is to find correlations only based on what items the playlists contain.
As seen in the exploratory notebook, the playlists come with a bunch of meta data such as number of tracks/albums/artists, the Spotify URI etc. 

A sample playlist looks like the following:
```
{
            "name": "TRVP", 
            "collaborative": "false", 
            "pid": 26439, 
            "modified_at": 1505260800, 
            "num_tracks": 10, 
            "num_albums": 9, 
            "num_followers": 1, 
            "tracks": [
                {
                    "pos": 0, 
                    "artist_name": "Sfera Ebbasta", 
                    "track_uri": "spotify:track:6Xq4v660hdyAwFiu9nXRlZ", 
                    "artist_uri": "spotify:artist:23TFHmajVfBtlRx5MXqgoz", 
                    "track_name": "Ciny", 
                    "album_uri": "spotify:album:6GIzEUMuSVY9TgdfZvzmBD", 
                    "duration_ms": 166346, 
                    "album_name": "XDVR Reloaded"
                }, 
                {
                    "pos": 1, 
                    "artist_name": "Sfera Ebbasta", 
                    "track_uri": "spotify:track:0QuQVxpCB1zd7xIBAcgjuR", 
                    "artist_uri": "spotify:artist:23TFHmajVfBtlRx5MXqgoz", 
                    "track_name": "XDVRMX", 
                    "album_uri": "spotify:album:6GIzEUMuSVY9TgdfZvzmBD", 
                    "duration_ms": 202445, 
                    "album_name": "XDVR Reloaded"
                }, 
                {
                    "pos": 2, 
                    "artist_name": "Sfera Ebbasta", 
                    "track_uri": "spotify:track:1S2VLr8j9KVbroLn76RF4i", 
                    "artist_uri": "spotify:artist:23TFHmajVfBtlRx5MXqgoz", 
                    "track_name": "Tran Tran", 
                    "album_uri": "spotify:album:6no7XEwjK8pAELt9aClXFx", 
                    "duration_ms": 220968, 
                    "album_name": "Tran Tran"
                }, 
                {
                ...
                },
}
```

Only the playlist, artist and song name will be used. Artist+song name will be combined to be used for vectorization and the playlist name will be used to produce funny non-sensical name recommendations.

One json contains 1000 playlists, and more than a couple files is too much for this library-less solution(except using numpys dot product for a bit more performance), since Python is a really slow data when it comes to processing large data unless it is calling C/native code functions in the background. The item-based recommender takes especially long with 2 jsons, 1 hour for a single run! 1 json takes ~3 minutes.


## Assessment
There's no assessment function, but from what we've interpreted of these implementations, it would be hard to assess because mostly the popular songs are recommended, so if we removed a few songs to see if they'd get re-recommended, they probably wouldn't if they were more niche tracks. So a "qualitative" method will be used instead.

## Collaborative filters
The recommender is based on the explanations and code in Joel Grus' book `Data Science from Scratch, 2nd Edition`. While the lecture material went through the user-based approach, we also delved deeper into the actual book and implemented an item-based approach the book is talking about.

### User-based
The user-based approach looks at what other users have added to their playlist and compares the supplied playlist/songs with those.
Each playlist is transformed into a sparse feature vector, where the vector is `num_unique_songs` long and the respective positions represent a unique song. When *Jane Doe* adds her favourite country songs to her playlist, the playlist's vector is compared in similarity to all other existing vectors using cosine similarity. A lower value means the playlists are less similar. Then we go through each playlist’s tracks, assigning the track names to a big dictionary and += the songs with their own playlist's cosine similarity value.

Similar playlists will give much more weight to their songs, and likewise very popular songs will gain a high score, even though these songs might be in very similar playlists. The end result is a list of songs with decent chance of being similar to Jane Doe's favourites.


### Item-based
The item-based approach is quite similar to user-based, with the big difference being the playlist vectors being turned around into a list of songs, its feature vector is a long list indicating which playlist the song exists in. The matrix is `num_unique_songsrows` x `num_playlists`. Jane Doe now has a sparse feature vector for each song in her playlist but is it still the same type of logic as a user-based, where the vectors get a cosine between each other, and these similarities are added up for the tracks to produce a ranking of most similar songs.

The advantage of an item-based approach could perhaps be that if Jane adds a track of very different genre compared to what her usual listening, we can now compare the similarity of this one track to all others and get similar tracks (without having to compare her playlist as a whole).

The advantage of an item-based approach could perhaps be that if John adds a track of very different genre compared to what his usual listening, we can now compare the similarity of this one track to all others and get similar tracks. (without having to compare his playlist as a whole)

## Examples


In [1]:
import json
import os
import numpy as np
import math
from typing import List, Tuple, Union
from collections import defaultdict
import random
import time
import pickle


class Collaborative_Recommender:
    Vector = List[int]

    def __init__(self, path: str, rewrite_data: bool = False, playlist_limit: int = 0) -> None:
        """
        path: file path to the spotify jsons. Uses all jsons found.
        rewrite_data: True = Always create new vectors based off the jsons, False = Try to use pickled
            vectors, if not found generate new vectors and pickle them.
        playlist_limit: Limit the number of playlists to use
        """
        self.playlists = []
        self.path = path
        self.rewrite_data = rewrite_data
        self.load_data(path, playlist_limit)
        # Shuffle playlists as to reduce possible bias
        random.shuffle(self.playlists)

    def load_data(self, data_path: str, limit: int = 0):
        try:
            print("Loading the found jsons:", os.listdir(data_path))
            for filename in os.listdir(data_path):
                with open(os.path.join(data_path, filename), "rt", encoding="utf-8") as f:
                    self.playlists.extend(json.load(f)["playlists"])
            print(f"Loaded {len(os.listdir(data_path))} jsons")

            if limit > 0:
                self.playlists = self.playlists[:limit]
        except FileNotFoundError as e:
            print("File not found:", e)

    def extract_unique_songs(self):
        """Puts all unique songs from the playlist into an array"""
        self.unique_songs = sorted(
            {
                f"{song['artist_name']} - {song['track_name']}"
                for playlist in self.playlists
                for i, song in enumerate(playlist["tracks"])
            }
        )
        print(f"Found {len(self.unique_songs)} unique songs")

    def extract_playlist_names(self):
        """Puts all playlists' names into an array with the same order as the playlists"""
        self.playlist_names = [playlist["name"] for playlist in self.playlists]

    def sample_tracks(self, pos: int, num: int):
        for s in self.playlists[pos]["tracks"][:num]:
            print(f"{s['artist_name']} - {s['track_name']}")

    def to_track_names(self, playlists) -> List[List[str]]:
        """
        Compiles every track in every playlist an '<artist> - <song>' name, into an array of the
        same shape as the playlists
        """
        return [
            [f"{s['artist_name']} - {s['track_name']}" for i, s in enumerate(playlist["tracks"])]
            for playlist in playlists
        ]

    def make_playlist_vector(self, playlist: Union[dict, List[str]]) -> List[int]:
        """
        Given a list of playlists, produce a vector whose ith element is 1
        if unique_songs[i] is in the list, 0 otherwise
        """
        playlist_ = []
        # Checks if the functiun gets called from multiprocessing
        if isinstance(playlist, tuple):
            _, playlist_ = playlist
        else:
            playlist_ = playlist
        # Checking whether it's a dict or list, since one can supply a list of song names too
        if isinstance(playlist_, dict):
            return [
                1
                if song in [f"{s['artist_name']} - {s['track_name']}" for s in playlist_["tracks"]]
                else 0
                for song in self.unique_songs
            ]
        elif isinstance(playlist_, List):
            return [1 if song in playlist_ else 0 for song in self.unique_songs]

    def make_user_based_vectors(self):
        """
        Generates sparse feature vectors of the given playlists
        Assigns 1 to n'th position if the playlist includes song n, else 0
        Uses multiprocessing to speed up the generation
        Pickles the vectors so they dont have to be recalculated on a rerun
        """
        import multiprocessing as mp

        # Using multiprocessing to speed up the vector creation
        # Spawns a new python process to run the make_playlist_vector method
        p = mp.Pool(14)  # 14 available processes, dunno if this works badly on <16 cpu threads
        t0 = time.time()

        try:
            if self.rewrite_data:
                print("Rewriting data: Generating vectors and dumping them")
                self.extract_unique_songs()
                self.playlist_vectors = p.map(self.make_playlist_vector, enumerate(self.playlists))
                vector_file = open("playlist_vectors.pickle", "wb")
                playlist_file = open("playlists.pickle", "wb")
                print("Pickling...")
                pickle.dump(self.playlist_vectors, vector_file)
                pickle.dump(self.playlists, playlist_file)
            else:
                vector_file = open("playlist_vectors.pickle", "rb")
                playlist_file = open("playlists.pickle", "rb")

                print("Pickled data found.")
                self.playlist_vectors = pickle.load(vector_file)
                self.playlists = pickle.load(playlist_file)
                self.extract_unique_songs()
        except (FileNotFoundError):
            print("Pickled data not found, generating vectors...")
            self.extract_unique_songs()
            self.playlist_vectors = p.map(self.make_playlist_vector, enumerate(self.playlists))
            vector_file = open("playlist_vectors.pickle", "wb")
            playlist_file = open("playlists.pickle", "wb")
            print("Pickling...")
            pickle.dump(self.playlist_vectors, vector_file)
            pickle.dump(self.playlists, playlist_file)
        finally:
            vector_file.close()
            playlist_file.close()
            self.named_playlists = self.to_track_names(self.playlists)

        t1 = time.time()
        print(f"Time taken: {round(t1-t0, 2)} s")
        print("playlist_vectors length:", len(self.playlist_vectors))

    def make_item_based_vectors(self):
        """
        Flips the playlist-songs matrix into songs-playlists
        So that each row is a song, and the sparse vector tells which playlist this song is in
        """
        self.song_playlist_matrix: List[List[int]] = [
            [playlist_vector[i] for playlist_vector in self.playlist_vectors]
            for i, _ in enumerate(self.unique_songs)
        ]

    def cosine_similarity(self, v1: Vector, v2: Vector) -> float:
        """Calculates the cosine between two feature vectors"""
        return np.dot(v1, v2) / math.sqrt(np.dot(v1, v1) * np.dot(v2, v2))

    def compute_similarities(self, pv) -> List[float]:
        """Runs cosine_similarity on each playlist_vector"""
        return [self.cosine_similarity(pv, pv_i) for pv_i in self.playlist_vectors]

    def sort_similar_playlists(
        self, similarities: List[float], user_id: int
    ) -> List[Tuple[int, float]]:
        """
        Create a sorted list of the similar playlists.
        Each element is a tuple of the playlist's id and its cosine similarity
        """
        # Puts the similarity into a tuple with its playlist id
        id_similarity = [
            (id, similarity)
            for id, similarity in enumerate(similarities)
            if id != user_id and similarity > 0  # user_id is actually the user's playlist's id
        ]
        return sorted(id_similarity, key=lambda pair: pair[-1], reverse=True)

    def suggest_name(self, similarities: List[Tuple[int, float]], length: int = 4) -> str:
        """Slaps together the top playlists' names first word and calls it a name"""
        suggested_name = ""
        for z in range(length):
            # print(f"Most similar playlist{[z]} name:", self.playlist_names[similarities[z][0]])
            suggested_name = " ".join(
                [suggested_name, self.playlist_names[similarities[z][0]].split(" ")[0]]
            )
        return suggested_name.strip()

    def user_based_suggestions(
        self, similarities, max_suggestions: int = 10
    ) -> Tuple[List[Tuple[str, float]], str]:
        """
        Sorts the given similarities(playlist similarity scores),
        gives each song a similarity score, and returns
        the most similar songs & a suggested name for the playlist.
        """
        suggestions: Dict[str, float] = defaultdict(float)
        # Sort playlists into (playlist_id, similarity)
        sorted_similarities = self.sort_similar_playlists(similarities, self.user_id)

        suggested_name = self.suggest_name(sorted_similarities)

        # Sum up the song similarities
        for other_user_id, similarity in sorted_similarities:
            for song in self.named_playlists[other_user_id]:
                suggestions[song] += similarity

        # Convert them to a sorted list
        suggestions = sorted(suggestions.items(), key=lambda pair: pair[-1], reverse=True)

        # Exclude the user_id's supplied songs
        return [
            (suggestion, weight)  # weight = summed up score from the playlists
            for suggestion, weight in suggestions
            if suggestion not in self.named_playlists[self.user_id]
        ][:max_suggestions], suggested_name

    def user_based_recommendation(
        self, playlist: Union[dict, List[str]], limit: int = 10, user_id: int = -1
    ) -> Tuple[List[Tuple[str, float]], str]:
        """Return a recommendation of tracks based off the given playlist(can be
        mpd-formatted or list of strings)"""
        self.user_id = user_id
        similarities = self.compute_similarities(self.make_playlist_vector(playlist))
        suggestions, name = self.user_based_suggestions(
            similarities=similarities, max_suggestions=limit
        )
        return suggestions, name

    def compute_song_similarities(self, song_id: int) -> List[float]:
        """Runs cosine_similarity on each song-playlists feature vector against the given song"""
        return [
            self.cosine_similarity(self.song_playlist_matrix[song_id], pl_vector_j)
            for pl_vector_j in self.song_playlist_matrix
        ]

    def most_similar_songs_to(self, song_id: int) -> List[Tuple[str, float]]:
        """
        Create a sorted list of the similar songs.
        Each element in the returned list is a tuple of the songs's name and its cosine similarity
        """
        similarities = self.compute_song_similarities(song_id)
        song_similarity_pairs = [
            (self.unique_songs[other_song_id], similarity)
            for other_song_id, similarity in enumerate(similarities)
            if song_id != other_song_id and similarity > 0
        ]
        return sorted(song_similarity_pairs, key=lambda pair: pair[-1], reverse=True)

    def item_based_suggestions(
        self, playlist_vector: List[int], max_suggestions: int = 10
    ) -> List[Tuple[str, float]]:
        """
        Goes through each song in the playlist vector and computes the most similar songs to that
        song, then gives each song a similarity score, and returns the most similar songs.
        The more frequently a computed similar song appears while going through each
        song in the playlist, the higher its ranking will be.
        """
        suggestions = defaultdict(float)

        for song_id, in_playlist in enumerate(playlist_vector):
            if in_playlist == 1:  # If song is in this playlist
                similar_songs = self.most_similar_songs_to(song_id)  # Get most similar songs to it
                # Add up the similarity score on each similar song
                for song, similarity in similar_songs:
                    suggestions[song] += similarity

        suggestions = sorted(suggestions.items(), key=lambda pair: pair[-1], reverse=True)

        return [
            (suggestion, weight)
            for suggestion, weight in suggestions
            if suggestion not in self.named_playlists[self.user_id]  # Don't include existing songs
        ][:max_suggestions]

    def item_based_recommendation(
        self, playlist: Union[dict, List[str]], limit: int = 10, user_id: int = -1
    ) -> List[Tuple[str, float]]:
        """Return a recommendation of tracks based off the given playlist(can be
        mpd-formatted or list of strings)"""
        t0 = time.time()
        self.user_id = user_id
        playlist_vector = self.make_playlist_vector(playlist)
        suggestions = self.item_based_suggestions(
            playlist_vector=playlist_vector, max_suggestions=limit
        )
        t1 = time.time()
        print(f"Time taken: {round(t1-t0, 2)} s")
        return suggestions

    def summary(self):
        total = 0
        for playlist in self.playlists:
            total += len(playlist)
        print("- - - - Summary of data - - - -")
        print("Nr. of playlists:", len(self.playlists))
        print("Unique songs:", len(self.unique_songs))
        print("Average playlist length:", total / len(self.playlists))
        print("- - - - End of summary - - - - -")

### Setup the recommender
Load in one json and vectorize the playlists. If there's no pickle, it takes about 30 seconds on a ryzen 3800x.

In [2]:
# Init recommender with path to the mpd json/s and if to use pickled vector data
recommender = Collaborative_Recommender("mpd/data_samples", rewrite_data=False)
recommender.make_user_based_vectors()
recommender.extract_playlist_names()

Loading the found jsons: ['mpd.slice.26000-26999.json']
Loaded 1 jsons
Pickled data found.
Found 34827 unique songs
Time taken: 0.59 s
playlist_vectors length: 1000


In [3]:
print("5 Sample tracks from playlist[69]:")
recommender.sample_tracks(69, 5)
recommender.make_item_based_vectors()
recommender.summary()

5 Sample tracks from playlist[69]:
5 Seconds of Summer - Amnesia
Christina Perri - A Thousand Years
Miley Cyrus - Drive
Ed Sheeran - Thinking Out Loud
Justin Bieber - Down To Earth
- - - - Summary of data - - - -
Nr. of playlists: 1000
Unique songs: 34827
Average playlist length: 11.015
- - - - End of summary - - - - -


### Recommend songs to Jane

Jane's playlist happens to be in the database already:

In [4]:
jane_id = 69
print("Jane's own playlist:")
display(recommender.named_playlists[jane_id])

Jane's own playlist:


['5 Seconds of Summer - Amnesia',
 'Christina Perri - A Thousand Years',
 'Miley Cyrus - Drive',
 'Ed Sheeran - Thinking Out Loud',
 'Justin Bieber - Down To Earth',
 'Adele - Hello',
 'Uncle Jed - Latch',
 'Bon Iver - Skinny Love',
 'Corey Gray - If I Lose Myself',
 'Justin Bieber - Life Is Worth Living',
 'X Ambassadors - Unsteady',
 'Madilyn Bailey - Pompeii',
 'Nicki Minaj - Bed Of Lies',
 'Birdy - All You Never Say',
 'Madilyn Bailey - Maps',
 'Daya - Back to Me',
 'Daya - Hide Away',
 'Daya - Back to Me']

- Looks like a bunch of pop songs?
- They seem to be relatively recent songs.

In [6]:
def print_recommendation(suggestions, name=None):
    if(name is not None):
        print("Suggested name:", name)
    
    for i, s in enumerate(suggestions):
        print("{0:<5}{1:<40} {2}".format(f"{i+1}.", s[0], s[1]))

### User based recommendation
The playlist can be in MPD format or just a list of strings. If the playlist is in the dataset then one can supply the id as to not be recommended your own songs.

In [7]:
# Get jane's playlist
janes_playlist = recommender.playlists[jane_id]
# Run it through the recommender. Give it playlist, how many tracks, and own playlist id
suggestions, sug_name = recommender.user_based_recommendation(janes_playlist, 12, jane_id)
print_recommendation(suggestions, sug_name)

Suggested name: car you ❤️❤️ Syd
1.   Ed Sheeran - Photograph                  0.8473668279740618
2.   James Bay - Let It Go                    0.6311617785519165
3.   Sam Smith - Lay Me Down                  0.6194777047540253
4.   John Legend - All of Me                  0.6049336982536837
5.   Passenger - Let Her Go                   0.5787191194277588
6.   Jason Mraz - I Won't Give Up             0.5596744710984812
7.   James Arthur - Say You Won't Let Go      0.5584253648693458
8.   Sam Smith - I'm Not The Only One         0.5409875690857151
9.   Sam Smith - Stay With Me                 0.5404209506202018
10.  Lukas Graham - 7 Years                   0.5383299958530888
11.  Meghan Trainor - Like I'm Gonna Lose You 0.5024478955016453
12.  Justin Bieber - Love Yourself            0.4968716903397613


- "car you ❤️❤️ Syd" is a great playlist name!
- We do seem to get a bunch of pop songs back.
- Ed Sheeran & Justin Bieber are in Jane's playlist as well.

### Item based recommender
Let's run Jane's playlist through the item based collaborative filter. Let us also test that a string list of his songs also works.

In [8]:
janes_playlist = recommender.named_playlists[jane_id]
suggestions = recommender.item_based_recommendation(janes_playlist, 12, jane_id)
print("Suggestions:")
print_recommendation(suggestions)

Time taken: 215.35 s
Suggestions:
1.   Ed Sheeran - Photograph                  1.5255518251563038
2.   Passenger - Let Her Go                   1.4329548991977399
3.   Sam Smith - I'm Not The Only One         1.4282577109497308
4.   Joel Adams - Please Don't Go             1.3431411183876572
5.   Sam Smith - Stay With Me                 1.3297328816157459
6.   Shawn Mendes - Mercy                     1.3111933489642604
7.   Jason Mraz - I Won't Give Up             1.3055642896400699
8.   One Direction - History                  1.2983940880728837
9.   Coldplay - Fix You                       1.2773365744229945
10.  One Direction - If I Could Fly           1.2402517216602607
11.  Birdy - Skinny Love                      1.2327594778960225
12.  Adele - Someone Like You                 1.2165642728450732


Once again I can't tell if these are super similar because of my unfamiliarity. I'm sure I'd be happy!
I know Adele's songs, let's see what is similar to hers.

In [9]:
print_recommendation(recommender.item_based_recommendation(['Adele - Someone Like You'], 5))

Time taken: 12.61 s
1.   Adele - Set Fire to the Rain             0.5
2.   Adele - When We Were Young               0.49746833816309105
3.   Adele - Rolling in the Deep              0.4216370213557839
4.   Adele - Take It All                      0.408248290463863
5.   Adele - One And Only                     0.3849001794597505


Adele definitely is similar to Adele. I bet it's because people who like Adele also have loads of other of hers on their playlists.

## More samples with different playlists

The first array printed is the user's 10 songs in the playlist, the other one are the recommendations and its last element is the suggested playlist name.

In [10]:
id = 514
display(recommender.named_playlists[id][:10])
s, n = recommender.user_based_recommendation(playlist=recommender.playlists[id], user_id=id)
print_recommendation(s, n)

['Rascal Flatts - What Hurts The Most',
 'Hollywood Undead - Black Dahlia',
 'Limp Bizkit - Behind Blue Eyes',
 'Saliva - Always',
 'Hollywood Undead - Pour Me',
 'Hollywood Undead - S.C.A.V.A.',
 'Hollywood Undead - Bullet',
 'Hollywood Undead - Coming Back Down',
 'Hollywood Undead - Mother Murder',
 'Linkin Park - Waiting For The End']

Suggested name: Country Three Outlaw Sleep
1.   Drowning Pool - Bodies                   0.18036210542307424
2.   Papa Roach - Last Resort                 0.1540163831377841
3.   Rascal Flatts - Bless The Broken Road    0.1471111628412463
4.   Brooks & Dunn - Red Dirt Road            0.14040176987488098
5.   Three Days Grace - I Hate Everything About You 0.1386860985245757
6.   Led Zeppelin - Stairway To Heaven        0.1296541708541989
7.   Tim McGraw - Live Like You Were Dying    0.1286040998539846
8.   Lonestar - My Front Porch Looking In     0.125112751243962
9.   Kenny Chesney - There Goes My Life       0.12394252288475384
10.  The Band Perry - If I Die Young          0.12025772997872064


The playlist is numetal and the recommendations are quite similar.

In [11]:
id = 516
display(recommender.named_playlists[id][:10])
s, n = recommender.user_based_recommendation(playlist=recommender.playlists[id], user_id=id)
print_recommendation(s, n)

['Duran Duran - Hungry Like The Wolf - 2009 Remastered Version',
 'The Cars - Just What I Needed',
 'a-ha - Take On Me',
 'Madonna - Hung Up',
 'Michael Jackson - Billie Jean',
 'Depeche Mode - Enjoy The Silence - Single Mix',
 'Loverboy - Working for the Weekend',
 'The Apples - Hey Jude',
 'Queen - Bohemian Rhapsody - Remastered 2011',
 'The Police - Roxanne - Remastered 2003']

Suggested name: the 80'S Blast Friday
1.   Journey - Don't Stop Believin'           1.1616207136131862
2.   Queen - Under Pressure - Remastered 2011 0.8802600169010343
3.   Guns N' Roses - Sweet Child O' Mine      0.8185612962547691
4.   Rick Springfield - Jessie's Girl         0.8090761593650391
5.   The Police - Message In A Bottle - Remastered 2003 0.7900942818093478
6.   The Outfield - Your Love                 0.7627315108430431
7.   Eurythmics - Sweet Dreams (Are Made of This) - Remastered 0.727007894050679
8.   Pat Benatar - Hit Me With Your Best Shot 0.6645785793798065
9.   Simple Minds - Don't You (Forget About Me) 0.6622089780815504
10.  The Police - Every Breath You Take - Remastered 2003 0.6478365997709965


Looks good! An 80's nostalgia playlist gets recommended more hits. The suggested name "the 80'S Blast Friday" is really good!

## Conclusion

The recommendations seem totally serviceable.

But there's definitely a weakness to such a simple system. It will prefer to recommend the most popular songs out of all the other's playlists. Say you have a hiphop playlist, but most tracks are niche or older, then you will not be recommended such tracks because the recommender does addition on each playlist's tracks, say playlist[n] has similarity score of 0.05, then every song in that playlist gets incremented with that value. So the most popular and mainstream tracks will accrue more points. More recent and mainstream tracks will bubble up to the top, while the user's own niche will be ignored.