# Music Recommendations
This dataset is for making music recommendations. We will build a machine learning model to recommend new songs to users.

**Your goal in this assignment is to think about ways to explain recommendations to different stakeholders.**


## Part 1: Data Exploration

Upload the CSVs ('songs.csv' & 'likes.csv) included in the homework assignment.

In [None]:
from google.colab import files
uploaded = files.upload()

Now load the data into pandas dataframes.


In [None]:
import pandas as pd
songs_df = pd.read_csv('songs.csv', low_memory=False)
likes_df = pd.read_csv('likes.csv', low_memory=False)

The songs dataframe contains information about 30 songs. The likes dataframe contains information about which people have "liked" which songs. These data sets are small enough that you can look at them directly. 

In [None]:
songs_df

In [None]:
likes_df


## Part 2: Content-Based Recommendations
One way that we could make recommendations is to find songs that are similar to the songs that a user already likes, using the information we have about each song--artist, genre, danceability, valence, and energy. (Danceability, valence, and energy were extracted from [Spotify's API](https://developer.spotify.com/documentation/web-api/reference/#object-audiofeaturesobject).) 

First, we need to transform our data so that artist and genre can be represented numerically.



In [None]:
songs_df = pd.get_dummies(songs_df, columns=['Artist', 'Genre'])
songs_df.head()

Now we can find pairs of songs that are similar (SSID stands for "similar song ID").

In [None]:
from sklearn.neighbors import NearestNeighbors
song_data = songs_df.drop(columns=['SongID', 'Title']).values
neighbors = NearestNeighbors(n_neighbors=2, algorithm='brute').fit(song_data)
_, indices = neighbors.kneighbors(song_data)
song_pairs = {'SongID': songs_df['SongID'], 'Song': songs_df['Title'], 'Similar Song': songs_df['Title'][indices[:,1]].values, 'SSID': songs_df['SongID'][indices[:,1]].values}
similarity_df = pd.DataFrame(song_pairs)
similarity_df.head()

Now that we've calculated pairs of similar songs, we can start making recommendations. 

In [None]:
grouped = pd.merge(likes_df, similarity_df).groupby('Person')
recommended = []
for person, info in grouped:
  print(person,info)
  recs = set([song for song in info['Similar Song'].values if song not in info['Song'].values])
  recommended += [(person, rec) for rec in recs]
recommendations_df = pd.DataFrame(recommended, columns = ['Name', 'Recommendations'])
recommendations_df

### Question 1: How would you explain to Adrian why the song "I Like It" is recommended to him?

_Double click to write your answer question here. Show your work in code below if applicable._

## Part 3: Collaborative Filtering
A different way to make recommendations is -- instead of basing it on the attributes of the song -- to base it on the listening behavior of other people. If someone likes a lot of the same things as you, there's a good chance you'll like other things they like (that you haven't tried yet).

To calculate this, we will start by rearranging our data.

In [None]:
matrix_df = likes_df.copy()
matrix_df.loc[:,'present'] = 1
matrix_df = matrix_df.pivot(index='SongID', columns='Person', values='present').fillna(0)
matrix_df

In the above matrix, a 1 in row i and column j indicates that user j has liked song i. For example, we know that Adrian likes "Basket Case", because there is a 1 in the first column of the sixth row ("Basket Case" is song #6 in the dataset). 

Now we have a different way to measure song similarity. Two songs are similar if they are liked by many of the same users.


In [None]:
song_data = matrix_df.values
neighbors = NearestNeighbors(n_neighbors=2, algorithm='brute').fit(song_data)
_, indices = neighbors.kneighbors(song_data)
songs = songs_df['Title'].values
song_pairs = {'SongID': matrix_df.index.values, 'Song': songs, 'Similar Song': songs[indices[:,1]],'SSID': songs_df['SongID'][indices[:,1]].values}
similarity_df = pd.DataFrame(song_pairs)
similarity_df.head(10)

Now that we've again calculated pairs of similar songs, we can start making new recommendations. 

In [None]:
grouped = pd.merge(likes_df, similarity_df).groupby('Person')
recommended = []
for person, info in grouped:
  recs = set([song for song in info['Similar Song'].values if song not in info['Song'].values])
  recommended += [(person, rec) for rec in recs]
recommendations_df = pd.DataFrame(recommended, columns = ['Name', 'Recommendations'])
recommendations_df

### Question 2: How would you explain to Alex why the song "drivers license" is recommended to him?

_Double click to write your answer question here. Show your work in code below if applicable._

###Question 3: Sketch four interactions that provide an explanation of the collaborative filtering recommendation system that could be used by each of the following target users:

####3A. Someone who uses the "Daily Mix" playlist on their smartphone while going for a daily walk.

####3B. A user who plays the recommended tracks beginning from a starting song on their smart speaker (Google Home, Amazon Echo, Nest, etc).
<!--A user who hasn't been using Spotify much recently, because they're bored with the music they hear. -->

####3C. A band member whose music is featured on Spotify. 

####3D. An executive at a video game company that would like to start a sponsorship for musicians. 

As you begin, reflect on user goals and the criteria for a good explanation that we talked about last week. 

_Please attach your annotated sketches as a PDF._
