# Building a Song Recommender  
by Braden Weber (blw22) and Kelsey Yen (kny4)

## Goal  

The goal of our project was to create a song recommender. Using the fundamentals from Units 6 and 12, we created a song recommender and validated the results based on how well it recommends songs to a genre-specific playlist.

## The Dataset  

The dataset we used is from the [Spotify Million Playlist Dataset Challenge](https://research.atspotify.com/the-million-playlist-dataset-remastered/). The data comes from sampling random existing Spotify playlists made by real users for Spotify's AI research (much of which focuses on recommendation systems).  
Each playlist is a dictionary that includes data. The data given about each track is as follows:
* **track_name** - the name of the track
* **track_uri** - the Spotify URI of the track
* **album_name** - the name of the track's album
* **album_uri** - the Spotify URI of the album
* **artist_name** - the name of the track's primary artist
* **artist_uri** - the Spotify URI of track's primary artist
* **duration_ms** - the duration of the track in milliseconds
* **pos** - the position of the track in the playlist (zero-based)  

Unlike other datasets we explored, the dataset does not include song attributes that can be compared to one another, things like tempo, genre, or danceability (although attributes are available using Spotify Web API). Since our model is trained on the occurence of songs in a certain type of playlist, the data is only used to identify unique songs in the dataset.  

To use the dataset, it was unzipped then made into a datafram using this `for loop`:

In [1]:
# for loop code

To make the data easier to handle, we shortened playlists to 30 songs and deleted playlists with less than 30 songs. This ensures that every playlist has the same size and will work with tensors.  

Next, we create the 10 training sets by pulling out the last ten songs in each playlist, respectively assigning them as the target song for each of the training sets (train_1 uses the song at index 21, train_2 uses the song at index 22, etc) .

In [2]:
# code for sorting/cutting playlists
# code for yeeting 10 songs into training set

The model needs to have each unique track labelled with a unique index in the set tracks.

In [3]:
# for loops for song indeces

Now each song can be identified with a unique id. We then go back through the playlists and replace the songs with ids.

In [4]:
# code for playlists, all playlists are just id numbers
# make training playlists and playlist_list into tensors

Let's look at the shapes of our tensors:

Training shape is the number of playlists.    
Playlists shape is the number for every playlist, there are 20 songs tensor[4..., 20]


In [5]:
# tensor shape code here

## Model Setup

On to the model. To set up the embedding of each song, we used the number of unique songs and the embedding dimension.

In [6]:
# unique songs number
# embedding function

In [7]:
# num parameters

Next, we inserted tensor of playlists into `embedder` to create a tensor that has the embeddings for each song in a given playlist.

In [8]:
# code for playlist embedding

Our model is as follows:  
<div>
    <img src="model.png" width="500">
</div>


Let's talk about the MLPs. Each corresponds to the new parameter added:
* Mean: the input is the embedding dimension, out put is also the embedding dimension. Mean is a tensor of length 50 and is the average of the rows for each embedding for the songs (an average of all the songs). 
* Variance: finds the difference ebtween all the songs. The final thing it inputs is the actual song to be recommended. The model then has two inputs.  
* Head: has an extra layer to output the number of unique songs. We keep the input to 50 because the kernel keeps dying.

In [9]:
# code for mean, variance, head 

Code for model setup based on the target with a single song embedding. Returns the output of the last mlp which is the large array that is softmaxed for probablities. Insert pretty picture here.

In [10]:
# code for get_model

## Training

Training the model: It goes through every playlist in the list "playlist" (it enumerates), taking the playlist and runs through the model, outputting a tensor with the probability of reommendation of each unique song. (big-ass array).

Cross-entropy takes two argument: the big-ass array of probailities of each unique song and the index of the target song.

Goes backward down the gradient to update each parameter in the model from the gradient. 

IMPORTANT: Zero the gradient every time the loss is used to back propogate to create the gradients. This saves memory space and keeps the kernel alive.

In [11]:
# code for gradient

In [12]:
# code for setting up training set

## Optimizing

The optimization of the model comes from adjusting the following 4 parameters:
* **learning rate**: how fast the weights change as it goes backwards along the gradient in the mlp  
* **files**: changes how many playlists are loaded in  
* **training set**: how many traning sets the model runs through  
* **rounds**: how many times the whole process is repeated  

In [13]:
# code for changing parameters maybe here?

How we do epochs/rounds: For loop that runs every playlist through the model and back propogates the loss. For each time is runs through the epoch, there is alaso a loop for it to run throguh the number of training sets it wants. Nested in the loops is the call to train.

In [14]:
# for loop for epochs/rounds 
# outputs training set number based on epoch round

## Validation

show_recommendations: shows the top 15 songs that the model recommends. It takes in the index of the plalist (index 25 is a country playlist). It shows what each training set had as it's training target and then the actual song it recommended with the confidence.

In [15]:
# code for show_recommendations with top 15 songs

To validate the model, we analyzed the number of country songs were recommended (generally most distinguishable genre).

In [16]:
# code for get_playlist(25) 

The validation is to show that the model can at least recognize genres of songs and recommend a song within that genre.

In [17]:
# table for validation

playlist(7): rap/hip-hop
playlist(25) and (31): country
playlist(83): Christmas

## Improvements

Upping training set, modularizing model to just change parameters, going from concatenating embeddings to using a sequential model

In terms of optimizing parameters, blah blah blah smart smart

Removing one of the mlps since target is not added into the training portion. Results will be the same.

## Error Analysis

Cross-entropy loss function
Loss.backward() carries the loss through the whole model

Counting country songs in output? Does this count?

## Alternative Methods

Many different ways to set up this sequential model  

Earlier idea of concatenating mean, variance, and target

## Conclusion

Obviously, our model does not perform the way we intended it to. Like all model training, the best possible song recommender needs lots of data and subsequent time to train on (some recommenders we saw online took 3-4 days to train).