# **Instructions**

This document is a template, and you are not required to follow it exactly. However, the kinds of questions we ask here are the kinds of questions we want you to focus on. While you might have answered similar questions to these in your project presentations, we want you to go into a lot more detail in this write-up; you can refer to the Lab homeworks for ideas on how to present your data or results. 

You don't have to answer every question in this template, but you should answer roughly this many questions. Your answers to such questions should be paragraph-length, not just a bullet point. You likely still have questions of your own -- that's okay! We want you to convey what you've learned, how you've learned it, and demonstrate that the content from the course has influenced how you've thought about this project.

# **Introduction**

The problem our project is targeting is music playlist prediction. When listening to music, there is a desire to have song recommendations based on the previous song listened to. This can improve user experience by providing a customized selection tailored to their music taste.

# Song Prediction Project
Project mentor: Sophia Sklaviadis

Benjamin Fry <bfry2@jh.edu>, Cassie Parent <cparent5@jh.edu>, Alexandra Szewc <aszewc1@jh.edu>

Link to Project Repo: https://github.com/benf549/CS475-Machine-Learning-Final-Project

# Outline and Deliverables

*List the deliverables from your project proposal. For each uncompleted deliverable, please include a sentence or two on why you weren't able to complete it (e.g. "decided to use an existing implementation instead" or "ran out of time"). For each completed deliverable, indicate which section of this notebook covers what you did.*

*If you spent substantial time on any aspects that weren't deliverables in your proposal, please list those under "Additional Work" and indicate where in the notebook you discuss them.*

### Uncompleted Deliverables
1. "Expect to complete #2": Determine feature importance in the neural network. We decided to spend more time on improving the performance of our models beyond the baselines.
2. "Would like to accomplish #1": Predict a song based off of a playlist instead of one song. We decided to attempt making predictions off of a raw audio file instead.
3. "Would like to accomplish #2": Add additional features fapart rom the Spotify Audio Analysis API endpoint to incorporate information about the previous song’s musical elements. Instead, we worked on extracting features from a raw audio file.


### Completed Deliverables
1. "Must complete #1": Build a music track dataframe with relevant features [in 'Dataset' below](#scrollTo=zFq-_D0khnhh&line=10&uniqifier=1).
2. "Must complete #2": Select a metric to compare predicted track to “true” value. [in 'Models and Evaluation' below](#scrollTo=oMyqHUa0jUw7&line=5&uniqifier=1).
3. "Must complete #3": Develop a base neural network to predict tracks based on input. [in 'Methods: Neural Net' below]().
4. "Expect to complete #1": Optimize model by experimenting with a number of layers, activation functions, dropout, etc. [in 'Methods: Combined Approach' below]().
5. "Expect to complete #3": Analyze trends in where our network is failing and attempt to explain these difficulties. [in 'Discussion' below]():
6. "Would like to accomplish #3": Compare the supervised neural network to the unsupervised hierarchical clustering approach. [in 'Results' below]().


### Additional Deliverables
1. We decided to implement the making of predictions that would follow an input audio file. We discuss this [in 'Methods: Combined Approach' below]().

# Preliminaries

## What problem were you trying to solve or understand?

What are the real-world implications of this data and task?

How is this problem similar to others we’ve seen in lectures, breakouts, and homeworks?

What makes this problem unique?

What ethical implications does this problem have?

## Dataset(s)

Describe the dataset(s) you used.

How were they collected?

Why did you choose them?

How many examples in each?


In [None]:
# Imports
import json, torch, os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import torch.nn.functional as F

In [None]:
# Read in the JSON data
data_stream = ""
with open("challenge_set.json", "r") as f:
    data_stream = f.read()
data_set = json.loads(data_stream)


# Grab unique track/artist pairs
track_uris = set()
playlist_idcs = []

for i,j in enumerate(data_set["playlists"]):
    if len(j["tracks"]) == 100:
        playlist_idcs.append(i) 
        for track in j["tracks"]:
            track_uris.add((track["track_name"], track["artist_name"], track["track_uri"]))
unique_tracks = list(track_uris)

# Define mapping between URIs and trakcs
uri_to_title_artist_map = {x[2]: (x[0], x[1]) for x in unique_tracks}
# for i,(j,k) in enumerate(uri_to_title_artist_map.items()):
#     if i > 5: break
#     print(j,k)

# Rearranging data table
df = pd.read_csv("all_downloaded_data.csv")
df = df.drop("Unnamed: 0", axis=1)
df = df.set_index('uri')
df = df.drop_duplicates()

print(df.head())

## Pre-processing

What features did you use or choose not to use? Why?

If you have categorical labels, were your datasets class-balanced?

How did you deal with missing data? What about outliers?

What approach(es) did you use to pre-process your data? Why?

Are your features continuous or categorical? How do you treat these features differently?

In [None]:
# For those same examples above, what do they look like after being pre-processed?

In [None]:
# Visualize the distribution of your data before and after pre-processing.
#   You may borrow from how we visualized data in the Lab homeworks.

# Models and Evaluation

## Experimental Setup

How did you evaluate your methods? Why is that a reasonable evaluation metric for the task?

What did you use for your loss function to train your models? Did you try multiple loss functions? Why or why not?

How did you split your data into train and test sets? Why?


In [None]:
# Code for loss functions, evaluation metrics or link to Git repo

## Baselines 

What baselines did you compare against? Why are these reasonable?

Did you look at related work to contextualize how others methods or baselines have performed on this dataset/task? If so, how did those methods do?

## Methods

What methods did you choose? Why did you choose them?

How did you train these methods, and how did you evaluate them? Why?

Which methods were easy/difficult to implement and train? Why?

For each method, what hyperparameters did you evaluate? How sensitive was your model's performance to different hyperparameter settings?

In [None]:
# Code for training models, or link to your Git repository

In [None]:
# Show plots of how these models performed during training.
#  For example, plot train loss and train accuracy (or other evaluation metric) on the y-axis,
#  with number of iterations or number of examples on the x-axis.

## Results

Show tables comparing your methods to the baselines.

What about these results surprised you? Why?

Did your models over- or under-fit? How can you tell? What did you do to address these issues?

What does the evaluation of your trained models tell you about your data? How do you expect these models might behave differently on different data?  

In [None]:
# Show plots or visualizations of your evaluation metric(s) on the train and test sets.
#   What do these plots show about over- or under-fitting?
#   You may borrow from how we visualized results in the Lab homeworks.
#   Are there aspects of your results that are difficult to visualize? Why?

# Discussion

## What you've learned

*What concepts from lecture/breakout were most relevant to your project? How so?*

The concepts from the lecture and breakout rooms that were most relevant to our project were the introductions to both PyTorch and K-Means clustering. Additionally, we found the lectures on clustering, neural networks, and FATE to be helpful in informing the direction of our project implementation. Finally, we found the feedback we gained from our presentation to help us find new directions for our project after facing some difficulties—which is how we came to the results we presented here.

*What aspects of your project did you find most surprising? What lessons did you take from this project that you want to remember for the next ML project you work on? Do you think those lessons would transfer to other datasets and/or models? Why or why not*

We were collectively surprised in the difficulty we had structuring the training of a neural network  due to poor performance when trying to classify between so many songs, but soon understood why this was the case. Our surprise at this aspect that became relatively obvious in hindisght taught us the importance of planning extensively prior to undertaking any ML project. This is a lesson that owuld transfer to other datasets, models, and projects because understanding the data being worked with and the problem being solved—as well as how the solution is to be arrived at—are important steps to successfuly developing a machine learning framework to answer the question at hand. Without prioritizing good project design from the start, it becomes difficult to identify problems with the design and further develop the project. Our group ran into these problems as a result of reduced time to work on the project due to changing our proposal due to TA recommendations. In the future, we will all be careful to define our problem well before starting another project.

*What was the most helpful feedback you received during your presentation? Why?*

The most helpful feedback we received during our presentation was to combine the two approaches we identified as deliverables during our proposal and presentation: clustering and neural networks. To do this, we experimented with several neural networks to predict the desired labels prior to feeding our results into a clustering model. This approach was very informative to us to implement, and allowed us to understand how different models can interact to augment performance results.

*If you had two more weeks to work on this project, what would you do next? Why?*

With two more weeks to work on the project, we would implement further suggestions given during our presentation, such as adding different embedding layers into our neural network model since the input dataset was sparse. Additionally, we would have looked into extracting additional metadata about the songs—such as reviews, genre, and tags—since these additional features do not relate to the musical properties of the songs themselves, unlike our current features.