## Project Description

For this project, I will be exploring three of my personal playlists on spotify for my three favorite genres of music. In addition, I have created two playlists of music which I like and music which I don't in order to perform a logisitic regression and predict whether or not I will like a song. These playlists will be explored utilizing Spotify's API and the spotipy program in python. For this analysis, I will be attempting to answer 2 main questions:
```
1) Can I predict the popularity for a song in a given genre based on the audio features of the song?
```
```
2) Can I predict whether or not I will like a song in a given genre based on the audio features of the song?
```

## Data Selection

For this case study, the three genres of music that I will be exploring are alternative, hard rock, and metalcore. For each of these genres, I have created a spotify playlist ranging from 400 to 650 songs. Using Spotify API, I am able to pull audio features for each of the songs in the particular playlist. These features are quantitative descriptions of a particular song. Features include accousticness, energy, danceability, instrumentalness, etc. There are a total of 14 different audio features for each song which will be used as features in a linear regression model. The target variable for linear regression will be a variable called popularity, which is a quantitative representation of how popular a particular song is.

In addition, I have created a playlist for my favorite genre hard rock which has songs from the genre that I do not like. This data combined with the hard rock playlist that I do like will be used to create a classification problem that can be solved with a logistic regression model. Songs I do like (1) and songs I do not like (0) will be fit with a logistic regression model to determine whether or not I will like a hard rock song based on its audio features.

## Data Acquistion ##

By using Spotify API and the python library called Spotipy, I am able to extract the songs from each playlist. From this, I can use spotipy's get_audio_features for each song and store these features for each song into a pandas dataframe. Similarly, these tracks all have a metric called popularity which can be extracted and placed into the same dataframe and used as the target for linear regression. 

## Data Cleansing

In [1]:
run Spotify_audio_features.py

Runs a python script to obtain the dataframes for the alternative, hard rock, and metalcore playlists with the desired audio features. Note: try to improve speed of script when I get the chance. Luckily in this data, there are no NaN values since the audio features are described in full for each particular song in the playlists.

## EDA

In [6]:
alternative.head()

Unnamed: 0,Artist,Title,Duration(m),acousticness,danceability,energy,instrumentalness,key,liveness,loudness,mode,speechiness,tempo,time_signature,track_href,valence,Popularity
0,The 1975,Chocolate,3.744,0.0041,0.591,0.944,0.0,11,0.385,-4.325,1,0.0544,100.124,4,https://api.spotify.com/v1/tracks/44Ljlpy44mHv...,0.713,65
1,The 1975,The Sound,4.148,0.096,0.643,0.945,8e-06,0,0.495,-4.66,1,0.0779,120.723,4,https://api.spotify.com/v1/tracks/316r1KLN0bcm...,0.526,66
2,3 Doors Down,When I'm Gone,4.341767,0.00457,0.496,0.765,0.0,7,0.104,-5.66,1,0.033,74.072,4,https://api.spotify.com/v1/tracks/3WbphvawbMZ8...,0.337,67
3,3 Doors Down,Here Without You,3.976,0.0537,0.536,0.55,0.0,10,0.134,-6.733,0,0.0248,144.018,4,https://api.spotify.com/v1/tracks/3NLrRZoMF0Lx...,0.234,69
4,3 Doors Down,Still Alive,2.6991,0.000296,0.426,0.959,0.509,11,0.201,-4.248,0,0.056,169.948,4,https://api.spotify.com/v1/tracks/3Q7BHIhobJEH...,0.59,41


In [3]:
hard_rock.head()

Unnamed: 0,Artist,Title,Duration(m),acousticness,danceability,energy,instrumentalness,key,liveness,loudness,mode,speechiness,tempo,time_signature,track_href,valence,Popularity
0,10 Years,Wasteland,3.8311,0.000426,0.391,0.801,0.000208,6,0.0662,-5.102,0,0.0813,146.729,4,https://api.spotify.com/v1/tracks/3pO37BXsjMC2...,0.341,63
1,10 Years,Beautiful,3.264883,0.00227,0.497,0.748,0.0,6,0.136,-4.5,1,0.0332,131.944,4,https://api.spotify.com/v1/tracks/6AgtIN7FyBd4...,0.274,50
2,10 Years,Shoot It Out,3.328,0.00442,0.47,0.889,2e-06,6,0.379,-3.879,1,0.0747,164.111,4,https://api.spotify.com/v1/tracks/6TghWaPh1WHJ...,0.295,48
3,10 Years,Fix Me,3.597333,0.00148,0.444,0.832,0.0,1,0.0891,-3.857,0,0.0343,158.007,4,https://api.spotify.com/v1/tracks/60OKW0mZiPFH...,0.246,56
4,12 Stones,Anthem For The Underdog,3.073767,0.000651,0.2,0.863,0.0,5,0.339,-3.424,1,0.0626,93.977,3,https://api.spotify.com/v1/tracks/6FFwt1ea9hJ4...,0.468,50


In [4]:
metal_core.head()

Unnamed: 0,Artist,Title,Duration(m),acousticness,danceability,energy,instrumentalness,key,liveness,loudness,mode,speechiness,tempo,time_signature,track_href,valence,Popularity
0,Abandon All Ships,Take One Last Breath,3.659767,0.00115,0.313,0.961,0.0,5,0.12,-4.51,1,0.193,138.721,4,https://api.spotify.com/v1/tracks/2tpQV2QzHhvi...,0.21,45
1,Adept,Death Dealers,3.384883,0.00205,0.46,0.871,0.144,5,0.566,-6.261,1,0.118,91.046,4,https://api.spotify.com/v1/tracks/1oGNZv18BpWp...,0.211,25
2,Adept,From the Depths of Hell,3.942433,3.4e-05,0.177,0.959,1.9e-05,8,0.403,-4.964,1,0.0721,87.847,4,https://api.spotify.com/v1/tracks/6AAj9C5weV6Z...,0.308,27
3,Adept,Hope,2.830433,7e-06,0.387,0.947,0.172,8,0.141,-5.014,1,0.0692,97.544,4,https://api.spotify.com/v1/tracks/6jpABYYFMdYq...,0.213,26
4,Adept,Shark! Shark! Shark!,4.323317,0.000606,0.389,0.886,0.000526,7,0.338,-2.815,0,0.0816,97.798,4,https://api.spotify.com/v1/tracks/5vqLZXjnxaPL...,0.207,39
