# Kendrick Lamar Spotify API Report
### 09/29/2024
### Bianca Amoako

### Hypothesis:

### The valence of Kendrick Lamar's album *Mr. Morale & The Big Steppers* is independent from the other track audio features that may effect a person's inclination to listen to the album/the popularity.
#### Other audio features like high energy, fast tempo, and higher danceability could be reasons a listener gravitates towards a track or album that brings them a sense of "happiness", affecting the popularity of the track.

#### Lamar released his fifth album *Mr. Morale & The Big Steppers* in 2022. The album follows the story of Kendrick Lamar's therapy journey with many collaborators.
#### Some topics on the album include:
- #### Generational Trauma
- #### Infedelity
- #### Father Issues
#### The album ended up winning Best Rap Album at the 2023 Grammys Awards.

<center><img src= "images/kendricks.png" width = "600"></center>


---
### The **Spotify API endpoints** I will be using are the Get Album Tracks, Get Several Tracks' Audio Features, and Get Several Tracks.

#### The [Get Album Tracks](https://developer.spotify.com/documentation/web-api/reference/get-an-albums-tracks "Spotify API Reference") endpoint will allow me to gather all of the Spotify track ids to pass to other endpoints. It also returns the track number which will help reveal the trends of the album from start to end.

#### The [Get Several Tracks' Audio Features](https://developer.spotify.com/documentation/web-api/reference/get-several-audio-features "Spotify API Reference") will return the audio features, including the valence of each track on the *Mr. Morale & The Big Steppers* album.
#### Spotify defines valence as: 
> #### *A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).*

#### The [Get Several Tracks](https://developer.spotify.com/documentation/web-api/reference/get-several-tracks "Spotify API Reference") will return the popularity of each track on the album.
#### Spotify defines popularity as:
> #### The popularity of a track is a value between 0 and 100, with 100 being the most popular. The popularity is calculated by algorithm and is based, in the most part, on the total number of plays the track has had and how recent those plays are.

#### When it comes to the track features, there could be combatting factors that contribute to a track's score. Does a higher pitch increase valence? Do certain instruments effect the features? Are the lyrics taken into account or just the sounds? Lastly, popularity is measured as of recent, and this album came out years ago.

---

In [159]:
import json
import requests
import pandas as pd
import base64
import urllib

#### I will be using a function that takes my Spotify developer credentials and returns an access token. The access token authorizes me to make an API request. This idea is outlined in [Client Credentials Flow](https://developer.spotify.com/documentation/web-api/tutorials/client-credentials-flow) on Sopitify's Web API page.

In [160]:
def get_session_token(ClientID, ClientKey):
    url = "https://accounts.spotify.com/api/token"
    data_ = {"grant_type" : "client_credentials"}
    encoded_key = base64.b64encode(str(ClientID + ":" + ClientKey).encode("ascii"))
    headers_ = {"Authorization": "Basic {}".format(encoded_key.decode("ascii"))}

    response = requests.post(url, data = data_, headers = headers_)
    print(response.status_code)

    return response.json()["access_token"]

In [161]:
client_info = pd.read_csv("/Users/biancaamoako/Data_EMAT/Spotify_API/spotify_client_info.txt")
#client_info

In [162]:
client_id = client_info["Client_ID"].iloc[0]
client_key = client_info["Client_Secret"].iloc[0]
access_token = get_session_token(client_id, client_key)

200


#### With an access token, I can make a request for the album/tracks on *Mr. Morale & The Big Steppers* endpoints.
#### I'm using a function that will take the endpoint urls and the access token I received earlier and return the API data.

In [163]:
def api_request(endpoint_url, token_header):
    response = requests.get(endpoint_url, headers = token_header)
    print(response.status_code)
    
    return response.json()

In [164]:
album_tracks_ep = "https://api.spotify.com/v1/albums/79ONNoS4M9tfIA1mYLBYVX/tracks"
track_feats_ep = "https://api.spotify.com/v1/audio-features?ids={}"
several_tracks_ep = "https://api.spotify.com/v1/tracks?market=US&ids={}"
#morale_album_id = "79ONNoS4M9tfIA1mYLBYVX"
session_header = {"Authorization" : "Bearer {}".format(access_token)}

In [165]:
#Get album tracks ep
morale_tracks_response = api_request(album_tracks_ep, session_header)
morale_tracks_response.keys()

200


dict_keys(['href', 'items', 'limit', 'next', 'offset', 'previous', 'total'])

In [166]:
#morale_tracks_response["items"][0]

In [167]:
morale_tracks_df = pd.DataFrame(morale_tracks_response["items"])
morale_tracks_df.head(3)

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,external_urls,href,id,name,track_number,type,uri,is_local
0,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,255377,True,{'spotify': 'https://open.spotify.com/track/5G...,https://api.spotify.com/v1/tracks/5Gt9bxniM1Sx...,5Gt9bxniM1SxN61yRzRhXL,United In Grief,1,track,spotify:track:5Gt9bxniM1SxN61yRzRhXL,False
1,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,195950,True,{'spotify': 'https://open.spotify.com/track/0f...,https://api.spotify.com/v1/tracks/0fX4oNGBWO3d...,0fX4oNGBWO3dSGUZcVdVV2,N95,2,track,spotify:track:0fX4oNGBWO3dSGUZcVdVV2,False
2,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,203367,True,{'spotify': 'https://open.spotify.com/track/66...,https://api.spotify.com/v1/tracks/66mVPWmFvXPF...,66mVPWmFvXPFf8pjK5ttOW,Worldwide Steppers,3,track,spotify:track:66mVPWmFvXPFf8pjK5ttOW,False


#### The track ids from the Get Album Tracks endpoint will be "joined" into a list and passed to the Get Several Tracks' Audio Features and Get Several Tracks endpoint urls

In [168]:
morale_track_ids = ",".join(morale_tracks_df["id"].to_list())
morale_track_ids

'5Gt9bxniM1SxN61yRzRhXL,0fX4oNGBWO3dSGUZcVdVV2,66mVPWmFvXPFf8pjK5ttOW,2g6tReTlM2Akp41g0HaeXN,28qA8y1sz0FTuSapsCxNOG,15pyFQHCTVp6T7vaWctSgO,1QPreu0BNOrUfEb8HTd2qG,67XC51nlZncNpHmZ8rOU9a,1REVvAphiSTJyKQ1fDpHa4,6BU1RZexmvJcBjgagVVt3M,4zMxWSP2qZUy5CBCH5PdzZ,3lzUeaCbcCDB5IXYfqWRlF,5G4uLkFKdEZLcuNyeomQmE,3drdWsJKiVCSQ2gKhd9BDT,1uY1X8YeBixs1FdQ3fQ7d4,6CmpZfKUQ2KerzBFZ3QKFr,346SJSEbB6pNZMpwovxDiu,5xoYormSTltk6F9SlQV6mm,5qbhVL3vB7HwWvb0042B7y'

In [169]:
#Get several tracks' audio features ep
morale_track_feats = api_request(track_feats_ep.format(morale_track_ids), session_header)
morale_track_feats.keys()

200


dict_keys(['audio_features'])

In [170]:
#morale_track_feats["audio_features"]

In [171]:
#audio features data frame
morale_feats_df = pd.DataFrame(morale_track_feats["audio_features"])
morale_feats_df.head(3)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.529,0.845,8,-8.142,1,0.404,0.244,0.0,0.143,0.331,85.63,audio_features,5Gt9bxniM1SxN61yRzRhXL,spotify:track:5Gt9bxniM1SxN61yRzRhXL,https://api.spotify.com/v1/tracks/5Gt9bxniM1Sx...,https://api.spotify.com/v1/audio-analysis/5Gt9...,255378,3
1,0.79,0.67,1,-5.527,1,0.105,0.377,2e-06,0.119,0.408,139.956,audio_features,0fX4oNGBWO3dSGUZcVdVV2,spotify:track:0fX4oNGBWO3dSGUZcVdVV2,https://api.spotify.com/v1/tracks/0fX4oNGBWO3d...,https://api.spotify.com/v1/audio-analysis/0fX4...,195950,4
2,0.514,0.472,8,-11.106,0,0.368,0.753,0.00012,0.0746,0.557,77.215,audio_features,66mVPWmFvXPFf8pjK5ttOW,spotify:track:66mVPWmFvXPFf8pjK5ttOW,https://api.spotify.com/v1/tracks/66mVPWmFvXPF...,https://api.spotify.com/v1/audio-analysis/66mV...,203367,3


In [172]:
#Get several tracks ep (popularity)
morale_tracks = api_request(several_tracks_ep.format(morale_track_ids), session_header)

200


In [173]:
#track popularity data frame
track_pop_df = pd.DataFrame(morale_tracks["tracks"])
track_pop_df.head(3)

Unnamed: 0,album,artists,disc_number,duration_ms,explicit,external_ids,external_urls,href,id,is_local,is_playable,name,popularity,track_number,type,uri
0,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,255377,True,{'isrc': 'USUM72208961'},{'spotify': 'https://open.spotify.com/track/5G...,https://api.spotify.com/v1/tracks/5Gt9bxniM1Sx...,5Gt9bxniM1SxN61yRzRhXL,False,True,United In Grief,74,1,track,spotify:track:5Gt9bxniM1SxN61yRzRhXL
1,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,195950,True,{'isrc': 'USUM72208963'},{'spotify': 'https://open.spotify.com/track/0f...,https://api.spotify.com/v1/tracks/0fX4oNGBWO3d...,0fX4oNGBWO3dSGUZcVdVV2,False,True,N95,76,2,track,spotify:track:0fX4oNGBWO3dSGUZcVdVV2
2,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,203367,True,{'isrc': 'USUM72208960'},{'spotify': 'https://open.spotify.com/track/66...,https://api.spotify.com/v1/tracks/66mVPWmFvXPF...,66mVPWmFvXPFf8pjK5ttOW,False,True,Worldwide Steppers,60,3,track,spotify:track:66mVPWmFvXPFf8pjK5ttOW


#### Now, I am going to combine the data frames through their ids so the titles, features, and popularity are in one data frame. I will also drop excess variables from the joined data frame.

In [174]:
#final data frame
morale_df = pd.merge(morale_tracks_df, morale_feats_df, how = "inner", on = "id")

In [175]:
morale_df = pd.merge(morale_df, track_pop_df[["popularity", "id"]], how = "inner", on = "id")

In [176]:
morale_df.columns

Index(['artists', 'available_markets', 'disc_number', 'duration_ms_x',
       'explicit', 'external_urls', 'href', 'id', 'name', 'track_number',
       'type_x', 'uri_x', 'is_local', 'danceability', 'energy', 'key',
       'loudness', 'mode', 'speechiness', 'acousticness', 'instrumentalness',
       'liveness', 'valence', 'tempo', 'type_y', 'uri_y', 'track_href',
       'analysis_url', 'duration_ms_y', 'time_signature', 'popularity'],
      dtype='object')

In [177]:
morale_df = morale_df.drop(columns = ["artists", "available_markets", "duration_ms_x", "external_urls", "href", "key",\
                                       "instrumentalness", "type_x", "is_local", "type_y", "uri_y", "uri_x", "track_href",\
                                       "analysis_url", "disc_number", "time_signature", "explicit", "mode"])


In [178]:
morale_df

Unnamed: 0,id,name,track_number,danceability,energy,loudness,speechiness,acousticness,liveness,valence,tempo,duration_ms_y,popularity
0,5Gt9bxniM1SxN61yRzRhXL,United In Grief,1,0.529,0.845,-8.142,0.404,0.244,0.143,0.331,85.63,255378,74
1,0fX4oNGBWO3dSGUZcVdVV2,N95,2,0.79,0.67,-5.527,0.105,0.377,0.119,0.408,139.956,195950,76
2,66mVPWmFvXPFf8pjK5ttOW,Worldwide Steppers,3,0.514,0.472,-11.106,0.368,0.753,0.0746,0.557,77.215,203367,60
3,2g6tReTlM2Akp41g0HaeXN,Die Hard,4,0.775,0.736,-8.072,0.247,0.319,0.127,0.362,100.988,239027,73
4,28qA8y1sz0FTuSapsCxNOG,Father Time (feat. Sampha),5,0.514,0.779,-4.365,0.344,0.181,0.099,0.517,152.869,222497,69
5,15pyFQHCTVp6T7vaWctSgO,Rich - Interlude,6,0.481,0.406,-11.52,0.0775,0.864,0.0869,0.741,91.589,103319,56
6,1QPreu0BNOrUfEb8HTd2qG,Rich Spirit,7,0.852,0.421,-9.153,0.208,0.428,0.106,0.457,95.977,202285,74
7,67XC51nlZncNpHmZ8rOU9a,We Cry Together,8,0.648,0.68,-7.276,0.345,0.292,0.0808,0.504,106.89,341307,62
8,1REVvAphiSTJyKQ1fDpHa4,Purple Hearts,9,0.567,0.824,-6.973,0.296,0.167,0.149,0.737,138.202,329295,63
9,6BU1RZexmvJcBjgagVVt3M,Count Me Out,1,0.776,0.431,-7.544,0.091,0.671,0.153,0.495,133.999,283642,75


----

### **Conclusions from tidy data**
#### Using the data frame above I can see that the popularity scores of all the tracks are around 56 to 76. That shows that the album tracks should each be taken into account when comparing the audio features. 
#### The track with the lowest valence, *Crown* has a lower popularity score of 58, but the song with the highest valence score *Savior - Interlude* has an even lower score of 57. So, valence alone does not determine popularity.
#### The other features seem to vary. Tempo is around 77 and above. The data frame shows any outliers between the tracks. 
#### Graphs can be made from this data frame to show simple relationships between different audio features and popularity. From those graphs I can look further into certain features in a more holistic way. I could also look more into the lyrics and sentiments. 