## __Leah Gerke__
### _API Data Report_
### _10/11/2024_
__State a hypothesis that could be tested using the data available from the Spotify API.__
- The danceablility of a song correlates directly with the energy and loudness of it.

__Explain the theoretical and the statistical applications of your hypothesis.__
- Theoretical: The louder a song is and the more energy it has, the more danceable a song is.
- Statistical: A song scores from level 0 (lowest) to level 1 (highest). When a song scores .5 or higher on each energy _and_ loudness levels, the danceability score will indicate a song is suitable to dance to by scoring .5 or higher.


__Identify and describe the Spotify API endpoints you will use to collect data. Explain why these endpoints (and which of the response objects) are suitable to test your hypothesis.__
- I will need the "Get Track's Audio Features" endpoint that is under the "Tracks" reference to get the danceability, loudness, and energy data.
- My hypothesis relies solely on what data presents itself with these three endpoints.
- I will be using Billie Eilish's latest album, "Hit Me Hard and Soft". I think there are a variety of different types of songs and beats and overall vibes in this specific album. And hopefully this will showcase a good set of data for me to determine if my hypothesis is correct.

__Discuss the ways in which this data might be reliable and unreliable.__
- The danceability level is determined by "a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity" and the data would be reliable in calculating all of that data as it would be factual. On the other hand, this data could be deemed unreliable in how it is incapable of taking human opinion into account. There is a possibility that even if a song is deemed danceable based on the calculated elemtents, a person may disagree in that the song cannot be danced to.

__Are there any limitations or caveats to the response objects that might alter your ability to test your hypothesis?__
- Yes, the data I find may be limited or nonexistent based on which song, genre, or artist I use to test and find data for.

In [1]:
import urllib
import requests
import pandas as pd
import json
import base64

In [2]:
text = 'billie :%&= eilish'
urllib.parse.quote_plus(text)

'billie+%3A%25%26%3D+eilish'

In [3]:
query = 'artist:Eilish genre:pop tag:new'
url_query = urllib.parse.quote_plus(query)

In [4]:
def get_sesh_token(SeshID, SeshKey):
    url = 'https://accounts.spotify.com/api/token'
    data = {'grant_type':'client_credentials'}
    encoded_key = base64.b64encode(str(SeshID + ":" + SeshKey).\
                                   encode("ascii"))
    header = {'Authorization': 'Basic {}'.format(encoded_key.decode("ascii"))}
    response = requests.post(url, 
                            data = data, 
                            headers = header)
    print(response.status_code)
    return response.json()['access_token']

In [5]:
spot_keys = pd.read_csv(r"C:\Users\lgerk\data-emat_fa24\Spotify_Keys_9-12-23.txt")

In [6]:
spot_keys

Unnamed: 0,Client_ID,Client_Secret
0,dc5c42b67fc246c6a3b12d5744f72fa6,98da17e3bb654706909793a442643edf


In [7]:
access_token = get_sesh_token(spot_keys['Client_ID'].iloc[0], 
spot_keys['Client_Secret'].iloc[0])

200


In [8]:
access_token

'BQBYfQ4yQsm1Kh2-nYeQgHhup3nG-zee4JT3Y1SZGSqhUOKoSZ9ujhVOZH70iTYhzkd4DSSyqFtlQuiN4Ls_9_2TaLGoBWxL0UJ02WJYbOop9W1_ELA'

In [9]:
aud_feats_ep = 'https://api.spotify.com/v1/audio-features'
trcks_ep = 'https://api.spotify.com/v1/tracks'
ab_trcks_ep = 'https://api.spotify.com/v1/albums/{}/tracks'
#11dFghVXANMlKmJXsNCbNl
#t_features_ep = ''
#tracks_ep = 'https://api.spotify.com/v1/tracks'
#ab_tracks_ep = 'https://api.spotify.com/v1/albums/{}/tracks'
#https://api.spotify.com/v1/audio-features?ids={}
#https://api.spotify.com/v1/audio-features

___I originally had "aud_feats_ep" have this URL: https://api.spotify.com/v1/audio-features/{}. Everything worked with it until I tried to actually isolate "audio features". After much trial and error of adjusting the names and definitions of things, I just could not figure out why it was giving me a 400 output. So I asked chatgpt what was wrong and it suggested I double check the "aud_feats_ep" url and add on "?ids={}" to the end of it. Sure enough, after I did that, the output changed to 200. However, when I actually went to look at the DataFrame using ".head()", nothing I could use showed up. I knew we did this exact data frame in class so went back to the notes from 10-1-24 and saw the URL we used for audio features didnt have ANYTHING on the end of it: https://api.spotify.com/v1/audio-features. So THAT is what "fixed" my dataframe. That's why there are commented out bits of code underneath my url definition cell.___

In [10]:
def api_call(endpoint_url, api_header):
    response = requests.get(endpoint_url, headers = api_header)
    print(response.status_code)
    return response.json()

In [11]:
sesh_header = {'Authorization': 'Bearer {}'.format(access_token)}

In [12]:
be_alb_id = "7aJuG4TFXa2hmE4z1yxc3n"

In [13]:
be_trck_feats = api_call(aud_feats_ep + '?ids={}'.format(be_alb_id),
                             sesh_header)

200


In [14]:
be_feats_df = pd.DataFrame(be_trck_feats['audio_features'])
be_feats_df.head()

Unnamed: 0,0
0,


___I attempted to gather data on _only_ the audio features first without having to create the album dataframe. But I believe I needed to obtain the album contents before I could define the individual audio features. Because my hypothesis is looking at the different songs in the same album, without defining the album, there was no audio features I could look at because I think the code didn't know what songs I was trying to get the audio features to in the first place (at least, doing it the way I did it).___

In [15]:
aud_feats_ep.format(be_alb_id)

'https://api.spotify.com/v1/audio-features'

In [16]:
be_alb_response = api_call(ab_trcks_ep.format(be_alb_id), sesh_header)

200


In [17]:
be_alb_df = pd.DataFrame(be_alb_response['items'])
be_alb_df.head()

Unnamed: 0,artists,available_markets,disc_number,duration_ms,explicit,external_urls,href,id,name,track_number,type,uri,is_local
0,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,219733,False,{'spotify': 'https://open.spotify.com/track/1C...,https://api.spotify.com/v1/tracks/1CsMKhwEmNnm...,1CsMKhwEmNnmvHUuO5nryA,SKINNY,1,track,spotify:track:1CsMKhwEmNnmvHUuO5nryA,False
1,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,179586,False,{'spotify': 'https://open.spotify.com/track/62...,https://api.spotify.com/v1/tracks/629DixmZGHc7...,629DixmZGHc7ILtEntuiWE,LUNCH,2,track,spotify:track:629DixmZGHc7ILtEntuiWE,False
2,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,303440,False,{'spotify': 'https://open.spotify.com/track/7B...,https://api.spotify.com/v1/tracks/7BRD7x5pt8Lq...,7BRD7x5pt8Lqa1eGYC4dzj,CHIHIRO,3,track,spotify:track:7BRD7x5pt8Lqa1eGYC4dzj,False
3,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,210373,False,{'spotify': 'https://open.spotify.com/track/6d...,https://api.spotify.com/v1/tracks/6dOtVTDdiauQ...,6dOtVTDdiauQNBQEDOtlAB,BIRDS OF A FEATHER,4,track,spotify:track:6dOtVTDdiauQNBQEDOtlAB,False
4,[{'external_urls': {'spotify': 'https://open.s...,"[AR, AU, AT, BE, BO, BR, BG, CA, CL, CO, CR, C...",1,261466,False,{'spotify': 'https://open.spotify.com/track/3Q...,https://api.spotify.com/v1/tracks/3QaPy1KgI7nu...,3QaPy1KgI7nu9FJEQUgn6h,WILDFLOWER,5,track,spotify:track:3QaPy1KgI7nu9FJEQUgn6h,False


In [18]:
be_trck_ids = ','.join(be_alb_df['id'].to_list())

In [19]:
aud_feats_ep + '?ids={}'.format(be_trck_ids)

'https://api.spotify.com/v1/audio-features?ids=1CsMKhwEmNnmvHUuO5nryA,629DixmZGHc7ILtEntuiWE,7BRD7x5pt8Lqa1eGYC4dzj,6dOtVTDdiauQNBQEDOtlAB,3QaPy1KgI7nu9FJEQUgn6h,6TGd66r0nlPaYm3KIoI7ET,6fPan2saHdFaIHuTSatORv,1LLUoftvmTjVNBHZoQyveF,7DpUoxGSdlDHfqCYj0otzU,2prqm9sPLj10B4Wg0wE5x9'

In [20]:
be_trck_feats = api_call(aud_feats_ep + '?ids={}'.format(be_trck_ids),
                            sesh_header)

200


In [21]:
be_trck_info = api_call(trcks_ep + '?market=US&ids={}'.format(be_trck_ids),
                            sesh_header)

200


In [22]:
be_feats_df = pd.DataFrame(be_trck_feats['audio_features'])
be_feats_df.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.251,0.252,9,-14.478,1,0.0375,0.693,0.00706,0.0968,0.0395,69.988,audio_features,1CsMKhwEmNnmvHUuO5nryA,spotify:track:1CsMKhwEmNnmvHUuO5nryA,https://api.spotify.com/v1/tracks/1CsMKhwEmNnm...,https://api.spotify.com/v1/audio-analysis/1CsM...,219733,4
1,0.893,0.4,11,-7.981,0,0.0643,0.0452,0.0823,0.0632,0.945,124.987,audio_features,629DixmZGHc7ILtEntuiWE,spotify:track:629DixmZGHc7ILtEntuiWE,https://api.spotify.com/v1/tracks/629DixmZGHc7...,https://api.spotify.com/v1/audio-analysis/629D...,179587,4
2,0.7,0.425,7,-12.531,1,0.0529,0.144,0.879,0.083,0.521,110.015,audio_features,7BRD7x5pt8Lqa1eGYC4dzj,spotify:track:7BRD7x5pt8Lqa1eGYC4dzj,https://api.spotify.com/v1/tracks/7BRD7x5pt8Lq...,https://api.spotify.com/v1/audio-analysis/7BRD...,303440,4
3,0.747,0.507,2,-10.171,1,0.0358,0.2,0.0608,0.117,0.438,104.978,audio_features,6dOtVTDdiauQNBQEDOtlAB,spotify:track:6dOtVTDdiauQNBQEDOtlAB,https://api.spotify.com/v1/tracks/6dOtVTDdiauQ...,https://api.spotify.com/v1/audio-analysis/6dOt...,210373,4
4,0.467,0.247,6,-12.002,0,0.0431,0.612,0.000271,0.17,0.126,148.101,audio_features,3QaPy1KgI7nu9FJEQUgn6h,spotify:track:3QaPy1KgI7nu9FJEQUgn6h,https://api.spotify.com/v1/tracks/3QaPy1KgI7nu...,https://api.spotify.com/v1/audio-analysis/3QaP...,261467,4


___My audio features dataframe showing the dancerability, energy, and loudness. I really want to crunch it down to just those three features later on when I have more time to play around with this.___

In [23]:
be_trcks_df = pd.DataFrame(be_trck_info['tracks'])
be_trcks_df.head()

Unnamed: 0,album,artists,disc_number,duration_ms,explicit,external_ids,external_urls,href,id,is_local,is_playable,name,popularity,track_number,type,uri
0,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,219733,False,{'isrc': 'USUM72401995'},{'spotify': 'https://open.spotify.com/track/1C...,https://api.spotify.com/v1/tracks/1CsMKhwEmNnm...,1CsMKhwEmNnmvHUuO5nryA,False,True,SKINNY,80,1,track,spotify:track:1CsMKhwEmNnmvHUuO5nryA
1,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,179586,False,{'isrc': 'USUM72401991'},{'spotify': 'https://open.spotify.com/track/62...,https://api.spotify.com/v1/tracks/629DixmZGHc7...,629DixmZGHc7ILtEntuiWE,False,True,LUNCH,88,2,track,spotify:track:629DixmZGHc7ILtEntuiWE
2,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,303440,False,{'isrc': 'USUM72401988'},{'spotify': 'https://open.spotify.com/track/7B...,https://api.spotify.com/v1/tracks/7BRD7x5pt8Lq...,7BRD7x5pt8Lqa1eGYC4dzj,False,True,CHIHIRO,88,3,track,spotify:track:7BRD7x5pt8Lqa1eGYC4dzj
3,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,210373,False,{'isrc': 'USUM72401994'},{'spotify': 'https://open.spotify.com/track/6d...,https://api.spotify.com/v1/tracks/6dOtVTDdiauQ...,6dOtVTDdiauQNBQEDOtlAB,False,True,BIRDS OF A FEATHER,98,4,track,spotify:track:6dOtVTDdiauQNBQEDOtlAB
4,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,261466,False,{'isrc': 'USUM72401993'},{'spotify': 'https://open.spotify.com/track/3Q...,https://api.spotify.com/v1/tracks/3QaPy1KgI7nu...,3QaPy1KgI7nu9FJEQUgn6h,False,True,WILDFLOWER,91,5,track,spotify:track:3QaPy1KgI7nu9FJEQUgn6h


___This is the dataframe that actually shows the name of the songs in the album, which is one of the main features I need for my hypothesis, so I can identify which songs are which___.

In [24]:
be_merged = pd.merge(be_trcks_df, be_feats_df,how = 'inner', on = 'id')
be_merged.head()

Unnamed: 0,album,artists,disc_number,duration_ms_x,explicit,external_ids,external_urls,href,id,is_local,...,instrumentalness,liveness,valence,tempo,type_y,uri_y,track_href,analysis_url,duration_ms_y,time_signature
0,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,219733,False,{'isrc': 'USUM72401995'},{'spotify': 'https://open.spotify.com/track/1C...,https://api.spotify.com/v1/tracks/1CsMKhwEmNnm...,1CsMKhwEmNnmvHUuO5nryA,False,...,0.00706,0.0968,0.0395,69.988,audio_features,spotify:track:1CsMKhwEmNnmvHUuO5nryA,https://api.spotify.com/v1/tracks/1CsMKhwEmNnm...,https://api.spotify.com/v1/audio-analysis/1CsM...,219733,4
1,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,179586,False,{'isrc': 'USUM72401991'},{'spotify': 'https://open.spotify.com/track/62...,https://api.spotify.com/v1/tracks/629DixmZGHc7...,629DixmZGHc7ILtEntuiWE,False,...,0.0823,0.0632,0.945,124.987,audio_features,spotify:track:629DixmZGHc7ILtEntuiWE,https://api.spotify.com/v1/tracks/629DixmZGHc7...,https://api.spotify.com/v1/audio-analysis/629D...,179587,4
2,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,303440,False,{'isrc': 'USUM72401988'},{'spotify': 'https://open.spotify.com/track/7B...,https://api.spotify.com/v1/tracks/7BRD7x5pt8Lq...,7BRD7x5pt8Lqa1eGYC4dzj,False,...,0.879,0.083,0.521,110.015,audio_features,spotify:track:7BRD7x5pt8Lqa1eGYC4dzj,https://api.spotify.com/v1/tracks/7BRD7x5pt8Lq...,https://api.spotify.com/v1/audio-analysis/7BRD...,303440,4
3,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,210373,False,{'isrc': 'USUM72401994'},{'spotify': 'https://open.spotify.com/track/6d...,https://api.spotify.com/v1/tracks/6dOtVTDdiauQ...,6dOtVTDdiauQNBQEDOtlAB,False,...,0.0608,0.117,0.438,104.978,audio_features,spotify:track:6dOtVTDdiauQNBQEDOtlAB,https://api.spotify.com/v1/tracks/6dOtVTDdiauQ...,https://api.spotify.com/v1/audio-analysis/6dOt...,210373,4
4,"{'album_type': 'album', 'artists': [{'external...",[{'external_urls': {'spotify': 'https://open.s...,1,261466,False,{'isrc': 'USUM72401993'},{'spotify': 'https://open.spotify.com/track/3Q...,https://api.spotify.com/v1/tracks/3QaPy1KgI7nu...,3QaPy1KgI7nu9FJEQUgn6h,False,...,0.000271,0.17,0.126,148.101,audio_features,spotify:track:3QaPy1KgI7nu9FJEQUgn6h,https://api.spotify.com/v1/tracks/3QaPy1KgI7nu...,https://api.spotify.com/v1/audio-analysis/3QaP...,261467,4


___I then attempted to merge the data sets of the tracks and audio features and together but I had them in the wrong order in my list and they weren't displaying how I wanted them to so I ended up changing it to the order in the cell below this and was able to see the danceability, energy, and loudness levels while still being able to see all 10 songs on the album so I can easily identify them with their respective audio features.___

In [25]:
be_merged = pd.merge(be_feats_df, be_trcks_df,how = 'inner', on = 'id')
be_merged.head(10)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,external_ids,external_urls,href,is_local,is_playable,name,popularity,track_number,type_y,uri_y
0,0.251,0.252,9,-14.478,1,0.0375,0.693,0.00706,0.0968,0.0395,...,{'isrc': 'USUM72401995'},{'spotify': 'https://open.spotify.com/track/1C...,https://api.spotify.com/v1/tracks/1CsMKhwEmNnm...,False,True,SKINNY,80,1,track,spotify:track:1CsMKhwEmNnmvHUuO5nryA
1,0.893,0.4,11,-7.981,0,0.0643,0.0452,0.0823,0.0632,0.945,...,{'isrc': 'USUM72401991'},{'spotify': 'https://open.spotify.com/track/62...,https://api.spotify.com/v1/tracks/629DixmZGHc7...,False,True,LUNCH,88,2,track,spotify:track:629DixmZGHc7ILtEntuiWE
2,0.7,0.425,7,-12.531,1,0.0529,0.144,0.879,0.083,0.521,...,{'isrc': 'USUM72401988'},{'spotify': 'https://open.spotify.com/track/7B...,https://api.spotify.com/v1/tracks/7BRD7x5pt8Lq...,False,True,CHIHIRO,88,3,track,spotify:track:7BRD7x5pt8Lqa1eGYC4dzj
3,0.747,0.507,2,-10.171,1,0.0358,0.2,0.0608,0.117,0.438,...,{'isrc': 'USUM72401994'},{'spotify': 'https://open.spotify.com/track/6d...,https://api.spotify.com/v1/tracks/6dOtVTDdiauQ...,False,True,BIRDS OF A FEATHER,98,4,track,spotify:track:6dOtVTDdiauQNBQEDOtlAB
4,0.467,0.247,6,-12.002,0,0.0431,0.612,0.000271,0.17,0.126,...,{'isrc': 'USUM72401993'},{'spotify': 'https://open.spotify.com/track/3Q...,https://api.spotify.com/v1/tracks/3QaPy1KgI7nu...,False,True,WILDFLOWER,91,5,track,spotify:track:3QaPy1KgI7nu9FJEQUgn6h
5,0.407,0.192,7,-10.99,1,0.0368,0.637,3e-06,0.21,0.159,...,{'isrc': 'USUM72401992'},{'spotify': 'https://open.spotify.com/track/6T...,https://api.spotify.com/v1/tracks/6TGd66r0nlPa...,False,True,THE GREATEST,82,6,track,spotify:track:6TGd66r0nlPaYm3KIoI7ET
6,0.467,0.392,9,-9.355,1,0.0908,0.2,0.0174,0.106,0.313,...,{'isrc': 'USUM72401990'},{'spotify': 'https://open.spotify.com/track/6f...,https://api.spotify.com/v1/tracks/6fPan2saHdFa...,False,True,L’AMOUR DE MA VIE,84,7,track,spotify:track:6fPan2saHdFaIHuTSatORv
7,0.857,0.386,1,-9.761,1,0.168,0.243,0.0931,0.111,0.661,...,{'isrc': 'USUM72401989'},{'spotify': 'https://open.spotify.com/track/1L...,https://api.spotify.com/v1/tracks/1LLUoftvmTjV...,False,True,THE DINER,80,8,track,spotify:track:1LLUoftvmTjVNBHZoQyveF
8,0.521,0.254,9,-14.409,0,0.0399,0.815,0.884,0.114,0.153,...,{'isrc': 'USUM72401987'},{'spotify': 'https://open.spotify.com/track/7D...,https://api.spotify.com/v1/tracks/7DpUoxGSdlDH...,False,True,BITTERSUITE,79,9,track,spotify:track:7DpUoxGSdlDHfqCYj0otzU
9,0.349,0.337,7,-10.671,1,0.0407,0.29,0.172,0.139,0.0365,...,{'isrc': 'USUM72401996'},{'spotify': 'https://open.spotify.com/track/2p...,https://api.spotify.com/v1/tracks/2prqm9sPLj10...,False,True,BLUE,84,10,track,spotify:track:2prqm9sPLj10B4Wg0wE5x9


___Overall, I felt like I was using the in-class notes very heavily to do this. But really, I only resorted to that after I attempted to do it my own way multiple different times. And through that process I felt like I got a better grasp and understanding of how this type of data works. I really wanted to forgo creating the album dataframe and that was my main struggle in this assignment, but it was also the thing that taught me the most because I had to understand that I needed to create it if I wanted my data to turn out analyzable for my hypothesis. Another idea I want to explore is converting this dataframe into some kind of chart so I can clearly see the correlation between danceability, energy, and loudness. Also, I did not realize loudness levels were in the negatives, so I might need to redefine "loudness" in my hypothesis and figure out it's correlation with energy and danceability (e.g. what means loud and what means quieter, and do I intend to mean if it's louder, it's more danceable...).___