# Using public REST APIs to gather data

## Recap / introduction to REST APIs



* API - Application programming interface, or a way to interact with a specific program

* REST - stands for representational state transfer architectural style, this is a set of contraints rather than a standard

With REST APIs there are four main ways of interacting with them

* GET - gets a record

* POST - creates a record

* PUT - updates a record

* DELETE - deletes a record

This follows the CRUD style of database management and is performed via HTTP request

In [1]:
import requests
import pandas as pd

In [2]:
reqresURL = 'https://reqres.in/api/users/'
query = {'page' : '1'}
response = requests.get(reqresURL, query)


In [3]:
df = pd.json_normalize(response.json())
df.head()

Unnamed: 0,page,per_page,total,total_pages,data,support.url,support.text
0,1,6,12,2,"[{'id': 1, 'email': 'george.bluth@reqres.in', ...",https://reqres.in/#support-heading,"To keep ReqRes free, contributions towards ser..."


In [4]:
dfa = pd.json_normalize(df['data'][0])

In [5]:
dfa.head()

Unnamed: 0,id,email,first_name,last_name,avatar
0,1,george.bluth@reqres.in,George,Bluth,https://reqres.in/img/faces/1-image.jpg
1,2,janet.weaver@reqres.in,Janet,Weaver,https://reqres.in/img/faces/2-image.jpg
2,3,emma.wong@reqres.in,Emma,Wong,https://reqres.in/img/faces/3-image.jpg
3,4,eve.holt@reqres.in,Eve,Holt,https://reqres.in/img/faces/4-image.jpg
4,5,charles.morris@reqres.in,Charles,Morris,https://reqres.in/img/faces/5-image.jpg


In [6]:
createQuery = {'name' : 'John',
               'job' : 'Leader'}
response = requests.post(reqresURL, data = createQuery)

In [7]:
dfb = pd.json_normalize(response.json())
dfb.head()

Unnamed: 0,name,job,id,createdAt
0,John,Leader,583,2022-12-06T05:59:47.117Z


In [8]:
updateQuery = {'name' : 'John',
               'job' : 'Follower'}
response = requests.patch(reqresURL, updateQuery)

In [9]:
dfc = pd.json_normalize(response.json())
dfc.head()

Unnamed: 0,name,job,updatedAt
0,John,Follower,2022-12-06T05:59:48.751Z


# Getting data from Spotify


## Why Spotify?

Spotify's API is easy to use and has a ton of documentation on it

## Getting authentication token
1. Go to https://developer.spotify.com/dashboard/ and create a new app
2. Copy the Client ID and Secret
3. Create a file named .env
4. Paste the ID and Secret in the following format in the .env with qutation
~~~
CLIENT_ID = "ID"

CLIENT_SECRET = "Secret"
~~~

## Why a .env file?

Generally considered bad practice and insecure to have client secret hard coded in

In [10]:
from dotenv import load_dotenv
import os

In [11]:
load_dotenv()
CLIENT_ID = os.environ.get('CLIENT_ID')
CLIENT_SECRET = os.environ.get('CLIENT_SECRET')

After loading in the app credentials, create a header file which will be used to get the access token

In [12]:
spotifyAuth = 'https://accounts.spotify.com/api/token'
authQuery = {'grant_type' : 'client_credentials',
             'client_id' : CLIENT_ID,
             'client_secret' : CLIENT_SECRET}

authResponse = requests.post(spotifyAuth, authQuery)
             

Access token is inside of the json response

In [13]:
authResponseData = authResponse.json()
accessToken = authResponseData['access_token']

In [14]:
headers = { 'Authorization': 'Bearer {token}'.format(token=accessToken)}

In [15]:
spotifyAPI = 'https://api.spotify.com/v1/'

# Using Spotify to get top tracks from an artist

Artist ids are ripped from the Spotify artist page url

In [16]:
porterRobinsonID = '3dz0NnIZhtKKeXZxLOxCam'
kendrickLamarID = '2YZyLoL8N0Wb9xBt1NhZWg'
frankOceanID = '2h93pZq0e7k5yf4dywlkpM'

Query for top tracks is as follow

`artists/*ID*/top-tracks?market=*2 Letter Country Code*`

In [17]:
porterResponse = requests.get(spotifyAPI + 'artists/' + porterRobinsonID + '/top-tracks?market=US', headers=headers)
kendrickResponse = requests.get(spotifyAPI + 'artists/' + kendrickLamarID + '/top-tracks?market=US', headers=headers)
frankResponse = requests.get(spotifyAPI + 'artists/' + frankOceanID + '/top-tracks?market=US', headers=headers)

In [18]:
porterData = porterResponse.json()
kendrickData = kendrickResponse.json()
frankData = frankResponse.json()

In [19]:
def printTopTracks(tracks):
    df = pd.json_normalize(tracks)
    return df.loc[:,['name', 'album.name', 'album.release_date']].head()

In [20]:
print('Porter Robinson\'s top tracks')
printTopTracks(porterData['tracks'])

Porter Robinson's top tracks


Unnamed: 0,name,album.name,album.release_date
0,Everything Goes On,Everything Goes On,2022-07-14
1,Shelter,Shelter,2016-08-12
2,Goodbye To A World,Worlds,2014-08-12
3,Sad Machine,Worlds,2014-08-12
4,Look at the Sky,Nurture,2021-04-23


In [21]:
print('Kendrick Lamar\'s top tracks')
printTopTracks(kendrickData['tracks'])

Kendrick Lamar's top tracks


Unnamed: 0,name,album.name,album.release_date
0,All The Stars (with SZA),Black Panther The Album Music From And Inspire...,2018-02-09
1,HUMBLE.,DAMN.,2017-04-14
2,LOVE. FEAT. ZACARI.,DAMN.,2017-04-14
3,N95,Mr. Morale & The Big Steppers,2022-05-13
4,PRIDE.,DAMN.,2017-04-14


In [22]:
print('Frank Ocean\'s top tracks')
printTopTracks(frankData['tracks'])

Frank Ocean's top tracks


Unnamed: 0,name,album.name,album.release_date
0,Pink + White,Blonde,2016-08-20
1,Lost,channel ORANGE,2012-07-10
2,Novacane,Novacane,2011-01-01
3,Ivy,Blonde,2016-08-20
4,Nights,Blonde,2016-08-20


Query for information about the track

`audio-features/*ID*`

In [23]:
shelter = porterData['tracks'][0]['id']
humble = kendrickData['tracks'][1]['id']
nights = frankData['tracks'][4]['id']

In [24]:
shelterReq = requests.get(spotifyAPI + 'audio-features/' + shelter, headers=headers)
humbleReq = requests.get(spotifyAPI + 'audio-features/' + humble, headers=headers)
nightsReq = requests.get(spotifyAPI + 'audio-features/' + nights, headers=headers)

In [25]:
shelterData = pd.json_normalize(shelterReq.json())
humbleData = pd.json_normalize(humbleReq.json())
nightsData = pd.json_normalize(nightsReq.json())

In [26]:
shelterData.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.55,0.648,7,-6.542,1,0.0309,0.0191,0,0.0999,0.323,156.957,audio_features,3WBRfkOozHEsG0hbrBzwlm,spotify:track:3WBRfkOozHEsG0hbrBzwlm,https://api.spotify.com/v1/tracks/3WBRfkOozHEs...,https://api.spotify.com/v1/audio-analysis/3WBR...,202821,4


In [27]:
humbleData.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.908,0.621,1,-6.638,0,0.102,0.000282,5.4e-05,0.0958,0.421,150.011,audio_features,7KXjTSCq5nL1LoYtL7XAwS,spotify:track:7KXjTSCq5nL1LoYtL7XAwS,https://api.spotify.com/v1/tracks/7KXjTSCq5nL1...,https://api.spotify.com/v1/audio-analysis/7KXj...,177000,4


In [28]:
nightsData.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.457,0.551,5,-9.36,0,0.167,0.427,1e-06,0.113,0.428,89.87,audio_features,7eqoqGkKwgOaWNNHx90uEZ,spotify:track:7eqoqGkKwgOaWNNHx90uEZ,https://api.spotify.com/v1/tracks/7eqoqGkKwgOa...,https://api.spotify.com/v1/audio-analysis/7eqo...,307151,4


# Liveness?

Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. - Spotify API Refrence
# How accurate is it?

To test this, I compared tracks from J. Cole, who released a studio and live version of their album 2014 Forest Hills Drive

In [29]:
taleOf2CitiezLive = requests.get(spotifyAPI + 'audio-features/5EPp5oygWvFf08qCFBAZcV', headers=headers)
gomdLive = requests.get(spotifyAPI + 'audio-features/1O02UzJKK7njDp3lQqLhUm', headers=headers)
noRoleModelzLive = requests.get(spotifyAPI + 'audio-features/0SFZAQnxM6KBBNi8Y7YflZ', headers=headers)

taleOf2Citiez = requests.get(spotifyAPI + 'audio-features/52A8OAP8lTQKZCj4Rce92B', headers=headers)
gomd = requests.get(spotifyAPI + 'audio-features/0Thqjtu54vKMP06pwZkAWp', headers=headers)
noRoleModelz = requests.get(spotifyAPI + 'audio-features/68Dni7IE4VyPkTOH9mRWHr', headers=headers)


In [30]:
taleLiveData = taleOf2CitiezLive.json()
taleData = taleOf2Citiez.json()
print('Tale of 2 Citiez Liveness')
print('Live:', taleLiveData['liveness'], 'Studio:', taleData['liveness'])

Tale of 2 Citiez Liveness
Live: 0.754 Studio: 0.294


In [31]:
gomdLiveData = gomdLive.json()
gomdData = gomd.json()
print('G.0.M.D Liveness')
print('Live:', gomdLiveData['liveness'], 'Studio:', gomdData['liveness'])

G.0.M.D Liveness
Live: 0.893 Studio: 0.331


In [32]:
noLiveData = noRoleModelzLive.json()
noData = noRoleModelz.json()
print('No Role Modelz Liveness')
print('Live:', noLiveData['liveness'], 'Studio:', noData['liveness'])

No Role Modelz Liveness
Live: 0.77 Studio: 0.0534
