In [None]:
# https://spotipy.readthedocs.io/en/2.19.0/

# Install spotipy
# As google colab starts each session like a new computer, we must install this each time.
# If you are working on your local machine, you only need to install it once.
!pip install spotipy --upgrade
!pip install urllib3 --upgrade 

In [None]:
import pandas as pd

## Requests

The requests library is the de facto standard for making HTTP requests in Python. It abstracts the complexities of making requests behind a beautiful, simple API so that you can focus on interacting with services and consuming data in your application.

https://requests.readthedocs.io/en/latest/

When we make a request, we get a requests.Response object in return. The requests.Response object contains the server's response to the HTTP request. Part of this response is a number - which represents whether we received the information we wanted or not. If you get a number that you don't understand, these cats will help you: https://http.cat/

More often than not you'll receive:

200: Success!

401: Unauthorized client error status: lack of valid authentication credentials

403: The server understood the request but refuses to authorize it

In [None]:
import requests

google = requests.get("https://developers.google.com")
print("Google:", google.status_code)

NBA = requests.get("https://api.sportsdata.io/api/nba/fantasy/json/CurrentSeason")
print("NBA:", NBA.status_code) 

wbscs = requests.get("https://www.wbscodingschool.com/")
print("WBS CS: ", wbscs.status_code)

## JSON

### Intro - making a request and viewing the JSON

https://docs.python.org/3/library/json.html

Since its inception, JSON has quickly become the de facto standard for information exchange. JSON supports primitive types, like strings and numbers, as well as nested lists and objects. It looks like nested python dictionaries:

`{"firstname": "Harry",
"lastname": "Noah",
"city": "Berlin",
"dogs": [{"name": "rover", "breed": "labrador"}, {"name": "pip", "breed": "spaniel"}],
"cars": "none"}`

In [None]:
import json

In [None]:
# Make the request.
response = requests.get("https://jsonplaceholder.typicode.com/users")

In [None]:
# Check the HTTP code.
response

In [None]:
# Example of what a JSON looks like.

# View API response as a JSON.
response.json()

### GitHub API - Accessing the data in the JSON

Now that we know
- what an API is,
- how to request information from one (requests),
- how the information will be delivered to us (JSON):

Let's look how we can use this information. We will first look at how we can access particular values within the JSON. Then we will look at a couple of methods to make a dataframe from the JSON.

Github has many APIs. Here we'll look at two of them.

In [None]:
# Very basic API (a string returned).

# GitHub's Zen API produces a new inspirational phrase every 30 seconds.
# Run this cell again in 30 seconds to see a different output.

resp = requests.get("https://api.github.com/zen")
resp.text

In [None]:
# More complex API (a large json returned).

# Github's Event API shows the events that power the various activity streams on the site.
# In other words, what's happening on Github, who's updating what?

response = requests.get('https://api.github.com/events')
github_response = response.json()
github_response

In [None]:
# How many events are we looking at?
len(github_response)

In [None]:
# What are the keys in the first event?
github_response[0].keys()

In [None]:
# We can see that 'repo' is another subdictionary.
# What are the keys in the 'repo' subdictionary?
github_response[0]['repo'].keys()

In [None]:
# What's the value for the key 'name' in 'repo'?
github_response[0]['repo']['name']

#### Transforming a JSON into a DataFrame

##### Option 1: pd.DataFrame()

In [None]:
# Turn it into a pandas DataFrame.
pd.DataFrame(github_response)

##### Option 2: pd.json_normalize()
You may notice above that many columns still contain dictionaries as values. We can correct this using json_normalize().

https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html

In [None]:
pd.json_normalize(github_response)

#### Selecting only certain values by iterating with a for loop

If we only want to select certain parts of the JSON:
- Option 1: make a DataFrame and drop the rest.
- Option 2: Use a for loop to extract only the required information.

https://www.w3schools.com/python/python_for_loops.asp

https://www.w3schools.com/python/ref_func_range.asp

In [None]:
# login - first value
github_response[0]['actor']['login']

In [None]:
# repo - first value
github_response[0]['repo']['name']

In [None]:
# event_type - first value
github_response[0]['type']

In [None]:
# Empty lists that the loop will fill with values.
login = []
repo = []
event_type = []

for i in range(len(github_response)):
    # add the ith login value to the login list
    login.append(github_response[i]['actor']['login'])
    # add the ith repo name to the repo list
    repo.append(github_response[i]['repo']['name'])
    # add the ith event type to the event_type list
    event_type.append(github_response[i]['type'])

In [None]:
# Let's have a look at the login list.
login

In [None]:
# Let's have a look at the repo list.
repo

In [None]:
# Let's have a look at the event_type list.
event_type

### International Space Station API - just another cool API

Send a simple `get` request to know where the ISS is right now.

Docs here: http://open-notify.org/Open-Notify-API/ISS-Location-Now/

In [None]:
url = "http://api.open-notify.org/iss-now.json"

In [None]:
response = requests.get(url)

In [None]:
response.json()

## Spotipy

Now that you know all about APIs, let's use that knowledge on something even more fun.

Spotify has an API that allows users to gather information about songs and even interact with other users and playlists. To make their usage in Python easier, someone created `spotipy`, a library with some convenient functions to send requests and collect data.

Create / log into a Spotify account (https://developer.spotify.com/dashboard/login) and follow these steps (only the "Register your App" section): https://developer.spotify.com/documentation/general/guides/authorization/app-settings/

#### Authentication

With most APIs we need to autheticate ourself. This is often done with a username and a password. You will likely use a different username and password for most APIs, so make sure you're using a password manager, or keep everything written down somewhere safe.

If ever you'd like more information about spotipy: [here are the docs](https://spotipy.readthedocs.io/en/2.19.0/).

In [None]:
# import libraries
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

In [None]:
#Initialize SpotiPy with user credentials
#sp = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials(
    #client_id="82d10cbeab244ba885a29af43becd014",
    #client_secret="b23ecbf78eb743b38161dae226870f4c"))


In [None]:
sp = spotipy.Spotify(client_credentials_manager=SpotifyClientCredentials(
    client_id="4f3178f82dd54838a2e06c9d4003a726",
    client_secret="25b63863e2674da1a8a914ff62b53d04"))

#### Searching songs with 'queries' with `sp.search`

This method allows you to find songs using Spotify's search engine. That's convenient when you don't have the exact "id" of a song.

In [None]:
# Search for 'Lady Gaga', restricted to the first 10 results.

results = sp.search(q="Lady Gaga", limit = 10)

Explore the object returned by the request. As it's a dictionary (with nested dictionaries inside), using `.keys` is a great way to see what's in there:

In [None]:
results.keys()

In [None]:
# Only one key makes it seem like there'll be many subdictionaries, let's delve deeper.
results["tracks"].keys()

This is the url of your request:

In [None]:
# You can play the track by pasting the url in your browser.

# We can explore further by adding keys one after the other.

results["tracks"]["href"]

This is the name of the first song returned by the API:

In [None]:
results["tracks"]["items"][0]["name"]

As one song can have many artists, the artists are returned as a list: note the square brackets.

In [None]:
results["tracks"]["items"][0]["artists"]

There are some other interesting features contained in the search results:

In [None]:
# https://developer.spotify.com/documentation/web-api/reference/#/operations/get-several-tracks
# The popularity of the track. The value will be between 0 and 100, with 100 being the most popular.

results["tracks"]["items"][0]["popularity"]

This is how Spotify identifies individual songs: with a Uniform Resource Identifier  or `uri`. (The `id` and the `url` are also ways to identify each song uniquely).

In [None]:
results["tracks"]["items"][0]["uri"]

Here we look for 10 songs by the Red Hot Chili Peppers and store the `uri` of the songs and their names.

In [None]:
# Send request and store the response.
red_hot = sp.search(q="Red Hot Chili Peppers", limit=10)

# Initialize empty lists that we will fill with information from our loop.
list_of_uri = []
list_of_song_names = []

# Iterate through the "items" (the songs),
# and append the "uri" and the "name" to the lists we created.
for item in red_hot["tracks"]["items"]:
    list_of_uri.append(item["uri"])
    list_of_song_names.append(item["name"])

# Print results.
print(list_of_uri)
print("\n")
print(list_of_song_names)

#### Searching multiple artists

Here we first create a list of artists we want to gather songs from. Then we iterate through them and append the results to a big list called `results`.

In [None]:
artists = ["Red Hot Chili Peppers", "SCARR", "Whitney Houston"]

In [None]:
results = []

for artist in artists:
    results.append(sp.search(q=artist, limit=10)) 

In [None]:
# Let's look at the second element in the results list.
results[1]

We can iterate through the `results` list and get just the names of all the songs:

In [None]:
song_names = []

for result in results:
    for item in result["tracks"]["items"]:
        song_names.append(item["name"])

In [None]:
song_names

### Playlists

Using spotipy, we can both build and read spotify playlists. Today, we will only show you how to read information from a playlist. However, if you wish to build one, we strongly encourage you read the [documentation](https://spotipy.readthedocs.io/en/2.19.0/) and explore further.

In [None]:
my_playlist = sp.user_playlist_tracks(user="spotify", playlist_id="spotify:playlist:0ce6Rmxf7QXroqa1wzjWY8")

Extract songs ID from a playlist

In [None]:
my_playlist

In [None]:
my_playlist["items"][0]["track"]["uri"]

### Audio features

You can check an explanation of the audio features [here](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/).

In [None]:
sp.audio_features("spotify:track:6Sy9BUbgFse0n0LPA5lwy5")

### Creating a function that takes a song name and returns its audio features 

In [None]:
list_of_songs = []

def song_features(human_song_title):
    # Search for the song title you enter into the function, limited to the first 10 results.
    results = sp.search(q=human_song_title, limit = 10)['tracks']['items']
    
    # Create a loop, so we only select the parts of the json we need.
    for i in results:

        # Empty dictionary to be filled with the information below.
        track_dict = {}

        # Add the key artist and a corresponding value to the dictionary.
        track_dict['Artist'] = i['artists'][0]['name']
        # Add the key title and the corresponding value to the dictionary.
        track_dict['Title'] = i['name'] 
        # Add the key album and the corresponding value to the dictionary.
        track_dict['Album'] = i['album']['name']
        # Add the key audio description and the corresponding value to the dictionary.
        track_dict['Audio Description'] = sp.audio_features(i['id'])
        
        # Add the dictionary to the list list_of_songs.
        list_of_songs.append(track_dict)
    # Output list_of_songs.
    return list_of_songs

# Call the function with a song to test.
song_features("Under the Bridge")

In [None]:
# Make a dataframe from the list of songs created in the function above.
df = pd.DataFrame(list_of_songs)

df

As you can see, this DataFrame looks a bit off as the audio descriptions aren't expanded - all of the data is clumped together in one cell. Let's correct this, so we can see each audio feature as an individual column.

In [None]:
# Quick function we can use to select only the first item.
# This can also be done simply with [0], but we wanted to show you how you can incorporate a custom function into your work.

def first_value (x):
    return x[0]

# Making a DataFrame from the audio features of the songs in list_of_songs.
df_audio_features = pd.json_normalize(df['Audio Description'].apply(first_value))

df_audio_features

In [None]:
# Merge the expanded audio features with the original DataFrame.
new_df = pd.merge(df, df_audio_features, left_index=True, right_index=True)

# Drop the old ugly column where all the audio features are clumped together.
new_df.drop('Audio Description', axis=1, inplace=True)

new_df

If you like a challenge and want to have a go at testing what you've just learned: Follow the steps below and see if you can make a DataFrame similar to the one above with expanded audio features. This time though, do it for a playlist of your choice on spotify. Then see if you can expand it to include the songs from multiple playlists.

Try not to look at the solution, but we've included it below in case you get stuck

### Collect a big dataframe of songs with their audio features

- Start by looking for a playlist on spotify (it does not have to be your playlist), and copy its url.

- Extract the audio features for each song on your playlist.

- Collect the link of many playlists and do the same for all of them.

- Structure the information as a dataframe where each row is a song and the columns are audio features.

In [None]:
# your code here

#### Solution

In [None]:
list_of_playlists = ["spotify:playlist:2zjepkjZxLpeIBlvPCWIHl",
                    "spotify:playlist:0ce6Rmxf7QXroqa1wzjWY8"]

track_list = []
for i in list_of_playlists:
    individual_playlist = sp.user_playlist_tracks(user="spotify", playlist_id=i)['items']
    for j in individual_playlist:
        track_dict = {}
        track_dict['Artist'] = j['track']['artists'][0]['name']
        track_dict['Title'] = j['track']['name']
        track_dict['Album'] = j['track']['album']['name']
        track_dict['Audio Description'] = sp.audio_features(j['track']['id'])
        track_list.append(track_dict)

print(track_list)

In [None]:
playlist_df = pd.DataFrame(track_list)

In [None]:
df_a_f = pd.json_normalize(playlist_df['Audio Description'].apply(first_value))

In [None]:
new_playlist_df = pd.merge(playlist_df, df_a_f, left_index=True, right_index=True)

In [None]:
new_playlist_df.drop('Audio Description', axis=1, inplace=True)

In [None]:
new_playlist_df