# The Spotify API

In this notebook we will fetch and process music data from Spotify, using the [Spotify API](https://developer.spotify.com/documentation/web-api).

Specifically we will lookup and display the top tracks for any given artist.




## Setup



Sometimes when working with an API we will fetch data directly, using a web request. However in other cases there might be a Python package that can help make our lives easier, especially when there are complicated authorization steps involved.


For this demo, we will use the [`spotipy` package](https://spotipy.readthedocs.io/en/latest/), which provides a high level interface into the Spotify API.




Installing the `spotipy` package into the notebook environment:

In [1]:
%%capture
!pip install spotipy

Ensuring the package has been installed:

In [2]:
!pip list | grep spotipy

spotipy                               2.25.1


To interface with the Spotify API, you need to create a [Spotify API Client application](https://developer.spotify.com/dashboard/applications/), with corresponding credentials (i.e. the "client identifier" and "client secret").


Before proceeding, set these credentials as notebook secrets called `SPOTIPY_CLIENT_ID` and `SPOTIPY_CLIENT_SECRET`, respectively.

Accessing the credentials from notebook secrets:

In [3]:
from google.colab import userdata

SPOTIPY_CLIENT_ID = userdata.get("SPOTIPY_CLIENT_ID")
SPOTIPY_CLIENT_SECRET = userdata.get("SPOTIPY_CLIENT_SECRET")

## Example: Artist Top Tracks

Entering a search term, which in this case is an artist we are interested in:

In [147]:
search_term = "Chainsmokers"
search_term

'Chainsmokers'

Initializing a new Spotify API client, using the provided credentials:

In [148]:

from spotipy import Spotify
from spotipy.oauth2 import SpotifyClientCredentials

creds = SpotifyClientCredentials(
    client_id=SPOTIPY_CLIENT_ID,
    client_secret=SPOTIPY_CLIENT_SECRET
)

client = Spotify(client_credentials_manager=creds)
print("CLIENT:", type(client))

CLIENT: <class 'spotipy.client.Spotify'>


Fetching tracks for a given artist:

In [149]:
results = client.search(q=search_term, limit=50)
print(type(results))
print(results.keys())

<class 'dict'>
dict_keys(['tracks'])


Investigating and processing the nested structure of the response data:

In [150]:
tracks = results["tracks"]["items"]
print(type(tracks))
len(tracks)

<class 'list'>


50

There are 50 tracks provided by default. Here is a simplified view of the first track:

In [152]:
track = tracks[0]

info = {
    "name": track['name'],
    "artist": track["artists"][0]["name"],
    #"artists": [artist["name"] for artist in track["artists"]],
    "duration_ms": track['duration_ms'],
    "explicit":  track['explicit'],
    "markets": len(track["available_markets"]),
    "popularity": track["popularity"],
    "album_art_url": track["album"]["images"][0]["url"]
}
info

{'name': 'Closer',
 'artist': 'The Chainsmokers',
 'duration_ms': 244960,
 'explicit': False,
 'markets': 185,
 'popularity': 87,
 'album_art_url': 'https://i.scdn.co/image/ab67616d0000b273495ce6da9aeb159e94eaa453'}

Helper function for displaying the artist's top tracks in a table format:

In [174]:
#| code-fold: "show"

from pandas import DataFrame
from IPython.core.display import HTML

def img_html(url, width=50):
    return f"<img src='{url}' width='{width}' >"

def display_top_tracks(artist_name:str, top_n=10, client=client):
    results = client.search(q=artist_name, limit=50)

    tracks = results["tracks"]["items"]

    records = []
    for i, track in enumerate(tracks):
        artist_names = [artist["name"] for artist in track["artists"]]
        artist_names = ", ".join(artist_names)
        record = {
            #"index": i,
            #"artist": track["artists"][0]["name"],
            "track": track['name'],
            "artists": artist_names,
            "album": track["album"]["name"],
            "release_date": track["album"]["release_date"],
            #"duration_ms": track["duration_ms"],
            "explicit":  track['explicit'],
            "markets": len(track["available_markets"]),
            "popularity": track["popularity"],
            "album_art": img_html(track["album"]["images"][0]["url"])
        }
        records.append(record)

    tracks_df = DataFrame(records)
    # top n:
    tracks_df.sort_values(by="popularity", ascending=False, inplace=True)
    tracks_df = tracks_df.head(top_n)

    tracks_df.reset_index(inplace=True, drop=True)
    tracks_df["rank"] = tracks_df.index + 1
    # reorder cols:
    cols = ["rank"] + tracks_df.drop(columns=["rank"]).columns.tolist()
    tracks_df = tracks_df[cols]

    # displaying the dataframe as HTML:
    tracks_table = HTML(tracks_df.to_html(
        escape=False,
        index=False,
        formatters=dict(Icon=img_html)
    ))
    display(tracks_table)


Displaying top tracks for the given artist:

In [175]:
display_top_tracks(artist_name=search_term, top_n=10)

rank,track,artists,album,release_date,explicit,markets,popularity,album_art
1,Something Just Like This,"The Chainsmokers, Coldplay",Memories...Do Not Open,2017-04-07,False,185,88,
2,Closer,"The Chainsmokers, Halsey",Closer,2016-07-29,False,185,87,
3,Don't Let Me Down,"The Chainsmokers, Daya",Don't Let Me Down,2016-02-05,False,185,83,
4,Addicted,"Zerb, The Chainsmokers, Ink",Addicted,2024-03-29,True,185,80,
5,Roses,"The Chainsmokers, ROZES",Roses,2015-06-16,False,181,80,
6,Paris,The Chainsmokers,Memories...Do Not Open,2017-04-07,False,185,80,
7,Something Just Like This,"The Chainsmokers, Coldplay",Something Just Like This,2017-02-22,False,185,72,
8,White Wine & Adderall,"The Chainsmokers, Beau Nox",White Wine & Adderall,2025-07-11,False,185,72,
9,This Feeling,"The Chainsmokers, Kelsea Ballerini",Sick Boy,2018-12-14,False,185,71,
10,Jungle,"Alok, The Chainsmokers, Mae Stephens",Jungle,2023-09-22,False,185,71,


This is just one small example of the capabilities of the Spotify API. 