In [8]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

CC: Cameron Watts
[Medium](https://towardsdatascience.com/extracting-song-data-from-the-spotify-api-using-python-b1e79388d50)

In [9]:
!pip install spotipy
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

[0m

* The Spotify API is quite powerful, and gives us access to a lot of information about any song or artist on Spotify. This ranges from features describing the “feel” of the audio, such as the variables `“liveness”`, `“acousticness”`, and `“energy”`, through to the features describing the popularity of the artist and song. We can also get more advanced information from this API, such as the predicted position of each beat in the song, if we want to do a more advanced analysis of the data. It also contains the recommendation engine

# Authorized Requests

## Client Credentials Flow - Authentification without the user
* The method makes it possible to authenticate your requests to the Spotify Web API and to obtain a higher rate limit than you would
* This allows us to access the general features/stats such as following, music listened to, playslists etc

## Authorization Code Flow - Authentification with user
* This method is suitable for long-running applications which the user logs into once. It provides an access token that can be refreshed.

In [10]:

client_credentials_manager = SpotifyClientCredentials(client_id = 'c3930f4db12048faa6de73f9ee8eca81', client_secret = '6c12523839d648f18114152857e933a1' )
sp = spotipy.Spotify(client_credentials_manager = client_credentials_manager)

# Using the Spotify Object
* Both types of authentication create the same Spotify object, just with different methods of creation. This means that the same class methods are usable for either method of authentication, with the exception of those relating to the current user. Now, using this object, we can interact with the Spotify API, to get the information that we want.
* We need a URI (Uniform Resource Identifiers) to perform any function with the API referring to an object in Spotify. 
* The URI of any Spotify object is contained in its shareable link.
* For example, the link to the Global top songs playlist, when found from the Spotify desktop application, is:
“https://open.spotify.com/playlist/37i9dQZEVXbNG2KDcFcKOF?si=77d8f5cd51cd478d”
The URI contained in this link is “37i9dQZEVXbNG2KDcFcKOF” — if we use this with the API then we will be referencing the Global top songs playlist. You may also see the URI listed in the format “spotify:object_type:uri”, which also works, and if anything is a more valid way of referring to the object. Using these URIs, we will extract features of songs in a playlist, and in turn extract a series of features from these songs, such that we can create a dataset to analyse.

# Extracting Tracks from a Playslist
* The first method that we will use in extracting features from tracks in a playlist is the “playlist_tracks” method.
* This method takes the URI from a playlist, and outputs JSON data containing all of the information about this playlist.

In [11]:
playlist_link = "https://open.spotify.com/playlist/37i9dQZF1EQncLwOalG3K7?si=c952fbae9049421f"
playlist_URI = playlist_link.split("/")[-1].split("?")[0]
track_uris = [x["track"]["uri"] for x in sp.playlist_tracks(playlist_URI)["items"]]

* We can also extract the name of each track, the name of the album that it belongs to, and the popularity of the track. From the artist, we can find a genre (though not airtight — artists can make songs in multiple genres), and an artist popularity score.

In [12]:
for track in sp.playlist_tracks(playlist_URI)["items"]:
    #URI
    track_uri = track["track"]["uri"]
    
    #Track name
    track_name = track["track"]["name"]
    
    #Main Artist
    artist_uri = track["track"]["artists"][0]["uri"]
    artist_info = sp.artist(artist_uri)
    
    #Name, popularity, genre
    artist_name = track["track"]["artists"][0]["name"]
    artist_pop = artist_info["popularity"]
    artist_genres = artist_info["genres"]
    
    #Album
    album = track["track"]["album"]["name"]
    
    #Popularity of the track
    track_pop = track["track"]["popularity"]

# Extracting Features from Songs

In [13]:
sp.audio_features(track_uri)[0]

{'danceability': 0.422,
 'energy': 0.712,
 'key': 11,
 'loudness': -5.907,
 'mode': 0,
 'speechiness': 0.1,
 'acousticness': 0.273,
 'instrumentalness': 0,
 'liveness': 0.051,
 'valence': 0.471,
 'tempo': 78.454,
 'type': 'audio_features',
 'id': '2CvOqDpQIMw69cCzWqr5yr',
 'uri': 'spotify:track:2CvOqDpQIMw69cCzWqr5yr',
 'track_href': 'https://api.spotify.com/v1/tracks/2CvOqDpQIMw69cCzWqr5yr',
 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/2CvOqDpQIMw69cCzWqr5yr',
 'duration_ms': 261160,
 'time_signature': 4}