<a id='top'></a>

# Spotify Audio Data Insights

This notebook reviews the audio attributes available from Spotify and how to retrieve them. I also provide an explanation as to why I chose to utilize the audio feature **timbre** for this project.

----

## Prerequisites

[1) Setting Up the API Connection](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/01SettingUptheAPIConnection.ipynb)<br>
[2) Using the Spotify API](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/python/02UsingtheSpotifyAPI.ipynb)

---

## Table of Contents


1. [Audio Features](#AudioFeatures)
1. [Audio Analysis](#AudioAnalysis)
   1. [Pitch](#Pitch)
   1. [Timbre](#Timbre)

---

<a id='AudioFeatures'></a>

## Audio Features
Audio features are attributes of a song such as tempo, loudness, danceability, etc. A breakdown of each of the 13 audio features and their definitions can be found on [Spotify Developer's Site](https://developer.spotify.com/documentation/web-api/reference/get-audio-features). 
This project does not utilize any of the audio features, but I have used them on another project and can say they are a great, fun, and easy way to start to dive into musical attribute analyses.<br>
Even though we are not using them, we will still see how to retrieve the audio features using the Spotipy API.

### Connect to Spotify
Let's create a function that will conveniently get us connected to the Spotify API using the method outlined in the [Setting Up the API Connection](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/01SettingUptheAPIConnection.ipynb).

In [4]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

def connect_to_spotify():
    sp_client_id = input("Enter your Spotify Client ID: ")
    sp_client_secret = input("Enter your Spotify Secret Token: ")
    
    credentials = SpotifyClientCredentials(client_id = sp_client_id,
                                           client_secret = sp_client_secret)
    
    sp = spotipy.Spotify(client_credentials_manager = credentials)
    
    print("Connected to Spotify!")

    return sp

In [5]:
sp = connect_to_spotify()

Enter your Spotify Client ID:  ed9307841d3542df8819aec9a4f0ec84
Enter your Spotify Secret Token:  8208717955574be6a24163ed59675094


Connected to Spotify!


<br>

### Find Track URI
Using the method outlined in [Using the Spotify API](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/02UsingtheSpotifyAPI.ipynb), let's create a function that will take an artist and song name and return the track URI.

In [7]:
def get_uri():

    artist = input("Input artist name: ")
    track = input("Input song name: ")

    search_term = f'artist:{artist} track:{track}'

    result = sp.search(search_term, type = 'track', limit = 1)['tracks']['items'][0]

    track_uri = result['uri']
    artist_name = result['artists'][0]['name']
    track_name = result['name']

    print(f'\nTrack name: {track_name}')
    print(f'Artist name: {artist_name}')
    print(f'Track URI: {track_uri}')

    return track_uri

<br>

For this example, we will use the song *Karma Police* by Radiohead.

In [8]:
track_uri = get_uri()

Input artist name:  Radiohead
Input song name:  Karma Police



Track name: Karma Police
Artist name: Radiohead
Track URI: spotify:track:63OQupATfueTdZMWTxW03A


<br>

### Retrieve Audio Features
Next, we use the track URI to retrieve the audio features. The spotipy object, <code>sp</code>, has a built-in method for this, <code>audio_features</code>.<br>
The only required parameter for <code>audio_features</code> is the track URI.

In [9]:
# Use the track URI to retrieve audio features
audio_features = sp.audio_features(track_uri)

audio_features

[{'danceability': 0.36,
  'energy': 0.501,
  'key': 7,
  'loudness': -9.129,
  'mode': 1,
  'speechiness': 0.0258,
  'acousticness': 0.0638,
  'instrumentalness': 9.32e-05,
  'liveness': 0.172,
  'valence': 0.324,
  'tempo': 74.807,
  'type': 'audio_features',
  'id': '63OQupATfueTdZMWTxW03A',
  'uri': 'spotify:track:63OQupATfueTdZMWTxW03A',
  'track_href': 'https://api.spotify.com/v1/tracks/63OQupATfueTdZMWTxW03A',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/63OQupATfueTdZMWTxW03A',
  'duration_ms': 264067,
  'time_signature': 4}]

<br>

### Building an Audio Feature Data Frame 
Finally, if desired we can use the result above to make a nice data frame for analyses and visuals.

In [4]:
import pandas as pd

# Dropping a few unnecessary columns for readability
audio_features_df = pd.DataFrame(audio_features).drop(['id', 'uri', 'type', 'track_href', 'analysis_url'], axis = 1)

audio_features_df

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms,time_signature
0,0.36,0.501,7,-9.129,1,0.0258,0.0638,9.3e-05,0.172,0.324,74.807,264067,4


<br>

From here we can easily add the artist and song name to the data frame using the <code>track_artist</code> and <code>track_name</code> variables from first step.

In [5]:
# Add previously stored artist name
audio_features_df.insert(0, 'artist', track_artist)

# Add previously stored track name
audio_features_df.insert(1, 'song_title', track_name)

# Final result
audio_features_df

Unnamed: 0,artist,song_title,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms,time_signature
0,Radiohead,Karma Police,0.36,0.501,7,-9.129,1,0.0258,0.0638,9.3e-05,0.172,0.324,74.807,264067,4


<br>

With that we have a template by which we can build out a data frame or data frames! Again, we're not using this on this particular project, but it is a fun and easy share.

---

<a id='AudioAnalysis'></a>

## Audio Analysis
By comparison, the audio analysis provides a breakdown of a song's *rhythm*, *pitch*, and *timbre* at microsecond intervals of the song (**'start'**).<br>

For this project, we will instead be using Spotify's audio analysis. According to [Spotify Developer's Site](https://developer.spotify.com/documentation/web-api/reference/get-audio-analysis), the audio analysis describes a track's musical structure, including rhythm, pitch, and ***timbre***. Let's look closer at the latter two:<br>

In [39]:
track_analysis_df = pd.DataFrame(sp.audio_analysis('spotify:track:63OQupATfueTdZMWTxW03A')['segments'])

track_analysis_df.head()

Unnamed: 0,start,duration,confidence,loudness_start,loudness_max_time,loudness_max,loudness_end,pitches,timbre
0,0.0,0.11134,0.0,-60.0,0.0,-60.0,0.0,"[1.0, 0.725, 0.337, 0.371, 0.431, 0.408, 0.569...","[0.05, 169.954, 8.395, -29.988, 56.787, -50.22..."
1,0.11134,0.84522,1.0,-60.0,0.05445,-19.323,0.0,"[0.038, 0.03, 0.006, 0.01, 0.101, 0.013, 0.034...","[33.644, -23.384, -74.375, 34.627, 18.404, 63...."
2,0.95655,0.57519,1.0,-34.527,0.04297,-15.601,0.0,"[0.099, 0.042, 0.018, 0.017, 0.124, 0.028, 0.1...","[40.11, 5.728, -22.954, 30.254, 10.283, -4.349..."
3,1.53175,0.23787,0.696,-29.43,0.10532,-22.314,0.0,"[0.135, 0.064, 0.053, 0.038, 0.17, 0.044, 0.11...","[34.347, -6.78, 5.479, -62.715, 3.666, -8.412,..."
4,1.76961,0.44463,0.447,-26.881,0.03024,-20.798,0.0,"[0.032, 0.084, 0.02, 0.024, 0.056, 0.214, 1.0,...","[33.805, -53.703, -75.956, 43.627, -25.619, -4..."


<a id='Pitch'></a>

### Pitch
Imagine hitting a key on a piano. When you strike the key, the piano produces a sound wave that you perceive as the note played, such as C. If you play a different C key on the piano, it will sound the same note but at a different pitch, depending on whether it's a higher or lower octave. Pitch is how high or low a note sounds, with higher pitches corresponding to sound waves with higher frequencies and shorter wavelengths, while lower pitches are associated with lower frequencies and longer wavelengths.<br>
In Western music, there are 12 unique pitches, known as semitones, which include the natural notes: C, D, E, F, G, A, B, along with their sharps and flats (C#, D#, F#, G#, A#). When analyzing the audio segments of a song, we can evaluate the prominence of each pitch class on a scale of 0 to 1. For example, if an analysis shows that G has a prevalence of 0.8 in a particular segment, this indicates that the note G is a prominent part of that section of the music.


#### Examine the pitches of the first 5 segments
We will need to use this again later on, so we will create a function for this method.

In [63]:
def create_pitch_df(uri):

    # Create the audio analysis dataframe
    audio_analysis_df = pd.DataFrame(sp.audio_analysis(uri)['segments'])
    
    # Create a list of the 12 pitch classes
    semitones = ['C', 'C#', 'D', 'D#', 'E', 'F', 'F#', 'G', 'G#', 'A', 'A#', 'B']

    # Create a pitch dataframe 
    pitch_df = audio_analysis_df[['start', 'pitches']].copy()

    # Expand the pitches and label by class
    for i in range(12):
        pitch_df[semitones[i]] = pitch_df['pitches'].apply(lambda x: x[i])

    # Drop the pitches column
    pitch_df.drop(['pitches'], axis = 1, inplace = True)

    return pitch_df


# Radiohead Karma Police Pitch DF
rh_kp_pitch_df = create_pitch_df(track_uri)

rh_kp_pitch_df.head(5)

Unnamed: 0,duration,C,C#,D,D#,E,F,F#,G,G#,A,A#,B
0,0.11134,1.0,0.725,0.337,0.371,0.431,0.408,0.569,0.603,0.69,0.674,0.686,0.12
1,0.84522,0.038,0.03,0.006,0.01,0.101,0.013,0.034,0.048,0.031,1.0,0.042,0.016
2,0.57519,0.099,0.042,0.018,0.017,0.124,0.028,0.121,0.221,0.07,1.0,0.063,0.053
3,0.23787,0.135,0.064,0.053,0.038,0.17,0.044,0.116,0.127,0.149,1.0,0.248,0.035
4,0.44463,0.032,0.084,0.02,0.024,0.056,0.214,1.0,0.139,0.029,0.032,0.022,0.013


<br>

According to the documentation, pitches in our data are normalized on a scale of 0 to 1 based on their most prevalent pitch in the analysis. For example, in the data frame above, we can observe that the predominant first five notes are C, A, A, A, and F#.<br>
While there has been considerable work done with these audio features, they may still not meet our specific needs.<br>
***Why is that?***<br>
Even though different instruments can produce the same pitch—such as playing a C note on a piano versus a guitar—the sound we hear is distinct due to a characteristic known as ***timbre***. 

<a id='Timbre'></a>

### Timbre 
Timbre (pronounced tam-ber) is a fundamental aspect of music that describes the unique tone color and quality of a sound. It's often described as the "color" of music, but it can also be thought of as the way a sound feels. Imagine two people singing the same notes at the same pitch – they might produce the same musical attributes, but they will undoubtedly have distinct timbres due to differences in their vocal qualities, such as breathiness, vibrato, or resonance.<br>
I chose to focus on timbre for this project because it's a crucial factor in why people prefer certain songs over others. While rhythm may play a role in this preference, there are still distinctions between songs in the same key that can be attributed to timbre. For instance, a song with a bright, piercing timbre might evoke a different emotional response than one with a warm, mellow timbre. Looking at a 3D model of timbre, then, means we are looking at how a song <i>feels</i>. That's exactly what we are after.<br>
Furthermore, it will be timbre that will help us find songs that sound like one another when we build our deep learning model. Look no further than the [audio analysis documentation on Spotify's Developer Site](https://developer.spotify.com/documentation/web-api/reference/get-audio-analysis) on timbre for verification of this: *"Timbre vectors are best used in comparison with each another."*


#### A quick look at timbre values
We will see a lot more of this in the next notebook, [Creating a 3D Audio Model](https://nbviewer.org/github/JonYarber/music_modeling/blob/main/4.%20Creating%20a%203D%20Music%20Model.ipynb).

In [62]:
# Create the audio analysis dataframe
audio_analysis_df = pd.DataFrame(sp.audio_analysis(track_uri)['segments'])
    
# Create a pitch dataframe 
timbre_df = audio_analysis_df[['start', 'timbre']].copy()

# Expand the timbres and label
for i in range(12):
    timbre_df[f'timbre_{i + 1}'] = timbre_df['timbre'].apply(lambda x: x[i])

# Drop the original timbre column
timbre_df.drop(['timbre'], axis = 1, inplace = True)

# Look at first 5 timbre vectors
timbre_df.head()

Unnamed: 0,duration,timbre_1,timbre_2,timbre_3,timbre_4,timbre_5,timbre_6,timbre_7,timbre_8,timbre_9,timbre_10,timbre_11,timbre_12
0,0.11134,0.05,169.954,8.395,-29.988,56.787,-50.228,14.894,3.851,-27.467,0.92,-10.538,-6.641
1,0.84522,33.644,-23.384,-74.375,34.627,18.404,63.062,7.419,-30.435,-15.794,48.255,36.766,-8.482
2,0.57519,40.11,5.728,-22.954,30.254,10.283,-4.349,-4.717,-0.904,-1.862,-9.79,-17.182,5.178
3,0.23787,34.347,-6.78,5.479,-62.715,3.666,-8.412,-9.065,4.048,4.949,-0.829,-11.058,0.388
4,0.44463,33.805,-53.703,-75.956,43.627,-25.619,-46.175,7.318,-32.674,-14.825,25.307,-9.279,18.329


Now that we know what we are looking for and how to obtain it, we will move on to the next step where I will show you how to create the 3D timbre model.

[Back to top](#top)