<h1>Creating the Music Model</h1>

In this notebook, I will show you how to create an awesome 3D model using timbre in Spotify's audio analysis.

<h2>Prerequisites</h2>

<a href = 'http://localhost:8888/notebooks/Documents%2FProjects%2Fspotify%2F3.%20Understanding%20the%20Data.ipynb'>1. Setting Up Your API Connection</a><br>
<a href = 'http://localhost:8888/notebooks/Documents%2FProjects%2Fspotify%2F3.%20Understanding%20the%20Data.ipynb'>2. Using the API</a><br>
<a href = 'http://localhost:8888/notebooks/Documents%2FProjects%2Fspotify%2F3.%20Understanding%20the%20Data.ipynb'>3. Understanding the Data</a><br>

<h2>Putting it All Together</h2>
Now that we've seen how to connect to the Spotify's API to obtain the audio features and audio analysis, we're going to put it all together to create a beautiful 3D model of a song's timbre. <br>
Since the ultimate goal is to create a 3D print of this model, I chose to create a cylindrical model. This comes with its own unique set of challenges and code requirements. 

<h3>Connect to Spotify API Using Spotipy</h3>

In [1]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

In [2]:
credentials = SpotifyClientCredentials(client_id = 'ab2b3366a21b4079a62dc532ab756e5e',
                                       client_secret = '3b9f64d4cec3421f8d63d375fa3118f7')

sp = spotipy.Spotify(client_credentials_manager = credentials)

<br>Let's start by creating a function that will search for a song by song and artist name and return the track URI

In [3]:
def get_uri():

    artist = input("Input artist name: ")
    track = input("Input track name: ")

    search_term = f'artist:{artist} track:{track}'

    track_uri = sp.search(search_term, type = 'track')['tracks']['items'][0]['uri']

    # If no result, return to user
    if not track_uri:
        return print("Search returned no result")
    else:
        return track_uri

In [4]:
uri = get_uri()

print(f'\nTrack URI:\n{uri}')

Input artist name:  Radiohead
Input track name:  Creep



Track URI:
spotify:track:70LcF31zb1H0PyJoS1Sx1r


<br>
Perfect! Let's now use this to create a dataframe of timbre and loudness.

<h3>Create the Timbre Dataframe</h3>
Next we'll create the dataframe that will be used to build our model.<br>
We start by using the function we just created to search for a song and return the portion of audio analysis we need.<br>

In [47]:
import pandas as pd

# Use our function to get a track URI
uri = get_uri()

# Use the track URI to get the audio analysis. We only need the 'segments' portion
segments = sp.audio_analysis(uri)['segments']

# Convert it to a dataframe
segment_df = pd.DataFrame(segments)

# Create the timbre dataframe by selecting only the columns we need: start, loudness_start, timbre
timbre_df = segment_df[['start', 'loudness_start', 'timbre']].copy() # See 'why use .copy()?' below

# Rename loudness and start columns
timbre_df = timbre_df.rename(columns = {'loudness_start': 'loudness', 'start': 'start_time'})

# How'd we do?
timbre_df.head()

Input artist name:  Beatles
Input track name:  Yesterday


Unnamed: 0,start_time,loudness,timbre
0,0.0,-60.0,"[0.0, 171.13, 9.469, -28.48, 57.491, -50.067, ..."
1,0.37134,-60.0,"[22.232, 10.977, -53.433, -18.372, 60.242, 227..."
2,0.71224,-43.205,"[28.063, -67.593, -98.307, -45.673, 31.978, -3..."
3,1.01687,-34.759,"[29.44, -69.623, -116.014, -36.885, 23.061, -4..."
4,1.33859,-35.627,"[32.995, -93.258, -53.919, -15.588, 30.646, -3..."


<br>
Next, we need to stretch out those timbres and make each one a column of their own.

In [48]:
# How this algorithm works:
# There are 12 timbres. Create an enumerator from 1 to 12
# Create a new column timbre_x where x is the timbre number as it appears in the timbre list
# Set that column equal to the equivalent timbre in the timbre list

for i in range(12):
    timbre_df[f'timbre_{i + 1}'] = timbre_df['timbre'].apply(lambda x: x[i])

# Drop the original timbre column
timbre_df.drop(['timbre'], axis = 1, inplace = True)

# Result
timbre_df.head()

Unnamed: 0,start_time,loudness,timbre_1,timbre_2,timbre_3,timbre_4,timbre_5,timbre_6,timbre_7,timbre_8,timbre_9,timbre_10,timbre_11,timbre_12
0,0.0,-60.0,0.0,171.13,9.469,-28.48,57.491,-50.067,14.833,5.359,-27.228,0.973,-10.64,-7.228
1,0.37134,-60.0,22.232,10.977,-53.433,-18.372,60.242,227.1,73.194,-37.545,-22.616,47.437,-61.104,-14.667
2,0.71224,-43.205,28.063,-67.593,-98.307,-45.673,31.978,-31.683,9.15,-4.495,21.215,21.263,14.256,19.109
3,1.01687,-34.759,29.44,-69.623,-116.014,-36.885,23.061,-44.102,-4.948,-11.709,19.167,-2.326,1.337,26.828
4,1.33859,-35.627,32.995,-93.258,-53.919,-15.588,30.646,-36.02,26.725,-6.089,1.971,7.112,-2.744,19.641


<br>
Finally, we will pivot down these 12 timbres using the .melt() method. This will replace the 12 timbre columns with two columns: one which indicates which timbre we are looking at, 'timbre_num', and one that tells us the value of the timbre, 'timbres'.

In [49]:
# Create a list of the timbre columns
timbre_cols = [col for col in timbre_df.columns if col.startswith('timbre_')]

# Melt the timbres
timbre_df = timbre_df.melt(id_vars = ['start_time', 'loudness'],
                           value_vars = timbre_cols,
                           var_name = 'timbre_which',
                           value_name = 'timbre')

# Create the 'timbre_num' column by removing the the word 'timbre' in the 'timbre_which' column
timbre_df['timbre_num'] = [int(timbre_which.split('_')[1]) for timbre_which in timbre_df['timbre_which']]

# Drop the timbre_which column
timbre_df.drop(['timbre_which'], axis = 1, inplace = True)

# Sort the dataframe by start_time and timbre_num
timbre_df = timbre_df.sort_values(by = ['start_time', 'timbre_num']).reset_index(drop = True)

# How'd we do?:
timbre_df.head(24) # First two segments

Unnamed: 0,start_time,loudness,timbre,timbre_num
0,0.0,-60.0,0.0,1
1,0.0,-60.0,171.13,2
2,0.0,-60.0,9.469,3
3,0.0,-60.0,-28.48,4
4,0.0,-60.0,57.491,5
5,0.0,-60.0,-50.067,6
6,0.0,-60.0,14.833,7
7,0.0,-60.0,5.359,8
8,0.0,-60.0,-27.228,9
9,0.0,-60.0,0.973,10


<h4>The 13th Timbre</h4>
We are going to create a polygon of 12 sides (dodecagon) using 12 points. Point 1 will connect to point 2, point 2 to 3, and so on. When we get to point 12, intuitively we would simply connect it back to point 1. However, the plotting tool will not do this and will leave a gap in our polygon. To fix this we'll simply create a 13th point (timbre) which will just a copy of the 1st point.<br>

In [50]:
import numpy as np

# Store the shape of the dataframe as it is currently
org_shape = timbre_df.shape[0]

# Append to the dataframe the very first and every 13th timbre thereafter 
for i in np.arange(0, timbre_df.shape[0]):
    if i % 12 == 0:
        timbre_df = timbre_df._append(timbre_df.iloc[i])

# Store the shape of the new data frame
new_shape = timbre_df.shape[0]

# Using the original shape of the data frame and new shape of the data frame, set the timbre_num of the appended rows to 13
timbre_df.timbre_num.iloc[org_shape:new_shape] = 13

# Sort the dataframe
timbre_df = timbre_df.sort_values(by = ['start_time', 'timbre_num'], axis = 0).reset_index(drop = True)

# Did it work? (timbre 13 should be the same as timbre 1)
timbre_df.head(26)


ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy




Unnamed: 0,start_time,loudness,timbre,timbre_num
0,0.0,-60.0,0.0,1.0
1,0.0,-60.0,171.13,2.0
2,0.0,-60.0,9.469,3.0
3,0.0,-60.0,-28.48,4.0
4,0.0,-60.0,57.491,5.0
5,0.0,-60.0,-50.067,6.0
6,0.0,-60.0,14.833,7.0
7,0.0,-60.0,5.359,8.0
8,0.0,-60.0,-27.228,9.0
9,0.0,-60.0,0.973,10.0


<br>
Looks good! Next we'll apply the needed transforms.<br>

<h3>Applying Transforms</h3>

There are 4 transforms I chose to apply to timbre to create this model:
<li><b>Add 400</b></li>
We cannot have negative values for the next log transform we will perform next, but taking the absolute value of the timbres changes the relationship of one to the next in a way which is not representative of the true nature of said relationships. Therefore, a simple addition of 400 is added to each as in my exploration I never found a timbre to be valued less than -400.<br>
<br>
<li><b>Log</b></li>
The values of the timbres in their raw form are too widespread for this model. The two options to scale them for this project were the log transform and the min-max scaler. This was a little trial-and-error, but the log transform provided the best, most appealing shape of the model.<br>
<br>
<li><b>Min-Max</b></li>
I still applied and utilized the min-max scaler on timbre as it created a more robust color scale than the log transform. We will see why when we get to it in the code.<br>
<br>
<li><b>Loudness</b></li>
I wanted to add another dimension to the model and loudness makes the most sense. When you're listening to a song, loudness helps dictate the feel of the song in a moment. To apply this transform, we simply multiply the loudness x log scaled timbre. Loudness also required the application of the min-max scaler to keep the diameter of the model to a minimum.

<h4>Apply Log Transform</h4>

We'll first look at the minimum timbre value to ensure that adding 400 will make it positive

In [51]:
min(timbre_df.timbre)

-169.614

In [52]:
# Apply the log transform
timbre_df['timbre_log'] = np.log10(timbre_df['timbre'] + 400)

# Observe
timbre_df.timbre_log.describe()

count    4433.000000
mean        2.603270
std         0.039156
min         2.362456
25%         2.581553
50%         2.603534
75%         2.631181
max         2.797814
Name: timbre_log, dtype: float64

<br>
I also like that the log scaler makes all my values greater than 1. This is important in future transforms.

<h4>Apply Min-Max Scaler</h4>
The min-max scaler needs to be applied to to both the timbre and loudness. <br>
We'll start with timbre.

In [53]:
from sklearn.preprocessing import MinMaxScaler

# Create a MinMaxScaler object
timbre_scaler = MinMaxScaler()

# Apply the min-max scaler to the timbre column
timbre_df['timbre_scaled'] = timbre_scaler.fit_transform(timbre_df['timbre'].values.reshape(-1, 1))

# Observe results
timbre_df.timbre_scaled.describe()

count    4433.000000
mean        0.433679
std         0.090297
min         0.000000
25%         0.380381
50%         0.430227
75%         0.496611
max         1.000000
Name: timbre_scaled, dtype: float64

<br>
Looks good, but because this value is going to be multiplied by another to create a larger number, they must be greater than one. Otherwise, we're just taking a fraction. This is simple though: we'll just add 1 to the column.

In [54]:
timbre_df['timbre_scaled'] = timbre_df['timbre_scaled'] + 1

timbre_df.timbre_scaled.describe()

count    4433.000000
mean        1.433679
std         0.090297
min         1.000000
25%         1.380381
50%         1.430227
75%         1.496611
max         2.000000
Name: timbre_scaled, dtype: float64

<br>
Now let's apply the min-max scaler to the loudness.

In [55]:
# Create MinMaxScaler object
loudness_scaler = MinMaxScaler()

# Apply min-max scaler to the loudness column. 
timbre_df['loudness_scaled'] = loudness_scaler.fit_transform(timbre_df['loudness'].values.reshape(-1, 1))

# Observe results
timbre_df.loudness_scaled.describe()

count    4433.000000
mean        0.759911
std         0.142335
min         0.000000
25%         0.705702
50%         0.785248
75%         0.853282
max         1.000000
Name: loudness_scaled, dtype: float64

<br>
Same as the timbre scaling, we need add 1 to this column to make all values greater than 1.

In [56]:
timbre_df['loudness_scaled'] = timbre_df['loudness_scaled'] + 1

# Observe
timbre_df.loudness_scaled.describe()

count    4433.000000
mean        1.759911
std         0.142335
min         1.000000
25%         1.705702
50%         1.785248
75%         1.853282
max         2.000000
Name: loudness_scaled, dtype: float64

<h4>Loudness</h4>
The final transform to is scale the the timbres to the loudness. Another simple one - we'll just take the result of our log transformed timbres and multiple them by the scaled loudness. We will do this for both the 'timbre_log' and 'timbre_scaled' columns.

In [57]:
#  Scale timbre_log with loudness
timbre_df['timbre_log_w_loud'] = timbre_df['loudness_scaled'] * timbre_df['timbre_log']

# Scale timbre_scaled with loudness
timbre_df['timbre_scaled_w_loud'] = timbre_df['loudness_scaled'] * timbre_df['timbre_scaled']

# Observe
timbre_df.head(26)

Unnamed: 0,start_time,loudness,timbre,timbre_num,timbre_log,timbre_scaled,loudness_scaled,timbre_log_w_loud,timbre_scaled_w_loud
0,0.0,-60.0,0.0,1.0,2.60206,1.426805,1.0,2.60206,1.426805
1,0.0,-60.0,171.13,2.0,2.756735,1.857425,1.0,2.756735,1.857425
2,0.0,-60.0,9.469,3.0,2.612221,1.450632,1.0,2.612221,1.450632
3,0.0,-60.0,-28.48,4.0,2.569982,1.35514,1.0,2.569982,1.35514
4,0.0,-60.0,57.491,5.0,2.660383,1.571471,1.0,2.660383,1.571471
5,0.0,-60.0,-50.067,6.0,2.543985,1.30082,1.0,2.543985,1.30082
6,0.0,-60.0,14.833,7.0,2.617873,1.46413,1.0,2.617873,1.46413
7,0.0,-60.0,5.359,8.0,2.60784,1.44029,1.0,2.60784,1.44029
8,0.0,-60.0,-27.228,9.0,2.571443,1.35829,1.0,2.571443,1.35829
9,0.0,-60.0,0.973,10.0,2.603115,1.429253,1.0,2.603115,1.429253


<br>
That's what we want to see. On to the next part: math!

<h3>Creating Coordinates</h3>
Since our final shape is going to be a cylinder, we need X, Y, and Z values. The Z value is easy: start_time. X and Y, however, are a little trickier.<br>
We need to use the timbres to make a circular shape. It's actually a polygon with 12 sides, or a <b>dodecagon</b>. Knowing that I need a circular shape with 12 sides, I divide the degrees in a circle (360) by 12 to arrive at the angles by which each timbre (point) needs to fall: 30. So we'll have points at 30, 60, 90, etc degrees.<br>
<i>"Okay, that's great that I know the angle on which the points fall, but how do I actually find those X & Y values?"<br></i>
For that, we're going to need to use a little trigonometry and a unit circle:<br>
<br>
<img src = 'https://upload.wikimedia.org/wikipedia/commons/4/4c/Unit_circle_angles_color.svg', alt='UnitCircle'>
<br>
The unit circle shows us the radius 1 at varius points on the circle. Lucky for us, 30 degrees is on the unit circle! Unlucky for us, we don't need a radius of 1. We need a radius that varies depending upon how big or small the timbre is. How do we fix that? Easy, just multiple the points on the radius circle by the associated timbre. So the first timbre will be located at 30 degrees, the X will be the value timbre multiplied by the X value on the unit circle, and the Y value will be the value of timbre multipled by the Y value of the circle. The second timbre will be located at 60 degrees. The third, 90. And so on.<br>
</b>Boom. Got it.<br>
There are two ways to go about this calculation:<br>
<li>Multiple each X and Y value on the unit circle at every 30 degrees by the associated timbre to create timbre_X and timbre_Y.</li>
<li>Use a little trigonometry to figure out where those values are made, then apply apply to our dataframe to create timbre_X and timbre_Y.</li><br>
Is the first option even <i>really</i> an option?

<h4>Trigonometry Review!</h4>
<i>How are the values of a unit circle figured?</i><br>
This is actually not as bad as it may seem. Let's look at a right triangle:<br>
<br>
<img src = 'https://study.com/cimages/multimages/16/screenshot_2021-05-26_214629_13996646558148649374.jpg', alt = 'RightTriangle')>
In this image, 'Opposite' is our Y value and 'Adjacent' is our X value. The 'Hypotenuse' is the radius of our unit cirle which we know to be 1. From our trigonometry lessons we know:<br>
cos &#x0398 = Adjacent (X) / Hypotenuse (1) <br>
sin &#x0398 = Opposite (Y) / Hypotenuse (1) <br>
-or-<br>
X = cos &#x0398<br>
Y = cos &#x0398<br>
That makes it very easy. Then to find the timbre_X all we need is to find the cosine of the angle and multiple it by the timbre, and to find timbre_Y we find sine of the angle and multiple it by the timbre.<br>
<br>
That wasn't so hard now was it?

In [67]:
import math

# Create a column of angles based on timbre number (timbre_num)
timbre_df['timbre_angle'] = 30 * timbre_df['timbre_num']

# The sin and cos functions in the math package will only take the angles as radians. 
# Convert the column just created to radians
timbre_df['timbre_angle'] = (math.pi * timbre_df['timbre_angle']) / 180

# Create timbre_X
# For this, we want to use the timbre_log_w_loud variable
timbre_df['timbre_X'] = round(timbre_df['timbre_log_w_loud'] * timbre_df['timbre_angle'].apply(lambda z: math.cos(z)), 3)

# Create timbre_Y using timbre_log_w_loud
timbre_df['timbre_Y'] = round(timbre_df['timbre_log_w_loud'] * timbre_df['timbre_angle'].apply(lambda z: math.sin(z)), 3)

# We don't need the angle column anymore
timbre_df.drop(['timbre_angle'], axis = 1, inplace = True)

# Let's look at the state of things:
timbre_df.head(26)

Unnamed: 0,start_time,loudness,timbre,timbre_num,timbre_log,timbre_scaled,loudness_scaled,timbre_log_w_loud,timbre_scaled_w_loud,timbre_X,timbre_Y
0,0.0,-60.0,0.0,1.0,2.60206,1.426805,1.0,2.60206,1.426805,2.253,1.301
1,0.0,-60.0,171.13,2.0,2.756735,1.857425,1.0,2.756735,1.857425,1.378,2.387
2,0.0,-60.0,9.469,3.0,2.612221,1.450632,1.0,2.612221,1.450632,0.0,2.612
3,0.0,-60.0,-28.48,4.0,2.569982,1.35514,1.0,2.569982,1.35514,-1.285,2.226
4,0.0,-60.0,57.491,5.0,2.660383,1.571471,1.0,2.660383,1.571471,-2.304,1.33
5,0.0,-60.0,-50.067,6.0,2.543985,1.30082,1.0,2.543985,1.30082,-2.544,0.0
6,0.0,-60.0,14.833,7.0,2.617873,1.46413,1.0,2.617873,1.46413,-2.267,-1.309
7,0.0,-60.0,5.359,8.0,2.60784,1.44029,1.0,2.60784,1.44029,-1.304,-2.258
8,0.0,-60.0,-27.228,9.0,2.571443,1.35829,1.0,2.571443,1.35829,-0.0,-2.571
9,0.0,-60.0,0.973,10.0,2.603115,1.429253,1.0,2.603115,1.429253,1.302,-2.254


<br>
That's all we need in the dataframe. Let's create the model.

<h3>Creating the Model</h3>


In [69]:
# Store the number of segments in the song for our reshapes
num_segments = len(timbre_df.start_time.unique())

# Also store the number of timbres
num_timbres = 13

print(f'Number of segments: {num_segments}')

Number of segments: 341


In [70]:
# Next, we create a color scale based on the timbre_scale_w_loudness values
color_scale = timbre_df.timbre_scaled_w_loud.values.reshape(num_segments, num_timbres).T

In [72]:
# Create X and Y and Z values for plot
X = timbre_df.timbre_X.values.reshape(num_segments, num_timbres).T
Y = timbre_df.timbre_Y.values.reshape(num_segments, num_timbres).T

# We'll call Z what it is - start_time
start_time = timbre_df.start_time.values.reshape(num_segments, num_timbres).T

In [73]:
print(f'\nShape of color scale: {color_scale.shape}')
print(f'Shape of X: {X.shape}')
print(f'Shape of Y: {Y.shape}')
print(f'Shape of Z: {start_time.shape}')


Shape of color scale: (13, 341)
Shape of X: (13, 341)
Shape of Y: (13, 341)
Shape of Z: (13, 341)


<br>
We have a perfectly matched 3D grid. Let's plot it using Plotly!

In [74]:
import plotly.graph_objects as go

fig = go.Figure()

fig.add_trace(go.Surface(x = X,
                         y = Y,
                         z = start_time,
                         surfacecolor = color_scale,
                         colorscale = 'ylgnbu'))

fig.update_layout(height = 1000, 
                  width = 1100,
                  scene=dict(aspectmode='manual',
                             aspectratio=dict(x=1, y=1, z=15)),
                  scene_camera = dict(
                                      eye = dict(x=4, y=-3, z=-9.5),
                                      center = dict(x=-2, y=2, z=0),
                                      up = dict(x=0, y=0, z=0)
                                     ))

fig.update_traces(showscale = False)

fig.show()