In [17]:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# 1. Genre-based Recommendation Engine
1. Determine features of a song, and which columns are considered obsolete for decision-making
2. Cluster based on the predetermined features to deterime `genre`
3. Analyze the data based on the new clusters and label them as genres. List all the unique genres.
4. Apply PCA in order to project the newfound clusters into two-dimensional space.
5. Take an input which is one of the unique genres in step 3. Return random 10 songs of that genre.

## Analyzing the sourced raw data
We import the data and see the columns in the .csv file, as well as its corresponding data types (string or numerics). Via the Spotify official developer website, we can understand and take not the overall representation for each column (i.e. what each column means and how they relate to each other)

In [18]:
raw_data = pd.read_csv("data/data.csv")
raw_data.head()

Unnamed: 0,acousticness,artists,danceability,duration_ms,energy,explicit,id,instrumentalness,key,liveness,loudness,mode,name,popularity,release_date,speechiness,tempo,valence,year
0,0.995,['Carl Woitschach'],0.708,158648,0.195,0,6KbQ3uYMLKb5jDxLF7wYDD,0.563,10,0.151,-12.428,1,Singende Bataillone 1. Teil,0,1928,0.0506,118.469,0.779,1928
1,0.994,"['Robert Schumann', 'Vladimir Horowitz']",0.379,282133,0.0135,0,6KuQTIu1KoTTkLXKrwlLPV,0.901,8,0.0763,-28.454,1,"Fantasiestücke, Op. 111: Più tosto lento",0,1928,0.0462,83.972,0.0767,1928
2,0.604,['Seweryn Goszczyński'],0.749,104300,0.22,0,6L63VW0PibdM1HDSBoqnoM,0.0,5,0.119,-19.924,0,Chapter 1.18 - Zamek kaniowski,0,1928,0.929,107.177,0.88,1928
3,0.995,['Francisco Canaro'],0.781,180760,0.13,0,6M94FkXd15sOAOQYRnWPN8,0.887,1,0.111,-14.734,0,Bebamos Juntos - Instrumental (Remasterizado),0,1928-09-25,0.0926,108.003,0.72,1928
4,0.99,"['Frédéric Chopin', 'Vladimir Horowitz']",0.21,687733,0.204,0,6N6tiFZ9vLTSOIxkj8qKrd,0.908,11,0.098,-16.829,1,"Polonaise-Fantaisie in A-Flat Major, Op. 61",1,1928,0.0424,62.149,0.0693,1928


In [19]:
raw_data.info() # To see whether the numerical data are already represented as numerics and not strings

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 169909 entries, 0 to 169908
Data columns (total 19 columns):
 #   Column            Non-Null Count   Dtype  
---  ------            --------------   -----  
 0   acousticness      169909 non-null  float64
 1   artists           169909 non-null  object 
 2   danceability      169909 non-null  float64
 3   duration_ms       169909 non-null  int64  
 4   energy            169909 non-null  float64
 5   explicit          169909 non-null  int64  
 6   id                169909 non-null  object 
 7   instrumentalness  169909 non-null  float64
 8   key               169909 non-null  int64  
 9   liveness          169909 non-null  float64
 10  loudness          169909 non-null  float64
 11  mode              169909 non-null  int64  
 12  name              169909 non-null  object 
 13  popularity        169909 non-null  int64  
 14  release_date      169909 non-null  object 
 15  speechiness       169909 non-null  float64
 16  tempo             16

From [this page](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/), we can derive the meaning for each column above. I'll write them down in this cell so we won't need to look back and forth on that page.
1. `Acousticness` represents a confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
2. `Danceability` represents how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. 1.0 is most danceable. 
3. `Energy` represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. 1.0 is most energetic
4. `Instrumentalness` represents the instrumental frequency in the track. 1.0 represents tracks with no vocals.
5. `Liveness` represents the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live
6. `Loudness` represents the loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Values typical range between -60 and 0 db. 
7. `Speechiness` represents the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 are tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks
8. `Valence` represents the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric).
9. `Tempo` represents the overall estimated tempo of a track in beats per minute (BPM).
10. `Mode` 0 = Minor, 1 = Major
11. `Explicit` 0 = No explicit content, 1 = Explicit content\
12. `Key` represents the estimated overall key of the track. Integers map to pitches using standard Pitch Class notation . E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1.
---
Despite the data has been cleaned, we can double-check whether a column has any missing `NaN` data. It should return the column name along with the number of missing data besides it.

In [20]:
for colname in raw_data.columns.tolist():
    print("Column {}".format(colname) + " has {} missing values.".format(raw_data[colname].isnull().sum()))

Column acousticness has 0 missing values.
Column artists has 0 missing values.
Column danceability has 0 missing values.
Column duration_ms has 0 missing values.
Column energy has 0 missing values.
Column explicit has 0 missing values.
Column id has 0 missing values.
Column instrumentalness has 0 missing values.
Column key has 0 missing values.
Column liveness has 0 missing values.
Column loudness has 0 missing values.
Column mode has 0 missing values.
Column name has 0 missing values.
Column popularity has 0 missing values.
Column release_date has 0 missing values.
Column speechiness has 0 missing values.
Column tempo has 0 missing values.
Column valence has 0 missing values.
Column year has 0 missing values.


In [21]:
df = raw_data[['id', 'name', 'artists', 'acousticness', 'danceability', 'energy', 'instrumentalness', 'explicit', 'key', 'liveness', 'loudness', 'mode', 'popularity','speechiness', 'tempo', 'valence']]

## Begin to cluster the data
Before we instantiate the clustering on our feature set, we need to determine which value of k is optimal using the elbow method.

We can see that the elbow resides somewhere between `k = 20` clusters. Therefore, we should choose 20 as the number of unique `genres`

In [22]:
from sklearn.cluster import KMeans # Clustering library
%matplotlib inline

features = df[list(df.columns)[3:]]
distortions = []

plt.clf()
k = 20
k_means = KMeans(init = "k-means++", n_clusters = k, n_init = 12) # Run 12 times with different centroid seeds
k_means.fit(features.values)

KMeans(n_clusters=20, n_init=12)

<Figure size 432x288 with 0 Axes>

In [23]:
temp = pd.DataFrame(k_means.labels_)
df = pd.concat([df, temp], axis=1)
df.rename(columns={0: "cluster"}, inplace=True)
df.head()

Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
0,6KbQ3uYMLKb5jDxLF7wYDD,Singende Bataillone 1. Teil,['Carl Woitschach'],0.995,0.708,0.195,0.563,0,10,0.151,-12.428,1,0,0.0506,118.469,0.779,12
1,6KuQTIu1KoTTkLXKrwlLPV,"Fantasiestücke, Op. 111: Più tosto lento","['Robert Schumann', 'Vladimir Horowitz']",0.994,0.379,0.0135,0.901,0,8,0.0763,-28.454,1,0,0.0462,83.972,0.0767,9
2,6L63VW0PibdM1HDSBoqnoM,Chapter 1.18 - Zamek kaniowski,['Seweryn Goszczyński'],0.604,0.749,0.22,0.0,0,5,0.119,-19.924,0,0,0.929,107.177,0.88,12
3,6M94FkXd15sOAOQYRnWPN8,Bebamos Juntos - Instrumental (Remasterizado),['Francisco Canaro'],0.995,0.781,0.13,0.887,0,1,0.111,-14.734,0,0,0.0926,108.003,0.72,12
4,6N6tiFZ9vLTSOIxkj8qKrd,"Polonaise-Fantaisie in A-Flat Major, Op. 61","['Frédéric Chopin', 'Vladimir Horowitz']",0.99,0.21,0.204,0.908,0,11,0.098,-16.829,1,1,0.0424,62.149,0.0693,9


In [35]:
import ast
df_new = df.copy(deep=True)

In [36]:
df_new['artists'] = df_new['artists'].apply(lambda x: (ast.literal_eval(x)[0]))

In [40]:
for i in range(k):
    to_print = df_new[df_new['cluster'] == i]
    print("Cluster {}".format(i))
    display(to_print.head())
    print("----------------------------------------------------------------------------------------------------")

Cluster 0


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1763,6M2da2eRNj08QJ4TT5vYuU,The Man I Love - 1995 Digital Remaster,Peggy Lee,0.968,0.262,0.219,0.00439,0,10,0.158,-13.534,0,19,0.0323,112.194,0.118,0
1764,6SEDXcebrNoYClqPORDe91,All Too Soon,Ella Fitzgerald,0.854,0.429,0.157,0.0,0,1,0.104,-14.926,1,16,0.0354,125.212,0.209,0
1770,76aRzC9R2XaYOY09k3JfDX,Easy Living,John Lewis,0.994,0.526,0.0872,0.00394,0,8,0.13,-13.066,1,18,0.0446,129.201,0.401,0
1771,79uBIicCUMDSRGomaKI79J,Metaphor,Yusef Lateef,0.847,0.448,0.202,0.0412,0,2,0.107,-19.357,0,17,0.0429,120.777,0.281,0
1777,0O6GI9vbKCe9kj5hj4hSBf,Right String but the Wrong Yo Yo,Carl Perkins,0.284,0.613,0.888,5.1e-05,0,0,0.238,-7.684,1,16,0.0628,117.759,0.889,0


----------------------------------------------------------------------------------------------------
Cluster 1


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
33,6Z6DJ8L7WMy2WHbXe3jR9I,A Shropshire Lad: Oh Fair Enough are Sky and P...,George Butterworth,0.987,0.238,0.0324,0.000986,0,7,0.112,-24.361,0,0,0.0541,172.321,0.0379,1
81,6yanEXQBzoNKlWJzKGBpsb,Reaching for Someone (And Not Finding Anyone T...,Paul Whiteman,0.993,0.523,0.0867,0.124,0,10,0.14,-17.528,1,1,0.0707,161.687,0.697,1
101,3yZj8i9TwsWAVmBWKdMbaa,Cobardia - Remasterizado,Ignacio Corsini,0.991,0.545,0.133,0.444,0,7,0.115,-20.53,0,0,0.127,185.062,0.691,1
114,40jiorcB4ujhNCJEM3NYRz,Cantando Bajo la Lluvia - Remasterizado,Francisco Canaro,0.989,0.445,0.288,0.758,0,2,0.111,-8.925,1,0,0.0699,205.423,0.697,1
118,41fXcqGebIUYOFQvDzqLJG,A Lo Lejos - Remasterizado,Francisco Canaro,0.99,0.518,0.462,0.85,0,4,0.157,-9.632,0,0,0.377,178.88,0.696,1


----------------------------------------------------------------------------------------------------
Cluster 2


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
2782,3M0nozpvaNSj5WOF5JeW45,Ride Your Pony,Betty Harris,0.0895,0.694,0.691,0.00387,0,10,0.293,-7.499,0,41,0.053,123.244,0.961,2
2845,4J2xMy0kakU9sAin1uppxb,La Balsa,Los Gatos,0.403,0.506,0.611,0.0,0,11,0.19,-4.977,1,46,0.0313,125.925,0.646,2
2963,6SsQ4eeyzgJirHaWABOK9Q,Désormais,Charles Aznavour,0.57,0.438,0.687,0.0,0,0,0.14,-5.968,0,45,0.0448,123.178,0.46,2
3022,5gpQ5GGP8u7GETtKIlGPVY,Now and Forever,Jimmy Cliff,0.123,0.555,0.474,0.0,0,5,0.132,-8.646,1,38,0.0277,123.323,0.316,2
3028,22zbgTMLuj0sNFLxCJGdHR,Jumpin' Jack Flash,Thelma Houston,0.0767,0.525,0.706,0.432,0,8,0.176,-9.125,1,35,0.0345,132.902,0.84,2


----------------------------------------------------------------------------------------------------
Cluster 3


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1682,5fDFyReOWEeALWPzGYu4SM,"Jumps, Giggles And Shouts",Gene Vincent & His Blue Caps,0.141,0.591,0.803,0.0,0,7,0.204,-9.66,1,15,0.0584,96.198,0.856,3
1754,5aWJ0sKEfLQh5VHGxfpBwO,"6 Little Preludes: No. 4 in D Major, BWV 936",Johann Sebastian Bach,0.993,0.424,0.206,0.903,0,2,0.195,-25.727,1,18,0.0515,100.427,0.961,3
1755,5gGq4KxuYuivZWQpUi0HXT,Doin' My Time (2017 Remaster),Johnny Cash,0.121,0.617,0.587,0.00156,0,10,0.209,-13.291,0,14,0.0315,105.613,0.735,3
1758,5soXG9VjJxo6RQNQ2eNtpg,"Italian Concerto in F Major, BWV 971: I. [ ] -...",Johann Sebastian Bach,0.98,0.314,0.351,0.824,0,5,0.0775,-19.263,1,20,0.0383,95.357,0.905,3
1760,62CB0noXueC3vEE1wdWh8h,"9 Little Preludes, BWV 924-932: Praeludium in ...",Johann Sebastian Bach,0.995,0.609,0.34,0.884,0,5,0.195,-19.569,1,20,0.0752,91.7,0.974,3


----------------------------------------------------------------------------------------------------
Cluster 4


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1695,6qdHpkWJJa2tAR59v8XkfR,Lover,Ella Fitzgerald,0.621,0.428,0.476,0.0,0,2,0.136,-6.931,1,15,0.0514,82.664,0.464,4
1734,1fEepJX6lz8fqhDRsNxkQp,Overture,Antônio Carlos Jobim,0.871,0.221,0.269,0.173,0,9,0.174,-10.066,0,17,0.034,80.475,0.169,4
1762,6GmTz8aCMthyvw8OkdbDdF,Sonny's Mood,Sonny Clark,0.846,0.544,0.344,0.000646,0,5,0.335,-11.963,0,16,0.0426,75.166,0.573,4
1766,6Z19hWp496a19Us6tbVDWB,Under a Blanket of Blue (with Paul Weston & Hi...,Doris Day,0.944,0.348,0.0596,0.000527,0,7,0.124,-18.77,1,18,0.0355,66.479,0.146,4
1767,6nzeMuNY6K841H1P193VAX,My Bucket's Got A Hole In It,Louis Armstrong,0.793,0.524,0.228,0.0,0,9,0.21,-9.553,0,16,0.0351,80.105,0.34,4


----------------------------------------------------------------------------------------------------
Cluster 5


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
4192,1kxPRJGVKGqjJM7BB44p0p,Errol - Remastered,Australian Crawl,0.0414,0.613,0.772,0.0,0,0,0.24,-6.971,1,56,0.0422,164.825,0.785,5
4640,5fu5vl7owT5ny0ry2Bema2,Si No Fuera Por... - Remasterizado 2007,Soda Stereo,0.0183,0.605,0.855,0.00404,0,0,0.242,-5.25,1,47,0.047,172.275,0.694,5
4689,2ePNAnihorbU5j4jzzfKJT,Las Curvas de Esa Chica,Mecano,0.262,0.655,0.833,4.1e-05,0,5,0.163,-6.866,1,48,0.167,164.197,0.817,5
5138,751bsmv3KNPrytbCUdzQJN,Via con me,Paolo Conte,0.799,0.669,0.425,0.0111,0,5,0.152,-18.404,1,50,0.14,165.314,0.7,5
5139,7A4qkJulbEMvveTLvj7Tbo,The Weeping Song - 2010 Remastered Version,Nick Cave & The Bad Seeds,0.204,0.416,0.723,5e-06,0,7,0.116,-9.793,0,48,0.136,170.3,0.719,5


----------------------------------------------------------------------------------------------------
Cluster 6


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
10,6QBInZBkQNIQYU9gGzT5l4,"Piano Sonata No. 2 in B-Flat Minor, Op. 36: I....",Sergei Rachmaninoff,0.994,0.376,0.0719,0.883,0,10,0.196,-21.849,0,0,0.0352,141.39,0.0393,6
27,6WRmg6x1bYjHxzeMCfKguB,"Scherzo No. 1 in B Minor, Op. 20",Frédéric Chopin,0.991,0.38,0.119,0.89,0,11,0.0601,-21.255,1,0,0.0389,132.005,0.18,6
35,6ZWi1fuonJCUlt4p6o9Uzs,La Noce À Rebecca,Perchicot,0.989,0.769,0.442,1e-06,0,2,0.167,-12.697,0,0,0.273,132.125,0.934,6
39,6bKJYhoaWIohnhspidbdW7,Chapter 2.11 - Zamek kaniowski,Seweryn Goszczyński,0.777,0.738,0.282,0.0,0,1,0.29,-16.556,1,0,0.952,140.612,0.703,6
62,6oZyWbbisoG8jOunmpPH40,Country Blues #1,Taj Mahal,0.872,0.571,0.204,0.779,0,6,0.197,-18.61,1,0,0.0328,129.289,0.0932,6


----------------------------------------------------------------------------------------------------
Cluster 7


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1977,7kkIAIAKqHp3lb1XhPUG6x,Broto Legal (I'm In Love),Sérgio Murillo,0.662,0.716,0.505,0.000773,0,11,0.0318,-9.069,1,24,0.495,168.302,0.907,7
2005,3UBG6YnD80FP1RiC4uavGj,Beyond The Reef,Billy Vaughn,0.778,0.319,0.301,0.403,0,9,0.133,-11.733,0,22,0.0294,159.097,0.463,7
2148,22dqPEditHWH5vQ8S9pC8q,El Amor Es una Cosa Esplendorosa (Love Is a Ma...,Enrique Guzman,0.72,0.165,0.436,0.0,0,2,0.225,-7.921,1,27,0.0341,175.835,0.314,7
2317,5dILFohd0MWsM5a1ueCiBO,"Vilaines filles, mauvais garçons",Serge Gainsbourg,0.332,0.491,0.766,5e-06,0,0,0.448,-10.684,1,25,0.0432,165.599,0.904,7
2366,71TdR127wNt70AwcpLP4aT,Shut Down (Stereo),The Beach Boys,0.0545,0.589,0.676,0.00185,0,8,0.0442,-10.77,1,17,0.0475,159.892,0.979,7


----------------------------------------------------------------------------------------------------
Cluster 8


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
5375,5lSOVaPDk7x9Ey6c9DqGZx,Sweet Harmony,The Beloved,0.0144,0.573,0.837,0.0889,0,10,0.0544,-8.352,0,57,0.0337,101.59,0.796,8
5483,2FHnN5ELL83TGbtXMDzoiJ,I Like Chopin,Gazebo,0.577,0.779,0.635,0.000258,0,2,0.215,-8.028,0,59,0.0282,108.018,0.53,8
5676,7BZLNqU7zChzcnSo6ETJ5l,Rotterdam (Or Anywhere),The Beautiful South,0.327,0.724,0.585,2e-06,0,0,0.139,-7.68,1,60,0.0312,108.114,0.525,8
5683,5yIVcrwQXdIlDgTMc8pa6z,Contigo,Joaquín Sabina,0.0309,0.49,0.418,2e-05,0,2,0.0615,-10.773,1,62,0.0572,98.507,0.226,8
5830,0tTbk9bpoFRDa19eVghS4d,Caruso,Luciano Pavarotti,0.835,0.332,0.23,4.5e-05,0,7,0.107,-12.146,0,55,0.0338,99.573,0.215,8


----------------------------------------------------------------------------------------------------
Cluster 9


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1,6KuQTIu1KoTTkLXKrwlLPV,"Fantasiestücke, Op. 111: Più tosto lento",Robert Schumann,0.994,0.379,0.0135,0.901,0,8,0.0763,-28.454,1,0,0.0462,83.972,0.0767,9
4,6N6tiFZ9vLTSOIxkj8qKrd,"Polonaise-Fantaisie in A-Flat Major, Op. 61",Frédéric Chopin,0.99,0.21,0.204,0.908,0,11,0.098,-16.829,1,1,0.0424,62.149,0.0693,9
5,6NxAf7M8DNHOBTmEd3JSO5,Scherzo a capriccio: Presto,Felix Mendelssohn,0.995,0.424,0.12,0.911,0,6,0.0915,-19.242,0,0,0.0593,63.521,0.266,9
6,6O0puPuyrxPjDTHDUgsWI7,"Valse oubliée No. 1 in F-Sharp Major, S. 215/1",Franz Liszt,0.956,0.444,0.197,0.435,0,11,0.0744,-17.226,1,0,0.04,80.495,0.305,9
9,6PrZexNb16cabXR8Q418Xc,Chapter 1.3 - Zamek kaniowski,Seweryn Goszczyński,0.846,0.674,0.205,0.0,0,9,0.17,-20.119,1,0,0.954,81.249,0.759,9


----------------------------------------------------------------------------------------------------
Cluster 10


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
11,6QIONtzbQCbnmWNwn0H1yT,"Piano Sonata No. 2, Op. 35: IV. Finale. Presto",Frédéric Chopin,0.989,0.17,0.0823,0.911,0,10,0.0962,-30.107,0,1,0.0317,85.989,0.346,10
12,6QgdUySTRGVkNo3KwbHpK3,"Piano Sonata in E-Flat Minor, Op. 26: III. Ada...",Samuel Barber,0.99,0.359,0.0435,0.899,0,7,0.109,-20.858,1,0,0.0424,96.645,0.042,10
22,6VUm7Dg5sufmG01IYcoJE3,"Andante spianato in E-Flat Major, Op. 22",Frédéric Chopin,0.975,0.277,0.09,0.949,0,7,0.125,-26.188,1,0,0.0316,105.031,0.168,10
31,6XoyWGdCJwFaJV1Pnmphwr,Por una Mujer - Remasterizado,Ignacio Corsini,0.995,0.531,0.124,0.0168,0,2,0.118,-23.243,1,0,0.0711,101.902,0.555,10
74,6v13FCz4z385EbNyPaXYCU,"Piano Sonata in E-Flat Minor, Op. 26: II. Alle...",Samuel Barber,0.987,0.242,0.149,0.877,0,0,0.134,-26.742,1,0,0.0369,88.188,0.279,10


----------------------------------------------------------------------------------------------------
Cluster 11


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
2106,67LQpgGMjI0jnXo9lRj07C,A Volta do Boêmio,Nelson Gonçalves,0.892,0.596,0.314,1e-06,0,9,0.168,-10.024,0,42,0.029,99.442,0.599,11
2186,5zUJlRQyzxw09Jv1hDgL5h,Twist à Saint-Tropez,Les Chats Sauvages,0.00318,0.39,0.637,0.000129,0,2,0.443,-4.498,1,37,0.0273,90.328,0.681,11
2201,0Cx7w1aXcvsPBdfdND0Pju,Presumida (High Class Baby),Los Teen Tops,0.556,0.571,0.751,0.0,0,2,0.0646,-9.544,1,36,0.0585,91.002,0.966,11
2966,70gCYpuRthr9sfC3WV8g5P,Melody Fair,Bee Gees,0.0274,0.449,0.423,0.0,0,2,0.0785,-8.535,1,40,0.0239,81.187,0.453,11
3007,3Tz5fLDzaPYxvd5MY6gtS1,"On Days Like These - From ""The Italian Job"" So...",Quincy Jones,0.77,0.472,0.338,0.0,0,7,0.199,-10.747,0,38,0.026,95.899,0.31,11


----------------------------------------------------------------------------------------------------
Cluster 12


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
0,6KbQ3uYMLKb5jDxLF7wYDD,Singende Bataillone 1. Teil,Carl Woitschach,0.995,0.708,0.195,0.563,0,10,0.151,-12.428,1,0,0.0506,118.469,0.779,12
2,6L63VW0PibdM1HDSBoqnoM,Chapter 1.18 - Zamek kaniowski,Seweryn Goszczyński,0.604,0.749,0.22,0.0,0,5,0.119,-19.924,0,0,0.929,107.177,0.88,12
3,6M94FkXd15sOAOQYRnWPN8,Bebamos Juntos - Instrumental (Remasterizado),Francisco Canaro,0.995,0.781,0.13,0.887,0,1,0.111,-14.734,0,0,0.0926,108.003,0.72,12
7,6OJjveoYwJdIt76y0Pxpxw,Per aspera ad astra,Carl Woitschach,0.988,0.555,0.421,0.836,0,1,0.105,-9.878,1,0,0.0474,123.31,0.857,12
8,6OaJ8Bh7lsBeYoBmwmo2nh,Moneda Corriente - Remasterizado,Francisco Canaro,0.995,0.683,0.207,0.206,0,9,0.337,-9.801,0,0,0.127,119.833,0.493,12


----------------------------------------------------------------------------------------------------
Cluster 13


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
4312,1y96w7WDPlwByq9aISEi6G,Barro Tal Vez,Luis Alberto Spinetta,0.938,0.392,0.124,0.0,0,2,0.134,-15.154,0,55,0.0424,122.422,0.141,13
4617,5WeBnrDPyLhxruxVryHCkn,La Bestia Pop,Patricio Rey y sus Redonditos de Ricota,0.0506,0.641,0.677,0.112,0,7,0.0687,-10.142,1,58,0.0327,120.666,0.896,13
5085,747fyAuUDG3feXRd6pILnx,Your Song,Elton John,0.78,0.554,0.33,3e-06,0,3,0.107,-10.866,1,66,0.03,128.214,0.297,13
5352,5UkoitnvaDUSsq7cVsOdOh,Be My Baby,Vanessa Paradis,0.447,0.731,0.683,5e-06,0,0,0.117,-8.045,1,56,0.0294,129.062,0.963,13
5377,5bxQHscWvyaQbm37igKP4K,La solitudine,Laura Pausini,0.738,0.574,0.467,0.0,0,1,0.149,-6.374,1,59,0.0381,131.89,0.256,13


----------------------------------------------------------------------------------------------------
Cluster 14


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1832,0KnvDXxUgUuKIi8im5ywpE,Rocks In My Bed,Ella Fitzgerald,0.829,0.392,0.205,0.0,0,8,0.0901,-13.366,1,15,0.0522,212.242,0.373,14
2013,4EpaFP7yN1aSWWHmAcnU5U,That's All There Is to That,Dinah Washington,0.851,0.283,0.285,0.000122,0,8,0.119,-14.016,1,20,0.0455,204.13,0.359,14
2017,4ZD4ZwMVEPsnhCSJ3eBiQZ,East of the Sun,Paul Desmond,0.922,0.453,0.167,0.49,0,5,0.192,-18.961,0,15,0.0661,207.667,0.589,14
2077,2cr2rjdV2JOz1J0prTiPOG,Shake Your Money Maker,Elmore James,0.695,0.419,0.675,0.00032,0,2,0.0762,-5.491,1,14,0.0776,200.96,0.65,14
2231,34f4J8BuDUHjSYqP0EHPwA,Kahin Deep Jale Kahin Dil,Lata Mangeshkar,0.965,0.218,0.327,0.571,0,6,0.338,-11.444,0,14,0.0376,207.42,0.294,14


----------------------------------------------------------------------------------------------------
Cluster 15


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1269,19Ttq3sjIl5pW9wT7ZFfmY,"Götterdämmerung : Act 1 : Willkommen, Gast, In...",Richard Wagner,0.976,0.0,0.0854,0.000105,0,7,0.343,-21.505,1,0,0.0,0.0,0.0,15
2721,0P7TUyrm6OfIDJJKcidvnu,My Kind Of Town (Reprise) - Live At The Sands ...,Frank Sinatra,0.0995,0.0,0.906,1.8e-05,0,1,0.366,-6.227,1,22,0.0,0.0,0.0,15
3387,2mex2o4uA69pMcLjMtyyGb,Ride Me Down Easy,Waylon Jennings,0.756,0.0,0.0484,0.000144,0,4,0.166,-18.198,1,29,0.0,0.0,0.0,15
6930,3oKBZhpwrMiOhosXauv3lP,Ocean Waves,Crain & Taylor,0.931,0.0,7.5e-05,0.892,0,1,0.115,-19.703,0,47,0.0,0.0,0.0,15
7411,7foc25ig7dibxvULPU2kBG,Brown Noise - 90 Minutes,Sound Dreamer,0.111,0.0,9.9e-05,0.392,0,2,0.137,-21.669,1,50,0.0,0.0,0.0,15


----------------------------------------------------------------------------------------------------
Cluster 16


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
3122,3yGyWqmw9eCQPdJJ6iJLWs,4/3/1943,Lucio Dalla,0.891,0.596,0.305,2e-06,0,0,0.192,-13.27,1,54,0.046,134.381,0.782,16
3312,1sdW2whJr8CLa6bMA67LPQ,Clube Da Esquina Nº 2,Milton Nascimento,0.44,0.47,0.454,0.424,0,2,0.244,-14.921,1,45,0.0289,151.519,0.738,16
3685,5oLfyNezPv2IZdAFC9cYsh,Miss You Nights - 2001 Remaster,Cliff Richard,0.697,0.284,0.25,1.5e-05,0,5,0.103,-12.217,1,44,0.0328,140.908,0.217,16
3926,1EtGuTEIAqZkBIjzY3MWCf,オリビアを聴きながら,Anri,0.733,0.453,0.486,2e-06,0,7,0.157,-7.934,1,48,0.0261,141.31,0.116,16
4271,3BLewreWlYMr2MbVUBfBS2,Almost With You - 2002 Digital Remaster,The Church,0.000112,0.366,0.816,0.021,0,0,0.136,-5.135,1,44,0.0331,152.085,0.608,16


----------------------------------------------------------------------------------------------------
Cluster 17


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
3386,689lBKIELtWGHfsPWpR6rv,Todas las Hojas Son del Viento,Pescado Rabioso,0.657,0.523,0.169,1.1e-05,0,0,0.115,-14.107,1,54,0.0403,76.049,0.392,17
3402,1BoLhF18bW0zMb5P4BAEEf,Help Me Make It Through the Night,John Holt,0.233,0.654,0.567,0.0274,0,4,0.357,-8.766,1,44,0.0553,78.788,0.956,17
3460,3XHm2zKyY6gcYPmF0DtBXW,Vino Griego,José Velez,0.251,0.504,0.643,0.0,0,5,0.223,-4.206,1,45,0.0267,76.077,0.656,17
3481,5VpZzYUnr2y3ztWREovItM,We Said Goodbye,Dave Maclean,0.0366,0.381,0.469,0.00368,0,5,0.125,-9.927,1,44,0.03,71.448,0.348,17
4650,5TjYiOt2CSqApOFo8gMkCb,The Island,Paul Brady,0.971,0.567,0.103,9e-06,0,5,0.0923,-15.332,1,47,0.0308,82.611,0.312,17


----------------------------------------------------------------------------------------------------
Cluster 18


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
2628,7lN2ZGdAmVQFYvfBhrv0yK,Desesperadamente,Eydie Gormé,0.728,0.537,0.415,0.0,0,8,0.242,-12.961,1,37,0.0338,112.958,0.833,18
2675,5yQgbjGfbj2U02m3M2wX3N,Stand by Your Man,Tammy Wynette,0.798,0.514,0.326,0.00076,0,9,0.153,-9.69,1,44,0.0291,106.504,0.559,18
2994,3GuUtxubpUewt8tckD79Bl,La Mujer Que Yo Quiero,Joan Manuel Serrat,0.613,0.551,0.492,0.0,0,1,0.103,-7.188,0,41,0.0288,116.49,0.733,18
3068,21YtaaAivXcpcFaYbTjgKP,To Be Young Gifted and Black,Bob & Marcia,0.418,0.646,0.454,0.000532,0,0,0.353,-10.821,1,41,0.0308,117.125,0.922,18
3072,5FD1IZKwXVSL3zGwlNvLoF,Celoso,Roberto Luti,0.84,0.526,0.501,0.0,0,0,0.139,-8.088,1,39,0.0367,113.546,0.762,18


----------------------------------------------------------------------------------------------------
Cluster 19


Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1688,6443igXMStgLhDvCveDZfQ,Lo Que Quiera Lola,Bobby Capo,0.741,0.782,0.485,0.000122,0,0,0.334,-11.593,1,19,0.149,133.127,0.799,19
1759,5zTLkmFCtatsL66leWDzYu,"Italian Concerto in F Major, BWV 971: III. Pre...",Johann Sebastian Bach,0.978,0.263,0.339,0.883,0,5,0.132,-19.828,1,19,0.0323,141.828,0.902,19
1772,7Ip6RxnfiDWXO3kp4FAzlO,De Ti Enamorado,La Sonora Matancera,0.828,0.742,0.53,0.0,0,10,0.0497,-11.41,1,24,0.0409,137.16,0.973,19
1817,66zvey9033Ttmtj09VfwF0,Exactly Like You,Carmen McRae,0.601,0.694,0.353,0.0,0,5,0.105,-7.713,1,21,0.0437,135.258,0.563,19
1843,2DHSI9LXjHAdPT8gc5ghbf,Proud Of You,Eddie Cochran,0.678,0.506,0.42,0.0,0,1,0.202,-9.64,1,18,0.0481,141.016,0.859,19


----------------------------------------------------------------------------------------------------


In [43]:
df_new[df_new['artists'] == 'Bobby Capo']

Unnamed: 0,id,name,artists,acousticness,danceability,energy,instrumentalness,explicit,key,liveness,loudness,mode,popularity,speechiness,tempo,valence,cluster
1688,6443igXMStgLhDvCveDZfQ,Lo Que Quiera Lola,Bobby Capo,0.741,0.782,0.485,0.000122,0,0,0.334,-11.593,1,19,0.149,133.127,0.799,19
9805,0VntiMue6qOcWmJTDhCKXF,Luna De Miel En Puerto Rico,Bobby Capo,0.834,0.746,0.394,0.00393,0,8,0.119,-12.41,1,10,0.145,131.573,0.826,6
71596,0JaPsiY4rZB6g3dGoeYmiR,El Cucú,Bobby Capo,0.817,0.644,0.383,0.0,0,2,0.0691,-8.933,0,3,0.179,95.119,0.804,10
137875,2M3YIKtGu9e3SQDcOdtPTB,Me Lo Dijo Adela,Bobby Capo,0.774,0.848,0.593,0.000482,0,3,0.103,-11.769,0,17,0.12,135.878,0.963,6
163416,2RaXEKmOuAwQ1NAXbsdbwU,Locamente Enamorado,Bobby Capo,0.867,0.664,0.367,0.000231,0,4,0.382,-10.599,0,16,0.0501,113.68,0.867,0
