# Spotify Data Case Study
I listen to a lot of music on Spotify, as do many other people. But this is about me. My problem is that I promised a friend a list of my top 50 albums and I seriously don't know what they are. I can list maybe 20 before it becomes really hard to decide. The aim of this case study is to solve my pressing issue using supervised learning. And I guess other people can also download their Spotify data do the same thing. 

I decided on using a neural network since my music taste is quite sophistocated and therefore contains many complex patterns. Specifically, it will address the problem of varying feature length better than other methods (since I listen to some albums much more than others). For this case study I won't be working with the Spotify API or any song metadata, nor will I reference album, artist, or track as features in the learning. 

Step 1: Contact spotify support to download my full listening history. 
Step 2: Clean the data, manipulate it to work with a cnn, fit the neural network, predict my favorite albums.

In [1]:
import json
import pandas as pd
import numpy as np
import tensorflow as tf
import seaborn as sns
import time
import datetime

In [2]:
# read json files to df
for i in range(0,5):
    if i == 0:
        df = pd.read_json('MyData/endsong_' + str(i) + '.json')
    else:
        # ignore_index makes sure they fully combine
        df = df.append(pd.read_json('MyData/endsong_' + str(i) + '.json'), ignore_index=True)
y_train = pd.read_csv('y_train.csv')

In [3]:
%%capture
# drop podcasts
df.drop(df[~df['episode_show_name'].isnull()].index) 
# drop useless columns
renames = {"master_metadata_track_name":"track","master_metadata_album_artist_name":"artist","master_metadata_album_album_name":"album"}
drops = ["username","conn_country","ip_addr_decrypted","city","region","metro_code","longitude","latitude","offline","offline_timestamp","incognito_mode","user_agent_decrypted","platform","episode_name","episode_show_name","spotify_episode_uri"]
df.rename(columns = renames, inplace = True)
df = df.drop(drops, axis = 1)
# drop Sleepy John
df = df.drop(df[df['artist'].str.match("Sleepy John",case=False,na=False)].index)
# drop empty observations
df.drop(df[df['spotify_track_uri'].isnull()].index)
# if I do not have over 20 observations of an album I assume it can't be one of my favorites
df = df.groupby(['artist', 'album']).filter(lambda x: len(x) > 20).reset_index()
# merge df with y_train file
df = df.merge(y_train, how='left', on=['artist', 'album'])

In [4]:
# create a date column
df['date'] = df['ts'].str.split("T", expand=True)[0]
# create a timestamp for number of days since unix epoch 
# technically it's the number of seconds as a time delta, but this makes no difference 
df['date_ts'] = (pd.to_datetime(df['date']) - np.datetime64('1970-01-01T00:00:00'))

In [5]:
# here's what the data looks like
df.head(10)

Unnamed: 0,index,ts,ms_played,track,artist,album,spotify_track_uri,reason_start,reason_end,shuffle,skipped,predict,date,date_ts
0,0,2018-07-15T17:15:45Z,153351,4th Dimension,KIDS SEE GHOSTS,KIDS SEE GHOSTS,spotify:track:6JyEh4kl9DLwmSAoNDRn5b,trackdone,trackdone,False,,,2018-07-15,17727 days
1,4,2019-07-25T23:58:38Z,44912,I'm in Love Again,Tomppabeats,Harbor,spotify:track:3Pm3R9cbWkanONrubREjW9,trackdone,trackdone,False,,,2019-07-25,18102 days
2,7,2018-11-08T18:54:12Z,46359,Shimmy,System Of A Down,Toxicity,spotify:track:1a3X8Y882vwSnlnHqf9ztF,trackdone,endplay,False,,,2018-11-08,17843 days
3,8,2019-02-15T06:12:55Z,245098,Kids See Ghosts,KIDS SEE GHOSTS,KIDS SEE GHOSTS,spotify:track:2I3dW2dCBZAJGj5X21E53k,trackdone,trackdone,False,,,2019-02-15,17942 days
4,9,2017-04-17T15:17:46Z,8904,"Sing About Me, I'm Dying Of Thirst",Kendrick Lamar,"good kid, m.A.A.d city",spotify:track:0sd6BRTa0O96tfEbFGhJF9,clickrow,endplay,False,,1.0,2017-04-17,17273 days
5,11,2017-03-24T19:19:26Z,493400,Holiday / Boulevard of Broken Dreams,Green Day,American Idiot,spotify:track:0MsrWnxQZxPAcov7c74sSo,trackdone,trackdone,False,,,2017-03-24,17249 days
6,15,2018-10-14T13:56:24Z,314173,Man On The Moon,R.E.M.,Automatic For The People,spotify:track:4jLv3tDBu8ww2R07DvL12s,trackdone,trackdone,False,,,2018-10-14,17818 days
7,16,2016-09-05T08:28:08Z,242906,Must Be The Ganja,Eminem,Relapse,spotify:track:0g6Wn9gocuOVr72Yaf8UBF,trackdone,trackdone,True,,,2016-09-05,17049 days
8,19,2019-10-18T18:09:57Z,214906,Greedy,Ariana Grande,Dangerous Woman,spotify:track:2tpIAmAq9orm1Owh5pja1w,trackdone,trackdone,False,,,2019-10-18,18187 days
9,20,2018-12-13T17:34:10Z,149193,Salad Days,Mac DeMarco,Salad Days Demos,spotify:track:4DPmeebOrpe4tLgSI8hGC0,trackdone,trackdone,False,,,2018-12-13,17878 days


In [6]:
# describe all features
df.describe(include="all")

Unnamed: 0,index,ts,ms_played,track,artist,album,spotify_track_uri,reason_start,reason_end,shuffle,skipped,predict,date,date_ts
count,35355.0,35355,35355.0,35355,35355,35355,35355,35355,35355,35355,70.0,6026.0,35355,35355
unique,,34699,,6169,318,526,7054,10,10,2,,,1423,
top,,2016-11-27T18:19:28Z,,Remedy Ft. Leah Culver,Red Hot Chili Peppers,Blonde,spotify:track:3gAxPDWaeiRLRaPcSSkjdT,trackdone,trackdone,False,,,2020-01-20,
freq,,62,,116,1159,498,116,26628,28022,26596,,,206,
mean,31139.628567,,194894.9,,,,,,,,0.171429,0.597577,,17889 days 00:02:31.514637056
std,18027.797133,,125158.2,,,,,,,,0.379604,0.490427,,432 days 11:40:19.719840376
min,0.0,,0.0,,,,,,,,0.0,0.0,,16964 days 00:00:00
25%,15475.5,,123402.5,,,,,,,,0.0,0.0,,17569 days 00:00:00
50%,31140.0,,205266.0,,,,,,,,0.0,1.0,,17970 days 00:00:00
75%,46791.0,,257380.0,,,,,,,,0.0,1.0,,18245 days 00:00:00


In [7]:
# see what the frequency of trackdone and trackstart messages are
display(df["reason_end"].value_counts())
display(df["reason_start"].value_counts())

trackdone                       28022
endplay                          4583
fwdbtn                           1408
unexpected-exit-while-paused      497
remote                            400
logout                            206
backbtn                           168
unknown                            43
unexpected-exit                    27
trackerror                          1
Name: reason_end, dtype: int64

trackdone     26628
clickrow       5767
fwdbtn         1373
remote          537
appload         431
playbtn         406
backbtn         190
                 11
trackerror        9
unknown           3
Name: reason_start, dtype: int64

In [8]:
# Observe the album I have listened to most: Blonde
album_grp = df.groupby(['album'])
album_grp.get_group('Modal Soul').head(10)

Unnamed: 0,index,ts,ms_played,track,artist,album,spotify_track_uri,reason_start,reason_end,shuffle,skipped,predict,date,date_ts
77,137,2019-03-23T18:37:36Z,257893,reflection eternal,Nujabes,Modal Soul,spotify:track:6eGMwVVABqVTe9bWRIm498,trackdone,trackdone,False,,1.0,2019-03-23,17978 days
99,176,2020-04-18T04:21:28Z,29570,World's end Rhapsody,Nujabes,Modal Soul,spotify:track:7BBZGDSZbsb4Esi8YB94HT,clickrow,endplay,False,,1.0,2020-04-18,18370 days
171,288,2018-06-13T20:47:35Z,336866,Luv(sic.) pt3 (feat. Shing02),Nujabes,Modal Soul,spotify:track:4xlpJ99yL9xYQtzG6c3hwk,trackdone,trackdone,False,,1.0,2018-06-13,17695 days
306,531,2018-04-17T04:40:41Z,214333,Eclipse (feat. Substantial),Nujabes,Modal Soul,spotify:track:7mEPuj0XW6eK14Unu6IUc1,trackdone,trackdone,False,,1.0,2018-04-17,17638 days
392,679,2019-03-01T02:42:02Z,336866,Luv(sic.) pt3 (feat. Shing02),Nujabes,Modal Soul,spotify:track:4xlpJ99yL9xYQtzG6c3hwk,trackdone,trackdone,False,,1.0,2019-03-01,17956 days
396,685,2019-03-21T03:13:19Z,289800,The Sign (feat. Pase Rock),Nujabes,Modal Soul,spotify:track:2g8vK3m0npTrzsADQAnbVO,trackdone,trackdone,False,,1.0,2019-03-21,17976 days
412,708,2019-03-22T04:16:10Z,14071,sea of cloud,Nujabes,Modal Soul,spotify:track:4rAcMik7N6LlIs61u5bzYo,clickrow,endplay,False,,1.0,2019-03-22,17977 days
554,991,2019-10-31T03:34:12Z,235000,Light on the land,Nujabes,Modal Soul,spotify:track:2rUaktaAxshyPzIdAKzk1Y,trackdone,trackdone,False,,1.0,2019-10-31,18200 days
669,1185,2019-03-06T22:53:43Z,336866,Luv(sic.) pt3 (feat. Shing02),Nujabes,Modal Soul,spotify:track:4xlpJ99yL9xYQtzG6c3hwk,trackdone,trackdone,False,,1.0,2019-03-06,17961 days
683,1209,2020-02-13T22:55:09Z,260467,Music is mine,Nujabes,Modal Soul,spotify:track:45ejirK0hfPnsjgzj3s7gP,trackdone,trackdone,False,,1.0,2020-02-13,18305 days


In [9]:
df.groupby(['artist', 'album']).size().sort_values(ascending=False).head(60)

artist                 album                                     
Frank Ocean            Blonde                                        498
Chon                   Homey                                         442
Flying Lotus           You're Dead!                                  388
Radiohead              Kid A                                         324
Porter Robinson        Worlds                                        317
Nujabes                Modal Soul                                    315
Jon Bellion            The Human Condition                           301
Kanye West             The Life Of Pablo                             299
Chon                   Grow                                          282
Tokyo Police Club      A Lesson In Crime                             270
Taylor Swift           Lover                                         258
BROCKHAMPTON           SATURATION II                                 258
Red Hot Chili Peppers  Stadium Arcadium                   

In [10]:
df[df['artist'].str.match("ecco",case=False,na=False)].head(10)

Unnamed: 0,index,ts,ms_played,track,artist,album,spotify_track_uri,reason_start,reason_end,shuffle,skipped,predict,date,date_ts
133,229,2020-08-05T05:29:44Z,154057,Cc,Ecco2k,E,spotify:track:0vsHcvOXID7A5RcpiNiLGj,trackdone,trackdone,False,,1.0,2020-08-05,18479 days
262,463,2021-01-20T11:03:11Z,4351,Fragile,Ecco2k,E,spotify:track:7be5hJjOIHkQ210gdVXcOL,clickrow,endplay,False,,1.0,2021-01-20,18647 days
345,598,2020-08-08T15:17:52Z,15893,Fragile,Ecco2k,E,spotify:track:7be5hJjOIHkQ210gdVXcOL,fwdbtn,fwdbtn,False,,1.0,2020-08-08,18482 days
596,1056,2021-01-20T03:57:49Z,224643,Calcium,Ecco2k,E,spotify:track:0D6MyCMzbniMYWHYyCcv5N,trackdone,trackdone,False,,1.0,2021-01-20,18647 days
642,1138,2020-08-08T13:25:51Z,1557,Bliss Fields,Ecco2k,E,spotify:track:779Wd2zuFcEDhwgX1Jtwq3,fwdbtn,fwdbtn,False,,1.0,2020-08-08,18482 days
760,1339,2021-01-19T09:18:13Z,252000,AAA Powerline,Ecco2k,E,spotify:track:6y6VuKUvAqVzONMPL4hXmU,clickrow,trackdone,True,,1.0,2021-01-19,18646 days
835,1465,2021-01-20T13:42:09Z,66634,Security!,Ecco2k,E,spotify:track:4XJlKNehaCUUeviadnz1Fg,clickrow,endplay,False,,1.0,2021-01-20,18647 days
1243,2181,2021-01-20T13:46:54Z,104187,Peroxide,Ecco2k,E,spotify:track:5r0nz4nalNOBQAPKchQKRY,clickrow,endplay,False,,1.0,2021-01-20,18647 days
1281,2257,2021-01-20T11:03:37Z,22234,Bliss Fields,Ecco2k,E,spotify:track:779Wd2zuFcEDhwgX1Jtwq3,clickrow,trackdone,False,,1.0,2021-01-20,18647 days
1573,2769,2021-01-09T17:30:21Z,155459,Security!,Ecco2k,E,spotify:track:4XJlKNehaCUUeviadnz1Fg,trackdone,trackdone,False,,1.0,2021-01-09,18636 days


In [11]:
df[df['ms_played'] == 0 | df['artist'].isnull()]

Unnamed: 0,index,ts,ms_played,track,artist,album,spotify_track_uri,reason_start,reason_end,shuffle,skipped,predict,date,date_ts
271,477,2018-07-16T00:35:45Z,0,"The Island, Pt. 1 (Dawn) - Skrillex Remix",Pendulum,The Reworks,spotify:track:0jSvm9INXBdV6fzFbONmD7,backbtn,backbtn,False,,,2018-07-16,17728 days
291,508,2017-06-05T06:25:47Z,0,Ivory,Polyphia,Renaissance,spotify:track:6GIJiLJVQumva9G5khVXH6,fwdbtn,fwdbtn,True,,,2017-06-05,17322 days
375,652,2019-01-22T07:24:11Z,0,Antebellum,Vienna Teng,Inland Territory,spotify:track:7hSPU5mA6fRQVUpghuth1i,fwdbtn,fwdbtn,True,,,2019-01-22,17918 days
407,698,2019-01-22T07:24:13Z,0,Love Song,Elton John,Tumbleweed Connection,spotify:track:3B1uIhPK3xaWlWB4iELL09,fwdbtn,fwdbtn,True,,,2019-01-22,17918 days
464,801,2018-11-12T21:47:00Z,0,La Lune (feat. Dan Smith),Madeon,Adventure (Deluxe),spotify:track:1UCDE2UIE5cC6B1o0nsrSv,clickrow,endplay,True,,,2018-11-12,17847 days
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35230,62005,2019-03-07T17:41:13Z,0,Uh Oh!,Tennyson,Uh Oh!,spotify:track:6rQBbUcQCijl6ndHa4m9sj,fwdbtn,fwdbtn,True,,,2019-03-07,17962 days
35289,62100,2018-01-24T06:29:11Z,0,The General,Dispatch,Bang Bang,spotify:track:6n6EXIwLtNwe4u4CFzENYm,fwdbtn,backbtn,True,,,2018-01-24,17555 days
35302,62122,2017-10-21T08:59:07Z,0,Engagement Party,Justin Hurwitz,La La Land,spotify:track:2bonbKENtFAQQh8U4UEAu5,fwdbtn,fwdbtn,True,,,2017-10-21,17460 days
35308,62132,2016-11-23T18:17:47Z,0,Blow,Kesha,Cannibal,spotify:track:5FQazQxWUHsJ8QDaXLdFzR,fwdbtn,fwdbtn,True,,,2016-11-23,17128 days


In [12]:
dfgrouped = df.groupby(['artist','album'])
n_obs_max = dfgrouped.size().max()
n_groups = dfgrouped.ngroups

A = []
Y = []
for name, group in dfgrouped:
    a = np.array(group['date_ts'])
    a.resize(n_obs_max)
    b = np.array(group['ms_played'])
    b.resize(n_obs_max)
    c = np.array(group['shuffle'])
    c.resize(n_obs_max)
    d = np.array(group['shuffle'])
    d.resize(n_obs_max)
    A.append(a + b)
    Y.append(group['predict'].max())

A = np.vstack(A).T
A = A.astype(np.int64)
X_train = A[:,(~np.isnan(Y)).T].T
Y = np.array(Y)[~np.isnan(Y)]

In [13]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(128, activation='tanh'),
    tf.keras.layers.Dense(64, activation='tanh'),
    tf.keras.layers.Dense(64, activation='tanh', kernel_regularizer=tf.keras.regularizers.L2(0.1)),
    tf.keras.layers.Dense(64, activation='tanh'),
    tf.keras.layers.Dense(1, activation='sigmoid'),
])
loss_fn = tf.keras.losses.BinaryCrossentropy()
model.compile(optimizer='adam', loss=loss_fn, metrics=['accuracy'])

model.fit(X_train, Y, epochs=20)

model.predict(A.T).T[0]

Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200
Epoch 125/200
Epoch 126/200
Epoch 127/200
Epoch 128/200
Epoch 129/200
Epoch 130/200
Epoch 131/200
Epoch 132/200
Epoch 133/200
Epoch 134/200
Epoch 135/200
Epoch 136/200
Epoch 137/200
Epoch 138/200
Epoch 139/200
Epoch 140/200
Epoch 141/200
Epoch 142/200
Epoch 143/200
Epoch 144/200
Epoch 145/200
Epoch 146/200
Epoch 147/200
Epoch 148/200
Epoch 149/200
Epoch 150/200
Epoch 151/200
Epoch 152/200
Epoch 153/200
Epoch 154/200
Epoch 155/200
Epoch 156/200
Epoch 1

Epoch 167/200
Epoch 168/200
Epoch 169/200
Epoch 170/200
Epoch 171/200
Epoch 172/200
Epoch 173/200
Epoch 174/200
Epoch 175/200
Epoch 176/200
Epoch 177/200
Epoch 178/200
Epoch 179/200
Epoch 180/200
Epoch 181/200
Epoch 182/200
Epoch 183/200
Epoch 184/200
Epoch 185/200
Epoch 186/200
Epoch 187/200
Epoch 188/200
Epoch 189/200
Epoch 190/200
Epoch 191/200
Epoch 192/200
Epoch 193/200
Epoch 194/200
Epoch 195/200
Epoch 196/200
Epoch 197/200
Epoch 198/200
Epoch 199/200
Epoch 200/200


array([1.89265609e-03, 4.97132540e-04, 3.20792198e-04, 3.69518995e-04,
       8.51457119e-02, 8.83783221e-01, 7.35878944e-04, 1.00658834e-02,
       2.47421861e-03, 4.32879597e-01, 4.97132540e-04, 2.46390700e-03,
       8.02695751e-04, 3.69518995e-04, 1.22115016e-02, 1.46114826e-03,
       9.92971718e-01, 9.88577604e-01, 7.27921724e-04, 2.76975036e-02,
       7.75736570e-03, 7.57293582e-01, 4.97132540e-04, 9.98620391e-01,
       2.08190084e-03, 3.29195857e-02, 5.73545694e-04, 7.29079008e-01,
       3.20792198e-04, 4.41724211e-01, 4.71144915e-04, 6.05579972e-01,
       6.09517097e-04, 3.20792198e-04, 9.78742719e-01, 3.69518995e-04,
       1.67498589e-02, 9.07245278e-03, 6.09517097e-04, 1.51368976e-03,
       2.50631571e-03, 6.21267200e-01, 9.04190421e-01, 1.00182295e-02,
       8.94105434e-03, 1.29864663e-01, 4.91222739e-03, 9.98594165e-01,
       5.23686409e-03, 9.93489385e-01, 6.34679198e-03, 2.29597092e-04,
       5.18769026e-04, 2.59795785e-03, 5.23686409e-03, 2.63631344e-04,
      

In [14]:
pd.DataFrame(model.predict(A.T).T[0]).describe()

Unnamed: 0,0
count,529.0
mean,0.246667
std,0.378804
min,0.000172
25%,0.002199
50%,0.009072
75%,0.441724
max,0.999745


In [15]:
dfgrouped_merge = pd.DataFrame(dfgrouped.size())
dfgrouped_merge.columns = ['test_y']
dfgrouped_merge['test_y'] = model.predict(A.T).T[0]
df = df.merge(dfgrouped_merge, how='left', on=['artist', 'album'])

In [16]:
#df[df.test_y == 1].groupby(['artist','album','test_y']).size().head(55)
def f(x):
    k = f['test_y'].mean()
    return pd.Series(k, index='test_y')
df_predictions = pd.DataFrame(df.groupby(['artist','album']).apply(
    lambda x: pd.Series([x.test_y.mean(),x.predict.mean()], index=['test_y','predict']))).sort_values('test_y',ascending=0).head(50)

In [17]:
df_predictions[df_predictions.predict!=1].head(25)

Unnamed: 0_level_0,Unnamed: 1_level_0,test_y,predict
artist,album,Unnamed: 2_level_1,Unnamed: 3_level_1
Pink Floyd,The Dark Side of the Moon,0.999135,
"Tyler, The Creator",IGOR,0.99862,
Anderson .Paak,Ventura,0.99862,
Big K.R.I.T.,4eva Is A Mighty Long Time,0.998594,
Danny Brown,Old,0.99853,
Death Grips,No Love Deep Web,0.997856,
Radiohead,OK Computer,0.997834,
Flying Lotus,You're Dead!,0.996836,
Kendrick Lamar,Section.80,0.99682,
Skrillex,Recess,0.996766,


In [43]:
pd.pivot_table(df,columns=['test_y'],aggfunc=['mean','std'])

Unnamed: 0_level_0,mean,mean,mean,mean,mean,mean,mean,mean,mean,mean,...,std,std,std,std,std,std,std,std,std,std
test_y,0.000079,0.000085,0.000092,0.000096,0.000110,0.000110.1,0.000114,0.000114.1,0.000128,0.000129,...,0.999005,0.999085,0.999235,0.999271,0.999356,0.999533,0.999537,0.999557,0.999833,0.999919
index,32918.714286,26535.444444,25428.339286,33307.925926,30659.37037,30670.446429,34149.666667,30410.52,30565.25,31971.928571,...,19229.748027,18841.424721,18580.622882,18931.770838,17950.058118,17460.531718,17295.397875,18118.981099,17573.302869,17540.298642
ms_played,189493.75,187987.703704,153738.25,156035.0,161533.407407,167488.339286,155216.407407,1411422.0,203796.982143,184690.321429,...,90368.579587,27870.89202,86305.275649,108953.925668,126673.258076,51950.924252,93796.432051,3056.769255,114053.927691,114378.290162
predict,,,,,,,,,,,...,,,,,0.0,0.0,,,0.0,0.0
shuffle,0.0,0.148148,0.428571,0.703704,0.333333,0.410714,0.382716,0.07407407,0.375,0.321429,...,0.502625,0.0,0.40776,0.396173,0.451546,0.423241,0.391684,0.066154,0.442023,0.411138
skipped,,,,,,0.0,0.0,,,,...,,,,,,0.0,0.5,,,


In [88]:
df[df['artist'].str.match("Sleepy John",case=False,na=False)]

Unnamed: 0,index,ts,ms_played,track,artist,album,spotify_track_uri,reason_start,reason_end,shuffle,skipped,predict,date,date_ts,test_y
8,18,2020-04-01T01:42:51Z,60000,"Fireplace Sounds for Sleep, Pt. 40",Sleepy John,Fire Sounds,spotify:track:5skajcSHCtqY9j92zHGT67,trackdone,trackdone,False,,,2020-04-01,18353 days,0.999557
218,364,2020-03-27T19:25:26Z,61124,"Fireplace Sounds for Sleep, Pt. 78",Sleepy John,Fire Sounds,spotify:track:47VxOqtHgwwfkdxSVGMOoZ,trackdone,trackdone,False,,,2020-03-27,18348 days,0.999557
227,386,2020-04-01T01:38:48Z,60000,"Fireplace Sounds for Sleep, Pt. 36",Sleepy John,Fire Sounds,spotify:track:6KPP7eOmnnol7pE6NjhmwH,trackdone,trackdone,False,,,2020-04-01,18353 days,0.999557
254,442,2020-03-27T15:20:51Z,60000,"Fireplace Sounds for Sleep, Pt. 14",Sleepy John,Fire Sounds,spotify:track:77BryQXGnQMf1p3O8oaUVP,trackdone,trackdone,False,,,2020-03-27,18348 days,0.999557
329,559,2020-03-28T23:22:30Z,60000,"Fireplace Sounds for Sleep, Pt. 05",Sleepy John,Fire Sounds,spotify:track:4rhmlccVHnQsHHurlWvf4A,trackdone,trackdone,False,,,2020-03-28,18349 days,0.999557
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
35346,61420,2020-03-27T20:48:11Z,61132,"Fireplace Sounds for Sleep, Pt. 157",Sleepy John,Fire Sounds,spotify:track:0yAjRU3WWgXO6Y3e9rI2WW,trackdone,trackdone,False,,,2020-03-27,18348 days,0.999557
35379,61482,2020-03-27T19:56:53Z,61874,"Fireplace Sounds for Sleep, Pt. 109",Sleepy John,Fire Sounds,spotify:track:0aZ94icu1nKTyJj8wPG3oY,trackdone,trackdone,False,,,2020-03-27,18348 days,0.999557
35605,61872,2020-03-27T19:57:27Z,33055,"Fireplace Sounds for Sleep, Pt. 110",Sleepy John,Fire Sounds,spotify:track:0IcwSjCwBhzZEn4tvqS6Lf,trackdone,fwdbtn,False,,,2020-03-27,18348 days,0.999557
35647,61945,2020-03-27T16:03:26Z,60000,"Fireplace Sounds for Sleep, Pt. 56",Sleepy John,Fire Sounds,spotify:track:0RDZiPDWu2cYQBXRZP3wrD,trackdone,trackdone,False,,,2020-03-27,18348 days,0.999557
