# In tune: analyze your own spotify song data 

## Data processing final project
## Name: Ruben de Klerk



#### Inspiration: 

It's 2020. This year kinda sucks and I am at home playing videogames and listening to lots of music. There is not much else to do other than to work on this project. I have put a lot of effort into my Spotify playlists and I've always wanted to know what makes a song suitable for a certain playlist. 
Are there playlists where the songs all share a certain trait? 
Are all metal songs in minor key, whereas all rock songs are in major? Let's find out! 

#### Data acquisition: 

This script uses the Spotify wep API to acquire all of a users public data from their playlists. The Spotify web API is a tool for developers, and can be used by registering an app on their [website](https://developer.spotify.com/documentation/web-api/). This program uses the [spotipy library](https://github.com/plamere/spotipy) to extract the data from Spotify, and the [bokeh library](https://bokeh.org/) to visualize it. 

#### Setup: 

The following cell runs the setup for the graph. setup.py primarily imports all necessary bokeh, pandas and spotipy modules, and the functions that I wrote to extract data and transform them into the desired objects. These functions can be found in modules.py. The setup also provides an authentification token for the API. All requests go through the spotify app that I registered and require this authentication. 

Here we also set the default username for the Spotify data. The graphs will display my personal playlists, but if you want to see your own you can change it using the button above the graph. 

The setup also generates a user_object: a dictionary that contains 1) a dictionary with a dataframe of every public playlist the user has and 2) a list of the names and IDs of these playlists. The purpose to have all this data in one object is to make the widgets faster. 

In [1]:
%run setup.py

#### Interactive bokeh plot

Here's the code for the interactive bokeh plot that takes a *user_object* and graphs the song data for a playlist. As you can see, you may choose what playlist to visualize and determine the axes yourself! __[To find out more on how the variables are defined, click here](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/).__

Use the the button to change username so you can visualize your friend's or your own data. 
After entering a <u>valid</u> username pressing the button, wait a few seconds and you will be instructed to restart the cell above to update the graph (there is no need to restart the kernel). 

notes: 

-Sometimes the "danceability" option will return a blank graph. If this happens, pick a different option or just restart the cell. 

-Sometimes, after switching users, a cascade of warnings will follow. This is a bug that occurs when replacing graphs in the layout. Github doesn't have an easy solution for so it is to be ignored as it has no functional implications for the code. 

In [2]:
# button for switching user
def modify_button(doc):
    global user_object
    button = Button(label="Switch user")
    input = TextInput(value="enter Spotify username or ID here")
    output = Paragraph()
    output2 = Paragraph()
    # button callback
    def update_button():
        global user_object 
        username = str(input.value)
        user_object = objectify_user(username)
        output.text = "Hello " + username + ", you can now run the cell above for your song data"        
        output2.text = "click ^this^ cell and press Shift + Enter"
        
    button.on_click(update_button)
    layout = column(output2, input, button, output)
    doc.add_root(layout)

In [4]:
# bokeh interactive graphs
def modify_doc(doc):
    global user_object
    
    source = ColumnDataSource(data = user_object["dataframes"][user_object["playlists"][0][0]])
    
    playlist = Select(title="Playlist", value=user_object["playlists"][0][0], options=[x[0] for x in user_object["playlists"]])  
    
    # scatterplot with song features   
    def create_figure():
        
        hover = HoverTool(tooltips=[("title: ", "@song"), ("artist: ", "@artist")])     
  
        p = figure(plot_width = 600, plot_height = 400, tools = [hover])
        p.circle(px.value, 
                 py.value, 
                 size = 12, 
                 source = source,
                 color= factor_cmap('mode',['darkmagenta','darkturquoise'],["Major", "minor"]),
                 legend_field='mode')       

        p.xaxis.axis_label = px.value
        p.yaxis.axis_label = py.value 
        p.axis.axis_label_text_color = "darkmagenta"
        p.axis.axis_label_text_font_style = "bold"
        p.axis.axis_label_text_font_size = "15pt"
        p.title.text = f"'{playlist.value}': {px.value} vs {py.value}"
        p.title.text_font_size = "15pt"
        p.background_fill_color = "black"
        p.background_fill_alpha = 0.9
        p.outline_line_width = 7
        p.outline_line_alpha = 0.6
        p.outline_line_color = "darkmagenta"
        p.grid.grid_line_alpha = 0.3
        p.grid.band_hatch_pattern = "x"
        p.legend.background_fill_alpha = 0.4
        p.legend.location = "top_left"
        p.legend.border_line_color = "darkmagenta"
        p.legend.background_fill_color = "black"
        p.legend.label_text_color = "white"
        return p
    
    # bar plot with key signatures
    def key_plot():
        keys = Counter(list(source.data["key"]))
        key_sigs = list(keys.keys())
        key_counts = list(keys.values())
        
        k = figure(x_range = key_sigs, plot_height = 300, plot_width = 380, title = "key signatures in this playlist", tools = "")
        k.vbar(x= key_sigs, bottom = 0, top= key_counts, width = 0.3, color = "darkturquoise")
        k.title.text_font_size = "15pt"
        k.background_fill_color = "black"
        k.background_fill_alpha = 0.9
        k.outline_line_width = 7
        k.outline_line_alpha = 0.6
        k.outline_line_color = "darkmagenta"
        k.xgrid.visible = False
        k.grid.grid_line_alpha = 0.3
        
        return k
    
    ### callback functions ###
    
    # change the datasource to the dataframe holding the selected playlist
    def update_playlist(attrname, old, new):
        source.data = user_object["dataframes"][playlist.value]
    playlist.on_change('value', update_playlist)

    # update the graphs using the new parameters
    def update(attrname, old, new):
        source.data = user_object["dataframes"][playlist.value]
        graph.children[1] = create_figure()
        controls.children[2] = key_plot()
        
    
    # options for changing the axes 
    px = Select(title = "x-axis", 
                    value = "danceability", 
                    options = ["danceabilty", "energy", "loudness", "instrumentalness", "liveness", "valence", "popularity"]
                   )
    py = Select(title = "y-axis", 
                    value = "energy", 
                    options = ["danceabilty", "energy", "loudness", "instrumentalness", "liveness", "valence", "popularity"]
                   )
    px.on_change("value", update)
    py.on_change("value", update)
    playlist.on_change("value", update)
    
    # create a layout for everything
    controls = column(px, py, key_plot())
    graph = column(playlist, create_figure())
    layout = row(graph, controls)
    
    doc.add_root(layout)

# display the whole thing
show(modify_button)
show(modify_doc)




__[What do these variables mean??](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/)__