# Visualizing Spotify Data

Python script by Darby Bates
Based on script from https://github.com/elb0/florence-nightingale-uoft-22/blob/master/plotting.Rmd
Output: HTML files

## Welcome to the Pursue STEM CBSNY 2022 Activity!
If you haven't opened this python notebook in Google Colab yet, follow the instructions in the "readme.md" file in this Spotify Activity folder on Github. https://github.com/DarbyBates/PursueSTEM/blob/main/Spotify%20Activity/readme.md

If you've opened this python notebook in Google Colab, follow these instructions (in order!) to import the data we'll be working from to your Google Colab Session and run this python script for the first time.

On the upper left-hand side of the screen are a few buttons that are symbols. Look below the symbol '{x}'. Click on the symbol of a folder you see there.

A Files tab should open, and you can click on the symbol of a piece of paper with an arrow pointing straight up into the paper. This is the "Upload to session storage" button.

Select the "for_students.csv" file you downloaded from the github. You'll find it here: https://github.com/DarbyBates/PursueSTEM/tree/main/Spotify%20Activity

In the menu bar at the top of the page is a "Runtime" tab. Click "Runtime", then "Run all".

If you scroll to the bottom of the page, you should see a plot of Energy versus Positiveness (valence) for a few musical artists. If you hover your mouse over points on the plot, the artist and name of the song, along with the valence and energy values should pop up! You can toggle on/off individual artists within the plot by clicking on the artist names on the legend on the right of the plot.


In [4]:
##Don't change anything in this cell.
#Header
import pandas as pd #for dataframes
import plotly.express as px #for plotly
import plotly.graph_objects as go

In [5]:
##Don't change anything in this cell.
#Load CSV File into a pandas dataframe, make sure .csv file is in the same folder as this notebook.

#A dataframe is a python-readable array of information, 
#like a table of values, that is formatted to be easy to access with code.
#Here, we load the dataframe and name it "dataset", then we isolate the artist names and filter for unique ones.
#This way we can print out a list of all the artist names that we have data for.

dataset=pd.read_csv("for_students.csv")
allnames=dataset['artist_name']
uniquenames=allnames.unique()
print(uniquenames) #Print the artist names included in the dataset

['Bruno Mars' 'Daniel Caesar' 'Billie Eilish' 'Nicki Minaj'
 'Ariana Grande' 'Taylor Swift' 'Lorde' 'Queen' 'Tyler, The Creator'
 'Fall Out Boy' 'NF' 'GOT7' 'Melanie Martinez' 'Harry Styles' 'Rihanna'
 'Rod Wave' 'Kendrick Lamar' 'Rex Orange County' 'Jeff Bernat' 'Mac Ayres'
 'Toosii' 'Lil Tjay' 'Polo G' 'BTS' 'Mitski' 'The Strokes'
 'The Neighbourhood' 'Drake' 'The Weeknd' 'Burna Boy' 'Madison Beer'
 'Shawn Mendes' 'Eve' 'YOASOBI' 'The Orion Experience' 'Pink Floyd'
 'Chemlab' 'Tame Impala' 'Loud Luxury' 'anders' 'The Chainsmokers'
 'Troye Sivan' 'Jack Harlow' 'DaBaby' 'Lil Wayne' 'Tory Lanez' 'ENHYPEN'
 'Justin Bieber' 'Giveon' 'Ne-Yo' 'Phoebe Bridgers' 'Bad Bunny'
 'Daft Punk' '3MFrench' 'Bvlly' 'Mustard' 'Denzel Curry' "Twelve'len"
 'GoldLink' 'Glen Campbell' 'The Jackson 5' 'Kanye West' 'Hippo Campus'
 'Dayglow' 'Jeremy Soule' 'NIKI' 'Olivia Rodrigo' 'Destiny Rogers'
 'Rick Astley' '物語シリーズ' 'Toby Fox' 'Summer Walker' 'Koffee' 'Brent Faiyaz'
 'H.E.R.' 'Afro B' 'Alessia Cara' 'Nirva

In [6]:
##Decide whether you want the plot to have 
# all of the artists,
# two random artists from the list, or 
# particular artists you name.

# Choose your option by making that variable = 1 and the others = 0
random=0
all=1
particular=0

#RANDOM artist comparison
if random==1:
    all=0
    particular=0
    random1=allnames.sample()
    random2=allnames.sample()
    artist_names=[random1.iloc[0],random2.iloc[0]]

#ALL artists
if all==1:
    random=0
    particular=0
    artist_names=uniquenames

##If you want to choose PARTICULAR artists, enter those artist names here
# in single quotes and separated by commas, spelling counts!
if particular==1:
    all=0
    random=0
    #artist_names=[]
    artist_names=['Toosii','Fall Out Boy', 'Mitski','Jeff Bernat','Taylor Swift']

#Here, we filter the data based on your selections above.
datafilter=dataset.loc[dataset['artist_name'].isin(artist_names)]

#Now that you've selected which data we want to plot, let's work on the plot itself.    
    
#This section plots the data using a software called plotly, from the pandas dataframe "dataset".
fig = px.scatter(datafilter,x="valence", y="energy",hover_data=['track_name'],color="artist_name", 
                 labels={"track_name":"Song Title","valence":"Positiveness (valence)",
                         "energy":"Energy","artist_name":"Artist(s)"},
                width=800,height=700)

#Set axes and labels for axes
fig.update_yaxes(range=[-0.01, 1.01])
fig.update_xaxes(range=[-0.01, 1.01])
fig.update_xaxes(title_text='Positiveness (valence)')
fig.update_yaxes(title_text='Energy')
#Add quadrant dividing lines
fig.add_hline(y=0.5, line_width=3)
fig.add_vline(x=0.5, line_width=3)
#Label the quadrants
#If you want to label your quadrants with your own descriptions, add them 
# here in the text=["Q1", "Q2", "Q3","Q4"] lines, 
# replacing the Q1 or Q2 or Q3 or Q4 with what you want that quadrant to say on it.
fig.add_trace(go.Scatter(
    x=[0.95],#These x-coordinates tell you where the text will be. 
    y=[0.95],#These y-coordinates tell you where the text will be. 
    mode="text",
    name="Text",
    text=["Q1"],#This is the line which says what text to print on the graph.
    textfont=dict(size=36,color="lightslategrey"),
    textposition="middle center",showlegend=False
))

fig.add_trace(go.Scatter(
    x=[0.95],#These x-coordinates tell you where the text will be. 
    y=[0.05],#These y-coordinates tell you where the text will be. 
    mode="text",
    name="Text",
    text=["Q2"],#This is the line which says what text to print on the graph.
    textfont=dict(size=36,color="lightslategrey"),
    textposition="middle center",showlegend=False
))
fig.add_trace(go.Scatter(
    x=[0.05],#These x-coordinates tell you where the text will be. 
    y=[0.05],#These y-coordinates tell you where the text will be. 
    mode="text",
    name="Text",
    text=["Q3"],#This is the line which says what text to print on the graph.
    textfont=dict(size=36,color="lightslategrey"),
    textposition="middle center",showlegend=False
))
fig.add_trace(go.Scatter(
    x=[0.05],#These x-coordinates tell you where the text will be. 
    y=[0.95],#These y-coordinates tell you where the text will be. 
    mode="text",
    name="Text",
    text=["Q4"],#This is the line which says what text to print on the graph.
    textfont=dict(size=36,color="lightslategrey"),
    textposition="middle center",showlegend=False
))


# There are several default color schemes and layouts
# you can choose from at https://plotly.com/python/templates/
# Replace "simple_white" with the template you'd like to pick. Be sure to capitalize as shown on the website!
fig.update_layout(template="simple_white")


#Save plot as html file and open in browser.
if all==1:
    fig.write_html('all'+'.html', auto_open=True)
if random==1:
    fig.write_html(''.join(artist_names)+'.html', auto_open=True)
if particular==1:
    fig.write_html(''.join(artist_names)+'.html', auto_open=True)
#Show plot below in python.
fig.show()