# Creating the Lollapalooza charts from Déborah Mesquita's tutorial

[Tutorial on freecodecamp](https://www.freecodecamp.org/news/how-and-why-i-used-plotly-instead-of-d3-to-visualize-my-lollapalooza-data-d48345e2ca68/)

The following text is copied from the article and placed in this notebook to make it easier for you to render the charts.

Déborah's article provides code that uses plotly.graph_objects. Since we are mostly using plotly.express then both versions of the charts are provided here. If you skipped the previous exercise then focus only on the express version of each chart.

The imports are covered once at the start of the notebook, you will need to run this before you try to run any of the chart cells.

The use of Plotly Express is recommended by Plotly in their ebook 'Top 10 Dash Enterprise Tips & Tricks':

"**Tip 3 Import Plotly Express to save time when updating figures**
Plotly Express is a high-level Python visualization library and wrapper for Plotly.py that exposes a simple syntax for complex charts. Users can create figures using Plotly Express one-line functions in addition to the traditional graph objects data and layout syntax. It allows the user to focus on the higher callback logic of their Dash app rather than get stuck in the details of the data visualization layer. One example of a Plotly Express shortcut is for creating and updating figures in Python. After importing Plotly Express, you can create and update figures faster with just a few lines of code using the .update_* and .add_* functions."




In [71]:
import plotly.graph_objs as go
import plotly.express as px
import pandas as pd
import plotly.io as pio
pio.renderers.default = "notebook"

## My Lollapalooza experience

For the 2018 edition of Lollapalooza Brazil, all purchases were made through an RFID-enabled wristband. They send the data to your email address, so I decided to take a look at it. What can we learn about me and my experience by analyzing the purchases I made at the festival?

This is how the data looks:

- purchase date
- purchase hour
- product
- quantity
- stage
- place where I did the purchase

Based on this data, let’s answer some questions.

## Where did I go during the festival?

The data only tells us the name of the location where I made the purchase, and the festival took place at Autódromo de Interlagos. 

I took the map with the stages from here and used the georeferencer tool from georeference.com to get the latitude and longitude coordinates for the stages.

We need to display a map and the markers for each purchase, so we will use Mapbox and the scattermapbox trace. 

You will need to [generate a mapbox token from the mapbox site](https://www.mapbox.com/help/define-access-token/) and then place your token in the first line of code after the import statements in the cell below. You will need to create an account to do this. If you don't feel comfortable in signing up then you can skip this chart and move to the next.

First let’s plot only the stages to see how this works:

In [73]:
# Plotly graph objects version

mapbox_token = "pk.eyJ1Ijoic2FyYWhzYW5kZXJzIiwiYSI6ImNrZHU5Y2hqeTI2aGYyd3R2ajVjOWVtenMifQ.EKpDFYzW2nTwiILPwecc0A"

df = pd.read_csv("../data/stages.csv")

trace = go.Scattermapbox(lat = df["latitude"], lon = df["longitude"], text=df["stage"], marker=go.Marker(size=10), mode="markers+text", textposition="top left")

data = [trace]

layout = go.Layout(mapbox=dict(accesstoken=mapbox_token, center=dict(lat = -23.701057,lon = -46.6970635), zoom=14.5))

figure = go.Figure(data = data, layout = layout)

figure.show()

In [74]:
# Plotly Express version

mapbox_token = "pk.eyJ1Ijoic2FyYWhzYW5kZXJzIiwiYSI6ImNrZHU5Y2hqeTI2aGYyd3R2ajVjOWVtenMifQ.EKpDFYzW2nTwiILPwecc0A" #https://www.mapbox.com/help/define-access-token/

px.set_mapbox_access_token(mapbox_token)

df = pd.read_csv("../data/stages.csv")

fig = px.scatter_mapbox(df, 
                        lat="latitude", 
                        lon="longitude", 
                        color="stage",
                        center=dict(lat = -23.701057,lon = -46.6970635),
                        hover_name="stage",
                        zoom=14.5,
                       title='Lollapalooza Brazil 2018 map')

fig.show()

## How did I spend my money?
To answer that, I created a bar chart with my spendings for food and beverage by each day and built a heatmap to show when I bought stuff. 

We already saw how to build a bar chart (in the previous exercise):

In [75]:
# Plotly graph objects version

df = pd.read_csv("../data/data.csv")

df_purchases_by_place = df.pivot_table(index="place",columns="date",values="price",aggfunc="sum").fillna(0)

data = []

for index,place in df_purchases_by_place.iterrows():
    trace = go.Bar(x = df_purchases_by_place.columns, y = place, name=index)
    data.append(trace)

layout = go.Layout(title="Purchases by place", showlegend=True, barmode="stack")

figure = go.Figure(data=data, layout=layout)

figure.show()

In [76]:
# Plotly express version

df = pd.read_csv("../data/data.csv")

# Remove add a new column based on price * quantity
df['spend'] = df['price']*df['quantitiy']

# Only keep the columns we need
df = df[['date', 'place', 'spend']]

# Group the data by the date and then by place and sum the amount spent
df = df.groupby(['date','place']).sum().reset_index()

fig = px.bar(df, x="spend", y="date", color="place", title="Purchases by place")

figure.show()

Now let’s build a heatmap chart:

In [77]:
# Plotly graph objects version

df = pd.read_csv("../data/data.csv")

df_purchases_by_type = df.pivot_table(index="place",columns="date",values="price",aggfunc="sum").fillna(0)

df["hour_int"] = pd.to_datetime(df["hour"], format="%H:%M", errors='coerce').apply(lambda x: int(x.hour))

df_heatmap = df.pivot_table(index="date",values="price",columns="hour", aggfunc="sum").fillna(0)

trace_heatmap = go.Heatmap(x = df_heatmap.columns, 
                           y = df_heatmap.index, 
                           z = [df_heatmap.iloc[0], 
                                df_heatmap.iloc[1], 
                                df_heatmap.iloc[2]])

data = [trace_heatmap]

layout = go.Layout(title="Purchases by place", showlegend=True)

figure = go.Figure(data=data, layout=layout)

figure.show()

In [78]:
# Plotly Express version

df = pd.read_csv("../data/data.csv")

# Prepare the data

#df["hour_int"] = pd.to_datetime(df["hour"], format="%H:%M", errors='coerce').apply(lambda x: int(x.hour))

df_heatmap = df.pivot_table(index="date",values="price",columns="hour", aggfunc="sum").fillna(0)

# Create the heatmap

fig = px.imshow(df_heatmap, title="Purchases by place")

figure.show()

## Which concerts did I watch?
Now let’s go to the coolest part: could I guess the concerts I attended based only on my purchases?

Ideally, when we are watching a show, we are watching the show (and not buying stuff), so the purchases should be made before or after each concert. I then made a list of each concert happening one hour before, one hour after, and according to the time the purchase was made.

To find out which one of these shows I attended, I calculated the distance from the location of the purchase to each stage. The shows I attended should be the ones with the shortest distance to the concessions.

As we want to show each data point, the best choice for a visualization is a table. Let’s build one:


In [79]:
# Plotly graph objects version

df_table = pd.read_csv("../data/concerts_I_attended.csv")

def colorFont(x):    
    if x == "Yes":       
        return "rgb(0,0,9)"    
    else:       
        return "rgb(178,178,178)"
    
df_table["color"] = df_table["correct"].apply(lambda x: colorFont(x))

trace_table = go.Table(header=dict(values=["Concert","Date","Correct?"], fill=dict(color=("rgb(82,187,47)"))),
                       cells=dict(values= [df_table.concert, df_table.date,df_table.correct], font=dict(color=([df_table.color]))))

data = [trace_table]

figure = go.Figure(data = data)

figure.show()

In [None]:
# Plotly Express version

# There is no equivalent of Figure in express

Three concerts were missing and four were incorrect, giving us a precision of 67% and recall of 72%.

## Putting it all together: dash
We have all the charts, but the goal is to put them all together on a page. To do that we will use Dash (by Plotly).

In [None]:
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd 

app = dash.Dash()

df_table = pd.read_csv("concerts_I_attended.csv").dropna(subset=["concert"])def colorFont(x):    if x == "Yes":       return "rgb(0,0,9)"    else:       return "rgb(178,178,178)"

        df_table["color"] = df_table["correct"].apply(lambda x: colorFont(x))

    trace_table = go.Table(header=dict(values=["Concert","Date","Correct?"],fill=dict(color=("rgb(82,187,47)"))),cells=dict(values=[df_table.concert,df_table.date,df_table.correct],font=dict(color=([df_table.color]))))

data_table = [trace_table]

app.layout = html.Div(children=[    html.Div(        [            dcc.Markdown(                """                ## My experience at Lollapalooza Brazil 2018                ***                """.replace('  ', ''),                className='eight columns offset-by-two'            )        ],        className='row',        style=dict(textAlign="center",marginBottom="15px")    ),

                                html.Div([        html.Div([            html.H5('Which concerts did I attend?', style=dict(textAlign="center")),            html.Div('People usually buy things before or after a concert, so I took the list of concerts, got the distances from the location of the purchases to the stages and tried to guess which concerts did I attend. 8 concerts were correct and 3 were missing from a total of 12 concerts.', style=dict(textAlign="center")),            dcc.Graph(id='table', figure=go.Figure(data=data_table,layout=go.Layout(margin=dict(t=30)))),        ], className="twelve columns"),    ], className="row")])

app.css.append_css({    'external_url': 'https://codepen.io/chriddyp/pen/bWLwgP.css'})

if __name__ == '__main__':    
    app.run_server(debug=True)

import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd 

app = dash.Dash()

df_table = pd.read_csv("concerts_I_attended.csv").dropna(subset=["concert"])def colorFont(x):    if x == "Yes":       return "rgb(0,0,9)"    else:       return "rgb(178,178,178)"

        df_table["color"] = df_table["correct"].apply(lambda x: colorFont(x))

    trace_table = go.Table(header=dict(values=["Concert","Date","Correct?"],fill=dict(color=("rgb(82,187,47)"))),cells=dict(values=[df_table.concert,df_table.date,df_table.correct],font=dict(color=([df_table.color]))))

data_table = [trace_table]

app.layout = html.Div(children=[    html.Div(        [            dcc.Markdown(                """                ## My experience at Lollapalooza Brazil 2018                ***                """.replace('  ', ''),                className='eight columns offset-by-two'            )        ],        className='row',        style=dict(textAlign="center",marginBottom="15px")    ),

                                html.Div([        html.Div([            html.H5('Which concerts did I attend?', style=dict(textAlign="center")),            html.Div('People usually buy things before or after a concert, so I took the list of concerts, got the distances from the location of the purchases to the stages and tried to guess which concerts did I attend. 8 concerts were correct and 3 were missing from a total of 12 concerts.', style=dict(textAlign="center")),            dcc.Graph(id='table', figure=go.Figure(data=data_table,layout=go.Layout(margin=dict(t=30)))),        ], className="twelve columns"),    ], className="row")])

app.css.append_css({    'external_url': 'https://codepen.io/chriddyp/pen/bWLwgP.css'})

if __name__ == '__main__':    
    app.run_server(debug=True)

## Putting it all together: dash
We have all the charts, but the goal is to put them all together on a page. To do that we will use Dash (by Plotly).

In [None]:
import dash
import dash_core_components as dcc
import dash_html_components as html
import plotly.graph_objs as go
import pandas as pd 

app = dash.Dash()

df_table = pd.read_csv("concerts_I_attended.csv").dropna(subset=["concert"])def colorFont(x):    if x == "Yes":       return "rgb(0,0,9)"    else:       return "rgb(178,178,178)"

        df_table["color"] = df_table["correct"].apply(lambda x: colorFont(x))

    trace_table = go.Table(header=dict(values=["Concert","Date","Correct?"],fill=dict(color=("rgb(82,187,47)"))),cells=dict(values=[df_table.concert,df_table.date,df_table.correct],font=dict(color=([df_table.color]))))

data_table = [trace_table]

app.layout = html.Div(children=[    html.Div(        [            dcc.Markdown(                """                ## My experience at Lollapalooza Brazil 2018                ***                """.replace('  ', ''),                className='eight columns offset-by-two'            )        ],        className='row',        style=dict(textAlign="center",marginBottom="15px")    ),

                                html.Div([        html.Div([            html.H5('Which concerts did I attend?', style=dict(textAlign="center")),            html.Div('People usually buy things before or after a concert, so I took the list of concerts, got the distances from the location of the purchases to the stages and tried to guess which concerts did I attend. 8 concerts were correct and 3 were missing from a total of 12 concerts.', style=dict(textAlign="center")),            dcc.Graph(id='table', figure=go.Figure(data=data_table,layout=go.Layout(margin=dict(t=30)))),        ], className="twelve columns"),    ], className="row")])

app.css.append_css({    'external_url': 'https://codepen.io/chriddyp/pen/bWLwgP.css'})

if __name__ == '__main__':    
    app.run_server(debug=True)