# Creating the Lollapalooza charts from Déborah Mesquita's tutorial

[Tutorial on freecodecamp](https://www.freecodecamp.org/news/how-and-why-i-used-plotly-instead-of-d3-to-visualize-my-lollapalooza-data-d48345e2ca68/)

The following text is copied from the article and placed in this notebook to make it easier for you to render the charts.

Déborah's article provides code that uses plotly.graph_objects. Since we are using plotly.express then this version of the charts is provided here. 

The imports are covered once at the start of the notebook, you will need to run this before you try to run any of the chart cells.

For this activity you are just asked to read and follow the activities, you don't need to modify the code. However, you could stretch yourself and try to amend or improve some of the charts!


In [None]:
import plotly.express as px
import pandas as pd
import plotly.io as pio
pio.renderers.default = "notebook"

## My Lollapalooza experience

For the 2018 edition of Lollapalooza Brazil, all purchases were made through an RFID-enabled wristband. They send the data to your email address, so I decided to take a look at it. What can we learn about me and my experience by analyzing the purchases I made at the festival?

This is how the data looks:

- purchase date
- purchase hour
- product
- quantity
- stage
- place where I did the purchase

Based on this data, let’s answer some questions.

## Where did I go during the festival?

The data only tells us the name of the location where I made the purchase, and the festival took place at Autódromo de Interlagos. 

I took the map with the stages from here and used the georeferencer tool from georeference.com to get the latitude and longitude coordinates for the stages.

We need to display a map and the markers for each purchase, so we will use Mapbox and the scattermapbox trace. 

You will need to [generate a mapbox token from the mapbox site](https://www.mapbox.com/help/define-access-token/) and then place your token in the first line of code after the import statements in the cell below. You will need to create an account to do this. If you don't feel comfortable in signing up then you can skip this chart and move to the next.

First let’s plot only the stages to see how this works:

In [None]:
mapbox_token = "" # Add your mapbox token here

px.set_mapbox_access_token(mapbox_token)

df = pd.read_csv("stages.csv")

fig = px.scatter_mapbox(df, 
                        lat="latitude", 
                        lon="longitude", 
                        color="stage",
                        center=dict(lat = -23.701057,lon = -46.6970635),
                        hover_name="stage",
                        zoom=14.5,
                       title='Lollapalooza Brazil 2018 map')

fig.show()

## How did I spend my money?
To answer that, I created a bar chart with my spendings for food and beverage by each day and built a heatmap to show when I bought stuff. 

We already saw how to build a bar chart (in the previous exercise):

In [None]:
df = pd.read_csv("purchase_data.csv")

# Remove add a new column based on price * quantity
df['spend'] = df['price']*df['quantitiy']

# Only keep the columns we need
df = df[['date', 'place', 'spend']]

# Group the data by the date and then by place and sum the amount spent
df = df.groupby(['date','place']).sum().reset_index()

fig = px.bar(df, x="spend", y="date", color="place", title="Purchases by place")

fig.show()

Now let’s build a heatmap chart:

In [None]:
df = pd.read_csv("purchase_data.csv")

# Prepare the data
df_heatmap = df.pivot_table(index="date",values="price",columns="hour", aggfunc="sum").fillna(0)

# Create the heatmap
fig = px.imshow(df_heatmap)
fig.show()

## Which concerts did I watch?
Now let’s go to the coolest part: could I guess the concerts I attended based only on my purchases?

Ideally, when we are watching a show, we are watching the show (and not buying stuff), so the purchases should be made before or after each concert. I then made a list of each concert happening one hour before, one hour after, and according to the time the purchase was made.

To find out which one of these shows I attended, I calculated the distance from the location of the purchase to each stage. The shows I attended should be the ones with the shortest distance to the concessions.

As we want to show each data point, the best choice for a visualization is a table. Let’s build one:


In [None]:
# Plotly graph objects version as there isn't a table object in plotly express
import plotly.graph_objects as go

df_table = pd.read_csv("concerts_I_attended.csv")

def colorFont(x):    
    if x == "Yes":       
        return "rgb(0,0,9)"    
    else:       
        return "rgb(178,178,178)"
    
df_table["color"] = df_table["correct"].apply(lambda x: colorFont(x))

trace_table = go.Table(header=dict(values=["Concert","Date","Correct?"], fill=dict(color=("rgb(82,187,47)"))),
                       cells=dict(values= [df_table.concert, df_table.date,df_table.correct], font=dict(color=([df_table.color]))))

data = [trace_table]

figure = go.Figure(data = data)

figure.show()