# Chart Remake - Google Trends: Hangover Cure

This is a remake of chartr.co plot using altair. The dataset date range is different as well.

--- 

@date: 05-Sep-2020 | @author: katnoria

In [3]:
import pandas as pd
import altair as alt

In [4]:
def version_info(cls):
    print(f"{cls.__name__}: {cls.__version__}")

In [5]:
print("Library VersionInfo:")
print("-"*20)
version_info(pd)
version_info(alt)

Library VersionInfo:
--------------------
pandas: 0.24.2
altair: 4.1.0


# Data

I have downloaded the data and made available on the github, so we will just use that.

In [6]:
df = pd.read_csv("../data/us-90days.csv", parse_dates=["Day"])

In [7]:
# Basic info
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90 entries, 0 to 89
Data columns (total 2 columns):
Day                               90 non-null datetime64[ns]
hangover cure: (United States)    90 non-null int64
dtypes: datetime64[ns](1), int64(1)
memory usage: 1.5 KB


In [8]:
# Basic stats
df.describe()

Unnamed: 0,hangover cure: (United States)
count,90.0
mean,29.633333
std,18.847973
min,6.0
25%,16.0
50%,22.0
75%,42.75
max,100.0


In [9]:
# see first few rows
df.head()

Unnamed: 0,Day,hangover cure: (United States)
0,2020-05-30,55
1,2020-05-31,57
2,2020-06-01,20
3,2020-06-02,12
4,2020-06-03,16


We will rename the columns to make it easier to reference them later in the code

In [10]:
df.columns = ['Day', 'Count']
df.head()

Unnamed: 0,Day,Count
0,2020-05-30,55
1,2020-05-31,57
2,2020-06-01,20
3,2020-06-02,12
4,2020-06-03,16


Here, we add a new column to determine whether a given record is a weekday or the weekend

In [11]:
# 5 - sat, 6 - sunday
df['weekday'] = df.Day.dt.weekday
df.head()

Unnamed: 0,Day,Count,weekday
0,2020-05-30,55,5
1,2020-05-31,57,6
2,2020-06-01,20,0
3,2020-06-02,12,1
4,2020-06-03,16,2


# Plot

At this point, we are all good to start making the plot. Let's begin by creating a line chart📈

In [12]:
# Add a label for weekday & weekend
x_tick_values = df[df.weekday >= 5]['Day']
df['label'] = df.weekday.apply(lambda x: 'Saturday \n& Sunday' if x >= 5 else 'Weekday')

In [26]:
# Line chart
alt.Chart(df).mark_line().encode(
    alt.X('Day', axis=alt.Axis(title="", format=("%b %d"), labelAngle=-45, grid=False)),
    alt.Y('Count', title="Google Search Volume (Indexed, 100 = Maximum)", axis=alt.Axis(grid=False))
).properties(
    title="Data is Beautiful. Hangovers Are Not",
    width=1000,
    height=500
)

Next, let's add the weekend bands

In [27]:
# We need to add vertical bars, so we make use of constant value of 100 for every row
# I am sure there are better ways to do this, but this will work
x_tick_values = pd.DataFrame(x_tick_values)
x_tick_values['Count'] = 100
x_tick_values.head(3)

Unnamed: 0,Day,Count
0,2020-05-30,100
1,2020-05-31,100
7,2020-06-06,100


In [31]:
base = alt.Chart(df).mark_line(color="black").encode(
    alt.X('Day:T', axis=alt.Axis(title="", format=("%b %d"), labelAngle=-45, grid=False)),
    alt.Y('Count:Q', title="Google Search Volume (Indexed, 100 = Maximum)", axis=alt.Axis(grid=False))
).properties(
    title="Data is Beautiful. Hangovers Are Not",
    width=1000,
    height=500
)


bar = alt.Chart(x_tick_values).mark_bar(size=12, opacity=0.2).encode(
    alt.X('Day:T'),
    alt.Y('Count:Q')
)


base + bar

And, now we can add the annotation as well

In [36]:
text = (
    alt.Chart(df.query("Day == '2020-06-21'"))
    .mark_text(dy=280, color="#4F61A1")
    .encode(x=alt.X("Day:T"), y=alt.Y("Count:Q"), text=alt.Text("label"))
)

c = base + bar + text
c

And some final touches to the plot.

In [42]:
c.properties(
    title= {
        "text": "Data is Beautiful. Hangovers Are Not",
        "subtitle": "[Google Search Volume for \"hangover cure\"]"
    }   
).configure_title(
    fontSize=32,
    color="darkblue"
)

And, that is it folks. If you're interested you can checkout the implementation in other libraries and langauges as well.