# #MakeoverMonday - What is America's Favourite Sport?
>  Gallup has analysed which sports Americans like to watch.

- toc: false
- badges: true
- comments: true
- categories: makeovermonday, altair, python, visualisation
- image: images/sports.png

## What is America's Favourite Sport?

Gallup has analysed which sports Americans like to watch. Football tops the list with roughly 40%, followed by 15% of Americans who don't have any preferences.

Source: https://news.gallup.com/poll/4735/sports.aspx#1

In [151]:
# hide
import pandas as pd
import altair as alt

In [152]:
# hide
df = pd.read_excel("Data.xlsx")
df

Unnamed: 0,Sport,2017,2013,2008,2007,2006,2005,2004
0,Football,37,39,41,43,43,34,37
1,Basketball,11,12,9,11,12,12,13
2,Baseball,9,14,10,13,11,12,10
3,Soccer,7,4,3,2,2,3,2
4,Ice hockey,4,3,4,4,2,4,3
5,Auto racing,2,2,3,3,4,5,5
6,Tennis,2,3,1,1,1,3,2
7,Golf,1,2,2,2,3,2,2
8,Volleyball,1,0,1,0,0,1,0
9,Boxing,1,1,2,1,2,1,1


In [153]:
# hide
df.columns = df.columns.astype(str)

In [154]:
# hide
# The data doesn't quite add up to 100%
df["2017"].sum()

98

In [155]:
# hide
# Reshape data to obtain a long and thin dataset
df.set_index("Sport", inplace = True)
df.columns.name = "Year"
df = df.stack(level=-1)

In [156]:
# hide
# Convert series back to df
df = df.to_frame(name = "Percentage").reset_index()

In [157]:
# hide
# Find most relevant sport so the df can be sorted accordingly
df_group = df.groupby("Sport").sum().sort_values("Percentage", ascending = False)
df_group.head()

Unnamed: 0_level_0,Percentage
Sport,Unnamed: 1_level_1
Football,274
,89
Basketball,80
Baseball,79
Other,29


In [158]:
# hide
df_inner = df.merge(df_group, how="inner", on="Sport")

In [159]:
# hide
df_inner.sort_values(["Percentage_y", "Sport", "Year"], ascending=[False, True, True], inplace = True)

In [160]:
# hide
df_inner.rename(columns={"Percentage_x": "Percentage"}, inplace = True)

In [161]:
# hide
df_inner.head(12)

Unnamed: 0,Sport,Year,Percentage,Percentage_y
6,Football,2004,37,274
5,Football,2005,34,274
4,Football,2006,43,274
3,Football,2007,43,274
2,Football,2008,41,274
1,Football,2013,39,274
0,Football,2017,37,274
146,,2004,12,89
145,,2005,13,89
144,,2006,12,89


In [163]:
alt.Chart(df_inner).mark_rect(stroke="#ffffff", strokeWidth=0.7).encode(
    alt.X("Year", title=None),
    alt.Y("Sport:N", title=None, sort="-x"),
    color=alt.Color("Percentage:Q", scale=alt.Scale(scheme="purplered")),
    tooltip=["Sport", "Percentage", "Year"],
).configure(font="Calibri").configure_axis(
    grid=False, labelFontSize=12, ticks=False, domainOpacity=0
).properties(
    width=150, height=490
).configure_view(
    strokeOpacity=0
).configure_axisY(  # left aligns y-axis labels
    titleAngle=0, titleY=-10, titleX=-60, labelPadding=110, labelAlign="left"
).properties(
    title={"text": "What is America’s favourite sport to watch?", "subtitle": " "}
).configure_title(
    anchor="start", fontSize=18
).configure_legend(
    labelFontSize=12, titleFontSize=14
)