#### A data exploration in Altair:
## 6. Interaction

Contact: jonas.oesch@nzz.ch

Import the necessary libraries and don't include the data in Vega-Lite specifications:

In [53]:
import pandas as pd
import altair as alt

alt.data_transformers.enable('data_server')

DataTransformerRegistry.enable('data_server')

Read the data, convert into correct types and preview:

In [54]:
data = pd.read_excel("Olympics.xlsx")
data.Year = pd.to_datetime(data.Year)
data.head(3)

Unnamed: 0,Year,City,Sport,Discipline,Athlete,Gender,Event,Medal,Country,Code,...,Durability,Endurance,Flexibility,Hand-Eye Coordination,Nerve,Power,Rank,Speed,Strength,Total
0,1896-01-01,Athens,Aquatics,Swimming,"HAJOS, Alfred",Men,100M Freestyle,Gold,Hungary,HUN,...,4.63,9.25,5.5,2.88,2.63,4.63,36,5.5,5.25,46.875
1,1896-01-01,Athens,Aquatics,Swimming,"HAJOS, Alfred",Men,100M Freestyle,Gold,Hungary,HUN,...,3.25,4.13,5.5,2.75,2.5,6.25,45,7.88,5.25,44.125
2,1896-01-01,Athens,Aquatics,Swimming,"HERSCHMANN, Otto",Men,100M Freestyle,Silver,Austria,AUT,...,4.63,9.25,5.5,2.88,2.63,4.63,36,5.5,5.25,46.875


Do large countries win more medals?

In [72]:
alt.Chart(data).mark_point().encode(
    x="Population",
    y="count()",
    tooltip="Country"
)

Maybe try a log-scale on the x-axis:

In [73]:
alt.Chart(data).mark_point().encode(
    x=alt.X("Population", scale=alt.Scale(type="log")),
    y="count()",
    tooltip="Country"
)

Or is it a question of being rich? It certainly seems to help.
Here we use an aggregate transform to create a column that contains the medal count per country. Thanks to this, we are able to display the medal-count also in the tooltip.

In [67]:
d1 = (data
          .groupby(["Country", "GDP per Capita"]).Athlete.count()
          .reset_index().rename({"Athlete": "Medals"}, axis=1)
     )
d1

Unnamed: 0,Country,GDP per Capita,Medals
0,Algeria,4206.031232,8
1,Argentina,13431.878340,223
2,Armenia,3489.127690,2
3,Australia,56310.962993,1259
4,Austria,43774.985174,211
...,...,...,...
86,United States,56115.718426,4193
87,Uruguay,15573.900919,76
88,Uzbekistan,2132.070368,17
89,Zambia,1304.879014,1


In [78]:
c1 = (alt.Chart(d1)
        .mark_circle(size=100)
        .encode(
            y=alt.Y("Medals"),
            tooltip=["Country", "Medals", "GDP per Capita"],
            x=alt.X("GDP per Capita", scale=alt.Scale(type="log"))
        )
     )
c1

If we want to have a better look at the countries with less medals, we can make the chart interactive:

In [79]:
c1.interactive()