<div class="alert alert-info">

# Instructions

- This Jupyter Notebook is **for practicing only** - you do not need to submit it.
- Still, **invest sufficient time in practicing how Plotly works!** While it is easy to create basic plots, it certainly takes more time to optimize and customize them. Mastering Plotly is fundamental for your success in this course.

In [43]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

In [108]:
df = pd.read_csv("wdi.csv")
df20 = df[df.year==2020]
df.head()

Unnamed: 0,iso3,country,year,continent,region,population,gdp,gdp_capita,life_expectancy,inflation,fertility,maternal_death,infant_mortality_per_1000,suicides_per_100k
0,AFG,Afghanistan,1990,Asia,South Asia,10694796.0,,,45.967,,7.565,,120.9,
1,ALB,Albania,1990,Europe,Europe & Central Asia,3286542.0,8379850000.0,2549.746801,73.144,-0.431369,2.9,,35.4,
2,DZA,Algeria,1990,Africa,Middle East & North Africa,25518074.0,177965000000.0,6974.076379,67.416,30.259599,4.556,,43.6,
3,ASM,American Samoa,1990,Oceania,East Asia & Pacific,47818.0,,,,,,,,
4,AND,Andorra,1990,Europe,Europe & Central Asia,53569.0,,,,7.326244,,,9.1,


# Part 1

## 1.1

Create a scatter plot showing the relationship between GDP per capita and fertility rate in 2020. No optimizations and customizations are needed at this point.

In [4]:
px.scatter(df20, x='gdp_capita', y='fertility')

What happens if you add the argument `text="country"` to the `px.scatter()` function?

In [5]:
px.scatter(df20, x='gdp_capita', y='fertility', text='country')

Change your code so that the country names are only shown when you hover over the points.

In [6]:
px.scatter(df20, x='gdp_capita', y='fertility', hover_name='country')

- Add the color argument to the `px.scatter()` function so that the color of the markers reflects the `region`.
- Then try out alternative encoding choices such as `symbol` or `size`. 
- Which encoding works best?

In [9]:
px.scatter(df20, x='gdp_capita', y='fertility', color='region', hover_name='country')

In [10]:
px.scatter(df20, x='gdp_capita', y='fertility', color='region', symbol='region', hover_name='country')

Add a title and proper labels for the x- and y-axis and the legend.

In [12]:
px.scatter(df20, x='gdp_capita', y='fertility', 
           color='region', symbol='region', hover_name='country',
           title='Do rich people have less children?',
           subtitle = 'GDP per capita vs. Fertility rate in 2020',
           labels={'region': 'Region',
                   'gdp_capita': 'GDP per Capita [$]',
                    'fertility': 'Fertility Rate'
                    }
           )

## 1.2

Below you see a svg image of a line plot. **Can you recreate this figure using Plotly.**  

![](figures/fig1.svg)

In [28]:
df_small = df[df["country"].isin(['Morocco', 'Rwanda', 'South Africa', 'South Sudan', 'Uganda'])]
px.line(df_small, x='year', y='life_expectancy', color='country',
        title='Life expectancy in selected African countries (1990-2020)',
        labels={'country': 'Country',
                'year': 'Year',
                'life_expectancy': 'Life Expectancy'
                })

# Part 2

## 2.1

- Run the following code cell and inspect the resulting figure. 
- What happens with the x and y axis if you deselect "Africa" in the legend?
- (When) is this behaviour desirable? 

In [29]:
px.scatter(df20, x="fertility", y ="life_expectancy", color="continent")

- Change the definition of the x and y axis so that they are fixed at appropriate ranges. 
- Which of the two approaches do you prefer?

In [None]:
x_min = df20["fertility"].min()
x_max = df20["fertility"].max()

y_min = df20["life_expectancy"].min()
y_max = df20["life_expectancy"].max()

fig = px.scatter(df20, x="fertility", y="life_expectancy", color="continent")

fig.update_xaxes(range=[x_min-1, x_max+1])
fig.update_yaxes(range=[y_min-2, y_max+2])

fig.show()

## 2.2

Below you see a svg image of a scatter plot. **Try to recreate this figure using Plotly.**  
You can use the code snippet below the image to get started.  
Go step by step and check your figure after each step.

In particular, you will need to:

2. change the data type of the year variable ade
3. apply changes to the traces of the figure
3. apply changes to the layout of the figure

![](figures/fig2.svg)

In [38]:
# Filter for years 1990 and 2020, and for Africa only
data = df[(df.year.isin([1990, 2020])) & (df.continent=="Africa")].reset_index(drop=True)

In [106]:
data_20 = data[data.year == 2020]
data_90 = data[data.year == 1990]

fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x= data_90.fertility, 
        y=data_90.life_expectancy, 
        mode='markers',
        marker=dict(color="#2B9978", size=10, opacity=0.7),
        name='1990'
    )
)
fig.add_trace(
    go.Scatter(
        x= data_20.fertility, 
        y=data_20.life_expectancy, 
        mode='markers',
        marker=dict(color="#D86D1B", size=10, opacity=0.7),
        name='2020'
    )
)

fig.update_layout(
    width=800,
    height=400,
    title="How has Africa's fertility and life expectancy changed between 2000 and 2020?",
    xaxis=dict(title="Fertility"),
    yaxis=dict(title="Life Expectancy"),
    template='plotly_white',
    legend=dict(
        orientation="h",
        x=0.75,   # left-right (0 = left, 1 = right)
        y=0.98,   # bottom-top (0 = bottom, 1 = top)
        bordercolor="black",
        borderwidth=0.5
    )
)

fig.update_xaxes(range=[1, 9], dtick=1)
fig.update_yaxes(range=[28, 80])

fig.show()

# Part 3

Create one further visualization of you choice. Then customize and optimize it.

In [133]:
fig = px.choropleth(df,
                    locations='iso3',
                    color='life_expectancy',
                    hover_name='country',
                    animation_frame='year',
                    color_continuous_scale='Plasma',
                    range_color=[30, 85],
                    title='Global Life Expectancy Over Time',
                    height=700,
                    labels={'year': 'Year',
                    'life_expectancy': 'Life Expectancy'} 
                    )

fig.update_layout(
    width=700,
    height=400,
    margin=dict(l=0, r=0, t=40, b=0))

fig.show()