<a href="https://github.com/hturbe/python_visualisation/blob/main/intro_vis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Plotly
The aim of this workshop is to intoduce two common libraries in Python:

*   **[Matplotlib](https://matplotlib.org/stable/index.html)**: For simple visualisation
*   **[Plotly](https://plotly.com/)**: For interactive visualisation


## Data Formatting
The first step is to import the data and format them so that they can be used to produce different visualisations

In [None]:
# We need to import to import some external libraries that will  be used to transform the data
import os
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
!python -m pip install plotly --upgrade

Today we will work with covid data from the confederation [website](https://www.covid19.admin.ch/en/overview)
We first download the data and unzip them into a folder called covid_data.

NB we can run command by preceeding them with ! as done below with the wget command and unzip.

In [None]:
!rm -r /content/*
!wget https://www.covid19.admin.ch/api/data/20211025-yqqxa0ri/downloads/sources-csv.zip -O data_zipped.zip
!unzip -qd /content/covid_data /content/data_zipped.zip

In [None]:
# we can now look at the data we have downloaded
path_data = '/content/covid_data/data'
os.listdir(path_data)

In [None]:
#We can read the csv file with the pandas library we imported before
df_hosp = pd.read_csv(os.path.join(path_data, 'COVID19Hosp_geoRegion_AKL10_w.csv'))
# We convert the week number into dates so that they are easier to manipulate afterwards
df_hosp.datum = pd.to_datetime(df_hosp.datum.astype(str)+'1',format="%Y%W%w")


Data is split in canton and [region](https://en.wikipedia.org/wiki/NUTS_statistical_regions_of_Switzerland) 

We first focus on the data for all Switzerland

In [None]:
# df_hospCH = df_hosp.loc[df_hosp.geoRegion == 'CH',:]
print(df_hosp.columns)

In [None]:
df_hospIndexed = df_hosp.set_index(['geoRegion','altersklasse_covid19', 'datum']).loc[:,['entries','inz_entries']]

In [None]:
df_hospCH = df_hospIndexed.xs('CH')
df_hospCH

## Simple visualisations: Matplotlib

In [None]:
df_entries = df_hospCH.xs('80+').loc[:,'inz_entries']
plt.plot(df_entries);

### Customising Matplolib graphs

In [None]:
SMALL_SIZE = 12
MEDIUM_SIZE = 14
BIGGER_SIZE = 16

plt.rc('font', size=SMALL_SIZE)          # controls default text sizes
plt.rc('axes', titlesize=SMALL_SIZE)     # fontsize of the axes title
plt.rc('axes', labelsize=MEDIUM_SIZE)    # fontsize of the x and y labels
plt.rc('xtick', labelsize=SMALL_SIZE)    # fontsize of the tick labels
plt.rc('ytick', labelsize=SMALL_SIZE)    # fontsize of the tick labels
plt.rc('legend', fontsize=SMALL_SIZE)    # legend fontsize

plt.figure(figsize=[12,8]);
plt.plot(df_entries);
plt.title('Hospitalisation for people aged 80+ in CH',fontsize = 18);
plt.ylabel('Weekly Incidence');
plt.xlabel('Date');
plt.xticks(rotation=45);

##Interactive visualisations Plotly

You can find some of the projects done with dash and plotly [here](https://dash.gallery/Portal/)

In [None]:
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=df_entries.index, y=df_entries.values,
    name='Incdence',
    mode='lines+markers',
))

fig.update_layout(
        template='plotly_white')

In [None]:
import plotly.express as px

color_scheme = px.colors.qualitative.Vivid
age_classes = df_hospCH.index.get_level_values(0).unique()
# print('Age classses:',age_classes.tolist())
fig = go.Figure()
for idx,age_range in enumerate(age_classes):
  if age_range == 'Unbekannt':
    legend_name = 'Unknown'
  else:
    legend_name = age_range 
  df_tmp = df_hospCH.xs(age_range).loc[:,'inz_entries']
  fig.add_trace(go.Scatter(
    x=df_tmp.index, y=df_tmp.values,
    name=legend_name,
    mode='lines+markers',
    marker_color = color_scheme[idx]
  ))

fig.update_layout(
    title="Incidence of hospitalisation in CH",
    xaxis_title="Date",
    yaxis_title="Incidence",
    template='plotly_white')

# fig.update_layout(legend_title_text='Age groups')
    

In [None]:
import plotly.express as px

color_scheme = px.colors.qualitative.Vivid
locations = df_hospIndexed.index.get_level_values(0).unique()
fig = go.Figure()
# We first add the trace for all locations and age classes setting visible to false
for location in locations:
  df_loc = df_hospIndexed.xs(location)
  for idx,age_range in enumerate(age_classes):
    if age_range == 'Unbekannt':
      legend_name = 'Unknown'
    else:
      legend_name = age_range 
      
    df_tmp = df_loc.xs(age_range).loc[:,'inz_entries']
    fig.add_trace(go.Scatter(
      x=df_tmp.index, y=df_tmp.values,
      name=legend_name,
      mode='lines+markers',
      visible=False,
      marker_color = color_scheme[idx]
    ))

steps = []
trace_nb = int(len(fig.data)/len(locations))
# We create the dict with info of which trace to show for each locations
for i in range(len(locations)):

		area_str = locations[i]
		step = dict(
			label=area_str,
			method="update",
			args=[{"visible": [False] * len(fig.data)},
				# {"title": area_str}
        ],  # layout attribute
		)
		for idx in range(trace_nb):	# Toggle i'th trace to "visible"
			step["args"][0]["visible"][i*trace_nb+idx] = True  
		steps.append(step)

# We update the layout with the dictionary previously created for the button
fig.update_layout(
		template='plotly_white',
    width=1500,
    height=700,
    xaxis_title="Date",
    yaxis_title="Incidence",
		updatemenus=[ 
			dict(
				active=0,
				buttons=steps,
				direction="down",
				pad={"r": 10, "t": 10,"b":15},
				showactive=True,
				x=0,
				xanchor="left",
				yanchor="bottom")],
	)

for idx in range(trace_nb):
		fig.data[idx].visible = True

fig.show()

## Your turn
You can pick one of the csv included in the dataset we downloaded and create your own graph. e.g. show vaccinated population per canton