# **Task 3 - KPIs & Exploratory Data Analysis**

Define 3-6 Key Performance Indicators (KPIs) in addition to those used in the previous tasks and perform exploratory data analysis on multiple sheets in both Excel files. Make sure to document what you are investigating using a combination of markdown and code in the Jupyter notebook. If you are using Python scripts, write markdown alongside the code. Also, document your findings and save all visualizations.

**1. For Vaccination Data:**

1.1 **Vaccination Coverage Rate Municipalities:** The percentage of the population in municipalities that has received at least one dose of the COVID-19 vaccine. Formula: (Number of people vaccinated municipality).

1.2 **Fully Vaccinated Rate Municipalities:** The percentage of the population that has received all recommended doses of the COVID-19 vaccine. Formula: (Number of people fully vaccinated / Total population) * 100.

1.3 **Vaccine Efficacy Rate:** The effectiveness of the vaccine in preventing COVID-19 cases. Formula: (1 - (Number of vaccinated people who contracted COVID-19 / Total number of vaccinated people)) * 100.

**2. For COVID-19 Cases Data:**

2.1 **Incidence Rate:** The number of new COVID-19 cases reported in a specific period per 100,000 population. Formula: (Number of new cases / Total population) * 100,000.

2.2 **Hospitalization Rate:** The percentage of COVID-19 cases that required intensive care. Formula: (Number of hospitalized cases / Total number of COVID-19 cases) * 100.

2.3 **Case Fatality Rate (CFR):** The percentage of confirmed COVID-19 cases that resulted in death. Formula: (Number of deaths due to COVID-19 / Total number of confirmed COVID-19 cases) * 100.

### Import of libraries needed for exploration

In [53]:
import pandas as pd
import seaborn as sns
import plotly.express as px

### **Vaccination Data:**

1. **Vaccination Coverage Rate Municipalities:** The percentage of the population in municipalities that has received at least one dose of the COVID-19 vaccine. Formula: (Number of people vaccinated municipality).

We start with uploading the sheet with data about vaccinated split on municipalities. Then we sort the dataframe in descending order based on the share of the totalpopulation that received dose one.

In [54]:
# Vaccinated with at least one and at least two doses per municipality
vaccinated_municipality = pd.read_excel("Data/Folkhalsomyndigheten_Covid19_Vaccine.xlsx", sheet_name="Vaccinerade kommun")
vaccinated_municipality.sort_values(by='Andel_dos1', ascending=False)

Unnamed: 0,KnKod,KnNamn,Antal_dos1,Antal_dos2,Andel_dos1,Andel_dos2
187,1761,Hammarö,13046,12916,0.931591,0.922308
273,2480,Umeå,105536,103798,0.928744,0.913449
193,1780,Karlstad,77359,76392,0.922324,0.910795
103,1262,Lomma,19247,18986,0.921218,0.908725
54,584,Vadstena,6204,6160,0.919929,0.913405
...,...,...,...,...,...,...
101,1260,Bjuv,10379,10106,0.764511,0.744402
97,1231,Burlöv,12720,12388,0.763231,0.743310
114,1277,Åstorp,10562,10269,0.762985,0.741819
17,181,Södertälje,62933,59877,0.716744,0.681939


The code above shows the share of the citizens that received dose one split on municipalities and sorted on 'Andel_dos1' (Share dose one). The first five are the ones with the highest share of vaccinated with dose one and last five rows are the ones with the lowest share of vaccinated with dose one.

From what we can see Botkyrka has the lowest numbers of citizens that received dose one with just 70% of the total population in the municipality. On the other hand we have Hammarö that vaccinated 93% of their population with at lest one dose.

In [55]:
labels = {
    'Andel_dos1': "Share of dose one", 
    'KnNamn': 'Municipality color'}
fig = px.strip(vaccinated_municipality, x='Andel_dos1', hover_name='KnNamn', title='Share per Municipality that Received Vaccine Dose 1', color='KnNamn', labels=labels)
fig.show()
# fig.write_html()

In [56]:
vaccinated_gender = pd.read_excel("Data/Folkhalsomyndigheten_Covid19_Vaccine.xlsx", sheet_name="Vaccinerade kön")
vaccinated_gender

Unnamed: 0,Kön,Antal vaccinerade,Andel vaccinerade,Vaccinationsstatus
0,Totalt,7810380,0.858964,Minst 1 dos
1,Totalt,7627588,0.838861,Minst 2 doser
2,Män,3858688,0.845743,Minst 1 dos
3,Män,3759898,0.82409,Minst 2 doser
4,Kvinnor,3951692,0.872279,Minst 1 dos
5,Kvinnor,3867690,0.853737,Minst 2 doser


If we then combine the above knowledge with looking att the total number of vaccinated in the country by gender we can assume that men in Botkyrka municipality are the group that has the lowest rate of vaccinations. 

### **COVID-19 Cases Data:**

1. **Incidence Rate:** The number of new COVID-19 cases reported in a specific period per 100,000 population. Formula: (Number of new cases / Total population) * 100,000.

In [66]:
weekly_data_county = pd.read_excel("Data/Folkhalsomyndigheten_Covid19.xlsx", sheet_name="Veckodata Region")
weekly_data_county

Unnamed: 0,år,veckonummer,Region,Antal_fall_vecka,Kum_antal_fall,Antal_intensivvårdade_vecka,Kum_antal_intensivvårdade,Antal_avlidna_vecka,Kum_antal_avlidna,Antal_fall_100000inv_vecka,Kum_fall_100000inv
0,2020,1,Blekinge,0,0,0,0,0,0,0,0
1,2020,2,Blekinge,0,0,0,0,0,0,0,0
2,2020,3,Blekinge,0,0,0,0,0,0,0,0
3,2020,4,Blekinge,0,0,0,0,0,0,0,0
4,2020,5,Blekinge,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...
3082,2022,38,Östergötland,248,97976,1,437,5,871,53,20973
3083,2022,39,Östergötland,238,98214,2,439,7,878,51,21024
3084,2022,40,Östergötland,204,98418,0,439,3,881,44,21067
3085,2022,41,Östergötland,145,98563,1,440,4,885,31,21098
