# Delivery 1 - The relative cases overtime of Covid infectors (absolut Covid cases/population size)

### Notebook Description
This notebook contains the following time series graphs for the selected countries (i.e. United States, Germany, India and Spain):

1) Linear Graph for total Cases of Covid infectors over time

2) Linear Graph for the relative cases overtime of Covid infectors (absolut Covid cases/population size)

3) Linear Graphs for People vaccinated over time

4) Linear Graphs for the vaccination rate (percentage of the population)

The dataset for this Covid-19 project is taken from https://covid.ourworldindata.org/data/owid-covid-data.

### The following libraries are being used:

In [1]:
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt

%matplotlib inline

## Data Set loading for visualization


URL="https://covid.ourworldindata.org/data/owid-covid-data.csv" 

The file is downloaded and read as csv file for analysing the data

In [2]:
df = pd.read_csv('owid-covid-data.csv')

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 197484 entries, 0 to 197483
Data columns (total 67 columns):
 #   Column                                      Non-Null Count   Dtype  
---  ------                                      --------------   -----  
 0   iso_code                                    197484 non-null  object 
 1   continent                                   186033 non-null  object 
 2   location                                    197484 non-null  object 
 3   date                                        197484 non-null  object 
 4   total_cases                                 189559 non-null  float64
 5   new_cases                                   189317 non-null  float64
 6   new_cases_smoothed                          188143 non-null  float64
 7   total_deaths                                170942 non-null  float64
 8   new_deaths                                  170911 non-null  float64
 9   new_deaths_smoothed                         169748 non-null  float64
 

## Viewing columns of dataset

In [4]:
df.columns

Index(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases',
       'new_cases_smoothed', 'total_deaths', 'new_deaths',
       'new_deaths_smoothed', 'total_cases_per_million',
       'new_cases_per_million', 'new_cases_smoothed_per_million',
       'total_deaths_per_million', 'new_deaths_per_million',
       'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients',
       'icu_patients_per_million', 'hosp_patients',
       'hosp_patients_per_million', 'weekly_icu_admissions',
       'weekly_icu_admissions_per_million', 'weekly_hosp_admissions',
       'weekly_hosp_admissions_per_million', 'total_tests', 'new_tests',
       'total_tests_per_thousand', 'new_tests_per_thousand',
       'new_tests_smoothed', 'new_tests_smoothed_per_thousand',
       'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations',
       'people_vaccinated', 'people_fully_vaccinated', 'total_boosters',
       'new_vaccinations', 'new_vaccinations_smoothed',
       't

In [5]:
df.head()

Unnamed: 0,iso_code,continent,location,date,total_cases,new_cases,new_cases_smoothed,total_deaths,new_deaths,new_deaths_smoothed,...,female_smokers,male_smokers,handwashing_facilities,hospital_beds_per_thousand,life_expectancy,human_development_index,excess_mortality_cumulative_absolute,excess_mortality_cumulative,excess_mortality,excess_mortality_cumulative_per_million
0,AFG,Asia,Afghanistan,2020-02-24,5.0,5.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
1,AFG,Asia,Afghanistan,2020-02-25,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
2,AFG,Asia,Afghanistan,2020-02-26,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
3,AFG,Asia,Afghanistan,2020-02-27,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,
4,AFG,Asia,Afghanistan,2020-02-28,5.0,0.0,,,,,...,,,37.746,0.5,64.83,0.511,,,,


## List of countries in the dataset

In [6]:
df['location'].unique()

array(['Afghanistan', 'Africa', 'Albania', 'Algeria', 'Andorra', 'Angola',
       'Anguilla', 'Antigua and Barbuda', 'Argentina', 'Armenia', 'Aruba',
       'Asia', 'Australia', 'Austria', 'Azerbaijan', 'Bahamas', 'Bahrain',
       'Bangladesh', 'Barbados', 'Belarus', 'Belgium', 'Belize', 'Benin',
       'Bermuda', 'Bhutan', 'Bolivia', 'Bonaire Sint Eustatius and Saba',
       'Bosnia and Herzegovina', 'Botswana', 'Brazil',
       'British Virgin Islands', 'Brunei', 'Bulgaria', 'Burkina Faso',
       'Burundi', 'Cambodia', 'Cameroon', 'Canada', 'Cape Verde',
       'Cayman Islands', 'Central African Republic', 'Chad', 'Chile',
       'China', 'Colombia', 'Comoros', 'Congo', 'Cook Islands',
       'Costa Rica', "Cote d'Ivoire", 'Croatia', 'Cuba', 'Curacao',
       'Cyprus', 'Czechia', 'Democratic Republic of Congo', 'Denmark',
       'Djibouti', 'Dominica', 'Dominican Republic', 'Ecuador', 'Egypt',
       'El Salvador', 'Equatorial Guinea', 'Eritrea', 'Estonia',
       'Eswatini', 'Ethi

In [7]:
df['location'].nunique() #showing the total count of countries 

244

## Filtering the data with necessary columns

In [60]:
df_filtered_data=df[['date','location','total_cases','people_vaccinated','people_fully_vaccinated','population']]

In [61]:
df_filtered_data.head()

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population
0,2020-02-24,Afghanistan,5.0,,,39835428.0
1,2020-02-25,Afghanistan,5.0,,,39835428.0
2,2020-02-26,Afghanistan,5.0,,,39835428.0
3,2020-02-27,Afghanistan,5.0,,,39835428.0
4,2020-02-28,Afghanistan,5.0,,,39835428.0


## Selecting the required countries from the filtered data

In [62]:
df_final_data = df_filtered_data[(df_filtered_data['location']=='India')|(df_filtered_data['location']=='Spain')|(df_filtered_data['location']=='United States')|(df_filtered_data['location']=='Germany')]

In [63]:
df_final_data

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population
66594,2020-01-27,Germany,1.0,,,83900471.0
66595,2020-01-28,Germany,4.0,,,83900471.0
66596,2020-01-29,Germany,4.0,,,83900471.0
66597,2020-01-30,Germany,4.0,,,83900471.0
66598,2020-01-31,Germany,5.0,,,83900471.0
...,...,...,...,...,...,...
186950,2022-06-24,United States,86909476.0,,,332915074.0
186951,2022-06-25,United States,86948848.0,,,332915074.0
186952,2022-06-26,United States,86967399.0,,,332915074.0
186953,2022-06-27,United States,87092233.0,,,332915074.0


### Reseting the index

In [64]:
df_final_data.reset_index(drop=True,inplace=True)

In [65]:
df_final_data

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population
0,2020-01-27,Germany,1.0,,,83900471.0
1,2020-01-28,Germany,4.0,,,83900471.0
2,2020-01-29,Germany,4.0,,,83900471.0
3,2020-01-30,Germany,4.0,,,83900471.0
4,2020-01-31,Germany,5.0,,,83900471.0
...,...,...,...,...,...,...
3528,2022-06-24,United States,86909476.0,,,332915074.0
3529,2022-06-25,United States,86948848.0,,,332915074.0
3530,2022-06-26,United States,86967399.0,,,332915074.0
3531,2022-06-27,United States,87092233.0,,,332915074.0


## Plotting the data

In [66]:
df_final_data

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population
0,2020-01-27,Germany,1.0,,,83900471.0
1,2020-01-28,Germany,4.0,,,83900471.0
2,2020-01-29,Germany,4.0,,,83900471.0
3,2020-01-30,Germany,4.0,,,83900471.0
4,2020-01-31,Germany,5.0,,,83900471.0
...,...,...,...,...,...,...
3528,2022-06-24,United States,86909476.0,,,332915074.0
3529,2022-06-25,United States,86948848.0,,,332915074.0
3530,2022-06-26,United States,86967399.0,,,332915074.0
3531,2022-06-27,United States,87092233.0,,,332915074.0


### Linear Graph - Plotting the total Covid-19 cases over time for all selected countries 

In [71]:
import plotly.express as px
fig = px.line(df_final_data, x="date", y="total_cases", color ='location', title='COVID cases over time',markers=True)
fig.show()

### Dividing the total cases by their respective population

In [72]:
df_final_data['Absolute Cases/Population - (%)']=df_final_data['total_cases']*100/df_final_data['population']



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [73]:
df_final_data

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population,Absolute Cases/Population - (%)
0,2020-01-27,Germany,1.0,,,83900471.0,0.000001
1,2020-01-28,Germany,4.0,,,83900471.0,0.000005
2,2020-01-29,Germany,4.0,,,83900471.0,0.000005
3,2020-01-30,Germany,4.0,,,83900471.0,0.000005
4,2020-01-31,Germany,5.0,,,83900471.0,0.000006
...,...,...,...,...,...,...,...
3528,2022-06-24,United States,86909476.0,,,332915074.0,26.105600
3529,2022-06-25,United States,86948848.0,,,332915074.0,26.117426
3530,2022-06-26,United States,86967399.0,,,332915074.0,26.122998
3531,2022-06-27,United States,87092233.0,,,332915074.0,26.160496


### Linear Graph: COVID-19 Cases over time with ratio of cases by population

In [74]:
fig = px.line(df_final_data, x="date", y="Absolute Cases/Population - (%)", color ='location', title='The relative cases overtime of Covid infectors (absolut Covid cases/population size)',markers=True)
fig.show()

# Delivery 2 - The vaccination rate (percentage of the population) over time

In [75]:
df_final_data

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population,Absolute Cases/Population - (%)
0,2020-01-27,Germany,1.0,,,83900471.0,0.000001
1,2020-01-28,Germany,4.0,,,83900471.0,0.000005
2,2020-01-29,Germany,4.0,,,83900471.0,0.000005
3,2020-01-30,Germany,4.0,,,83900471.0,0.000005
4,2020-01-31,Germany,5.0,,,83900471.0,0.000006
...,...,...,...,...,...,...,...
3528,2022-06-24,United States,86909476.0,,,332915074.0,26.105600
3529,2022-06-25,United States,86948848.0,,,332915074.0,26.117426
3530,2022-06-26,United States,86967399.0,,,332915074.0,26.122998
3531,2022-06-27,United States,87092233.0,,,332915074.0,26.160496


In [76]:
df_final_data['People_Vaccinated (% of the population)']=df_final_data['people_vaccinated']*100/df_final_data['population']
df_final_data['People_Fully_Vaccinated (% of the population)']=df_final_data['people_fully_vaccinated']*100/df_final_data['population']



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [77]:
df_final_data

Unnamed: 0,date,location,total_cases,people_vaccinated,people_fully_vaccinated,population,Absolute Cases/Population - (%),People_Vaccinated (% of the population),People_Fully_Vaccinated (% of the population)
0,2020-01-27,Germany,1.0,,,83900471.0,0.000001,,
1,2020-01-28,Germany,4.0,,,83900471.0,0.000005,,
2,2020-01-29,Germany,4.0,,,83900471.0,0.000005,,
3,2020-01-30,Germany,4.0,,,83900471.0,0.000005,,
4,2020-01-31,Germany,5.0,,,83900471.0,0.000006,,
...,...,...,...,...,...,...,...,...,...
3528,2022-06-24,United States,86909476.0,,,332915074.0,26.105600,,
3529,2022-06-25,United States,86948848.0,,,332915074.0,26.117426,,
3530,2022-06-26,United States,86967399.0,,,332915074.0,26.122998,,
3531,2022-06-27,United States,87092233.0,,,332915074.0,26.160496,,


### Plotting the graphs

### Total number of people vaccinated in respective countries

In [87]:
fig = px.line(df_final_data, x="date", y="people_vaccinated", color ='location', title='Total number of people vaccinated',markers=True)
fig.show()

In [89]:
fig = px.line(df_final_data, x="date", y="people_fully_vaccinated", color ='location', title='Total number of people fully vaccinated',markers=True)
fig.show()

### Linear Graph: Vaccination rate (percentage of the population) over time

In [84]:
fig = px.line(df_final_data, x="date", y="People_Vaccinated (% of the population)", color ='location', title='Vaccination rate (percentage of the population) over time',markers=True)
fig.show()

In [85]:
fig = px.line(df_final_data, x="date", y="People_Fully_Vaccinated (% of the population)", color ='location', title='Full Vaccination rate (percentage of the population) over time',markers=True)
fig.show()

### Danke