<a href="https://colab.research.google.com/github/kafeelkamran/Covid-19-impacts-analysis/blob/main/covid_19_impacts_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go

In [3]:
data = pd.read_csv("transformed_data.csv")
data2 = pd.read_csv("raw_data.csv")
print(data)

      CODE      COUNTRY        DATE    HDI        TC        TD       STI  \
0      AFG  Afghanistan  2019-12-31  0.498  0.000000  0.000000  0.000000   
1      AFG  Afghanistan  2020-01-01  0.498  0.000000  0.000000  0.000000   
2      AFG  Afghanistan  2020-01-02  0.498  0.000000  0.000000  0.000000   
3      AFG  Afghanistan  2020-01-03  0.498  0.000000  0.000000  0.000000   
4      AFG  Afghanistan  2020-01-04  0.498  0.000000  0.000000  0.000000   
...    ...          ...         ...    ...       ...       ...       ...   
50413  ZWE     Zimbabwe  2020-10-15  0.535  8.994048  5.442418  4.341855   
50414  ZWE     Zimbabwe  2020-10-16  0.535  8.996528  5.442418  4.341855   
50415  ZWE     Zimbabwe  2020-10-17  0.535  8.999496  5.442418  4.341855   
50416  ZWE     Zimbabwe  2020-10-18  0.535  9.000853  5.442418  4.341855   
50417  ZWE     Zimbabwe  2020-10-19  0.535  9.005405  5.442418  4.341855   

             POP    GDPCAP  
0      17.477233  7.497754  
1      17.477233  7.497754  


In [5]:
data.head()

Unnamed: 0,CODE,COUNTRY,DATE,HDI,TC,TD,STI,POP,GDPCAP
0,AFG,Afghanistan,2019-12-31,0.498,0.0,0.0,0.0,17.477233,7.497754
1,AFG,Afghanistan,2020-01-01,0.498,0.0,0.0,0.0,17.477233,7.497754
2,AFG,Afghanistan,2020-01-02,0.498,0.0,0.0,0.0,17.477233,7.497754
3,AFG,Afghanistan,2020-01-03,0.498,0.0,0.0,0.0,17.477233,7.497754
4,AFG,Afghanistan,2020-01-04,0.498,0.0,0.0,0.0,17.477233,7.497754


In [6]:
data2.head()

Unnamed: 0,iso_code,location,date,total_cases,total_deaths,stringency_index,population,gdp_per_capita,human_development_index,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13
0,AFG,Afghanistan,2019-12-31,0.0,0.0,0.0,38928341,1803.987,0.498,#NUM!,#NUM!,#NUM!,17.477233,7.497754494
1,AFG,Afghanistan,2020-01-01,0.0,0.0,0.0,38928341,1803.987,0.498,#NUM!,#NUM!,#NUM!,17.477233,7.497754494
2,AFG,Afghanistan,2020-01-02,0.0,0.0,0.0,38928341,1803.987,0.498,#NUM!,#NUM!,#NUM!,17.477233,7.497754494
3,AFG,Afghanistan,2020-01-03,0.0,0.0,0.0,38928341,1803.987,0.498,#NUM!,#NUM!,#NUM!,17.477233,7.497754494
4,AFG,Afghanistan,2020-01-04,0.0,0.0,0.0,38928341,1803.987,0.498,#NUM!,#NUM!,#NUM!,17.477233,7.497754494


In [7]:
data["COUNTRY"].value_counts()

COUNTRY
Afghanistan        294
Indonesia          294
Macedonia          294
Luxembourg         294
Lithuania          294
                  ... 
Tajikistan         172
Comoros            171
Lesotho            158
Hong Kong           51
Solomon Islands      4
Name: count, Length: 210, dtype: int64

In [8]:
data["COUNTRY"].value_counts().mode()

0    294
Name: count, dtype: int64

So 294 is the mode value. We will need to use it for dividing the sum of all the samples related to the human development index, GDP per capita, and the population. Now let’s create a new dataset by combining the necessary columns from both the datasets:

In [9]:
# Aggregating the data

code = data["CODE"].unique().tolist()
country = data["COUNTRY"].unique().tolist()
hdi = []
tc = []
td = []
sti = []
population = data["POP"].unique().tolist()
gdp = []

for i in country:
    hdi.append((data.loc[data["COUNTRY"] == i, "HDI"]).sum()/294)
    tc.append((data2.loc[data2["location"] == i, "total_cases"]).sum())
    td.append((data2.loc[data2["location"] == i, "total_deaths"]).sum())
    sti.append((data.loc[data["COUNTRY"] == i, "STI"]).sum()/294)
    population.append((data2.loc[data2["location"] == i, "population"]).sum()/294)

aggregated_data = pd.DataFrame(list(zip(code, country, hdi, tc, td, sti, population)),
                               columns = ["Country Code", "Country", "HDI",
                                          "Total Cases", "Total Deaths",
                                          "Stringency Index", "Population"])
print(aggregated_data.head())

  Country Code      Country       HDI  Total Cases  Total Deaths  \
0          AFG  Afghanistan  0.498000    5126433.0      165875.0   
1          ALB      Albania  0.600765    1071951.0       31056.0   
2          DZA      Algeria  0.754000    4893999.0      206429.0   
3          AND      Andorra  0.659551     223576.0        9850.0   
4          AGO       Angola  0.418952     304005.0       11820.0   

   Stringency Index  Population  
0          3.049673   17.477233  
1          3.005624   14.872537  
2          3.195168   17.596309  
3          2.677654   11.254996  
4          2.965560   17.307957  


Sort the data according to the total cases of Covid-19:

In [10]:
# Sorting Data According to Total Cases

data = aggregated_data.sort_values(by=["Total Cases"], ascending=False)
print(data.head())

   Country Code   Country       HDI  Total Cases  Total Deaths  \
27          BRA    Brazil  0.759000  425704517.0    14340567.0   
90          IND     India  0.640000  407771615.0     7247327.0   
42          COL  Colombia  0.581847   60543682.0     1936134.0   
92          IRN      Iran  0.798000   52421884.0     2914070.0   
40          CHL     Chile  0.656622   51268034.0     1283880.0   

    Stringency Index  Population  
27          3.136028   19.174732  
90          3.610552   21.045353  
42          3.357923   17.745037  
92          3.207064   18.246243  
40          2.989194   16.766047  


Select the top 10 countries with the highest number of cases:

In [11]:
# Top 10 Countries with Highest Covid Cases

data = data.head(10)
print(data)

   Country Code     Country       HDI  Total Cases  Total Deaths  \
27          BRA      Brazil  0.759000  425704517.0    14340567.0   
90          IND       India  0.640000  407771615.0     7247327.0   
42          COL    Colombia  0.581847   60543682.0     1936134.0   
92          IRN        Iran  0.798000   52421884.0     2914070.0   
40          CHL       Chile  0.656622   51268034.0     1283880.0   
97          ITA       Italy  0.880000   50752853.0     6664225.0   
68          FRA      France  0.901000   50084335.0     5633444.0   
7           ARG   Argentina  0.707143   47155234.0     1077426.0   
73          DEU     Germany  0.936000   42447678.0     1640691.0   
15          BGD  Bangladesh  0.477714   35266178.0      484534.0   

    Stringency Index  Population  
27          3.136028   19.174732  
90          3.610552   21.045353  
42          3.357923   17.745037  
92          3.207064   18.246243  
40          2.989194   16.766047  
97          3.629838   17.917523  
68    

Add two more columns (GDP per capita before Covid-19, GDP per capita during Covid-19) to this dataset:

In [12]:
data["GDP Before Covid"] = [65279.53, 8897.49, 2100.75,
                            11497.65, 7027.61, 9946.03,
                            29564.74, 6001.40, 6424.98, 42354.41]
data["GDP During Covid"] = [63543.58, 6796.84, 1900.71,
                            10126.72, 6126.87, 8346.70,
                            27057.16, 5090.72, 5332.77, 40284.64]
print(data)

   Country Code     Country       HDI  Total Cases  Total Deaths  \
27          BRA      Brazil  0.759000  425704517.0    14340567.0   
90          IND       India  0.640000  407771615.0     7247327.0   
42          COL    Colombia  0.581847   60543682.0     1936134.0   
92          IRN        Iran  0.798000   52421884.0     2914070.0   
40          CHL       Chile  0.656622   51268034.0     1283880.0   
97          ITA       Italy  0.880000   50752853.0     6664225.0   
68          FRA      France  0.901000   50084335.0     5633444.0   
7           ARG   Argentina  0.707143   47155234.0     1077426.0   
73          DEU     Germany  0.936000   42447678.0     1640691.0   
15          BGD  Bangladesh  0.477714   35266178.0      484534.0   

    Stringency Index  Population  GDP Before Covid  GDP During Covid  
27          3.136028   19.174732          65279.53          63543.58  
90          3.610552   21.045353           8897.49           6796.84  
42          3.357923   17.745037      

Analyzing the spread of covid-19 in all the countries with the highest number of covid-19 cases.

In [13]:
figure = px.bar(data, y='Total Cases', x='Country',
            title="Countries with Highest Covid Cases")
figure.show()

 Total number of deaths among the countries with the highest number of covid-19 cases:

In [14]:
figure = px.bar(data, y='Total Deaths', x='Country',
            title="Countries with Highest Deaths")
figure.show()

total number of covid-19 cases, the USA is leading in the deaths, with Brazil and India in the second and third positions. One thing to notice here is that the death rate in India, Russia, and South Africa is comparatively low according to the total number of cases. Now let’s compare the total number of cases and total deaths in all these countries:

In [15]:
fig = go.Figure()
fig.add_trace(go.Bar(
    x=data["Country"],
    y=data["Total Cases"],
    name='Total Cases',
    marker_color='indianred'
))
fig.add_trace(go.Bar(
    x=data["Country"],
    y=data["Total Deaths"],
    name='Total Deaths',
    marker_color='lightsalmon'
))
fig.update_layout(barmode='group', xaxis_tickangle=-45)
fig.show()

The percentage of total deaths and total cases among all the countries with the highest number of covid-19 cases:

In [16]:
# Percentage of Total Cases and Deaths
cases = data["Total Cases"].sum()
deceased = data["Total Deaths"].sum()

labels = ["Total Cases", "Total Deaths"]
values = [cases, deceased]

fig = px.pie(data, values=values, names=labels,
             title='Percentage of Total Cases and Deaths', hole=0.5)
fig.show()

Below is how you can calculate the death rate of Covid-19 cases:

In [17]:
death_rate = (data["Total Deaths"].sum() / data["Total Cases"].sum()) * 100
print("Death Rate = ", death_rate)

Death Rate =  3.5329191090118237


Another important column in this dataset is the stringency index. It is a composite measure of response indicators, including school closures, workplace closures, and travel bans. It shows how strictly countries are following these measures to control the spread of covid-19:

In [18]:
fig = px.bar(data, x='Country', y='Total Cases',
             hover_data=['Population', 'Total Deaths'],
             color='Stringency Index', height=400,
             title= "Stringency Index during Covid-19")
fig.show()

Analyzing Covid-19 Impacts on Economy

In [19]:
fig = px.bar(data, x='Country', y='Total Cases',
             hover_data=['Population', 'Total Deaths'],
             color='GDP Before Covid', height=400,
             title="GDP Per Capita Before Covid-19")
fig.show()

The GDP per capita during the rise in the cases of covid-19:

In [20]:
fig = px.bar(data, x='Country', y='Total Cases',
             hover_data=['Population', 'Total Deaths'],
             color='GDP During Covid', height=400,
             title="GDP Per Capita During Covid-19")
fig.show()

Compare the GDP per capita before covid-19 and during covid-19 to have a look at the impact of covid-19 on the GDP per capita:

In [21]:
fig = go.Figure()
fig.add_trace(go.Bar(
    x=data["Country"],
    y=data["GDP Before Covid"],
    name='GDP Per Capita Before Covid-19',
    marker_color='indianred'
))
fig.add_trace(go.Bar(
    x=data["Country"],
    y=data["GDP During Covid"],
    name='GDP Per Capita During Covid-19',
    marker_color='lightsalmon'
))
fig.update_layout(barmode='group', xaxis_tickangle=-45)
fig.show()

Countries spending their budget on the human development:

In [22]:
fig = px.bar(data, x='Country', y='Total Cases',
             hover_data=['Population', 'Total Deaths'],
             color='HDI', height=400,
             title="Human Development Index during Covid-19")
fig.show()