#**COVID-19 GLOBAL DATASET ANALYSIS**



Coronavirus disease 2019 (COVID-19), the respiratory ailment responsible for the COVID-19 pandemic, is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).

This virus has swept the globe since its discovery in Wuhan, China, in December 2019. Some people prefer to focus on the good aspects of the situation, such as how the epidemic has resulted in numerous positive improvements. However, the collateral harm caused by this pandemic cannot be underestimated.

This dataset if taken from woldometers.info

This dataset includes 225 nations.

All countries have records dated 2020-2-15 to 2022-05-14 (820 days per country).
With the exception of China, which has data from 2020-1-22 to 2022-05-14 (844 days per nation), and Palau, which has records from 2021-8-25 to 2022-05-14 (263 days per country).

In [109]:
import pandas as pd

Lets import our two data sets

In [110]:
df_daily=pd.read_csv('worldometer_coronavirus_daily_data.csv')

In [111]:
df_summ=pd.read_csv('worldometer_coronavirus_summary_data.csv')

**Checking for duplicate values in dataset**

In [114]:
df_summ.duplicated().value_counts()

False    226
dtype: int64

In [115]:
df_daily.duplicated().value_counts()

False    184787
dtype: int64

Both of our datasets has no duplicate values

**Checking for Null Values in our two datasets**

In [116]:
df_daily.isna().sum()

date                           0
country                        0
cumulative_total_cases         0
daily_new_cases            10458
active_cases               18040
cumulative_total_deaths     6560
daily_new_deaths           26937
dtype: int64

In [117]:
df_summ.isna().sum()

country                            0
continent                          0
total_confirmed                    0
total_deaths                       8
total_recovered                   22
active_cases                      22
serious_or_critical               81
total_cases_per_1m_population      0
total_deaths_per_1m_population     8
total_tests                       14
total_tests_per_1m_population     14
population                         0
dtype: int64

Our both datasets have a lot of null values. Lets replace these null values with 0 since it wont effect the commulative count of our dataset

In [118]:
df_daily.fillna(value=0,inplace=True)

In [119]:
df_summ.fillna(value=0,inplace=True)

In [120]:
df_daily.isna().sum()

date                       0
country                    0
cumulative_total_cases     0
daily_new_cases            0
active_cases               0
cumulative_total_deaths    0
daily_new_deaths           0
dtype: int64

In [121]:
df_summ.isna().sum()

country                           0
continent                         0
total_confirmed                   0
total_deaths                      0
total_recovered                   0
active_cases                      0
serious_or_critical               0
total_cases_per_1m_population     0
total_deaths_per_1m_population    0
total_tests                       0
total_tests_per_1m_population     0
population                        0
dtype: int64

Our datasets now contain no null values

**Data Exploration**

In [122]:
print(df_daily.head())

        date      country  cumulative_total_cases  daily_new_cases  \
0  2020-2-15  Afghanistan                     0.0              0.0   
1  2020-2-16  Afghanistan                     0.0              0.0   
2  2020-2-17  Afghanistan                     0.0              0.0   
3  2020-2-18  Afghanistan                     0.0              0.0   
4  2020-2-19  Afghanistan                     0.0              0.0   

   active_cases  cumulative_total_deaths  daily_new_deaths  
0           0.0                      0.0               0.0  
1           0.0                      0.0               0.0  
2           0.0                      0.0               0.0  
3           0.0                      0.0               0.0  
4           0.0                      0.0               0.0  


In [123]:
print(df_summ.head())

       country continent  total_confirmed  total_deaths  total_recovered  \
0  Afghanistan      Asia           179267        7690.0         162202.0   
1      Albania    Europe           275574        3497.0         271826.0   
2      Algeria    Africa           265816        6875.0         178371.0   
3      Andorra    Europe            42156         153.0          41021.0   
4       Angola    Africa            99194        1900.0          97149.0   

   active_cases  serious_or_critical  total_cases_per_1m_population  \
0        9375.0               1124.0                           4420   
1         251.0                  2.0                          95954   
2       80570.0                  6.0                           5865   
3         982.0                 14.0                         543983   
4         145.0                  0.0                           2853   

   total_deaths_per_1m_population  total_tests  total_tests_per_1m_population  \
0                           190.0  

In [124]:
print(df_daily.tail())

             date   country  cumulative_total_cases  daily_new_cases  \
184782  2022-5-10  Zimbabwe                248642.0            106.0   
184783  2022-5-11  Zimbabwe                248778.0            136.0   
184784  2022-5-12  Zimbabwe                248943.0            165.0   
184785  2022-5-13  Zimbabwe                249131.0            188.0   
184786  2022-5-14  Zimbabwe                249206.0             75.0   

        active_cases  cumulative_total_deaths  daily_new_deaths  
184782         963.0                   5481.0               2.0  
184783        1039.0                   5481.0               0.0  
184784        1158.0                   5481.0               0.0  
184785        1283.0                   5482.0               1.0  
184786        1307.0                   5482.0               0.0  


In [125]:
print(df_summ.tail())

                       country          continent  total_confirmed  \
221  Wallis And Futuna Islands  Australia/Oceania              454   
222             Western Sahara             Africa               10   
223                      Yemen               Asia            11819   
224                     Zambia             Africa           320591   
225                   Zimbabwe             Africa           249206   

     total_deaths  total_recovered  active_cases  serious_or_critical  \
221           7.0            438.0           9.0                  0.0   
222           1.0              9.0           0.0                  0.0   
223        2149.0           9009.0         661.0                 23.0   
224        3983.0         315997.0         611.0                  0.0   
225        5482.0         242417.0        1307.0                 12.0   

     total_cases_per_1m_population  total_deaths_per_1m_population  \
221                          41755                           644.0   


In [126]:
print(df_daily.shape)

(184787, 7)


In [127]:
print(df_summ.shape)

(226, 12)


The daily data set has 184787 rows and 7 columns while the summary dataset has 226 rows and 12 columns

lets convert date column in dataset df_daily to datetime object since it will help us perform many more operations for our data analysis

In [128]:
print(df_daily.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 184787 entries, 0 to 184786
Data columns (total 7 columns):
 #   Column                   Non-Null Count   Dtype  
---  ------                   --------------   -----  
 0   date                     184787 non-null  object 
 1   country                  184787 non-null  object 
 2   cumulative_total_cases   184787 non-null  float64
 3   daily_new_cases          184787 non-null  float64
 4   active_cases             184787 non-null  float64
 5   cumulative_total_deaths  184787 non-null  float64
 6   daily_new_deaths         184787 non-null  float64
dtypes: float64(5), object(2)
memory usage: 9.9+ MB
None


In [129]:
df_daily['date']=pd.to_datetime(df_daily['date'],format='%Y-%m-%d')

In [130]:
print(df_daily.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 184787 entries, 0 to 184786
Data columns (total 7 columns):
 #   Column                   Non-Null Count   Dtype         
---  ------                   --------------   -----         
 0   date                     184787 non-null  datetime64[ns]
 1   country                  184787 non-null  object        
 2   cumulative_total_cases   184787 non-null  float64       
 3   daily_new_cases          184787 non-null  float64       
 4   active_cases             184787 non-null  float64       
 5   cumulative_total_deaths  184787 non-null  float64       
 6   daily_new_deaths         184787 non-null  float64       
dtypes: datetime64[ns](1), float64(5), object(1)
memory usage: 9.9+ MB
None


In [25]:
print(df_summ.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 226 entries, 0 to 225
Data columns (total 12 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   country                         226 non-null    object 
 1   continent                       226 non-null    object 
 2   total_confirmed                 226 non-null    int64  
 3   total_deaths                    226 non-null    float64
 4   total_recovered                 226 non-null    float64
 5   active_cases                    226 non-null    float64
 6   serious_or_critical             226 non-null    float64
 7   total_cases_per_1m_population   226 non-null    int64  
 8   total_deaths_per_1m_population  226 non-null    float64
 9   total_tests                     226 non-null    float64
 10  total_tests_per_1m_population   226 non-null    float64
 11  population                      226 non-null    int64  
dtypes: float64(7), int64(3), object(2)
m

In [131]:
df_daily.describe()

Unnamed: 0,cumulative_total_cases,daily_new_cases,active_cases,cumulative_total_deaths,daily_new_deaths
count,184787.0,184787.0,184787.0,184787.0,184787.0
mean,725108.9,2818.548507,56301.67,13393.04,34.025418
std,3681471.0,17305.881476,376215.7,59467.24,167.972177
min,0.0,-322.0,-14321.0,0.0,-39.0
25%,1099.0,0.0,9.0,12.0,0.0
50%,17756.0,41.0,811.0,258.0,0.0
75%,223808.5,636.0,11479.0,3696.0,8.0
max,84209470.0,909610.0,17935430.0,1026646.0,5093.0


In [132]:
df_summ.describe()

Unnamed: 0,total_confirmed,total_deaths,total_recovered,active_cases,serious_or_critical,total_cases_per_1m_population,total_deaths_per_1m_population,total_tests,total_tests_per_1m_population,population
count,226.0,226.0,226.0,226.0,226.0,226.0,226.0,226.0,226.0,226.0
mean,2305651.0,27823.38,2037158.0,61931.42,172.89823,148156.809735,1116.575221,28023820.0,1824185.0,34955210.0
std,7575510.0,98069.42,7262591.0,224185.1,718.311893,155202.909225,1210.214694,104799500.0,3247665.0,139033800.0
min,2.0,0.0,0.0,0.0,0.0,16.0,0.0,0.0,0.0,805.0
25%,24126.0,187.5,8559.5,60.5,0.0,11748.25,125.0,239449.8,91648.75,560512.5
50%,179375.0,1946.5,77871.0,1172.0,4.0,98271.5,709.5,1792364.0,665769.0,5800570.0
75%,1090902.0,13233.75,858169.5,14684.0,42.75,255632.75,1842.75,11178360.0,1987618.0,21872840.0
max,84209470.0,1026646.0,81244260.0,1938567.0,8318.0,704302.0,6297.0,1016883000.0,21842470.0,1439324000.0


Descriptive statistics summarize the central tendency, dispersion and shape of our two datasets distribution

**Drive Insights from our Data**

In [133]:
continent_cases=df_summ.groupby('continent')[['total_confirmed','total_deaths','total_recovered','active_cases','serious_or_critical']].sum()

In [134]:
print(continent_cases.head())

                   total_confirmed  total_deaths  total_recovered  \
continent                                                           
Africa                    12042400      254319.0       10137200.0   
Asia                     149999659     1427939.0      126145273.0   
Australia/Oceania          7942867       11413.0        7403813.0   
Europe                   194330079     1830655.0      170861871.0   
North America             99625662     1467234.0       94818163.0   

                   active_cases  serious_or_critical  
continent                                             
Africa                 497766.0                966.0  
Asia                  3260318.0              11768.0  
Australia/Oceania      455469.0                162.0  
Europe                5841832.0               8050.0  
North America         3329149.0               7460.0  


In [135]:
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

**1. Total number of confirmed COVID-19 cases by Continents**

In [42]:
continent_total_confirmed_cases=continent_cases['total_confirmed'].sort_values(ascending=False)
labels=['Continents','Total number of confirmed cases']
bars=px.bar(x=continent_total_confirmed_cases.index,y=continent_total_confirmed_cases.values,color=continent_total_confirmed_cases.index,title='Total number of confirmed COVID-19 cases by Continents',labels={
    'x':'Continents',
    'y':'Total number of confirmed COVID-19 cases',
    'color':'Continents'
})
bars.show()

**2. Total number of COVID-19 deaths by Continents**

In [43]:
continent_death_cases=continent_cases['total_deaths'].sort_values(ascending=False)
labels=['Continents','Total number of confirmed cases']
bars=px.bar(x=continent_death_cases.index,y=continent_death_cases.values,color=continent_death_cases.index,title='Total number of COVID-19 deaths by Continents',labels={
    'x':'Continents',
    'y':'Total number of COVID-19 deaths',
    'color':'Continents'
})
bars.show()

**3. Total number of active COVID-19 cases by Continents**

In [44]:
continent_active_cases=continent_cases['active_cases'].sort_values(ascending=False)
labels=['Continents','Total number of active cases']
bars=px.bar(x=continent_active_cases.index,y=continent_active_cases.values,color=continent_active_cases.index,title='Total number of active COVID-19 cases by Continents',labels={
    'x':'Continents',
    'y':'Total number of active COVID-19 cases',
    'color':'Continents'
})
bars.show()

**4. Ratio of Recovery of Continents from COVID-19 per Total number of confirmed cases**

In [45]:
ratio_recovered=(continent_cases['total_recovered']/continent_cases['total_confirmed'])*100
print(ratio_recovered)

continent
Africa               84.179233
Asia                 84.097040
Australia/Oceania    93.213357
Europe               87.923533
North America        95.174437
South America        89.314757
dtype: float64


In [47]:
pie=px.pie(
    names=ratio_recovered.index,
    title='Ratio of Recovery from COVID-19 per Total number of confirmed cases',
    labels=ratio_recovered.index,
    values=ratio_recovered.values,

)
pie.update_traces(textinfo='label + percent',textposition='outside')
pie.show()



Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**5. Ratio of deaths of Continents from COVID-19 per Total number of confirmed cases**

In [48]:
ratio_dead=(continent_cases['total_deaths']/continent_cases['total_confirmed'])*100
print(ratio_dead)

continent
Africa               2.111863
Asia                 0.951961
Australia/Oceania    0.143689
Europe               0.942034
North America        1.472747
South America        2.269168
dtype: float64


In [49]:
pie=px.pie(
    names=ratio_dead.index,
    title='Ratio of deaths from COVID-19 per Total number of cases',
    labels=ratio_dead.index,
    values=ratio_dead.values,

)
pie.update_traces(textinfo='label + percent',textposition='outside')
pie.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



In [50]:
country_cases=df_summ.groupby('country')[['total_confirmed','total_deaths','total_recovered','active_cases','serious_or_critical']].sum()
print(country_cases)

                           total_confirmed  total_deaths  total_recovered  \
country                                                                     
Afghanistan                         179267        7690.0         162202.0   
Albania                             275574        3497.0         271826.0   
Algeria                             265816        6875.0         178371.0   
Andorra                              42156         153.0          41021.0   
Angola                               99194        1900.0          97149.0   
...                                    ...           ...              ...   
Wallis And Futuna Islands              454           7.0            438.0   
Western Sahara                          10           1.0              9.0   
Yemen                                11819        2149.0           9009.0   
Zambia                              320591        3983.0         315997.0   
Zimbabwe                            249206        5482.0         242417.0   

In [51]:
country_confirmed_cases=country_cases['total_confirmed'].sort_values(ascending=False)
print(country_confirmed_cases.head(10))

country
USA            84209473
India          43121599
Brazil         30682094
France         29160802
Germany        25780226
UK             22159805
Russia         18260293
South Korea    17782061
Italy          17057873
Turkey         15053168
Name: total_confirmed, dtype: int64


**6. Choropleth of confirmed COVID-19 cases by Countries**

In [56]:
from plotly import graph_objects as go
fig = go.Figure(data = go.Choropleth(locations = df_summ['country'],locationmode = 'country names',z = df_summ['total_confirmed'],text = df_summ['country'],colorscale = 'Viridis',colorbar_title = 'Total Confirmed Cases',))

fig.update_layout(title_text = 'Total Confirmed Covid Cases by Countries',geo = dict(showframe = False,showcoastlines = True,projection_type = 'equirectangular'),)

fig.show()

**7. Choropleth of confirmed COVID-19 deaths by Countries**

In [53]:
from plotly import graph_objects as go
fig = go.Figure(data = go.Choropleth(locations = df_summ['country'],locationmode = 'country names',z = df_summ['total_deaths'],text = df_summ['country'],colorscale = 'portland', colorbar_title = 'Total Death Cases'),)

fig.update_layout(title_text = 'Total number of confirmed COVID-19 deaths by Countries',geo = dict(showframe = False,showcoastlines = True,projection_type = 'equirectangular'),)

fig.show()

**8. Choropleth of active COVID-19 cases by Countries**

In [54]:
from plotly import graph_objects as go
fig = go.Figure(data = go.Choropleth(locations = df_summ['country'],locationmode = 'country names',z = df_summ['active_cases'],text = df_summ['country'],colorscale = 'greens',colorbar_title = 'Total Active Cases',))

fig.update_layout(title_text = 'Total active COVID-19 cases by Countries',geo = dict(showframe = False,showcoastlines = True,projection_type = 'equirectangular'),)

fig.show()

**9.Top 10 Countries with highest Number of confirmed COVID-19 cases**

In [57]:
top_10_coun_high=country_confirmed_cases.iloc[0:10]

In [58]:
plt=px.bar(x=top_10_coun_high.index,y=top_10_coun_high.values,color=top_10_coun_high.index,
           title='Top 10 Countries with highest Number of confirmed COVID-19 cases',labels={
               'x':'Countries',
               'y':'Total Number of Confirmed Cases',
               'color':'Countries'
           })
plt.show()

**10.Top 10 Countries with lowest number of confirmed COVID-19 cases**

In [59]:
top_10_coun_low=country_confirmed_cases.iloc[-10:-1]

In [61]:
plt4=px.bar(x=top_10_coun_low.index,y=top_10_coun_low.values,color=top_10_coun_low.index,title='Top 10 Countries with lowest number of confirmed COVID-19 Cases',
           labels={
               'x':'Countries',
               'y':'Total Number of Confirmed Cases',
               'color':'Countries'
           })
plt4.show()

**11.Top 10 Countries with highest number of deaths from COVID-19**

In [62]:
country_death_cases=country_cases['total_deaths'].sort_values(ascending=False)
top_coun_death=country_death_cases.iloc[0:10]

In [64]:
plt=go.Bar(x=top_coun_death.index,y=top_coun_death.values)
data = [plt]
layout = go.Layout(
    title='Top 10 Countries with highest number of Deaths from COVID-19',
    xaxis=dict(title='Countries'),
    yaxis=dict(title='Number of Deaths')
)

plt = go.Figure(data=data, layout=layout)
plt.show()

**12.Top 10 Countries with highest number of active COVID-19 cases**

In [65]:
country_active_cases=country_cases['active_cases'].sort_values(ascending=False)
top_coun_active_high=country_active_cases.iloc[0:10]

In [67]:
plt=px.bar(x=top_coun_active_high.index,y=top_coun_active_high.values, title='Top 10 Countries with highest number Of active COVID-19 cases' ,labels={
    'y':'Number of Active Cases',
    'x':'Countries'
})
plt.show()

**13.Top 10 Countries that conducted the highest number of COVID-19 tests**

In [68]:
countries_test=df_summ.groupby('country')[['total_tests','population']].sum()
ratio_test_per_country=(countries_test['total_tests']/countries_test['population'])*100
countries_test_top=countries_test['total_tests'].sort_values(ascending=False)
countries_test_top=countries_test_top.iloc[0:10]

In [69]:
plt9=px.bar(x=countries_test_top.index, y=countries_test_top.values, title="Top 10 Countries that conducted the highest number of COVID-19 tests",
           labels={
               'x':'Countries',
               'y':'Total Number of Tests Conducted'
           })
plt9.show()

**14.Top 10 Countries with highest COVID-19 testing ratios per Population**

In [70]:
ratio_test_per_country=ratio_test_per_country.sort_values(ascending=False)

In [71]:
ratio_test_per_country=ratio_test_per_country.iloc[0:10]
print(ratio_test_per_country)

country
Denmark                     2184.247220
Austria                     2032.880138
Gibraltar                   1586.727845
Faeroe Islands              1581.011604
United Arab Emirates        1573.262826
Bermuda                     1445.400162
Turks And Caicos Islands    1273.188552
Spain                       1006.735197
Saint Barthelemy             791.764824
Greece                       786.520937
dtype: float64


In [73]:
pie=px.pie(
    labels=ratio_test_per_country.index,
    values=ratio_test_per_country.values,
    names=ratio_test_per_country.index,
    title='Top 10 Countries with Highest COVID-19 Testing Ratio Per Population'
)
pie.update_traces(textposition='outside',textinfo='percent + label')

pie.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**15.Top 10 Countries with Highest COVID-19 Death Ratio per Population**

In [76]:
countries_death=df_summ.groupby('country')[['total_deaths','population']].sum()
ratio_death_per_country=(countries_death['total_deaths']/countries_death['population'])*100
ratio_death_per_country=ratio_death_per_country.sort_values(ascending=False)
ratio_death_per_country=ratio_death_per_country.iloc[0:10]
pie=px.pie(
    labels=ratio_death_per_country.index,
    values=ratio_death_per_country.values,
    names=ratio_death_per_country.index,
    title='Top 10 Countries with Highest COVID-19 Death Ratio Per Population'
)
pie.update_traces(textposition='outside',textinfo='percent + label')

pie.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**Analyzing COVID-19 Stats for European Countries**

In [77]:
europe_data=df_summ[df_summ['continent']=='Europe']

In [78]:
europe_data=europe_data.groupby('country')[['continent','total_confirmed','total_deaths','total_recovered','active_cases','serious_or_critical','total_tests','population']].sum()


The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.



In [79]:
europe_data_total_cases=europe_data['total_confirmed'].sort_values(ascending=False)
europe_data_death_cases=europe_data['total_deaths'].sort_values(ascending=False)
europe_data_recovered_cases=europe_data['total_recovered'].sort_values(ascending=False)
europe_data_active_cases=europe_data['active_cases'].sort_values(ascending=False)
europe_data_serious_cases=europe_data['serious_or_critical'].sort_values(ascending=False)
europe_data_test_cases=europe_data['total_tests'].sort_values(ascending=False)

**16.European Countries with Total Number of Confirmed COVID-19 Cases**

In [85]:
plt=px.bar(x=europe_data_total_cases.index,y=europe_data_total_cases.values,color=europe_data_total_cases.index,title='European Countries with Total Number of Confirmed COVID-19 Cases',
            labels={
                'x':'Countries',
                'y':'Number of Total Confirmed Cases',
                'color':'Countries'
            })
plt.show()

**17.European Countries with Total Number of Confirmed COVID-19 Deaths**

In [84]:
plt=px.bar(x=europe_data_death_cases.index,y=europe_data_death_cases.values,color=europe_data_death_cases.index,title='European Countries with Total Number of Confirmed COVID-19 Deaths',
            labels={
                'x':'Countries',
                'y':'Number of Total Confirmed Cases',
                'color':'Countries'
            })
plt.show()

**18.European Countries with Total Number of Confirmed COVID-19 Recovery Cases**

In [83]:
plt=px.bar(x=europe_data_recovered_cases.index,y=europe_data_recovered_cases.values,color=europe_data_recovered_cases.index,title='European Countries with Total Number of Confirmed COVID-19 Recovery Cases',
            labels={
                'x':'Countries',
                'y':'Number of Total Confirmed Cases',
                'color':'Countries'
            })
plt.show()

**19.European Countires with Total Number of Active COVID-19 Cases**

In [86]:
plt=px.bar(x=europe_data_active_cases.index,y=europe_data_active_cases.values,color=europe_data_active_cases.index,title='European Countries with Total Number of Active COVID-19 Cases',
            labels={
                'x':'Countries',
                'y':'Number of Total Confirmed Cases',
                'color':'Countries'
            })
plt.show()

**20.European Countries with Total Number of Serious or Critical COVID-19 Cases**

In [87]:
plt=px.bar(x=europe_data_serious_cases.index,y=europe_data_serious_cases.values,color=europe_data_serious_cases.index,title='European Countries with Total Number of Serious or Critical COVID-19 Cases',
            labels={
                'x':'Countries',
                'y':'Number of Total Confirmed Cases',
                'color':'Countries'
            })
plt.show()

**21. European Countries with Total Number of COVID-19 Tests Conducted**

In [88]:
plt=px.bar(x=europe_data_test_cases.index,y=europe_data_test_cases.values,color=europe_data_test_cases.index,title='European Countries with Total Number of COVID-19 Tests Conducted',
            labels={
                'x':'Countries',
                'y':'Number of Total Confirmed Cases',
                'color':'Countries'
            })
plt.show()

**22.Top 10 European Countries with Highest COVID-19 Death Ratio Per Population**

In [89]:
ratio_death_europe=(europe_data['total_deaths']/europe_data['population'])*100
ratio_recovery_europe=(europe_data['total_recovered']/europe_data['population'])*100
ratio_serious_europe=(europe_data['serious_or_critical']/europe_data['population'])*100
ratio_test_europe=(europe_data['total_tests']/europe_data['population'])*100

In [94]:
ratio_death_eur=ratio_death_europe.sort_values(ascending=False)
ratio_death_eur=ratio_death_eur.iloc[0:10]
plt=px.pie(
    names=ratio_death_eur.index,
    labels=ratio_death_eur.index,
    values=ratio_death_eur.values,
    title='Top 10 European Countries with Highest COVID-19 Death Ratio Per Population'
)
plt.update_traces(textinfo='percent + label',textposition='outside',marker=dict(colors=px.colors.diverging.Spectral))
plt.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**23. Top 10 European Countries with Highest COVID-19 Recoverey Ratio Per Population**

In [95]:
ratio_recover_eur=ratio_recovery_europe.sort_values(ascending=False)
ratio_recover_eur=ratio_recover_eur.iloc[0:10]
plt=px.pie(
    names=ratio_recover_eur.index,
    labels=ratio_recover_eur.index,
    values=ratio_recover_eur.values,
    title='Top 10 European Countries with Highest COVID-19 Recovery Ratio per Population'
)
plt.update_traces(textinfo='percent + label',textposition='outside',marker=dict(colors=px.colors.diverging.Spectral))
plt.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**24. Top 10 European Countries with Highest COVID-19 Serious or Critical Cases Ratio Per Population**

In [96]:
ratio_serious_eur=ratio_serious_europe.sort_values(ascending=False)
ratio_serious_eur=ratio_serious_eur.iloc[0:10]
plt=px.pie(
    names=ratio_serious_eur.index,
    labels=ratio_serious_eur.index,
    values=ratio_serious_eur.values,
    title='Top 10 European Countries with Highest COVID-19 Serious or Critical Cases Ratio Per Population'
)
plt.update_traces(textinfo='percent + label',textposition='outside',marker=dict(colors=px.colors.diverging.Spectral))
plt.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**25. Top 10 European Countries with Highest COVID-19 Testing Ratio Per Population**

In [97]:
ratio_tests_eur=ratio_test_europe.sort_values(ascending=False)
ratio_tests_eur=ratio_tests_eur.iloc[0:10]
plt=px.pie(
    names=ratio_tests_eur.index,
    labels=ratio_tests_eur.index,
    values=ratio_tests_eur.values,
    title='Top 10 European Countries with Highest COVID-19 Testing Ratio Per Population'
)
plt.update_traces(textinfo='percent + label',textposition='outside',marker=dict(colors=px.colors.diverging.Spectral))
plt.show()


Support for multi-dimensional indexing (e.g. `obj[:, None]`) is deprecated and will be removed in a future version.  Convert to a numpy array before indexing instead.



**Analyzing COVID-19 Stats for UK**

In [98]:
data_london=df_daily[df_daily['country']=='UK']

In [99]:
print(data_london)

             date country  cumulative_total_cases  daily_new_cases  \
173307 2020-02-15      UK                    33.0              0.0   
173308 2020-02-16      UK                    33.0              0.0   
173309 2020-02-17      UK                    33.0              0.0   
173310 2020-02-18      UK                    34.0              1.0   
173311 2020-02-19      UK                    34.0              0.0   
...           ...     ...                     ...              ...   
174122 2022-05-10      UK              22144628.0           4341.0   
174123 2022-05-11      UK              22145157.0            529.0   
174124 2022-05-12      UK              22159615.0          14458.0   
174125 2022-05-13      UK              22159805.0            190.0   
174126 2022-05-14      UK              22159805.0              0.0   

        active_cases  cumulative_total_deaths  daily_new_deaths  
173307          25.0                      0.0               0.0  
173308          25.0       

**26. Daily New Cases of COVID-19 in UK**

In [103]:
fig = px.line(data_london, x=data_london['date'], y=data_london['daily_new_cases'],title='Daily New Cases of COVID-19 in UK',labels={
    'date':'Time Period',
    'daily_new_cases':'Daily New Cases'
})
fig.show()

In [104]:
fig = px.scatter(data_london, x=data_london['date'], y=data_london['daily_new_cases'],title='Daily New Cases of COVID-19 in UK',labels={
    'date':'Time Period',
    'daily_new_cases':'Daily New Cases'
}, color=data_london['daily_new_cases'])
fig.show()

The highest number of daily new COVID-19 Cases in UK were recorded on 5 Jan, 2022.

**27. Daily Active Cases of COVID-19 in UK**

In [105]:
fig=px.line(x=data_london['date'],y=data_london['active_cases'],title='Daily Active Cases of COVID-19 in UK',labels={
    'x':'Time Period',
    'y':'Daily Active Cases'
})
fig.show()

In [106]:
fig = px.scatter(x=data_london['date'], y=data_london['active_cases'],title='Daily Active Cases of COVID-19 in UK',labels={
    'x':'Time Period',
    'y':'Active Cases'
}, color=data_london['active_cases'])
fig.show()

The highest number of daily active COVID-19 Cases in UK were recorded on 14 Jan, 2022.

**28. Daily New Deaths from COVID-19 in UK**

In [107]:
fig = px.line(x=data_london['date'], y=data_london['daily_new_deaths'],title=' Daily New Deaths from COVID-19 in UK',labels={
    'x':'Time Period',
    'y':'Daily New Deaths'
})
fig.show()

In [108]:
fig = px.scatter(x=data_london['date'], y=data_london['daily_new_deaths'],title=' Daily New Deaths from COVID-19 in UK',labels={
    'x':'Time Period',
    'y':'Daily New Deaths'
}, color=data_london['daily_new_deaths'])
fig.show()

The highest number of daily new COVID-19 deaths in UK were recorded on 20 Jan, 2021.

**29. Analyzing Trends between Daily COVID-19 Active, New Cases and Deaths in UK**

In [136]:
df=data_london[['date','active_cases','daily_new_deaths','daily_new_cases']]
df=df.set_index('date')
print(df)

fig=px.line(df,x=df.index,y=df.columns)
fig.show()

            active_cases  daily_new_deaths  daily_new_cases
date                                                       
2020-02-15          25.0               0.0              0.0
2020-02-16          25.0               0.0              0.0
2020-02-17          25.0               0.0              0.0
2020-02-18          26.0               0.0              1.0
2020-02-19          26.0               0.0              0.0
...                  ...               ...              ...
2022-05-10      378217.0               0.0           4341.0
2022-05-11      352158.0               0.0            529.0
2022-05-12      344751.0             284.0          14458.0
2022-05-13      325327.0               0.0            190.0
2022-05-14      325327.0               0.0              0.0

[820 rows x 3 columns]
