#**Covid-19 Vaccines Analysis with Python**




The dataset that I will be using here for the task of covid-19 vaccines analysis is taken from Kaggle. Let’s start by importing the necessary Python libraries and the datas

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.read_csv("/content/country_vaccinations.csv")
data.head()

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
0,Afghanistan,AFG,2021-02-22,0.0,0.0,,,,0.0,0.0,,,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
1,Afghanistan,AFG,2021-02-23,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
2,Afghanistan,AFG,2021-02-24,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
3,Afghanistan,AFG,2021-02-25,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
4,Afghanistan,AFG,2021-02-26,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...


Now let’s explore this data before we start analyzing the vaccines taken by countries:



In [2]:
data.describe()

Unnamed: 0,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million
count,9437.0,8754.0,6502.0,7928.0,15465.0,9437.0,8754.0,6502.0,15465.0
mean,5250013.0,3302754.0,1672178.0,137196.5,79484.45,16.130694,11.678865,5.945094,3007.500873
std,21632690.0,12337160.0,7225403.0,529414.0,364159.6,23.841769,15.622004,10.306655,4693.064582
min,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,63064.0,54036.75,24614.0,3071.75,910.0,1.4,1.3125,0.66,356.0
50%,446285.0,339801.0,181810.5,16130.0,6495.0,6.65,5.13,2.48,1504.0
75%,2003211.0,1387596.0,722847.0,63866.0,30036.0,20.85,15.51,6.76,4020.0
max,297734000.0,149462300.0,108926600.0,11601000.0,7205286.0,215.71,112.75,102.95,118759.0


In [3]:
pd.to_datetime(data.date)
data.country.value_counts()

Canada           144
Russia           143
China            143
Israel           139
United States    138
                ... 
Timor              1
Congo              1
Djibouti           1
Armenia            1
Libya              1
Name: country, Length: 196, dtype: int64

**The United Kingdom is made up of England, Scotland, Wales, and Northern Ireland. But in the above data, these countries are mentioned separately with the same values as in the United Kingdom. So this may be an error while recording this data. So let’s see how we can fix this error:**

In [5]:
data = data[data.country.apply(lambda x: x not in ["England", "Scotland", "Wales", "Northern Ireland"])]
data.country.value_counts()

Canada           144
Russia           143
China            143
Israel           139
United States    138
                ... 
Somalia            1
Armenia            1
Timor              1
Congo              1
Comoros            1
Name: country, Length: 192, dtype: int64

In [6]:
data.vaccines.value_counts()

Oxford/AstraZeneca                                                                    2607
Moderna, Oxford/AstraZeneca, Pfizer/BioNTech                                          1897
Johnson&Johnson, Moderna, Oxford/AstraZeneca, Pfizer/BioNTech                         1532
Oxford/AstraZeneca, Pfizer/BioNTech                                                   1418
Pfizer/BioNTech                                                                       1227
Moderna, Pfizer/BioNTech                                                               604
Oxford/AstraZeneca, Sinopharm/Beijing                                                  587
Oxford/AstraZeneca, Pfizer/BioNTech, Sinovac                                           499
Oxford/AstraZeneca, Pfizer/BioNTech, Sinopharm/Beijing, Sputnik V                      487
Oxford/AstraZeneca, Sinovac                                                            472
Sputnik V                                                                              438

In [7]:
df = data[["vaccines", "country"]]
df.head()

Unnamed: 0,vaccines,country
0,Oxford/AstraZeneca,Afghanistan
1,Oxford/AstraZeneca,Afghanistan
2,Oxford/AstraZeneca,Afghanistan
3,Oxford/AstraZeneca,Afghanistan
4,Oxford/AstraZeneca,Afghanistan


In [8]:
dict_ = {}
for i in df.vaccines.unique():
  dict_[i] = [df["country"][j] for j in df[df["vaccines"]==i].index]

vaccines = {}
for key, value in dict_.items():
  vaccines[key] = set(value)
for i, j in vaccines.items():
  print(f"{i}:>>{j}")

Oxford/AstraZeneca:>>{'Mauritius', 'Kosovo', 'Bahamas', 'Samoa', 'Uganda', "Cote d'Ivoire", 'Ghana', 'Sudan', 'Timor', 'Afghanistan', 'Mali', 'Montserrat', 'Ethiopia', 'Jamaica', 'Brunei', 'Barbados', 'Saint Lucia', 'Taiwan', 'Bangladesh', 'Malawi', 'Belize', 'Antigua and Barbuda', 'Democratic Republic of Congo', 'Suriname', 'Uzbekistan', 'Sierra Leone', 'Guyana', 'Solomon Islands', 'Falkland Islands', 'Saint Vincent and the Grenadines', 'South Sudan', 'Angola', 'Eswatini', 'Grenada', 'Gambia', 'Saint Helena', 'Botswana', 'Trinidad and Tobago', 'Lesotho', 'Togo', 'Dominica', 'Saint Kitts and Nevis', 'Nigeria', 'Vietnam', 'Papua New Guinea', 'Zambia', 'Myanmar', 'Fiji', 'Anguilla', 'Bhutan', 'Djibouti', 'Cape Verde', 'Georgia', 'Nauru', 'Tonga', 'Comoros', 'Sao Tome and Principe'}
Oxford/AstraZeneca, Pfizer/BioNTech, Sinovac, Sputnik V:>>{'Bosnia and Herzegovina', 'Albania'}
Sputnik V:>>{'Syria', 'Belarus', 'Venezuela', 'Guinea', 'Armenia', 'Algeria', 'Paraguay', 'Kazakhstan'}
Oxford/As

Now let’s visualize this data to have a look at what combination of vaccines every country is using:

In [11]:
import plotly.express as px
import plotly.offline as py

vaccine_map = px.choropleth(data, locations = 'iso_code', color = 'vaccines')
vaccine_map.update_layout(height=300, margin={"r":0,"t":0,"l":0,"b":0})
vaccine_map.show()