# Covid19 Vaccine Analysis

Many vaccines have been introduced so far to fight covid-19. No vaccine has guaranteed 100% accuracy so far, but most manufacturing companies claim their vaccine is not 100% accurate, but still, it will save your life by giving you immunity.

Thus, each country tries to vaccinate a large part of its population so as not to depend on a single vaccine. That’s I am going to analysis in this project, which is how many vaccines each country is using to fight covid-19. In the section below, I have made my project on Covid-19 vaccines analysis with Python.

### Importing the necessary Python libraries and the dataset

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
data = pd.read_csv("country_vaccinations.csv")
data.head()

Unnamed: 0,country,iso_code,date,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million,vaccines,source_name,source_website
0,Afghanistan,AFG,22-02-2021,0.0,0.0,,,,0.0,0.0,,,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
1,Afghanistan,AFG,23-02-2021,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
2,Afghanistan,AFG,24-02-2021,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
3,Afghanistan,AFG,25-02-2021,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...
4,Afghanistan,AFG,26-02-2021,,,,,1367.0,,,,35.0,Oxford/AstraZeneca,Government of Afghanistan,https://reliefweb.int/report/afghanistan/afgha...


### Exploring this data before we start analyzing the vaccines taken by countries 

In [4]:
data.describe()

Unnamed: 0,total_vaccinations,people_vaccinated,people_fully_vaccinated,daily_vaccinations_raw,daily_vaccinations,total_vaccinations_per_hundred,people_vaccinated_per_hundred,people_fully_vaccinated_per_hundred,daily_vaccinations_per_million
count,9325.0,8649.0,6425.0,7830.0,15307.0,9325.0,8649.0,6425.0,15307.0
mean,5180765.0,3271412.0,1645949.0,135983.0,79109.11,15.928546,11.555663,5.836851,2997.132292
std,21310660.0,12212710.0,7123496.0,523193.0,361668.6,23.530195,15.459619,10.1169,4693.081687
min,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,62083.0,53490.0,24107.0,3020.5,917.5,1.39,1.3,0.65,358.0
50%,441976.0,338057.0,179292.0,16035.5,6506.0,6.59,5.08,2.46,1501.0
75%,1988844.0,1380430.0,714210.0,63414.0,29875.5,20.61,15.34,6.63,4008.5
max,289627000.0,148562900.0,107346500.0,11601000.0,7205286.0,211.08,111.32,99.76,118759.0


In [7]:
pd.to_datetime(data.date)

0       2021-02-22
1       2021-02-23
2       2021-02-24
3       2021-02-25
4       2021-02-26
           ...    
15500   2021-04-30
15501   2021-01-05
15502   2021-02-05
15503   2021-03-05
15504   2021-04-05
Name: date, Length: 15505, dtype: datetime64[ns]

In [8]:
data.country.value_counts()

Canada           143
China            142
Russia           142
Israel           138
United States    137
                ... 
Somalia            1
Djibouti           1
Libya              1
Armenia            1
Congo              1
Name: country, Length: 195, dtype: int64

The United Kingdom is made up of England, Scotland, Wales, and Northern Ireland. But in the above data, these countries are mentioned separately with the same values as in the United Kingdom. So this may be an error while recording this data. So for fixing this error:

In [9]:
data = data[data.country.apply(lambda x: x not in ["England", "Scotland", "Wales", "Northern Ireland"])]
data.country.value_counts()

Canada           143
China            142
Russia           142
Israel           138
United States    137
                ... 
Armenia            1
Congo              1
Somalia            1
Timor              1
Libya              1
Name: country, Length: 191, dtype: int64

### Exploring the vaccines available in this dataset

In [10]:
data.vaccines.value_counts()

Oxford/AstraZeneca                                                                    2574
Moderna, Oxford/AstraZeneca, Pfizer/BioNTech                                          1886
Johnson&Johnson, Moderna, Oxford/AstraZeneca, Pfizer/BioNTech                         1522
Oxford/AstraZeneca, Pfizer/BioNTech                                                   1402
Pfizer/BioNTech                                                                       1216
Moderna, Pfizer/BioNTech                                                               594
Oxford/AstraZeneca, Sinopharm/Beijing                                                  585
Oxford/AstraZeneca, Pfizer/BioNTech, Sinovac                                           495
Oxford/AstraZeneca, Sinovac                                                            467
Sputnik V                                                                              436
Oxford/AstraZeneca, Pfizer/BioNTech, Sinopharm/Beijing, Sputnik V                      404

So we have almost all the Covid-19 vaccines available in this dataset. Now I will create a new DataFrame by only selecting the vaccine and the country columns to explore which vaccine is taken by which country:

In [12]:
df = data[['vaccines', 'country']]
df.head()

Unnamed: 0,vaccines,country
0,Oxford/AstraZeneca,Afghanistan
1,Oxford/AstraZeneca,Afghanistan
2,Oxford/AstraZeneca,Afghanistan
3,Oxford/AstraZeneca,Afghanistan
4,Oxford/AstraZeneca,Afghanistan


#### Now let’s see how many countries are taking each of the vaccines mentioned in this data:

In [13]:
dict_ = {}
for i in df.vaccines.unique():
    dict_[i] = [df['country'][j] for j in df[df['vaccines'] == i].index]
    

vaccines = {}
for key, value in dict_.items():
    vaccines[key] = set(value)
for i,j in vaccines.items():
    print(f'{i}:>>{j}')

Oxford/AstraZeneca:>>{'Afghanistan', 'Gambia', 'Malawi', 'Brunei', 'Saint Helena', 'Trinidad and Tobago', 'Myanmar', "Cote d'Ivoire", 'Fiji', 'Sao Tome and Principe', 'Vietnam', 'Timor', 'Grenada', 'Cape Verde', 'Guyana', 'Jamaica', 'Bahamas', 'Montserrat', 'Angola', 'Georgia', 'Sudan', 'Suriname', 'Eswatini', 'Bhutan', 'Djibouti', 'Mauritius', 'Saint Kitts and Nevis', 'Antigua and Barbuda', 'Belize', 'Saint Vincent and the Grenadines', 'Falkland Islands', 'Ghana', 'Democratic Republic of Congo', 'Bangladesh', 'Dominica', 'Anguilla', 'Nauru', 'Samoa', 'Sierra Leone', 'South Sudan', 'Mali', 'Solomon Islands', 'Taiwan', 'Lesotho', 'Botswana', 'Zambia', 'Uzbekistan', 'Uganda', 'Barbados', 'Saint Lucia', 'Kosovo', 'Togo', 'Papua New Guinea', 'Tonga', 'Nigeria', 'Ethiopia'}
Oxford/AstraZeneca, Pfizer/BioNTech, Sinovac, Sputnik V:>>{'Albania', 'Bosnia and Herzegovina'}
Sputnik V:>>{'Armenia', 'Syria', 'Algeria', 'Belarus', 'Venezuela', 'Paraguay', 'Guinea', 'Kazakhstan'}
Oxford/AstraZeneca, 

### Now let’s visualize this data to have a look at what combination of vaccines every country is using:

In [2]:
import plotly.express as px
import plotly.offline as py

vaccine_map = px.choropleth(data, locations = 'iso_code', color = 'vaccines')
vaccine_map.update_layout(height = 300, margin = {'r':0, 't':0, 'l':0, 'b':0})
vaccine_map.show()

NameError: name 'data' is not defined