<a href="https://colab.research.google.com/github/Shadyorr/tauitoc/blob/master/Untitled0.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<h1 style="font-size:36px;" align="center">The astoundingly unequal distribution of COVID-19 vaccinations</h1>
<hr>

#<h1 align="center">Introduction</h1>

Over two years into the COVID-19 pandemic, we have a growing arsenal of tools to fight the virus. But we are still lacking a truly global and equitable response to help find a way out of the pandemic.

Safe and effective vaccines have been available for over a year, so why aren’t vaccination rates increasing everywhere in the world? It comes down to three factors: supply, distribution, and last-mile logistics of getting shots into arms.

Rich countries bought the majority of the initial supply of vaccines, leaving low- and lower middle-income countries out of the market for over a year. The global supply of vaccines has only just begun to meaningfully increase — one year after they were made available in high-income countries.

So, now that the supply of COVID-19 vaccines has increased, efforts to vaccinate the world should be easy, right? Wrong. For low- and lower middle-income countries to get shots into arms, supply must remain predictable and available. And  high-income countries need to meaningfully follow through on their promises to help end the threat of COVID-19 everywhere.

Omicron challenged the way the world responded to COVID-19. First, it was a wake-up call that COVID is not over, even as news cycles and funding priorities started to move on. Second, it highlighted the importance of other pillars in the global response, including “test and treat” strategies. But even with the increasing importance of scaling up access to tests and treatments, vaccination remains the best tool to tackle COVID globally.

<h1 style="text-align: center;">Data collection</h1>

The data was collected by [Our World in Data](https://ourworldindata.org/) by browsing public information from official sources. Our World in Data rely on figures published on official websites, in press releases and by social media accounts of national authorities—usually governments, ministries of health, or centres for disease control. Population estimates for per-capita metrics are based on the United Nations World Population Prospects. Income groups are based on the World Bank classification.

[Our World in Data](https://ourworldindata.org/) complete COVID-19 dataset is a collection of the COVID-19 data maintained by Our World in Data. Our World in Data will update it daily throughout the duration of the COVID-19 pandemic. It includes the following data:

<table >
<thead>
<tr>
<th>Metrics</th>
<th>Source</th>
<th>Updated</th>
<th>Countries</th>
</tr>
</thead>
<tbody>
<tr>
<td>Vaccinations</td>
<td>Official data collated by the Our World in Data team</td>
<td>Daily</td>
<td>218</td>
</tr>
<tr>
<td>Tests &amp; positivity</td>
<td>Official data collated by the Our World in Data team</td>
<td>No longer updated</td>
<td></td>
</tr>
<tr>
<td>Hospital &amp; ICU</td>
<td>Official data collated by the Our World in Data team</td>
<td>Daily</td>
<td>47</td>
</tr>
<tr>
<td>Confirmed cases</td>
<td>JHU CSSE COVID-19 Data</td>
<td>Daily</td>
<td>217</td>
</tr>
<tr>
<td>Confirmed deaths</td>
<td>JHU CSSE COVID-19 Data</td>
<td>Daily</td>
<td>217</td>
</tr>
<tr>
<td>Reproduction rate</td>
<td>Arroyo-Marioli F, Bullano F, Kucinskas S, Rondón-Moreno C</td>
<td>Daily</td>
<td>192</td>
</tr>
<tr>
<td>Policy responses</td>
<td>Oxford COVID-19 Government Response Tracker</td>
<td>Daily</td>
<td>187</td>
</tr>
<tr>
<td>Other variables of interest</td>
<td>International organizations (UN, World Bank, OECD, IHME…)</td>
<td>Fixed</td>
<td>241</td>
</tr>
</tbody>
</table>


The income groups we use come from the [World Bank income classification](https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups). low-income economies are defined as those with a GNI per capita, calculated using the [World Bank Atlas method](https://datahelpdesk.worldbank.org/knowledgebase/articles/378832-what-is-the-world-bank-atlas-method), of \$1,085 or less in 2021; lower middle-income economies are those with a GNI per capita between \$1,086 and \$4,255; upper middle-income economies are those with a GNI per capita between \$4,256 and $13,205; high-income economies are those with a GNI per capita of \$13,205 or more. 

#<h1 align="center">Getting Started</h1>

We make use of Python 3 along with a few imported libraries: pandas, numpy, plotly, scikit-learn, seaborn, and more.

In [73]:
# Necessary libraries
import pandas as pd
import plotly.express as px
import plotly as ply
import seaborn as sns
import matplotlib as mpl
import matplotlib.pyplot as plt
import scipy.stats as sci
import plotly.io as pio
import datetime
import numpy as np
from sklearn import linear_model
from IPython.display import Markdown, display, HTML
from tabulate import tabulate
import math

pio.renderers.default = 'colab'



**Function**

In [74]:
millnames = ['',' Thousand',' Million',' Billion',' Trillion']
def millify(n):
    n = float(n)
    millidx = max(0,min(len(millnames)-1,
        int(math.floor(0 if n == 0 else math.log10(abs(n))/3))))
    return '{:.2f}{}'.format(n / 10**(3 * millidx), millnames[millidx])



#<h1 align="center">Reading data</h1>

The downloaded data are inside a csv file and in addition to the fields listed above, we want to enrich the data by adding other fields such as the ISO code of the countries, the income group.
Then we download another csv file containing the list of countries and the fields we need. Then we go to merge / join the data through the ISO code field.
What we will get is shown below.

In [75]:
#The complete Our World in Data COVID-19 dataset 
#Source: https://ourworldindata.org/covid-vaccinations
df_owid_covid_data = pd.read_csv('https://covid.ourworldindata.org/data/owid-covid-data.csv')
df_owid_covid_data = df_owid_covid_data[['iso_code','continent','location','date','total_vaccinations','people_vaccinated','people_fully_vaccinated','total_vaccinations_per_hundred','people_vaccinated_per_hundred','people_fully_vaccinated_per_hundred']]
df_owid_covid_data.columns = ['Code', 'Continent', 'Country', 'Date','Vaccine doses administered','Number of people vaccinated','Number of people fully vaccinated','Vaccine doses administered %','Number of people vaccinated %','Number of people fully vaccinated %']
# Classification by income
#Source: https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups
df_world_income = pd.read_csv('https://raw.githubusercontent.com/Shadyorr/Income-groups/main/WorldIncome.csv')
df_world_income = df_world_income.drop(['Country','Region'], axis=1)
df_world_income.columns = ['Code','Income group']

#Merge Our World in Data COVID-19 dataset and Classification by income
df = pd.merge(df_owid_covid_data, df_world_income, on='Code', how="outer")

display(Markdown('**Our World in Data COVID-19 dataset :**'))
display(Markdown('Number of elements owid-covid-data : {}'.format(len(df_owid_covid_data))))

display(df_owid_covid_data.head(3))

display(Markdown('**Classification by income :**'))
display(Markdown('Number of elements WorldIncome : {}'.format(len(df_world_income))))

display(df_world_income.head(3))

display(Markdown('**Final data :**'))
display(Markdown('Number of elements after merge : {}'.format(len(df))))

display(df.head(3))

**Our World in Data COVID-19 dataset :**

Number of elements owid-covid-data : 202186

Unnamed: 0,Code,Continent,Country,Date,Vaccine doses administered,Number of people vaccinated,Number of people fully vaccinated,Vaccine doses administered %,Number of people vaccinated %,Number of people fully vaccinated %
0,AFG,Asia,Afghanistan,2020-02-24,,,,,,
1,AFG,Asia,Afghanistan,2020-02-25,,,,,,
2,AFG,Asia,Afghanistan,2020-02-26,,,,,,


**Classification by income :**

Number of elements WorldIncome : 218

Unnamed: 0,Code,Income group
0,ABW,High income
1,AFG,Low income
2,AGO,Lower middle income


**Final data :**

Number of elements after merge : 202190

Unnamed: 0,Code,Continent,Country,Date,Vaccine doses administered,Number of people vaccinated,Number of people fully vaccinated,Vaccine doses administered %,Number of people vaccinated %,Number of people fully vaccinated %,Income group
0,AFG,Asia,Afghanistan,2020-02-24,,,,,,,Low income
1,AFG,Asia,Afghanistan,2020-02-25,,,,,,,Low income
2,AFG,Asia,Afghanistan,2020-02-26,,,,,,,Low income


#<h1 align="center">Data processing and visualization</h1>

Now we have all our data, but there are accommodations to do. In this processing phase, we organize and "correct" the data so that it is easy to use in the subsequent stages of visualization and analysis.
First we delete the rows containing "nan" values. 

In [76]:
df = df.dropna(subset=['Vaccine doses administered','Number of people vaccinated','Number of people fully vaccinated','Vaccine doses administered %','Number of people vaccinated %','Number of people fully vaccinated %'],how='all')
display(Markdown('**Final data :**'))
display(Markdown('Number of elements after remove missing values : {}'.format(len(df))))

display(df.head(3))

**Final data :**

Number of elements after remove missing values : 56489

Unnamed: 0,Code,Continent,Country,Date,Vaccine doses administered,Number of people vaccinated,Number of people fully vaccinated,Vaccine doses administered %,Number of people vaccinated %,Number of people fully vaccinated %,Income group
364,AFG,Asia,Afghanistan,2021-02-22,0.0,0.0,,0.0,0.0,,Low income
370,AFG,Asia,Afghanistan,2021-02-28,8200.0,8200.0,,0.02,0.02,,Low income
386,AFG,Asia,Afghanistan,2021-03-16,54000.0,54000.0,,0.13,0.13,,Low income


In [77]:
display(Markdown("**<h3>The key numbers</h3>**"))

date_total_vaccinations = df[(df['Code'] == 'OWID_WRL')].max()
display(Markdown("<h4>As of {}, <mark>{}</mark> vaccine doses have been administered in the world:</h4>".format(pd.to_datetime(pd.Series(date_total_vaccinations['Date'])).dt.strftime('%d %B %Y')[0],millify(date_total_vaccinations['Vaccine doses administered'] ))))

low_income = df[(df['Code'] == 'OWID_LIC') & (pd.notna(df['Vaccine doses administered']))].max()
display(Markdown("* **<mark>{}</mark>** doses in low-income countries".format(millify(low_income['Vaccine doses administered']))))

lower_middle_income = df[(df['Code'] == 'OWID_LMC') & (pd.notna(df['Vaccine doses administered']))].max()
display(Markdown("* **<mark>{}</mark>** doses in lower middle-income countries".format(millify(lower_middle_income['Vaccine doses administered']))))

upper_middle_income = df[(df['Code'] == 'OWID_UMC') & (pd.notna(df['Vaccine doses administered']))].max()
display(Markdown("* **<mark>{}</mark>** doses in upper middle-income countries".format(millify(upper_middle_income['Vaccine doses administered']))))


high_income = df[(df['Code'] == 'OWID_HIC') & (pd.notna(df['Vaccine doses administered']))].max()
display(Markdown("* **<mark>{}</mark>** doses in high-income countries".format(millify(high_income['Vaccine doses administered']))))
display(Markdown("<br>"))
display(Markdown("To date, <mark>{}%</mark> of the world’s population has been fully vaccinated. But only <mark>{}%</mark> of people in low-income countries have been fully vaccinated. Lower middle income countries have fully vaccinated <mark>{}%</mark> of their people. That’s a huge difference compared with <mark>{}%</mark> in high income countries, and <mark>{}%</mark> in upper middle-income countries.".format(date_total_vaccinations['Number of people fully vaccinated %'],low_income['Number of people fully vaccinated %'],lower_middle_income['Number of people fully vaccinated %'],high_income['Number of people fully vaccinated %'],upper_middle_income['Number of people fully vaccinated %'])))

**<h3>The key numbers</h3>**

<h4>As of 18 July 2022, <mark>12.23 Billion</mark> vaccine doses have been administered in the world:</h4>

* **<mark>194.34 Million</mark>** doses in low-income countries

* **<mark>4.26 Billion</mark>** doses in lower middle-income countries

* **<mark>5.25 Billion</mark>** doses in upper middle-income countries

* **<mark>2.52 Billion</mark>** doses in high-income countries

<br>

To date, <mark>61.28%</mark> of the world’s population has been fully vaccinated. But only <mark>15.79%</mark> of people in low-income countries have been fully vaccinated. Lower middle income countries have fully vaccinated <mark>55.19%</mark> of their people. That’s a huge difference compared with <mark>73.47%</mark> in high income countries, and <mark>78.68%</mark> in upper middle-income countries.

**<h3>Tracking progress against global vaccination targets</h3>**
<br>
In September 2021, world leaders aligned around the target of vaccinating 70% of the population in all countries by September 2022. By the end of 2021, wealthier countries had already met the target.
​
Based on current trends, low-income countries don’t stand a chance to meet the 70% vaccination target in 2022.

**At current vaccination rates, only wealthier countries will meet the target**

In [78]:
df_incomes = df[(df['Code'].isin(['OWID_LIC','OWID_LMC','OWID_UMC','OWID_HIC']) & (df['Date'] >= "2021-01-01"))]

df_incomes['Number of people fully vaccinated %'] =   df_incomes['Number of people fully vaccinated %'].fillna(0)

fig = px.line(df_incomes, x="Date", y="Number of people fully vaccinated %", color='Country',
             labels={
                 'Country':"Income group"
             }, line_shape="spline", render_mode="svg")
fig.update_xaxes(
    dtick="M3",
    tickformat="%d %b %Y")

y =(int(df_incomes['Number of people fully vaccinated %'].max()/10)+1)*10 
fig.update_layout(yaxis_range=[0,y])
fig.add_shape(type="rect",
    x0="2021-01-01", x1=date_total_vaccinations['Date'], y0=70, y1=y,
    line=dict(color="rgba(0,0,0,0)",width=3,),
    fillcolor="LightSkyBlue",layer='below'
)
fig.add_annotation(text='Objective met', 
                    align='left',
                    showarrow=False,
                    xref='paper',
                    yref='paper',
                    x=0.1,
                    y=0.95)

fig.show()