<font size=+3 color="#133F6C"><center><b>COVID-19 Vaccination Progress: How do Countries in the EU Compare with the UK?</b></center></font>

<img src="https://images.unsplash.com/photo-1608638479472-b1181125d106?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=746&q=80" width = 400>
<center><em>Photo by Hakan Nural (Unsplash)</em></center>


On the 31st of December, at 11 p.m. GMT, the United Kingdom (UK) completed its separation from the European Union (EU) with the ending of the agreed transition period. Unsurprisingly, the next controversy in the often-rocky relationship between the two entities came only a month later and concerned the availability of vaccination jabs against COVID-19 ([Source](https://news.sky.com/story/covid-19-how-does-the-uk-compare-in-europes-race-for-vaccines-amid-shortage-warning-12199325)). Moreover, the EU has also faced severe criticism regarding its vaccination efficiency, owing to burdensome paperwork, a lack of nurses, and supply shortages
([Source](https://www.nytimes.com/2021/01/05/world/europe/europe-covid-vaccinations.html)). On the other hand, the UK has been praised for its management of the vaccination rollout programme ([Source](https://uk.news.yahoo.com/german-newspaper-bild-we-envy-you-british-covid-vaccine-rollout-104131274.html)).

<br>

As the need for mass-inoculation by countries gathers pace, this notebook looks at how the **UK is faring against other countries in the EU**.

<br>

**Table of Contents**

- [Outline](#Outline)
- [Explanation of Features](#Explanation-of-Features)
- [Introduction](#Introduction)
- [Functions](#Functions)<br>
    - [plot_bar_chart()](#plot_bar_chart())<br>
    - [include_flags()](#include_flags())<br>
    - [plot_map()](#plot_map())<br>
    - [plot_bubble_chart()](#plot_bubble_chart())<br>
    - [plot_vacc_progress()](#plot_vacc_progress())<br>
    - [create_animation()](#create_animation())<br>
- [Data](#Data)
    - [Geospatial Dataset](#Geospatial-Dataset-(ne_10m_admin_0_countries.geojson))<br>
    - [Vaccination Dataset](#Vaccination-Dataset-(country_vaccinations.csv))<br>
    - [Extra Information](#Extra-Information-(eucountrydatawb_csv.csv))<br>
    - [Merging the three Datasets](#Merging-the-three-Datasets)<br>
    - [Flags Dataset](#Flags-Dataset-(countries_continents_codes_flags_url.csv))<br>
- [Current State](#Current-State)<br>
    - [Overview](#Overview)<br>
    - [Vaccines](#Vaccines)<br>
    - [Vaccination Schemes](#Vaccination-Schemes)<br>
    - [Total Vaccinations](#Total-Vaccinations)<br>
    - [People Vaccinated](#People-Vaccinated)<br>
    - [People Fully Vaccinated](#People-Fully-Vaccinated)<br>
    - [Daily Vaccinations](#Daily-Vaccinations)
- [Does a country's wealth impact its vaccine rollout?](#Does-a-country's-wealth-impact-its-vaccine-rollout?)
    - [Alternative Comparison](#Alternative-Comparison)
- [Caveat](#Caveat)
- [Vaccination Progress](#Vaccination-Progress)<br>
    - [Total Vaccinations (%)](#Total-Vaccinations-(%))
    - [People Vaccinated (%)](#People-Vaccinated-(%))
    - [People Fully Vaccinated (%)](#People-Fully-Vaccinated-(%))
    - [Daily Vaccinations (per million)](#Daily-Vaccinations-(per-million))
    - [Animation](#Animation)
- [Extra Resources](#Extra-Resources)
- [Conclusions](#Conclusions)

<br>

# Outline

This notebook focuses on only the [27 current member states of the EU](https://en.wikipedia.org/wiki/Member_state_of_the_European_Union) plus the UK (abbreviated as '**EU/UK**' in this notebook). The comparison between countries will be based on four aspects of the vaccination process:

- The total number of vaccinations,
- The number of people vaccinated (1st dose),
- The number of people fully vaccinated (2nd dose), and
- The daily number of new vaccinations.

Apart from comparing the absolute numbers, we also need to factor in the population of each country. Therefore, each of the above features will be normalised per hundred people (for the first three features) or million people (for the daily number of vaccinations). Our analysis will also factor in parameters such as GDP per capita. Lastly, we will visualise how vaccinations progressed since the day of the first vaccination instead of actual dates (Year-Month-Day) because immunisations started at different dates for different countries.

<br>

# Explanation of Features

As mentioned earlier, we will compare countries based on four aspects of the vaccination process:

- The **total number of vaccinations** (`total_vaccinations`): total number of doses administered. This is counted as a single dose and may not equal the total number of people vaccinated, depending on the specific dose regime (e.g. people receive multiple doses). If a person receives one dose of the vaccine, this metric goes up by 1. If they receive a second dose, it goes up by 1 again.
- The **number of people vaccinated** (`people_vaccinated`): total number of people who received at least one vaccine dose. If a person receives the first dose of a 2-dose vaccine, this metric goes up by 1. If they receive the second dose, the metric stays the same.
- The **number of people fully vaccinated** (`people_fully_vaccinated`): total number of people who received all doses prescribed by the vaccination protocol. If a person receives the first dose of a 2-dose vaccine, this metric stays the same. If they receive the second dose, the metric goes up by 1.
- The **daily number of new vaccinations** (`daily_vaccinations`): new doses administered per day.

<br>

As an example of how these metrics were calculated, consider 4 people that take part in a vaccination program, to be given a vaccine that requires 2 doses to be effective against the disease.

- Dina has received 2 doses;
- Joel has received 1 dose;
- Tommy has received 1 dose;
- Ellie has not received any dose.

In our data:

- The total number of doses administered (`total_vaccinations`) will be equal to **4** (2 + 1 + 1);
- The total number of people vaccinated (`people_vaccinated`) will be equal to **3** (Dina, Joel, Tommy);
- The total number of people fully vaccinated (`people_fully_vaccinated`) will be equal to **1** (Dina).

You can read more in [Ref. 1](#Extra-Resources).

<br>

# Introduction

Let's start the notebook by importing the necessary libraries.

In [None]:
import numpy as np
import pandas as pd
import geopandas as gpd

import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

from matplotlib.lines import Line2D
from mpl_toolkits.axes_grid1 import make_axes_locatable
from plotly.subplots import make_subplots
from shapely.geometry import Point

!pip install pywaffle;
from pywaffle import Waffle

In [None]:
print('‚úîÔ∏è Libraries imported!')

We also need to set some defaults parameters for the whole notebook.

In [None]:
font_size = 22
default_color = '#133F6C'
facecolor = '#f6f5f5'

plt.rcParams['axes.labelsize'] = font_size
plt.rcParams['axes.titlesize'] = font_size + 3
plt.rcParams['xtick.labelsize'] = font_size - 2
plt.rcParams['ytick.labelsize'] = font_size - 2

plt.rcParams['axes.edgecolor'] = default_color
plt.rcParams['xtick.color'] = default_color
plt.rcParams['ytick.color'] = default_color
plt.rcParams['axes.labelcolor'] = default_color
plt.rcParams['axes.titlecolor'] = default_color
plt.rcParams['axes.titleweight'] = 'bold'

plt.rcParams['figure.facecolor'] = facecolor
plt.rcParams['axes.facecolor'] = facecolor

%config InlineBackend.figure_format = 'retina'

pd.options.mode.chained_assignment = None

print('‚úîÔ∏è Default parameters set!')

Finally, our analysis requires two qualitative colour palettes that we will define now.

In [None]:
palette_1 = ['#4E89AE', '#ED6663', '#FFA372', '#726A95', '#A3D2CA']
palette_2 = ['#D73838', '#132743', '#0089BA', '#74C7B8', '#EE6F57']

sns.palplot(palette_1)
plt.title('Palette #1')
plt.tick_params(axis='x', bottom=False, labelbottom=False)

sns.palplot(palette_2)
plt.title('Palette #2')
plt.tick_params(axis='x', bottom=False, labelbottom=False);

<br>

# Functions

Since we are going to reuse parts of the code, it will be helpful to define some functions.

## plot_bar_chart()

In [None]:
def plot_bar_chart(fig, df, column, color, title, height, width):
    '''Create an interactive bar chart with countries sorted by the specified feature.'''
    column_norm = column + '_per_hundred'

    data = df.sort_values(by=[column], ascending=True)[['country', 'iso_code', column, column_norm, 'color']]

    fig.add_trace(
        go.Bar(x=data[column],
               y=data['country'],
               name='Vaccinated',
               hovertemplate='%{x:,0f}',
               text=data[column_norm],
               textposition="outside",
               insidetextanchor='end',
               textfont_color=data['color'],
               orientation='h',
               marker=dict(color=data['color'],
                           line=dict(color=data['color'], width=1))))

    fig.update_traces(texttemplate='%{text:.1f}%')
    fig.update_xaxes(title_text='Number', range=[0, data[column].max() + 10 * 1E+6])

    fig.update_layout(title=title,
                      font=dict(size=14, color=color),
                      hovermode='y unified',
                      showlegend=False,
                      plot_bgcolor=facecolor,
                      paper_bgcolor=facecolor,
                      height=height,
                      width=width)


print('‚úîÔ∏è Function defined!')

## include_flags()

In [None]:
def include_flags(fig, df, column, offset, sizex, sizey):
    '''Include the flag of the country next to its bar in the bar chart.'''
    data = vacc_tot.sort_values(by=[column], ascending=True)[['country', 'iso_code', column]]

    for country, value in zip(data['country'], data[column]):
        flag_url = flags_df[flags_df['country'] == country]['image_url'].values[0]

        fig.add_layout_image(
            dict(
                source=flag_url,
                x=int(value) + offset,
                y=country,
                sizex=sizex,
                sizey=sizey,
                xanchor="left",
                yanchor="middle",
                sizing='stretch',
                xref='x',
                yref="y",
            ))


print('‚úîÔ∏è Function defined!')

## plot_map()

In [None]:
def plot_map(fig, df, column, title, colormap, width, height):
    '''Plot a map with each country colored according to its value for the specified feature.'''
    data = df.sort_values(by=[column], ascending=True)[['country', 'iso_code', column, 'color']]

    hovertemplate = "%{z:.1f}%<extra>%{text}</extra>" if column != 'daily_vaccinations_per_million' else "%{z:.0f}<extra>%{text}</extra>"

    fig.add_trace(
        go.Choropleth(locations=data['iso_code'],
                      locationmode='ISO-3',
                      z=data[column],
                      marker=dict(line=dict(color=data['color'], width=1)),
                      text=data['country'],
                      hovertemplate=hovertemplate,
                      colorscale=colormap,
                      colorbar_title=title,
                      colorbar_titleside='right'))

    fig.update_layout(geo=dict(visible=False,
                               showcountries=False,
                               landcolor=facecolor,
                               countrycolor=facecolor,
                               showlakes=False,
                               showocean=True,
                               oceancolor=facecolor,
                               projection_type='natural earth'),
                      title={
                          'text': f"{title} (as of {final_date_str})",
                          'x': 0.5,
                          'xanchor': 'center',
                          'font': {
                              'size': 16
                          }
                      },
                      plot_bgcolor=facecolor,
                      paper_bgcolor=facecolor,
                      width=width,
                      height=height)

    fig.update_geos(lataxis_range=[32, 70],
                    lonaxis_range=[-10, 36],
                    resolution=50)


print('‚úîÔ∏è Function defined!')

## plot_bubble_chart()

In [None]:
def plot_bubble_chart(df, features, size, color, color_discrete_map, axes, title, height, width):
    '''Plot a bubble chart where each bubble describes four variables (through its x- and y-values, size, and color.)'''
    fig = px.scatter(df,
                     x=features[0],
                     y=features[1],
                     size=size,
                     color=color,
                     color_discrete_map=color_mapping,
                     hover_name='country')

    fig.update_traces(marker=dict(line=dict(width=1, color='firebrick')))

    fig.update_xaxes(title=axes[0])
    fig.update_yaxes(title=axes[1])

    line_color = default_color
    fig.update_layout(title={'text': title},
                      xaxis=dict(linewidth=1.5,
                                 linecolor=line_color,
                                 mirror=True,
                                 tickfont=dict(size=12, color=line_color)),
                      yaxis=dict(linewidth=1.5,
                                 linecolor=line_color,
                                 mirror=True,
                                 tickfont=dict(size=12, color=line_color)),
                      font=dict(size=14, color=line_color),
                      legend=dict(orientation='h',
                                  xanchor='left',
                                  yanchor='bottom',
                                  x=-0.1,
                                  y=-0.45),
                      plot_bgcolor=facecolor,
                      paper_bgcolor=facecolor,
                      height=height,
                      width=width)

    return fig


print('‚úîÔ∏è Function defined!')

## plot_vacc_progress()

In [None]:
def plot_vacc_progress(fig, df, countries, feature, colors, ylabel, title, height, width, interpolate=True):
    '''Create a line chart describing one feature related to vaccination for all specified countries.'''
    for index, country in enumerate(countries):

        # Define the data.
        df_country = df.loc[df['country'] == country].reset_index()
        y = df_country[feature].interpolate() if interpolate else df_country[feature]

        # Plot the data.
        fig.add_trace(
            go.Scatter(x=df_country.index,
                       y=y,
                       name=country,
                       mode='lines',
                       line=dict(width=3),
                       marker_color=colors[index]))

    # Update the axes parameters.
    line_color = default_color
    fig.update_xaxes(title_text='Days Since First Vaccination')
    fig.update_yaxes(title_text=ylabel, type='log')

    # Update the layout parameters.
    fig.update_layout(title_text=title,
                      xaxis=dict(linewidth=1.5,
                                 linecolor=line_color,
                                 mirror=True,
                                 tickfont=dict(size=12, color=line_color)),
                      yaxis=dict(linewidth=1.5,
                                 linecolor=line_color,
                                 mirror=True,
                                 tickfont=dict(size=12, color=line_color)),
                      font=dict(size=14, color=line_color),
                      hovermode='x unified',
                      legend=dict(orientation='h',
                                  xanchor='right',
                                  yanchor='bottom',
                                  x=1,
                                  y=1.02),
                      plot_bgcolor=facecolor,
                      paper_bgcolor=facecolor,
                      height=height,
                      width=width)

    return fig


print('‚úîÔ∏è Function defined!')

## create_animation()

In [None]:
def create_animation(df, features, size, color_discrete_sequence, range_x, range_y, axes, title, height, width):
    '''Create an animation where each frame is a bubble chart.'''
    fig = px.scatter(df,
                     x=features[0],
                     y=features[1],
                     animation_frame='date',
                     animation_group='country',
                     size=size,
                     color='vaccines',
                     color_discrete_sequence=color_discrete_sequence,
                     hover_name='country',
                     log_x=True,
                     range_x=range_x,
                     range_y=range_y)

    fig.update_traces(marker=dict(line=dict(width=1, color='firebrick')))

    fig.update_xaxes(title=axes[0])
    fig.update_yaxes(title=axes[1])

    line_color = default_color
    fig.update_layout(title={'text': title},
                      xaxis=dict(linewidth=1.5,
                                 linecolor=line_color,
                                 mirror=True,
                                 tickfont=dict(size=12, color=line_color)),
                      yaxis=dict(linewidth=1.5,
                                 linecolor=line_color,
                                 mirror=True,
                                 tickfont=dict(size=12, color=line_color)),
                      font=dict(size=14, color=line_color),
                      showlegend=False,
                      plot_bgcolor=facecolor,
                      paper_bgcolor=facecolor,
                      height=height,
                      width=width)

    return fig


print('‚úîÔ∏è Function defined!')

<br>

# Data

Our analysis requires three (plus one) datasets:

- **ne_10m_admin_0_countries**: a GeoJSON file containing geospatial data for all countries of the world.
- **country_vaccinations**: a CSV file that stores information about how vaccinations progress in many countries.
- **eucountrydatawb_csv**: a CSV file containing information about a country's population and GDP.

The first two datasets contain information even for non-EU countries; therefore, we first need to isolate the data for the 27 EU members + the UK. Then, we will merge all three datasets. 

Additionally, we will use another dataset (**countries_continents_codes_flags_url.csv**) for including a flag for each country in bar charts.


## Geospatial Dataset (ne_10m_admin_0_countries.geojson)

In [None]:
url = 'https://raw.githubusercontent.com/nvkelso/natural-earth-vector/master/geojson/ne_10m_admin_0_countries.geojson'
map_df = gpd.read_file(url)

print('‚úîÔ∏è Raw dataset imported!')

In [None]:
map_df = map_df[map_df['SOVEREIGNT'] == map_df['ADMIN']]  # Exclude overseas territories.

map_df = map_df[['ADMIN', 'geometry']]  # Select only the 'ADMIN' and 'geometry' columns.
map_df.columns = ['country', 'geometry']

map_df['country'].replace({'Czechia': 'Czech Republic'}, inplace=True)  # Change the entry for the Czech Republic.

countries = [
    'Austria', 'Belgium', 'Bulgaria', 'Croatia', 'Cyprus', 'Czech Republic',
    'Denmark', 'Estonia', 'Finland', 'France', 'Germany', 'Greece', 'Hungary',
    'Ireland', 'Italy', 'Latvia', 'Lithuania', 'Luxembourg', 'Malta',
    'Netherlands', 'Poland', 'Portugal', 'Romania', 'Slovakia', 'Slovenia',
    'Spain', 'Sweden', 'United Kingdom'
]

map_eur = map_df[map_df['country'].isin(countries)].reset_index()

map_eur['centroid'] = map_eur['geometry'].set_crs(epsg=3395, allow_override=True).centroid

# We need to change the centroid for some countries.
france_id = map_eur[map_eur['country'] == 'France'].index[0]
cyprus_id = map_eur[map_eur['country'] == 'Cyprus'].index[0]
lux_id = map_eur[map_eur['country'] == 'Luxembourg'].index[0]
malta_id = map_eur[map_eur['country'] == 'Malta'].index[0]

map_eur['centroid'].iat[france_id] = Point(3, 47)
map_eur['centroid'].iat[cyprus_id] = Point(32.989, 36)
map_eur['centroid'].iat[lux_id] = Point(6.07, 49.0)
map_eur['centroid'].iat[malta_id] = Point(14.404, 35.1)

print('‚úîÔ∏è Dataset modified successfully!\nSample of 3 rows:')
map_eur.sample(3)

## Vaccination Dataset (country_vaccinations.csv)

In [None]:
vacc_df = pd.read_csv('../input/covid-world-vaccination-progress/country_vaccinations.csv',
                      parse_dates=['date'])

vacc_df.drop(['source_name', 'source_website'], axis=1, inplace=True)
vacc_df['country'].replace({'Czechia': 'Czech Republic'}, inplace=True)

print('‚úîÔ∏è Dataset imported!\n')
print('It contains information about {} unique countries,'.format(vacc_df['country'].nunique()))
print('and it is organised in {} rows x {} columns.'.format(vacc_df.shape[0], vacc_df.shape[1]))

In [None]:
vacc_eur = vacc_df[vacc_df['country'].isin(countries)]

vacc_eur['color'] = np.where(vacc_eur['country'] == 'United Kingdom', 'firebrick', default_color)

print('‚úîÔ∏è Subset selected!\nLast 3 rows:')
vacc_eur.tail(3)

**Note**: The dataset is not updated on the same day for all countries. Therefore, we need to select the latest date that is common in all 28 countries.

In [None]:
vacc_eur_gb = vacc_eur.groupby(['country', 'iso_code', 'vaccines']).last().reset_index()
final_date = vacc_eur_gb['date'].unique().min()
final_date_str = str(pd.to_datetime(final_date).date())

print('The last date that is common in all countries (EU/UK):', final_date_str)

vacc_eur = vacc_eur[vacc_eur['date'] <= final_date_str]

## Extra Information (eucountrydatawb_csv.csv)

In [None]:
url = 'https://pkgstore.datahub.io/opendatafortaxjustice/eucountrydatawb/eucountrydatawb_csv/data/2844298f518898b5ccc36ddc3a0d28be/eucountrydatawb_csv.csv'
eu_info = pd.read_csv(url)

eu_info.columns = ['country', 'population', 'GDP (Billions)', 'GDP per Capita']
eu_info['country'].replace({'Slovak Republic': 'Slovakia'}, inplace=True)

print('‚úîÔ∏è Dataset imported!\nSample of 3 rows:')
eu_info.tail(3)

## Merging the three Datasets

Remember that the vaccination dataset stores data since the beginning of vaccination for each country. Consequently, we first need to use the `groupby()` method to group the data based on three features: 'country', 'iso_code', 'vaccines'. We are interested in the final day; therefore, we will keep each group's last entry.

Once this is complete, we will merge the three datasets based on the common 'country' feature. 

In [None]:
vacc_eur_gb = vacc_eur.groupby(['country', 'iso_code', 'vaccines']).last().reset_index()

vacc_merged = pd.merge(map_eur, vacc_eur_gb, on='country', how='left')
vacc_tot = pd.merge(vacc_merged, eu_info, on='country', how='left')

print('‚úîÔ∏è Datasets Merged!\nLast 3 rows:')
vacc_tot.tail(3)

## Flags Dataset (countries_continents_codes_flags_url.csv)

In [None]:
url = '../input/countries-iso-codes-continent-flags-url/countries_continents_codes_flags_url.csv'

flags_df = pd.read_csv(url, usecols=['country', 'image_url', 'alpha-3'])

print('‚úîÔ∏è Dataset imported!\nSample of 3 rows:')
flags_df.sample(3)

<br>

# Current State

The first thing to look at is the current state of vaccinations.

## Overview

In [None]:
# Inspired from Devakumar kp's 'plot_card_grid' function 
# at his notebook: https://www.kaggle.com/imdevskp/comparing-pandemics-epidemics-outbreaks

sum_df = vacc_tot.sum().to_frame().T
total_pop = eu_info['population'].sum()

cols = ['total_vaccinations', 'people_vaccinated', 'people_fully_vaccinated', 'daily_vaccinations']
title = 'Current state in the EU/UK (as of {})'.format(final_date_str)

fig, ax = plt.subplots(1, len(cols), figsize=(17, 4))
ax = ax.flatten()

ax[0].text(0.02, 0.9, title, size=22, color=default_color, weight='bold')
ax[0].text(0.02,
           0.7,
           'Total population: {:.1f} Millions'.format(total_pop),
           size=18,
           color=default_color,
           weight='semibold')

for ind, col in enumerate(cols):
    ax[ind].text(0.5,
                 0.5,
                 col.replace('_', ' ').title(),
                 ha='center',
                 va='center',
                 fontsize=17,
                 color='white',
                 weight='heavy',
                 bbox=dict(edgecolor='white',
                           facecolor=default_color,
                           lw=2,
                           pad=10))

    ax[ind].text(0.5,
                 0.2,
                 '{:,}\n({:.1f}%)'.format(
                     int(sum_df[col]),
                     100 * int(sum_df[col]) / (total_pop * 1E+6)),
                 ha='center',
                 va='center',
                 fontfamily='monospace',
                 fontsize=28,
                 fontweight='bold',
                 color='#D23228')

    ax[ind].set_axis_off()

As of 2021-07-05, the total number of vaccinations in the EU/UK exceeds 450 million. More than 275 million people have received one dose of the vaccine, which amounts to almost 55% of the population. Nearly 200 million people have been fully immunised (approximately 38% of the population). The number of new vaccinations on 2021-07-05 exceeds 4 million.

##  Vaccines

In [None]:
vaccines = pd.DataFrame(vacc_tot['vaccines'].str.split(', ').tolist()).stack().value_counts()

fig, ax = plt.subplots(figsize=(16, 6))

sns.barplot(x=vaccines.values,
            y=vaccines.index,
            color=default_color,
            edgecolor='#D23228',
            lw=1.5,
            ax=ax)

for index, value in enumerate(vaccines):
    ax.annotate(value,
                xy=(value + 0.75, index),
                ha='center',
                va='center',
                size=15,
                color='#D23228',
                weight='bold',
                bbox=dict(facecolor='none',
                          edgecolor='#D23228',
                          boxstyle='round',
                          linewidth=2))

ax.set_title('Vaccines used within the EU/UK')

for s in ['right', 'top']:
    ax.spines[s].set_visible(False)

ax.set_xlabel('Number of countries');

Six different vaccines have been used within the EU/UK. The most common vaccine is **Pfizer/BioNTech**, as it is in use in all 28 countries. The second and third most common/popular vaccines are Oxford/AstraZeneca (27/28 countries) and Moderna (26/28 countries). Johnson&Johnson is in use in 21 countries. Interestingly, only one country (Hungary, as we will see later) uses the Chinese Sinopharm vaccine(?) and the Russian Sputnik V vaccine.

## Vaccination Schemes

However, countries usually use a combination of different vaccines, which we refer to as 'vaccination schemes'.


In [None]:
print('There are {} different vaccinations schemes within the studied countries.'.format(vacc_tot['vaccines'].nunique()))

A [waffle plot](https://github.com/gyli/PyWaffle) is an excellent way of visualising the number of countries that use each scheme.

In [None]:
vaccines_dict = dict(vacc_tot['vaccines'].value_counts(normalize=True) * 100)

fig = plt.figure(
    FigureClass=Waffle,
    values=vaccines_dict,
    rows=4,
    columns=7,
    figsize=(16, 9),
    colors=palette_1,
    icons='syringe',
    font_size=100,
    title={
        'label':
        'Distribution of Vaccinations Schemes among the EU/UK Countries',
        'loc': 'center',
        'fontdict': {
            'fontsize': 23,
        }
    },
    labels=[f'{k} ({v:.1f}%)' for k, v in vaccines_dict.items()],
    legend={
        'loc': 'lower left',
        'bbox_to_anchor': (0, -0.3),
        'ncol': 1,
        'fontsize': 14,
        'labelcolor': default_color
    })

It is also interesting to create a map and see the vaccination scheme for every country.

In [None]:
vaccine_schemes_list = vacc_tot['vaccines'].value_counts().index

color_mapping = {
    vaccine_schemes_list[0]: palette_1[0],
    vaccine_schemes_list[1]: palette_1[1],
    vaccine_schemes_list[2]: palette_1[2],
    vaccine_schemes_list[3]: palette_1[3],
    vaccine_schemes_list[4]: palette_1[4]
}

fig, ax = plt.subplots(figsize=(12, 12))

vacc_tot['geometry'].boundary.plot(edgecolor='white', linewidth=0.5, ax=ax)
vacc_tot[vacc_tot['country'] == 'United Kingdom']['geometry'].boundary.plot(edgecolor='firebrick', ax=ax)

vacc_tot.plot(color=vacc_tot['vaccines'].map(color_mapping),
              categorical=True,
              ax=ax)

for i in range(len(vacc_tot)):
    ax.text(vacc_tot.iloc[i]['centroid'].x,
            vacc_tot.iloc[i]['centroid'].y,
            vacc_tot.iloc[i]['iso_code'],
            bbox=dict(boxstyle='round', pad=0.2, alpha=0.7, color='white'),
            ha='center',
            va='center',
            size=8,
            color='red',
            weight='semibold')

custom_lines = [Line2D([0], [0], color=color, lw=5) for color in color_mapping.values()]
leg_lines = ax.legend(custom_lines,
                      color_mapping.keys(),
                      fontsize=13,
                      loc=(0, -0.15),
                      labelcolor=default_color)
ax.add_artist(leg_lines)

ax.set_title('Vaccination Scheme in each Country', size=28)
ax.set_xticks([])
ax.set_yticks([])
ax.set_xlim([-13, 35])
ax.set_ylim([33, 71])

ax.set_frame_on(False)

19 out of the 27 countries of the EU use a combination of four vaccines: Johnson&Johnson, Moderna, Oxford/AstraZeneca, and Pfizer/BioNTech. The UK (along with Luxembourg, Croatia, Slovakia, and Finland) uses only Moderna, Oxford/AstraZeneca, and Pfizer/BioNTech. Denmark is the only country that has not used Oxford/AstraZeneca. As mentioned earlier, Hungary is the country with the highest diversity in vaccines since it uses all the previously mentioned ones, plus Sinopharm and Sputnik.

<br>

We will now focus on the four features of the vaccination process (For more info, please [see here](#Explanation-of-Features)). For each one of these features, the comparison between countries will be performed using **two** visualisation methods:

- A **bar chart** with countries sorted according to their value. Next to each bar, a number will be shown indicating the same feature but normalised according to population ('per hundred people' for the first three features and 'per million' for the daily number of vaccinations).
- A **map** showing the normalised number for that feature (again, 'per hundred people' for the first three features and 'per million' for the daily number of vaccinations).

To ease the comparison between the EU and the UK, the <font size=+0 color="firebrick"><b>UK</b></font> will be depicted with a <font size=+0 color="firebrick"><b>dark red</b></font> bar and edge colour in the bar chart and the map, respectively.

## Total Vaccinations

In [None]:
fig = go.Figure()

plot_bar_chart(fig,
               df=vacc_tot,
               column='total_vaccinations',
               color=default_color,
               title='Total Vaccinations (as of {})'.format(final_date_str),
               height=1000,
               width=800)

include_flags(fig,
              vacc_tot,
              'total_vaccinations',
              offset=9*1E+6,
              sizex=3 * 1E+6,
              sizey=0.5)

fig.show()

In [None]:
fig = go.Figure()

plot_map(fig,
         df=vacc_tot,
         column='total_vaccinations_per_hundred',
         title='Total Vaccinations (%)',
         colormap='Blues',
         width=800,
         height=600)

fig.show()

## People Vaccinated

In [None]:
fig = go.Figure()

plot_bar_chart(fig,
               df=vacc_tot,
               column='people_vaccinated',
               color=default_color,
               title='People Vaccinated (as of {})'.format(final_date_str),
               height=1000,
               width=800)

include_flags(fig,
              vacc_tot,
              'people_vaccinated',
              offset=5E+6,
              sizex=2.2 * 1E+6,
              sizey=0.5)

fig.show()

In [None]:
fig = go.Figure()

plot_map(fig,
         df=vacc_tot,
         column='people_vaccinated_per_hundred',
         title='People Vaccinated (%)',
         colormap='Blues',
         width=800,
         height=600)

fig.show()

## People Fully Vaccinated

In [None]:
fig = go.Figure()

plot_bar_chart(
    fig,
    df=vacc_tot,
    column='people_fully_vaccinated',
    color=default_color,
    title='People Fully Vaccinated (as of {})'.format(final_date_str),
    height=1000,
    width=800)

include_flags(fig,
              vacc_tot,
              'people_fully_vaccinated',
              offset=4E+6,
              sizex=1.5 * 1E+6,
              sizey=0.5)

fig.show()

In [None]:
fig = go.Figure()

plot_map(fig,
         df=vacc_tot,
         column='people_fully_vaccinated_per_hundred',
         title='People Fully Vaccinated (%)',
         colormap='Blues',
         width=800,
         height=600)

fig.show()

## Daily Vaccinations

In [None]:
data = vacc_tot.sort_values(by=['daily_vaccinations'], ascending=True)[['country', 'iso_code', 'daily_vaccinations', 'daily_vaccinations_per_million', 'color']]

fig = go.Figure()

fig.add_trace(
    go.Bar(x=data['daily_vaccinations'],
           y=data['country'],
           name='Daily Vaccinations',
           hovertemplate='%{x:,0f}',
           text=data['daily_vaccinations_per_million'],
           textposition="outside",
           insidetextanchor='start',
           textfont_color=data['color'],
           orientation='h',
           marker=dict(color=data['color'],
                       line=dict(color=data['color'], width=1))))

fig.update_traces(texttemplate='%{text:.0f} per million')
fig.update_xaxes(title_text='Daily Vaccinations',
                 range=[0, data['daily_vaccinations'].max() + 200_000])

line_color = default_color
fig.update_layout(title='Daily Vaccinations (as of {})'.format(final_date_str),
                  font=dict(size=14, color=line_color),
                  barmode='stack',
                  hovermode='y unified',
                  showlegend=False,
                  plot_bgcolor=facecolor,
                  paper_bgcolor=facecolor,
                  height=1000,
                  width=800)

include_flags(fig,
              vacc_tot,
              'daily_vaccinations',
              offset=23E+4,
              sizex=3.5 * 1E+4,
              sizey=0.5)
fig.show()

In [None]:
fig = go.Figure()

plot_map(
    fig,
    df=vacc_tot,
    column='daily_vaccinations_per_million',
    title='Daily Vaccinations (per million)',
    colormap='Blues',
    width=800,
    height=600)

fig.show()

Apparently, **Malta** is currently the most successful country in terms of normalised numbers as it has the highest percentage of people having received both the first and second dose. 

Considering countries with a high population (> 10 million), the **UK** is leading the vaccination race. Since December, the total number of vaccinations in the UK exceeds 79 million. In comparison, Germany and France have administered almost 78 and 57 million vaccines, respectively.

The UK also leads in terms of vaccinations per capita. Specifically, more than 45 million people have received one dose, which amounts to almost 67% of the UK's population. In comparison, Germany, the best-performing country in the EU, has immunised nearly 2 million more people than the UK, which amounts to approximately 56.5% of the country's population.

Almost one out of two people have been fully immunised in the UK. The same percentage is equal to 39% for Germany and 34% for France.
	
However, on  2021-07-05, the UK delivered 286,000 new vaccinations (almost 4,200 per million) while Germany delivered more than double that number (688,781 jabs or 8,221 per million). France also has a higher number with 574,717 jabs (or 8,506 per million). Portugal has the highest rate of daily new vaccinations, with 16,560 jabs per million.

<br>

# Does a country's wealth impact its vaccine rollout?

This section is inspired by Ref 2. We will compare a country's GDP per capita with the percentage of total vaccinations administered.

In [None]:
fig = go.Figure()

fig.add_trace(
    go.Scatter(x=vacc_tot['GDP per Capita'],
               y=vacc_tot['total_vaccinations_per_hundred'],
               hovertemplate='<b>%{text}</b>' +
               '<br>GDP per Capita: %{x:.1f}' +
               '<br>1st dose: %{y:.1f} %<extra>%{text}</extra>',
               text=vacc_tot['country'].values,
               mode='markers',
               marker=dict(size=15,
                           color=vacc_tot['color'],
                           line=dict(color='red', width=1))))

fig.add_shape(type='line',
              x0=vacc_tot['GDP per Capita'].mean(),
              y0=0,
              x1=vacc_tot['GDP per Capita'].mean(),
              y1=156,
              line=dict(color='black', dash='dash'),
              xref='x',
              yref='y')

fig.add_annotation(x=vacc_tot['GDP per Capita'].mean() + 2000,
                   y=120,
                   text='Above Average ‚Üë',
                   textangle=90,
                   showarrow=False,
                   font_size=20)

fig.add_shape(type='line',
              x0=0,
              y0=vacc_tot['total_vaccinations_per_hundred'].mean(),
              x1=110_000,
              y1=vacc_tot['total_vaccinations_per_hundred'].mean(),
              line=dict(color='black', dash='dash'),
              xref='x',
              yref='y')

fig.add_annotation(x=83_000,
                   y=vacc_tot['total_vaccinations_per_hundred'].mean() + 5,
                   text='Above Average ‚Üë',
                   showarrow=False,
                   font_size=20)

fig.update_xaxes(title='GDP per Capita')
fig.update_yaxes(title='Total Vaccinations (%)')

line_color = default_color
fig.update_layout(title = "Does a country's wealth impact its vaccine rollout?",
                  xaxis=dict(linewidth=1.5,
                             linecolor=line_color,
                             mirror=True,
                             tickfont=dict(size=12, color=line_color)),
                  yaxis=dict(linewidth=1.5,
                             linecolor=line_color,
                             mirror=True,
                             tickfont=dict(size=12, color=line_color)),
                  font=dict(size=14, color=line_color),
                  showlegend=False,
                  plot_bgcolor=facecolor,
                  paper_bgcolor=facecolor,
                  height=600,
                  width=850)

fig.show()

It appears that GDP is not a major factor as most countries are clustered around the mean percentage of total vaccinations. It‚Äôs interesting that Luxenburg, the richest of the countries featured, has a slightly above average percentage. As mentioned earlier, Malta has by far the highest vaccination rate even though its GDP is below the average value. 

## Alternative Comparison

Initially, I incorporated GDP through a bubble plot. Bubble plots can be considered a variation of the scatter plot, in which the data points are replaced with bubbles. The advantage of bubble plots is that a third dimension is added: the value of an additional numeric feature is represented through the size of the bubbles. It is even possible to add a fourth dimension by varying the colours of the bubbles.

Here, we will plot the percentage of people vaccinated against the daily number of vaccinations per million people. The size of bubbles will vary according to GDP per capita, while their colour will indicate the vaccine scheme. 

In [None]:
fig = plot_bubble_chart(vacc_tot,
                        features = ['daily_vaccinations_per_million', 'people_vaccinated_per_hundred'],
                        size = 'GDP per Capita',
                        color = 'vaccines',
                        color_discrete_map = palette_1,
                        axes = ['Daily Vaccinations per million', 'People Vaccinated per hundred'],
                        title = 'People Vaccinated (%) vs Daily Vaccinations (per million) <br>'\
                                'grouped by GDP (size) and Vaccine Scheme (color) ({})'.format(final_date_str),
                        height = 600,
                        width = 850)

fig.show()

There are some indications that a higher GDP leads to a higher number of daily vaccinations, which in turn leads to a higher percentage of people vaccinated. However, we should not forget that this figure is just a snapshot of the full vaccination rollout and should not be used to reach conclusions. Therefore, we will create an [animation](#Animation) with one bubble plot per date.

<br>

# Caveat

The most significant caveat with our analysis is probably the fact that not every country began vaccinations at the same time. For instance, the UK approved the Pfizer-BioNTech vaccine on the 2nd of December, and the first jab was administered six days later. Authorisation of the same vaccine in the EU came almost three weeks later (21st of December), with most countries starting vaccination on the 27th of December. 

We can mitigate this caveat by visualising how the vaccination process has evolved since the beginning. However, instead of using dates (Day-Month-Year), we will use days since the first vaccination as the independent variable. In this way, all countries will be placed on the same starting point.  

<br>

# Vaccination Progress

We can gain insights into a country's vaccination efficiency by visualising how the vaccination process has evolved since the beginning.

For this purpose, we will create a time plot with the feature of interest on the y axis and the day since the first vaccination on the x-axis (see [Caveat](#Caveat)). I have chosen to compare the UK with the four EU countries with the highest population (Germany, France, Italy, and Spain). In this way, we will avoid cluttering the plot with 28 lines.

## Total Vaccinations (%)

In [None]:
countries_selection = ['United Kingdom', 'Germany', 'France', 'Italy', 'Spain']

fig = plot_vacc_progress(
    go.Figure(),
    vacc_eur,
    countries_selection,
    feature='total_vaccinations_per_hundred',
    colors=palette_2,
    ylabel='Percentage',
    title=
    'Total Vaccinations (%) in the 5 Most Populated Countries (Log Scale)',
    height=600,
    width=850,
    interpolate=True)

fig.show()

## People Vaccinated (%)

In [None]:
fig = plot_vacc_progress(
    go.Figure(),
    vacc_eur,
    countries_selection,
    feature='people_vaccinated_per_hundred',
    colors=palette_2,
    ylabel='Percentage',
    title='People Vaccinated (%) in the 5 Most Populated Countries (Log Scale)',
    height=600,
    width=850,
    interpolate=True)

fig.show()

## People Fully Vaccinated (%)

In [None]:
fig = plot_vacc_progress(
    go.Figure(),
    vacc_eur,
    countries_selection,
    feature='people_fully_vaccinated_per_hundred',
    colors=palette_2,
    ylabel='Percentage',
    title=
    'People Fully Vaccinated (%) in the 5 Most Populated Countries (Log Scale)',
    height=600,
    width=850,
    interpolate=True)

fig.show()

## Daily Vaccinations (per million)

In [None]:
fig = plot_vacc_progress(
    go.Figure(),
    vacc_eur,
    countries_selection,
    feature='daily_vaccinations_per_million',
    colors=palette_2,
    ylabel='Vaccinations per million',
    title=
    'Daily Vaccinations (per million) in the 5 Most Populated Countries (Log Scale)',
    height=600,
    width=850,
    interpolate=True)

fig.show()

It appears that the UK has been the leading nation since the beginning of vaccinations; however, the four big EU countries have started catching up. Looking at the total number of vaccinations, after 190 days, Germany has administered vaccinations covering almost 93% of its population. The UK achieved the same vaccination percentage on day #165, i.e. more than three weeks faster than the best EU country. Interestingly, Spain is quickly approaching the UK.

The UK is ahead in terms of the percentage of the population having received both the first and second dose of the vaccine. However, there is a significant decline in the number of daily new vaccinations starting from day #165.

## Animation

Lastly, we can create an animation of the bubble plot(s) showed earlier.

In [None]:
vacc_eur_new = vacc_eur.copy()
vacc_eur_new['date'] = vacc_eur['date'].dt.strftime('%Y-%m-%d')
vacc_eur_new = pd.merge(vacc_eur_new, eu_info, on='country', how='left')
vacc_eur_new = vacc_eur_new[vacc_eur_new['date'] > '2021-01-15'].sort_values(by=['date'])
vacc_eur_new = vacc_eur_new.dropna(subset=['people_vaccinated_per_hundred', 'daily_vaccinations_per_million'])

fig = create_animation(vacc_eur_new,
                              features = ['daily_vaccinations_per_million', 'people_vaccinated_per_hundred'],
                              size = 'GDP per Capita',
                              color_discrete_sequence = palette_1,
                              range_x = [100, 20000],
                              range_y = [0, 90],
                              axes = ['Daily Vaccinations per Million', 'People Vaccinated per Hundred'],
                              title = 'People Vaccinated (%) vs Daily Vaccinations (per million) <br>'\
                                      'grouped by Population (size) and Vaccine Scheme (color)',
                              height = 600,
                              width = 850)
fig.show()

<br>

# Extra Resources

1. [Data on COVID-19 (coronavirus) vaccinations by *Our World in Data*](https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/README.md) GitHub repo of the organisation [Our World in Data](https://github.com/owid).
2. [COVID-19 vaccine rollout: How do countries in Europe compare?](https://www.euronews.com/2021/02/24/covid-19-vaccinations-in-europe-which-countries-are-leading-the-way) by Chris Harris [euronews](https://www.euronews.com/) (Retrieved on July 07, 2021).
3. [COVID-19 Vaccination Progress](https://www.kaggle.com/gpreda/covid-19-vaccination-progress) notebook by [Gabriel Preda](https://www.kaggle.com/gpreda) (Version 41 - Retrieved on July 07, 2021).
4. [Which Country Is Leading the Global Race to Vaccinate](https://www.nytimes.com/interactive/2021/01/25/world/europe/global-vaccination-population-rate.html) by [The New York Times](https://www.nytimes.com/) (Retrieved on March 2021).
5. [COVID-19 Pandemic in Greece: An Overview](https://www.kaggle.com/korfanakis/covid-19-pandemic-in-greece-an-overview#Extra-Resources) by . . . me.

<br>

# Conclusions

The notebook came to an end. Through a series of simple visualisations, we were able to compare the UK with the EU regarding their progress on vaccinations against COVID-19. You can find more information at the end of each subsection.

<br>

If you liked this notebook, please consider <font size=+0 color="#DF0000"><b>upvoting</b></font>. <font size=+0 color="green"><b>Suggestions</b></font> are always welcome. üôÇ

<br>

<br>