# Introduction

### Immunization is an essential component for reducing under-five mortality. Immunization coverage estimates are used to monitor coverage of immunization services and to guide disease eradication and elimination efforts. It is a good indicator of health system performance.

This kernel is a STARTER for understanding the percentage of 1-year-olds who have received one dose of bacille Calmette-Guérin (BCG) vaccine in a given year using plotly.

Bacillus Calmette–Guérin (BCG) vaccine is a vaccine primarily used against tuberculosis (TB). In countries where tuberculosis or leprosy is common, one dose is recommended in healthy babies as close to the time of birth as possible. In areas where tuberculosis is not common, only children at high risk are typically immunized, while suspected cases of tuberculosis are individually tested for and treated. Adults who do not have tuberculosis and have not been previously immunized but are frequently exposed may be immunized as well. BCG also has some effectiveness against Buruli ulcer infection and other nontuberculous mycobacteria infections. Additionally it is sometimes used as part of the treatment of bladder cancer.

The BCG vaccine was first used medically in 1921. It is on the World Health Organization's List of Essential Medicines, the safest and most effective medicines needed in a health system [source - wiki](https://en.wikipedia.org/wiki/BCG_vaccine). We still don't know if it can protect us from COVID-19 but there is some probability. Also check this hackaton [BCG - COVID-19 AI Challenge](https://www.kaggle.com/bcgvaccine/hackathon).

For better understanding how vaccines work there is a short video:

In [None]:
from IPython.display import HTML
HTML('<center><iframe width="1077" height="721" src="https://www.youtube.com/embed/Atrx1P2EkiQ" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center>')

# 1. Import necessary modules and check files

* `import pandas as pd` : data processing, working with CSV file I/O (e.g. pd.read_csv)
* `import plotly.graph_objects as go` : interactive, publication-quality graphs, object `graph_objects.Figure`
* `import plotly.express as px` : wrapper for Plotly.py that exposes a simple syntax for complex charts
* `import os` : access files

In [None]:
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import plotly.graph_objects as go
import plotly.express as px

import os
for dirname, _, filenames in os.walk('/kaggle/input/who-immunization-coverage'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

There are 10 files in the dataset [Immunization coverage estimates by country](https://www.kaggle.com/lsind18/who-immunization-coverage) presented by World Health organization.   
This notebook is focused on `BCG.csv`: the percentage of 1-year-olds who have received one dose of bacille Calmette-Guérin (BCG) vaccine in a given year. Reports of vaccinations performed by service providers (e.g. district health centres, vaccination teams, physicians) are used for estimates based on service/facility records. The estimate of immunization coverage is derived by dividing the total number of vaccinations given by the number of children in the target population, often based on census projections.

## 1. Check data

In [None]:
df = pd.read_csv('/kaggle/input/who-immunization-coverage/BCG.csv', skipinitialspace=True)
df

* Rename some columns (not necessary)
* Unpivot data using `melt`
* Sort data by country name (ascending) and Year (descending)

In [None]:
# not necessary
country_names = {'Russian Federation':'Russia', "Democratic People's Republic of Korea":"Korea, Democratic People's Republic of", 
                 'Republic of North Macedonia':'Macedonia, the former Yugoslav Republic of', 'Bolivia (Plurinational State of)': 'Bolivia',
                'Cabo Verde':'Cape Verde', 'Congo' : 'Congo (Brazzaville)', 'Czechia' : 'Czech Republic', 'Democratic Republic of the Congo' : 'Congo (Kinshasa)',
                'Eswatini': 'Swaziland', 'Iran (Islamic Republic of)': 'Iran', 'Libya' : 'Libyan Arab Jamahiriya', 'Micronesia (Federated States of)': 'Micronesia, Federated States of',
                'United Republic of Tanzania' : 'Tanzania', 'Venezuela (Bolivarian Republic of)' : 'Venezuela', 'Viet Nam':'Vietnam'}
df['Country'].replace(country_names, inplace = True)

In [None]:
df=df.melt(id_vars=['Country'], var_name='Year', value_name='Percent')
df = df.sort_values(by=['Country', 'Year'], ascending=[True, False])
df

* Check null-data: there was no information about percent of children who have received one dose of BCG vaccine in a given year.
* Drop these rows

In [None]:
df.isnull().sum(axis = 0)

In [None]:
df = df.dropna()
df

# 2. Create maps using plotly

### 2.1 Interactive map over years

Using plotly.express, wrapper for Plotly.py that exposes a simple syntax for complex charts, draw the interactive map with information about percentage over years: **1980 - 2018**.

In [None]:
px.choropleth(df, locations=df['Country'], locationmode='country names', color = df['Percent'], hover_name=df['Country'], animation_frame=df['Year'],
              color_continuous_scale=px.colors.sequential.RdBu, projection='natural earth')

### Unfortunetely, the information is not full, for example there is no info about BCG in Soviet Union (started back from 1991).

### 2.2 Map over last presented year
* group dataframe by Country and take top 1 (last presented year)

In [None]:
lastyear = df.groupby('Country').head(1)
lastyear

* create Figure about percentage of children with latest BCG reported woldwide.

In [None]:
fig = go.Figure(data=go.Choropleth(
    locations = lastyear['Country'],
    z = lastyear['Percent'],
    text = lastyear['Year'] + ', ' + lastyear['Country'],
    colorscale = 'RdBu',
    marker_line_color='darkgray',
    marker_line_width=0.5,
    colorbar_title = 'BCG, %',
    locationmode='country names',
))

fig.update_layout(
    title_text='Latest BCG reported woldwide'
)

* update Figure about percentage of children with latest BCG reported in **Europe, Africa and South America**:

In [None]:
fig.update_geos(projection_type="natural earth", scope="europe")
fig.update_layout(
    title_text='Latest BCG reported Europe')

In [None]:
fig.update_geos(projection_type="natural earth", scope="africa")
fig.update_layout(title_text='Latest BCG reported Africa')

In [None]:
fig.update_geos(projection_type="natural earth", scope="south america")
fig.update_layout(title_text='Latest BCG reported South America')

### BCG vaccination is routinely used in children worldwide, especially in countries with endemic TB. 

upvote this notebook if you find it useful.
You can use it in similar way to understand data about other vaccines:
* `DTP3.csv`: the percentage of 1-year-olds who have received three doses of the combined **diphtheria, tetanus toxoid and pertussis vaccine** in a given year.
* `HepB3.csv`: the percentage of 1-year-olds who have received three doses of **hepatitis B vaccine** in a given year. 
* `Hib3.csv`: the percentage of 1-year-olds who have received three doses of **Haemophilus influenzae type B vaccine** in a given year.
* `MCV1.csv`: the percentage of children under 1 year of age who have received at least one dose of **measles-containing vaccine** in a given year. For countries recommending the first dose of measles vaccine in children over 12 months of age, the indicator is calculated as the proportion of children less than 12-23 months of age receiving one dose of measles-containing vaccine. 
* `MCV2.csv`: the percentage of children who have received two doses of **measles containing vaccine** (MCV2) in a given year, according to the nationally recommended schedule. 
* `PAB.csv`: the proportion of neonates in a given year that can be considered as having been protected against **tetanus** as a result of maternal immunization. 
* `PCV3.csv`: the percentage of 1-year-olds who have received three doses of **pneumococcal conjugate vaccine** (PCV3) in a given year. 
* `Pol3.csv`: the percentage of 1-year-olds who have received three doses of **polio vaccine** in a given year. 
* `ROTAC.csv`: the percentage of surviving infants who received the final recommended dose of **rotavirus vaccine**, which can be either the 2nd or the 3rd dose depending on the vaccine in a given year. 