# Final Project: Vaccination Coverage for One-Year-Olds Around the World

This project will be displaying and comparing the data of vaccination coverage for one - year olds around the world. I will be using vaccination coverage data from [*Our World in Data*](https://ourworldindata.org/vaccination). The specific CVS dataset used is from https://ourworldindata.org/grapher/the-worlds-number-of-vaccinated-one-year-olds?time=1980..latest and https://ourworldindata.org/vaccination. The dataset used are the vaccination coverage by income and the number of one-year-olds who recieved different vaccinations. 


### Hypothesis:
A country's GDP per capita is a great determinative of if a country is industrial and developed. A high or low the GDP per capita, we can infer that size of the country's economy. Using the GDP per capita of countries and the vaccination coverage for one - year olds around the world datasets, we can observe that  a country is more likely to have higher percentages of one years olds that are vaccinated due to their high GDP per capita compare to country with lower GDP. 

In [1]:
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

!pip install plotly

renderer = None

def show(fig):
  fig.show(renderer=renderer)

datadir = './data/'



##### -Information below is reading the csv file and edit the dataframe. The some of data is display to show what we are working with. 

In [2]:
#Read CSV File for Both CSV
# Data for the worlds-number-of-vaccinated-one-year-olds csv 
df=pd.read_csv(datadir+ 'the-worlds-number-of-vaccinated-one-year-olds.csv')
df2=pd.read_csv(datadir+ 'vaccination-coverage-by-income-in.csv')



# Renaming column name Entity for both dataframe to Country instead 
d={'Entity':'Country','Number of one-year-olds vaccinated with HepB3': 'Hepatitis B3', 'Number of one-year-olds vaccinated with DTP containing vaccine, 3rd dose':'Diphteria, Pertussis and Tetanus (DTP)', 'Number of one-year-olds vaccinated with polio, 3rd dose':'Polio (3rd Dose)', 'Number of one-year-olds vaccinated with measles-containing vaccine, 1st dose':'Measles (1st Dose)','Number of one-year-olds vaccinated with Hib3': 'Hib3','Number of one-year-olds vaccinated with rubella-containing vaccine, 1st dose': 'Rubella (1st Dose)','Number of one-year-olds vaccinated with rotavirus, last dose' : 'Rotavirus (Last Dose)','Number of one-year-olds vaccinated with BCG':'Tuberculosis'}
df=df.rename(columns=d)
df2=df2.rename(columns=d)


# Removing unnecessary column 
df.drop(['Population - Sex: all - Age: 0 - Variant: estimates', 'Code'], axis=1, inplace=True)


vaccination_types_columns = df[['Hepatitis B3',
       'Diphteria, Pertussis and Tetanus (DTP)', 'Polio (3rd Dose)',
       'Measles (1st Dose)', 'Hib3', 'Rubella (1st Dose)',
       'Rotavirus (Last Dose)', 'Tuberculosis']]

df.head(10).fillna(0)



Unnamed: 0,Country,Year,Hepatitis B3,"Diphteria, Pertussis and Tetanus (DTP)",Polio (3rd Dose),Measles (1st Dose),Hib3,Rubella (1st Dose),Rotavirus (Last Dose),Tuberculosis
0,Afghanistan,1980,0.0,19716.0,0.0,54220.0,0.0,0.0,0.0,0.0
1,Africa (UN),1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Albania,1980,0.0,71047.0,69535.0,68024.0,0.0,0.0,0.0,70291.0
3,Algeria,1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,American Samoa,1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Americas (UN),1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Andorra,1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Angola,1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Anguilla,1980,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Antigua and Barbuda,1980,0.0,802.0,535.0,0.0,0.0,0.0,0.0,0.0


### Graph 1: Bar Graph to Show the Sum of Population of Vaccination Types Through 1980 to 2021 

This interactive graph allow you to see that population sum through out the world of one years olds that were vaccinated with different vaccinations through 1980 to 2021. In this bar graph the populations of one year olds are increasing and the vaccinations applied are also increasing as well.

In [3]:
#Graph 1 for the-worlds-number-of-vaccinated-one-year-olds.csv
fig=go.Figure()

vaccination_container= []
for col in vaccination_types_columns:
       vaccination_container.append(col)

fig=px.histogram(df, x='Year', y=vaccination_container, barmode='stack', histfunc='sum')

fig.update_layout(xaxis_title="Years",
                  yaxis_title="Population",
                  legend_title="Vaccination",
                  title="Population Sum of Vaccination Through 1980 to 2021 ",
                  bargap=0.2, bargroupgap=0.1)
fig.show()


### Graph 2: Line Graph Displays Vaccination Coverage for Individual Countries & Continents

To get a better insight on the vaccination coverage and how it connected to the years and the population,a line graph was created. In this line graph you have the ability to see the what the vaccination coverage is for individual countries and continents while also comparing the years the vaccination was applied to and population of one years olds. 

Looking at Diphteria, Pertussis and Tetanus (DTP) vaccination data in the Vaccination Coverage for United States, since 1980 there has been increase in the population of one years olds that received this vaccine. There could be many favors for this but it is recommended that the DTP vaccines applied to one years old.

In [4]:

def plot_vaccination_types(frame,country):
    country_data = frame[frame['Country'] == country]
    filter_country_data = country_data[['Year', 'Country', 'Hepatitis B3', 'Diphteria, Pertussis and Tetanus (DTP)', 
                                        'Polio (3rd Dose)', 'Measles (1st Dose)', 'Hib3', 'Rubella (1st Dose)', 'Rotavirus (Last Dose)', 'Tuberculosis']]

    fig = px.line(filter_country_data,x='Year', y= vaccination_container, markers=True)
    fig.update_layout(title={"text": f' Vaccination Coverage of {country} through 1980 to Current',
                         "x": 0.5,
                         "xanchor":"center"}
                  , width=1200, height=500,xaxis_title="Years",
                  yaxis_title="Population",
                  legend_title="Vaccinations")
    
    
    return fig   
  
    
fig = plot_vaccination_types(df,'United States')
show(fig)

### Graph 3: Subplots Displays Vaccination Coverage for Individual Countries & Continents

Subplots was created to separate the vaccinations data provide and display them on their individual charts. This data is still showing Vaccination Coverage for United States with the x-axis being years and the population being y-axis. Data show that the DTP vaccine being administer as early of 1980 compared to Polio(3rd Dose) that have later data because it is applied later. Both line graph still showing an increase as well.

In [5]:

def country_vaccination_types(frame,country):
    data = frame[frame['Country'] == country]
    filter_data = data[['Year', 'Country', 'Hepatitis B3', 'Diphteria, Pertussis and Tetanus (DTP)', 
                                        'Polio (3rd Dose)', 'Measles (1st Dose)', 'Hib3', 'Rubella (1st Dose)', 'Rotavirus (Last Dose)', 'Tuberculosis']]
    # Initalize figs w. subplots
    fig= make_subplots(rows=4, cols=2, 
                       subplot_titles=('Hepatitis B3', 'Diphteria, Pertussis and Tetanus (DTP)', 
                                        'Polio (3rd Dose)', 'Measles (1st Dose)', 'Hib3', 'Rubella (1st Dose)', 'Rotavirus (Last Dose)', 'Tuberculosis'),
                       )
    # Add Traces 
    fig.add_trace(go.Scatter(x=filter_data['Year'],
                                 y=filter_data['Hepatitis B3'],
                                 name='Hepatitis B3'
                             ),row=1,col=1)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data['Diphteria, Pertussis and Tetanus (DTP)'],
                         name= 'Diphteria, Pertussis and Tetanus (DTP)')
              ,row=1,col=2)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data[ 'Polio (3rd Dose)'],
                         name=  'Polio (3rd Dose)')
              ,row=2,col=1)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data['Measles (1st Dose)'],
                         name= 'Measles (1st Dose)')
              ,row=2,col=2)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data['Hib3'],
                         name= 'Hib3')
              ,row=3,col=1)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data['Rubella (1st Dose)'],
                         name= 'Rubella (1st Dose)')
              ,row=3,col=2)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data['Rotavirus (Last Dose)'],
                         name= 'Rotavirus (Last Dose)'), row=4, col=1)
    fig.add_trace(go.Scatter(
                         x=filter_data['Year'],
                         y= filter_data['Tuberculosis'],
                         name= 'Tuberculosis')
              ,row=4,col=2)
    # Update Xaxis + Y axis
    fig.update_xaxes(title_text="Years", row=1, col=1)
    fig.update_xaxes(title_text="Years",  row=1, col=2)
    fig.update_xaxes(title_text="Years",  row=2, col=1)
    fig.update_xaxes(title_text="Years",  row=2, col=2)
    fig.update_xaxes(title_text="Years",  row=3, col=1)
    fig.update_xaxes(title_text="Years",  row=4, col=1)
    fig.update_xaxes(title_text="Years",  row=4, col=2)

    fig.update_yaxes(title_text="Population", row=1, col=1)
    fig.update_yaxes(title_text="Population", row=2, col=1)
    fig.update_yaxes(title_text="Population", row=3, col=1)
    fig.update_yaxes(title_text="Population", row=4, col=1)
    fig.update_layout(height=1000, width=1000,
                  title={"text": f'Vaccination Coverage of {country}',
                         "x": 0.5,
                         "xanchor":"center"},
                  plot_bgcolor='#D3D3D3')
    return fig

fig = country_vaccination_types(df, 'United States')
show(fig)   

##### -Information below edit the dataframe. The some of data is display to show what we are working with. 

In [6]:
# Vaccination Coverage by Income focus on DTP3
df2=df2.fillna(0)
# df2.drop(['World regions according to OWID'],axis=1, inplace=True)
df2.head(10)
# print(df2['Country'].unique())


Unnamed: 0,Country,Code,Year,DTP3 (% of one-year-olds immunized),"GDP per capita (output, multiple price benchmarks)",Population (historical),World regions according to OWID
0,Afghanistan,AFG,1980,4.0,0.0,13169312.0,0.0
1,Afghanistan,AFG,1981,3.0,0.0,11937587.0,0.0
2,Afghanistan,AFG,1982,5.0,0.0,10991382.0,0.0
3,Afghanistan,AFG,1983,5.0,0.0,10917986.0,0.0
4,Afghanistan,AFG,1984,16.0,0.0,11190220.0,0.0
5,Afghanistan,AFG,1985,15.0,0.0,11426855.0,0.0
6,Afghanistan,AFG,1986,11.0,0.0,11420074.0,0.0
7,Afghanistan,AFG,1987,25.0,0.0,11387824.0,0.0
8,Afghanistan,AFG,1988,35.0,0.0,11523299.0,0.0
9,Afghanistan,AFG,1989,33.0,0.0,11874088.0,0.0


### Graph 4: Scatter Geo Displays Vaccination Coverage by Income

This interactive graph allow you to see that globe view of vaccination coverage of DTP3 vaccine based on a inputted year. In the Scatter Geo, their are hover text that provides the GDP rate, population, country, code and the percentage of one-year-olds immunized from DTP3 vaccine. There is a color scale that indicates the color of a the country or continent's population. 


In [7]:
# Map of Vaccination Coverage by Income focus on DTP3

def vaccination_coverage_by_income(frame,year):
    year_data = frame[frame['Year']== year]
    fig = px.scatter_geo(year_data, locations='Country', locationmode='country names',
                     hover_name="Country", size="Population (historical)", color= "Population (historical)",
                     hover_data=['Country', 'Code', 'Year', 'DTP3 (% of one-year-olds immunized)', 'GDP per capita (output, multiple price benchmarks)', 
                                 'Population (historical)'],
                     projection="natural earth",opacity =.8, color_continuous_scale='Turbo', range_color=(1000000,8000000))
    
    fig.update_layout(title={"text": f"Vaccination Coverage by Income on DTP3 in {year}",
                         "x": 0.5,
                         "xanchor":"center"},
                      coloraxis=dict(
        colorbar=dict(
            title="Population",)))
    fig.update_traces(marker=dict(size=10))
    return fig
fig = vaccination_coverage_by_income(df2, 2012)
show(fig)

### Graph 5: Line Graph Displays DTP3 Rate for Individual Countries & Continents

This line graph display the the rate of one-year-olds being immunized with the DTP3 vaccine in a year span. In United States it shows that the rate tend to be in 90% rate but there was a decline to 83% in 1993. In 1993, there was an Childhood Immunization Initiative implement which focus on eliminate the indigenous cases of six vaccine preventable diseases (DTP3 included in this) by 1996. This Initiative and other programs assisted increasing the rate up to 94% are in 1994. 


In [8]:
#Showing GDP vs Immunization Rate
def plot_gdprate(frame, country):
    data = frame[frame['Country'] == country]
    # print("Country Data",data)
    filter_data = data[['Year','DTP3 (% of one-year-olds immunized)','GDP per capita (output, multiple price benchmarks)']]
    graph = px.line(filter_data,x='Year', y='DTP3 (% of one-year-olds immunized)')
    graph.update_layout(title={"text": f'DTP3 Rate of {country} Through 1980 to 2023',
                         "x": 0.5,
                         "xanchor":"center"}
                  , width=1200, height=500)
    return graph    
    
fig = plot_gdprate(df2, 'United States')
show(fig)


### Graph 6: Bar Graph Displays GDP Rate for Individual Countries & Continents

This bar graph display the GDP per capita based on the country or continents inputted into the function. Currently, this rate is increasing through the year for United States but that bar graph could display difference for another country that have a lower GDP.

In [9]:
def plot_dtp3(frame, country):
    data = frame[frame['Country'] == country]
    # print("Country Data",data)
    filter_data = data[['Year','GDP per capita (output, multiple price benchmarks)']]
    graph = px.bar(filter_data,x='Year', y='GDP per capita (output, multiple price benchmarks)')
    graph.update_layout(title={"text": f'GDP Rate of {country} Through 1980 to 2023',
                         "x": 0.5,
                         "xanchor":"center"}
                  , width=1200, height=500)
    return graph 

fig = plot_dtp3(df2, 'United States')
show(fig)

## Conclusion
To conclude, a country's GDP per capita rate does matters and can have a decrease or increase the mortality of one-years-olds.If a country's economy is poor the likelihood of one-year-olds receiving a vaccine-preventable diseases is stronger compared to a country with a stronger economy. Both dataset provides show there was an increase on the vaccinations administer to one-year-olds throughout the years and the population of one-year-olds as well. These vaccines also assist in decreasing the mortality rate of one-year-olds from vaccine-preventable diseases.

Comparing the GDP Rate bar graph,the country's economy size does matter especially regarding producing and administering vaccines. United States's GDP has increases throughout the years which have correlated to a consistent rate of 90% in the DTP3 rate line graph. Besides that the DTP3 line chart, in the Vaccination Coverage of United States through 1980 to Current line graph, there was an increase in the population of on one-years-olds that receive those vaccines as well.

