*Latest update on July 31th, 2021.*

**Description:**

This workbook is aimed for visualizing the vaccination progress for COVID-19 in Japan.

**Data resources:**

(1) The vaccination data are compiled from the Japanese page of [Prime Minister's Office of Japan](https://www.kantei.go.jp/jp/headline/kansensho/vaccine.html), provided in the dataset of Japan Covid-19 vaccination status.


   From June, the vaccination has been extended from the healthcare staffs and the elderly to the general public.
   
   The original dataset include 4 tables:
*    Number of vaccine doses administered to the healthcare staffs sorted by date
*    Number of vaccine doses administered to the general public (the elderly included) sorted by date
*    Number of vaccine doses administered to the healthcare staffs sorted by prefecture
*    Number of vaccine doses administered to the general public (the elderly included) sorted by prefecture

Accordingly, the codes below will visualize the vaccination progress by date and by prefecture.

(2) The geographic visulization is done on Mapbox. The shapes of each prefecture are defined by the dataset of *Geographic data of Japan*.

(3) The population by prefecture is an [estimation](https://uub.jp/rnk/p_j.html) from past census data, included in the dataset of *Japan population*.


**Note**

Please leave your comments if you have any questions about the codes or any advice on improving it.

In [None]:
import pandas as pd
import json
import plotly.express as px

# import data as dataframes
df_healthcare_by_prefecture = pd.read_csv('../input/japan-covid19-vaccination-status/Vaccination of the healthcare by prefecture.csv') 
df_general_by_prefecture = pd.read_csv('../input/japan-covid19-vaccination-status/Vaccination of the general public by prefecture.csv') 

# Combining the above two data sets to make the total vaccination doses
df_total = pd.DataFrame()
df_total['Prefecture']=df_healthcare_by_prefecture['Prefecture']
df_total['Total doses']=df_healthcare_by_prefecture['Total doses'] + df_general_by_prefecture['Total doses']
df_total['First dose total']=df_healthcare_by_prefecture['First dose'] + df_general_by_prefecture['First dose']

# Importing the geographic data of the prefectures of Japan, which is used for defining the shapes of each prefecture
f = open('../input/geographic-data-of-japan/japan_prefectures.geojson')
japan = json.load(f)

# Import population data and calculate ratio of the vaccinated
df_population = pd.read_csv('../input/japan-population/Japan estimated population by prefectures 2021.csv')
df_ratio = pd.merge(df_total, df_population,on='Prefecture')
df_ratio['Vaccinated ratio']=df_ratio['First dose total']/df_ratio['Population']

# Visualizing the total doses administered as a distribution to prefectures
fig = px.choropleth_mapbox(df_total, geojson=japan, locations='Prefecture', color='Total doses',
                           color_continuous_scale='Blues',
                           mapbox_style="carto-positron",
                           zoom=3.79,
                           center = {"lat": 36.40, "lon": 139.65}
                          )
fig.update_layout(title_text='Total administered doses by prefecture', title_x=0.5)
fig.update_layout(margin={"r":0,"t":43,"l":0,"b":0})
fig.show()

In [None]:
# # Visualizing the ratio of vaccinated population by prefectures

fig = px.choropleth_mapbox(df_ratio, geojson=japan, locations='Prefecture', color='Vaccinated ratio',
                           color_continuous_scale="RdBu",range_color=(0, 1),
                           mapbox_style="carto-positron",
                           zoom=3.79,
                           center = {"lat": 36.40, "lon": 139.65}
                          )
fig.update_layout(margin={"r":0,"t":43,"l":0,"b":0})
fig.update_layout(title_text='Vaccination rates by prefecture', title_x=0.5)
fig.show()

In [None]:
import pandas as pd

df_healthcare_by_date = pd.read_csv('../input/japan-covid19-vaccination-status/Vaccination of the healthcare by date.csv') 
df_healthcare_by_date['Date'] = pd.to_datetime(df_healthcare_by_date['Date'])
df_healthcare_by_date['Group'] = 'healthcare'

df_general_by_date = pd.read_csv('../input/japan-covid19-vaccination-status/Vaccination of the general public by date.csv') 
df_general_by_date['Date'] = pd.to_datetime(df_general_by_date['Date'])
df_general_by_date['Group'] = 'general'

# create dataframe copies for plotting cumulative data later
df_general_by_date_copy = df_general_by_date.copy()
df_general_by_date_copy.columns = ['Date', 'Total doses(general)', 'First dose(general)', 'Second dose(general)', 'First dose(general with Pfizer)','First dose(elderly with Moderna)','Second dose(elderly with Pfizer)','Second dose(elderly with Moderna)','Group']
df_healthcare_by_date_copy = df_healthcare_by_date.copy()
df_healthcare_by_date_copy.columns = ['Date', 'Total doses(healthcare)', 'First dose(healthcare)', 'Second dose(healthcare)', 'First dose(healthcare with Pfizer)','First dose(healthcare with Moderna)','Second dose(healthcare with Pfizer)','Second dose(healthcare with Moderna)','Group']

# plot the daily vaccinated population
df_by_date = pd.concat([df_healthcare_by_date,df_general_by_date])
df_by_date = df_by_date.sort_values(by='Date',ascending=True)

fig = px.bar(df_by_date, x='Date', y='Total doses', color='Group', barmode='group')
fig.update_layout(title_text='Administered doses by date', title_x=0.5)
fig.update_layout(
    yaxis_title="Doses administered"  
)
fig.show()

It has been found that one dose of vaccination delivers decent effectiveness. Therefore, data shown below are the population received at leat one dose of vaccination.

In [None]:
# combine the healthcare workers and general public data, calculate cumulative sum of the vaccinated (at least one dose)

df_by_date_copy = pd.merge(df_healthcare_by_date_copy, df_general_by_date_copy, on='Date')
df_by_date_copy = df_by_date_copy.sort_values(by='Date',ascending=True)
df_by_date_copy['first_dose_sum_by_date'] = df_by_date_copy['First dose(healthcare)'] + df_by_date_copy['First dose(general)']
df_by_date_copy['first_dose_cum_sum'] = df_by_date_copy['first_dose_sum_by_date'].cumsum()
fig = px.area(df_by_date_copy, x='Date', y='first_dose_cum_sum')
fig.update_layout(title_text='Vaccination rate (population received at least one dose)', title_x=0.5)
fig.update_layout(
    yaxis_title='Cumulative population'  
)
fig.show()

In [None]:
# plot ratio of the population that gets at least one dose, as a progress by date

population = 127128905 # refer to https://cio.go.jp/c19vaccine_dashboard
df_by_date_copy['first_dose_cum_sum versus population'] = df_by_date_copy['first_dose_cum_sum']/ population
fig = px.area(df_by_date_copy, x='Date', y='first_dose_cum_sum versus population',color_discrete_sequence=['green'])
fig.update_layout(title_text='Vaccination rate (ratio of population received at least one dose)', title_x=0.5)
fig.update_layout(
    yaxis_title="Vaccination rate",
    yaxis_range=[0,1],
    yaxis_tickformat = '%'
)
fig.show()