# EDA: COVID-19 in Tokyo
![](https://images.unsplash.com/photo-1513407030348-c983a97b98d8?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=1052&q=80)
* [Image Source](https://unsplash.com/photos/IocJwyqRv3M)



Hello Kagglers.

I wrote a [notebook about COVID-19 in Japan](https://www.kaggle.com/japandata509/eda-covid-19-pandemic-in-japan) about a month ago, and got many upvotes. Thank you for your upvote.
Then, I felt like writing a notebook about COVID-19 in Tokyo, so I published it.

If you like, please feel free to **upvote**.


# Data Source↓
* [COVID-19 in Tokyo](https://www.kaggle.com/japandata509/covid19-in-tokyo-japan)
* [COVID-19 dataset in Japan](https://www.kaggle.com/lisphilar/covid19-dataset-in-japan)

# Content
* [Current Situation in Tokyo](#1)
* [EDA about COVID-19 cases](#2)
* [EDA about patients of COVID-19](#3)

# About Tokyo

![](https://upload.wikimedia.org/wikipedia/commons/0/01/Tokyo_in_Japan.svg)
* [Image Source](https://en.wikipedia.org/wiki/Tokyo)
* Area painted <span style="color:red">red</span> belongs to Tokyo Metropolis.
* Tokyo (Tokyo Metropolis) is the largest prefecture and the capital of Japan.
* The population is about 14 million, and incereasing year by year.
* The city and trains in Tokyo are crowded because many people visit Tokyo from neighboring area for working.
* In Tokyo, as of 30th January, about 100000 people tested positive for COVID-19 and the number of death cases is about 850.

In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
import json
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from IPython.display import HTML,display
#import bar_chart_race as bcr

In [None]:
tokyo_covid = pd.read_csv('../input/covid19-in-tokyo-japan/tokyo_covid19_patients.csv')
prefecture_covid = pd.read_csv('../input/covid19-dataset-in-japan/covid_jpn_prefecture.csv')
tokyo = pd.read_csv('../input/covid19-in-tokyo-japan/tokyo_cases_byarea.csv')
prefecture_covid

<a id="1"></a>
<h2 style='background:#FFFFFF; border:0; color:black'><center>Current Situation in Tokyo<center><h2>

* On 7th January 2021, Japanese government declared <span style="color:red">a state of emergency</span> for Tokyo metropolitan area because the number of cases is soaring.
* For more information, please refer to these sites below.
* [COVID-19 Information and Resources(Cabinet Secretariat)](https://corona.go.jp/en/)
* [Novel Coronavirus (Ministry of Health,Labour and Welfare)](https://www.mhlw.go.jp/stf/seisakunitsuite/bunya/0000164708_00079.html)
* [Coronavirus Outbreak (NHK World)](https://www3.nhk.or.jp/nhkworld/en/news/tags/82/)


# As of 30th January
* The number of patients in hospital is 2882 (Total bed: 4000)
* The number of patients who are seriously ill is 141 (Total bed for severe patients: 250)
* Information about patients can be accessed [here](https://stopcovid19.metro.tokyo.lg.jp/en).

In [None]:
tokyo_current = prefecture_covid[prefecture_covid["Prefecture"] == "Tokyo"].tail(1)
cols = ["Discharged","Fatal","Hosp_require","Hosp_severe"]
tokyo_current = tokyo_current[cols]
tokyo_current.columns = ["Cured","Death","Active","Severe"]

In [None]:
current_df = pd.DataFrame(tokyo_current.T.values,columns=["Cases"],index=["Cured","Death","Active","Severe"])

In [None]:
fig = px.pie(current_df, values='Cases',names=current_df.index,
             title='Current Situation in Tokyo (As of January 29th)',color_discrete_sequence=px.colors.sequential.Rainbow)
fig.show()

<a id="2"></a>
<h2 style='background:#FFFFFF; border:0; color:black'><center>Confirmed Cases by Date<center><h2>

In [None]:
cases_byday = []
for i in range(102,len(tokyo_covid.drop_duplicates(["Date"],keep='last'))-1):
    cases_byday.append(tokyo_covid.drop_duplicates(["Date"],keep='last').iloc[i+1,0]-tokyo_covid.drop_duplicates(["Date"],keep='last').iloc[i,0])

In [None]:
date = tokyo_covid.drop_duplicates(["Date"]).iloc[103:,1]
tokyo_cases_bydate = pd.DataFrame({"Date":date,"Number of cases":cases_byday})
tokyo_cases_bydate

In [None]:
fig = px.bar(tokyo_cases_bydate,x='Date',y='Number of cases',labels={"index":"Number of cases","value":"Date"},title="Cases by Date in Tokyo") 
fig.show()

* The number of cases by Date reached 2000 for the first time on 7th January.

<h2 style='background:#FFFFFF; border:0; color:black'><center>Confirmed Cases by Area<center><h2>

* In Tokyo, there are 23 wards such as Shinjuku,Shibuya,and Setagaya.
* The area including 23 wards is called "Tokyo Special Wards", and many government agencies and businesses are located.

![Tokyo Special Ward](https://upload.wikimedia.org/wikipedia/commons/a/aa/Tokyo_23_Special_Wards_Area_Map.svg)

* Areas painted green belong to Tokyo Special Wards.([Image Source](https://en.wikipedia.org/wiki/Special_wards_of_Tokyo))

![](https://upload.wikimedia.org/wikipedia/commons/1/16/Tokyo_special_wards_map.svg)
* Tokyo Special Wards face Tokyo Bay.
* Population is about 9.6 million.
* Shinjuku ward has Japan's largest downtown area.
* ([Image Source](https://en.wikipedia.org/wiki/Special_wards_of_Tokyo))

* We analyze the number of cases in Tokyo Special Wards by drawing a map.
* GeoJSON file was collected at [this site](https://github.com/niiyz/JapanCityGeoJson).

* Reference: [Choropleth Maps in Python](https://plotly.com/python/choropleth-maps/)

In [None]:
d =  json.load(open('../input/tokyo-json/tokyo23.json', 'r'))

In [None]:
features = d['features']
ids = sorted(set([d['id'] for d in features]))
new_features = []
for id in ids:
    data = [d['geometry']['coordinates'] for d in features if d['id']==id]
    if len(data) == 1:
        feature = dict(
            type = "Feature",
            geometry = dict(
                type = "Polygon",
                coordinates = data[0],
            ),
            id = id,
        )
    else:
        feature = dict(
            type = "Feature",
            geometry = dict(
                type = "MultiPolygon",
                coordinates = data,
            ),
            id = id,
        )
    new_features.append(feature)
d['features'] = new_features

#with open('new_tokyo23.json', 'w') as f:
#    json.dump(d, f)

In [None]:
fig = px.choropleth(tokyo, geojson=d, color="Positive Cases",
                    locations="Code",title = 'Confirmed Cases by Area'
                   )
fig.update_geos(fitbounds="locations", visible=False)
#fig.update_layout(margin={"r":0,"t":0,"l":0,"b":0})
fig.show()

* We can see that the number of cases is particularly high in Shinjuku and Setagaya.
* The number of cases in Setagaya is highest in 23 wards probably because Setagaya has a larger population (about 1 million people).
* Shinjuku has more cases probably because there are large downtown (such as "Kabukicho") and many people who work there live nearby.

![](https://images.unsplash.com/photo-1558632328-465f59aff245?ixid=MXwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHw%3D&ixlib=rb-1.2.1&auto=format&fit=crop&w=1053&q=80)
* A photo of the streets in Kabukicho ([Image Source](https://unsplash.com/photos/wDehhoMbspg))

* The table below shows the number of total cases by Municipality in Tokyo.

In [None]:
tokyo = tokyo.drop('Code',axis=1)
tokyo.sort_values(by='Positive Cases',ascending=False).style.background_gradient(cmap='plasma_r')

<h2 style='background:#FFFFFF; border:0; color:black'><center>Total Tests<center><h2>

In [None]:
tokyo_total = prefecture_covid[prefecture_covid["Prefecture"] == "Tokyo"]
fig = px.bar(tokyo_total,x='Date',y='Tested',labels={"index":"Number of Tests","value":"Date"},title="Total Tests in Tokyo (As of Jan 29th)") 
fig.show()

<h2 style='background:#FFFFFF; border:0; color:black'><center>Total Confirmed Cases<center><h2>

In [None]:
fig = px.bar(tokyo_total,x='Date',y='Positive',labels={"Positive":"Number of cases","value":"Date"},title="Total Cases in Tokyo (As of Jan 29th)") 
fig.show()

* On 14th January, the number of confirmed cases in Tokyo reached 80000.

<h2 style='background:#FFFFFF; border:0; color:black'><center>Active Cases<center><h2>

In [None]:
fig = px.bar(tokyo_total,x='Date',y='Hosp_require',labels={"Hosp_require":"Number of active cases","value":"Date"},title="Active Cases in Tokyo (As of Jan 29th)") 
fig.show()

* The number of active cases is increasing since November 2020, and now approximately 18000 people are hospitalized or staying at home or in hotels.
* In Tokyo, the bed utilization rate is increasing,and medical services are in danger of collapsing.

<h2 style='background:#FFFFFF; border:0; color:black'><center>Severe Cases<center><h2>

* According to [Update on COVID-19 in Tokyo](https://stopcovid19.metro.tokyo.lg.jp/en), the definition of "Severe" is as follows.


> Within hospitalized patients, "serious symptoms" are considered to be those patients who need ventilators or ECMO.

In [None]:
fig = px.bar(tokyo_total,x='Date',y='Hosp_severe',labels={"Hosp_severe":"Number of severe cases","value":"Date"},title="Severe Cases in Tokyo (As of Jan 29th)") 
fig.show()

* The number of severe cases is soaring since December 2020, and reached high record on January.

<h2 style='background:#FFFFFF; border:0; color:black'><center>Total Death<center><h2>

In [None]:
fig = px.bar(tokyo_total.iloc[51:],x='Date',y='Fatal',labels={"Fatal":"Number of death cases","value":"Date"},title="Death Cases in Tokyo (As of Jan 29th)") 
fig.show()

* The number of death cases reached 700 on January.

<a id="3"></a>
<h2 style='background:#FFFFFF; border:0; color:black'><center>Patients of COVID-19 in Tokyo<center><h2>

In [None]:
tokyo_age = tokyo_covid["Age"].value_counts()
fig = px.bar(tokyo_age,labels={"index":"Age","value":"Number of people"},title="Patient Age",text=tokyo_age)
fig.show()

* We can see that many young people infected COVID-19.
* Few people in their 20s or 30s died of COVID-19 in Tokyo, but there are concerns about sequelae.
* Tokyo government asked people, especially young to refrain from having a diner with others and nonessential and nonurgent outings.

In [None]:
tokyo_gender = tokyo_covid["Gender"].value_counts()
fig = px.bar(tokyo_gender,labels={"index":"Gender","value":"Number of people"},title="Patient Gender",text=tokyo_gender)
fig.show()

In [None]:
tokyo_reg = tokyo_covid["Region"].value_counts()
fig = px.bar(tokyo_reg,labels={"index":"Region","value":"Number of people"},title="Regions where patients live in",text=tokyo_reg)
fig.show()

* The first three patients live in China, but most of patients after that live in Tokyo.
* "Outside Tokyo" means areas near Tokyo Metropolis (e.g. Saitama,Kanagawa,Chiba).

# Thank you for reading, feel free to upvote!