# 20M51702 Yuki Kobayashi
# Geospatial data analysis for environmental studies
# Excercise 6
We look at Corona cases for a specific country:
I selected **South Korea**.

Attention:
I do not know why but there is sometimes no graph displayed on the notebook, however, it is properly output as html file.

The notebook has a possibility to fail when running at first, so if it fails, please run again.

## Loading dataframe

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
np.set_printoptions(threshold=np.inf)  #In order not to omit displaying array

df = pd.read_csv('../input/novel-corona-virus-2019-dataset/covid_19_data.csv',header=0)
# print(df)
# print(df.columns)
# print(df['Country/Region'])
# print(df['Country/Region'].values)
# print(np.unique(df['Country/Region'].values))  #Extracting only unique values

#Filtering only South Korea data
selected_country='South Korea'
df = df[df['Country/Region']==selected_country]
df = df.groupby('ObservationDate').sum()  #Grouping by observation date
print(df)

## Calculating and plotting
'df' is cumlative information.
So, calculating day-to-day cases as below.

In [None]:
# print(df['Recovered'].diff())

df['daily_confirmed'] = df['Confirmed'].diff()
df['daily_deaths'] = df['Deaths'].diff()
df['daily_recovery'] = df['Recovered'].diff()

# print(df)

# df['daily_confirmed'].plot()
# df['daily_recovery'].plot()
# plt.show()

In [None]:
print(df)

## Making interactive chart with loading 2 modules below.

In [None]:
from plotly.offline import iplot
import plotly.graph_objs as go

daily_confirmed_object = go.Scatter(x=df.index,y=df['daily_confirmed'].values,name='Daily confirmed')
daily_deaths_object = go.Scatter(x=df.index,y=df['daily_deaths'].values,name='Daily deaths')

layout_object = go.Layout(title='South Korea daily cases 20M51702',xaxis=dict(title='Date'),yaxis=dict(title='Number of people'))
fig = go.Figure(data=[daily_confirmed_object,daily_deaths_object],layout=layout_object)
iplot(fig)

In [None]:
fig.write_html('Original_SouthKorea_daily_cases_20M51702.html')

## I modified a little because daily death was difficult to see. I also added plot of recovered persons referring to https://plotly.com

In [None]:
from plotly.subplots import make_subplots


fig2 = go.Figure()
fig2 = make_subplots(specs=[[{"secondary_y": True}]])

daily_recovered_object = go.Scatter(x=df.index,y=df['daily_recovery'].values,name='Daily recovered')
daily_deaths_object2 = go.Bar(x=df.index,y=df['daily_deaths'].values,name='Daily deaths',opacity=0.5)

fig2.add_trace(
    daily_confirmed_object,
    secondary_y=False,
)

fig2.add_trace(
    daily_recovered_object,
    secondary_y=False,
)

fig2.add_trace(
    daily_deaths_object2,
    secondary_y=True,
)

fig2.update_layout(title_text="South Korea daily cases 20M51702")
fig2.update_xaxes(title_text="Date")
fig2.update_yaxes(title_text="Number of dead people", secondary_y=True)
fig2.update_yaxes(title_text="Number of confirmed/recovered people", secondary_y=False)

fig2.update_layout(
    showlegend=False,
    annotations=[
        dict(
            x="02/18/2020",
            y=0,
            text="Super Spreader",
            showarrow=True,
            arrowhead=7,
            ax=0,
            ay=-300
        ),
        dict(
            x="03/19/2020",
            y=0,
            text="Restrict foreigners' entry",
            showarrow=True,
            arrowhead=7,
            ax=0,
            ay=-300
        ),
        dict(
            x="05/03/2020",
            y=0,
            text="Deregulate the restriction",
            showarrow=True,
            arrowhead=7,
            ax=0,
            ay=-300
        ),
    ]
)


# Please ignore below.
'''
from plotly.subplots import make_subplots

# fig3 = go.Figure()

daily_confirmed_object2 = go.Scatter(x=df.index,y=df['daily_confirmed'].values,name='Daily confirmed',yaxis='y1')
daily_recovered_object = go.Scatter(x=df.index,y=df['daily_recovery'].values,name='Daily recovered',yaxis='y1')
daily_deaths_object2 = go.Bar(x=df.index,y=df['daily_deaths'].values,name='Daily deaths',yaxis='y2')

layout = go.Layout(
    title = "South Korea daily cases 20M51702",
    xaxis = dict(title='Date',type='date'),
    yaxis = dict(title="Number of confirmed/recovered people", side='left'),
    yaxis2 = dict(title="Number of dead people",side = 'right',showgrid=False,overlaying='y')
)

fig3 = dict(data = [daily_confirmed_object2, daily_recovered_object, daily_deaths_object2], layout = layout)
iplot(fig3)
'''

iplot(fig2)
fig2.write_html('Modified_SouthKorea_daily_cases_20M51702.html')

### Discussion 1

I collected main news from Wikipedia (link:https://ja.wikipedia.org/wiki/韓国における2019年コロナウイルス感染症の流行状況#cite_note-191 (Japanese)), but I thought it seems to have pretty reliable data cooperating Google.

Korean Government says that one woman "Super spreader" seemed to spread Corona Virus on 18th February, and we can see rapid increase of daily confirmed people from then. 

On 20th February, mass infection occurred in a church.

On 19th March, the government started to restrict foreigners' entry.

We can see the confirmed people decreased and recovered people increased after that days.

Actually, the article says the restriction was not the best because Korea made entrants from Europe wait in hospitals, but there were over 1000 people coming and the beds were not enough, the government gave up and made them wait just at their home.

Nevertheless, it seemed to be effective to control infections according to this graph.

## Making informative table
Color the entries as "jet" showing large values as bright color, low values as dark color.

In [None]:
df1 = df#[['daily_confirmed']]
df1 = df1.fillna(0.)   #fill na with 0
styled_object = df1.style.background_gradient(cmap='jet').highlight_max('daily_confirmed').set_caption('Daily Summaries')
display(styled_object)
f = open('table_20M51702.html','w')  #Save as html
f.write(styled_object.render())

### Discussion 2
Looking at this chart and the graph, we can see the largest daily cases were 851 people on March 3rd, 2020.


The cases were decreasing until April, however, we can also see it is gradually increasing again from May.

## Calculating global ranking
selecting only the latest date: 06/12/2020

In [None]:
df = pd.read_csv('../input/novel-corona-virus-2019-dataset/covid_19_data.csv',header=0)
df1 = df[df['ObservationDate']=='06/12/2020']
df1 = df1.groupby('Country/Region').sum()
df2 = df1.sort_values(by='Confirmed',ascending=False).reset_index() 
df2['Rank'] = df2.index.values+1  #'Rank' indicates global ranking
print(df2)
print(df2[df2['Country/Region']=='South Korea'])
print(df2[df2['Country/Region']=='Japan'])

### Discussion 3
Since 'Rank' indicates global ranking, South Corea is the 56th among the world in 06/12/2020.

On 3rd May, the government changed to relax the plan because the confirmed people was decreasing.

In early May, however, there was mass infection in a night clubhouse and there is modest increase of new confirmed people.

So, they fear that another infection wave may come now.

It can be considered that South Korea and Japan are on the same situation now because Japan is also on Rank 48th with modest increases.