# How long does it take for Corona to fully heal?

(코로나가 완치되는데 얼마나 걸리는가?)

**Hello, I'm a normal undergraduate in Korea, and most Koreans now wonder when the Corona crisis is over, and those who do business should know when to resume. Students need to know when they can go to school, and I think that knowing this will give them a sense of when the Corona situation will end.I hope the current situation is over as soon as possible.**

(안녕하세요 저는 한국의 평범한 학부생입니다. 현재 대다수 한국인들은 코로나 사태가 언제 끝나는지에 대해 궁금해 합니다. 사업을 하시는 분들은 언제 사업을 재개할지 알아야 합니다. 학생들은 언제 학교에 갈 수 있을지 알아야 합니다. 이를 알면 코로나 사태가 언제쯤 종결될 지에 대해 감을 잡을 수 있을 것이라고 생각합니다. 조속히 현재 상황이 종결되길 바랍니다.)


# 0. get DataFrame

(일단 dataframe을 만듭니다.)

In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('../input/coronavirusdataset/patient.csv')
df.head()

In [None]:
df['age'] = 2020-df['birth_year']

# 1. get data: patients full recovery

(완치된 환자 data만 선별합니다.)

In [None]:
df_released = df[df['state'] == 'released']
df_released.head()

In [None]:
df_released = df_released.reset_index(drop = True)
df_released.tail()

**Here, we're missing some of the df_released's 'released_data'. I will solve this problem.**

(여기서 일부 df_released의 'released_data'가 누락되어있습니다. 이를 해결하겠습니다.)

In [None]:
df_released.info()

**There are a few null data in 'released_date'.
 I will remove them**

In [None]:
df_released = df_released[df_released.released_date.notna()]
df_released.tail()

# 2. calculate the time it takes for Corona to heal completely

(코로나가 완치되는데 걸리는 시간을 계산합니다.)

In [None]:
date_cols = ["confirmed_date", "released_date", "deceased_date"]
for col in date_cols:
    df_released[col] = pd.to_datetime(df_released[col])

In [None]:
df_released["timedelta_to_release_since_confirmed"] = df_released["released_date"] - df_released["confirmed_date"]

In [None]:
time_to_recover_series = df_released["timedelta_to_release_since_confirmed"].apply(lambda timedelta: int(timedelta.days))
df_released["time_to_release_since_confirmed"] = df_released["timedelta_to_release_since_confirmed"].apply(lambda timedelta: int(timedelta.days))
time_to_recover_series[:5]

In [None]:
time_to_recover = time_to_recover_series.values
time_to_recover 

# 3. visualize the time to recover completely

(완치되는데 걸리는 시간을 시각화합니다.)

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

### 1) calculate min_value, max_value, and mean

In [None]:
print('min_value: '+ str(time_to_recover.min()) + ' days')
print('max_value: '+ str(time_to_recover.max()) + ' days')
print('mean: '+ str(time_to_recover.mean()))
print('std: '+ str(time_to_recover.std()))

**On average, it takes about 14 days to cure the coroner completely, and the standard deviation is 6.6.**

(코로나가 완치되는데 평균적으로 약 14일 정도가 소요되며 표준편차는 6.6을 가집니다. )

### 2) draw the time to recover completely by distplot

In [None]:
plt.figure(figsize=(10,5))
sns.distplot(time_to_recover, color = 'red')
plt.xlim(time_to_recover.min(),time_to_recover.max())
plt.xticks(range(time_to_recover.max()));
plt.xlabel('day');

In [None]:
plt.figure(figsize=(10,5))
sns.kdeplot(time_to_recover, color = 'red')
plt.xlim(time_to_recover.min(),time_to_recover.max())
plt.xticks(range(time_to_recover.max()));
plt.xlabel('day');

In [None]:
import scipy as sp

In [None]:
print('skewness: ' + str(sp.stats.skew(time_to_recover)))
print('kurtosis: ' + str(sp.stats.kurtosis(time_to_recover)))

**Skewness is negative, so it tilts slightly to the right, but it is close to zero and is close to symmetry.**

(skewness가 음수여서 약간 오른쪽으로 기울었지만 0에 가까워서 대칭에 가깝습니다.)

# 4. The time it takes for the corona to heal completely with age?

(나이에 따른 코로나 완치 시간)

In [None]:
plt.figure(figsize=(20,10))
ax = sns.barplot(data = df_released, x="age", y="time_to_release_since_confirmed",
                 saturation=1)
plt.title('The time it takes for the corona to heal completely with age');

**Except that one 33-year-old adult takes 32 days to fully heal, the older the person is, the longer it takes to fully heal.**

**Still, the time it takes to fully heal is relatively even, depending on age.**

(33세 성인 1명이 완치까지 32일이 걸린 것을 제외하고는 나이가 많을수록 완치까지 오래걸리는 경향을 볼 수 있습니다.
그래도 대체적으로는 연령에 따라 완치까지 걸리는 시간이 비교적 고릅니다.)

# 5. an age-related death and a complete healer

(연령에 따라 사망한 사람과 완치자를 보겠습니다.)

### 1) get data: patients deceased

In [None]:
df_deceased = df[df['state'] == 'deceased']
df_deceased.head()

In [None]:
df_deceased = df_deceased.reset_index(drop = True)
df_deceased.tail()

In [None]:
df_deceased.info()

**23 non-null data!! in 'confirmed_date' and 'deceased_date'**

In [None]:
date_cols = ["confirmed_date", "released_date", "deceased_date"]
for col in date_cols:
    df_deceased[col] = pd.to_datetime(df_deceased[col])
    
df_deceased["timedelta_to_decease_since_confirmed"] = df_deceased["deceased_date"] - df_deceased["confirmed_date"]
df_deceased["time_to_decease_since_confirmed"] = df_deceased["timedelta_to_decease_since_confirmed"].apply(lambda timedelta: int(timedelta.days))
df_deceased["time_to_decease_since_confirmed"][:5]

**Remove negative data because it exists.**

(음수인 data가 있으므로 제거합니다.)

In [None]:
df_deceased = df_deceased[df_deceased["time_to_decease_since_confirmed"]>=0]
df_deceased = df_deceased.reset_index(drop = True)
df_deceased.head()

In [None]:
sns.kdeplot(data=df_deceased['age'],label='deceased', color='black', shade=True)
sns.kdeplot(data=df_released['age'],label='released', color='red', shade=True);