# Corona Virus Data Visulizations 2020

![](https://storage.googleapis.com/kaggle-datasets-images/490520/913071/6f768170042ff67201713be09e3ad3df/dataset-cover.jpg?t=2020-01-26-22-15-00)

## Introduction:
The novel coronavirus (provisionally named 2019-nCoV) is a contagious virus that causes respiratory infection. It has been identified as the causative agent of the ongoing 2019–20 Wuhan coronavirus outbreak.

As many early cases were linked to a large seafood and animal market, the virus is thought to have a zoonotic origin, but this has not been confirmed. Comparisons of the genetic sequences of this virus and other virus samples have shown similarities to SARS-CoV (79.5%) and bat coronaviruses (96%), which makes an ultimate origin in bats likely.

The first known human infection occurred in December 8, 2019. An outbreak of 2019-nCoV was first detected in Wuhan, China, in mid-December 2019.The virus subsequently spread to all other provinces of China and to more than twenty other countries in Asia, Europe, North America, and Oceania. Human-to-human spread of the virus has been confirmed in China, Germany, Thailand, Taiwan, Japan, and the United States.

As of 1 February 2020, there were 12,024 confirmed cases of infection, of which 11,860 were within mainland China. Cases outside China, to date, were people who have either travelled from Wuhan, or were in direct contact with someone who travelled from the area. The number of deaths was 259 as of 1 February 2020.

## Data Loading

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt # Data Visulizations
import seaborn as sns
import dateutil.parser


# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

In [None]:
file = pd.read_csv("/kaggle/input/2019-coronavirus-dataset-01212020-01262020/2019_nC0v_20200121_20200126_cleaned.csv")
file = file.drop(['Unnamed: 0'], axis = 1) 
file.info()

- **Province/State ** - City of virus suspected cases.
- **Country** - Country of virus suspected cases.
- **Date last updated	** - Date of update of patient infected
- **Confirmed** - Confirmation by doctors that this patient is infected with deadly virus
- **Suspected** - Number of cases registered
- **Recovered** - Recovery of the patient
- **Deaths** - Death of the patient due to corna virus.


Some Staticals calculations on dataset

In [None]:
round(file.describe())

In [None]:
# first few record of the dataset
file.head(10)

Ok, now that we have a glimpse of the data, let's explore them.

## Data Explorations & Visulizations

In [None]:
#file['Date last updated']=pd.to_datetime(file['Date last updated']).apply(lambda x: x.date())
uniq_dates = list(file['Date last updated'].unique())
uniq_dates


In [None]:
confirmed=[]
suspected=[]
recovered=[]
deaths=[]
for x in uniq_dates:
    confirmed.append(file[file['Date last updated']==x].Confirmed.sum())
    suspected.append(file[file['Date last updated']==x].Suspected.sum())
    recovered.append(file[file['Date last updated']==x].Recovered.sum())
    deaths.append(file[file['Date last updated']==x].Deaths.sum())


line_plot= pd.DataFrame()
line_plot['Date']=uniq_dates
line_plot['Confirmed']=confirmed
line_plot['Suspected']=suspected
line_plot['Recovered']=recovered
line_plot['Deaths']=deaths
line_plot.head()

In [None]:

line_plot = line_plot.set_index('Date')
plt.figure(figsize=(20,15))
sns.lineplot(data=line_plot)
plt.xticks(rotation=15)
plt.title('Number of Coronavirus Cases Over Time', size=20)
plt.xlabel('Time', size=20)
plt.ylabel('Number of Cases', size=20)
plt.show()

### Relationship Between Confirmend,Suspected,Recovered and Death by Contry and States

In [None]:
plt.figure(figsize=(20,6))
sns.pairplot(file, size=3.5);

In [None]:
plt.figure(figsize=(20,6))
sns.pairplot(file,hue='Country' ,size=3.5);

In [None]:
plt.figure(figsize=(20,6))
sns.pairplot(file,hue='Province/State' ,size=3.5);

### Country and State wise Explorations

In [None]:
data= pd.DataFrame(file.groupby(['Country'])['Confirmed','Suspected','Recovered','Deaths'].agg('sum')).reset_index()
data.head(19)

In [None]:
data= pd.DataFrame(file.groupby(['Country'])['Confirmed','Suspected','Recovered','Deaths'].agg('sum')).reset_index()

data.sort_values(by=['Confirmed'], inplace=True,ascending=False)

plt.figure(figsize=(12,6))

#  title
plt.title("Number of Patients Confirmed Infected by Corona Virus, by Country")

# Bar chart showing Number of Patients Confirmed Infected by Corona Virus, by Country
sns.barplot(y=data['Country'],x=data['Confirmed'],orient='h')

# Add label for vertical axis
plt.ylabel("Number of Confirmed Patients")

In [None]:
data.sort_values(by=['Suspected'], inplace=True,ascending=False)

plt.figure(figsize=(12,6))

#  title
plt.title("Number of Patients Suspected Infected by Corona Virus, by Country")

# Bar chart showing Number of Patients Confirmed Infected by Corona Virus, by Country
sns.barplot(y=data['Country'],x=data['Suspected'],orient='h')

# Add label for vertical axis
plt.ylabel("Number of Suspected Patients")

In [None]:
data.sort_values(by=['Recovered'], inplace=True,ascending=False)

plt.figure(figsize=(12,6))

#  title
plt.title("Number of Patients Recovered from by Corona Virus, by Country")

# Bar chart showing Number of Patients Confirmed Infected by Corona Virus, by Country
sns.barplot(y=data['Country'],x=data['Recovered'],orient='h')

# Add label for vertical axis
plt.ylabel("Number of Recovered Patients")

In [None]:
data.sort_values(by=['Deaths'], inplace=True,ascending=False)

plt.figure(figsize=(12,6))

#  title
plt.title("Number of Patients Died by Corona Virus, by Country")

# Bar chart showing Number of Patients Confirmed Infected by Corona Virus, by Country
sns.barplot(y=data['Country'],x=data['Deaths'],orient='h')

# Add label for vertical axis
plt.ylabel("Number of Deaths")

As we got the insight that china and some other countries near by china have many cases.

## Sates of China

In [None]:
china= file[file['Country'] == 'Mainland China']
china_data= pd.DataFrame(china.groupby(['Province/State'])['Confirmed','Suspected','Recovered','Deaths'].agg('sum')).reset_index()
china_data.head(35)

In [None]:
china_data.sort_values(by=['Confirmed'], inplace=True,ascending=False)

plt.figure(figsize=(25,10))

#  title
plt.title("Number of Patients Confirmed Infected by Corona Virus, by States")

# Bar chart showing Number of Patients Confirmed Infected by Corona Virus, by Country
sns.barplot(x=china_data['Province/State'],y=china_data['Confirmed'],orient='v')


# Add label for vertical axis
plt.ylabel("Number of Confirmed Patients")

In [None]:
china_data.sort_values(by=['Deaths'], inplace=True,ascending=False)

plt.figure(figsize=(25,10))

#  title
plt.title("Number of Patients Died by Corona Virus, by States")

# Bar chart showing Number of Patients Confirmed Infected by Corona Virus, by Country
sns.barplot(x=china_data['Province/State'],y=china_data['Deaths'],orient='v')


# Add label for vertical axis
plt.ylabel("Number of Deaths")

This is not end I am still exploring data