# Coronavirus Analysis

<img src='https://bolnews.s3.amazonaws.com/wp-content/uploads/2020/01/Coranavirus-2.jpg1000-x-500.jpg' width='600' height='500' >

### How Coronavirus Started

The virus appears to have originated from a Wuhan seafood market where wild animals, including marmots, birds, rabbits, bats and snakes, are traded illegally. Coronaviruses are known to jump from animals to humans, so it’s thought that the first people infected with the disease – a group primarily made up of stallholders from the seafood market – contracted it from contact with animals.

Although an initial analysis of the virus suggested it was similar to coronavirus seen in snakes, it now seems more likely that it came from bats. A team of virologists at the Wuhan Institute for Virology released a detailed paper showing that the new coronaviruses' genetic makeup is 96 per cent identical to that of a coronavirus found in bats. Bats were also the original source of the Sars virus.

src= 'https://www.wired.co.uk/article/china-coronavirus'

Coronaviruses are a group of viruses that cause diseases in mammals and birds. In humans, the viruses cause respiratory infections which are typically mild including the common cold but rarer forms like SARS and MERS can be lethal. In cows and pigs they may cause diarrhea, while in chickens they can cause an upper respiratory disease. There are no vaccines or antiviral drugs that are approved for prevention or treatment.

Coronaviruses are viruses in the subfamily Orthocoronavirinae in the family Coronaviridae, in the order Nidovirales.Coronaviruses are enveloped viruses with a positive-sense single-stranded RNA genome and with a nucleocapsid of helical symmetry. The genomic size of coronaviruses ranges from approximately 26 to 32 kilobases, the largest for an RNA virus.

The name "coronavirus" is derived from the Latin corona, meaning crown or halo, which refers to the characteristic appearance of the virus particles (virions): they have a fringe reminiscent of a royal crown or of the solar corona.
    
Source -> https://en.wikipedia.org/wiki/Coronavirus

### Import Libraries

In [None]:
import numpy as np
import pandas as pa
import matplotlib.pyplot as plt
import seaborn as sn

#Import Plotly
import plotly.express as px
import plotly.offline as py
import plotly.graph_objects as go
import folium 

%matplotlib inline
import squarify
import matplotlib
import matplotlib as mpl
import matplotlib.cm as cm

font = {'family' : 'normal',
        'weight' : 'bold',
        'size'   : 32}

matplotlib.rcParams.update({'font.size': 30})

import warnings
warnings.filterwarnings('ignore')
plt.style.use('fivethirtyeight')

In [None]:
original_data = pa.read_csv('../input/2019-coronavirus-dataset-01212020-01262020/2019_nCoV_20200121_20200206.csv')
data = original_data

In [None]:
for i in data[data['Province/State'].isna()].index:
    data.loc[i,'Province/State'] = 'Other' + ' ' + str(data.loc[i,'Country/Region'])

data = data.fillna(0)

data['Province/State'] = data['Province/State'].astype('category')
data['Country/Region'] = data['Country/Region'].astype('category')
data['Last Update'] = pa.to_datetime(data['Last Update'])

In [None]:
data.info()

In [None]:
data.head()

In [None]:
'''country_count = data['Country/Region'].value_counts()
country = pa.DataFrame({'Name':country_count.index,'Values':country_count.values})
names = []
for values in country_count.index:
    if ((values == 'Mainland China') | (values == 'Cambodia') |(values == 'South Korea') | (values == 'Hong Kong')| (values == 'Japan' ) | (values == 'Thailand')|(values == 'Taiwan') | (values == 'Macau') | (values =='Singapore')|(values =='Vietnam')|(values =='Malaysia')|(values =='Philippines')|(values =='Nepal') | (values =='Sri Lanka')):
        names.append('Asia')
    if (( values == 'United States') | ( values == 'Mexico' ) | ( values =='Colombia') | ( values == 'Brazil') | ( values == 'Canada')):
        names.append('America')
    if (( values == 'France') | ( values == 'Germany' )):  
        names.append('Europe') 
    if (( values == 'Ivory Coast')):
        names.append('Africa')
    if (( values == 'Australia' )):
        names.append('Oceania')

country['Continent'] = names'''

#### Frequencies Country/Region

In [None]:
matplotlib.rcParams.update({'font.size': 35})

plt.figure(figsize=(60,17))
country_count = data['Country/Region'].value_counts()
squarify.plot(sizes=country_count.values,label=country_count.index,alpha=0.7)

Coranavirus is mostly common in China as compared to other countries. 

### Get all the latest updated values

In [None]:
#data['Last Update'] = pa.to_datetime(data['Last Update'])
final_data = pa.DataFrame(columns=['Province/State','Country/Region','Last Update','Confirmed','Suspected','Recovered','Death'])

In [None]:
def search_values(state):
    value = len(final_data[final_data['Province/State'] == state])
    if(value>0):
        return True
    else:
        return False

In [None]:
for i,(row_name,row) in enumerate(data.iterrows()):
    if i == 0:
        final_data.loc[i] = [row['Province/State']] + [row['Country/Region']] +  [row['Last Update']] + [row['Confirmed']]  + [row['Suspected']] + [row['Recovered']]+ [row['Death']]
    else:
        val = search_values(row['Province/State'])
        if val == False:
            final_data.loc[i] = [row['Province/State']] + [row['Country/Region']] +  [row['Last Update']] + [row['Confirmed']]  + [row['Suspected']] + [row['Recovered']]+ [row['Death']]
final_data= final_data.reset_index(drop=True)
final_data.head(5)

In [None]:
data= final_data

In [None]:
data_province = data.groupby('Province/State').sum()
data_province = data_province[data_province['Confirmed'] > 0].sort_values(by=['Confirmed'],ascending=False)

fig = go.Figure(go.Bar(x=data_province['Confirmed'],
                       y=data_province.index,
                       orientation='h',
        marker={
        'color': [np.random.randint(10,255) for x in range(0,len(data_province))],
        'colorscale': 'Viridis'
        }
        ))

fig.update_layout(yaxis=dict(title='States'),width=900,height=500,title='Total Confirmed Coronavirus cases over States',
                 xaxis=dict(title='Confirmed'))
fig.show()

Coronavirus found or Confirmed in people are mostly common in places like Hubei,Henan, Jiagxi,Beijing and Jiangsu.

In [None]:
data_province = data.groupby('Province/State').sum()
data_province = data_province[data_province['Recovered'] > 0].sort_values(by=['Recovered'],ascending=False)

fig = go.Figure(go.Bar(x=data_province['Recovered'],
                       y=data_province.index,
                       orientation='h',
        marker={
        'color': [np.random.randint(10,255) for x in range(0,len(data_province))],
        'colorscale': 'Viridis'
        }
        ))

fig.update_layout(yaxis=dict(title='States'),width=900,height=500,title='Total Recovered Coronavirus cases over States',
                 xaxis=dict(title='Recovered'))
fig.show()

### Death Percentage over state and country

In [None]:
fig_c = go.Pie(labels=data['Country/Region'],values=data['Death'], textinfo='label+percent',hole=0.4,domain={'x': [0,0.40]})

data_province = data_province[data_province['Death'] > 0].sort_values(by=['Death'],ascending=False)

fig_s = go.Pie(labels=data_province.index,values=data_province['Death'], textinfo='label+percent',hole=0.4,
               domain={'x': [0.46,1]})

layout = dict(font=dict(size=10), legend=dict(orientation="v"),
              annotations = [dict(x=0.14, y=0.5, text='Country', showarrow=False, font=dict(size=20)),
                             dict(x=0.77, y=0.5, text='State', showarrow=False, font=dict(size=20)) ])


fig = dict(data=[fig_s, fig_c],layout=layout)
py.iplot(fig)

All Peoples died due to coranavirus are from China(492),Hongkong(1),Philippines(1) and states is Hubei (In Hubei 479 people died due to this virus which is almost 97.4% compared to other states).It means 13 people died from other states.

### Country Confirmed and Recovered Coronavirus excluding China

In [None]:
data_other_country = data[data['Country/Region']!= 'Mainland China']

fig_c = go.Pie(labels=data_other_country['Country/Region'],values=data_other_country['Confirmed'], textinfo='label+percent',
               hole=0.4,domain={'x': [0,0.40]})

fig_s = go.Pie(labels=data_other_country['Country/Region'],values=data_other_country['Recovered'], 
               textinfo='label+percent',hole=0.4,
               domain={'x': [0.46,1]})

layout = dict(font=dict(size=10), legend=dict(orientation="v"),
              annotations = [dict(x=0.13, y=0.5, text='Confirmed', showarrow=False, font=dict(size=20)),
                             dict(x=0.80, y=0.5, text='Recovered', showarrow=False, font=dict(size=20)) ])


fig = dict(data=[fig_s, fig_c],layout=layout)
py.iplot(fig)

From the first plot we can see that all the states that accumulated number of confirmed cases is more in Thailand,Taiwan,Hong Kong,Japan,Singapore,Macau the regions which falls under china or the neghiboring countries of china.

#### Analysis China Recovered and Confirmed and Death accross all state of china

In [None]:
data_cn = data.groupby('Country/Region')['Recovered','Death'].sum()
data_cn.loc['Mainland China']
#sn.barplot(x = data_cn.loc['Mainland China'].index,y=data_cn.loc['Mainland China'].values)

fig = go.Figure(go.Bar(y=data_cn.loc['Mainland China'].values,
                       x=data_cn.loc['Mainland China'].index,
                       orientation='v',
        marker={
        'color': [np.random.randint(0,225) for x in range(0,2)]
        
        }
        ))

fig.update_layout(yaxis=dict(title='Values'),width=900,height=500,title='Total number of recovered and deaths cases in China',
                 xaxis=dict(title='Status'))

fig.show()

The number of recovered is more than death cases

In [None]:
data_china = data[data['Country/Region'] == 'Mainland China']
data_china_val = data_china.groupby(['Country/Region']).sum() 

data_chinac = data_china[(data_china['Recovered']!= 0) | (data_china['Death']!=0.0)]

st = data_chinac['Province/State']
rec = data_chinac['Recovered']
de = data_chinac['Death']

fig = go.Figure()

fig.add_trace(go.Scatter(y=rec, x=st,name='Recovered'))
fig.add_trace(go.Bar(x=st,
                       y=rec,showlegend=False,
                       orientation='v'))


fig.add_trace(go.Scatter(x=st, y=de,name='Death'))
fig.add_trace(go.Bar(x=st,
                       y=de,showlegend=False,
                       orientation='v'))

fig.update_layout(yaxis=dict(title='Recovered & Death'),width=900,height=500,
                  title="Analysis of China's Recovered and Death cases",
                  xaxis=dict(title='Status'))

fig.show()

In [None]:
china_locations = pa.read_csv('../input/china-locations-states/china_locations_states.csv')
china_locations = china_locations.rename({'admin':'Province/State'},axis=1)
china_locations = china_locations.drop(['population'],axis=1)
china_locations = china_locations.groupby('Province/State').mean()
china_locations.reset_index(inplace=True)

data_china_loc = data_china.groupby('Province/State')['Confirmed','Suspected','Recovered','Death'].sum()
data_china_loc.reset_index(inplace=True)
data_china_loc = data_china_loc[(data_china_loc['Confirmed'] >0) | (data_china_loc['Suspected'] >0) |
                                (data_china_loc['Recovered'] >0)]

final_china_state = pa.merge(data_china_loc,china_locations, on='Province/State')

print('Blue Represent Death and Orange represent Recovered')

folium_map = folium.Map(location=[32.8617,109.1954],zoom_start=5,tiles='CartoDB dark_matter')
for index, row in final_china_state.iterrows():
    iter_val = ['Recovered','Death']
    
    for status in iter_val:
        if status == 'Recovered':
            color = '#E37222'
            value = row[status] / 10
            #print(value)
        else:
            color = '#0A8A9F'
            value = row[status]/10
            #print(value)
            
        if value!=0:
            folium.CircleMarker(location=(row['lat'],row['lng']),radius=value,color=color,
                                popup = ('<strong><u>Country</u></strong>: ' + str(row['Province/State']).capitalize())
                                ,fill=True).add_to(folium_map)
    
folium_map

Hubei has the most number of recovered and deaths as compared to others.

There's currently a severe shortage of medical supplies, not just in Wuhan but in surrounding cities as well, the governor of Hubei Province, Wang Xiaodong said at a press conference on Wednesday.
The mask shortage has become a country-wide problem since the new coronavirus outbreak spread domestically. Everyone who goes outside is suggested to wear mask. But the problem is it's hard to get one. Besides experts suggested the use of normal surgical masks, people have to replace it every four hours.

China is also recovering quite fast from the above plot it is clear.

Source - https://news.cgtn.com/news/2020-01-30/Hubei-has-a-severe-shortage-of-medical-supplies-says-governor-NFDtX4DR7i/index.html

In [None]:
values = data_china_val.loc['Mainland China'].values

fig = go.Figure(go.Bar(y=values,
                       x=['Confirmed','Suspected','Recovered','Death'],
                       orientation='v',
        marker={
        'color': [np.random.randint(100,255) for x in range(0,len(values))],
        'colorscale': 'Viridis'
        }
        ))

fig.update_layout(yaxis=dict(title='Values'),width=900,height=500,
                  title='Analysis China Recovered and Confirmed and Death',
                  xaxis=dict(title='Status'))
fig.show()

####  Confirmed cases in Province/State in China

In [None]:
fig = go.Figure(go.Treemap(
    
    labels = data_china['Province/State'],
    values = data_china.Confirmed,
    parents = data_china['Country/Region']
))

fig.show()

In [None]:
folium_map = folium.Map(location=[35.8617,104.1954],zoom_start=5,tiles='CartoDB dark_matter')

for index, row in final_china_state.iterrows():
    folium.CircleMarker(location=(row['lat'],row['lng']),radius=row['Confirmed']/400
                        ,color='#E37222',
                        popup = ('<strong><u>Country</u></strong>: ' + str(row['Province/State']).capitalize())
                        ,fill=True).add_to(folium_map)


folium_map

### Analysing Hubei

In [None]:
z = original_data[original_data['Province/State'] == 'Hubei']
z = z.fillna(0)

### Check the number of confirmed cases for each update time in Hubei 

In [None]:
z['Last Update'] = pa.to_datetime(z['Last Update'])
df = z.pivot_table(columns=z['Last Update'].dt.hour,index = z['Last Update'].dt.day,values='Confirmed')
df = df.fillna(0)

matplotlib.rcParams.update({'font.size': 12})

def pie_heatmap(table, cmap='coolwarm_r', vmin=None, vmax=None,inner_r=0.25):
    n, m = table.shape
    vmin= table.min().min() if vmin is None else vmin
    vmax= table.max().max() if vmax is None else vmax

    centre_circle = plt.Circle((0,0),inner_r,edgecolor='black',facecolor='white',fill=True,linewidth=0.3)
    plt.gcf().gca().add_artist(centre_circle)
    norm = mpl.colors.Normalize(vmin=vmin, vmax=vmax)
    cmapper = cm.ScalarMappable(norm=norm, cmap=cmap)
  
    for i, (row_name, row) in enumerate(table.iterrows()):
        labels = None if i > 0 else table.columns
        wedges = plt.pie([1] * m,radius=inner_r+float(n-i)/n, colors=[cmapper.to_rgba(x) for x in row.values], 
            labels=labels, startangle=90, counterclock=False, wedgeprops={'linewidth':-1})
        plt.setp(wedges[0], edgecolor='grey',linewidth=1.8)
        wedges = plt.pie([1], radius=inner_r+float(n-i-1.2)/n, colors=['w'], labels=[row_name], startangle=-90, wedgeprops={'linewidth':0})
        plt.setp(wedges[0], edgecolor='grey',linewidth=1.8)
        
plt.figure(figsize=(8,8))
plt.title("Timewheel of Hour Vs Date",y=1.08,fontsize=30)
pie_heatmap(df,vmin=-10,vmax=25,inner_r=0.25)

**From the above timewheel [0,9,11,12,13,14,18,19,20,21,22,23] represent the hour i.e 0 means 12.00 am etc and [21,22,23.....31] represents the date.**

**For example from the above plot if we take 24 th January we can see the number of confirmed cases updated two times one at 12.00 am another at 12.00 pm.**



In [None]:
import plotly.express as px
fig = go.Figure()

fig.add_trace(go.Scatter(x=z['Last Update'], y=z['Confirmed'],name="Confirmed"))
fig.add_trace(go.Bar(x=z['Last Update'],
                       y=z['Confirmed'],
                       showlegend=False,
                       orientation='v'))


fig.add_trace(go.Scatter(x=z['Last Update'], y=z['Recovered'],name="Recovered"))
fig.add_trace(go.Bar(x=z['Last Update'],
                       y=z['Recovered'],
                       showlegend=False,
                       orientation='v'))

fig.add_trace(go.Scatter(x=z['Last Update'], y=z['Death'],name="Death"))
fig.add_trace(go.Bar(x=z['Last Update'],
                       y=z['Death'],
                       showlegend=False,
                       orientation='v'))

fig.add_trace(go.Scatter(x=z['Last Update'], y=z['Suspected'],name="Suspected"))
fig.add_trace(go.Bar(x=z['Last Update'],
                       y=z['Suspected'],
                       showlegend=False,
                       orientation='v'))

fig.update_layout(yaxis=dict(title='Values'),width=900,height=500,
                  title='Analysis of Death/Recovered/Confirmed in Hubei with time',
                  xaxis=dict(title='Time in Date'))

The number of victims of this virus is increasing in hubei as it it can be seen clearly from the above plot,the main place from where this virus originated is Wuhan which is the capital of  capital of Central China’s Hubei province.  

In [None]:
data_country = data.groupby('Country/Region')['Confirmed'].sum()

worldmap = [dict(type = 'choropleth', locations = data_country.index, locationmode = 'country names',
                 z = data_country.values, colorscale = "Inferno", reversescale = True, 
                 marker = dict(line = dict( width = 0.5)), 
                 colorbar = dict(autotick = True, title = 'Number of Confirmed cases'))]

layout = dict(title = 'Coronavirus across all over the world', geo = dict(showframe = False, showcoastlines = True, 
                                                                projection = dict(type = 'Mercator')))

fig = dict(data=worldmap, layout=layout)
py.iplot(fig, validate=False)

## If you found this notebook useful pls upvote