In [1]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

# import numpy as np # linear algebra
# import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# # Input data files are available in the "../input/" directory.
# # For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

# import os
# for dirname, _, filenames in os.walk('/kaggle/input'):
#     for filename in filenames:
#         print(os.path.join(dirname, filename))

# Any results you write to the current directory are saved as output.

**This notebook is a continuation of https://www.kaggle.com/sauravmishra1710/covid-19-sars-cov-2-a-geo-statistical-analysis/**

# What is Coronavirus?

    Coronaviruses (CoV) are a large family of viruses that cause illness ranging from the common cold to more severe 
    diseases such as Middle East Respiratory Syndrome (MERS-CoV) and Severe Acute Respiratory Syndrome (SARS-CoV). 
    A novel coronavirus (nCoV) is a new strain that has not been previously identified in humans.  

    Coronaviruses are zoonotic, meaning they are transmitted between animals and people.  Detailed investigations found 
    that SARS-CoV was transmitted from civet cats to humans and MERS-CoV from dromedary camels to humans. Several 
    known coronaviruses are circulating in animals that have not yet infected humans. 

    Common signs of infection include respiratory symptoms, fever, cough, shortness of breath and breathing difficulties.
    In more severe cases, infection can cause pneumonia, severe acute respiratory syndrome, kidney failure and even death. 

    Standard recommendations to prevent infection spread include regular hand washing, covering mouth and nose when coughing
    and sneezing, thoroughly cooking meat and eggs. Avoid close contact with anyone showing symptoms of respiratory illness
    such as coughing and sneezing.


    The World Health Organization has declared the novel coronavirus outbreak a public health emergency, it has increased 
    the general fear among the public. A lot of countires have heightened their measures to fight with this virus with the 
    condition in China still senitive..More than 20 countries and territories outside of mainland China have confirmed cases 
    of the virus -- spanning Asia, Europe, North America and the Middle East -- as India, Italy and the Philippines reported 
    their first cases.



# Data Source

1. The data has been shared in kaggle @ https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset.


2. Real time data on everyday basis is also made available by the Johns Hopkins university @ 
    https://github.com/CSSEGISandData/COVID-19


# Interactive Real Time Data Visualizations

1. An Interactive real time data visualization provided by the Johns Hopkins University - https://arcg.is/0fHmTX


2. An interactive web-based dashboard to track COVID-19 in real time -    https://www.thelancet.com/journals/laninf/article/PIIS1473-3099(20)30120-1/fulltext


3. Real time data visuals - https://www.worldometers.info/coronavirus/


4. World Health Organization Coverage on Covid-19 - https://www.who.int/health-topics/coronavirus


# Previous Work

   Initially I worked on the same dataset in a more statistical approach. The work can be viewed @ https://www.kaggle.com/sauravmishra1710/covid-19-sars-cov-2-a-geo-statistical-analysis
   This work was more on the statistical and time series analysis. This noteboook is a continuation of the previous one and concentrates more on the geographical maps. Also there would
   be some tree map based visuals also.


   This analysis will be more towards the new learnings seeing various other works in the platform. A real inspiration to start this all over again has been this notebook at https://www.kaggle.com/imdevskp/corona-virus-covid-19-ncov-19-outbreak-analysis. This has some really nice, innovative visualizations and has been a great source of learning the aspects on the treemaps and geographical animations.

# Getting Started - Import Required Packages & Libraries

In [2]:
# import the necessary libraries

import numpy as np 
import pandas as pd
from datetime import date
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import Markdown
import plotly.graph_objs as go
import plotly.offline as py
from plotly.subplots import make_subplots
import plotly.express as px
from plotly.offline import init_notebook_mode, plot, iplot, download_plotlyjs
import plotly as ply
import pycountry
import folium 
from folium import plugins
import json


%config InlineBackend.figure_format = 'retina'
init_notebook_mode(connected=True)


# Utility Functions

'''Display markdown formatted output like bold, italic bold etc.'''
def formatted_text(string):
    display(Markdown(string))


'''highlight the maximum in a Series or DataFrame'''  
def highlight_max(data, color='yellow'):
    attr = 'background-color: {}'.format(color)
    if data.ndim == 1:  # Series from .apply(axis=0) or axis=1
        is_max = data == data.max()
        return [attr if v else '' for v in is_max]
    else:  # from .apply(axis=None)
        is_max = data == data.max().max()
        return pd.DataFrame(np.where(is_max, attr, ''), index=data.index, columns=data.columns)   

# Import Dataset

In [3]:
covid_19 = pd.read_csv('./corona-virus-report/covid_19_clean_complete.csv', parse_dates=['Date'])

In [4]:
print("Covid_19 Shape", covid_19.shape)

Covid_19 Shape (3636, 8)


In [5]:
covid_19.head()

Unnamed: 0,Province/State,Country/Region,Lat,Long,Date,Confirmed,Deaths,Recovered
0,Anhui,Mainland China,31.82571,117.2264,2020-01-22,1,0,0
1,Beijing,Mainland China,40.18238,116.4142,2020-01-22,14,0,0
2,Chongqing,Mainland China,30.05718,107.874,2020-01-22,6,0,0
3,Fujian,Mainland China,26.07783,117.9895,2020-01-22,1,0,0
4,Gansu,Mainland China,36.0611,103.8343,2020-01-22,0,0,0


# Data pre-processing

In [6]:
formatted_text('***Covid 19 data information -***')
covid_19.info()

***Covid 19 data information -***

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3636 entries, 0 to 3635
Data columns (total 8 columns):
Province/State    2160 non-null object
Country/Region    3636 non-null object
Lat               3636 non-null float64
Long              3636 non-null float64
Date              3636 non-null datetime64[ns]
Confirmed         3636 non-null int64
Deaths            3636 non-null int64
Recovered         3636 non-null int64
dtypes: datetime64[ns](1), float64(2), int64(3), object(2)
memory usage: 227.4+ KB


In [7]:
formatted_text('***NULL values in the data -***')
covid_19.isnull().sum()

***NULL values in the data -***

Province/State    1476
Country/Region       0
Lat                  0
Long                 0
Date                 0
Confirmed            0
Deaths               0
Recovered            0
dtype: int64

**Initial Insights**

- The **'Province/State'** column has some missing values. We could fill in these misssing values with the 'Others' name. 
  As we do not know the State for these records, lets go with 'Others' for now.
   
- The column name for - **'Province/State'** and **'Country/Region'** can be simplified. Lets rename them to 'State' and  
  'Country' respectively.
  
- China is recorded as 'Mainland China'. We will rename it to 'China'

In [8]:
# Convert 'Last Update' column to datetime object
covid_19['Date'] = covid_19['Date'].apply(pd.to_datetime)

# Fill the missing values in 'Province/State' with the 'Country' name.
covid_19['Province/State'] = covid_19['Province/State'].replace(np.nan, covid_19['Country/Region'])

# Lets rename the columns - 'Province/State' and 'Last Update' to remove the '/' and space respectively.
covid_19.rename(columns={'Country/Region': 'Country', 'Province/State': 'State'}, inplace=True)

# Convert 'Mainland China' to 'China'
covid_19['Country'] = np.where(covid_19['Country'] == 'Mainland China', 'China', covid_19['Country'])

# Data Glimpse
covid_19.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered
0,Anhui,China,31.82571,117.2264,2020-01-22,1,0,0
1,Beijing,China,40.18238,116.4142,2020-01-22,14,0,0
2,Chongqing,China,30.05718,107.874,2020-01-22,6,0,0
3,Fujian,China,26.07783,117.9895,2020-01-22,1,0,0
4,Gansu,China,36.0611,103.8343,2020-01-22,0,0,0


In [9]:
# Check for the missing values again to ensure that there are no more remaining
formatted_text('***NULL values in the data -***')
covid_19.isnull().sum()

***NULL values in the data -***

State        0
Country      0
Lat          0
Long         0
Date         0
Confirmed    0
Deaths       0
Recovered    0
dtype: int64

**==>`The data looks clean now. This should be good to continue with further analysis`**.

In [10]:
# Lets check the total #Countries affected by nCoV

formatted_text('***Affected Countries -***')
Covid_19_Countries = covid_19['Country'].unique().tolist()
print(Covid_19_Countries)
print("\n------------------------------------------------------------------")
print("\nTotal countries affected by nCoV: ",len(Covid_19_Countries))

***Affected Countries -***

['China', 'Thailand', 'Japan', 'South Korea', 'Taiwan', 'US', 'Macau', 'Hong Kong', 'Singapore', 'Vietnam', 'France', 'Nepal', 'Malaysia', 'Canada', 'Australia', 'Cambodia', 'Sri Lanka', 'Germany', 'Finland', 'United Arab Emirates', 'Philippines', 'India', 'Italy', 'UK', 'Russia', 'Sweden', 'Spain', 'Belgium', 'Others', 'Egypt', 'Iran', 'Lebanon', 'Iraq', 'Oman', 'Afghanistan', 'Bahrain', 'Kuwait', 'Algeria', 'Croatia', 'Switzerland', 'Austria', 'Israel', 'Pakistan', 'Brazil', 'Georgia', 'Greece', 'North Macedonia', 'Norway', 'Romania']

------------------------------------------------------------------

Total countries affected by nCoV:  49


So there are a total of 31 countries affected with Covid-19. 

One stand out from the above country list is the item - **`Others`**. Lets check what are these records.

In [11]:
# Now lets see the Country - 'Others' which is there in the list above
formatted_text('***Affected Country - Others***')
covid_19[covid_19['Country'] == 'Others']

***Affected Country - Others***

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered
71,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-22,0,0,0
172,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-23,0,0,0
273,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-24,0,0,0
374,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-25,0,0,0
475,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-26,0,0,0
576,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-27,0,0,0
677,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-28,0,0,0
778,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-29,0,0,0
879,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-30,0,0,0
980,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-31,0,0,0


This is the **`Diamond Princess cruise ship`** which started on the **`5th of February`** from **`Yokohama, Japan`**. The ship is the biggest cluster outside China. The number of patients infected with the COVID-19 coronavirus aboard a quarantined cruise ship docked in Yokohama, Japan has continued to rise making the ship the largest cluster of the deadly virus outside China.

With nearly 6% of the 3,711 passengers and crew members now infected, the 952-foot cruise ship also has the **`highest infection rate of the coronavirus`** anywhere in the world. Wuhan, China, the city where the disease is believed to have originated has nearly 33,000 official cases—but spread across a population of more than 11 million, that’s an infection rate of less than 0.3%.

Health experts say the Diamond Princess highlights the high risk that an infection will spread in confined spaces. According to the U.S. Centers for Disease Control and Preventions, **`close-contact environments such as cruises can facilitate the transmission of viruses through droplets or contaminated surfaces`**.

There have been death cases reported from the ship. Two Japanese passengers who had been on the quarantined Diamond Princess cruise ship have died after being infected with the novel coronavirus. Japan's health ministry says the male and female passengers were hospitalized last week. They were both in their 80s.
The man and woman are the first Diamond Princess passengers to die during the virus outbreak. The cruise ship has been under a quarantine at Yokohama's port near Tokyo since Feb. 3.

Source - 

 - https://time.com/5783451/covid-19-princess-diamond-cruise-ship/
 - https://www.channelnewsasia.com/news/asia/covid19-coronavirus-diamond-princess-japan-79-test-positive-12450498
 - https://www.npr.org/sections/goatsandsoda/2020/02/20/807745305/coronavirus-2-princess-diamond-cruise-ship-passengers-die-after-contracting-covi

# Diamond Cruise Ship Data

In [12]:
# Lets create a subset of the data for the cruise ship

diamond_cruise_ship_cases = covid_19[covid_19['Country'] == 'Others']

# Data Glimpse
diamond_cruise_ship_cases.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered
71,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-22,0,0,0
172,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-23,0,0,0
273,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-24,0,0,0
374,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-25,0,0,0
475,Diamond Princess cruise ship,Others,35.4437,139.638,2020-01-26,0,0,0


# World Data

In [13]:
# Now that we have created a different subset for the cruise ship data, lets derive a subset with only the country data
covid_19_world_data = covid_19[covid_19['Country'] != 'Others']

formatted_text('***World Data -***')
# Data Glimpse
covid_19_world_data.head()

***World Data -***

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered
0,Anhui,China,31.82571,117.2264,2020-01-22,1,0,0
1,Beijing,China,40.18238,116.4142,2020-01-22,14,0,0
2,Chongqing,China,30.05718,107.874,2020-01-22,6,0,0
3,Fujian,China,26.07783,117.9895,2020-01-22,1,0,0
4,Gansu,China,36.0611,103.8343,2020-01-22,0,0,0


In [14]:
formatted_text('***World Data Countries Afftected -***')

print(covid_19_world_data.Country.unique().tolist())
print("\nTotal number of countries: ", len(covid_19_world_data.Country.unique().tolist()))

***World Data Countries Afftected -***

['China', 'Thailand', 'Japan', 'South Korea', 'Taiwan', 'US', 'Macau', 'Hong Kong', 'Singapore', 'Vietnam', 'France', 'Nepal', 'Malaysia', 'Canada', 'Australia', 'Cambodia', 'Sri Lanka', 'Germany', 'Finland', 'United Arab Emirates', 'Philippines', 'India', 'Italy', 'UK', 'Russia', 'Sweden', 'Spain', 'Belgium', 'Egypt', 'Iran', 'Lebanon', 'Iraq', 'Oman', 'Afghanistan', 'Bahrain', 'Kuwait', 'Algeria', 'Croatia', 'Switzerland', 'Austria', 'Israel', 'Pakistan', 'Brazil', 'Georgia', 'Greece', 'North Macedonia', 'Norway', 'Romania']

Total number of countries:  48


In [15]:
formatted_text('***Country and State wise grouped data -***')

covid_19_country_wise_data = covid_19_world_data.groupby(['Country', 'State'])['Confirmed', 'Deaths', 'Recovered'].max()
covid_19_country_wise_data

***Country and State wise grouped data -***

Unnamed: 0_level_0,Unnamed: 1_level_0,Confirmed,Deaths,Recovered
Country,State,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Afghanistan,Afghanistan,1,0,0
Algeria,Algeria,1,0,0
Australia,From Diamond Princess,7,0,0
Australia,New South Wales,4,0,4
Australia,Queensland,5,0,1
Australia,South Australia,2,0,2
Australia,Victoria,4,0,4
Austria,Austria,2,0,0
Bahrain,Bahrain,33,0,0
Belgium,Belgium,1,0,1


### Latest World Data

In [16]:
strDate = covid_19_world_data['Date'][-1:].astype('str')
year = int(strDate.values[0].split('-')[0])
month = int(strDate.values[0].split('-')[1])
day = int(strDate.values[0].split('-')[2].split()[0])

formatted_text('***Last reported case date-time***')
print(strDate)
print(year)
print(month)
print(strDate.values[0].split('-')[2].split())

***Last reported case date-time***

3635    2020-02-26
Name: Date, dtype: object
2020
2
['26']


In [17]:
latest_covid_19_data = covid_19_world_data[covid_19_world_data['Date'] == pd.Timestamp(date(year,month,day))]

latest_covid_19_data.reset_index(inplace=True, drop=True)

latest_covid_19_data.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered
0,Anhui,China,31.82571,117.2264,2020-02-26,989,6,744
1,Beijing,China,40.18238,116.4142,2020-02-26,400,4,235
2,Chongqing,China,30.05718,107.874,2020-02-26,576,6,384
3,Fujian,China,26.07783,117.9895,2020-02-26,294,1,218
4,Gansu,China,36.0611,103.8343,2020-02-26,91,2,81


In [18]:
CountryWiseData = pd.DataFrame(latest_covid_19_data.groupby('Country')['Confirmed', 'Deaths', 'Recovered'].sum())
CountryWiseData['Country'] = CountryWiseData.index
CountryWiseData.index = np.arange(1, len(covid_19_world_data.Country.unique().tolist())+1)

CountryWiseData = CountryWiseData[['Country','Confirmed', 'Deaths', 'Recovered']]

formatted_text('***Country wise numbers of ''Confirmed'', ''Deaths'', ''Recovered'' Cases***')

#CountryWiseData = pd.merge(latest_covid_19_data[['Country', 'Lat','Long']], CountryWiseData, on='Country')

#CountryWiseData = CountryWiseData.drop_duplicates(subset = "Country", keep = 'first', inplace = True) 

CountryWiseData

***Country wise numbers of Confirmed, Deaths, Recovered Cases***

Unnamed: 0,Country,Confirmed,Deaths,Recovered
1,Afghanistan,1,0,0
2,Algeria,1,0,0
3,Australia,22,0,11
4,Austria,2,0,0
5,Bahrain,33,0,0
6,Belgium,1,0,1
7,Brazil,1,0,0
8,Cambodia,1,0,1
9,Canada,11,0,3
10,China,78065,2715,30053


In [20]:
# Import the WORLD Latitute Longitude Data

world_lat_lon_coordinates = pd.read_csv('world_coordinates.csv')
world_lat_lon_coordinates.head()

Unnamed: 0,Code,Country,latitude,longitude
0,AD,Andorra,42.546245,1.601554
1,AE,United Arab Emirates,23.424076,53.847818
2,AF,Afghanistan,33.93911,67.709953
3,AG,Antigua and Barbuda,17.060816,-61.796428
4,AI,Anguilla,18.220554,-63.068615


In [21]:
# Merge the Country co-coordinates above to the country wise data we created.

CountryWiseData = pd.merge(world_lat_lon_coordinates, CountryWiseData, on='Country')
CountryWiseData.head()

Unnamed: 0,Code,Country,latitude,longitude,Confirmed,Deaths,Recovered
0,AE,United Arab Emirates,23.424076,53.847818,13,0,4
1,AF,Afghanistan,33.93911,67.709953,1,0,0
2,AT,Austria,47.516231,14.550072,2,0,0
3,AU,Australia,-25.274398,133.775136,22,0,11
4,BE,Belgium,50.503887,4.469936,1,0,1


# Map Visualizations

# Country wise distribution

In [22]:
WorldMap = folium.Map(location=[0,0], zoom_start=1.5,tiles='cartodbpositron')

formatted_text('***Click on the pin to veiw details stats***')

for lat, long, confirmed, deaths, recovered, country in zip(CountryWiseData['latitude'],
                                                           CountryWiseData['longitude'],
                                                           CountryWiseData['Confirmed'],
                                                           CountryWiseData['Deaths'],
                                                           CountryWiseData['Recovered'], 
                                                           CountryWiseData['Country']):

    if (deaths == 0):
        folium.Marker(location=[lat, long]
                    , popup = ('<strong>nCov Numbers:</strong> ' + '<br>' + 
                               '<strong>Country:</strong> ' + str(country) + '<br>'
                               '<strong>Confirmed:</strong> ' + str(int(confirmed)) + '<br>'
                               '<strong>Deaths:</strong> ' + str(int(deaths)) + '<br>'
                               '<strong>Recovered:</strong> ' + str(int(recovered)) + '<br>')
                    , icon=folium.Icon(color='darkblue',icon='info-sign'), color='rgb(55, 83, 109)'
                    , tooltip = str(country), fill_color='rgb(55, 83, 109)').add_to(WorldMap)

    else:
        folium.Marker(location=[lat, long]
                    , popup = ('<strong>nCov Numbers:</strong> ' + '<br>' + 
                               '<strong>Country:</strong> ' + str(country) + '<br>'
                               '<strong>Confirmed:</strong> ' + str(int(confirmed)) + '<br>'
                               '<strong>Deaths:</strong> ' + str(int(deaths)) + '<br>'
                               '<strong>Recovered:</strong> ' + str(int(recovered)) + '<br>')
                    , icon=folium.Icon(color='red', icon='info-sign'), color='rgb(26, 118, 255)'
                    , tooltip = str(country), fill_color='rgb(26, 118, 255)').add_to(WorldMap)
        
WorldMap

***Click on the pin to veiw details stats***

# Country & State wise distribution

In [23]:
WorldMap = folium.Map(location=[0,0], zoom_start=1.5,tiles='Stamen Toner')

formatted_text('***Click on the dots to veiw details stats***')

for lat, long, confirmed, deaths, recovered, country, state in zip(latest_covid_19_data['Lat'],
                                                           latest_covid_19_data['Long'],
                                                           latest_covid_19_data['Confirmed'],
                                                           latest_covid_19_data['Deaths'],
                                                           latest_covid_19_data['Recovered'], 
                                                           latest_covid_19_data['Country'],
                                                           latest_covid_19_data['State']):

    if (deaths == 0):
        folium.CircleMarker(location=[lat, long]
                    , radius=3
                    , popup = ('<strong>nCov Numbers:</strong> ' + '<br>' + 
                               '<strong>Country:</strong> ' + str(country) + '<br>'
                               '<strong>State:</strong> ' + str(state) + '<br>'
                               '<strong>Confirmed:</strong> ' + str(int(confirmed)) + '<br>'
                               '<strong>Deaths:</strong> ' + str(int(deaths)) + '<br>'
                               '<strong>Recovered:</strong> ' + str(int(recovered)) + '<br>')
                    , color='blue'
                    , tooltip = str(state)
                    , fill_color='blue'
                    , fill_opacity=0.7).add_to(WorldMap)

    else:
        folium.CircleMarker(location=[lat, long]
                    , radius=3
                    , popup = ('<strong>nCov Numbers:</strong> ' + '<br>' + 
                               '<strong>Country:</strong> ' + str(country) + '<br>'
                               '<strong>State:</strong> ' + str(state) + '<br>'
                               '<strong>Confirmed:</strong> ' + str(int(confirmed)) + '<br>'
                               '<strong>Deaths:</strong> ' + str(int(deaths)) + '<br>'
                               '<strong>Recovered:</strong> ' + str(int(recovered)) + '<br>')
                    , color='red'
                    , tooltip = str(state)
                    , fill_color='red'
                    , fill_opacity=0.7).add_to(WorldMap)
        
WorldMap

***Click on the dots to veiw details stats***

# Choropleth Global

## Confirmed Cases

In [24]:
choropleth_map_confirmed = px.choropleth(CountryWiseData, locations='Country', 
                    locationmode='country names', color='Confirmed', 
                    hover_name='Country', range_color=[1,30], 
                    color_continuous_scale='reds', 
                    title='Covid-19 Globally Confirmed Countries')

choropleth_map_confirmed.update(layout_coloraxis_showscale=False)
iplot(choropleth_map_confirmed)

**`China is the worst affected country with Covid-19. The virus has spread to other neighbouring countries and cases of covid-19 have been reported there. 
However the numbers are not as high as China. Some distant countries in Europe, North America & Australia have also seen cases of Covid-19. 
This could be due to some citizens would have been present in China at the time of the virus out-break and unknowingly would have carried along with them in 
their return journey back to their respective countries.`**

## Global Deaths 

In [25]:
choropleth_map_deaths = px.choropleth(CountryWiseData, locations='Country', locationmode='country names', color='Deaths', hover_name='Country', range_color=[1,30], 
                                      color_continuous_scale='reds', title='Covid-19 Global Deaths Numbers')

choropleth_map_deaths.update(layout_coloraxis_showscale=False)
iplot(choropleth_map_deaths)

**`As China has the most reported cases, the number of deaths has also been on the higher side. The virus did not match any other known virus. 
This raised concern because when a virus is new, we do not know how it affects people. There were no existing medications available. 
Due to lack of timely medication available the number of deaths has see the higher side`**

## Global Recovered Cases

In [26]:
choropleth_map_recovered = px.choropleth(CountryWiseData, locations='Country', 
                    locationmode='country names', color='Recovered', 
                    hover_name='Country', range_color=[1,30], 
                    color_continuous_scale='reds', 
                    title='Covid-19 Global Recovered Cases')

choropleth_map_recovered.update(layout_coloraxis_showscale=False)
iplot(choropleth_map_recovered)

**`The recovery rate has been a little slowerthan expected. The virus did not match any other known virus. 
This raised concern because when a virus is new, we do not know how it affects people. There were no existing medications available. 
However, A team of doctors in Thailand have seen some apparent success treating Coronavirus with drug cocktail. The doctors combined the anti-flu drug oseltamivir with lopinavir 
and ritonavir, anti-virals used to treat HIV, Kriengsak said, adding the ministry was awaiting research results to prove the findings.`**

Read more at: https://economictimes.indiatimes.com/news/international/world-news/thailand-sees-apparent-success-treating-virus-with-drug-cocktail/articleshow/73879572.cms?utm_source=contentofinterest&utm_medium=text&utm_campaign=cppst

# China State wise distribution

In [27]:
chinese_data_over_time = covid_19[(covid_19['Country'] == 'China')]
chinese_data_over_time.head()

Unnamed: 0,State,Country,Lat,Long,Date,Confirmed,Deaths,Recovered
0,Anhui,China,31.82571,117.2264,2020-01-22,1,0,0
1,Beijing,China,40.18238,116.4142,2020-01-22,14,0,0
2,Chongqing,China,30.05718,107.874,2020-01-22,6,0,0
3,Fujian,China,26.07783,117.9895,2020-01-22,1,0,0
4,Gansu,China,36.0611,103.8343,2020-01-22,0,0,0


In [28]:
china_statewise_data = chinese_data_over_time.groupby(['State'])['Confirmed', 'Deaths', 'Recovered'].max()

china_statewise_data['State'] = china_statewise_data.index
china_statewise_data.index = np.arange(1, len(china_statewise_data.State.unique().tolist())+1)

china_statewise_data = china_statewise_data[['State','Confirmed', 'Deaths', 'Recovered']]

formatted_text('***Country wise numbers of ''Confirmed'', ''Deaths'', ''Recovered'' Cases***')

china_statewise_data.head()

***Country wise numbers of Confirmed, Deaths, Recovered Cases***

Unnamed: 0,State,Confirmed,Deaths,Recovered
1,Anhui,989,6,744
2,Beijing,400,4,235
3,Chongqing,576,6,384
4,Fujian,294,1,218
5,Gansu,91,2,81


In [29]:
# Extract the state latitude and longitude coordinates from the time series data.
china_coordinates = chinese_data_over_time[['State','Lat','Long']]
china_coordinates.drop_duplicates(keep='first', inplace=True)

china_coordinates.index = np.arange(1, len(china_coordinates.State.unique().tolist())+1)

china_coordinates.head()



A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Unnamed: 0,State,Lat,Long
1,Anhui,31.82571,117.2264
2,Beijing,40.18238,116.4142
3,Chongqing,30.05718,107.874
4,Fujian,26.07783,117.9895
5,Gansu,36.0611,103.8343


In [30]:
china_statewise_data = pd.merge(china_coordinates, china_statewise_data, on='State')

china_statewise_data.head()

Unnamed: 0,State,Lat,Long,Confirmed,Deaths,Recovered
0,Anhui,31.82571,117.2264,989,6,744
1,Beijing,40.18238,116.4142,400,4,235
2,Chongqing,30.05718,107.874,576,6,384
3,Fujian,26.07783,117.9895,294,1,218
4,Gansu,36.0611,103.8343,91,2,81


# China - Distribution on Map

In [31]:
china_lat = 35.8617
china_lon = 104.1954

formatted_text('***Click on the pin to veiw details stats***')

ChinaMap = folium.Map(location=[china_lat, china_lon], zoom_start=4, tiles='cartodbpositron')

for lat, long, confirmed, deaths, recovered, state in zip(china_statewise_data['Lat'],
                                                           china_statewise_data['Long'],
                                                           china_statewise_data['Confirmed'],
                                                           china_statewise_data['Deaths'],
                                                           china_statewise_data['Recovered'], 
                                                           china_statewise_data['State']):
    
    if (deaths == 0):
        folium.Marker(location=[lat, long]
                    , popup = ('<strong>nCov Numbers:</strong> ' + '<br>' + 
                                 '<strong>State:</strong> ' + str(state).capitalize() + '<br>'
                                 '<strong>Confirmed:</strong> ' + str(int(confirmed)) + '<br>'
                                 '<strong>Deaths:</strong> ' + str(int(deaths)) + '<br>'
                                 '<strong>Recovered:</strong> ' + str(int(recovered)) + '<br>')
                    , icon=folium.Icon(color='darkblue',icon='info-sign'), color='rgb(55, 83, 109)'
                    , tooltip = str(state).capitalize(), fill_color='rgb(55, 83, 109)').add_to(ChinaMap)
    else:
        folium.Marker(location=[lat, long]
                    , popup = ('<strong>nCov Numbers:</strong> ' + '<br>' + 
                                 '<strong>State:</strong> ' + str(state).capitalize() + '<br>'
                                 '<strong>Confirmed:</strong> ' + str(int(confirmed)) + '<br>'
                                 '<strong>Deaths:</strong> ' + str(int(deaths)) + '<br>'
                                 '<strong>Recovered:</strong> ' + str(int(recovered)) + '<br>')
                    , icon=folium.Icon(color='red', icon='info-sign'), color='rgb(26, 118, 255)'
                    , tooltip = str(state).capitalize(), fill_color='rgb(26, 118, 255)').add_to(ChinaMap)
    
    
ChinaMap

***Click on the pin to veiw details stats***

# Choropleth China

In [33]:
# Load the CHina geo json file

with open('china_geojson.json') as file:
    china = json.load(file)

## Confirmed Cases

In [34]:
china_conf_choropleth = go.Figure(go.Choroplethmapbox(geojson=china, locations=china_statewise_data['State'],
                                                      z=china_statewise_data['Confirmed'], colorscale='Aggrnyl',
                                                      zmin=0, zmax=10000, marker_opacity=0.5, marker_line_width=0))

china_conf_choropleth.update_layout(mapbox_style="carto-positron", mapbox_zoom=3, 
                                    mapbox_center = {"lat": china_lat, "lon": china_lon})

china_conf_choropleth.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

iplot(china_conf_choropleth)

**`Hubei is the worst affected state in China with almost 92% of the cases reported. The virus outbreak has been feared to be started in Wuhan, Hubei`**

## Death Cases

In [35]:
china_death_choropleth = go.Figure(go.Choroplethmapbox(geojson=china, locations=china_statewise_data['State'],
                                                      z=china_statewise_data['Deaths'], colorscale='Sunset',
                                                      zmin=0, zmax=3000, marker_opacity=0.5, marker_line_width=0))

china_death_choropleth.update_layout(mapbox_style="carto-positron", mapbox_zoom=3, 
                                    mapbox_center = {"lat": china_lat, "lon": china_lon})

china_death_choropleth.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

iplot(china_death_choropleth)

## Recovered Cases

In [36]:
china_recovered_choropleth = go.Figure(go.Choroplethmapbox(geojson=china, locations=china_statewise_data['State'],
                                                      z=china_statewise_data['Recovered'], colorscale='Brbg',
                                                      zmin=0, zmax=8000, marker_opacity=0.5, marker_line_width=0))

china_recovered_choropleth.update_layout(mapbox_style="carto-positron", mapbox_zoom=3, 
                                    mapbox_center = {"lat": china_lat, "lon": china_lon})

china_recovered_choropleth.update_layout(margin={"r":0,"t":0,"l":0,"b":0})

iplot(china_recovered_choropleth)

**`The recovery rate has been a little slowerthan expected. The virus did not match any other known virus. This raised concern because when a virus is new, we do not 
know how it affects people. There were no existing medications available. However, A team of doctors in Thailand have seen some apparent success treating Coronavirus with 
drug cocktail. The doctors combined the anti-flu drug oseltamivir with lopinavir and ritonavir, anti-virals used to treat HIV, Kriengsak said, adding the ministry was 
awaiting research results to prove the findings.`**

Read more at: https://economictimes.indiatimes.com/news/international/world-news/thailand-sees-apparent-success-treating-virus-with-drug-cocktail/articleshow/73879572.cms?utm_source=contentofinterest&utm_medium=text&utm_campaign=cppst

# Rest of World

In [37]:
rest_of_world = CountryWiseData[CountryWiseData['Country'] != 'China']
rest_of_world.head()

Unnamed: 0,Code,Country,latitude,longitude,Confirmed,Deaths,Recovered
0,AE,United Arab Emirates,23.424076,53.847818,13,0,4
1,AF,Afghanistan,33.93911,67.709953,1,0,0
2,AT,Austria,47.516231,14.550072,2,0,0
3,AU,Australia,-25.274398,133.775136,22,0,11
4,BE,Belgium,50.503887,4.469936,1,0,1


## Rest of World - Confirmed Cases

In [38]:
rest_of_world_confirmed = px.choropleth(rest_of_world, locations='Country', 
                    locationmode='country names', color='Confirmed', 
                    hover_name='Country', range_color=[1, 20], 
                    color_continuous_scale='Geyser', 
                    title='Covid-19 Rest of World Confirmed Cases')

iplot(rest_of_world_confirmed)

## Rest of World - Death Cases

In [39]:
rest_of_world_death = px.choropleth(rest_of_world, locations='Country', 
                    locationmode='country names', color='Deaths', 
                    hover_name='Country', range_color=[0, 1], 
                    color_continuous_scale='Picnic', 
                    title='Covid-19 Rest of World Death Cases')

iplot(rest_of_world_death)

**`There have been countable number of deaths reported outside of China.`**

## Rest of World - Recovered Cases

In [40]:
rest_of_world_recovered = px.choropleth(rest_of_world, locations='Country', 
                    locationmode='country names', color='Recovered', 
                    hover_name='Country', range_color=[1,20], 
                    color_continuous_scale='viridis', 
                    title='Covid-19 Rest of World Recovered Cases')

iplot(rest_of_world_recovered)

**`Outside of China there hve been countries where the confirmed cases have been all recovered from the virus`**

Lets see how many such countries are there - 

In [41]:
formatted_text('***Countries withh all reported cases recovered -***')
print(rest_of_world[rest_of_world['Confirmed'] == 
                    rest_of_world['Recovered']][['Country','Confirmed', 'Recovered']].reset_index())

***Countries withh all reported cases recovered -***

   index    Country  Confirmed  Recovered
0      4    Belgium          1          1
1     22      India          3          3
2     27   Cambodia          1          1
3     31  Sri Lanka          1          1
4     35      Nepal          1          1
5     40     Russia          2          2
6     46    Vietnam         16         16


# Diamond Princess Cruise Ship

In [42]:
diamond_cruise_ship_cases.reset_index(drop=True, inplace=True)

# We only need the latest data here
temp_ship = diamond_cruise_ship_cases.sort_values(by='Date', ascending=False).head(1)[['State', 'Confirmed', 
                                                                                       'Deaths', 'Recovered']]

temp_ship

Unnamed: 0,State,Confirmed,Deaths,Recovered
35,Diamond Princess cruise ship,705,4,10


In [43]:
formatted_text('***Click on the pin to veiw details stats***')
cruiseMap = folium.Map(location=[diamond_cruise_ship_cases.iloc[0]['Lat'], diamond_cruise_ship_cases.iloc[0]['Long']], 
                       tiles='cartodbpositron', min_zoom=8, max_zoom=12, zoom_start=12)

folium.Marker(location=[diamond_cruise_ship_cases.iloc[0]['Lat'], diamond_cruise_ship_cases.iloc[0]['Long']],
        popup =   '<strong>Ship : ' + str(temp_ship.iloc[0]['State']) + '<br>' +
                    '<strong>Confirmed : ' + str(temp_ship.iloc[0]['Confirmed']) + '<br>' +
                    '<strong>Deaths : ' + str(temp_ship.iloc[0]['Deaths']) + '<br>' +
                    '<strong>Recovered : ' + str(temp_ship.iloc[0]['Recovered'])
                    , icon=folium.Icon(color='red', icon='info-sign'), color='rgb(26, 118, 255)'
                    , tooltip = str(temp_ship.iloc[0]['State']), fill_color='rgb(26, 118, 255)').add_to(cruiseMap)

cruiseMap

***Click on the pin to veiw details stats***

This is the **`Diamond Princess cruise ship`** which started on the **`5th of February`** from **`Yokohama, Japan`**. The ship is the biggest cluster outside China. The number of patients infected with the COVID-19 coronavirus aboard a quarantined cruise ship docked in Yokohama, Japan has continued to rise making the ship the largest cluster of the deadly virus outside China.

With nearly 6% of the 3,711 passengers and crew members now infected, the 952-foot cruise ship also has the **`highest infection rate of the coronavirus`** anywhere in the world. Wuhan, China, the city where the disease is believed to have originated has nearly 33,000 official cases—but spread across a population of more than 11 million, that’s an infection rate of less than 0.3%.

Health experts say the Diamond Princess highlights the high risk that an infection will spread in confined spaces. According to the U.S. Centers for Disease Control and Preventions, **`close-contact environments such as cruises can facilitate the transmission of viruses through droplets or contaminated surfaces`**.

There have been death cases reported from the ship. Two Japanese passengers who had been on the quarantined Diamond Princess cruise ship have died after being infected with the novel coronavirus. Japan's health ministry says the male and female passengers were hospitalized last week. They were both in their 80s.
The man and woman are the first Diamond Princess passengers to die during the virus outbreak. The cruise ship has been under a quarantine at Yokohama's port near Tokyo since Feb. 3.

Source - 

 - https://time.com/5783451/covid-19-princess-diamond-cruise-ship/
 - https://www.channelnewsasia.com/news/asia/covid19-coronavirus-diamond-princess-japan-79-test-positive-12450498
 - https://www.npr.org/sections/goatsandsoda/2020/02/20/807745305/coronavirus-2-princess-diamond-cruise-ship-passengers-die-after-contracting-covi

### *More to be continued...*

# Data Insights:

`Covid-19: The new coronavirus disease now officially has a name.`
Read more: https://www.newscientist.com/article/2233218-covid-19-the-new-coronavirus-disease-now-officially-has-a-name/#ixzz6DpiKd2Ee

1. China is the worst affected where the numbers continue to rise.


2. Hubei is the worst affected State in China where the virus is believed to be originated.


3. The virus has spread to other countries as well. This could be due to people being present in China/Hubei as the time of the outbreak and in the event of their return to home 
   country unknowingly carried the virus along.


4. There have been many confirmed cases reported outside of Hubei. **Zhejiang, Guangdong, Henan, Hunan** beng the top four.


5. The recovery from the virus outside of Hubei has not been fast. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it 
   affects people. There were no existing medications available. However, A team of doctors in Thailand have seen some apparent success treating Coronavirus with drug cocktail. The 
   doctors combined the anti-flu drug oseltamivir with lopinavir and ritonavir, anti-virals used to treat HIV, Kriengsak said, adding the ministry was awaiting research results to 
   prove the findings.

Read more at: https://economictimes.indiatimes.com/news/international/world-news/thailand-sees-apparent-success-treating-virus-with-drug-cocktail/articleshow/73879572.cms?utm_source=contentofinterest&utm_medium=text&utm_campaign=cppst

6. Rest of World has reported cases from 27 different countries outsdie China. Most of the cases have been reported from Thailand, Singapore and Japan.


7. Honkong, Australia, South Korea have also reported more than 10 cases.


8. India has reported 3 cases so far. 

9. There have been countries where all the reported cases have been recovered. Such countries are **India, Belgium, Spain, Finland, Cambodia, Sri Lanka, Nepal, Russia**.



**This notebook is a continuation of https://www.kaggle.com/sauravmishra1710/covid-19-sars-cov-2-a-geo-statistical-analysis/**