# **VISUAL ANALYSIS OF COVID-19 IN INDIA**

COVID-19 disease originated in the city Wuhan in China in December 2019, and spread like wildfire throughout the world. Its affect in India started as early as February 2020, but the drastic jump in the cases was seen in the month of April. As of now there are almost 2 million patients of these virus in India. In this notebook, we will briefly study the statistics of the COVID-19 cases in India. 

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

In [None]:

ageGroupDetails = pd.read_csv("../input/covid19-in-india/AgeGroupDetails.csv")
hospitalBedDetails = pd.read_csv("../input/covid19-in-india/HospitalBedsIndia.csv")
ICMRTestingLabs = pd.read_csv("../input/covid19-in-india/ICMRTestingLabs.csv")
individualDetails = pd.read_csv("../input/covid19-in-india/IndividualDetails.csv")
statewiseTestingDetails = pd.read_csv("../input/covid19-in-india/StatewiseTestingDetails.csv")
covid19India = pd.read_csv("../input/covid19-in-india/covid_19_india.csv")
populationIndiaCensus2011 = pd.read_csv("../input/covid19-in-india/population_india_census2011.csv")

Let us first study the population distribution of India. As we can see from the below bar chart that most of the population in India resides in rural areas and villages. The only places were the urban population is in excess as compared to rural popualtion are the union territories like Delhi, Chandigarh, etc.

In [None]:
fig = px.bar(populationIndiaCensus2011, x = "State / Union Territory", y = ["Rural population", "Urban population"], title = "India Population state/union territory wise")
fig.show()

In [None]:
Percentage = []
for i in ageGroupDetails['Percentage']:
    Percentage.append(float(i[0:len(i)-1]))
Percentage
ageGroupDetails["NumericPercentage"] = Percentage
ageGroupDetails.head()

The bar chart shown below shows the variation of COVID-19 patients in different age groups. Most of the patients of this disease lie in the age group of 20-40 years. 

In [None]:
fig, axes= plt.subplots(figsize = (25,10))
sns.set_color_codes("pastel")
sns.barplot(x = "AgeGroup", y = "TotalCases", data = ageGroupDetails, label = "Total Cases", color = "b")
sns.set_color_codes("muted")
sns.barplot(x = "AgeGroup", y = "NumericPercentage", data = ageGroupDetails, label = "Percentage", color = "b")
axes.legend(ncol=2, loc = "right", frameon=True)
axes.set(xlim=(0, 24), ylabel="TotalCases / Percentage", xlabel="AgeGroup")
sns.despine(left=True, bottom=True)

The number of beds for the patients in the various states and union territories in India can be seen in the below chart. Most of the beds belong to the public and rural areas.

In [None]:
fig = go.Figure(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumPrimaryHealthCenters_HMIS"], name = "NumberOfPrimaryHealthCenters"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumCommunityHealthCenters_HMIS"], name = "NumberOfCommunityHealthCenters"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumSubDistrictHospitals_HMIS"], name = "NumberOfSubDistrictHealthCenters"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumDistrictHospitals_HMIS"], name = "NumberOfDistrictHealthCenters"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["TotalPublicHealthFacilities_HMIS"], name = "TotalPublicHealthFacilities"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumPublicBeds_HMIS"], name = "NumberOfPubicBeds"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumRuralHospitals_NHP18"], name="NumberRuralHospitals"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumUrbanHospitals_NHP18"], name="NumberUrbanHospitals"))
fig.add_trace(go.Bar(x = hospitalBedDetails["State/UT"], y = hospitalBedDetails["NumUrbanBeds_NHP18"], name="NumberOfUrbanBeds"))
fig.update_layout(barmode='stack', xaxis={'categoryorder':'category ascending'})
fig.show()

In [None]:
Int = []
for i in covid19India["Cured"]:
    i = str(i)
    Int.append(i[0:len(i)-2])
covid19India["CuredInt"] = Int

Int = []
for i in covid19India["Deaths"]:
    i = str(i)
    Int.append(i[0:len(i)-2])
covid19India["DeathInt"] = Int

Int = []
for i in covid19India["Confirmed"]:
    i = str(i)
    Int.append(i[0:len(i)-2])
covid19India["ConfirmedInt"] = Int

covid19India.tail()

In [None]:
statewiseTestingDetails["TotalSamples"] = statewiseTestingDetails["TotalSamples"].fillna(0)
statewiseTestingDetails["Negative"] = statewiseTestingDetails["Negative"].fillna(0)
statewiseTestingDetails["Positive"] = statewiseTestingDetails["Positive"].fillna(0)

statewiseTestingDetails

In [None]:
dataTotal = {}
dataPositive = {}
for i in range(0, 3780):
    if statewiseTestingDetails.iloc[i]["State"] not in dataTotal.keys() or statewiseTestingDetails.iloc[i]["State"] not in dataPositive.keys():
        dataTotal[statewiseTestingDetails.iloc[i]["State"]] = 0
        dataPositive[statewiseTestingDetails.iloc[i]["State"]] = 0
    dataTotal[statewiseTestingDetails.iloc[i]["State"]] += int(statewiseTestingDetails.iloc[i]["TotalSamples"])
    dataPositive[statewiseTestingDetails.iloc[i]["State"]] += int(statewiseTestingDetails.iloc[i]["Positive"])
state = []
total = []
positive = []

for i in dataTotal.keys():
    state.append(i)
    total.append(dataTotal[i])
    positive.append(dataPositive[i])

data = pd.DataFrame({"State" : state, "TotalSamples" : total, "Positive" : positive})
data.head()

The chart shown below illustrates the total samples analysed along with the total positive cases found. 

In [None]:

fig = px.bar(data, x= "State", y = ["TotalSamples", "Positive"], color_discrete_sequence = ["cyan", "red"])
fig.show()

In [None]:
covidConfirmed = covid19India.drop(["Sno", "Time", "ConfirmedIndianNational", "ConfirmedForeignNational", "Cured", "Deaths", "CuredInt", "DeathInt", "ConfirmedInt"],axis=1)
covidDeaths = covid19India.drop(["Sno","Time", "Confirmed", "ConfirmedIndianNational","ConfirmedForeignNational", "Cured", "CuredInt", "DeathInt", "ConfirmedInt"], axis = 1)

In [None]:
from datetime import *

In [None]:
dates = []
for i in list(covidConfirmed.Date):
    d,m,y = i.split('/')
    dates.append(date(int(y)+2000,int(m),int(d)))
covidConfirmed["Date"] = dates

In [None]:
covidConfirmed.head(10)

In [None]:
covidConfirmed = covidConfirmed.pivot_table(index="Date",columns="State/UnionTerritory",values="Confirmed",aggfunc="sum")

In [None]:
covidConfirmed = covidConfirmed.fillna(0)
covidConfirmed.head()

In [None]:
pip install bar_chart_race

In [None]:
import bar_chart_race as bcr

Day by day analysis of COVID-19 patients in the various states and union territories of India 31st January 2020.

In [None]:
bcr.bar_chart_race(df = covidConfirmed,
                   title = "State wise confirmed cases",
                   n_bars = 10,
                   filter_column_colors = True,
                   figsize = (5,3)
                  )

In [None]:
dates = []
for i in list(covidDeaths.Date):
    d,m,y = i.split('/')
    dates.append(date(int(y)+2000,int(m),int(d)))
covidDeaths["Date"] = dates

In [None]:
deaths = {}
for i in range(0, 4646):
    if covidDeaths.iloc[i]["State/UnionTerritory"] not in deaths:
        deaths[covidDeaths.iloc[i]["State/UnionTerritory"]] = 0
    deaths[covidDeaths.iloc[i]["State/UnionTerritory"]] = covidDeaths.iloc[i]["Deaths"]
deaths

In [None]:
l = []
for i in deaths.keys():
    l.append(i)
m = []
for i in deaths.values():
    m.append(i)

In [None]:
dict = {"State/UnionTerritory" : l[:-2], "deaths" : m[:-2]}
data = pd.DataFrame(dict)
data.head()

State wise Death toll due to COVID-19 in India

In [None]:
plt.figure(figsize = (10,10))
fig = px.pie(data, values = 'deaths', names = 'State/UnionTerritory', color_discrete_map = px.colors.sequential.Plasma_r, hole = 0.5, title = "State wise death toll due to COVID-19 till " + str(covidDeaths["Date"].max()))
fig.update_traces(textposition = "inside")
fig.show()