# <center>**Latest COVID-19 Statewise Data for India**<center>

![](https://assets2.rappler.com/2021/05/india-covid-variant-1280-1620622482819.jpg)

### **Importing Libraries**

In [None]:
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
import numpy as np
import seaborn as sns
import plotly.express as px
import matplotlib.pyplot as plt
%matplotlib inline

### **Importing Datasets**

In [None]:
covid_filepath = ('../input/latest-covid19-india-statewise-data/Latest Covid-19 India Status.csv')
covid_data = pd.read_csv(covid_filepath)
covid_data.head()

### **Identification of Data Types**

In [None]:
covid_data.dtypes

* The data are in their appropriate data types.

## **Size of the dataset**

In [None]:
covid_data.shape

* The dataset has **36 entries** with **8 variables**

## <center>**STATISTICAL ANALYSIS**<center>

In [None]:
covid_data.describe()

## **Missing Null Values Check**

In [None]:
covid_data.isnull().sum()

* There are **no missing values** for this dataset

## <center>**VISUALIZATION OF DATA**<center>

### <center>**Total number of cases for each state in India**<center>

In [None]:
sns.set_style('dark')
plt.figure(figsize = (10, 12))
plt.title('Total cases for each state in India')
sns.barplot(data = covid_data, y = 'State/UTs', x = 'Total Cases')
plt.xlabel('Total Cases (million)')


* We can see from the data that **Maharashtra** has the **highest** total number of cases while **Andaman and Nicobar** has the **lowest** total number of cases in India

### <center>**Death Ratio for each state in India**<center>

In [None]:
cdheatmap = covid_data.copy()
xcols = [1, 2, 3, 4, 5, 6] * 6
yrows = [1]*6 + [2]*6 + [3]*6 + [4]*6 + [5]*6 + [6]*6 
cdheatmap = cdheatmap.replace('Dadra and Nagar Haveli and Daman and Diu', 'Daman and Diu')
cdheatmap = cdheatmap.sort_values(by = 'Death Ratio (%)')
cdheatmap['xcols'] = xcols 
cdheatmap['yrows'] = yrows


state = ((np.asarray(cdheatmap['State/UTs'])).reshape(6,6))
dr = ((np.asarray(cdheatmap['Death Ratio (%)'])).reshape(6,6))
result = cdheatmap.pivot(index = 'yrows', columns = 'xcols', values = 'Death Ratio (%)')
labels = (np.asarray(["{0} \n {1:.2f}".format(region, value)
                     for region, value in zip(state.flatten(),
                                           dr.flatten())])).reshape(6,6)
fig, ax = plt.subplots(figsize=(20,7))
title = 'Death Ratio for each state'
plt.title(title, fontsize = 18)
#plt. ax.title
#plt.set_position(0.5, 1, 0.5)
ax.set_xticks([])
ax.set_yticks([])
ax.axis('off')

sns.heatmap(result, annot = labels, fmt = "", cmap = 'Reds', linewidths = 0.3, ax=ax)
plt.show()




* From the heatmap above we can observe that **Punjab** has the **highest mortality rate** but they have a lower number of total case with respect to the other states.
* we can also see from the previous data that **Maharashtra** has the **highest number of Total case** in India and the **3rd highest mortality rate** in the country.
* **Andaman and Nicobar** is one of the states that has the **lowest number of total cases** but has one of the highest mortality rate in the country.

### <center>**Number of active cases VS number of total case**<center>

In [None]:
plt.figure(figsize = (13, 8))
plt.title("Active case vs Total Case trend", fontsize = 16)
sns.regplot(data = covid_data, x = 'Total Cases', y = 'Active', color = 'red')
plt.xlabel('Total Cases (millions)')

* We can observe from this plot that as the total number of cases increase, the number of active cases increases.

### <center>**Number of deaths VS number of total case**<center>

In [None]:
plt.figure(figsize = (13, 8))
plt.title("Deaths vs Total Case trend", fontsize = 16)
sns.regplot(data = covid_data, x = 'Total Cases', y = 'Deaths', color = 'black')
plt.xlabel('Total Cases (millions)')

* Based on the data above, we can see that as the total number of cases increases, the number of deaths also increases.

### <center>**Number of active cases VS number of total cases for each region in India**<center>

In [None]:
sns.lmplot(data = covid_data, x = 'Total Cases', y = 'Active', hue = 'State/UTs')
plt.title('Active case vs Total case for each states', fontsize = 12)


* From this chart, we can observe that most states in India have a low number of total cases with respect to other states.

### <center>**Number of Deaths VS number of total cases for each region in India**<center>

In [None]:
sns.lmplot(data = covid_data, x = 'Total Cases', y = 'Deaths', hue = 'State/UTs')
plt.title('Deaths vs Total case for each states', fontsize = 12)


* From this chart, we can observe that most states in India have a low number of deaths with respect to other states.

### <center>**Number of recoveries VS number of total case**<center>

In [None]:
plt.figure(figsize = (13, 8))
plt.title("Discharged vs Total Case trend", fontsize = 16)
sns.regplot(data = covid_data, x = 'Total Cases', y = 'Discharged', color = 'green')
plt.xlabel('Total Cases (millions)')

* Based on the data above, we can see that the as the number of cases increases, the total number of recoveries also increases.

### <center>**Number of active case and recoveries in India**<center>

In [None]:
plt.figure(figsize = (13, 8))
sns.kdeplot(data = covid_data['Discharge Ratio (%)'], shade = True, label = 'Discharge Ratio')
sns.kdeplot(data = covid_data['Active Ratio (%)'], shade = True, label = 'Active Ratio', color = 'red')
plt.legend()
plt.xlabel("")
plt.title('Active and Discarge Ratio of COVID 19 Patients', fontsize = 15)


* Based on the data above we can say that almost **95% of the patients** who had COVID 19 in India **recovered**
* We can also observe that there is a **low** number of active cases in the country.

## <center>**SUMMARY OF REPORT**<center>

### **State that has the highest number of Total Cases in India**

In [None]:
high_total_case = covid_data[covid_data['Total Cases'] == max(covid_data['Total Cases'])]
high_total_case

### **State that has the lowest number of Total Cases in India**

In [None]:
low_total_case = covid_data[covid_data['Total Cases'] == min(covid_data['Total Cases'])]
low_total_case

### **State that has the highest number of Active Cases in India**

In [None]:
high_active_case = covid_data[covid_data['Active'] == max(covid_data['Active'])]
high_active_case

### **State that has the lowest number of Active Cases in India**

In [None]:
low_active_case = covid_data[covid_data['Active'] == min(covid_data['Active'])]
low_active_case

### **State that has the highest number of Recoveries in India**

In [None]:
high_dis_case = covid_data[covid_data['Discharged'] == max(covid_data['Discharged'])]
high_dis_case

### **State that has the lowest number of recoveries in India**

In [None]:
low_dis_case = covid_data[covid_data['Discharged'] == min(covid_data['Discharged'])]
low_dis_case

### **State that has the highest number of Deaths in India**

In [None]:
high_dr_case = covid_data[covid_data['Deaths'] == max(covid_data['Deaths'])]
high_dr_case

### **State that has the lowest number of Deaths in India**

In [None]:
low_dr_case = covid_data[covid_data['Deaths'] == min(covid_data['Deaths'])]
low_dr_case