I had stayed away from analyzing data around Covid-19 for a long time, primarly because it was everywhere around me. There were apps, dashboards and notifications for every single % rise and drop in cases and it was mentally taxting to see those numers go up and wait till they plateau and fall down.

But soon, people around me only spoke about the analysis they did using the numbers reported out from various sources and I wasn't part of any. In my 7 years career, I've learnt that one of the few things you have to deal with being a data-scientist is that you always got to do something, something very small if need be, but something to live up with the hype! To be a part of the crowd! To stay in the news! 



This notebook is my contribution of 'joining the conversation'

![JoinConvoUrl](https://media.giphy.com/media/huU5VnGBtP849VaZLh/giphy.gif "JoinConvoUrl")

It is a simple notebook where the numbers match up to everything we've seen, read, heard from news channels and people. Its not going to be a story but just a presentation of facts with the tidy dataset that was recently uploaded.

In [None]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import geopandas as gpd
from geopandas import GeoDataFrame

In [None]:
covid_df= pd.read_csv('../input/latest-covid19-india-statewise-data/Latest Covid-19 India Status.csv')
covid_df

#### Changing the names to map them as per Shapefile

Note: Dadra and Nagar Haveli and Daman and Diu is one row in the dataset but two separate rows in the shape-file. As I am not sure about the right split of numbers between the two places, I will be ignoring them from the analysis.

In [None]:
covid_df['State/UTs'].iloc[35] = 'Andaman & Nicobar Island'
covid_df['State/UTs'].iloc[28] = 'Arunanchal Pradesh'
covid_df['State/UTs'].iloc[20] = 'Jammu & Kashmir'
covid_df['State/UTs'].iloc[15] = 'Telangana'
covid_df['State/UTs'].iloc[7]  = 'NCT of Delhi'

In [None]:
fp = "../input/indian-states-shapefiles/Indian_States.shp"
map_df = gpd.read_file(fp)
map_df.head()

In [None]:
merged = map_df.set_index('st_nm').join(covid_df.set_index('State/UTs')).reset_index()
merged = GeoDataFrame(merged)
merged

### Total Cases

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise Total Cases as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})

# plot the figure
merged.plot(column='Total Cases', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

In [None]:
merged.sort_values(by='Total Cases', ascending=False)[['st_nm', 'Total Cases']].head(5)

### Active Cases

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise Active Cases as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})

# plot the figure
merged.plot(column='Active', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

#### Top 5 States with Highest Active Cases

In [None]:
merged.sort_values(by='Active', ascending=False)[['st_nm', 'Active']].head(5)

#### Top 5 States with Lowest Active Cases

In [None]:
merged.sort_values(by='Active', ascending=False)[['st_nm', 'Active']].dropna().tail(5)

### Discharged Cases

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise Discharged Cases as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})
merged.plot(column='Discharged', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

#### Top 5 States with Highest Discharged Cases

In [None]:
merged.sort_values(by='Discharged', ascending=False)[['st_nm', 'Discharged']].head(5)

#### Top 5 States with Lowest Discharged Cases

In [None]:
merged.sort_values(by='Discharged', ascending=False)[['st_nm', 'Discharged']].dropna().tail(5)

### Deaths

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise Deaths as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})
merged.plot(column='Deaths', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

#### Top 5 States with Highest Deaths

In [None]:
merged.sort_values(by='Deaths', ascending=False)[['st_nm', 'Deaths']].head(6)

Based on Hindustan Times news on June 2nd, 2021, according to data shared by the Union ministry of health and family welfare, more than 70% of India's cases are being traced back to six states - Maharashtra, Tamil Nadu, Karnataka, Kerala, Uttar Pradesh and West Bengal.
Read more at - https://www.hindustantimes.com/india-news/six-states-account-for-more-than-70-of-india-s-covid-19-deaths-101622617249066.html

#### Top 5 States with Lowest Deaths

In [None]:
merged.sort_values(by='Deaths', ascending=False)[['st_nm', 'Deaths']].dropna().tail(5)

### Active %

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise % of Active Cases as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})

# plot the figure
merged.plot(column='Active Ratio', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

In [None]:
merged.sort_values(by='Active Ratio', ascending=False)[['st_nm', 'Active Ratio']].head(5)

As of June, 30th 2021, states in Mizoram show highest % of active cases at 20.55%. The ratios here are relative because of the total number of cases for each state vary by a huge margin.
Maharashtra, which has the highest number of reported total cases at 6061404 is almost 295 times the total cases reported by Mizoram, 20492. Mizoram's latest numbers as of 7th July is 21,246 as per - https://www.business-standard.com/article/current-affairs/mizoram-reports-243-new-coronavirus-cases-one-fatality-in-the-last-24-hrs-121070400353_1.html

Hence this feature tells us how the state is handling the pandemic at its peak. 

### Discharge %

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise % of Discharges as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})

# plot the figure
merged.plot(column='Discharge Ratio', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

In [None]:
merged.sort_values(by='Discharge Ratio', ascending=False)[['st_nm', 'Discharge Ratio']].head(5)

### Death %

In [None]:
fig, ax = plt.subplots(1, figsize=(10, 8))
ax.axis('off')
ax.set_title('State Wise % of Death as of June 30th 2021', fontdict={'fontsize': '15', 'fontweight' : '3'})

# plot the figure
merged.plot(column='Death Ratio', cmap='YlOrRd', linewidth=0.8, ax=ax, edgecolor='0.5', legend=True)

Though the notebook is titled 2021 Mid-Year report, I will most probably not do a Year-End report. I guess, most of you'll will agree with me that we all see enough covid numbers in our daily lives. 

![OverandOut](https://media.giphy.com/media/3oEdv8etnVogPOs18I/giphy.gif "overandout")