# **Unemployment in India during covid-19**

India's massive work force is being hit hard by coronavirus. As many as 21 million salaried jobs have been lost between April and August, according to a recent report by the Centre for Monitoring Indian Economy (CMIE).
Hospitality, aviation, media, entertainment and automobile parts manufacturers are among the sectors seeing the most job cuts.

 In September 2020, India saw an unemployment rate of over six percent. This was a significant improvement from the previous months. A damaging impact on an economy as large as India’s caused due a total lockdown was imminent. Unemployment went up to 24 percent on May 17, 2020. This was possibly a result of a decrease in demand as well as the disruption of workforce faced by companies. Furthermore, this caused a GVA loss of more than nine percent for the Indian economy that month.

## **objective of analysis is -**
To know the covid-19 impact on job market

**which state survive and which state has more impact on**

This dataset contains the unemployment rate of all the states in India

States = states in India

Date = date which the unemployment rate observed

Frequency = measuring frequency (Monthly)

Estimated Unemployment Rate (%) = percentage of people unemployed in 

each States of India

Estimated Employed = Number of people employed

Estimated Labour Participation Rate (%) = The labour force 
participation rate is the portion of the working population in the 16-64 years' age group in the economy currently in employment or seeking employment.

Used Libraries

- Pandas -- to handle the Data frame

- DateTime -- to handle date and time
- Calender -- to form a different column of month
- PlotlyExpress -- for visualization

In [None]:
import pandas as pd
import datetime as dt
import calendar

Read the dateset using pandas

In [None]:
df=pd.read_csv("../input/unemployment-in-india-due-to-covid19/real_1_Unemployment_Rate_upto_11_2020.csv")

In [None]:
df.head()

## Data Preprocessing

Identifying null values

In [None]:
df.isnull().sum()

In [None]:
df1 = df.rename(columns = {"Region":"State",
                           "Region.1":"Region"})

In [None]:
df1.head()

In [None]:
df1.columns =['States','Date','Frequency','Estimated Unemployment Rate','Estimated Employed','Estimated Labour Participation Rate','Region','longitude','latitude']

In [None]:
df1['Date']

Making an another column for month to analyse the year properly

In [None]:
df['Date'] = pd.to_datetime(df['Date'],dayfirst=True)

In [None]:
df1['Month_integer'] = df['Date'].dt.month

In [None]:
df1.head()

Using calender library to form a month string column 

In [None]:
calendar.month_abbr[8]

In [None]:
df1['Month_name'] =  df1['Month_integer'].apply(lambda x: calendar.month_abbr[x])

In [None]:
df1.head()

In [None]:
df1.Month_integer

In [None]:
 df_stats = df1[['Estimated Unemployment Rate','Estimated Employed','Estimated Labour Participation Rate']]
round(df_stats.describe().T,2)

Finding unemployment rate, employed and labour participation rate for perticular regions

In [None]:
region_stats = df1.groupby(['Region'])[['Estimated Unemployment Rate','Estimated Employed','Estimated Labour Participation Rate']].mean().reset_index()
region_stats = round(region_stats,2)
region_stats

In [None]:
#pip install --upgrade plotly

In [None]:
import plotly.express as px

Using below box plot we can easily determine the outliers 

In [None]:
fig = px.box(df1,x='States',y='Estimated Unemployment Rate',color='States',title='Unemployment rate',template='plotly')
fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.show()

Through below multiple scatter plots we can compare between different columns.

In [None]:
fig = px.scatter_matrix(df1,template='plotly',
    dimensions=['Estimated Unemployment Rate','Estimated Employed',
                'Estimated Labour Participation Rate'],
    color='Region')
fig.show()

Using below bar graph we can find that, for average unemployment rate  of whole year in each state 

In [None]:
plot_ump = df1[['Estimated Unemployment Rate','States']]

df_unemp = plot_ump.groupby('States').mean().reset_index()

df_unemp = df_unemp.sort_values('Estimated Unemployment Rate')

fig = px.bar(df_unemp, x='States',y='Estimated Unemployment Rate',color='States',
            title='Average Unemployment Rate in each state',template='plotly')

fig.show()

In below presentation, we easily see the comparison between region and state & different months of an year

In [None]:
fig = px.bar(df1, x='Region',y='Estimated Unemployment Rate',animation_frame = 'Month_name',color='States',
            title='Unemployment rate across region from Jan.2020 to Oct.2020', height=700,template='plotly')

fig.update_layout(xaxis={'categoryorder':'total descending'})

fig.layout.updatemenus[0].buttons[0].args[1]["frame"]["duration"] = 2000

fig.show()

In [None]:
unemplo_df = df1[['States','Region','Estimated Unemployment Rate','Estimated Employed','Estimated Labour Participation Rate']]

unemplo = unemplo_df.groupby(['Region','States'])['Estimated Unemployment Rate'].mean().reset_index()

In [None]:
unemplo

In [None]:
fig = px.sunburst(unemplo, path=['Region','States'], values='Estimated Unemployment Rate',color_continuous_scale='Plasma',title= 'unemployment rate in each region and state',height=650,template='ggplot2')


fig.show()

In [None]:
df1.columns

# Impact of Lockdown on States Estimated Employed
On 24 March 2020, the Government of India under Prime Minister Narendra Modi ordered a nationwide lockdown for 21 days, but after it there is a rapid increase in COVID-19 patients, so Government increase the period of lockdown.

In below representation we can see the impact of lockdown on employment across regions on the India map.

In [None]:
fig = px.scatter_geo(df1,'longitude', 'latitude', color="Region",
                     hover_name="States", size="Estimated Unemployment Rate",
                     animation_frame="Month_name",scope='asia',template='ggplot2',title='Impack of lockdown on employement across regions' )

fig.layout.updatemenus[0].buttons[0].args[1]["frame"]["duration"] = 2000

fig.update_geos(lataxis_range=[5,35], lonaxis_range=[65, 100])

fig.show()

Now we slplit the dataset into two parts that are before lockdown and after lockdown to see the impact of COVID-19 on unemployment 

In [None]:
lock = df1[(df1['Month_integer'] >= 4) & (df1['Month_integer'] <=7)]

bf_lock = df1[(df1['Month_integer'] >= 1) & (df1['Month_integer'] <=4)]

In [None]:
g_lock = lock.groupby('States')['Estimated Unemployment Rate'].mean().reset_index()

g_bf_lock = bf_lock.groupby('States')['Estimated Unemployment Rate'].mean().reset_index()


g_lock['Unemployment Rate before lockdown'] = g_bf_lock['Estimated Unemployment Rate']

g_lock.columns = ['States','Unemployment Rate after lockdown','Unemployment Rate before lockdown']

g_lock

Now we identify Percentage change in unemployment rate in each state aftere lockdown

In [None]:

g_lock['percentage change in unemployment'] = round(g_lock['Unemployment Rate after lockdown'] - g_lock['Unemployment Rate before lockdown']/g_lock['Unemployment Rate before lockdown'],2)

In [None]:
plot_per = g_lock.sort_values('percentage change in unemployment')

In [None]:
fig = px.bar(plot_per, x='States',y='percentage change in unemployment',color='percentage change in unemployment',
            title='percentage change in Unemployment in each state after lockdown',template='ggplot2')

fig.show()

# Most impacted states/UT
Puducherry

Jharkhand

Bihar

Haryana

Tripura

Function to sort the values based on impact of COVID-19 lockdown 

In [None]:
def sort_impact(x):
    if x <= 10:
        return 'impacted States'
    elif x <= 20:
        return 'hard impacted States'
    elif x <= 30:
        return 'harder impacted States'
    elif x <= 40:
        return 'hardest impacted States'
    return x 

In [None]:
plot_per['impact status'] = plot_per['percentage change in unemployment'].apply(lambda x:sort_impact(x))

In [None]:
fig = px.bar(plot_per, y='States',x='percentage change in unemployment',color='impact status',
            title='Impact of lockdown on employment across states',template='ggplot2',height=650)


fig.show()

In [None]:
month_mean = df1.groupby('Month_integer')['Estimated Unemployment Rate'].mean().reset_index()

In [None]:
x=month_mean.iloc[:,0]

In [None]:
y=month_mean.iloc[:,1]

In [None]:
month_mean

In [None]:
import matplotlib.pyplot as plt

In [None]:
plt.plot(x,y, '-o')
plt.ylabel("Unemployment rate")
plt.xlabel("Month(integer)")
plt.show()

As you can see in there is rapid decrease in Unemployment rate from July-October. It is a significant improvement from the previous months.

Thanks for Reading