# **Unemployment Analysis in India during covid pandamic**

* Unemployment is measured by the unemployment rate which is the number of people who are unemployed as a percentage of the total labour force.
* During the covid-19 period there was a increase in the unemployment rate.
* The aim is to analyze the unemployment rate using python.


Goal:

This analysis focuses on evaluating the far-reaching impacts of the COVID-19 pandemic on India's employment landscape. The dataset at hand provides essential insights into the fluctuations of unemployment rates among different states in India. Within the dataset, vital indicators including State-wise breakdown, timeline, measurement frequency, Estimated Unemployment Rate (%), Estimated Employed Individuals, and Estimated Labour Participation Rate (%) are encompassed.






Dataset Overview:

The provided dataset delves into the unemployment landscape across diverse states in India:

* States: Various states constituting the Indian subcontinent.
* Date: The specific dates of unemployment rate recordings.
* Measuring Frequency: The regularity of measurement collection (Monthly).
* Estimated Unemployment Rate (%): The proportion of unemployed individuals in each Indian state.
* Estimated Employed Individuals: The tally of presently engaged individuals.
* Estimated Labour Participation Rate (%): The percentage of the working-age populace (16-64 years) actively involved in the job market, including both employed individuals and those actively seeking jobs.

This dataset serves as a valuable resource for comprehending the unemployment variations across India's states throughout the COVID-19 pandemic. By offering crucial insights, it illuminates the repercussions on unemployment rates, employment numbers, and labor participation proportions in distinct geographical regions across the nation. The analysis aims to provide insights into the pandemic's socio-economic effects on India's workforce and labor arena.

In [None]:
#import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime as dt
import calendar 
import plotly.graph_objects as go

import warnings
warnings.filterwarnings("ignore")
%matplotlib inline

Load the csv file into a pandas dataframe

In [None]:

df = pd.read_csv("/kaggle/input/unemployment-in-india/Unemployment_Rate_upto_11_2020.csv")
df.head()

In [None]:
df.tail()

In [None]:
df.info()

### Renaming the attributes
1. Region = state
2. Date = date
3. Frequency = frequency
4. Estimated Unemployment Rate (%) = estimated unemployment rate
5. Estimated Employed = estimated employment
6. Estimated Labour Participation Rate (%) = estimated labour participation rate
7. Region.1 = region
8. longitude = longitude
9. latitude = latitude


Updating column names:

In [None]:
df.columns = ['state','date','frequency','estimated unemployment rate','estimated employed','estimated labour participation rate','region','longitude','latitude']
df.head()

Revealing basic information of the dataset

In [None]:
df.shape

In [None]:
df.columns

In [None]:
df.describe()

In [None]:
df.isnull().sum()

In [None]:
df.duplicated().any()

In [None]:
df.state.value_counts()

### Changing the datatype of 'date' from object to datetime

In [None]:
df['date'] = pd.to_datetime(df['date'],dayfirst = True)
df.info()

### Extracting month from date attribute

In [None]:
df['month_int'] = df['date'].dt.month
df.head()

The months are in integer datetype. We need to convert the months into words for better analysis,

In [None]:
df['month'] = df['month_int'].apply(lambda x: calendar.month_abbr[x])
df.head()

Numeric data grouped by months

In [None]:
data = df.groupby(['month'])[['estimated unemployment rate','estimated employed','estimated labour participation rate']].mean()
data=pd.DataFrame(data).reset_index()

Bar plot of unemployment rate and labour participation rate

In [None]:
month = data.month
unemployment_rate = data['estimated unemployment rate']
labour_participation_rate = data['estimated labour participation rate']

fig = go.Figure()

fig.add_trace(go.Bar(x = month,y = unemployment_rate,name = 'Unemployment Rate'))
fig.add_trace(go.Bar(x = month,y = labour_participation_rate,name = 'Labour Participation Rate'))

fig.update_layout(title = 'Unemployment Rate and Labour Participation',
                     xaxis = {'categoryorder':'array','categoryarray':['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct']}      )
fig.show()

Bar plot of estimated employed citizen in every month

In [None]:
import plotly.express as px

In [None]:
fig = px.bar(data,x='month',y='estimated employed',color='month',
            category_orders ={'month':['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct']},
            title='Estimated employed people from Jan 2020 to Oct 2020')
fig.show()

## State wise Analysis

In [None]:
state =  df.groupby(['state'])[['estimated unemployment rate','estimated employed','estimated labour participation rate']].mean()
state = pd.DataFrame(state).reset_index()

In [None]:
# Box plot

fig = px.box(data_frame=df,x='state',y='estimated unemployment rate',color='state',title='Unemployment rate')
fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.show()

In [None]:
# average unemployment rate bar plot

fig = px.bar(state,x='state',y='estimated unemployment rate',color='state',title='Average unemployment rate (State)')
fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.show()

>  Hariyana and  Tripura were having the highest average amount of Unemployment rate




> Meghalaya was having the lowest average amount of Unemployment rate

In [None]:
# Bar plot Unemployment Rate (monthly)

fig = px.bar(df,x='state',y='estimated unemployment rate',animation_frame='month',color='state',
            title='Unemployment rate from Jan 2020 to Oct 2020(State)')

fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.show()


**Monthly unemployment rate**

In [None]:
fig=px.scatter_geo(df,'longitude','latitude',color='state',
                  hover_name='state',size='estimated unemployment rate',
                  animation_frame='month',scope='asia',title='Impact of lockdown on employment in India')

fig.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] =2000
fig.update_geos(lataxis_range=[5,40],lonaxis_range=[65,100],oceancolor='lightblue',
               showocean=True)

fig.show()

## Regional Analysis

In [None]:
df.region.unique()

In [None]:
# numeric data grouped by region

region = df.groupby(['region'])[['estimated unemployment rate','estimated employed','estimated labour participation rate']].mean()
region = pd.DataFrame(region).reset_index()

In [None]:
#Scatter plot

fig= px.scatter_matrix(df,dimensions=['estimated unemployment rate','estimated employed','estimated labour participation rate'],color='region')
fig.show()

In [None]:
# Average Unemployment Rate

fig = px.bar(region,x='region',y='estimated unemployment rate',color='region',title='Average unemployment rate(region)')
fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.show()

In [None]:
fig = px.bar(df,x='region',y='estimated unemployment rate',animation_frame='month',color='state',
            title='Unemployment rate from Jan 2020 to Oct 2020')

fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.layout.updatemenus[0].buttons[0].args[1]['frame']['duration'] =2000

fig.show()

In [None]:
unemployment =df.groupby(['region','state'])['estimated unemployment rate'].mean().reset_index()
unemployment.head()

In [None]:
fig = px.sunburst(unemployment,path=['region','state'],values='estimated unemployment rate',
                 title ='Unemployment rate in state and region',height=600)
fig.show()

## Unemployment rate before and after Lockdown

In [None]:
# data representation before and after lockdown

before_lockdown = df[(df['month_int']>=1) &(df['month_int'] <4)]
after_lockdown = df[(df['month_int'] >=4) & (df['month_int'] <=6)]

In [None]:
af_lockdown = after_lockdown.groupby('state')['estimated unemployment rate'].mean().reset_index()

lockdown = before_lockdown.groupby('state')['estimated unemployment rate'].mean().reset_index()
lockdown['unemployment rate before lockdown'] = af_lockdown['estimated unemployment rate']

lockdown.columns = ['state','unemployment rate before lockdown','unemployment rate after lockdown']
lockdown.head()

In [None]:
# unenployment rate change after lockdown

lockdown['rate change in unemployment'] =round(lockdown['unemployment rate before lockdown']-lockdown['unemployment rate before lockdown']
                                              /lockdown['unemployment rate after lockdown'],2)

In [None]:
fig = px.bar(lockdown,x='state',y='rate change in unemployment',color='rate change in unemployment',
            title='Percentage change in Unemployment rate in each state after lockdown',template='ggplot2')
fig.update_layout(xaxis={'categoryorder':'total ascending'})
fig.show()