Unemployment refers to the share of the labor force that is without work but available for and seeking employment.The Centre for Monitoring Indian Economy, a private organization (CMIE), estimates India's unemployment rate in India is around 7.45% at present. It is 7.93% in urban India whereas only 7.44% in rural India.The reasons for this unemployment situation are high population, defective education system, excessive burden on agriculture, low productivity in agricultural sector combined with lack of alternative opportunities for agricultural workers, unskilled workforce, etc. Here this dataset contains the unemployment rate of all the states in India. 

### Unemployment in India - Exploratory Data analysis

Importing libraries

In [None]:
import pandas as pd
import numpy as np

import warnings
warnings.filterwarnings('ignore')

import matplotlib.pyplot as plt
import seaborn as sns

## Load Data

In [None]:
df=pd.read_csv('/kaggle/input/unemployment-in-india/Unemployment in India.csv')

## Data Exploration

In [None]:
print('Shape of the dataset : ',df.shape)

In [None]:
df.head()

In [None]:
df.columns

In [None]:
df.info()

In [None]:
df.describe().transpose()

In [None]:
#checking for null values
df.isnull().sum()

No null values present in the dataset

In [None]:
#remove unwanted space before column names
df.columns=df.columns.str.strip()
df.columns

In [None]:
#currently date column in object type so let's convert it into datetime type
df['Date'] = pd.to_datetime(df['Date'])

In [None]:
import datetime as dt

In [None]:
#creating new columns for better analysis
#extracting month and year from date column
df['year']=df['Date'].dt.isocalendar().year
df['month']=df['Date'].dt.month

In [None]:
#dropping rows with null values
df=df.dropna()

In [None]:
#final shape of the dataset
df.shape

In [None]:
df['Frequency'].unique()
#dropping insignificant column
df=df.drop(['Frequency'],axis=1)

In [None]:
df.head()

## EDA

In [None]:
#coorelation heatmap
plt.figure(figsize=(12,10))
sns.heatmap(df.corr(),annot=True,cmap='Blues_r')

#### Unemployment rate by Area

In [None]:
count_by_area=df['Area'].value_counts().rename_axis('Area').reset_index(name='Count')
count_by_area.sort_values(by='Count',ascending=False)

In [None]:
sns.countplot(x='Area',data=df,palette=['pink','silver']) 

#### Unemployment rate by state

In [None]:
count_by_region=df['Region'].value_counts().rename_axis('State').reset_index(name='Count')
count_by_region.style.background_gradient(cmap='Blues')

In [None]:
sns.countplot(y='Region',data=df)

In [None]:
plt.title('Unemployment Rate based on Area')
sns.scatterplot(y=df['Region'],x=df['Estimated Unemployment Rate (%)'],hue=df['Area'])

#### States with High Unemployment Rate

In [None]:
df[df['Estimated Unemployment Rate (%)']>50].Region.value_counts()

In [None]:
plt.title('States with High Unemployment Rate')
df1=df[df['Estimated Unemployment Rate (%)']>50]
sns.scatterplot(y=df['Region'],x=df1['Estimated Unemployment Rate (%)'],hue=df['Area'])

In [None]:
#  Unemployment Rate by area
df[df['Estimated Unemployment Rate (%)']>40].Area.value_counts()

#### States with High Estimated Labour Participation Rate

In [None]:
df[df['Estimated Labour Participation Rate (%)']>60].Region.value_counts()

In [None]:
#states based on employed
df[df['Estimated Employed']>7500000].Region.value_counts()

In [None]:
df=df.drop(['month'],axis=1)
by_year=df.groupby(['year']).mean().round()
by_year.style.background_gradient(cmap='Blues_r')

In [None]:
df2 = pd.read_csv("/kaggle/input/unemployment-in-india/Unemployment_Rate_upto_11_2020.csv")
df2.head()

In [None]:
import plotly.express as px
unemployment = df2[['Region', 'Region.1', ' Estimated Unemployment Rate (%)', ' Estimated Employed', ' Estimated Labour Participation Rate (%)']]
unemployment = unemployment.groupby(['Region.1', 'Region'])[' Estimated Unemployment Rate (%)'].mean().reset_index()
fig = px.sunburst(unemployment, path=['Region.1', 'Region'], values=' Estimated Unemployment Rate (%)', title='Unemployment rate in each State and Region',height=850)
fig.show()