# Crime Against Women

This data is collated from https://data.gov.in. It has state-wise data on the various crimes committed against women between 2001 to 2021. Some crimes that are included are Rape, Kidnapping and Abduction, Dowry Deaths etc.


NOTE:
Andhra Pradesh and Telengana split to form new

*   Andhra Pradesh and Telengana split to form separate states in 2014. Data of Telengana is separate from 2014-2021.
*   Even though Jammu Kashmir and Ladakh split to form different union territories, their data is combined from 2019-2021



###Importing Libraries




In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go

### Reading Dataset

In [None]:
df = pd.read_csv("CrimesOnWomenData.csv")

df.iloc[:,1:].head()

Unnamed: 0,State,Year,Rape,Kidnapping & Assault,Dowry Deaths,Assault against Women,Assault of Modesty,Domestic Violece,Human trafficking
0,ANDHRA PRADESH,2001,871,765,420,3544,2271,5791,7
1,ARUNACHAL PRADESH,2001,33,55,0,78,3,11,0
2,ASSAM,2001,817,1070,59,850,4,1248,0
3,BIHAR,2001,888,518,859,562,21,1558,83
4,CHHATTISGARH,2001,959,171,70,1763,161,840,0


###Cleaning Data

In [None]:
df['State'] = df['State'].str.title()
df['State'].unique()

array(['Andhra Pradesh', 'Arunachal Pradesh', 'Assam', 'Bihar',
       'Chhattisgarh', 'Goa', 'Gujarat', 'Haryana', 'Himachal Pradesh',
       'Jammu & Kashmir', 'Jharkhand', 'Karnataka', 'Kerala',
       'Madhya Pradesh', 'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram',
       'Nagaland', 'Odisha', 'Punjab', 'Rajasthan', 'Sikkim',
       'Tamil Nadu', 'Tripura', 'Uttar Pradesh', 'Uttarakhand',
       'West Bengal', 'A & N Islands', 'Chandigarh', 'D & N Haveli',
       'Daman & Diu', 'A', 'Puducherry', 'Lakshadweep', 'Telangana',
       'D&N Haveli', 'Delhi Ut'], dtype=object)

In [None]:
replacements = {'D & N Haveli': 'D & D', 'D&N Haveli': 'D & D', 'Daman & Diu': 'D & D', 'A':'Lakshadweep'}
df['State'] = df['State'].replace(replacements)

In [None]:
df['State'].unique()
# len(df['State'].unique())

array(['Andhra Pradesh', 'Arunachal Pradesh', 'Assam', 'Bihar',
       'Chhattisgarh', 'Goa', 'Gujarat', 'Haryana', 'Himachal Pradesh',
       'Jammu & Kashmir', 'Jharkhand', 'Karnataka', 'Kerala',
       'Madhya Pradesh', 'Maharashtra', 'Manipur', 'Meghalaya', 'Mizoram',
       'Nagaland', 'Odisha', 'Punjab', 'Rajasthan', 'Sikkim',
       'Tamil Nadu', 'Tripura', 'Uttar Pradesh', 'Uttarakhand',
       'West Bengal', 'A & N Islands', 'Chandigarh', 'D & D',
       'Lakshadweep', 'Puducherry', 'Telangana', 'Delhi Ut'], dtype=object)

### Analysing Data

In [None]:
df.shape

(736, 10)

In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 736 entries, 0 to 735
Data columns (total 10 columns):
 #   Column                 Non-Null Count  Dtype 
---  ------                 --------------  ----- 
 0   Unnamed: 0             736 non-null    int64 
 1   State                  736 non-null    object
 2   Year                   736 non-null    int64 
 3   Rape                   736 non-null    int64 
 4   Kidnapping & Assault   736 non-null    int64 
 5   Dowry Deaths           736 non-null    int64 
 6   Assault against Women  736 non-null    int64 
 7   Assault of Modesty     736 non-null    int64 
 8   Domestic Violece       736 non-null    int64 
 9   Human trafficking      736 non-null    int64 
dtypes: int64(9), object(1)
memory usage: 57.6+ KB


In [None]:
df.describe()

Unnamed: 0.1,Unnamed: 0,Year,Rape,Kidnapping & Assault,Dowry Deaths,Assault against Women,Assault of Modesty,Domestic Violece,Human trafficking
count,736.0,736.0,736.0,736.0,736.0,736.0,736.0,736.0,736.0
mean,367.5,2011.149457,727.855978,1134.54212,215.692935,1579.115489,332.722826,2595.078804,28.744565
std,212.609188,6.053453,977.024945,1993.536828,424.927334,2463.962518,806.024551,4042.004953,79.99966
min,0.0,2001.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,183.75,2006.0,35.0,24.75,1.0,34.0,3.0,13.0,0.0
50%,367.5,2011.0,348.5,290.0,29.0,387.5,31.0,678.5,0.0
75%,551.25,2016.0,1069.0,1216.0,259.0,2122.25,277.5,3545.0,15.0
max,735.0,2021.0,6337.0,15381.0,2524.0,14853.0,9422.0,23278.0,549.0


### Checking for Missing Values

In [None]:
df.isnull().sum()

Unnamed: 0,0
Unnamed: 0,0
State,0
Year,0
Rape,0
Kidnapping & Assault,0
Dowry Deaths,0
Assault against Women,0
Assault of Modesty,0
Domestic Violece,0
Human trafficking,0


## Crimes over the years in India

In [None]:
df.columns.tolist()

['Unnamed: 0',
 'State',
 'Year',
 'Rape',
 'Kidnapping & Assault',
 'Dowry Deaths',
 'Assault against Women',
 'Assault of Modesty',
 'Domestic Violece',
 'Human trafficking']

In [None]:
def crimes_by_different_order(dataframe,column_name):
    "This function returns a dataframe with total number of particular crimes grouping by certain column"
    for column in list(df.columns)[3:]:
        dataframe[column]=df.groupby([column_name])[column].sum()
    return dataframe

total_crimes_in_years_df=pd.DataFrame()
total_crimes_in_years_df=crimes_by_different_order(total_crimes_in_years_df,"Year")
total_crimes_in_years_df.reset_index(inplace=True)
total_crimes_in_years_df

Unnamed: 0,Year,Rape,Kidnapping & Assault,Dowry Deaths,Assault against Women,Assault of Modesty,Domestic Violece,Human trafficking
0,2001,15694,13681,6738,33622,9656,49032,114
1,2002,15970,13613,6687,33497,10027,49102,76
2,2003,15357,12499,6078,32450,12220,49492,46
3,2004,17682,14697,6900,33966,9871,56867,89
4,2005,17701,14644,6673,33413,9759,56995,148
5,2006,18725,16348,7481,35899,9822,61400,67
6,2007,20139,19249,7955,37866,10783,74143,61
7,2008,21001,21803,8043,39802,12084,79957,67
8,2009,20928,24086,8242,38159,10891,88263,48
9,2010,21665,28055,8248,40012,9881,92637,36


In [None]:
total_crimes_in_years_df['Total Number of Cases']=total_crimes_in_years_df.sum(axis=1)
total_crimes_in_years_df["Total Number of Cases"].sum()

4909953

*More than 49 lakh cases registered between 2001 and 2021*

Function to plot graphs

In [None]:
#using plotly.graph_objects

def plot_figure(title: str, crime: str,ylabel: str,color: str="red"):
    """This function generate a bar graph using the total_crimes_in_years_df
    Parameters:
    title: insert title of bar graph,
    crime: insert column name according to dataframe,
    y_label: insert y label of bar graph,
    color: insert color choice according plotly.graph_options"""
    fig = go.Figure(data=[go.Bar(
            x=total_crimes_in_years_df["Year"], y=total_crimes_in_years_df[crime],
            text=total_crimes_in_years_df[crime],
        )])
    fig.update_traces(texttemplate='%{text:.2s}', textposition='outside',marker_color=color)
    fig.update_layout(title=title,
                       xaxis = dict(title='Years',
                                    tickmode = 'linear',
                                    tickangle=-30 ),
                   yaxis_title=ylabel,uniformtext_minsize=8, uniformtext_mode='show')

    fig.show()

In [None]:
# Rape cases in India in 2001-2021
plot_figure("Rape cases in India in 2001-2021","Rape","Cases of Rape in India", color='blue')

**Analysis:**

* **Increasing Trend:** There is a clear upward trend in the number of rape cases in India from 2001 to 2014. The number of cases generally increases each year, with a few minor fluctuations
*   **Significant Growth:** The increase in rape cases is substantial, with the number of cases rising from around 16,000 in 2001 to over 39,000 in 2016.





In [None]:
# Kidnapping and Assault vs years
plot_figure("Kidnapping and Assault cases in India in 2001-2021","Kidnapping & Assault","Cases of Kidnapping and Assault in India", color='crimson')

**Analysis:**

*   **Steady Increase:**
The graph reveals a consistent upward trend in the number of Kidnapping and Assault cases in India from 2001 to 2021.
*   **Significant Growth:** The number of cases has nearly tripled during this period, indicating a serious concern.
*   **Fluctuations:** While the overall trend is upward, there are some year-to-year fluctuations. This could be attributed to various factors like changes in reporting mechanisms, socio-economic conditions, and law enforcement efforts.

In [None]:
# Dowry Deaths
plot_figure("Dowry Death cases in India in 2001-2021","Dowry Deaths","Cases of Dowry Deaths in India", color='navy')

In [None]:
# Assault
plot_figure("Assault against women cases in India in 2001-2021","Assault against Women","Cases of Assault against women in India", color='rgb(13, 154, 218)')

**Analysis**
*  **Increasing Trend:** The graph reveals a clear upward trend in the number of Assault Against Women cases in India from 2001 to 2021.
*  **Significant Growth:** The number of cases has increased significantly over the period, highlighting a serious issue.
*  **Fluctuations:** While the overall trend is upward, there are some year-to-year fluctuations. This could be attributed to various factors like changes in reporting mechanisms, socio-economic conditions, and law enforcement efforts.
*  **Data Gap:** The missing data for 2011 could potentially skew the overall trend. Further investigation is needed to understand the reasons for this data gap and its impact on the analysis.

In [None]:
# Assault of Modesty
plot_figure("Assault of Modesty cases in India in 2001-2021","Assault of Modesty","Cases of Assault of Modesty in India", color='rgb(123, 19, 60)')

**Analysis**
*  **Increasing Trend:** The graph shows a general upward trend in the number of Assault of Modesty cases in India from 2001 to 2021.
*  **Significant Spike:** There's a notable spike in cases around 2016, followed by a slight decline in subsequent years.
Fluctuations: While the overall trend is increasing, there are some year-to-year fluctuations. This could be attributed to various factors like changes in reporting mechanisms, socio-economic conditions, and law enforcement efforts.

In [None]:
# Domestic violence
plot_figure("Domestic Violence cases in India in 2001-2021","Domestic Violece","Cases of Domestic Violence in India", color='rgb(222, 148, 54)')

**Analysis**
* **Increasing Trend:** The graph reveals a clear upward trend in the number of Domestic Violence cases in India from 2001 to 2021.
Significant Growth: The number of cases has increased substantially over the period, highlighting a serious societal issue.
* **Fluctuations:** While the overall trend is upward, there are some year-to-year fluctuations. This could be attributed to various factors like changes in reporting mechanisms, socio-economic conditions, and law enforcement efforts.

In [None]:
# Human Traficking
plot_figure("Human Trafficking cases in India in 2001-2021","Human trafficking","Cases of Human Trafficking in India", color='lightsalmon')


**Analysis**

*  **Increasing Trend:** The graph reveals a clear upward trend in the number of Human Trafficking cases in India from 2001 to 2021.
Significant Growth: The number of cases has increased substantially over the period, highlighting a serious societal issue.
*  **Fluctuations:** While the overall trend is upward, there are some year-to-year fluctuations. This could be attributed to various factors like changes in reporting mechanisms, socio-economic conditions, and law enforcement efforts.

##Year wise Analysis of Crimes Against Women in India

###Year wise Crime Againt women in India

In [None]:
#

fig = go.Figure(data=[go.Scatter(
            x=total_crimes_in_years_df["Year"], y=total_crimes_in_years_df["Total Number of Cases"],
            text=total_crimes_in_years_df["Total Number of Cases"],
        )])
fig.update_layout(title="Year wise crime against women in India(including States & UT)",
                       xaxis = dict(title='Years',
                                    tickmode = 'linear',
                                    tickangle=-30 ),
                      #yaxis=dict(title=, tickmode = 'linear', tick0 = 0.0,dtick = 0.25)
                   yaxis_title="Cases",yaxis_tickformat = 's',uniformtext_minsize=8, uniformtext_mode='show')

fig.show()


###Rate of Change of Different Crimes###


In [None]:
#Rate of change of different crimes over time

fig = go.Figure()
color=["blue","crimson","navy","rgb(13, 154, 218)",'rgb(123, 19, 60)', 'rgb(222, 148, 54)','lightsalmon']
dash=["solid","dash","dot","solid","dash","dot", "solid"]
j=0
for i in total_crimes_in_years_df.columns:
    if i=="Total Number of Cases":
        break
    if i == "Year":
        continue
    fig.add_trace(go.Scatter(x=total_crimes_in_years_df.Year, y=total_crimes_in_years_df[i], name=i,
                         line=dict(color=color[j % len(color)], width=4, dash=dash[j % len(dash)])))
    j=j+1
# Edit the layout
fig.update_layout(title='Rate of change of different crime over time',
                   xaxis_title='Years',
                   yaxis_title='Spread of Cases')

fig.show()
fig = go.Figure()

In [None]:
crimes_in_years_df=total_crimes_in_years_df.drop("Total Number of Cases", axis=1)
crimes_in_years_df.set_index("Year", inplace = True)

### Highest and Lowest Reported Crimes

In [None]:
pd.DataFrame(crimes_in_years_df.sum(axis=0),columns=['Count']).sort_values(by='Count',ascending=False)

Unnamed: 0,Count
Domestic Violece,1909978
Assault against Women,1162229
Kidnapping & Assault,835023
Rape,535702
Assault of Modesty,244884
Dowry Deaths,158750
Human trafficking,21156


In [None]:
fig = px.pie(pd.DataFrame(crimes_in_years_df.sum(axis=0),columns=['Count']), values='Count', names=pd.DataFrame(crimes_in_years_df.sum(axis=0)).index, title='Percentage of Each Crime between 2001 - 2021')
fig.show()

## State/UT Analysis

###Total Crimes Reported across all States in 2001-2021

In [None]:
state_ut_crimes_df=pd.DataFrame()
state_ut_crimes_df=crimes_by_different_order(state_ut_crimes_df,"State")
state_ut_crimes_df

Unnamed: 0_level_0,Rape,Kidnapping & Assault,Dowry Deaths,Assault against Women,Assault of Modesty,Domestic Violece,Human trafficking
State,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
A & N Islands,424,363,13,558,135,365,10
Andhra Pradesh,23424,21707,8165,92084,54684,188511,2548
Arunachal Pradesh,1153,1272,4,1584,97,705,7
Assam,32190,71462,2757,47115,2698,135415,343
Bihar,21006,77687,24428,9953,800,55187,825
Chandigarh,795,1547,68,954,281,1689,7
Chhattisgarh,25178,16536,1797,33310,4382,16381,103
D & D,2962,8873,286,7619,898,8261,41
Delhi Ut,13065,31251,1106,26039,6046,27119,103
Goa,1019,720,23,1475,475,356,111


##Top 10 States/Union Territories With Highest number of reported crimes

In [None]:
pd.DataFrame(state_ut_crimes_df.sum(axis=1),columns=['Total Cases'] ).sort_values(by='Total Cases',ascending=False).head(10)

Unnamed: 0_level_0,Total Cases
State,Unnamed: 1_level_1
Uttar Pradesh,529734
Madhya Pradesh,413157
West Bengal,409242
Andhra Pradesh,391123
Rajasthan,379264
Maharashtra,365632
Assam,291980
Kerala,202986
Odisha,194774
Bihar,189886


###Top 10 States/Union Territories with Lowest number of reported crimes

In [None]:
pd.DataFrame(state_ut_crimes_df.sum(axis=1),columns=['Total Cases'] ).sort_values(by='Total Cases', ascending=True).head(10)

Unnamed: 0_level_0,Total Cases
State,Unnamed: 1_level_1
Lakshadweep,72
Puducherry,1793
A & N Islands,1868
Mizoram,2844
Goa,4179
Meghalaya,4294
Manipur,4390
Arunachal Pradesh,4822
Chandigarh,5341
Sikkim,7454


## Which crime was reported the most in which state

In [None]:
pd.DataFrame(state_ut_crimes_df.idxmax(),columns=['State'])

Unnamed: 0,State
Rape,Madhya Pradesh
Kidnapping & Assault,Uttar Pradesh
Dowry Deaths,Uttar Pradesh
Assault against Women,Madhya Pradesh
Assault of Modesty,Andhra Pradesh
Domestic Violece,West Bengal
Human trafficking,Tamil Nadu


Which crime was reported the least in which state

In [None]:
pd.DataFrame(state_ut_crimes_df.idxmin(),columns=['State'])

Unnamed: 0,State
Rape,Lakshadweep
Kidnapping & Assault,Lakshadweep
Dowry Deaths,Lakshadweep
Assault against Women,Lakshadweep
Assault of Modesty,Lakshadweep
Domestic Violece,Lakshadweep
Human trafficking,Lakshadweep


##Function to Analyse States

In [None]:
def analyze_state(state_name):
    try:
        fig = px.pie(state_ut_crimes_df, values=state_ut_crimes_df.loc[state_name],
                     names=state_ut_crimes_df.iloc[0,:].index, title='Total Crime Rate Distribution for {}'.format(state_name))
        fig.show()
    except KeyError:
        print('You Entered Wrong STATE/UT Name')

state_name=input('Enter Name of State/UT : ').title()
analyze_state(state_name)

Enter Name of State/UT : Uttar Pradesh


In [None]:
def get_state_year_data(df, state_name):
  try:
    state_data = df[df['State'] == state_name]
    year_wise_data = state_data.groupby('Year').sum()

    return year_wise_data
  except KeyError:
    print('You Entered Wrong STATE/UT Name')

state_name = input('Enter Name of State/UT : ').title()
year_wise_data = get_state_year_data(df, state_name)
year_wise_data.iloc[:,1:]

Enter Name of State/UT : Bihar


Unnamed: 0_level_0,State,Rape,Kidnapping & Assault,Dowry Deaths,Assault against Women,Assault of Modesty,Domestic Violece,Human trafficking
Year,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2001,Bihar,888,518,859,562,21,1558,83
2002,Bihar,1040,744,927,621,6,1577,38
2003,Bihar,985,674,909,688,11,1880,37
2004,Bihar,1390,997,1029,704,13,2679,35
2005,Bihar,1147,929,1014,451,13,1574,74
2006,Bihar,1232,1084,1188,530,53,1689,42
2007,Bihar,1555,1260,1172,853,12,1635,56
2008,Bihar,1302,1789,1210,999,21,1992,22
2009,Bihar,929,1986,1295,726,12,2532,31
2010,Bihar,795,2569,1257,534,16,2271,8
