# Goal
In this ipynb file we will analyse people arrested and their disposal by the police which will give us an idea of how many people are involved in what types of crimes to predict which type of crime will require the most police attention

### Problem Statement
Prediction of crime, prognosis and patrol route map forecasting:


Predict crime hotspots by analysing historical crime data, socio-economic factors, and
environmental variables. Also generate dynamic patrol routes by considering predicted crime
hotspots, traffic conditons, and priority areas.

#### Data Loading and Joining

In [1]:
import pandas as pd

In [2]:
prsn_arr_cr_ag_ch_12 = pd.read_csv("AggregrateData/03_Persons_arrested_and_their_disposal_by_police_and_court_under_crime_against_children_2012.csv")
prsn_arr_cr_ag_ch_13 = pd.read_csv("AggregrateData/03_Persons_arrested_and_their_disposal_by_police_and_court_under_crime_against_children_2013.csv")
prsn_arr_cr_ag_ch_14 = pd.read_csv("AggregrateData/03_Persons_arrested_and_their_disposal_by_police_and_court_under_crime_against_children_2014.csv")

In [3]:
prsn_arr_SLL_12 = pd.read_csv("AggregrateData/04_01_Person_arrested_and_their_disposal_by_police_and_court_SLL_crime_2012.csv")
prsn_arr_SLL_13 = pd.read_csv("AggregrateData/04_01_Person_arrested_and_their_disposal_by_police_and_court_SLL_crime_2013.csv")
prsn_arr_SLL_14 = pd.read_csv("AggregrateData/04_01_Person_arrested_and_their_disposal_by_police_and_court_SLL_crime_2014.csv")

In [4]:
prsn_arr_IPC_12 = pd.read_csv("AggregrateData/04_02_Person_arrested_and_their_disposal_by_police_and_court_IPC_crime_2012.csv")
prsn_arr_IPC_13 = pd.read_csv("AggregrateData/04_02_Person_arrested_and_their_disposal_by_police_and_court_IPC_crime_2013.csv")
prsn_arr_IPC_14 = pd.read_csv("AggregrateData/04_02_Person_arrested_and_their_disposal_by_police_and_court_IPC_crime_2014.csv")

#### Data Overview an Descriptive Statistics
We will start off with checking the data types of all features and check if there are any null values

In [5]:
prsn_arr_cr_ag_ch_12.dtypes

STATE/UT                                                                                                   object
CRIME HEAD                                                                                                 object
Persons in custody or on bail during the stage of investigation at the beginning of the year                int64
Persons arrested during the year                                                                            int64
Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason     int64
Persons in custody or on bail during the stage of investigation at the end of the year                      int64
Persons in whose cases charge sheets were laid during the year                                              int64
Persons under trial at the beginning of the year                                                            int64
Total number of persons under trial during the year                                     

In [6]:
prsn_arr_cr_ag_ch_12.isna().sum()

STATE/UT                                                                                                   0
CRIME HEAD                                                                                                 0
Persons in custody or on bail during the stage of investigation at the beginning of the year               0
Persons arrested during the year                                                                           0
Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason    0
Persons in custody or on bail during the stage of investigation at the end of the year                     0
Persons in whose cases charge sheets were laid during the year                                             0
Persons under trial at the beginning of the year                                                           0
Total number of persons under trial during the year                                                        0
Persons against who

In [7]:
prsn_arr_cr_ag_ch_13.isna().sum()

STATE/UT                                                                                                   0
CRIME HEAD                                                                                                 0
Persons in custody or on bail during the stage of investigation at the beginning of the year               0
Persons arrested during the year                                                                           0
Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason    0
Persons in custody or on bail during the stage of investigation at the end of the year                     0
Persons in whose cases charge sheets were laid during the year                                             0
Persons under trial at the beginning of the year                                                           0
Total number of persons under trial during the year                                                        0
Persons against who

In [8]:
prsn_arr_cr_ag_ch_14.isna().sum()

States/UTs                                                          0
Crime Head                                                          0
Year                                                                0
Persons in custody during inv stage at beginning of Year_Male       0
Persons in custody during inv stage at beginning of Year_Female     0
Persons in custody during inv stage at beginning of Year_Total      0
Persons on bail during inv stage at beginning of Year_Male          0
Persons on bail during inv stage at beginning of Year_Female        0
Persons on bail during inv stage at beginning of Year_Total         0
Persons arrested during the year_Male                               0
Persons arrested during the year_Female                             0
Persons arrested during the year_Total                              0
Persons released or freed before trial for want of evidence_Male    0
Persons released or freed before trial for want of evidence_Fem     0
Persons released or 

In [9]:
prsn_arr_IPC_12.isna().sum()

STATE/UT                                                                                                   0
CRIME HEAD                                                                                                 0
Persons in custody or on bail during the stage of investigation at the beginning of the year               0
Persons arrested during the year                                                                           0
Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason    0
Persons in custody or on bail during the stage of investigation at the end of the year                     0
Persons in whose cases charge sheets were laid during the year                                             0
Persons under trial at the beginning of the year                                                           0
Total number of persons under trial during the year                                                        0
Persons against who

In [10]:
prsn_arr_IPC_13.isna().sum()

STATE/UT                                                                                                   0
CRIME HEAD                                                                                                 0
Persons in custody or on bail during the stage of investigation at the beginning of the year               0
Persons arrested during the year                                                                           0
Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason    0
Persons in custody or on bail during the stage of investigation at the end of the year                     0
Persons in whose cases charge sheets were laid during the year                                             0
Persons under trial at the beginning of the year                                                           0
Total number of persons under trial during the year                                                        0
Persons against who

In [11]:
prsn_arr_IPC_14.isna().sum()

States/UTs                                                          0
Crime Head                                                          0
Year                                                                0
Persons in custody during inv stage at beginning of Year_Male       0
Persons in custody during inv stage at beginning of Year_Female     0
Persons in custody during inv stage at beginning of Year_Total      0
Persons on bail during inv stage at beginning of Year_Male          0
Persons on bail during inv stage at beginning of Year_Female        0
Persons on bail during inv stage at beginning of Year_Total         0
Persons arrested during the year_Male                               0
Persons arrested during the year_Female                             0
Persons arrested during the year_Total                              0
Persons released or freed before trial for want of evidence_Male    0
Persons released or freed before trial for want of evidence_Fem     0
Persons released or 

Clearly no null values exist in our datasets

#### Combining the dataset
In order to combine the dataset we need to know the features present in all of them

In [12]:
prsn_arr_cr_ag_ch_12.info()
prsn_arr_cr_ag_ch_13.info()
prsn_arr_cr_ag_ch_14.info()
prsn_arr_IPC_12.info()
prsn_arr_IPC_13.info()
prsn_arr_IPC_14.info()
prsn_arr_SLL_12.info()
prsn_arr_SLL_13.info()
prsn_arr_SLL_14.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 494 entries, 0 to 493
Data columns (total 14 columns):
 #   Column                                                                                                   Non-Null Count  Dtype 
---  ------                                                                                                   --------------  ----- 
 0   STATE/UT                                                                                                 494 non-null    object
 1   CRIME HEAD                                                                                               494 non-null    object
 2   Persons in custody or on bail during the stage of investigation at the beginning of the year             494 non-null    int64 
 3   Persons arrested during the year                                                                         494 non-null    int64 
 4   Persons released or freed by Police or Magistrate before trial for want of evidence or any o

Clearly some new entries were added in the year 2014 and now in order to predict things in future we are gonna need data from 2012 and 2013 along with data of 2014 so we will count the similar features and combine using them neglecting the new entries in the year 2014 which we can later use for a single year analysis

In [13]:
# Common columns need to be filtered out first
common_columns_ch = set(prsn_arr_cr_ag_ch_12.columns) & set(prsn_arr_cr_ag_ch_13.columns) & set(prsn_arr_cr_ag_ch_14.columns)
len(common_columns_ch)

0

The name of features in the 2014 dataset do not match any of those in 2012 and 2013 hence we will analyse the features in 2014 dataset to try to get them similar to the ones in 2012 and 2013 datasets

We found out that the 2014 dataset has male female bifercation that we need to remove

In [14]:
# Selecting the columns ending with "Total" or "Tot"
Selected_columns_ch = [col for col in prsn_arr_cr_ag_ch_14.columns if col.endswith("Total") or col.endswith("Tot")]
Selected_columns_IPC = [col for col in prsn_arr_IPC_14.columns if col.endswith("Total") or col.endswith("Tot")]
Selected_columns_SLL = [col for col in prsn_arr_SLL_14.columns if col.endswith("Total") or col.endswith("Tot")]
print(len(Selected_columns_ch))
print(len(Selected_columns_IPC))
print(len(Selected_columns_SLL))

18
18
18


Finding out the 14 features that are common in 2012 and 2013 datasets

In [15]:
updated_columns_ch = [s.rstrip("_Total").rstrip("_Tot") for s in Selected_columns_ch]
updated_columns_IPC = [s.rstrip("_Total").rstrip("_Tot") for s in Selected_columns_IPC]
updated_columns_SLL = [s.rstrip("_Total").rstrip("_Tot") for s in Selected_columns_SLL]

In [16]:
final_ch = list(set(updated_columns_ch) & set(prsn_arr_cr_ag_ch_12.columns))
final_IPC = list(set(updated_columns_ch) & set(prsn_arr_IPC_12.columns))
final_SLL = list(set(updated_columns_ch) & set(prsn_arr_SLL_12.columns))

In [17]:
final_ch

['Persons convicted', 'Persons arrested during the year', 'Persons acquitted']

Due to differences in writing style of the same data we need to do this manually

In [18]:
prsn_arr_cr_ag_ch_14["Persons in custody or on bail during the stage of investigation at the beginning of the year"] = prsn_arr_cr_ag_ch_14["Persons in custody during inv stage at beginning of Year_Total"] + prsn_arr_cr_ag_ch_14["Persons on bail during inv stage at beginning of Year_Total"]
prsn_arr_cr_ag_ch_14["Persons in custody or on bail during the stage of investigation at the end of the year"] = prsn_arr_cr_ag_ch_14["Persons on Bail during inv stage at year end_Total"] + prsn_arr_cr_ag_ch_14["Persons in custody during inv stage at year end_Total"]
prsn_arr_cr_ag_ch_14["Persons under trial at the beginning of the year"] = prsn_arr_cr_ag_ch_14["Persons on Bail during trial stage at begin of year_Total"] + prsn_arr_cr_ag_ch_14["Persons in custody during trial stage at begin of year_Total"]
prsn_arr_cr_ag_ch_14["Persons in custody or on bail during the stage of trial at the end of the year"] = prsn_arr_cr_ag_ch_14["Persons in custody during trial stage at Year end_Total"] + prsn_arr_cr_ag_ch_14["Persons on bail during trial stage at Year End_Total"]

column_names_to_be_changed = {"Persons arrested during the year_Total":"Persons arrested during the year",
                              "Persons released or freed before trial for want of evidence_Tot":"Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason",
                              "Persons charge sheeted_Total":"Persons in whose cases charge sheets were laid during the year",
                              "Total number of persons under Trial_Total":"Total number of persons under trial during the year",
                              "Persons against whom cases were compounded by Courts_Total":"Persons in whose cases trials were completed during the year",
                              "Persons convicted_Total":"Persons convicted",
                              "Persons acquitted_Total":"Persons acquitted"
                              }
prsn_arr_cr_ag_ch_14.rename(columns = column_names_to_be_changed, inplace=True)

In [19]:
prsn_arr_IPC_14["Persons in custody or on bail during the stage of investigation at the beginning of the year"] = prsn_arr_IPC_14["Persons in custody during inv stage at beginning of Year_Total"] + prsn_arr_IPC_14["Persons on bail during inv stage at beginning of Year_Total"]
prsn_arr_IPC_14["Persons in custody or on bail during the stage of investigation at the end of the year"] = prsn_arr_IPC_14["Persons on Bail during inv stage at year end_Total"] + prsn_arr_IPC_14["Persons in custody during inv stage at year end_Total"]
prsn_arr_IPC_14["Persons under trial at the beginning of the year"] = prsn_arr_IPC_14["Persons on Bail during trial stage at begin of year_Total"] + prsn_arr_IPC_14["Persons in custody during trial stage at begin of year_Total"]
prsn_arr_IPC_14["Persons in custody or on bail during the stage of trial at the end of the year"] = prsn_arr_IPC_14["Persons in custody during trial stage at Year end_Total"] + prsn_arr_IPC_14["Persons on bail during trial stage at Year End_Total"]

column_names_to_be_changed = {"Persons arrested during the year_Total":"Persons arrested during the year",
                              "Persons released or freed before trial for want of evidence_Tot":"Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason",
                              "Persons charge sheeted_Total":"Persons in whose cases charge sheets were laid during the year",
                              "Total number of persons under Trial_Total":"Total number of persons under trial during the year",
                              "Persons against whom cases were compounded by Courts_Total":"Persons in whose cases trials were completed during the year",
                              "Persons convicted_Total":"Persons convicted",
                              "Persons acquitted_Total":"Persons acquitted"
                              }
prsn_arr_IPC_14.rename(columns = column_names_to_be_changed, inplace=True)

In [20]:
prsn_arr_SLL_14["Persons in custody or on bail during the stage of investigation at the beginning of the year"] = prsn_arr_SLL_14["Persons in custody during inv stage at beginning of Year_Total"] + prsn_arr_SLL_14["Persons on bail during inv stage at beginning of Year_Total"]
prsn_arr_SLL_14["Persons in custody or on bail during the stage of investigation at the end of the year"] = prsn_arr_SLL_14["Persons on Bail during inv stage at year end_Total"] + prsn_arr_SLL_14["Persons in custody during inv stage at year end_Total"]
prsn_arr_SLL_14["Persons under trial at the beginning of the year"] = prsn_arr_SLL_14["Persons on Bail during trial stage at begin of year_Total"] + prsn_arr_SLL_14["Persons in custody during trial stage at begin of year_Total"]
prsn_arr_SLL_14["Persons in custody or on bail during the stage of trial at the end of the year"] = prsn_arr_SLL_14["Persons in custody during trial stage at Year end_Total"] + prsn_arr_SLL_14["Persons on bail during trial stage at Year End_Total"]

column_names_to_be_changed = {"Persons arrested during the year_Total":"Persons arrested during the year",
                              "Persons released or freed before trial for want of evidence_Tot":"Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason",
                              "Persons charge sheeted_Total":"Persons in whose cases charge sheets were laid during the year",
                              "Total number of persons under Trial_Total":"Total number of persons under trial during the year",
                              "Persons against whom cases were compounded by Courts_Total":"Persons in whose cases trials were completed during the year",
                              "Persons convicted_Total":"Persons convicted",
                              "Persons acquitted_Total":"Persons acquitted"
                              }
prsn_arr_SLL_14.rename(columns = column_names_to_be_changed, inplace=True)

In [21]:
prsn_arr_IPC_14.rename(columns = {"States/UTs":"STATE/UT", "Crime Head":"CRIME HEAD"}, inplace = True)
prsn_arr_SLL_12.rename(columns = {"States/UTs":"STATE/UT", "Crime Head":"CRIME HEAD"}, inplace = True)
prsn_arr_cr_ag_ch_14.rename(columns = {"States/UTs":"STATE/UT", "Crime Head":"CRIME HEAD"}, inplace = True)

Running the code for getting common columns again


In [22]:
final_ch = list(set(prsn_arr_cr_ag_ch_14.columns) & set(prsn_arr_cr_ag_ch_12.columns))
final_IPC = list(set(prsn_arr_IPC_14.columns) & set(prsn_arr_IPC_12.columns))
final_SLL = list(set(prsn_arr_SLL_14.columns) & set(prsn_arr_SLL_12.columns))

In [23]:
prsn_arr_SLL_14 = prsn_arr_SLL_14[final_SLL]
prsn_arr_cr_ag_ch_14 = prsn_arr_cr_ag_ch_14[final_ch]
prsn_arr_IPC_14 = prsn_arr_IPC_14[final_IPC]

In [24]:
final_ch

['Persons convicted',
 'Persons in custody or on bail during the stage of trial at the end of the year',
 'Persons under trial at the beginning of the year',
 'CRIME HEAD',
 'Total number of persons under trial during the year',
 'Persons in custody or on bail during the stage of investigation at the end of the year',
 'Persons acquitted',
 'Persons in whose cases charge sheets were laid during the year',
 'Persons in whose cases trials were completed during the year',
 'Persons in custody or on bail during the stage of investigation at the beginning of the year',
 'Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason',
 'STATE/UT',
 'Persons arrested during the year']

In [25]:
final_SLL

['Persons convicted',
 'Persons in custody or on bail during the stage of trial at the end of the year',
 'Persons under trial at the beginning of the year',
 'Total number of persons under trial during the year',
 'Persons in custody or on bail during the stage of investigation at the end of the year',
 'Persons acquitted',
 'Persons in whose cases charge sheets were laid during the year',
 'Persons in whose cases trials were completed during the year',
 'Persons in custody or on bail during the stage of investigation at the beginning of the year',
 'Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason',
 'Persons arrested during the year']

In [26]:
prsn_arr_cr_ag_ch_12["Year"] = 2012
prsn_arr_cr_ag_ch_13["Year"] = 2013
prsn_arr_cr_ag_ch_14["Year"] = 2014
prsn_arr_IPC_12["Year"] = 2012
prsn_arr_IPC_13["Year"] = 2013
prsn_arr_IPC_14["Year"] = 2014
prsn_arr_SLL_12["Year"] = 2012
prsn_arr_SLL_13["Year"] = 2013
prsn_arr_SLL_14["Year"] = 2014

In [29]:
prsn_arr_cr_ag_ch = pd.concat([prsn_arr_cr_ag_ch_12,prsn_arr_cr_ag_ch_13, prsn_arr_cr_ag_ch_14], ignore_index=True)
prsn_arr_IPC = pd.concat([prsn_arr_IPC_12, prsn_arr_IPC_13, prsn_arr_IPC_14], ignore_index=True)
prsn_arr_SLL = pd.concat([prsn_arr_SLL_12, prsn_arr_SLL_13, prsn_arr_SLL_14], ignore_index=True)

In [30]:
prsn_arr_IPC["Legal Framework"] = "IPC"
prsn_arr_SLL["Legal Framework"] = "SLL"
prsn_arr_cr_ag_ch["Legal Framework"] = "Crime against children"

In [31]:
Persons_arrested = pd.concat([prsn_arr_cr_ag_ch, prsn_arr_IPC, prsn_arr_SLL], ignore_index=True)

In [32]:
Persons_arrested.head()

Unnamed: 0,STATE/UT,CRIME HEAD,Persons in custody or on bail during the stage of investigation at the beginning of the year,Persons arrested during the year,Persons released or freed by Police or Magistrate before trial for want of evidence or any other reason,Persons in custody or on bail during the stage of investigation at the end of the year,Persons in whose cases charge sheets were laid during the year,Persons under trial at the beginning of the year,Total number of persons under trial during the year,Persons against whom cases were compounded or withdrawn,Persons in custody or on bail during the stage of trial at the end of the year,Persons in whose cases trials were completed during the year,Persons convicted,Persons acquitted,Year,Legal Framework
0,ANDHRA PRADESH,INFANTICIDE (SECTION 315 IPC),0,6,0,5,1,4,5,0.0,5,0,0,0,2012,Crime against children
1,ARUNACHAL PRADESH,INFANTICIDE (SECTION 315 IPC),0,0,0,0,0,0,0,0.0,0,0,0,0,2012,Crime against children
2,ASSAM,INFANTICIDE (SECTION 315 IPC),0,0,0,0,0,0,0,0.0,0,0,0,0,2012,Crime against children
3,BIHAR,INFANTICIDE (SECTION 315 IPC),0,2,0,0,2,7,9,0.0,6,3,1,2,2012,Crime against children
4,CHHATTISGARH,INFANTICIDE (SECTION 315 IPC),0,5,0,0,5,16,21,0.0,17,4,2,2,2012,Crime against children
