<div style="height: 200px; background-image: url('https://legalserviceindia.com/legal/uploads/offencesagainstchildrenunderipc_2824045185.jpg'); background-size: cover; background-position: center;"></div>

## Introduction

- This project aims to analyze crimes against children in India for the year 2014. The dataset used for this analysis is sourced from the Government of India's official crime statistics, which provides detailed information on various types of crimes reported across different states and union territories.

### Importing required Libraries

##### At first, we are importing required libraries for data analysis and visualization

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

#### Importing csv file as pandas data frame

In [2]:
crime_csv = 'chCAC_2014_1.csv'
df = pd.read_csv('chCAC_2014_1.csv')

In [8]:
df.head(10)

Unnamed: 0,States/UTs,Crime Head,2014
0,Andhra Pradesh,1 - Murder (Section 302 and 303 IPC),45
1,Andhra Pradesh,2 - Infanticide (Section 315 IPC),2
2,Andhra Pradesh,3 - Rape,477
3,Andhra Pradesh,4 - Assault on women with intent to outrage he...,274
4,Andhra Pradesh,4.1 - Sexual Harassment (Section 354A IPC),67
5,Andhra Pradesh,4.2 - Assault on women with intent to Disrobe ...,19
6,Andhra Pradesh,4.3 - Voyeurism (Section 354C IPC),5
7,Andhra Pradesh,4.4 - Stalking (Section 354D IPC),94
8,Andhra Pradesh,4.5 - Others assault,89
9,Andhra Pradesh,5 - Insult to the Modesty of Women (Girls Chil...,75


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2028 entries, 0 to 2027
Data columns (total 3 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   States/UTs  2028 non-null   object
 1   Crime Head  2028 non-null   object
 2   2014        2028 non-null   int64 
dtypes: int64(1), object(2)
memory usage: 47.7+ KB


In [5]:
df.describe()

Unnamed: 0,2014
count,2028.0
mean,351.063609
std,3113.808306
min,0.0
25%,0.0
50%,1.0
75%,23.0
max,89423.0


In [7]:
df.isnull().sum()

States/UTs    0
Crime Head    0
2014          0
dtype: int64

### Disclaimer

- The provider of this dataset, the Government of India, makes no claims, promises, or guarantees about the accuracy, completeness, or adequacy of the data and expressly disclaims liability for errors and omissions in the contents of this dataset. No warranty of any kind, implied, expressed, or statutory, including but not limited to the warranties of non-infringement of third party rights, title, merchantability, fitness for a particular purpose, and freedom from computer virus, is given with respect to the contents of this dataset or its links to other internet resources.



## Cleaning up the data

#### Disclaimer on Data Cleaning Process

- As part of this project, specific types of crimes have been excluded from the analysis to focus on relevant categories. The following steps were taken to clean the data:

- Initial Inspection: The dataset was thoroughly inspected to understand its structure and content.
Exclusion of Specific Crime Types: Certain crime types, such as foeticides and infanticides, were excluded from the dataset. This decision was made to streamline the analysis and focus on other categories of crimes.

In [9]:
df = df.drop(df[df['Crime Head'] == '7 - Foeticide (Section 315 and 316 IPC)'].index)

In [11]:
df.head(10)

Unnamed: 0,States/UTs,Crime Head,2014
0,Andhra Pradesh,1 - Murder (Section 302 and 303 IPC),45
1,Andhra Pradesh,2 - Infanticide (Section 315 IPC),2
2,Andhra Pradesh,3 - Rape,477
3,Andhra Pradesh,4 - Assault on women with intent to outrage he...,274
4,Andhra Pradesh,4.1 - Sexual Harassment (Section 354A IPC),67
5,Andhra Pradesh,4.2 - Assault on women with intent to Disrobe ...,19
6,Andhra Pradesh,4.3 - Voyeurism (Section 354C IPC),5
7,Andhra Pradesh,4.4 - Stalking (Section 354D IPC),94
8,Andhra Pradesh,4.5 - Others assault,89
9,Andhra Pradesh,5 - Insult to the Modesty of Women (Girls Chil...,75


#### I removed featicides because we are trying to analyse the crimes agains the children after their birth

### ---- Removing sub coloms for better understanding

In [13]:
df = df.drop(df[df['Crime Head'] == '4.1 - Sexual Harassment (Section 354A IPC)'].index)

### Methodology

- Deleting rows, such as those containing total data of states and union territories, is often necessary during data analysis for several reasons related to data integrity, analysis focus, and accuracy of results:

- By carefully removing total rows or aggregate data before analysis, analysts ensure that their findings are based on accurate, unit-level information that aligns with the analytical objectives. This approach enhances the reliability and relevance of conclusions drawn from the analysis, supporting informed decision-making and meaningful interpretations of the data.

In [14]:
df.head(10)

Unnamed: 0,States/UTs,Crime Head,2014
0,Andhra Pradesh,1 - Murder (Section 302 and 303 IPC),45
1,Andhra Pradesh,2 - Infanticide (Section 315 IPC),2
2,Andhra Pradesh,3 - Rape,477
3,Andhra Pradesh,4 - Assault on women with intent to outrage he...,274
5,Andhra Pradesh,4.2 - Assault on women with intent to Disrobe ...,19
6,Andhra Pradesh,4.3 - Voyeurism (Section 354C IPC),5
7,Andhra Pradesh,4.4 - Stalking (Section 354D IPC),94
8,Andhra Pradesh,4.5 - Others assault,89
9,Andhra Pradesh,5 - Insult to the Modesty of Women (Girls Chil...,75
10,Andhra Pradesh,"6 - Kidnapping & Abduction_Total (Section 363,...",600


In [21]:
df.tail(10)

Unnamed: 0,States/UTs,Crime Head,2014
2018,Total (All India),19.1 - Under PCSO Act Section 4,4131
2019,Total (All India),19.2 - Under PCSO Act Section 6,764
2020,Total (All India),19.3 - Under PCSO Act Section 8,252
2021,Total (All India),19.4 - Under PCSO Act Section 10,2137
2022,Total (All India),19.5 - Under PCSO Act Sections 14 & 15,40
2023,Total (All India),19.6 - Other Acts of PCSO,1580
2024,Total (All India),20 - Attempt to commit Murder u/s 307 IPC,840
2025,Total (All India),21 - Unnatural Offences u/s 377 IPC,765
2026,Total (All India),22 - Other crimes committed against children,8484
2027,Total (All India),23 - Total crimes against Children,89423


In [20]:
row_position = 1798
row_data = df.iloc[row_position]

print(row_data)

States/UTs                                      Puducherry
Crime Head    22 - Other crimes committed against children
2014                                                     1
Name: 1870, dtype: object


In [22]:
start_index = 1799  # start deleting from index 2 (inclusive)
end_index = 2027    # stop deleting at index 5 (exclusive)

# Drop rows from start_index to end_index
df = df.drop(df.index[start_index:end_index])

In [23]:
df.tail()

Unnamed: 0,States/UTs,Crime Head,2014
1866,Puducherry,19.5 - Under PCSO Act Sections 14 & 15,0
1867,Puducherry,19.6 - Other Acts of PCSO,4
1868,Puducherry,20 - Attempt to commit Murder u/s 307 IPC,1
1869,Puducherry,21 - Unnatural Offences u/s 377 IPC,0
1870,Puducherry,22 - Other crimes committed against children,1


### Deleting subrows

- Deleting subrows, such as those representing specific categories under a main heading (like 4.1, 4.2, 4.3 under main heading 4), is crucial for maintaining data integrity and ensuring clarity in analysis. Here’s an explanation for why it’s necessary to remove subrows in this context:
  1. Avoiding Double Counting: Subrows typically represent detailed categories or subcategories that are already aggregated in their respective main headings. Including both the main heading and its subrows in analysis could lead to double counting of data. For example, if subrows under "4 - Andhra Pradesh" are included along with the total for "4 - Andhra Pradesh," the total count for Andhra Pradesh would be inflated.
  2. Maintaining Data Hierarchy: Data often follows a hierarchical structure where main headings aggregate values from their subrows. Analyzing data at both levels simultaneously can distort insights, as it mixes summary data (main heading) with detailed breakdowns (subrows).
  3. Ensuring Consistency in Analysis: By removing subrows, analysts ensure consistency in how data is analyzed and interpreted across different levels of aggregation. This consistency is crucial for making valid comparisons and drawing reliable conclusions from the data.


- In summary, deleting subrows ensures data accuracy, maintains hierarchy, enhances clarity in analysis, ensures consistency, and facilitates meaningful insights from the data. It’s a critical step in data preparation to ensure that analyses are robust and conclusions drawn are valid and actionable.



In [24]:
rows_to_drop = ['4.2 - Assault on women with intent to Disrobe (Section 354B IPC)',
                '4.3 - Voyeurism (Section 354C IPC)',
                '4.4 - Stalking (Section 354D IPC)',
                '4.5 - Others assault',
                '6.1 - Kidnapping & Abduction (Section 363 IPC)',
                '6.2 - Kidnaping & Abduction in order to Murder (Section 364 IPC)',
                '6.3 - Kidnapping for Ransom (Section 364A IPC)',
                '6.4 - Kidnapping & Abduction of Women to compel her for marriage (Section 366 IPC)',
                '6.5 - Other Kidnapping',
                '16.1 - Offences committed against Migrant',
                '16.1.1 - Offences committed against SC migrants',
                '16.1.2 - Offences committed against ST migrants',
                '16.1.3 - Offences committed against others migrants',
                '16.2 - Offences committed against Locals',
                '16.2.1 - Offences committed against Local SCs',
                '16.2.2 - Offences committed against Local STs',
                '16.2.3 - Offences committed against others locals',
                '17.1 - Under ITP Section 5',
                '17.2 - Under ITP Section 6',
                '17.3 - Under ITP Section 7',
                '17.4 - Under ITP Section 8',
                '17.5 - Other Section under ITP Act',
                '19.1 - Under PCSO Act Section 4',
                '19.2 - Under PCSO Act Section 6',
                '19.3 - Under PCSO Act Section 8',
                '19.4 - Under PCSO Act Section 10',
                '19.5 - Under PCSO Act Sections 14 & 15',
                '19.6 - Other Acts of PCSO',
                '23 - Total crimes against Children'
               ]
                

In [25]:
df = df[~df['Crime Head'].isin(rows_to_drop)]

In [27]:
df.head(20)

Unnamed: 0,States/UTs,Crime Head,2014
0,Andhra Pradesh,1 - Murder (Section 302 and 303 IPC),45
1,Andhra Pradesh,2 - Infanticide (Section 315 IPC),2
2,Andhra Pradesh,3 - Rape,477
3,Andhra Pradesh,4 - Assault on women with intent to outrage he...,274
9,Andhra Pradesh,5 - Insult to the Modesty of Women (Girls Chil...,75
10,Andhra Pradesh,"6 - Kidnapping & Abduction_Total (Section 363,...",600
17,Andhra Pradesh,8 - Abetment of Suicide of child (Section 305 ...,8
18,Andhra Pradesh,9 - Exposure and Abandonment (Section 317 IPC),26
19,Andhra Pradesh,10 - Procuration of minor girls (Section 366-A...,37
20,Andhra Pradesh,11 - Importation of Girls from Foreign Country...,0


In [28]:
df.tail(10)

Unnamed: 0,States/UTs,Crime Head,2014
1842,Puducherry,13 - Selling of minors for prostitution (Secti...,0
1843,Puducherry,"14 - Prohibition of Child Marriage Act, 2006",3
1844,Puducherry,"15 - Transplantation of Human Organs Act, 1994",0
1845,Puducherry,16 - Child Labour (Prohibition & Regulation) A...,0
1854,Puducherry,"17 - Immoral Traffic (Prevention) Act, 1956",0
1860,Puducherry,18 - Juvenile Justice (Care and Protection of ...,0
1861,Puducherry,19 - Protection of Children from Sexual Offenc...,21
1868,Puducherry,20 - Attempt to commit Murder u/s 307 IPC,1
1869,Puducherry,21 - Unnatural Offences u/s 377 IPC,0
1870,Puducherry,22 - Other crimes committed against children,1


In [29]:
df = df.reset_index(drop = True)

In [30]:
df.head(10)

Unnamed: 0,States/UTs,Crime Head,2014
0,Andhra Pradesh,1 - Murder (Section 302 and 303 IPC),45
1,Andhra Pradesh,2 - Infanticide (Section 315 IPC),2
2,Andhra Pradesh,3 - Rape,477
3,Andhra Pradesh,4 - Assault on women with intent to outrage he...,274
4,Andhra Pradesh,5 - Insult to the Modesty of Women (Girls Chil...,75
5,Andhra Pradesh,"6 - Kidnapping & Abduction_Total (Section 363,...",600
6,Andhra Pradesh,8 - Abetment of Suicide of child (Section 305 ...,8
7,Andhra Pradesh,9 - Exposure and Abandonment (Section 317 IPC),26
8,Andhra Pradesh,10 - Procuration of minor girls (Section 366-A...,37
9,Andhra Pradesh,11 - Importation of Girls from Foreign Country...,0


In [31]:
df.tail(10)

Unnamed: 0,States/UTs,Crime Head,2014
746,Puducherry,13 - Selling of minors for prostitution (Secti...,0
747,Puducherry,"14 - Prohibition of Child Marriage Act, 2006",3
748,Puducherry,"15 - Transplantation of Human Organs Act, 1994",0
749,Puducherry,16 - Child Labour (Prohibition & Regulation) A...,0
750,Puducherry,"17 - Immoral Traffic (Prevention) Act, 1956",0
751,Puducherry,18 - Juvenile Justice (Care and Protection of ...,0
752,Puducherry,19 - Protection of Children from Sexual Offenc...,21
753,Puducherry,20 - Attempt to commit Murder u/s 307 IPC,1
754,Puducherry,21 - Unnatural Offences u/s 377 IPC,0
755,Puducherry,22 - Other crimes committed against children,1


### Data cleaning completed

- Now I am exporting the current datframe as csv for furthur analysis

In [32]:
file_path = 'cleaned_data.csv'
df.to_csv(file_path, index = False)

- The data is now exported and ready for further analysis