# U.S. CRIME ANALYZATION DASHBOARD

## Members
1) Filzah Maisarah binti Elmy Khairull (2419934)
2) Insyirah Mardhiah binti Suhaimi (2415054)

## Background 
This dataset contains comprehensive information regarding criminal incidents that have occurred across the United States, including various attributes related to both the offenders and victims. It encompasses records of the category of the crimes involved, disposition, report type as well as the demographic details such as status, race, gender, and age of the individuals involved. The dataset spans a wide range of crime categories, namely theft, vandalism, violence, sexual crimes, and drug or weapon related offenses. This breadth of information enables an in-depth analysis of the criminal patterns in the States and potential correlations between demographic characteristics and the nature of crimes committed. Such insights can be valuable for research, policy-making, law enforcement strategies, and social interventions aimed at crime prevention and justice reform.

The dashboard includes eight filtered graphs based on disposition (all, open or closed) and crime category. These filters allow a closer, more detailed view of the data, showing how crimes are distributed by offender and victim age, race and gender, a breakdown of report types and victim injury status. In addition, there is a bar chart displaying overall crime categories, which remains unaffected by the filters. Finally, at the end of the dashboard, the raw data is presented in a table format for easy reference.

## Objectives 
1) To investigate the correlation between the age of offenders and the type of crime committed.
2) To identify and analyze gender patterns among victims, contributing to a depper understanding of how different crimes affect different groups.
3) To explore the breakdown of report types in criminal cases, specifically distinguishing between incident reports and supplement reports.

## Clean Data

In [24]:
import pandas as pd
import numpy as np
df=pd.read_csv("crime_data.csv")
print(df)

     Disposition OffenderStatus Offender_Race Offender_Gender  Offender_Age  \
0         CLOSED       ARRESTED         BLACK            MALE          30.0   
1         CLOSED       ARRESTED         BLACK            MALE          30.0   
2         CLOSED       ARRESTED         BLACK            MALE          30.0   
3         CLOSED       ARRESTED         BLACK            MALE          30.0   
4         CLOSED       ARRESTED         BLACK            MALE          30.0   
...          ...            ...           ...             ...           ...   
6633      CLOSED       ARRESTED         BLACK          FEMALE          30.0   
6634      CLOSED       ARRESTED         WHITE          FEMALE          20.0   
6635      CLOSED       ARRESTED         BLACK          FEMALE          26.0   
6636      CLOSED       ARRESTED         WHITE            MALE          38.0   
6637      CLOSED       ARRESTED         WHITE            MALE          38.0   

     PersonType Victim_Race Victim_Gender  Victim_A

In [26]:
data=pd.DataFrame(df)

In [28]:
data

Unnamed: 0,Disposition,OffenderStatus,Offender_Race,Offender_Gender,Offender_Age,PersonType,Victim_Race,Victim_Gender,Victim_Age,Victim_Fatal_Status,Report Type,Category
0,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
1,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
2,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
3,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
4,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
...,...,...,...,...,...,...,...,...,...,...,...,...
6633,CLOSED,ARRESTED,BLACK,FEMALE,30.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6634,CLOSED,ARRESTED,WHITE,FEMALE,20.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6635,CLOSED,ARRESTED,BLACK,FEMALE,26.0,VICTIM,BLACK,FEMALE,34.0,Non-fatal,Supplemental Report,Violence
6636,CLOSED,ARRESTED,WHITE,MALE,38.0,VICTIM,UNKNOWN,MALE,31.0,Non-fatal,Supplemental Report,Violence


In [30]:
data =data.rename(columns={'OffenderStatus': 'Offender_Status'})
data =data.rename(columns={'PersonType': 'Person_Type'})
data =data.rename(columns={'Report Type': 'Report_Type'})
data

Unnamed: 0,Disposition,Offender_Status,Offender_Race,Offender_Gender,Offender_Age,Person_Type,Victim_Race,Victim_Gender,Victim_Age,Victim_Fatal_Status,Report_Type,Category
0,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
1,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
2,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
3,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
4,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
...,...,...,...,...,...,...,...,...,...,...,...,...
6633,CLOSED,ARRESTED,BLACK,FEMALE,30.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6634,CLOSED,ARRESTED,WHITE,FEMALE,20.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6635,CLOSED,ARRESTED,BLACK,FEMALE,26.0,VICTIM,BLACK,FEMALE,34.0,Non-fatal,Supplemental Report,Violence
6636,CLOSED,ARRESTED,WHITE,MALE,38.0,VICTIM,UNKNOWN,MALE,31.0,Non-fatal,Supplemental Report,Violence


In [32]:
data.isnull().sum() #check for missing values

Disposition            0
Offender_Status        0
Offender_Race          0
Offender_Gender        0
Offender_Age           0
Person_Type            0
Victim_Race            0
Victim_Gender          0
Victim_Age             0
Victim_Fatal_Status    0
Report_Type            0
Category               0
dtype: int64

In [34]:
data.duplicated().sum() #identify duplicate

1944

In [36]:
data.drop_duplicates(inplace = True) #remove duplicate

In [38]:
data

Unnamed: 0,Disposition,Offender_Status,Offender_Race,Offender_Gender,Offender_Age,Person_Type,Victim_Race,Victim_Gender,Victim_Age,Victim_Fatal_Status,Report_Type,Category
0,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
5,CLOSED,ARRESTED,NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER,MALE,27.0,VICTIM,WHITE,FEMALE,62.0,Non-fatal,Incident Report,Miscellaneous
6,CLOSED,ARRESTED,NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER,MALE,27.0,VICTIM,WHITE,MALE,39.0,Non-fatal,Incident Report,Miscellaneous
7,CLOSED,ARRESTED,NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER,MALE,27.0,VICTIM,WHITE,MALE,50.0,Non-fatal,Incident Report,Miscellaneous
9,CLOSED,ARRESTED,BLACK,FEMALE,22.0,VICTIM,BLACK,FEMALE,27.0,Non-fatal,Incident Report,Vandalism
...,...,...,...,...,...,...,...,...,...,...,...,...
6632,CLOSED,ARRESTED,BLACK,FEMALE,30.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6634,CLOSED,ARRESTED,WHITE,FEMALE,20.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6635,CLOSED,ARRESTED,BLACK,FEMALE,26.0,VICTIM,BLACK,FEMALE,34.0,Non-fatal,Supplemental Report,Violence
6636,CLOSED,ARRESTED,WHITE,MALE,38.0,VICTIM,UNKNOWN,MALE,31.0,Non-fatal,Supplemental Report,Violence


In [40]:
#replace data
data['Offender_Race'] = data['Offender_Race'].replace("NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER", "OTHERS")
data['Victim_Race'] = data['Victim_Race'].replace("NATIVE HAWAIIAN OR OTHER PACIFIC ISLANDER", "OTHERS")
data['Offender_Race'] = data['Offender_Race'].replace("AMER. IND.", "AMERIND")
data['Victim_Race'] = data['Victim_Race'].replace("AMER. IND.", "AMERIND")

In [42]:
data

Unnamed: 0,Disposition,Offender_Status,Offender_Race,Offender_Gender,Offender_Age,Person_Type,Victim_Race,Victim_Gender,Victim_Age,Victim_Fatal_Status,Report_Type,Category
0,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29.0,Non-fatal,Supplemental Report,Theft
5,CLOSED,ARRESTED,OTHERS,MALE,27.0,VICTIM,WHITE,FEMALE,62.0,Non-fatal,Incident Report,Miscellaneous
6,CLOSED,ARRESTED,OTHERS,MALE,27.0,VICTIM,WHITE,MALE,39.0,Non-fatal,Incident Report,Miscellaneous
7,CLOSED,ARRESTED,OTHERS,MALE,27.0,VICTIM,WHITE,MALE,50.0,Non-fatal,Incident Report,Miscellaneous
9,CLOSED,ARRESTED,BLACK,FEMALE,22.0,VICTIM,BLACK,FEMALE,27.0,Non-fatal,Incident Report,Vandalism
...,...,...,...,...,...,...,...,...,...,...,...,...
6632,CLOSED,ARRESTED,BLACK,FEMALE,30.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6634,CLOSED,ARRESTED,WHITE,FEMALE,20.0,VICTIM,WHITE,MALE,34.0,Non-fatal,Supplemental Report,Theft
6635,CLOSED,ARRESTED,BLACK,FEMALE,26.0,VICTIM,BLACK,FEMALE,34.0,Non-fatal,Supplemental Report,Violence
6636,CLOSED,ARRESTED,WHITE,MALE,38.0,VICTIM,UNKNOWN,MALE,31.0,Non-fatal,Supplemental Report,Violence


In [44]:
data['Victim_Age'] = data['Victim_Age'].astype('int')
data

Unnamed: 0,Disposition,Offender_Status,Offender_Race,Offender_Gender,Offender_Age,Person_Type,Victim_Race,Victim_Gender,Victim_Age,Victim_Fatal_Status,Report_Type,Category
0,CLOSED,ARRESTED,BLACK,MALE,30.0,VICTIM,BLACK,FEMALE,29,Non-fatal,Supplemental Report,Theft
5,CLOSED,ARRESTED,OTHERS,MALE,27.0,VICTIM,WHITE,FEMALE,62,Non-fatal,Incident Report,Miscellaneous
6,CLOSED,ARRESTED,OTHERS,MALE,27.0,VICTIM,WHITE,MALE,39,Non-fatal,Incident Report,Miscellaneous
7,CLOSED,ARRESTED,OTHERS,MALE,27.0,VICTIM,WHITE,MALE,50,Non-fatal,Incident Report,Miscellaneous
9,CLOSED,ARRESTED,BLACK,FEMALE,22.0,VICTIM,BLACK,FEMALE,27,Non-fatal,Incident Report,Vandalism
...,...,...,...,...,...,...,...,...,...,...,...,...
6632,CLOSED,ARRESTED,BLACK,FEMALE,30.0,VICTIM,WHITE,MALE,34,Non-fatal,Supplemental Report,Theft
6634,CLOSED,ARRESTED,WHITE,FEMALE,20.0,VICTIM,WHITE,MALE,34,Non-fatal,Supplemental Report,Theft
6635,CLOSED,ARRESTED,BLACK,FEMALE,26.0,VICTIM,BLACK,FEMALE,34,Non-fatal,Supplemental Report,Violence
6636,CLOSED,ARRESTED,WHITE,MALE,38.0,VICTIM,UNKNOWN,MALE,31,Non-fatal,Supplemental Report,Violence


In [46]:
data['Offender_Age'] = data['Offender_Age'].astype('int')
data

Unnamed: 0,Disposition,Offender_Status,Offender_Race,Offender_Gender,Offender_Age,Person_Type,Victim_Race,Victim_Gender,Victim_Age,Victim_Fatal_Status,Report_Type,Category
0,CLOSED,ARRESTED,BLACK,MALE,30,VICTIM,BLACK,FEMALE,29,Non-fatal,Supplemental Report,Theft
5,CLOSED,ARRESTED,OTHERS,MALE,27,VICTIM,WHITE,FEMALE,62,Non-fatal,Incident Report,Miscellaneous
6,CLOSED,ARRESTED,OTHERS,MALE,27,VICTIM,WHITE,MALE,39,Non-fatal,Incident Report,Miscellaneous
7,CLOSED,ARRESTED,OTHERS,MALE,27,VICTIM,WHITE,MALE,50,Non-fatal,Incident Report,Miscellaneous
9,CLOSED,ARRESTED,BLACK,FEMALE,22,VICTIM,BLACK,FEMALE,27,Non-fatal,Incident Report,Vandalism
...,...,...,...,...,...,...,...,...,...,...,...,...
6632,CLOSED,ARRESTED,BLACK,FEMALE,30,VICTIM,WHITE,MALE,34,Non-fatal,Supplemental Report,Theft
6634,CLOSED,ARRESTED,WHITE,FEMALE,20,VICTIM,WHITE,MALE,34,Non-fatal,Supplemental Report,Theft
6635,CLOSED,ARRESTED,BLACK,FEMALE,26,VICTIM,BLACK,FEMALE,34,Non-fatal,Supplemental Report,Violence
6636,CLOSED,ARRESTED,WHITE,MALE,38,VICTIM,UNKNOWN,MALE,31,Non-fatal,Supplemental Report,Violence


In [48]:
data.to_csv('clean_data_crime.csv', index=False)

In [50]:
DF= pd.read_csv('clean_data_crime.csv')

In [52]:
DF.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4694 entries, 0 to 4693
Data columns (total 12 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   Disposition          4694 non-null   object
 1   Offender_Status      4694 non-null   object
 2   Offender_Race        4694 non-null   object
 3   Offender_Gender      4694 non-null   object
 4   Offender_Age         4694 non-null   int64 
 5   Person_Type          4694 non-null   object
 6   Victim_Race          4694 non-null   object
 7   Victim_Gender        4694 non-null   object
 8   Victim_Age           4694 non-null   int64 
 9   Victim_Fatal_Status  4694 non-null   object
 10  Report_Type          4694 non-null   object
 11  Category             4694 non-null   object
dtypes: int64(2), object(10)
memory usage: 440.2+ KB


## Dashboard link
https://dashboard-filzahinsyi.streamlit.app/

## Discussion 
1) To investigate the correlation between the age of offenders and the type of crime committed.

The first bar graph illustrates the distribution of crime by offender age. Overall, without considering the disposition and crime category, the 30 to 34 age group has highest number of cases, totaling 717. This indicates that the individuals in this age range are most frequently involved in criminal activities across the United States. Within this group, violent crimes are the most common, accounting for 393 cases, followed by miscellaneous offenses (157 cases) and theft which has 127 cases.

However, criminal activity is not limited to adults. The data also reveals involvement from both teenagers and the elderly. Generally, crimes have been recorded among individuals as young as 10 years old. This age group consisting of age 10 until 14 has about 63 cases. Next, the 15 to 19 age group shows a significant rise with 467 cases. The exposure of youth to the criminal behaviour at such early ages is highly concerning and underscores the importance of early preventive measures to reduce their involvement in crime. 

Moreover, for the elderly offenders, crime data shows 159 cases for the 60–64 age group, followed by 62 cases for those aged 65–69, 23 cases for ages 70–74, and 13 cases for the 75–79 group. Therefore, these findings suggest that crime prevention strategies should be reinforced across all age groups to effectively reduce criminal activity throughout the country.

2) To identify and analyze gender patterns among victims, contributing to a deeper understanding of how different crimes affect different groups.

According to the pie chart in the dashboard, the data categorizes victims by gender into female, male and unknown. It shows that females make up the majority of victims, accounting for 56.9%, followed by males at 43%, and 2% whose gender is unknown. This suggests that females are often easier targets for perpetrators, a trend supported by the broader data. 

However, when it comes to drug and weapon related crimes and theft, the majority of victims are male. Several factors may contribute to this. Males, particularly young men, are more frequently exposed to high-risk environments such as street crime and illegal markets, where these types of crimes are more common. This increases their chances of both being perpetrators and victims. Besides, the social norms and gender expectations can influence male behaviour toward more aggressive responses which can escalate encounters involving drugs or weapons. On the other hand, female victims in these categories may be underrepresented in the data due to underreporting, often caused by fear, stigma or limited access to support systems, particularly in domestic or intimate partner contexts where weapons or drugs are involved. 

In conclusion, while the overall victim data shows a higher percentage of females, specific crime types like drug and weapon misuse, and theft are mostly affecting males due to their greater exposure and risk factors. This highlights the need for targeted prevention strategies that address the unique vulnerabilities and social dynamics affecting different groups.

3) To explore the breakdown of report types in criminal cases, specifically distinguishing between incident reports and supplement reports.

The doughnut chart illustrates the report type breakdown, consisting of incident and supplemental report. An indicent report refers to the initial and primary documentation of a crime, typically created when the crime is first discovered or reported. In contrast, a supplemental report is an add-on or update to the original incident report. Multiple supplemental reports can be generated for a single case as the investigation progresses. 

According to the dashboard, incident reports make up the majority, accounting for 57.3% of all reports, while supplemental reports represent 42.7%. Certain crime categories, such as drug and weapon related crimes and sexual crimes, are entirely documented as incident reports. For miscellaneous crimes, 67.3% are incident reports and 32.7% are supplemental. Furthermore, vandalism and violence show a similar trend, with 61.5% and 64.9% incident reports respectively. Nevertheless, the pattern reverses for theft related crimes, where supplemental reports dominate, comprising 72.2%, compared to only 27.8% incident reports.

In conclusion, while incident reports form the bulk of crime documentation overall, the breakdown varies by crime type. Categories involving property loss or ongoing updates, such as theft, tend to generate more supplemental reports, possibly due to follow-up investigations or delayed discoveries. This variation underscores the importance of report type in interpreting crime patterns and investigative practices.

## Conclusion
The analysis of the crime dataset reveals important patterns in criminal activity across the United States. The 30-34 age group emerges as the most common among offenders, particularly in violent crimes, though criminal activities are observed across all age ranges which is from 10 to 79 years old. This indicates the need for age spanning prevention strategies.

Moreover, the data shows that females represent the majority of victims, suggesting their increased vulnerability in many crime types. However, males are more often victims in drug and weapon related offenses and theft. These trends underscore the importance of tailored intervention efforts based on gender and crime type. The report type breakdown further enhances our understanding of how crimes are documented. Incident reports make up 57.3% of all records and are dominant in most categories whereas supplemental reports are more common in theft, suggesting frequent updates or ongoing investigations in property-related cases.

In summary, the dataset offers valuable insights into the relationships between offender and victim characteristics, crime types, and reporting practices. These findings can guide policy-making, law enforcement priorities, and community based interventions to reduce crime and promote public safety more effectively across different population segments and crime categories.