In this presentation, we embark on a detailed exploration of reported crime incidents in Chicago, a dataset that extends from 2001 to the current date, excluding the most recent seven days. This data is extracted from the Chicago Police Department's Citizen Law Enforcement Analysis and Reporting (CLEAR) system, reflects a broad spectrum of crime incidents, with a notable exception for murders where each victim's data is separately recorded.
A key aspect of this dataset is its commitment to the privacy of crime victims. To this end, the information is generalized to the block level, without pinpointing specific locations. It's important to highlight that the dataset encompasses unverified reports and preliminary crime classifications that may be subject to change following further investigation. This aspect underscores the dynamic and somewhat tentative nature of the data.
Given the potential for mechanical or human error, the Chicago Police Department explicitly states that the accuracy, completeness, timeliness, or correct sequencing of the data cannot be guaranteed. As a result, this dataset should not be used for time-based comparative purposes.
This presentation aims to provide a data-driven narrative on public safety in Chicago. We will delve into this rich dataset, publicly available under the terms provided by the City of Chicago and offered 'AS IS' by Google, to uncover patterns, understand trends, and offer insights into the complex domain of urban crime and safety." For more information visit: click here
Querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. Google BigQuery solves this problem by enabling super-fast, SQL queries against append-mostly tables, using the processing power of Google's infrastructure.
SQL Code Configuration
SQL Code: Pull Request raw code |
Quick overview of SQL alias list.
<!-- Gets data from the last 5 years -->
SELECT * FROM `bigquery-public-data.chicago_crime.crime` WHERE year >= EXTRACT(YEAR FROM CURRENT_DATE()) - 5;
The usuage is for law enforcement people or people who intrested in crime data. The intention of this project is rather to raise awareness of crimes and hotspots.
If you are intrested in looking at the sql pull look at the previous. However, I highlighted most of my energy and portion into visuals to show the rise of crime from 2019-2023. The tableau link includes a high level overview of the data. For more programmers, I have attached a link to python file and r code that display a temporal analysis. Crime rates have been up and the goal is to raise awareness of how important governement officials are. This specifically highlights hotspots. For a quick EXCEL snapshot: click here.
The report titled 'Chicago Crime Data Reporting,' dated January 5, 2024, offers a thorough exploratory data analysis (EDA) and a detailed temporal evaluation of crime statistics in Chicago for the years 2022-2023. Executed using R and various libraries including readr, lubridate, dplyr, ggplot2, leaflet, leaflet.extras, and cluster, the analysis covers data preparation, identifying predominant crime types through bar plots, and a block-level scrutiny to pinpoint regions with heightened specific criminal activities. The study further investigates time-related trends in these areas, revealing notable shifts in crime frequencies. A key highlight of this analysis is the fusion of geospatial and temporal visualizations with interactive maps, augmented by machine learning techniques such as K-means clustering to delineate crime hotspots. This all-encompassing method not only sheds light on crime concentration and critical zones but also provides valuable insights for decision-making by policymakers and law enforcement bodies. This analysis serves as a powerful instrument in comprehending and tackling the nuances of urban crime.
- Rmarkdownfile
Check out my Tableau dashboard: https://public.tableau.com/shared/2ZZDXFHQX?:display_count=n&:origin=viz_share_link . For a high level overview - click on the PowerPoint below in the resources section.