Skip to content

NeoRecasata/crime-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

LAPD Crime Data Analysis (2010-2017)

Data cleaning and exploratory analysis of LAPD crime incident records, prepared for visualization in Tableau.

Overview

This project analyzes over 1.5 million crime incidents reported by the Los Angeles Police Department between 2010 and 2017. The Python script cleans and enriches the raw dataset for use in Tableau dashboards.

Data Cleaning

  • Normalizes whitespace in address fields
  • Converts integer time values to HH:MM format for Tableau compatibility
  • Splits combined lat/long Location field into separate Latitude and Longitude columns
  • Drops vague MO Codes column

Feature Engineering

  • Days Between Date Occurred and Date Reported - reporting delay in days
  • Month Occurred - extracted month name for seasonal analysis
  • Victim Age Group - bucketed age ranges (0-10, 11-20, ..., 91-99)
  • Time Category - time-of-day buckets (Late Night, Early Morning, Morning, etc.)
  • Victim Descent Type - full ethnicity names mapped from LAPD descent codes

Dataset

The raw dataset is too large for GitHub (~456 MB). Download it from the source:

Crime Data from 2010 to 2019 - City of Los Angeles Open Data

Place the downloaded CSV as data/Crime_Data_2010_2017.csv before running the script.

Usage

pip install pandas numpy

python exploration.py

The cleaned dataset will be saved to data/lapd_crime_dataset_cleaned.csv.

Project Structure

├── exploration.py                  # Data cleaning and feature engineering script
├── data/
│   ├── lookup/
│   │   ├── agegroup.csv            # Age to age group mapping
│   │   ├── time.csv                # Time to time-of-day category mapping
│   │   └── victimdescent.csv       # LAPD descent code to ethnicity mapping
│   └── LAPD_Reporting_Districts.zip  # LAPD reporting district boundaries (shapefile)
└── README.md

Tools

  • Python (Pandas, NumPy) - data cleaning and transformation
  • Tableau - visualization and dashboards

Authors

  • Franco Neo Recasata
  • Jaime Tanedo

About

Data cleaning and exploratory analysis of LAPD crime data (2010-2017) using Python and Tableau

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages