# Baltimore Crime Data Analysis

This notebook performs an exploratory data analysis (EDA) on the Baltimore crime data. The goal is to uncover significant trends, patterns, and insights from the data.

In [1]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set visualization style
sns.set(style="whitegrid")

## Load the Data

Load the crime data from the CSV file into a pandas DataFrame.

In [2]:
# Load the crime data
crime_data = pd.read_csv('Part_1_Crime_Data.csv')

# Display the first few rows of the dataset
crime_data.head()

## Data Cleaning

Perform initial data cleaning by removing duplicates and handling missing values.

In [3]:
# Remove duplicates
crime_data.drop_duplicates(inplace=True)

# Handle missing values
crime_data.dropna(inplace=True)

# Display the cleaned dataset
crime_data.head()

## Descriptive Statistics

Calculate descriptive statistics to understand the distribution of the data.

In [4]:
# Calculate descriptive statistics
crime_data.describe()

## Visualization

Create visualizations to identify temporal and spatial patterns in the crime data.

In [5]:
# Create a heatmap of crime incidents by neighborhood
plt.figure(figsize=(12, 6))
sns.heatmap(crime_data.pivot_table(index='Neighborhood', columns='CrimeDate', values='CrimeID', aggfunc='count'), cmap='YlGnBu')
plt.title('Heatmap of Crime Incidents by Neighborhood')
plt.xlabel('Date')
plt.ylabel('Neighborhood')
plt.show()

In [6]:
# Create a line chart of crime incidents over time
plt.figure(figsize=(12, 6))
crime_data['CrimeDate'] = pd.to_datetime(crime_data['CrimeDate'])
crime_data.set_index('CrimeDate', inplace=True)
crime_data.resample('M').size().plot()
plt.title('Crime Incidents Over Time')
plt.xlabel('Date')
plt.ylabel('Number of Incidents')
plt.show()

## Grouping by Neighborhood

Aggregate the data by neighborhood for year-wise comparisons.

In [7]:
# Group data by neighborhood and year
crime_data['Year'] = crime_data.index.year
neighborhood_crime = crime_data.groupby(['Neighborhood', 'Year']).size().unstack().fillna(0)

# Display the aggregated data
neighborhood_crime.head()