[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zjelveh/zjelveh.github.io/blob/master/files/cfc/ps/problem_set_2_solutions.ipynb)

**IMPORTANT**: Save your own copy!
1. Click File → Save a copy in Drive
2. Rename it like: "PS2_YourName"
3. Work in YOUR copy, not the original


---


# Problem Set 2: DC Crime Analysis with Pandas - SOLUTIONS
## CCJS 418E - Fall 2025

**Due**: Sunday, October 26, 11:59pm

**Submit**: If you are working on Colab, download your notebook **TWICE**. Once as .ipynb and once as .py. **UPLOAD BOTH FILES.** 
- [Submit here](https://umd.instructure.com/courses/1389501/assignments/7376099)

**Code Review**: Tuesday, October 29 in class

---

## Scenario

On August 7, 2025, the federal government initiated an emergency intervention in DC's public safety system, deploying federal resources to assist the Metropolitan Police Department. Local community organizations want to understand whether this "Federal Surge" has had any measurable impact on crime patterns.

You've been asked to analyze DC crime data from 2025 to help these organizations understand:
- What crime patterns looked like before the federal intervention
- Whether there have been changes since August 7
- Which wards have been most affected

This analysis will inform community advocacy and help residents understand the effectiveness of the federal intervention.

You may use AI tools to help write and debug your code. During the code review, you'll explain your logic and demonstrate you understand how pandas operations work.
- [Link to TerpAI](https://terpai.umd.edu/chat)

## Setup: Import Libraries and Load Data

First, we need to import pandas and load the DC crime data. The dataset contains incidents reported from 30 days before to 30 days after Aug 7, 2025.

**Note**: The dataset includes helpful pre-calculated columns:
- `intervention_period`: Labels crimes as "Pre-Intervention" or "Post-Intervention" (before/after August 7)
- `days_before_after`: Number of days from intervention (negative = before, positive = after)
- `is_violent`, `is_property`, `is_weekend`: Boolean flags for crime types and timing

In [1]:
# Import pandas
import pandas as pd

# Load the DC crime data
# Dataset columns: report_dat, offense, ward, shift, day_of_week, 
# intervention_period, days_before_after, is_after_intervention,
# is_violent, is_property, is_weekend

url = "https://raw.githubusercontent.com/zjelveh/zjelveh.github.io/refs/heads/master/files/cfc/ps/dc_crime_2025_sample.csv"
df = pd.read_csv(url)

# Display basic information about the dataset
print("DC Crime Data loaded successfully!")
print(f"Shape of data: {df.shape}")
print(f"\nColumn names: {list(df.columns)}")

DC Crime Data loaded successfully!
Shape of data: (4474, 10)

Column names: ['report_dat', 'offense', 'ward', 'day_of_week', 'intervention_period', 'days_before_after', 'is_after_intervention', 'is_violent', 'is_property', 'is_weekend']


In [2]:
# Important constants for calculating daily averages
DAYS_BEFORE_INTERVENTION = 30  # The thirty days before Aug 7 
DAYS_AFTER_INTERVENTION = 30    # The thirty days after Aug 7

print(f"\nDays before intervention: {DAYS_BEFORE_INTERVENTION}")
print(f"Days after intervention: {DAYS_AFTER_INTERVENTION}")


Days before intervention: 30
Days after intervention: 30


## Part 1: Data Exploration and Basic Operations
*Using concepts from lectures: Loading data, .head(), .info(), column operations*

Let's explore the dataset to understand DC's crime landscape in 2025.

In [3]:
# Task 1.1: Display the first 10 rows of the dataset
df.head(10)

Unnamed: 0,report_dat,offense,ward,day_of_week,intervention_period,days_before_after,is_after_intervention,is_violent,is_property,is_weekend
0,2025-07-08,THEFT/OTHER,3.0,Tuesday,Pre-Intervention,-30,False,False,True,False
1,2025-07-08,THEFT F/AUTO,4.0,Tuesday,Pre-Intervention,-30,False,False,True,False
2,2025-07-08,THEFT/OTHER,2.0,Tuesday,Pre-Intervention,-30,False,False,True,False
3,2025-07-08,THEFT F/AUTO,8.0,Tuesday,Pre-Intervention,-30,False,False,True,False
4,2025-07-08,THEFT/OTHER,5.0,Tuesday,Pre-Intervention,-30,False,False,True,False
5,2025-07-08,ASSAULT W/DANGEROUS WEAPON,8.0,Tuesday,Pre-Intervention,-30,False,True,False,False
6,2025-07-08,MOTOR VEHICLE THEFT,3.0,Tuesday,Pre-Intervention,-30,False,False,True,False
7,2025-07-08,THEFT F/AUTO,4.0,Tuesday,Pre-Intervention,-30,False,False,True,False
8,2025-07-08,THEFT/OTHER,6.0,Tuesday,Pre-Intervention,-30,False,False,True,False
9,2025-07-08,THEFT/OTHER,5.0,Tuesday,Pre-Intervention,-30,False,False,True,False


In [4]:
# Task 1.2: Use .info() to understand the data types
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4474 entries, 0 to 4473
Data columns (total 10 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   report_dat             4474 non-null   object 
 1   offense                4474 non-null   object 
 2   ward                   4474 non-null   float64
 3   day_of_week            4474 non-null   object 
 4   intervention_period    4474 non-null   object 
 5   days_before_after      4474 non-null   int64  
 6   is_after_intervention  4474 non-null   bool   
 7   is_violent             4474 non-null   bool   
 8   is_property            4474 non-null   bool   
 9   is_weekend             4474 non-null   bool   
dtypes: bool(4), float64(1), int64(1), object(4)
memory usage: 227.3+ KB


In [5]:
# Task 1.3: Calculate basic statistics for the entire dataset
# How many total crimes are in the dataset?
# How many unique offense types are there?
# How many unique wards?

total_crimes = len(df)  # or df.shape[0]
unique_offenses = df['offense'].nunique()
unique_wards = df['ward'].nunique()

print(f"Total crimes: {total_crimes}")
print(f"Unique offense types: {unique_offenses}")
print(f"Unique wards: {unique_wards}")

Total crimes: 4474
Unique offense types: 9
Unique wards: 8


In [6]:
# Task 1.4: What are the top 5 most common crime types in DC?
# Use value_counts() to find out

top_5_crimes = df['offense'].value_counts().head(5)

print("Top 5 most common offenses in DC:")
print(top_5_crimes)

Top 5 most common offenses in DC:
offense
THEFT/OTHER                   2245
THEFT F/AUTO                  1067
MOTOR VEHICLE THEFT            649
ROBBERY                        189
ASSAULT W/DANGEROUS WEAPON     168
Name: count, dtype: int64


In [7]:
# Task 1.5: How are crimes distributed across the intervention periods?
# Count how many crimes occurred before vs after the intervention

period_counts = df['intervention_period'].value_counts()

print("Crimes by intervention period:")
print(period_counts)

Crimes by intervention period:
intervention_period
Pre-Intervention     2350
Post-Intervention    2124
Name: count, dtype: int64


## Part 2: Overall Impact - Filtering to Compare Before and After
*Using concepts from lectures: Boolean filtering, multiple conditions*

**Focus**: Use filtering to understand the city-wide impact of the intervention. Did crime decrease overall? Did violent and property crimes respond differently?

In [8]:
# Task 2.1: Filter the data to create separate DataFrames for before and after
pre_intervention = df[df['intervention_period'] == 'Pre-Intervention']
post_intervention = df[df['intervention_period'] == 'Post-Intervention']

print(f"Crimes before intervention: {len(pre_intervention)}")
print(f"Crimes after intervention: {len(post_intervention)}")

# Calculate daily averages (total crimes divided by number of days)
daily_pre = len(pre_intervention) / DAYS_BEFORE_INTERVENTION
daily_post = len(post_intervention) / DAYS_AFTER_INTERVENTION

print(f"\nAverage daily crimes BEFORE: {daily_pre:.1f}")
print(f"Average daily crimes AFTER: {daily_post:.1f}")
print(f"Percent Change (new - old) / old: {((daily_post - daily_pre) / daily_pre * 100):+.1f}%")

Crimes before intervention: 2350
Crimes after intervention: 2124

Average daily crimes BEFORE: 78.3
Average daily crimes AFTER: 70.8
Percent Change (new - old) / old: -9.6%


In [9]:
# Task 2.2: Filter for violent crimes city-wide
# The dataset includes an 'is_violent' column (True/False)

violent_crimes = df[df['is_violent'] == True]

print(f"Total violent crimes: {len(violent_crimes)}")
print(f"Percentage of all crimes that are violent: {len(violent_crimes)/len(df)*100:.1f}%")

Total violent crimes: 385
Percentage of all crimes that are violent: 8.6%


In [10]:
# Task 2.3: Compare violent crime rates before and after the intervention
# What percentage of crimes were violent before vs after?

violent_pre = df[(df['is_violent'] == True) & (df['intervention_period'] == 'Pre-Intervention')]
violent_post = df[(df['is_violent'] == True) & (df['intervention_period'] == 'Post-Intervention')]

# Calculate percentages
violent_rate_pre = (len(violent_pre) / len(pre_intervention)) * 100
violent_rate_post = (len(violent_post) / len(post_intervention)) * 100

print(f"Violent crime rate BEFORE intervention: {violent_rate_pre:.1f}%")
print(f"Violent crime rate AFTER intervention: {violent_rate_post:.1f}%")
print(f"Change: {violent_rate_post - violent_rate_pre:+.1f} percentage points")

Violent crime rate BEFORE intervention: 8.6%
Violent crime rate AFTER intervention: 8.6%
Change: +0.0 percentage points


In [11]:
# Task 2.4: Focus on property crimes city-wide
# The dataset includes an 'is_property' column (True/False)
# Have property crimes decreased since the intervention?

property_pre = df[(df['is_property'] == True) & (df['intervention_period'] == 'Pre-Intervention')]
property_post = df[(df['is_property'] == True) & (df['intervention_period'] == 'Post-Intervention')]

# Calculate daily averages
daily_property_pre = len(property_pre) / DAYS_BEFORE_INTERVENTION
daily_property_post = len(property_post) / DAYS_AFTER_INTERVENTION

print(f"Average daily property crimes BEFORE: {daily_property_pre:.1f}")
print(f"Average daily property crimes AFTER: {daily_property_post:.1f}")
print(f"Change: {((daily_property_post - daily_property_pre) / daily_property_pre * 100):+.1f}%")

Average daily property crimes BEFORE: 71.6
Average daily property crimes AFTER: 64.7
Change: -9.6%


In [12]:
# Task 2.5: Look at the immediate impact - the first week after intervention
# Filter for crimes where days_before_after is in specific ranges

week_before = df[(df['days_before_after'] >= -7) & (df['days_before_after'] <= -1)]
week_after = df[(df['days_before_after'] >= 1) & (df['days_before_after'] <= 7)]

print(f"Crimes in week BEFORE intervention: {len(week_before)}")
print(f"Crimes in week AFTER intervention: {len(week_after)}")
print(f"Immediate change: {((len(week_after) - len(week_before)) / len(week_before) * 100):+.1f}%")

Crimes in week BEFORE intervention: 607
Crimes in week AFTER intervention: 546
Immediate change: -10.0%


## Part 3: Geographic Analysis - Identifying High-Crime Wards
*Using concepts from lectures: Groupby, creating columns, sorting*

**Focus**: Crime isn't evenly distributed across DC. Identify which wards have the most crime and create a column to flag them for later analysis.

<center>
<img src='https://planning.dc.gov/sites/default/files/dc/sites/op/page_content/images/2022Wards_small.png'>

In [16]:
# Task 3.1: Identify the high-crime wards
# Count how many crimes occurred in each ward

ward_crime_counts = df.groupby('ward').size()

print("Crimes by ward:")
print(ward_crime_counts.sort_values(ascending=False))

# Now identify the top 3 wards with the most crime
# The .index gives us the ward numbers (the labels) from the Series
top_3_ward_numbers = ward_crime_counts.sort_values(ascending=False).head(n=3)

print(top_3_ward_numbers)

Crimes by ward:
ward
2.0    796
5.0    707
6.0    640
1.0    586
7.0    545
4.0    522
8.0    365
3.0    313
dtype: int64
ward
2.0    796
5.0    707
6.0    640
dtype: int64


In [17]:
# Task 3.2: Create a flag column for high-crime wards
# Use the top 3 ward numbers you just identified to create a True/False column

df['is_high_crime_ward'] = df['ward'].isin([2, 5, 6])

# Verify it worked
high_ward_crimes = (df['is_high_crime_ward'] == True).sum()
print(f"\nCrimes in top 3 wards: {high_ward_crimes} ({high_ward_crimes / len(df) * 100:.1f}%)")
print(f"This means {high_ward_crimes / len(df) * 100:.1f}% of DC's crime is concentrated in just 3 wards")


Crimes in top 3 wards: 2143 (47.9%)
This means 47.9% of DC's crime is concentrated in just 3 wards


In [18]:
# Task 3.3: Which wards have the lowest crime?
# Find the 3 wards with the fewest crimes

bottom_3_wards = ward_crime_counts.nsmallest(3)

print("3 wards with lowest crime:")
print(bottom_3_wards)

3 wards with lowest crime:
ward
3.0    313
8.0    365
4.0    522
dtype: int64


## Part 4: Breaking Down the Impact - Using Groupby to Find Patterns
*Using concepts from lectures: .groupby(), .size(), .sum(), .mean()*

**Focus**: Now that we know what the intervention did overall, let's use groupby to break down the impact by different categories. Which wards saw the biggest changes? Which days of the week? Which offense types?

In [19]:
# Task 4.1: Compare crime levels by intervention period using groupby
# Group by intervention_period and count total crimes

crimes_by_period = df.groupby('intervention_period').size()

print("Total crimes by intervention period:")
print(crimes_by_period)

# Calculate daily averages for each period
print("\nDaily averages:")
print(f"Pre-intervention: {crimes_by_period['Pre-Intervention'] / DAYS_BEFORE_INTERVENTION:.1f} crimes/day")
print(f"Post-intervention: {crimes_by_period['Post-Intervention'] / DAYS_AFTER_INTERVENTION:.1f} crimes/day")

Total crimes by intervention period:
intervention_period
Post-Intervention    2124
Pre-Intervention     2350
dtype: int64

Daily averages:
Pre-intervention: 78.3 crimes/day
Post-intervention: 70.8 crimes/day


In [20]:
# Task 4.2: How many violent vs property crimes in each period?
# Group by intervention_period and sum the is_violent and is_property columns

violence_by_period = df.groupby('intervention_period')['is_violent'].sum()
property_by_period = df.groupby('intervention_period')['is_property'].sum()

print("Violent crimes by period:")
print(violence_by_period)
print("\nProperty crimes by period:")
print(property_by_period)

Violent crimes by period:
intervention_period
Post-Intervention    183
Pre-Intervention     202
Name: is_violent, dtype: int64

Property crimes by period:
intervention_period
Post-Intervention    1941
Pre-Intervention     2147
Name: is_property, dtype: int64


In [21]:
# Task 4.3: Which wards have the most crimes in each period?
# This helps us see if crime shifted geographically after the intervention

# For pre-intervention period
wards_pre = pre_intervention.groupby('ward').size()
top_5_wards_pre = wards_pre.nlargest(5)

# For post-intervention period
wards_post = post_intervention.groupby('ward').size()
top_5_wards_post = wards_post.nlargest(5)

print("Top 5 wards by crime count:")
print("\nPre-intervention:")
print(top_5_wards_pre)
print("\nPost-intervention:")
print(top_5_wards_post)

Top 5 wards by crime count:

Pre-intervention:
ward
2.0    412
5.0    382
1.0    322
6.0    308
7.0    307
dtype: int64

Post-intervention:
ward
2.0    384
6.0    332
5.0    325
1.0    264
4.0    248
dtype: int64


In [22]:
# Task 4.4: Which specific offense types are most common in each period?
# This shows whether the mix of crimes has changed

# For pre-intervention period
offenses_pre = pre_intervention.groupby('offense').size()
top_5_offenses_pre = offenses_pre.nlargest(5)

# For post-intervention period
offenses_post = post_intervention.groupby('offense').size()
top_5_offenses_post = offenses_post.nlargest(5)

print("Top 5 offense types:")
print("\nPre-intervention:")
print(top_5_offenses_pre)
print("\nPost-intervention:")
print(top_5_offenses_post)

Top 5 offense types:

Pre-intervention:
offense
THEFT/OTHER                   1109
THEFT F/AUTO                   600
MOTOR VEHICLE THEFT            373
ROBBERY                        101
ASSAULT W/DANGEROUS WEAPON      88
dtype: int64

Post-intervention:
offense
THEFT/OTHER                   1136
THEFT F/AUTO                   467
MOTOR VEHICLE THEFT            276
ROBBERY                         88
ASSAULT W/DANGEROUS WEAPON      80
dtype: int64


In [23]:
# Task 4.5: Analyze day-of-week patterns before and after
# Has the federal presence changed which days see the most crime?

# For pre-intervention period
dow_pre = pre_intervention.groupby('day_of_week').size()

# For post-intervention period  
dow_post = post_intervention.groupby('day_of_week').size()

# Sort both to see which day is busiest
dow_pre_sorted = dow_pre.sort_values(ascending=False)
dow_post_sorted = dow_post.sort_values(ascending=False)

print("Crimes by day of week (sorted by count):")
print("\nPre-intervention:")
print(dow_pre_sorted)
print("\nPost-intervention:")
print(dow_post_sorted)

Crimes by day of week (sorted by count):

Pre-intervention:
day_of_week
Wednesday    407
Friday       383
Tuesday      376
Thursday     331
Monday       302
Saturday     285
Sunday       266
dtype: int64

Post-intervention:
day_of_week
Thursday     371
Friday       369
Wednesday    329
Saturday     322
Tuesday      280
Monday       235
Sunday       218
dtype: int64


## Part 5: Advanced Comparisons
*Combining filtering, grouping, and calculations*

**Focus**: Answer more nuanced questions by combining the techniques you've learned.

In [24]:
# Task 5.1: Compare violent crime rates across wards before and after
# For each ward, what percentage of crimes are violent in each period?

# Pre-intervention: for each ward, calculate percent violent
pre_data = df[df['intervention_period'] == 'Pre-Intervention']
ward_violent_pct_pre = pre_data.groupby('ward')['is_violent'].mean() * 100

# Post-intervention: for each ward, calculate percent violent
post_data = df[df['intervention_period'] == 'Post-Intervention']
ward_violent_pct_post = post_data.groupby('ward')['is_violent'].mean() * 100

print("Violent crime percentage by ward:")
print("\nPre-intervention:")
print(ward_violent_pct_pre)
print("\nPost-intervention:")
print(ward_violent_pct_post)

Violent crime percentage by ward:

Pre-intervention:
ward
1.0    11.180124
2.0     3.398058
3.0     1.807229
4.0     6.934307
5.0     6.806283
6.0     5.844156
7.0    12.703583
8.0    26.256983
Name: is_violent, dtype: float64

Post-intervention:
ward
1.0     8.712121
2.0     3.125000
3.0     2.721088
4.0     6.854839
5.0     8.615385
6.0     4.216867
7.0    18.067227
8.0    22.580645
Name: is_violent, dtype: float64


In [25]:
# Task 5.2: Weekend vs weekday crime patterns
# Are there different offense types on weekends vs weekdays?

weekend_crimes = df[df['is_weekend'] == True]
weekday_crimes = df[df['is_weekend'] == False]

# What are the top 5 offenses on weekends?
weekend_offenses = weekend_crimes.groupby('offense').size()
top_5_weekend = weekend_offenses.nlargest(5)

# What are the top 5 offenses on weekdays?
weekday_offenses = weekday_crimes.groupby('offense').size()
top_5_weekday = weekday_offenses.nlargest(5)

print("Top 5 weekend offenses:")
print(top_5_weekend)
print("\nTop 5 weekday offenses:")
print(top_5_weekday)

Top 5 weekend offenses:
offense
THEFT/OTHER                   513
THEFT F/AUTO                  239
MOTOR VEHICLE THEFT           189
ASSAULT W/DANGEROUS WEAPON     57
ROBBERY                        57
dtype: int64

Top 5 weekday offenses:
offense
THEFT/OTHER                   1732
THEFT F/AUTO                   828
MOTOR VEHICLE THEFT            460
ROBBERY                        132
ASSAULT W/DANGEROUS WEAPON     111
dtype: int64


In [26]:
# Task 5.3: Compare high-crime wards to other wards
# Did high-crime wards improve more or less than other wards?

# High-crime wards (where is_high_crime_ward == True)
high_ward_pre = df[(df['is_high_crime_ward'] == True) & (df['intervention_period'] == 'Pre-Intervention')]
high_ward_post = df[(df['is_high_crime_ward'] == True) & (df['intervention_period'] == 'Post-Intervention')]

# Other wards (where is_high_crime_ward == False)
other_ward_pre = df[(df['is_high_crime_ward'] == False) & (df['intervention_period'] == 'Pre-Intervention')]
other_ward_post = df[(df['is_high_crime_ward'] == False) & (df['intervention_period'] == 'Post-Intervention')]

# Calculate daily averages
high_daily_pre = len(high_ward_pre) / DAYS_BEFORE_INTERVENTION
high_daily_post = len(high_ward_post) / DAYS_AFTER_INTERVENTION
other_daily_pre = len(other_ward_pre) / DAYS_BEFORE_INTERVENTION
other_daily_post = len(other_ward_post) / DAYS_AFTER_INTERVENTION

print("Daily crime rates:")
print(f"\nHigh-crime wards:")
print(f"  Before: {high_daily_pre:.1f} crimes/day")
print(f"  After: {high_daily_post:.1f} crimes/day")
print(f"\nOther wards:")
print(f"  Before: {other_daily_pre:.1f} crimes/day")
print(f"  After: {other_daily_post:.1f} crimes/day")

Daily crime rates:

High-crime wards:
  Before: 36.7 crimes/day
  After: 34.7 crimes/day

Other wards:
  Before: 41.6 crimes/day
  After: 36.1 crimes/day


## Part 6: Policy Impact Summary

Create a comprehensive summary that community organizations can use to understand the federal intervention's impact. Use specific numbers from your analysis above.

In [27]:
# Create a data-driven summary of the federal intervention's impact
# Include specific statistics from your analysis

print("=" * 70)
print("FEDERAL INTERVENTION IMPACT ANALYSIS: DC CRIME DATA")
print("Intervention Date: August 7, 2025")
print("=" * 70)
print()

print("KEY FINDINGS:")
print()
print("1. Overall Impact:")
print(f"   - Daily crime rate DECREASED from {daily_pre:.1f} to {daily_post:.1f} crimes/day")
print(f"   - Overall change: {((daily_post - daily_pre) / daily_pre * 100):+.1f}%")
print(f"   - Immediate effect (first week): {((len(week_after) - len(week_before)) / len(week_before) * 100):+.1f}% change")
print()

print("2. Crime Type Analysis:")
print(f"   - Violent crime share BEFORE: {violent_rate_pre:.1f}%")
print(f"   - Violent crime share AFTER: {violent_rate_post:.1f}%")
print(f"   - Change: {violent_rate_post - violent_rate_pre:+.1f} percentage points")
print(f"   - Property crime daily rate: {daily_property_pre:.1f} → {daily_property_post:.1f} ({((daily_property_post - daily_property_pre) / daily_property_pre * 100):+.1f}%)")
print()

print("3. Geographic Concentration:")
print(f"   - Top 3 high-crime wards: {list(top_3_ward_numbers)}")
print(f"   - These wards account for {high_ward_crimes / len(df) * 100:.1f}% of all crime")
print(f"   - High-crime wards: {high_daily_pre:.1f} → {high_daily_post:.1f} crimes/day")
print(f"   - Other wards: {other_daily_pre:.1f} → {other_daily_post:.1f} crimes/day")
print()

print("4. Most Common Offenses:")
print(f"   Top offense before intervention: {top_5_offenses_pre.index[0]} ({top_5_offenses_pre.iloc[0]} incidents)")
print(f"   Top offense after intervention: {top_5_offenses_post.index[0]} ({top_5_offenses_post.iloc[0]} incidents)")
print()

print("5. Temporal Patterns:")
print(f"   Busiest day before: {dow_pre_sorted.index[0]} ({dow_pre_sorted.iloc[0]} crimes)")
print(f"   Busiest day after: {dow_post_sorted.index[0]} ({dow_post_sorted.iloc[0]} crimes)")
print()

print("=" * 70)
print("CONCLUSION:")
if daily_post < daily_pre:
    print("The federal intervention coincided with a DECREASE in overall crime.")
else:
    print("The federal intervention coincided with an INCREASE in overall crime.")
    
print(f"Analysis based on {DAYS_BEFORE_INTERVENTION} days before and {DAYS_AFTER_INTERVENTION} days after Aug 7, 2025.")
print("Further research needed to determine causality and sustainability of trends.")
print("=" * 70)

FEDERAL INTERVENTION IMPACT ANALYSIS: DC CRIME DATA
Intervention Date: August 7, 2025

KEY FINDINGS:

1. Overall Impact:
   - Daily crime rate DECREASED from 78.3 to 70.8 crimes/day
   - Overall change: -9.6%
   - Immediate effect (first week): -10.0% change

2. Crime Type Analysis:
   - Violent crime share BEFORE: 8.6%
   - Violent crime share AFTER: 8.6%
   - Change: +0.0 percentage points
   - Property crime daily rate: 71.6 → 64.7 (-9.6%)

3. Geographic Concentration:
   - Top 3 high-crime wards: [796, 707, 640]
   - These wards account for 47.9% of all crime
   - High-crime wards: 36.7 → 34.7 crimes/day
   - Other wards: 41.6 → 36.1 crimes/day

4. Most Common Offenses:
   Top offense before intervention: THEFT/OTHER (1109 incidents)
   Top offense after intervention: THEFT/OTHER (1136 incidents)

5. Temporal Patterns:
   Busiest day before: Wednesday (407 crimes)
   Busiest day after: Thursday (371 crimes)

CONCLUSION:
The federal intervention coincided with a DECREASE in overall 

## Submission Checklist

Before submitting, make sure:
- [ ] All code cells run without errors
- [ ] All tasks are completed (look for `YOUR CODE HERE` markers)
- [ ] Your results make sense (check for reasonable percentages and counts)
- [ ] You've calculated daily averages correctly (dividing by the right number of days)
- [ ] You've included the policy impact summary with specific numbers from your analysis
- [ ] File is saved as PS2_YourFirstName_YourLastName.ipynb

Remember: During code review, be prepared to:
- Explain what any line of your code does
- Modify filters or groupby operations when asked
- Describe what the federal intervention analysis reveals
- Calculate daily averages
- Discuss whether your findings suggest the intervention is working
- Explain the difference between .size(), .count(), and .sum()
- Show how you combined multiple conditions with & and |
- Explain why we use groupby for some questions and filtering for others