# Temporal patterns

- To start off easily, let's count the number of crimes per year:
    - What is the year with most crimes?
    - What is the year with the fewest crimes?.
- Create a barplot of crimes-per-year (years on the x-axis, crime-counts on the y-axis).
- Finally, Police chief Suneman is interested in the temporal development of only a subset of categories, the so-called *focus crimes*. Those categories are listed below (for convenient copy-paste action). Create bar-charts displaying the year-by-year development of each of these categories across the years 2003-2017.

In [2]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sys

# Ensure Python can find clean_crime_data.py (go up one level to Assignment-1/)
sys.path.append("../Assignment-1/Clean-Crime_Data")  # ✅ Updated path

from clean_crime_data import clean_crime_data  # Import the function

# Load the cleaned dataset (go up one level to Assignment-1/Data/)
file_path = "../Assignment-1/Data/SF_Crime_Data_Cleaned.csv"  # ✅ Updated path
df = pd.read_csv(file_path, parse_dates=["Incident Date"])

# Clean and filter data using the function
df_focus = clean_crime_data(df)

# Convert "Incident Time" to extract hours
df_focus["Incident Time"] = pd.to_datetime(df_focus["Incident Time"], format="%H:%M", errors="coerce").dt.hour

# Now df_focus is ready for analysis!
print("Unique Crime Categories in df_focus:", df_focus["Incident Category"].unique())  # Check cleaned categories
df_focus.head()


Unique Crime Categories in df_focus: ['ROBBERY' 'VEHICLE THEFT' 'ASSAULT' 'TRESPASS' 'BURGLARY' 'LARCENY/THEFT'
 'DRUG/NARCOTIC' 'VANDALISM' 'WEAPON LAWS' 'DISORDERLY CONDUCT'
 'PROSTITUTION' 'DRUNKENNESS' 'DRIVING UNDER THE INFLUENCE'
 'STOLEN PROPERTY']


Unnamed: 0,Incident Date,Incident Time,Incident Day of Week,Incident Category,Incident Description,Police District,Latitude,Longitude
0,2004-11-22,17,Monday,ROBBERY,"ROBBERY, BODILY FORCE",INGLESIDE,37.708311,-122.420084
1,2005-10-18,20,Tuesday,VEHICLE THEFT,STOLEN AUTOMOBILE,PARK,90.0,-120.5
2,2004-02-15,2,Sunday,VEHICLE THEFT,STOLEN AUTOMOBILE,SOUTHERN,90.0,-120.5
4,2010-11-21,17,Sunday,ASSAULT,BATTERY,SOUTHERN,37.770913,-122.410541
5,2013-04-02,15,Tuesday,ASSAULT,BATTERY,TARAVAL,37.745158,-122.470366


## Year with the most and fewest crime 

In [7]:
import pandas as pd

# Define current year for partial data
current_year = 2025

# Count crimes per year clearly
crimes_per_year = df_focus["Incident Date"].dt.year.value_counts().sort_index()

# Crimes in incomplete year 2025
crimes_2025 = crimes_per_year.get(current_year, 0)

# Year with most crimes (excluding incomplete 2025)
year_most_crimes = crimes_per_year[crimes_per_year.index < current_year].idxmax()
num_most_crimes = crimes_per_year[year_most_crimes]

# Year with fewest crimes (excluding incomplete 2025)
year_fewest_crimes = crimes_per_year[crimes_per_year.index < current_year].idxmin()
num_fewest_crimes = crimes_per_year[year_fewest_crimes]

# Print results clearly
print(f"📅 Year with the most crimes: {year_most_crimes} ({num_most_crimes} crimes)")
print(f"📅 Year with the fewest crimes: {year_fewest_crimes} ({num_fewest_crimes} crimes)")

# Crimes in 2025 separately
crimes_2025 = crimes_per_year.get(current_year, 0)
print(f"📅 Crimes in the incomplete year {current_year}: {crimes_2025}")


📅 Year with the most crimes: 2017 (93149 crimes)
📅 Year with the fewest crimes: 2024 (61322 crimes)
📅 Crimes in the incomplete year 2025: 5994
