In [62]:
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import numpy as np
import math
import seaborn as sns
import plotly.express as px
import geopandas as gpd
import folium
from folium.plugins import HeatMap
import itertools
from sklearn.linear_model import LinearRegression
from sklearn.metrics import r2_score
from bokeh.plotting import figure, show, output_file, save
from bokeh.models import ColumnDataSource, FactorRange, HoverTool, CustomJS, Select, Slider
from bokeh.layouts import column, row
from bokeh.io import output_notebook
import calplot

from folium.plugins import HeatMapWithTime
import json
import urllib.request

In [63]:
df = pd.read_csv('../Assignment/merged_data.csv')
print(df.head())

        Category PdDistrict           X          Y        Date   Time  \
0        ROBBERY  INGLESIDE -122.420084  37.708311  2004-11-22  17:50   
1  VEHICLE THEFT       PARK -120.500000  90.000000  2005-10-18  20:00   
2  VEHICLE THEFT   SOUTHERN -120.500000  90.000000  2004-02-15  02:00   
3          ARSON  INGLESIDE -122.436220  37.724377  2011-02-18  05:27   
4        ASSAULT   SOUTHERN -122.410541  37.770913  2010-11-21  17:00   

   TimeOfDay DayOfWeek  DayOfMonth     Month  Year  
0         17    Monday          22  November  2004  
1         20   Tuesday          18   October  2005  
2          2    Sunday          15  February  2004  
3          5    Friday          18  February  2011  
4         17    Sunday          21  November  2010  


In [64]:
#df = pd.read_csv('../Assignment/merged_data.csv')

# Filter data for the category "Drug/Narcotic"
df = df[df['Category'] == 'DRUG/NARCOTIC']
# Group data by year and count occurrences
df = df['Year'].value_counts().sort_index()
# Filter the trends to include only years between 2017 and 2024
df = df[(df.index >= 2014) & (df.index <= 2024)]

San Francisco’s crime landscape tells a story of resilience and struggle, reflecting the city's complex socio-economic dynamics. Using the San Francisco Police Department’s Incident Report dataset, we explore how drug-related crimes have evolved over time, impacted specific neighborhoods, and intersected with broader societal challenges. This analysis combines temporal trends, geographic mapping, and interactive visualizations to provide a comprehensive understanding of the data.

The story behind these visualizations is deeply tied to San Francisco’s ongoing battle against drug addiction, particularly the fentanyl crisis. In 2023, overdose deaths peaked as fentanyl infiltrated the local drug market. Neighborhoods like the Tenderloin saw disproportionate impacts due to high rates of homelessness and poverty. Despite efforts like naloxone distribution and harm reduction programs, systemic inequities persisted.

By 2024, overdose fatalities showed signs of decline due to expanded treatment options and targeted outreach efforts. Yet, challenges remain, as marginalized communities continue to bear the brunt of this epidemic. This analysis sheds light on how data can inform policy decisions, resource allocation, and community engagement to address complex urban issues.

References
"Tracking San Francisco’s Drug Overdose Epidemic" - SF Chronicle

"Reducing Violent Crime and Drug Sales in the Tenderloin" - SF.gov

"Exploratory Data Analysis And Crime Prediction In San Francisco" - Semantic Scholar

San Francisco’s crime landscape tells a story of resilience and struggle, reflecting the city's
complex socio-economic dynamics. In this analysis, we focus on **drug-related incidents** from
the San Francisco Police Department’s Incident Report dataset between 2014 and 2024. The goal
is to illustrate **how these crimes evolved over time** and to bring attention to factors such as
the fentanyl crisis, homelessness, and law enforcement interventions.

We'll start with a brief overview of the dataset and then delve into a **temporal visualization** showing
yearly counts of drug-related incidents. This single-page, magazine-style layout weaves textual
explanation around the data visualization, allowing a linear, top-to-bottom narrative.

- **Time Frame**: 2014–2024
- **Categories**: We filtered the original dataset to include only drug-related incidents (e.g., `DRUG/NARCOTIC`).
- **Location Fields**: PdDistrict, X, Y, plus additional columns for date and time.
- **Data Source**: San Francisco Police Department Incident Reports (publicly available).

To keep the example concise, let's assume we have already aggregated counts of drug-related
incidents per year. (In a real analysis, you would load the dataset, filter by `Category == DRUG/NARCOTIC`,
group by `Year`, and count the rows.)

In [65]:
%matplotlib inline

In [66]:
df_interactive = df.reset_index()
df_interactive.columns = ['Year', 'IncidentCount']

# Create an interactive bar chart with Plotly Express
fig = px.bar(
    df_interactive,
    x='Year',
    y='IncidentCount',
    title='Drug/Narcotic Incidents (2014–2024)',
    labels={'Year': 'Year', 'IncidentCount': 'Number of Incidents'},
    template='plotly_white'
)

# Update traces so that they have a name and show in the legend
fig.update_traces(name='Drug Incidents', showlegend=True)

# Optionally adjust layout to ensure the legend is shown
fig.update_layout(
    legend_title_text='Legend',
    xaxis=dict(dtick=1)  # Show every year on x-axis
)
fig.show()

The bar chart illustrates the number of drug/narcotic incidents reported in San Francisco from 2014 to 2024, showcasing significant fluctuations over the decade. The data aligns with the narrative of San Francisco's evolving struggle with substance abuse, particularly the opioid and fentanyl crises.

2014–2018: The graph shows high incident counts, peaking in 2018. This period likely reflects the growing prevalence of drug-related activity, coinciding with the arrival of fentanyl in 2017. Fentanyl’s potency and accessibility contributed to increased drug use and overdose rates during this time, particularly in neighborhoods like the Tenderloin and SOMA.

2019–2021: A decline in incidents is visible, potentially reflecting initial harm reduction efforts such as naloxone distribution and public health campaigns. However, systemic challenges and gaps in treatment accessibility likely limited sustained progress.

2022–2024: Incidents rise again, peaking in 2023 before slightly declining in 2024. This resurgence corresponds with the worsening fentanyl crisis, which saw record overdose deaths in 2023. The slight decline in 2024 may reflect the impact of expanded treatment programs and outreach efforts, such as those implemented under San Francisco’s Overdose Prevention Plan.

This visualization underscores the cyclical nature of drug-related incidents in San Francisco, driven by external factors like drug market shifts and public health interventions. It highlights the city’s ongoing battle to address addiction through harm reduction, treatment expansion, and community-level strategies.

Sources
"Tracking San Francisco’s Drug Overdose Epidemic" - SF Chronicle

"Fentanyl State of Emergency Declared by Mayor Daniel Lurie" - CBS News

"San Francisco Overdose Prevention Plan (2024)" - SF.gov

In [67]:
df = pd.read_csv('../Assignment/merged_data.csv')

# Filter data for the category "Drug/Narcotic"
df = df[df['Category'] == 'DRUG/NARCOTIC']
# Group data by year and count occurrences
df = df['Year'].value_counts().sort_index()
# Filter the trends to include only years between 2017 and 2024
df = df[(df.index >= 2014) & (df.index <= 2024)]

In [68]:
# Ensure df is a DataFrame with the required columns
df = pd.read_csv('../Assignment/merged_data.csv')

# Filter data for the years 2014 to 2024
df_filtered = df[(df['Year'] >= 2014) & (df['Year'] <= 2024)]
# Filter data for the category "Drug/Narcotic"
df_filtered = df_filtered[df_filtered['Category'] == 'DRUG/NARCOTIC']
# Extract the required columns
df_map = df_filtered[['X', 'Y', 'Year']]

df_map['Year'].unique()

array([2015, 2017, 2016, 2018, 2014, 2022, 2023, 2024, 2020, 2021, 2019])

In [69]:
df.columns

Index(['Category', 'PdDistrict', 'X', 'Y', 'Date', 'Time', 'TimeOfDay',
       'DayOfWeek', 'DayOfMonth', 'Month', 'Year'],
      dtype='object')

In [70]:
sf_center = [37.7749, -122.4194]
base_map = folium.Map(location=sf_center, zoom_start=12)

# ----------------------------------------------------------------
# 2. Simple HeatMap (no time dimension)
# ----------------------------------------------------------------
# If you only want a single heat layer for all incidents from 2014-2024:
heat_data = df_map[['Y','X']].values.tolist()

HeatMap(heat_data,
        radius=8,
        blur=15,
        min_opacity=0.4,
        max_opacity=0.8,
        name='Drug Hotspots').add_to(base_map)

# ----------------------------------------------------------------
# 3. HeatMapWithTime (show evolution by year)
# ----------------------------------------------------------------
# If you want to visualize year-by-year changes:
# We need a list of heatmap data frames, one per year, plus a "time_index".

year_slices = []
time_index = []

for yr in sorted(df_map['Year'].unique()):
    # Filter for that year
    df_year = df_map[df_map['Year'] == yr]
    
    # Convert to a list of [lat, lon, weight] or just [lat, lon]
    # We can use a uniform weight = 1 for each incident, or incident count
    locations = df_year[['Y','X']].values.tolist()
    
    year_slices.append(locations)
    time_index.append(str(yr))  # Label each time slice by the year

# Now add the HeatMapWithTime layer
HeatMapWithTime(
    year_slices,
    radius=7,
    gradient={0.2: 'blue', 0.4: 'lime', 0.6: 'yellow', 1.0: 'red'},
    min_opacity=0.3,
    max_opacity=0.9,
    use_local_extrema=False,  # If True, each time-slice's colors scale independently
    auto_play=False,
    display_index=True,
    index=time_index,
    name='Drug Hotspots Over Time'
).add_to(base_map)

# ----------------------------------------------------------------
# 4. Layer Control and Display
# ----------------------------------------------------------------
folium.LayerControl().add_to(base_map)

# Display 
base_map


In [71]:
df_drug = df[df['Category'] == 'DRUG/NARCOTIC'].copy()
district_counts = df_drug.groupby('PdDistrict').size().reset_index(name='DrugCount')

In [72]:
df = pd.read_csv('../Assignment/merged_data.csv')

In [73]:
from folium.features import GeoJsonTooltip, GeoJsonPopup
gdf_sfpd = gpd.read_file("sfpd_districts.geojson")

# Inspect the columns in 'gdf_sfpd'. Suppose it has 'DISTRICT' as the SF district name
# Let's rename it to match our aggregator key, 'PdDistrict'.
gdf_sfpd = gdf_sfpd.rename(columns={'DISTRICT': 'PdDistrict'})

# -------------------------------------------------------------------
# 3. Merge the Aggregated Counts with District Polygons
# -------------------------------------------------------------------
# Merge on 'PdDistrict'
gdf_merged = gdf_sfpd.merge(district_counts, on='PdDistrict', how='left')

# Fill NA if any district had zero drug incidents in that timeframe
gdf_merged['DrugCount'] = gdf_merged['DrugCount'].fillna(0)

# Convert merged GeoDataFrame to JSON for Folium
geojson_data = gdf_merged.to_json()

# -------------------------------------------------------------------
# 4. Create a Folium Choropleth Map
# -------------------------------------------------------------------
sf_center = [37.7749, -122.4194]  # approximate center
sf_map = folium.Map(location=sf_center, zoom_start=12)

# Add a choropleth layer referencing the property name 'PdDistrict'
choropleth = folium.Choropleth(
    geo_data=geojson_data,
    data=gdf_merged,
    columns=['PdDistrict', 'DrugCount'],
    key_on='feature.properties.PdDistrict',  # depends on the rename above
    fill_color='YlOrRd',
    fill_opacity=0.6,
    line_opacity=0.8,
    legend_name='Drug/Narcotic Incidents (2014–2024)',
    highlight=True
).add_to(sf_map)
title_html = '''
     <h3 align="center" style="font-size:16px;">
         <b>Drug/Narcotic Incidents (2014–2024) by District</b>
     </h3>
'''
# -------------------------------------------------------------------
# 5. Optional: Add Tooltips for District & Count
# -------------------------------------------------------------------
folium.GeoJson(
    geojson_data,
    style_function=lambda x: {'color':'black','weight':0.5,'fillOpacity':0},
    tooltip=GeoJsonTooltip(
        fields=['PdDistrict','DrugCount'],
        aliases=['District:','Drug Crimes:'],
        localize=True
    )
).add_to(sf_map)
sf_map.get_root().html.add_child(folium.Element(title_html))

# Add layer control
folium.LayerControl().add_to(sf_map)

# Display in a Jupyter environment
sf_map