## Calls For Service
#### Questions to ask:
<b>Descriptive Analysis</b>
- What are the most common call types this year vs. last year?
- What times of day see the highest volume of calls by shift?
- Which addresses have the highest number of repeated calls?
- What percentage of calls resulted in a report being taken or an arrest?

<b>Trend & Comparative Analysis</b>
- How do weekly call volumes compare to the same week last year and to the 5-year average?
- Are we seeing a year-over-year increase in disturbance calls?
- What are the call volume trends for mental health or overdose-related incidents?
- How have reportable vs. non-reportable calls trended over time?
- Is there an increase in traffic-related calls during certain weather events?

<b>Spatial Analysis</b>
- What are the top 10 hotspots by call volume in the past 7, 30, and 90 days?
- Which areas have the highest concentration of violent crime calls?
- Are there emerging hotspots not present in previous periods?

<b>Operational Efficiency</b>
- What is the average time between call creation and unit arrival for high-priority calls?
- How many calls per shift are handled without report, arrest, or further follow-up?
- Which call types consume the most officer time with least actionable outcome?
- Are officers or shifts disproportionately handling more serious incidents?

In [1]:
import os
import pandas as pd
import inc.functions as fn
from inc.credential_manager import inject_decrypted_env, get_passphrase  # << New import
import h3

# Obtain the passphrase from hidden file or user input
passphrase = get_passphrase()

# Inject decrypted environment variables
inject_decrypted_env(environment="prod", passphrase=passphrase)

True

### Production View

In [None]:
cfs_df = fn.fetch_calls_for_service(ori="OH0760400", start_date=fn.last_week("str"), end_date=fn.yesterday("str"), data_type="Stats")

### Development View

In [2]:
cfs_df = pd.read_excel("resources/cfs/CFS_data_2025-05-30.xlsx")

# Already done in fetch_calls_for_service
# Process the DataFrame
cfs_df = fn.preprocess_calls(cfs_df)
fn.update_daily_summary(cfs_df, csv_filename="resources/cfs/call_type_daily_summary.csv")

No new data to process.


In [3]:
cfs_df["Hex_ID_10"] = cfs_df.apply(
    lambda row: fn.safe_latlng_to_hex(row["LatitudeY"], row["LongitudeX"], resolution=10),
    axis=1
)
cfs_df["Hex_ID_7"] = cfs_df.apply(
    lambda row: fn.safe_latlng_to_hex(row["LatitudeY"], row["LongitudeX"], resolution=7),
    axis=1
)

## Questions

What are the most common call types this year vs. last year?

In [4]:
df = cfs_df.loc[(cfs_df['Reportable'] == True) & (cfs_df['ORI'] == "OH0760400")].copy()
def most_common_call_types_this_vs_last_year(df, top_n=10):
    df = df.copy()
    df['LocalDatetime'] = pd.to_datetime(df['LocalDatetime'], errors='coerce')
    df['Year'] = df['LocalDatetime'].dt.year

    current_year = pd.Timestamp.today().year
    last_year = current_year - 1

    this_year_calls = (
        df[df['Year'] == current_year]['CallType']
        .value_counts()
        .head(top_n)
        .rename('ThisYear')
    )

    last_year_calls = (
        df[df['Year'] == last_year]['CallType']
        .value_counts()
        .head(top_n)
        .rename('LastYear')
    )

    # Combine both for comparison
    comparison_df = pd.concat([this_year_calls, last_year_calls], axis=1).fillna(0).astype(int)
    comparison_df['Change'] = comparison_df['ThisYear'] - comparison_df['LastYear']
    comparison_df['% Change'] = ((comparison_df['Change'] / comparison_df['LastYear'].replace(0, 1)) * 100).round(1)

    return comparison_df.sort_values(by='ThisYear', ascending=False)

result = most_common_call_types_this_vs_last_year(df)
print(result)

                   ThisYear  LastYear  Change  % Change
CallType                                               
Welfare Check          1278         0    1278  127800.0
911 Hangup              985         0     985   98500.0
Disturbance             784         0     784   78400.0
Parking Complaint       633         0     633   63300.0
Domestic                539         0     539   53900.0
Alarm - Business        475         0     475   47500.0
Accident                377         0     377   37700.0
Suspicious Person       355         0     355   35500.0
Theft                   337         0     337   33700.0
Noise Complaint         287         0     287   28700.0


What times of day see the highest volume of calls by shift?

In [5]:
def call_volume_by_hour_and_shift(df):
    df = df.copy()
    df['LocalDatetime'] = pd.to_datetime(df['LocalDatetime'], errors='coerce')
    df['Hour'] = df['LocalDatetime'].dt.hour

    # Ensure Shift is present or assign it
    if 'Shift' not in df.columns:
        df['Shift'] = df['LocalDatetime'].apply(fn.assign_shift)

    # Group by Shift and Hour
    grouped = df.groupby(['Shift', 'Hour']).size().reset_index(name='CallVolume')

    # Pivot for better readability
    pivot_table = grouped.pivot(index='Hour', columns='Shift', values='CallVolume').fillna(0).astype(int)

    return pivot_table
pivot = call_volume_by_hour_and_shift(df)
print(pivot)

Shift  1st Shift  2nd Shift  3rd Shift
Hour                                  
0            445          0          0
1            354          0          0
2            287          0          0
3            262          0          0
4            209          0          0
5            165          0          0
6              0        194          0
7              0        309          0
8              0        410          0
9              0        509          0
10             0        594          0
11             0        616          0
12             0        575          0
13             0        668          0
14             0          0        615
15             0          0        648
16             0          0        704
17             0          0        652
18             0          0        589
19             0          0        629
20             0          0        574
21             0          0        562
22           493          0          0
23           489         

Which addresses have the highest number of repeated calls?

In [6]:
def top_repeat_call_addresses(df, top_n=20):
    df = df.copy()
    df['FullAddress'] = df['FullAddress'].fillna('Unknown')

    # Count occurrences of each address
    address_counts = df['FullAddress'].value_counts().reset_index()
    address_counts.columns = ['FullAddress', 'CallCount']

    # Return top N
    return address_counts.head(top_n)
repeat_calls = top_repeat_call_addresses(df)
print(repeat_calls)


                         FullAddress  CallCount
0                         Not Listed        286
1                       3200 US 62          114
2                700 MCKINLEY AVE NW         84
3               4004 TUSCARAWAS ST W         82
4                    200 HIGH AVE SW         52
5                     2600 6TH ST SW         49
6   100 SOMEWHERE IN THE CITY   &            46
7                      131 5TH ST NE         43
8                643 ALAN PAGE DR SE         40
9               3131 TUSCARAWAS ST W         39
10               319 TUSCARAWAS ST E         31
11              2215 TUSCARAWAS ST E         31
12                     221 3RD ST SW         30
13              2210 TUSCARAWAS ST W         30
14                 112 CHERRY AVE SE         30
15                 1000 MARKET AVE N         29
16                1114 GONDER AVE SE         29
17                   1212 12TH ST NW         29
18                   1700 55TH ST NE         28
19                 626 WALNUT AVE NE    

What percentage of calls resulted in a report being taken or an arrest?

In [7]:
def percentage_report_or_arrest(df):
    df = df.copy()
    df['dispo'] = df['dispo'].fillna('').str.upper()

    total_calls = len(df)
    report_calls = df['dispo'].str.contains('REPORT TAKEN').sum()
    arrest_calls = df['dispo'].str.contains('ARREST').sum()

    # Avoid double-counting calls that include both
    report_or_arrest_calls = df[
        df['dispo'].str.contains('REPORT TAKEN') | df['dispo'].str.contains('ARREST')
    ]

    percent_report = (report_calls / total_calls) * 100
    percent_arrest = (arrest_calls / total_calls) * 100
    percent_combined = (len(report_or_arrest_calls) / total_calls) * 100

    return {
        "Total Calls": total_calls,
        "Report Taken %": round(percent_report, 2),
        "Arrest %": round(percent_arrest, 2),
        "Report or Arrest %": round(percent_combined, 2)
    }
results = percentage_report_or_arrest(df)
print(results)


{'Total Calls': 11552, 'Report Taken %': 24.82, 'Arrest %': 5.14, 'Report or Arrest %': 26.46}


In [9]:
def summarize_section(api_key, section_title, df_or_text):
    from langchain_openai import ChatOpenAI
    from langchain.chains import ConversationChain
    from langchain.prompts import PromptTemplate
    from langchain.memory import ConversationBufferMemory

    if isinstance(df_or_text, pd.DataFrame):
        data_str = df_or_text.to_string(index=False)
    elif isinstance(df_or_text, dict):
        data_str = "\n".join(f"{k}: {v}" for k, v in df_or_text.items())
    else:
        data_str = str(df_or_text)

    prompt_template = PromptTemplate(
        input_variables=["history", "input"],
        template=f"""You are a police analyst assistant. Provide a concise summary of the following section titled "{section_title}".
Focus on insights, trends, or red flags for command staff. Keep your summary under 100 words.

Conversation history:
{{history}}

Human: {{input}}
AI:"""
    )

    chain = ConversationChain(
        llm=ChatOpenAI(openai_api_key=api_key, model_name="gpt-3.5-turbo", temperature=0.3),
        memory=ConversationBufferMemory(),
        prompt=prompt_template
    )

    input_text = f"Section Title: {section_title}\nData:\n{data_str}"
    return chain.run(input_text)

api_key = os.environ['OPEN_API_KEY']

summaries = []
summaries.append(summarize_section(api_key, "Call Types This Year vs. Last Year", most_common_call_types_this_vs_last_year(df)))
summaries.append(summarize_section(api_key, "Hourly Call Volume by Shift", call_volume_by_hour_and_shift(df)))
summaries.append(summarize_section(api_key, "Top Repeat Call Addresses", top_repeat_call_addresses(df)))
summaries.append(summarize_section(api_key, "Report/Arrest Percentage", percentage_report_or_arrest(df)))

final_report = "\n\n".join(summaries)
print(final_report)


Insights: Significant increase in call types this year compared to last year, with percentage changes ranging from 28,700% to 127,800%. Trend indicates a potential rise in demand for police services. Command staff should assess resources and adjust deployment strategies accordingly.

The data shows a significant disparity in call volume between the 1st shift and the 2nd/3rd shifts. The 2nd and 3rd shifts have consistently low call volumes, indicating a potential issue with resource allocation or demand forecasting. Command staff should investigate the reasons behind this trend and consider adjusting staffing levels accordingly.

Insights from the "Top Repeat Call Addresses" section show that certain locations, such as 3200 US 62 and 700 MCKINLEY AVE NW, have a high number of calls. This could indicate areas with recurring issues that may require increased police presence or intervention. Command staff should consider allocating resources to address the underlying problems at these loca

### Break out specifics

In [10]:
# Loads the daily summary of calltypes to ORI and Date
df_summary = fn.load_summary_data()

# Add call types to the Summary Data to focus on Reportable calls only
cfs_types = pd.read_excel("resources/cfs/lib_call_types.xlsx", usecols=["CallType","Reportable","CodeType"])
df_summary = df_summary.merge(cfs_types, on="CallType", how="left")
df_summary_reportable = df_summary.loc[df_summary['Reportable'] == True].copy()

# analyze the data and cluster by call type
traffic_stops = df_summary.loc[df_summary['CallType'] == 'Traffic Stop'].copy()
shots_fired = df_summary.loc[df_summary['CallType'] == 'Shots Fired'].copy()
accidents = df_summary.loc[df_summary['CallType'].str.contains('Accident')].copy()

past_10_weeks = fn.compute_total_cfs_past_10_weeks(df_summary_reportable)

ytd_cfs = fn.compute_total_cfs_ytd(df_summary_reportable)
ytd_traffic_stops = fn.compute_total_cfs_ytd(traffic_stops)
ytd_shots_fired = fn.compute_total_cfs_ytd(shots_fired)
ytd_accidents = fn.compute_total_cfs_ytd(accidents)

In [11]:
df_summary_reportable.groupby(["CallType","Shift"]).size().unstack(fill_value=0)

Shift,1st Shift,2nd Shift,3rd Shift
CallType,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
911 Hangup,1981,2005,2097
ATV Complaint,119,158,757
Abandoned Vehicle,147,1049,654
Abuse/Neglect,49,96,125
Accident,680,1664,1840
...,...,...,...
Vehicle - Unauthorized Use,187,312,325
Violation of Order,60,150,249
Warrant Arrest,342,700,695
Warrant/Civil Process,20,82,43
