# Replicating the Police Diversion Analysis

This document provides a breakdown and replication of the methodology used in the 2021 review of the Cahoots orginization done by the EPD Crime Analysis unit. 
Below are the formulas used in the calculation of the diversion rates mentioned in the paper. 


### Gross Diversion Rates

| Rate | Formula | Description |
|---|---|---|
| **Diversion Rate 1** |  $ \frac{\text{CAHOOTS Associations}}{\text{Total Calls}} $  | This rate represents all instances where CAHOOTS was associated with a call, regardless of dispatch or arrival status.  |
| **Diversion Rate 2** | $ \frac{\text{CAHOOTS Dispatched}}{\text{Total Calls}} $ |  This rate considers calls where CAHOOTS was dispatched. 
| **Diversion Rate 3** | $ \frac{\text{CAHOOTS Only Arrived}}{\text{Total Calls}} $  | This rate considers all calls where CAHOOTS alone arrived at the scene. |

### Adjusted Diversion Rates

| Rate | Formula | Description |
|---|---|---|
| **Diversion Rate 4** | $ \frac{\text{CAHOOTS Only Arrived} - (\text{Count of top 3 Cahoots incidents(dataset 5)})}{\text{Total Calls}} $  | Removes the total count of the 3 most common call types handled by Cahoots from the numerator |
| **Diversion Rate 5** | $ \frac{\text{CAHOOTS Only Arrived} - (\text{(Count of top 3 Cahoots incidents(dataset 5))})}{\text{(Arrived only calls) - (Count of top 3 Cahoots incidents(dataset 5)) }} $  |  Similar to Diversion Rate 4, but only considers calls that arrived on scene |
| **Diversion Rate 6 (Low)** | $ \frac{\text{(Count of Dispatched Cahoots Welfare Calls) * 0.74}}{\text{Total Calls}} $ |  Assumes that Wellfare check is the only diverted call type and that 0.74 are diversions |
| **Diversion Rate 6 (High)** | $ \frac{\text{(Count of Dispatched Cahoots Welfare Calls) * 0.74}}{\text{(Arrived only calls) - (Count of top 3 Cahoots incidents(dataset 2))}} $  | Similar to Diversion Rate 6 (Low), but calculated using the adjusted total for dispatched calls (arrived only). |

# Police Diverson Criteria

* Call is recieved by the call center
* Police are normally dispatched to call type
* The call is dispatched to and arrived at by Cahoots
* No EPD Resources are dispatched to the call

# Exact Replication of the numbers

The exact calculations are not provided in the study so I extracted the relevant counts from the avalible figues, and just used the stated number when raw totals were not visable. The fact that many exact calculations are not shown means that some re-engineering was involved in deriving these formulas. 

In [13]:
# Dataset 1: ALL CAHOOTS ASSOCIATIONS
dataset1_len = 22055

# Dataset 2: ALL CAHOOTS DISPATCHED CFS
dataset2_len = 18106
dataset2_dispatched_wellfare = 5546 # 6003 is the true value
dataset2_dispatched_public = 5788
dataset2_dispatched_transport = 1803

# Dataset 3: ALL CAHOOTS ARRIVED CFS
dataset3_len = 16218

# Dataset 4: ALL CAHOOTS ONLY ASSOCIATIONS
dataset4_len = 18971

# Dataset 5: CAHOOTS ONLY ARRIVED CFS
dataset5_len = 14212
dataset5_assist_pub = 5058
dataset5_transport =  1587
dataset5_wellfare = 5022

# Total calls for police and Cahoots
total_calls = 109854

# Total calls for police and Cahoots (Arrived only ) - The paper says "dispatched only" but the numbers they cite for Cahoots are for "Arrived only"
# Either this was a misprint and both Police and Cahoots were assessed on an arrived only basis, or Police calls were calculated on a "dispacted" only basis and compared to cahoots on an arrived only basis. 
both_total_dis = 68427

### Gross Diversion Rates 

In [14]:
# Diversion Rate 1:  CAHOOTS Associations / Total Calls
Divert_rate_1 = dataset1_len / total_calls

# Diversion Rate 2:  CAHOOTS Dispatched CFS / Total Calls
Divert_rate_2 = dataset2_len / total_calls

# Diversion Rate 3:  CAHOOTS Only Arrived CFS / Total Calls
Divert_rate_3 = dataset5_len / total_calls

# Display Gross Diversion Rates
print("Gross Diversion Rates:")
print(f"Diversion Rate 1: {Divert_rate_1:.4f}")
print(f"Diversion Rate 2: {Divert_rate_2:.4f}")
print(f"Diversion Rate 3: {Divert_rate_3:.4f}")
print("-" * 30)

Gross Diversion Rates:
Diversion Rate 1: 0.2008
Diversion Rate 2: 0.1648
Diversion Rate 3: 0.1294
------------------------------


### Adjusted Diversion Rates

In [15]:
# Calculate adjusted totals for dispatched calls
total_dispatched_adjusted_5 = both_total_dis - (dataset5_assist_pub + dataset5_transport + dataset5_wellfare)
total_dispatched_adjusted_2 = both_total_dis - (dataset2_dispatched_public + dataset2_dispatched_transport + dataset2_dispatched_wellfare)

# Diversion Rate 4: (CAHOOTS Only Arrived CFS - (Public Assist + Transport + Welfare)) / Total Calls
Divert_rate_4 = (dataset5_len - sum([dataset5_assist_pub, dataset5_transport, dataset5_wellfare])) / total_calls

# Diversion Rate 5: (CAHOOTS Only Arrived - (Public Assist + Transport + Welfare)) / Total Dispatched Adjusted (Arrived Only)
Divert_rate_5 = (dataset5_len - sum([dataset5_assist_pub, dataset5_transport, dataset5_wellfare])) / total_dispatched_adjusted_5

# Apply 0.74 Adjustment 
adjusted_wellfare_dataset2 = dataset2_dispatched_wellfare * 0.74 

# Diversion Rate 6 (Low):  Adjusted Welfare Calls / Total Calls
Divert_rate_6_low = adjusted_wellfare_dataset2 / total_calls

# **Diversion Rate 6 (High): Adjusted Welfare Calls / Total Dispatched Adjusted
Divert_rate_6_high = adjusted_wellfare_dataset2 / total_dispatched_adjusted_2

# Display Adjusted Diversion Rates
print("Adjusted Diversion Rates:")
print(f"Diversion Rate 4: {Divert_rate_4:.4f}")
print(f"Diversion Rate 5: {Divert_rate_5:.4f}")
print(f"Diversion Rate 6 (Low): {Divert_rate_6_low:.4f}")
print(f"Diversion Rate 6 (High): {Divert_rate_6_high:.4f}")
print("-" * 35)

Adjusted Diversion Rates:
Diversion Rate 4: 0.0232
Diversion Rate 5: 0.0448
Diversion Rate 6 (Low): 0.0374
Diversion Rate 6 (High): 0.0742
-----------------------------------


### To get the final range you have to assume that the only type of call that cahoots can divert is a wellfare check. 

In [127]:
import pandas as pd
file_path = 'Data/cleaned_data/cleaned_CAD_data_2021.csv'
cleaned_data = pd.read_csv(file_path)
cleaned_data

Unnamed: 0,IncidentNumber,Call_Created_Time,Call_First_Dispatched_Time,Call_First_On_Scene,Call_Cleared,Call_Zipcode,Beat,Call_Source,Call_Priority,InitialIncidentTypeDescription,IsPrimary,PrimaryUnitCallSign,RespondingUnitCallSign,Unit_Dispatched_Time,Unit_OnScene_Time,Unit_Cleared_Time,Disposition,year,Cahoots_related
0,OR-2021-01-01-21000001,2021-01-01 00:00:58,01/01/2021 00:22:41,,01/01/2021 00:22:47,97403.0,EP03,PHONE,5,BEAT INFORMATION,1,6E31,6E31,01/01/2021 00:22:41,,01/01/2021 00:22:47,INFORMATION ONLY,2021,0
1,OR-2021-01-01-21000002,2021-01-01 00:01:03,,,,97404.0,LS13,E911,5,ILLEGAL FIREWORKS,0,,,,,,RELAYED TO LANE COUNTY SHERIFFS OFFICE,2021,0
2,OR-2021-01-01-21000004,2021-01-01 00:01:48,01/01/2021 00:02:53,01/01/2021 00:06:38,01/01/2021 00:23:46,97402.0,EP05,W911,3,TRAFFIC HAZARD,1,4E53,4E53,01/01/2021 00:18:25,01/01/2021 00:18:25,01/01/2021 00:23:46,PATROL CHECK,2021,0
3,OR-2021-01-01-21000004,2021-01-01 00:01:48,01/01/2021 00:02:53,01/01/2021 00:06:38,01/01/2021 00:23:46,97402.0,EP05,W911,3,TRAFFIC HAZARD,0,4E53,5E47,01/01/2021 00:02:53,01/01/2021 00:06:38,01/01/2021 00:23:33,PATROL CHECK,2021,0
4,OR-2021-01-01-21000004,2021-01-01 00:01:48,01/01/2021 00:02:53,01/01/2021 00:06:38,01/01/2021 00:23:46,97402.0,EP05,W911,3,TRAFFIC HAZARD,0,4E53,CMD16,01/01/2021 00:07:49,01/01/2021 00:07:49,01/01/2021 00:14:38,PATROL CHECK,2021,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
217278,OR-2021-12-31-21336952,2021-12-31 23:51:41,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:31,97401.0,EP02,SELF,P,FIGHT,0,4F64,4E23,01/01/2022 00:02:03,01/01/2022 00:03:28,01/01/2022 00:18:31,CITED IN LIEU OF CUSTODY,2021,0
217279,OR-2021-12-31-21336952,2021-12-31 23:51:41,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:31,97401.0,EP02,SELF,P,FIGHT,1,4F64,4F64,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:10,CITED IN LIEU OF CUSTODY,2021,0
217280,OR-2021-12-31-21336952,2021-12-31 23:51:41,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:31,97401.0,EP02,SELF,P,FIGHT,0,4F64,4F65,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:10,CITED IN LIEU OF CUSTODY,2021,0
217281,OR-2021-12-31-21336961,2021-12-31 23:59:50,,,,97402.0,EP05,E911,1,POISONING,0,,,,,,DISREGARD,2021,0


In [128]:
import pandas as pd

file_path = 'Data/cleaned_data/cleaned_CAD_data_2021.csv'
cleaned_data = pd.read_csv(file_path)

datetime_columns = ['Call_Created_Time', 'Call_First_Dispatched_Time', 'Call_First_On_Scene', 'Call_Cleared']
for col in datetime_columns:
    cleaned_data[col] = pd.to_datetime(cleaned_data[col], errors='coerce')
    
cleaned_data = cleaned_data.dropna(subset=['Call_Created_Time', 'Call_Source', 'InitialIncidentTypeDescription'])
cleaned_data

Unnamed: 0,IncidentNumber,Call_Created_Time,Call_First_Dispatched_Time,Call_First_On_Scene,Call_Cleared,Call_Zipcode,Beat,Call_Source,Call_Priority,InitialIncidentTypeDescription,IsPrimary,PrimaryUnitCallSign,RespondingUnitCallSign,Unit_Dispatched_Time,Unit_OnScene_Time,Unit_Cleared_Time,Disposition,year,Cahoots_related
0,OR-2021-01-01-21000001,2021-01-01 00:00:58,2021-01-01 00:22:41,NaT,2021-01-01 00:22:47,97403.0,EP03,PHONE,5,BEAT INFORMATION,1,6E31,6E31,01/01/2021 00:22:41,,01/01/2021 00:22:47,INFORMATION ONLY,2021,0
1,OR-2021-01-01-21000002,2021-01-01 00:01:03,NaT,NaT,NaT,97404.0,LS13,E911,5,ILLEGAL FIREWORKS,0,,,,,,RELAYED TO LANE COUNTY SHERIFFS OFFICE,2021,0
2,OR-2021-01-01-21000004,2021-01-01 00:01:48,2021-01-01 00:02:53,2021-01-01 00:06:38,2021-01-01 00:23:46,97402.0,EP05,W911,3,TRAFFIC HAZARD,1,4E53,4E53,01/01/2021 00:18:25,01/01/2021 00:18:25,01/01/2021 00:23:46,PATROL CHECK,2021,0
3,OR-2021-01-01-21000004,2021-01-01 00:01:48,2021-01-01 00:02:53,2021-01-01 00:06:38,2021-01-01 00:23:46,97402.0,EP05,W911,3,TRAFFIC HAZARD,0,4E53,5E47,01/01/2021 00:02:53,01/01/2021 00:06:38,01/01/2021 00:23:33,PATROL CHECK,2021,0
4,OR-2021-01-01-21000004,2021-01-01 00:01:48,2021-01-01 00:02:53,2021-01-01 00:06:38,2021-01-01 00:23:46,97402.0,EP05,W911,3,TRAFFIC HAZARD,0,4E53,CMD16,01/01/2021 00:07:49,01/01/2021 00:07:49,01/01/2021 00:14:38,PATROL CHECK,2021,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
217278,OR-2021-12-31-21336952,2021-12-31 23:51:41,2021-12-31 23:51:41,2021-12-31 23:51:41,2022-01-01 00:18:31,97401.0,EP02,SELF,P,FIGHT,0,4F64,4E23,01/01/2022 00:02:03,01/01/2022 00:03:28,01/01/2022 00:18:31,CITED IN LIEU OF CUSTODY,2021,0
217279,OR-2021-12-31-21336952,2021-12-31 23:51:41,2021-12-31 23:51:41,2021-12-31 23:51:41,2022-01-01 00:18:31,97401.0,EP02,SELF,P,FIGHT,1,4F64,4F64,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:10,CITED IN LIEU OF CUSTODY,2021,0
217280,OR-2021-12-31-21336952,2021-12-31 23:51:41,2021-12-31 23:51:41,2021-12-31 23:51:41,2022-01-01 00:18:31,97401.0,EP02,SELF,P,FIGHT,0,4F64,4F65,12/31/2021 23:51:41,12/31/2021 23:51:41,01/01/2022 00:18:10,CITED IN LIEU OF CUSTODY,2021,0
217281,OR-2021-12-31-21336961,2021-12-31 23:59:50,NaT,NaT,NaT,97402.0,EP05,E911,1,POISONING,0,,,,,,DISREGARD,2021,0


In [129]:
import pandas as pd

def create_datasets(cleaned_data):
    # Filtering out the 'SELF' entries from the dataset
    data_filtered = cleaned_data[cleaned_data['Call_Source'] != 'SELF']

    # Removing duplicate incident numbers, keeping only the first occurrence
    data_unique_incidents = data_filtered.drop_duplicates(subset='IncidentNumber', keep='first')

    # Dataset 1: ALL CAHOOTS ASSOCIATIONS
    cahoots_associations = data_unique_incidents[data_unique_incidents['Cahoots_related'] == 1]

    # Dataset 2: ALL CAHOOTS DISPATCHED CFS
    cahoots_dispatched = cahoots_associations.dropna(subset=['Call_First_Dispatched_Time'])

    # Dataset 3: ALL CAHOOTS ARRIVED CFS
    cahoots_arrived = cahoots_dispatched.dropna(subset=['Call_First_On_Scene'])

    # Dataset 4: ALL CAHOOTS ONLY ASSOCIATIONS
    cahoots_only_associations = cahoots_associations[cahoots_associations['PrimaryUnitCallSign'] == "CAHOOT"]
    cahoots_only_associations = cahoots_associations[cahoots_associations['IsPrimary'] == 1]

    # Dataset 5: CAHOOTS ONLY ARRIVED CFS
    cahoots_only_arrived = cahoots_only_associations.dropna(subset=['Call_First_On_Scene'])

    # Dataset 6: JOINT CAHOOTS / EPD CFS
    joint_cahoots_epd_responses = data_unique_incidents[
        (data_unique_incidents["Cahoots_related"] == 1) &
        ~(data_unique_incidents["Call_First_On_Scene"].isna()) &
        (data_unique_incidents["IsPrimary"] == 0)
    ]

    
    # Calculate total calls
    total_calls = data_unique_incidents.shape[0]

    # Filter top 3 CAHOOTS CFS natures
    top_3_natures = ['ASSIST PUBLIC- POLICE', 'CHECK WELFARE', 'TRANSPORT']
    top_3_cahoots_natures = cahoots_only_arrived[cahoots_only_arrived['InitialIncidentTypeDescription'].isin(top_3_natures)]
    top_3_cahoots_natures_count = top_3_cahoots_natures.shape[0]
    

    # Specific counts for top 3 natures
    check_welfare_count = top_3_cahoots_natures[top_3_cahoots_natures['InitialIncidentTypeDescription'] == 'CHECK WELFARE'].shape[0]

    # Gross Divert Rates
    gross_divert_rate_1 = (cahoots_associations.shape[0] / total_calls) * 100
    gross_divert_rate_2 = (cahoots_dispatched.shape[0] / total_calls) * 100
    gross_divert_rate_3 = (cahoots_only_arrived.shape[0] / total_calls) * 100

    # Adjusted Divert Rates
    adjusted_cahoots_only_arrived = cahoots_only_arrived.shape[0] - top_3_cahoots_natures_count
    adjusted_divert_rate = (adjusted_cahoots_only_arrived / total_calls) * 100
    
    # Only Dispatched divert rates
    adjusted_cahoots_police_dispatched = ((cahoots_dispatched.shape[0] - cahoots_dispatched[cahoots_dispatched['InitialIncidentTypeDescription'].isin(top_3_natures)].shape[0]) / (data_unique_incidents.dropna(subset=['Call_First_Dispatched_Time']).shape[0])) * 100

    # Applying 74% adjustment to Check Welfare calls
    likely_check_welfare_diverts = check_welfare_count * 0.74
    total_diverts = adjusted_cahoots_only_arrived + likely_check_welfare_diverts
    adjusted_total_calls = total_calls - top_3_cahoots_natures_count
    adjusted_divert_rate_with_welfare = (total_diverts / adjusted_total_calls) * 100

    # Final Results
    results = {
        "Gross Divert Rate 1 (All Cahoots Associations)": gross_divert_rate_1,
        "Gross Divert Rate 2 (All Cahoots Dispatched CFS)": gross_divert_rate_2,
        "Gross Divert Rate 3 (All Cahoots Arrived CFS)": gross_divert_rate_3,
        "Adjusted Divert Rate (Excluding Top 3 Natures)": adjusted_divert_rate,
        "Adjusted Divert Rate (Excluding Top 3 Natures dispatch only)": adjusted_cahoots_police_dispatched,
        "Adjusted Divert Rate with Check Welfare Adjustment": adjusted_divert_rate_with_welfare
    }

    return {
        'cahoots_associations': cahoots_associations,
        'cahoots_dispatched': cahoots_dispatched,
        'cahoots_arrived': cahoots_arrived,
        'cahoots_only_associations': cahoots_only_associations,
        'cahoots_only_arrived': cahoots_only_arrived,
        'joint_cahoots_epd_responses': joint_cahoots_epd_responses,
        'results': results
    }

# Example usage:
datasets_and_results = create_datasets(cleaned_data)
datasets_and_results['results']


{'Gross Divert Rate 1 (All Cahoots Associations)': 17.928730512249444,
 'Gross Divert Rate 2 (All Cahoots Dispatched CFS)': 15.348470856646207,
 'Gross Divert Rate 3 (All Cahoots Arrived CFS)': 12.877759067122966,
 'Adjusted Divert Rate (Excluding Top 3 Natures)': 2.3602585691781193,
 'Adjusted Divert Rate (Excluding Top 3 Natures dispatch only)': 5.151921228187724,
 'Adjusted Divert Rate with Check Welfare Adjustment': 6.397685077450752}

# Problems
- Assumption of Uniform Impact: The analysis assumes that the impact of Cahoots on diversion rates is uniform across all incident types. This does not factor in which CAD call types are appropriate for Cahoots.

- The approach of simply excluding the top three Cahoots-centric incident types assumes that these calls would never require police intervention. However, police still Respond to quite a few of these call types.

- Trying to measure what call volume would be if Cahoots did not exist does not make sense since Cahoots provides other preventative community services that could reduce call volume. 

- The methodology for the 74% figure is vauge and somewhat subjective (Based on Dispatchers opinion)




# Natural Expirement

In [1]:
import pandas as pd
CAD_data = pd.read_csv("data/call_data_from_CAD.csv")

CAD_data["Call_Created_Time"] = pd.to_datetime(CAD_data['Call_Created_Time'], errors='coerce')
CAD_data["year"] = CAD_data["Call_Created_Time"].dt.year
CAD_data["Call_First_Dispatched_Time"] = pd.to_datetime(CAD_data['Call_First_Dispatched_Time'], errors='coerce')


# Standardize Cahoots identifiers 
cahoots_identifiers = r"1J77\s*|3J79\s*|3J78\s*|3J77\s*|4J79\s*|3J81\s*|3J76\s*|2J28\s*|2J29\s*|CAHOOT\s*|CAHOT\s*|CAHO\s*"

CAD_data["PrimaryUnitCallSign"] = CAD_data["PrimaryUnitCallSign"].replace(cahoots_identifiers, 'CAHOOT', regex=True)
CAD_data["RespondingUnitCallSign"] = CAD_data["RespondingUnitCallSign"].replace(cahoots_identifiers, 'CAHOOT', regex=True)
# Standardize Cahoots identifiers 
cahoots_identifiers = r"1J77\s*|3J79\s*|3J78\s*|3J77\s*|4J79\s*|3J81\s*|3J76\s*|2J28\s*|2J29\s*|CAHOOT\s*|CAHOT\s*|CAHO\s*"

CAD_data["PrimaryUnitCallSign"] = CAD_data["PrimaryUnitCallSign"].replace(cahoots_identifiers, 'CAHOOT', regex=True)
CAD_data["RespondingUnitCallSign"] = CAD_data["RespondingUnitCallSign"].replace(cahoots_identifiers, 'CAHOOT', regex=True)

# Create an identifier for Cahoots involvement 
CAD_data['Cahoots_related'] = ((CAD_data['PrimaryUnitCallSign'] == 'CAHOOT') | (CAD_data['RespondingUnitCallSign'] == 'CAHOOT')).astype(int)

In [2]:
CAD_data = CAD_data[CAD_data['Call_Source'] != 'SELF'].copy()
CAD_data.drop(columns=["Unnamed: 0"], inplace=True)
CAD_data

Unnamed: 0,IncidentNumber,Call_Created_Time,Call_First_Dispatched_Time,Call_First_On_Scene,Call_Cleared,Call_Zipcode,Beat,Call_Source,Call_Priority,InitialIncidentTypeDescription,IsPrimary,PrimaryUnitCallSign,RespondingUnitCallSign,Unit_Dispatched_Time,Unit_OnScene_Time,Unit_Cleared_Time,Disposition,year,Cahoots_related
0,OR-2016-01-01-16000001,2016-01-01 00:00:04,2016-01-01 00:04:58,01/01/2016 00:09:41,01/01/2016 00:54:19,97402.0,EP05,E911,3,ASSAULT,1,5E57,5E57,01/01/2016 00:04:58,01/01/2016 00:09:56,01/01/2016 00:54:19,ADVISED,2016,0
1,OR-2016-01-01-16000001,2016-01-01 00:00:04,2016-01-01 00:04:58,01/01/2016 00:09:41,01/01/2016 00:54:19,97402.0,EP05,E911,3,ASSAULT,0,5E57,4X40,01/01/2016 00:09:41,01/01/2016 00:09:41,01/01/2016 00:46:59,ADVISED,2016,0
2,OR-2016-01-01-16000001,2016-01-01 00:00:04,2016-01-01 00:04:58,01/01/2016 00:09:41,01/01/2016 00:54:19,97402.0,EP05,E911,3,ASSAULT,0,5E57,4E53,01/01/2016 00:04:58,01/01/2016 00:12:26,01/01/2016 00:51:58,ADVISED,2016,0
4,OR-2016-01-01-16000004,2016-01-01 00:02:45,2016-01-01 00:04:05,01/01/2016 00:04:05,01/01/2016 00:18:22,97401.0,EP02,E911,3,CHECK WELFARE,0,3X90,3F61,01/01/2016 00:04:12,,01/01/2016 00:08:13,ASSISTED,2016,0
5,OR-2016-01-01-16000004,2016-01-01 00:02:45,2016-01-01 00:04:05,01/01/2016 00:04:05,01/01/2016 00:18:22,97401.0,EP02,E911,3,CHECK WELFARE,0,3X90,4F72,01/01/2016 00:04:49,01/01/2016 00:04:49,01/01/2016 00:08:07,ASSISTED,2016,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1616824,OR-2023-12-31-23351877,2023-12-31 23:23:05,NaT,,,97402.0,EP05,PHONE,3,CRIMINAL TRESPASS,0,,,,,,DISREGARD,2023,0
1616825,OR-2023-12-31-23351878,2023-12-31 23:26:00,NaT,,,97401.0,EP07,PHONE,P,AUDIBLE ALARM,0,,,,,,INFORMATION ONLY,2023,0
1616826,OR-2023-12-31-23351884,2023-12-31 23:32:06,NaT,,,97408.0,EP01,PHONE,P,SUSPICIOUS CONDITIONS,0,,,,,,QUALITY OF LIFE - NO DISPATCH,2023,0
1616827,OR-2023-12-31-23351886,2023-12-31 23:33:30,NaT,,,97401.0,EP01,PHONE,3,SHOTS FIRED,0,,,,,,REFERRED TO OTHER AGENCY,2023,0


In [3]:
data_2016 = CAD_data[(CAD_data["Call_First_Dispatched_Time"].dt.year == 2016) & 
                         (CAD_data["Call_First_Dispatched_Time"].dt.hour >= 5) & 
                         (CAD_data["Call_First_Dispatched_Time"].dt.hour < 10)].copy()

data_2017 = CAD_data[(CAD_data["Call_First_Dispatched_Time"].dt.year == 2017) & 
                         (CAD_data["Call_First_Dispatched_Time"].dt.hour >= 5) & 
                         (CAD_data["Call_First_Dispatched_Time"].dt.hour < 10)].copy()

data_2016.dropna(subset=['Call_First_Dispatched_Time'], inplace=True)
data_2017.dropna(subset=['Call_First_Dispatched_Time'], inplace=True)

In [10]:
def calculate_overlap_proportion(data_2016, data_2017, incident_type):
    """Calculates the proportion of calls for a specific incident type 
    handled by police in 2016 (before CAHOOTS expansion),
    treating it as a baseline weight.

    Args:
        data_2016: Pandas DataFrame containing 2016 CAD data filtered for 5am-10am.
        data_2017: Pandas DataFrame containing 2017 CAD data filtered for 5am-10am.
        incident_type: String representing the CAHOOTS incident type.

    Returns:
        float: The calculated proportion, representing the baseline weight.
             Returns None if there were no calls of that type in 2016 or 2017.
    """

    # 2016 Baseline Calls (Police Only):
    baseline_police_calls = len(data_2016[
        (data_2016["Cahoots_related"] == 0) & 
        (data_2016["InitialIncidentTypeDescription"] == incident_type)
    ])

    # 2017 Total Calls (Police & CAHOOTS):
    total_calls_2017 = len(data_2017[
        data_2017["InitialIncidentTypeDescription"] == incident_type
    ]) 

    # Handle cases with zero calls to avoid ZeroDivisionError:
    if total_calls_2017 == 0:
        print(f"Warning: No calls of type '{incident_type}' in 2017. Cannot calculate proportion.")
        return None

    # Calculate Proportion (baseline / total_2017):
    proportion = baseline_police_calls / total_calls_2017
    return proportion

def estimate_diverted_calls(data_2017, incident_type, proportion):
    """Estimates the number of calls diverted from police to CAHOOTS
    based on the calculated proportion/weight.

    Args:
        data_2017: Pandas DataFrame containing 2017 CAD data filtered for 5am-10am.
        incident_type: String representing the CAHOOTS incident type.
        proportion: Float representing the baseline weight calculated previously.

    Returns:
        int: The estimated number of diverted calls.
    """

    total_calls_2017 = len(data_2017[
        data_2017["InitialIncidentTypeDescription"] == incident_type
    ])
    diverted_calls = int(total_calls_2017 * (1 - proportion))
    return diverted_calls

incident_types = data_2016[data_2016["Cahoots_related"] == 1]["InitialIncidentTypeDescription"].value_counts().head(5).index.tolist()

diverted_calls_by_type = {}
for incident in incident_types:
    weight = calculate_overlap_proportion(data_2016, data_2017, incident)

    if weight is not None:  # Check for successful proportion calculation
        diverted_calls = estimate_diverted_calls(data_2017, incident, weight)
        diverted_calls_by_type[incident] = diverted_calls

diverted_calls_by_type

overlap_proportions = {}
for incident in incident_types:
    overlap = calculate_overlap_proportion(data_2016, data_2017, incident)
    overlap_proportions[incident] = overlap

overlap_proportions

{'CHECK WELFARE': 0.8883174136664217,
 'ASSIST PUBLIC- POLICE': 0.06174957118353345,
 'TRANSPORT': 0.046153846153846156,
 'SUICIDAL SUBJECT': 0.6749226006191951,
 'CRIMINAL TRESPASS': 1.035206499661476}

In [24]:
import numpy as np

np.round(((1052 + 108 + 5282 + 236 + 372 + 255 + 219 + 192) / total_dispatched_adjusted_2), 3)

0.14