# 🚨 **Project: Emergency Response Efficiency Analysis** 🚨

### *Using Numpy, Pandas, Matplotlib, and Seaborn*

---



## 🛠️ **Step 1: Dataset Loading and Initial Cleaning**

### 👉 Operation: Load the dataset and convert datetime columns.

In [33]:
# Import necessary libraries
import pandas as pd
import numpy as np
import warnings
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
df = pd.read_csv('dataset2_cleaned.csv') # Load the dataset

In [None]:
df.info() # print dataset information

In [None]:
df.head() # Display the first 5 rows of the DataFrame 

In [None]:
df.tail() # Display the last 5 rows of the DataFrame

In [None]:
print(df.isnull().sum()) # Check for missing values in each column

In [34]:
warnings.filterwarnings("ignore")  # Ignore warnings for cleaner output
# Convert datetime columns
datetime_columns = ['Received DtTm', 'Entry DtTm', 'Dispatch DtTm', 'Response DtTm', 'On Scene DtTm',
                    'Transport DtTm', 'Hospital DtTm', 'Available DtTm']

for col in datetime_columns:
    df[col] = pd.to_datetime(df[col], errors='coerce')  # errors='coerce' will handle missing/invalid dates

print("✅ Dataset loaded successfully!")
print(f"📊 Dataset shape: {df.shape}")
print(f"📅 Datetime columns converted: {len(datetime_columns)}")
print(f"🗓️ Date range: {df['Received DtTm'].min()} to {df['Received DtTm'].max()}")

✅ Dataset loaded successfully!
📊 Dataset shape: (9999, 23)
📅 Datetime columns converted: 8
🗓️ Date range: 2003-08-17 12:28:59 to 2025-06-17 14:08:49


### ✅ **Explanation:**

We are converting the date-time columns to proper `datetime` format so that we can **calculate time differences easily.**
`errors='coerce'` automatically sets invalid dates as `NaT` (missing) so we can clean later.

---

## 🛠️ **Step 2: Calculate Response Metrics**

### 👉 Operation: Calculate Dispatch Delay, Travel Time, and Total Response Time.

In [None]:

# Calculate time differences in minutes
df['Dispatch Delay (min)'] = (df['Dispatch DtTm'] - df['Received DtTm']).dt.total_seconds() / 60
df['Travel Time (min)'] = (df['On Scene DtTm'] - df['Dispatch DtTm']).dt.total_seconds() / 60
df['Total Response Time (min)'] = (df['On Scene DtTm'] - df['Received DtTm']).dt.total_seconds() / 60



### ✅ **Explanation:**

* `Dispatch Delay` ➜ Time taken to dispatch a unit after receiving the call.
* `Travel Time` ➜ Time taken to reach the scene after dispatch.
* `Total Response Time` ➜ Full time from call received to arrival on the scene.

---

## 🛠️ **Step 3: Clean the Data**

### 👉 Operation: Remove invalid or negative response times.

In [None]:
# Remove rows with negative or missing response times
df_clean = df[(df['Dispatch Delay (min)'] >= 0) &
              (df['Travel Time (min)'] >= 0) &
              (df['Total Response Time (min)'] >= 0)].copy()

### ✅ **Explanation:**

Negative times usually mean data entry errors.
We are keeping only **valid, logical response times** for accurate analysis.

---

## 📊 **Step 4: Descriptive Statistics**

### 👉 Operation: Calculate mean, median, and maximum response times.

In [None]:
response_stats = df_clean[['Dispatch Delay (min)', 'Travel Time (min)', 'Total Response Time (min)']].describe()
response_stats

### ✅ **Explanation:**

This will show:

* **Average response time**
* **Minimum and maximum response time**
* **Percentiles** to understand spread and outliers

---

## 🎨 **Step 5: Visualize Response Time Distribution**

### 👉 Operation: Plot response time distribution using Seaborn.

In [None]:
plt.figure(figsize=(12, 6))
sns.histplot(df_clean['Total Response Time (min)'], bins=50, kde=True, color='crimson')
plt.title('Total Emergency Response Time Distribution', fontsize=16, color='darkblue')
plt.xlabel('Total Response Time (minutes)', fontsize=14)
plt.ylabel('Frequency', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

We are plotting **how emergency response times are distributed** to check for patterns or delays.

---

## 📍 **Step 6: Neighborhood-wise Average Response Time**

### 👉 Operation: Calculate and plot average response time by neighborhood.

In [None]:
neighborhood_response = df_clean.groupby('Analysis Neighborhoods')['Total Response Time (min)'].mean().sort_values()

plt.figure(figsize=(14, 8))
sns.barplot(x=neighborhood_response.values, y=neighborhood_response.index, hue=neighborhood_response.index, palette='coolwarm', legend=False)
plt.title('Average Emergency Response Time per Neighborhood', fontsize=16, color='darkgreen')
ax = plt.gca()
ax.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{x:.0f}'))
plt.xlabel('Average Total Response Time (minutes)', fontsize=14)
plt.ylabel('Neighborhood', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This shows **which neighborhoods are getting faster or slower emergency responses.**
Important for finding hotspots where delays are frequent.

---

## ⏰ **Step 7: Time of Day vs Response Time**

### 👉 Operation: Extract hour and analyze response times across different hours.

In [None]:
df_clean['Hour of Call'] = df_clean['Received DtTm'].dt.hour

plt.figure(figsize=(12, 6))
sns.boxplot(x='Hour of Call', y='Total Response Time (min)', data=df_clean, hue='Hour of Call', palette='Spectral', legend=False)
plt.title('Response Time by Hour of the Day', fontsize=16, color='darkred')
plt.xlabel('Hour of the Day', fontsize=14)
plt.ylabel('Total Response Time (minutes)', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This shows **peak delay hours vs. quick response hours**
Ideal for resource planning and peak-hour alertness.

---


## 🚑 **Step 8: Call Type Group Analysis**

### 👉 Operation: Compare response time across different emergency types.

In [None]:
plt.figure(figsize=(14, 7))
sns.boxplot(x='Call Type Group', y='Total Response Time (min)', hue='Call Type Group', data=df_clean, palette='Set2', legend=False)
plt.title('Response Time by Emergency Type', fontsize=16, color='purple')
plt.xlabel('Emergency Type', fontsize=14)
plt.ylabel('Total Response Time (minutes)', fontsize=14)
plt.xticks(rotation=45)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This reveals **which emergency types get prioritized** and which ones are delayed.
Example: Life-threatening calls should ideally have the fastest response.

---

## ⚡ **Step 9: Response Time by Final Priority**

### 👉 Operation: Analyze response time based on emergency priority levels.

In [None]:
plt.figure(figsize=(10, 6))
sns.boxplot(x='Final Priority', y='Total Response Time (min)', hue='Final Priority', data=df_clean, palette='viridis', legend=False)
plt.title('⚡ Response Time by Final Priority', fontsize=16, color='darkorange')
plt.xlabel('Final Priority', fontsize=14)
plt.ylabel('Total Response Time (minutes)', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This shows **how emergency priority levels affect response times.**
Higher priority emergencies should ideally receive faster responses.

---

## 🚑 **Step 10: ALS Unit vs Non-ALS Unit Response Time**

### 👉 Operation: Compare response times between Advanced Life Support and regular units.

In [None]:
plt.figure(figsize=(10, 6))
sns.boxplot(x='ALS Unit', y='Total Response Time (min)', hue='ALS Unit', data=df_clean, palette='cool', legend=False)
plt.title('ALS Unit vs Non-ALS Unit Response Time', fontsize=16, color='darkblue')
plt.xlabel('ALS Unit (Advanced Life Support)', fontsize=14)
plt.ylabel('Total Response Time (minutes)', fontsize=14)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This compares **Advanced Life Support (ALS) units vs regular units** to see if specialized units respond faster.

---

## 🚒 **Step 11: Unit Type-wise Response Time**

### 👉 Operation: Analyze response times across different types of emergency units.

In [None]:
plt.figure(figsize=(14, 7))
sns.boxplot(x='Unit Type', y='Total Response Time (min)', hue='Unit Type', data=df_clean, palette='Set3', legend=False)
plt.title('Response Time by Unit Type', fontsize=16, color='darkgreen')
plt.xlabel('Unit Type', fontsize=14)
plt.ylabel('Total Response Time (minutes)', fontsize=14)
plt.xticks(rotation=45)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This shows **which unit types are most efficient** in responding to emergencies.
Different unit types may have varying response capabilities.

---

## 📍 **Step 12: Response Time by Zipcode of Incident**

### 👉 Operation: Analyze average response times across different zip codes.

In [None]:
plt.figure(figsize=(14, 8))
sns.barplot(x='Zipcode of Incident', y='Total Response Time (min)', hue='Zipcode of Incident', data=df_clean, errorbar=None, palette='YlOrRd', legend=False)
plt.title('Average Response Time by Zipcode', fontsize=16, color='darkred')
plt.xlabel('Zipcode', fontsize=14)
plt.ylabel('Average Total Response Time (minutes)', fontsize=14)
plt.xticks(rotation=45)
plt.grid(True, linestyle='--', alpha=0.5)
plt.show()

### ✅ **Explanation:**

This reveals **geographic patterns in emergency response times** across different zip codes.
Useful for identifying areas that may need additional emergency resources.

---

## 🔗 Step 13: **Correlation Matrix of Response Time Metrics**

### 👉 Operation: Analyze correlation between response time metrics.


The table below shows how strongly each response time metric is related to the others. High values indicate a strong relationship, helping identify which delays most impact total response time.

In [None]:
# Create a correlation matrix for relevant numerical columns
correlation_matrix = df_clean[['Dispatch Delay (min)', 'Travel Time (min)', 'Total Response Time (min)']].corr()

# Plot the heatmap for the correlation matrix
plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', fmt=".2f", square=True)
plt.title('Correlation Matrix of Response Time Metrics', fontsize=16, color='darkblue')
plt.yticks(rotation=45)
plt.xticks(rotation=45)
plt.show()

# Display correlation matrix values
correlation_matrix


## 🎯 **Project Conclusion**

### 📊 **Key Findings:**

✅ **Dataset Analysis Complete**: Successfully analyzed 9,999+ emergency response records  
✅ **Time Metrics Calculated**: Dispatch delay, travel time, and total response time  
✅ **Geographic Insights**: Identified neighborhood and zipcode response patterns  
✅ **Temporal Analysis**: Discovered peak delay hours vs. quick response periods  
✅ **Priority Assessment**: Analyzed response efficiency by emergency priority levels  
✅ **Unit Performance**: Compared ALS units vs. regular units effectiveness  

---