# Notebook 01: Data Exploration

In this notebook, we explore the synthetic dataset generated for the Abu Dhabi Emergency Ambulance Coverage Optimization project. Understanding the geographic and demographic layout of the emirate is the first step toward optimizing emergency response.

## 1. Setup

In [None]:
import os
import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set project root
PROJECT_ROOT = os.path.dirname(os.getcwd())
DATA_DIR = os.path.join(PROJECT_ROOT, "data", "synthetic")

plt.style.use('ggplot')
%matplotlib inline

## 2. Load Zone Data

In [None]:
zones_gdf = gpd.read_file(os.path.join(DATA_DIR, "zones.geojson"))
print(f"Loaded {len(zones_gdf)} zones.")
zones_gdf.head()

## 3. Visualize Population Distribution

Abu Dhabi has a distinct population structure: a very dense urban core (the island) and large, sparsely populated peripheral zones. 

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(12, 10))
zones_gdf.plot(column='population', ax=ax, legend=True, cmap='viridis', 
               legend_kwds={'label': "Estimated Population"})
ax.set_title("Population Distribution across Abu Dhabi Zones")
ax.axis('off')
plt.show()

## 4. Analyze Zone Types

We categorized zones into types to better model travel speeds and service requirements.

In [None]:
plt.figure(figsize=(10, 6))
sns.countplot(data=zones_gdf, x='zone_type')
plt.title("Number of Zones by Type")
plt.show()

pop_by_type = zones_gdf.groupby('zone_type')['population'].sum()
print("Total Population by Zone Type:")
print(pop_by_type.apply(lambda x: f"{x:,.0f}"))

## 5. Demand Nodes & Baseline Stations

We also generated 235 demand nodes (5 per zone) and identified 12 baseline stations based on current health authority maps. Notice the concentration of baseline stations in the urban core.

In [None]:
demand_gdf = gpd.read_file(os.path.join(DATA_DIR, "demand_nodes.geojson"))
existing_gdf = gpd.read_file(os.path.join(DATA_DIR, "existing_stations.geojson"))

fig, ax = plt.subplots(1, 1, figsize=(12, 10))
zones_gdf.plot(ax=ax, color='lightgray', alpha=0.5)
demand_gdf.plot(ax=ax, color='red', markersize=1, alpha=0.3, label='Demand Nodes')
existing_gdf.plot(ax=ax, color='blue', markersize=30, marker='^', label='Baseline Stations')
ax.set_title("Demand Nodes and Existing Station Infrastructure")
ax.legend()
ax.axis('off')
plt.show()