# 🏥 Hospital Patient Flow Simulation at IMH

## 📄 Overview
This document generates **synthetic hospital patient flow data** for a single day at the Institute of Mental Health (IMH), simulating realistic patient demographics and arrival patterns.

---

## 📅 Simulation Details

- **Duration**: 1 working day (8:00 AM to 5:00 PM)
- **Number of Patients**: 100

---

## ⏱️ Patient Arrival Times

- **Distribution**: Simulated using a **non-homogeneous Poisson process**
- **Pattern**:
  - Lower arrival rates in the morning
  - Peak arrival activity in the afternoon
  - Lower arrival rate in the evening

---

## 🧍 Patient Demographics

- **Age Range**: Random integers between 12 and 90
- **Gender Distribution**: 50% Male, 50% Female (random assignment)

---

## 🧠 Health Conditions

Each patient is randomly assigned one of the following mental health conditions according to the estimated population prevalence:

| Condition                               | Estimated % |
|-----------------------------------------|-------------|
| Depression                              | 25%         |
| Anxiety Disorders (incl. GAD)           | 20%         |
| Obsessive-Compulsive Disorder (OCD)     | 12%         |
| Schizophrenia                           | 10%         |
| Bipolar Disorder                        | 8%          |
| Alcohol Use Disorders                   | 7%          |
| Dementia                                | 6%          |
| Post-Traumatic Stress Disorder (PTSD)   | 5%          |
| Personality Disorders                   | 4%          |
| ADHD                                    | 3%          |

---

## 🛡️ Insurance Providers

Patients are assigned a **uniform random insurance provider** from the following top 5:

- AIA HealthShield
- Prudential
- Great Eastern
- NTUC Income
- HSBC

---

In [24]:
import pandas as pd
import numpy as np
import plotly.express as px
from datetime import datetime, timedelta, time

# Set seed for reproducibility
np.random.seed(42)

# Number of patients
n_patients = 100

# Generate patient IDs
patient_ids = [f"P{str(i+1).zfill(3)}" for i in range(n_patients)]

# Generate ages (uniform between 12 and 90)
ages = np.random.randint(12, 91, size=n_patients)

# Generate gender (50/50 split)
genders = np.random.choice(['Male', 'Female'], size=n_patients)

# Health conditions with specific probabilities
conditions = [
    ('Depression', 0.25),
    ('Anxiety Disorders', 0.20),
    ('OCD', 0.12),
    ('Schizophrenia', 0.10),
    ('Bipolar Disorder', 0.08),
    ('Alcohol Use Disorders', 0.07),
    ('Dementia', 0.06),
    ('PTSD', 0.05),
    ('Personality Disorders', 0.04),
    ('ADHD', 0.03)
]

condition_names, condition_probs = zip(*conditions)
health_conditions = np.random.choice(condition_names, size=n_patients, p=condition_probs)

# Insurance providers (uniform probability)
insurance_options = ['AIA', 'Prudential', 'Great Eastern', 'NTUC Income', 'HSBC']
insurances = np.random.choice(insurance_options, size=n_patients)

# Simulate arrival times using a non-homogeneous Poisson process
base_datetime = datetime.strptime("08:00", "%H:%M")
end_datetime = base_datetime + timedelta(hours=9)
total_minutes = int((end_datetime - base_datetime).total_seconds() / 60)

time_profile = np.concatenate([
    np.random.poisson(lam=0.3, size=180),
    np.random.poisson(lam=1.2, size=180),
    np.random.poisson(lam=0.5, size=180)
])

arrival_indices = np.where(time_profile > 0)[0]
arrival_minutes = np.sort(np.random.choice(arrival_indices, size=n_patients, replace=False))
arrival_times = [(base_datetime + timedelta(minutes=int(i))).time() for i in arrival_minutes]

## 🧾 Check-In Desk Simulation

This section models the **Check-In Desk process** as part of the hospital patient flow simulation.

---

### 🧑‍💼 Check-In Desk Setup

- There is **1 check-in desk** available.
- Patients are processed **on a first-come, first-served** basis.
- Check-in starts:
  - **Immediately upon patient arrival**, if the desk is free.
  - **After the current patient**, if the desk is occupied.

---

### ⏱️ Check-In Duration

- The **duration of check-in** is sampled from an **exponential distribution** (mean = 2 minutes).
- The value is **clipped** between:
  - **Minimum**: 1 minute  
  - **Maximum**: 3 minutes

---

### 📊 Tracked Metrics per Patient

For every patient:
- **Check-in Start Time**
- **Check-in End Time**
- **Check-in Duration** (1 to 3 minutes)
- **Wait Time** (if the desk is busy)
- **Queue Size** upon arrival

---

In [25]:
# Simulate check-in desk process
checkin_start_times = []
checkin_end_times = []
checkin_wait_times = []
checkin_durations = []
patients_waiting_list = []

ongoing_checkins = []

for i, arrival_minute in enumerate(arrival_minutes):
    arrival_time_minutes = arrival_minute + 8 * 60 # convert to minutes since midnight

    # Add to waiting list
    ongoing_checkins = [end for end in ongoing_checkins if end > arrival_time_minutes]

    # Current queue length
    queue_length = sum(end > arrival_time_minutes for end in ongoing_checkins)

    # Determine check-in start time
    checkin_start_minutes = max(arrival_time_minutes, max(ongoing_checkins) if ongoing_checkins else arrival_time_minutes)

    # Sample check-in duration (1-3 min from exponential distribution)
    raw_duration = np.random.exponential(scale=2)
    duration = int(round(min(max(raw_duration, 1), 3)))

    # Determine end time and wait time
    checkin_end_minutes = checkin_start_minutes + duration
    wait_time = int(round(checkin_start_minutes - arrival_time_minutes))

    # Convert to time objects
    checkin_start = time(int(checkin_start_minutes // 60), int(checkin_start_minutes % 60))
    checkin_end = time(int(checkin_end_minutes // 60), int(checkin_end_minutes % 60))

    # Store values
    checkin_start_times.append(checkin_start)
    checkin_end_times.append(checkin_end)
    checkin_durations.append(duration)
    checkin_wait_times.append(wait_time)
    patients_waiting_list.append(queue_length)

    # Update the check-in schedule
    ongoing_checkins.append(checkin_end_minutes)

## 🔄 update_queue(): Queue Management Function

This utility function is used to **manage queues** in the hospital simulation. It ensures that only patients who are still **actively waiting** remain in the queue, and handles the addition of newly arriving patients.

---

### 🧠 Purpose

- Adds a **new patient** to the queue.
- **Removes** patients who have already been served (i.e., their wait has ended).
- Ensures the queue is always **up-to-date and consistent**.

---

### 📥 Inputs

| Parameter        |
|------------------|
| `queue`          |
| `queue_starts`   |
| `end_minutes`    |
| `start_minutes`  |
---

### 📤 Outputs

| Returns          |
|------------------|
| `queue`          |
| `queue_starts`   |

---

### 🧮 Step-by-Step Logic

1. **Initial Queue Check**:
   - If `queue` is empty, add the new patient

2. **Iterate Over Queue**:
   - Loop through each patient's `queue_start` time.
   - If a patient’s `start_time > end_minutes`, that patient is still waiting → add the new patient and stop checking further.
   - Otherwise, mark the index to **delete**, as the patient has already moved on.

3. **Clean Up Queue**:
   - Remove all patients who have already been served using the list of marked deletion indices.

4. **Final Check**:
   - If all previous entries were removed and the queue is now empty, **add** the new patient to ensure no one is missed.

5. **Return Updated Queue**:
   - Return the `queue` and `queue_starts` with all updates applied.

---


In [26]:
def update_queue(queue, queue_starts, end_minutes, start_minutes):

    if queue == []:
        queue.append(end_minutes)
        queue_starts.append(start_minutes)
    else:
        # We will track indices to delete after the loop
        indices_to_delete = []

        for j, t in enumerate(queue_starts):
        # Cumulatively add patients to the queue
            if t > end_minutes:
                queue.append(end_minutes)
                queue_starts.append(start_minutes)
                break
            # Remove patients who have already been assigned to a station, only keeping those who are waiting
            else:
                indices_to_delete.append(j)

        # Now delete the indices that were marked
        for idx in reversed(indices_to_delete): # Delete from the end to avoid shifting
            del queue[idx]
            del queue_starts[idx]

        # If triage queue gets cleared out by above logic, still need to add the remaining patient to the queue
        if queue == []:
            queue.append(end_minutes)
            queue_starts.append(start_minutes)

    return queue, queue_starts


## 🩺 Triage Station Simulation

This section models the **triage process** patients go through immediately after check-in. It simulates how patients are routed to one of **two triage stations**, how wait times and queues are determined, and how patients are classified into **Emergency** or **Routine** categories.

---

### 🧠 Triage Station Setup

- There are **2 triage stations** available: Station `A` and Station `B`.
- Patients proceed to triage **immediately after completing check-in**.
- Triage begins:
  - **Immediately**, if a station is free.
  - **After waiting**, if both stations are occupied.
- Patients are classified as:
  - **Emergency (20%)**: Prioritized in the queue.
  - **Routine (80%)**: Processed on a **First-In-First-Out (FIFO)** basis.

---

### ⏱️ Triage Duration

- The **duration of triage** is sampled from a **uniform distribution** between:
  - **Minimum**: 3 minutes  
  - **Maximum**: 5 minutes

---

### 📊 Tracked Metrics per Patient

For every patient:
- **Triage Start Time** (when triage begins)
- **Triage End Time** (when triage ends)
- **Triage Duration** (3 to 5 minutes)
- **Wait Time for Triage** (time spent waiting after check-in)
- **Number of Patients Ahead in Queue** (if any)
- **Assigned Triage Station** (`A` or `B`)
- **Triage Classification** (`Emergency` or `Routine`)

---


In [27]:
# Simulate triage process
triage_in_times = []
triage_out_times = []
triage_wait_times = []
triage_durations = []
triage_classification = []
triage_station_assignment = []
triage_waiting_list = []

triage_station_end_times = {'A': None, 'B': None}
triage_end_minutes_list = []

triage_queue = []  # Keep track of patients waiting for triage
triage_queue_starts = [] # Keep track of start times of patients

for i, checkin_end in enumerate(checkin_end_times):
    checkin_end_minutes = checkin_end.hour * 60 + checkin_end.minute
    is_emergency = np.random.rand() < 0.20
    triage_classification.append('Emergency' if is_emergency else 'Routine')

    # Count number of patients waiting (in queue) including the current one if no station is free
    available_station = None
    for station, end_time in triage_station_end_times.items():
      if end_time is None or end_time <= checkin_end_minutes:
          triage_queue = []
          triage_queue_starts = []
          available_station = station
          triage_start_minutes = max(checkin_end_minutes, triage_station_end_times[available_station] or 0)
          break

    soonest_station = min(triage_station_end_times.items(), key=lambda x: float('inf') if x[1] is None else x[1])

    if available_station is None:

        available_station = soonest_station[0]
        triage_start_minutes = max(checkin_end_minutes, triage_station_end_times[available_station] or 0)

        triage_queue, triage_queue_starts = update_queue(triage_queue, triage_queue_starts, checkin_end_minutes, triage_start_minutes)

    waiting_count = len(triage_queue)
    triage_waiting_list.append(waiting_count)

    triage_duration = int(round(np.random.uniform(3, 5)))
    triage_end_minutes = triage_start_minutes + triage_duration
    triage_wait = triage_start_minutes - checkin_end_minutes

    triage_in = time(triage_start_minutes // 60, triage_start_minutes % 60)
    triage_out = time(triage_end_minutes // 60, triage_end_minutes % 60)

    triage_in_times.append(triage_in)
    triage_out_times.append(triage_out)
    triage_durations.append(triage_duration)
    triage_wait_times.append(triage_wait)
    triage_station_assignment.append(available_station)

    triage_station_end_times[available_station] = triage_end_minutes
    triage_end_minutes_list.append(triage_end_minutes)




### 🧮 `pop_next_patient()` Function Overview

This function selects the **next patient** from the **emergency or routine queue** based on:

- **Priority**: Emergency patients are always served before routine, as long as they are in the queue.
- **FIFO Rule**: Within each category, patients are served in the order they arrived.

#### 🔁 Returns
- The selected patient's index and arrival time.
- The updated emergency and routine queues after removing the selected patient.


In [28]:
def pop_next_patient(emergency_queue, routine_queue, doctor_end_times, next_doctor):
    """Return next patient from appropriate queue based on emergency priority and FIFO, along with updated queues."""
    if emergency_queue and routine_queue:
        emergency_pid, emergency_arrival = emergency_queue[0]
        routine_pid, routine_arrival = routine_queue[0]

        if emergency_pid < routine_pid:
            selected_queue = 'emergency'
        else:
            next_doctor_time = doctor_end_times[next_doctor]
            if next_doctor_time >= emergency_arrival:
                selected_queue = 'emergency'
            elif next_doctor_time >= routine_arrival and next_doctor_time < emergency_arrival:
                selected_queue = 'routine'

        if selected_queue == 'emergency':
            patient_idx, patient_arrival = emergency_queue.pop(0)
        else:
            patient_idx, patient_arrival = routine_queue.pop(0)

    elif emergency_queue:
        patient_idx, patient_arrival = emergency_queue.pop(0)

    elif routine_queue:
        patient_idx, patient_arrival = routine_queue.pop(0)

    else:
        patient_idx = patient_arrival = None  # fallback if both queues are empty

    return patient_idx, patient_arrival, emergency_queue, routine_queue


## 🩻 Doctor Consultation Simulation

This section models the **doctor consultation process** that patients enter after completing triage. It includes logic for doctor assignment, priority handling for Emergency patients, FIFO for Routine patients, and tracking of wait, consultation times, emergency and routine queues.

---

### 🧠 Consultation Setup

- There are **5 doctors** available: `A`, `B`, `C`, `D`, and `E`.
- All doctors have the specialty **Psychiatry**.
- Patients proceed to consultation **immediately after triage**, depending on doctor availability.
- Patients are prioritized based on triage classification:
  - **Emergency** patients are placed in a **priority queue**.
  - **Routine** patients are placed in a **First-In-First-Out (FIFO)** queue.

---

### 🧬 Consultation Logic

- If **one or more doctors are available**, a patient is assigned to the doctor with the **earliest available time**.
- If **all doctors are busy**, the patient is assigned to the **first doctor who becomes available**.
- Patients in the **Emergency queue** are always assigned first.
  - Among emergencies, the one with the **earliest triage end time** is selected.
- Routine patients are selected based on **earliest triage out time** (FIFO order).

---

### ⏱️ Consultation Duration

- The **duration of consultation** is sampled from a **normal distribution** with:
  - **Mean**: 17.5 minutes  
  - **Standard Deviation**: 1.5 minutes  
- Duration values are **clipped** between:
  - **Minimum**: 15 minutes  
  - **Maximum**: 20 minutes

---

### 📊 Tracked Metrics per Patient

For every patient:
- **Consultation Start Time** (`consultation_in_time`)
- **Consultation End Time** (`consultation_out_time`)
- **Total Consultation Time** (15 to 20 minutes)
- **Waiting Time** before consultation (difference between triage out and consultation start)
- **Assigned Doctor** (`A` to `E`)
- **Doctor Specialty** (always "Psychiatry")
- **Number of Patients Ahead in Emergency and Routine Queue**:
  - If a doctor is available immediately, this value is **zero**.
  - Otherwise, it reflects the number of emergency or routine patients waiting before this one (excluding those already in consultation).

---

In [29]:
# Doctor Consultation according to Emergency or Routine Queue

patient_ids = np.arange(n_patients)


# Initialize tracking lists
consultation_in_times = [None] * n_patients
consultation_out_times = [None] * n_patients
consultation_durations = [None] * n_patients
consultation_wait_times = [None] * n_patients
consultation_doctor = [None] * n_patients
consultation_specialty = ['Psychiatry'] * n_patients
routine_waiting_list = [None] * n_patients
emergency_waiting_list = [None] * n_patients
consultation_end_minutes_list = [None] * n_patients

doctor_end_times = {doc: 0 for doc in ['A', 'B', 'C', 'D', 'E']}
doctor_busy_slots = {doc: [] for doc in doctor_end_times}

# Queues
emergency_queue = []
routine_queue = []

# Simulation
for i in patient_ids:
    arrival_time = triage_end_minutes_list[i]
    category = triage_classification[i]

    # Add patient to appropriate queue
    if category == 'Emergency':
        emergency_queue.append((i, arrival_time))
    else:
        routine_queue.append((i, arrival_time))

    # Step 1: While there are idle doctors before this patient's arrival, assign from queues
    while True:
        idle_doctors = {doc: end_time for doc, end_time in doctor_end_times.items() if end_time <= arrival_time}
        # This loop breaks if there all doctors are busy or both queues are empty
        if not idle_doctors or not (emergency_queue or routine_queue):
          routine_waiting_list[i] = len(routine_queue)
          emergency_waiting_list[i] = len(emergency_queue)
          break

        # Assign the next available doctor
        next_doctor = min(idle_doctors, key=idle_doctors.get)

        patient_idx, patient_arrival, emergency_queue, routine_queue = pop_next_patient(emergency_queue, routine_queue, doctor_end_times, next_doctor)


        consultation_start = max(patient_arrival, doctor_end_times[next_doctor])
        duration = int(round(np.clip(np.random.normal(loc=17.5, scale=1.5), 15, 20)))
        consultation_end = consultation_start + duration

        doctor_end_times[next_doctor] = consultation_end
        doctor_busy_slots[next_doctor].append((consultation_start, consultation_end))

        consultation_in_times[patient_idx] = time(consultation_start // 60, consultation_start % 60)
        consultation_out_times[patient_idx] = time(min(23, consultation_end // 60), min(59, consultation_end % 60))
        consultation_durations[patient_idx] = duration
        consultation_wait_times[patient_idx] = consultation_start - patient_arrival
        consultation_doctor[patient_idx] = next_doctor
        consultation_end_minutes_list[patient_idx] = consultation_end

# After processing all patients, assign remaining in queues when doctors are free
while emergency_queue or routine_queue:

    next_available_time = min(doctor_end_times.values())
    idle_doctors = {doc: end_time for doc, end_time in doctor_end_times.items() if end_time == next_available_time}
    next_doctor = min(idle_doctors, key=idle_doctors.get)

    patient_idx, patient_arrival, emergency_queue, routine_queue = pop_next_patient(emergency_queue, routine_queue, doctor_end_times, next_doctor)

    consultation_start = max(patient_arrival, doctor_end_times[next_doctor])
    duration = int(round(np.clip(np.random.normal(loc=17.5, scale=1.5), 15, 20)))
    consultation_end = consultation_start + duration

    doctor_end_times[next_doctor] = consultation_end
    doctor_busy_slots[next_doctor].append((consultation_start, consultation_end))

    consultation_in_times[patient_idx] = time(consultation_start // 60, consultation_start % 60)
    consultation_out_times[patient_idx] = time(min(23, consultation_end // 60), min(59, consultation_end % 60))
    consultation_durations[patient_idx] = duration
    consultation_wait_times[patient_idx] = consultation_start - patient_arrival
    consultation_doctor[patient_idx] = next_doctor
    consultation_end_minutes_list[patient_idx] = consultation_end


## 🧪 Diagnostic Test Assignment

This section simulates the **assignment of patients to diagnostic tests** following their doctor consultation.

---

### 🎯 Assignment Logic

- After consultation, patients may be referred to one of the following tests:
  - **Blood Test** (20% probability)
  - **Neuroimaging** (10% probability)
  - **EEG** (10% probability)
- The remaining **60% of patients are not assigned** any diagnostic test.

---

### 📊 Data Tracked

For each patient:
- **Patient ID**
- **Consultation End Time** (used to determine test start time)
- **Assigned Test Type** (or `None`)
- **Triage Category** (Emergency or Routine)

---


In [30]:
# Initialize test assignment probabilities
np.random.seed(42)
test_types = ['Blood Test', 'Neuroimaging', 'EEG']
test_probs = [0.20, 0.10, 0.10]
test_assignments = np.random.choice(
    test_types + [None], size=n_patients,
    p=[0.20, 0.10, 0.10, 1 - sum(test_probs)]
)

tests = pd.DataFrame({
    'patient_id': patient_ids,
    'consultation_end_min': consultation_end_minutes_list,
    'test_type': test_assignments,
    'category': triage_classification
})


### ⚙️ `process_test()` Function Overview

This function simulates the **execution of a diagnostic test** (Blood, Neuroimaging, or EEG) for a patient at a given station.

#### 🔁 Workflow Summary
- **Start Time**: Begins when either the patient arrives or the station becomes free (whichever is later).
- **Duration**: Sampled from a **clipped normal distribution** based on given mean and std.
- **Result Days**: Randomly selected **integer days** between `min_days` and `max_days` for number of days it took to get test results.
- **End Time**: End time of test

#### 📌 Updates Performed
- Records **start and end times** of the test.
- Records **test duration, patient waiting time and number of days for results** for test
- Updates the **station's availability** (`test_end_times`).
- Tracks:
  - `test_in_times`, `test_out_times`
  - `test_durations`, `test_wait_times`
  - `test_station_assigned`, `test_result_days`

#### ✅ Returns
- Updated lists and dictionaries with the new test details for the given patient.

This function ensures consistent tracking and timing for diagnostic stations handling tests.


In [31]:
def process_test(
    patient_idx,
    patient_arrival,
    next_station,
    test_end_times,
    diagnostic_tests,
    test_in_times,
    test_out_times,
    test_durations,
    test_wait_times,
    test_station_assigned,
    test_result_days,
    mean,
    std,
    min_duration,
    max_duration,
    min_days,
    max_days
):
    # Determine test start time based on availability
    test_start = max(patient_arrival, test_end_times[next_station])

    # Sample test duration from a clipped normal distribution
    duration = int(round(np.clip(np.random.normal(mean, std), min_duration, max_duration)))

    # Sample result days from a uniform integer distribution
    result_days = np.random.randint(min_days, max_days + 1)

    # Calculate test end time
    test_end = test_start + duration

    # Update end time for the station
    test_end_times[next_station] = test_end

    # Get patient ID
    pid = diagnostic_tests.iloc[patient_idx]['patient_id']

    # Store results
    test_in_times[pid] = time(test_start // 60, test_start % 60)
    test_out_times[pid] = time(min(23, test_end // 60), min(59, test_end % 60))
    test_durations[pid] = duration
    test_wait_times[pid] = test_start - patient_arrival
    test_station_assigned[pid] = next_station
    test_result_days[pid] = result_days

    return (
        test_end_times,
        test_in_times,
        test_out_times,
        test_durations,
        test_wait_times,
        test_station_assigned,
        test_result_days
    )


## 🧪 Diagnostic Tests Simulation

This section simulates the **diagnostic testing stage** of patient flow, focusing on three types of tests: **Blood Test**, **Neuroimaging**, and **EEG**.

---

### 🔀 Test Assignment Logic

- After consultation, patients are randomly assigned to one of the following tests:
  - **20%** → Blood Test
  - **10%** → Neuroimaging
  - **10%** → EEG
  - Remaining **60%** are not assigned any diagnostic test.
- A new column `test_type` is added to reflect this assignment, with `null` values for patients not requiring further tests.

---

### 🏥 Test Station Setup

Each test type is handled by **2 specialized stations**:

| Test Type     | Stations |
|---------------|----------|
| Blood Test    | B1, B2   |
| Neuroimaging  | N1, N2   |
| EEG           | E1, E2   |

Patients are routed to the first available station once their consultation ends.

---

### 🧑‍⚕️ Queue Management

- Patients are **sorted by `consultation_out_time`** to maintain the correct order of flow.
- They are added to two separate queues based on triage category:
  - Emergency → `emergency_queue`
  - Routine → `routine_queue`
- When a station becomes free:
  - It **checks the queues**, giving **priority to emergency patients**.
  - Within each category, **First-In-First-Out** (FIFO) is enforced.

---

### ⏱️ Test Timing and Wait Tracking

Each test type has specific timing characteristics:

| Test Type     | Duration (mins)            | Result Days (Uniform Range) |
|---------------|----------------------------|------------------------------|
| Blood Test    | 10–15 (mean=12.5, std=1.5)  | 1–3 days                     |
| Neuroimaging  | 30–60 (mean=45, std=7.5)    | 2–4 days                     |
| EEG           | 45–60 (mean=52.5, std=5)    | 7–14 days                    |

- A **normal distribution** is used for test durations, clipped to the valid range.
- A **uniform distribution** (in days) determines when results will be available.
- **Wait time** is computed as the difference between `consultation_out_time` and `test_in_time`.

---

### ⚙️ Processing Logic

The diagnostic test workflow follows this sequence:

1. **Patients filtered by test type**.
2. **Sorted by consultation end time** to determine order.
3. **Queued based on triage category**.
4. **Assigned to first available station**
5. **Test durations, wait times, and result days** computed.
6. **All values tracked per patient** using the `process_test()` function.

---

### 📦 Output Tracked for Each Patient

- `test_type`: Type of diagnostic test assigned (or `None`)
- `test_in_time`: Time when test began
- `test_out_time`: Time when test ended
- `test_duration`: Total time spent in test
- `test_wait_time`: Time waited after consultation before test began
- `test_station_assigned`: Station that performed the test
- `test_result_days`: Number of days until results become available

---

This simulation closely mirrors real-world clinical operations where emergency patients are prioritized, and patients flow through specialized diagnostic paths following consultation.


In [32]:
# Filter patients who need blood test and sort by consultation out times. Keep track of patient id and triage category
blood_tests = tests[tests['test_type'] == 'Blood Test']
blood_tests = blood_tests.sort_values(by='consultation_end_min').reset_index(drop=True)

# Generate Flow for Blood Tests - Correct

# Initialize tracking lists
test_in_times = [None] * n_patients
test_out_times = [None] * n_patients
test_station_assigned = [None] * n_patients
test_wait_times = [None] * n_patients
test_durations = [None] * n_patients
test_result_days = [None] * n_patients

blood_test_end_times = {doc: 0 for doc in ['B1', 'B2']}

# Blood Test Queues
emergency_queue_blood = []
routine_queue_blood = []

# Simulation for blood test
for idx, row in blood_tests.iterrows():

  arrival_time = row['consultation_end_min']
  category = row['category']

  # Add patient to appropriate queue
  if category == 'Emergency':
    emergency_queue_blood.append((idx, arrival_time))
  else:
    routine_queue_blood.append((idx, arrival_time))

  # Step 1: While there are idle stations before this patient's arrival, assign from queues
  while True:
    idle_blood_tests = {doc: end_time for doc, end_time in blood_test_end_times.items() if end_time <= arrival_time}

    # This loop breaks if all stations are busy or both queues are empty
    if not idle_blood_tests or not (emergency_queue_blood or routine_queue_blood):
      break

    # Assign the next available station
    next_station = min(idle_blood_tests, key=idle_blood_tests.get)

    patient_idx, patient_arrival, emergency_queue_blood, routine_queue_blood = pop_next_patient(emergency_queue_blood, routine_queue_blood, blood_test_end_times, next_station)

    blood_test_end_times, test_in_times, test_out_times, test_durations, test_wait_times, test_station_assigned, test_result_days = process_test(patient_idx,patient_arrival,
    next_station,blood_test_end_times,blood_tests,test_in_times,test_out_times,
    test_durations,test_wait_times,test_station_assigned,test_result_days,
    mean=12.5,std=1.5,min_duration=10,max_duration=15,min_days=1,max_days=3)

while emergency_queue_blood or routine_queue_blood:

  next_available_time = min(blood_test_end_times.values())
  idle_blood_tests = {doc: end_time for doc, end_time in blood_test_end_times.items() if end_time == next_available_time}
  next_station = min(idle_blood_tests, key=idle_blood_tests.get)

  patient_idx, patient_arrival, emergency_queue_blood, routine_queue_blood = pop_next_patient(emergency_queue_blood, routine_queue_blood, blood_test_end_times, next_station)

  blood_test_end_times, test_in_times, test_out_times, test_durations, test_wait_times,
  test_station_assigned, test_result_days = process_test(patient_idx,patient_arrival,
  next_station,blood_test_end_times,blood_tests,test_in_times,test_out_times,
  test_durations,test_wait_times,test_station_assigned,test_result_days,
  mean=12.5,std=1.5,min_duration=10,max_duration=15,min_days=1,max_days=3)






In [33]:
# Neuroimaging Tests
neuro_tests = tests[tests['test_type'] == 'Neuroimaging']
neuro_tests = neuro_tests.sort_values(by='consultation_end_min').reset_index(drop=True)

neuro_test_end_times = {doc: 0 for doc in ['N1', 'N2']}
emergency_queue_neuro = []
routine_queue_neuro = []

for idx, row in neuro_tests.iterrows():
    arrival_time = row['consultation_end_min']
    category = row['category']

    if category == 'Emergency':
        emergency_queue_neuro.append((idx, arrival_time))
    else:
        routine_queue_neuro.append((idx, arrival_time))

    while True:
        idle_stations = {doc: end for doc, end in neuro_test_end_times.items() if end <= arrival_time}
        if not idle_stations or not (emergency_queue_neuro or routine_queue_neuro):
            break

        next_station = min(idle_stations, key=idle_stations.get)

        patient_idx, patient_arrival, emergency_queue_neuro, routine_queue_neuro = pop_next_patient(emergency_queue_neuro, routine_queue_neuro, neuro_test_end_times, next_station)

        neuro_test_end_times, test_in_times, test_out_times, test_durations, test_wait_times, test_station_assigned, test_result_days = process_test(patient_idx,patient_arrival,
        next_station,neuro_test_end_times,neuro_tests,test_in_times,test_out_times,
        test_durations,test_wait_times,test_station_assigned,test_result_days,
        mean=45,std=7.5,min_duration=30,max_duration=60,min_days=2,max_days=4)

while emergency_queue_neuro or routine_queue_neuro:
    next_available_time = min(neuro_test_end_times.values())
    idle_stations = {doc: end for doc, end in neuro_test_end_times.items() if end == next_available_time}
    next_station = min(idle_stations, key=idle_stations.get)

    patient_idx, patient_arrival, emergency_queue_neuro, routine_queue_neuro = pop_next_patient(emergency_queue_neuro, routine_queue_neuro, neuro_test_end_times, next_station)

    neuro_test_end_times, test_in_times, test_out_times, test_durations, test_wait_times, test_station_assigned, test_result_days = process_test(patient_idx,patient_arrival,
    next_station,neuro_test_end_times,neuro_tests,test_in_times,test_out_times,
    test_durations,test_wait_times,test_station_assigned,test_result_days,
    mean=45,std=7.5,min_duration=30,max_duration=60,min_days=2,max_days=4)



In [34]:
# EEG Tests
eeg_tests = tests[tests['test_type'] == 'EEG']
eeg_tests = eeg_tests.sort_values(by='consultation_end_min').reset_index(drop=True)

eeg_test_end_times = {doc: 0 for doc in ['E1', 'E2']}
emergency_queue_eeg = []
routine_queue_eeg = []

for idx, row in eeg_tests.iterrows():
    arrival_time = row['consultation_end_min']
    category = row['category']

    if category == 'Emergency':
        emergency_queue_eeg.append((idx, arrival_time))
    else:
        routine_queue_eeg.append((idx, arrival_time))

    while True:
        idle_stations = {doc: end for doc, end in eeg_test_end_times.items() if end <= arrival_time}
        if not idle_stations or not (emergency_queue_eeg or routine_queue_eeg):
            break

        next_station = min(idle_stations, key=idle_stations.get)

        patient_idx, patient_arrival, emergency_queue_eeg, routine_queue_eeg = pop_next_patient(emergency_queue_eeg, routine_queue_eeg, eeg_test_end_times, next_station)

        eeg_test_end_times, test_in_times, test_out_times, test_durations, test_wait_times, test_station_assigned, test_result_days = process_test(patient_idx,patient_arrival,
        next_station,eeg_test_end_times,eeg_tests,test_in_times,test_out_times,
        test_durations,test_wait_times,test_station_assigned,test_result_days,
        mean=52.5,std=3.75,min_duration=45,max_duration=60,min_days=7,max_days=14)

while emergency_queue_eeg or routine_queue_eeg:
    next_available_time = min(eeg_test_end_times.values())
    idle_stations = {doc: end for doc, end in eeg_test_end_times.items() if end == next_available_time}
    next_station = min(idle_stations, key=idle_stations.get)

    patient_idx, patient_arrival, emergency_queue_eeg, routine_queue_eeg = pop_next_patient(emergency_queue_eeg, routine_queue_eeg, eeg_test_end_times, next_station)

    eeg_test_end_times, test_in_times, test_out_times, test_durations, test_wait_times, test_station_assigned, test_result_days = process_test(patient_idx,patient_arrival,
    next_station,eeg_test_end_times,eeg_tests,test_in_times,test_out_times,
    test_durations,test_wait_times,test_station_assigned,test_result_days,
    mean=52.5,std=3.75,min_duration=45,max_duration=60,min_days=7,max_days=14)


## 💊 Pharmacy Flow Simulation

This section simulates the **pharmacy stage** in a clinic where patients collect their medications after consultation and/or diagnostic tests.

### 🔄 Overview of Process
- Each patient’s **prescription drop-off time** is determined as the later of their consultation or test end times.
- Patients are sorted by drop-off time, and ties are resolved by sorting on patient ID.
- The simulation uses **three pharmacists**: `P1`, `P2`, and `P3`.

### 👥 Pharmacist Assignment
- If a pharmacist is idle at the patient’s arrival time, that pharmacist is assigned.
- If all are busy, the patient waits for the next available pharmacist.
- The **pharmacy queue** is updated to track patients waiting.
- There is no emergency queue in the pharmacy.

### ⏱️ Processing Steps per Patient
- **Medication finding time** is sampled from a **normal distribution (mean = 6, SD = 0.5)** and clipped to **5–7 minutes**.
- **Explanation time** (pharmacist explains medication to patient) is sampled from a **normal distribution (mean = 4, SD = 0.5)** and clipped to **3–5 minutes**.
- Start and end times are tracked and stored for both steps.

### 📊 Tracked Metrics per Patient

For every patient visiting the pharmacy:

- **Pharmacy Arrival Time** (`arrival_time`)  
  - Computed as the **later** of consultation end or diagnostic test end time.
- **Assigned Pharmacist** (`P1`, `P2`, or `P3`)  
  - Based on availability at arrival; if all busy, the **next available pharmacist** is assigned.
- **Medication Finding Time** (`prescription_find_times`)  
  - Sampled from a **normal distribution** (mean = 6 min, SD = 0.5 min), clipped to **5–7 minutes**.
- **Explanation Time** (`patient_explaining_times`)  
  - Sampled from a **normal distribution** (mean = 4 min, SD = 0.5 min), clipped to **3–5 minutes**.
- **Pharmacy In/Out Time** (`pharmacist_in_times`, `pharmacist_out_times`)  
  - In-time is when the pharmacist starts processing
  - Out-time is is when pharmacist returns to the counter with the medication to meet the patient
- **Prescription Collection Time** (`prescription_collection_times`)
  - When patient has heard the pharmacist's explanation and exits the pharmacy with the medication
- **Pharmacy Wait Time** (`pharmacy_wait_times`)  
  - Time patient spends waiting for a pharmacist to be free.
- **Pharmacy Queue** (`pharmacy_waiting_list`)
  - Number of patients waiting for pharmacy
- **Medication Availability** (`medication_status`)  
  - Assigned randomly:
    - 90% → `'Available'`
    - 10% → `'Unavailable'`





In [35]:
# Pharmacy

# Initialize pharmacy lists
prescription_dropoff_times = [None] * n_patients
pharmacist_in_times = [None] * n_patients
pharmacist_out_times = [None] * n_patients
pharmacist_assigned = [None] * n_patients
prescription_collection_times = [None] * n_patients

prescription_find_times = [None] * n_patients
patient_explaining_times = [None] * n_patients
pharmacy_wait_times = [None] * n_patients
medication_availability = [None] * n_patients
pharmacy_waiting_list = [None] * n_patients

pharmacy_queue = []
pharmacy_queue_starts = []

pharmacist_end_times = {doc: 0 for doc in ['P1', 'P2', 'P3']}
prescription_dropoff_times = [max(t, c) if t and c else t or c for t, c in zip(test_out_times, consultation_out_times)]

pharm = pd.DataFrame({
    'patient_id': patient_ids,
    'prescription_dropoff_times': prescription_dropoff_times
})

pharm = pharm.sort_values(by=['prescription_dropoff_times', 'patient_id']).reset_index(drop=True)

for idx, row in pharm.iterrows():

    pid = row['patient_id']
    arrival_time = row['prescription_dropoff_times']
    arrival_minutes = arrival_time.hour * 60 + arrival_time.minute


    # Count number of patients waiting (in queue) including the current one if no station is free
    idle_stations = {doc: end for doc, end in pharmacist_end_times.items() if end <= arrival_minutes}

    if idle_stations:
      pharmacy_queue = []
      pharmacy_queue_starts = []
      next_station = min(idle_stations, key=idle_stations.get)
      pharmacist_start_minutes = max(arrival_minutes, pharmacist_end_times[next_station] or 0)

    else:
      next_station = min(pharmacist_end_times, key=pharmacist_end_times.get)
      pharmacist_start_minutes = max(arrival_minutes, pharmacist_end_times[next_station] or 0)

      pharmacy_queue, pharmacy_queue_starts = update_queue(pharmacy_queue, pharmacy_queue_starts, arrival_minutes, pharmacist_start_minutes)

    waiting_count = len(pharmacy_queue)
    pharmacy_waiting_list[pid] = waiting_count

    pharmacist_in_times[pid] = time(pharmacist_start_minutes // 60, pharmacist_start_minutes % 60)
    pharmacist_assigned[pid] = next_station
    find_times = int(round(np.clip(np.random.normal(loc=6, scale=0.5), 5, 7)))
    find_end = pharmacist_start_minutes + find_times
    prescription_find_times[pid] = find_times
    pharmacist_out_times[pid] = time(min(23, find_end // 60), min(59, find_end % 60))

    explain_times = int(round(np.clip(np.random.normal(loc=4, scale=0.5), 3, 5)))
    explain_end = find_end + explain_times
    patient_explaining_times[pid] = explain_times
    prescription_collection_times[pid] = time(min(23, explain_end // 60), min(59, explain_end % 60))

    pharmacist_end_times[next_station] = explain_end
    pharmacy_wait_times[pid] = find_end - arrival_minutes
    medication_availability[pid] = np.random.choice(['Available', 'Unavailable'], p=[0.9, 0.1])


## 🧾 Billing Flow Simulation

This section simulates the **billing stage** in a clinic, where patients pay for services after collecting their medications.

### 🔄 Overview of Process
- There are **two billing stations**: `B1` and `B2`.
- Patients proceed to billing **immediately after collecting their medications**.
- If a billing station is free, the patient is processed right away.
- If both stations are busy, the patient is added to a **billing queue** and **waits for the next available station**.
- The system follows a **First-In, First-Out (FIFO)** queue, based on `prescription_collection_time`.
- There is no emergency queue

### ⏱️ Processing Steps per Patient
- **Billing duration** is sampled from a **normal distribution** (mean = 5.5, SD = 0.5) and **clipped to 4–7 minutes**.


### 📊 Tracked Metrics per Patient

For every patient visiting the billing counter:

- **Billing Arrival Time** (`prescription_collection_times`)  
  - Time at which the patient arrives at billing, directly after pharmacy.

- **Assigned Billing Station** (`B1` or `B2`)  
  - Determined by station availability; assigned immediately if a station is free.

- **Billing In Time** (`billing_in_times`)  
  - Timestamp when the patient’s billing begins.

- **Billing Out Time** (`billing_out_times`)  
  - Timestamp when billing is completed.

- **Billing Duration** (`billing_durations`)  
  - Duration of the billing process, **sampled from a normal distribution**, clipped to **4–7 minutes**.

- **Billing Wait Time** (`billing_wait_times`)  
  - Time spent waiting before billing begins if no station is free.

- **Billing Queue Size** (`billing_waiting_list`)  
  - Number of patients already in queue when the current patient arrives.

In [36]:
# Billing

# Initialize billing lists
billing_in_times = [None] * n_patients
billing_out_times = [None] * n_patients
billing_station_assigned = [None] * n_patients
billing_durations = [None] * n_patients
billing_wait_times = [None] * n_patients

# Billing Queue
billing_waiting_list = [None] * n_patients
billing_queue = []
billing_queue_starts = []

billing_end_times = {doc: 0 for doc in ['B1', 'B2']}

billing = pd.DataFrame({
    'patient_id': patient_ids,
    'prescription_collection_times': prescription_collection_times
})

billing = billing.sort_values(by=['prescription_collection_times', 'patient_id']).reset_index(drop=True)

for idx, row in billing.iterrows():

    pid = row['patient_id']
    arrival_time = row['prescription_collection_times']
    arrival_minutes = arrival_time.hour * 60 + arrival_time.minute


    # Count number of patients waiting (in queue) including the current one if no station is free
    idle_stations = {doc: end for doc, end in billing_end_times.items() if end <= arrival_minutes}

    if idle_stations:
      billing_queue = []
      billing_queue_starts = []
      next_station = min(idle_stations, key=idle_stations.get)
      billing_start_minutes = max(arrival_minutes, billing_end_times[next_station] or 0)

    else:
      next_station = min(billing_end_times, key=billing_end_times.get)
      billing_start_minutes = max(arrival_minutes, billing_end_times[next_station] or 0)

      billing_queue, billing_queue_starts = update_queue(billing_queue, billing_queue_starts, arrival_minutes, billing_start_minutes)

    # patients waiting
    waiting_count = len(billing_queue)
    billing_waiting_list[pid] = waiting_count

    # billing times
    billing_in_times[pid] = time(billing_start_minutes // 60, billing_start_minutes % 60)
    billing_station_assigned[pid] = next_station

    billing_time = int(round(np.clip(np.random.normal(loc=5.5, scale=0.5), 4, 7)))
    billing_end = billing_start_minutes + billing_time

    billing_durations[pid] = billing_time
    billing_out_times[pid] = time(min(23, billing_end // 60), min(59, billing_end % 60))

    billing_wait_times[pid] = billing_start_minutes - arrival_minutes
    billing_end_times[next_station] = billing_end


### 🧾 Patient Flow DataFrame Creation

This cell constructs a comprehensive **DataFrame** that captures each patient's journey through the clinic—from arrival and check-in to triage, consultation, diagnostic tests, pharmacy, and billing. It combines all relevant timestamps, durations, queue sizes, and station assignments into a single table for easy analysis. The resulting data is exported as a CSV file and the first 20 rows are displayed.


In [37]:
# Build DataFrame
patients_df = pd.DataFrame({
    'PatientID': patient_ids,
    'Age': ages,
    'Gender': genders,
    'HealthCondition': health_conditions,
    'Insurance': insurances,
    'ArrivalTime': arrival_times,
    'CheckInStart': checkin_start_times,
    'CheckInEnd': checkin_end_times,
    'CheckInDurationMin': checkin_durations,
    'CheckInWaitMin': checkin_wait_times,
    'PatientsWaitingCheckIn': patients_waiting_list,
    'TriageIn': triage_in_times,
    'TriageOut': triage_out_times,
    'TriageDurationMin': triage_durations,
    'TriageWaitMin': triage_wait_times,
    'TriageCategory': triage_classification,
    'TriageStation': triage_station_assignment,
    'PatientsWaitingTriage': triage_waiting_list,
    'ConsultationIn': consultation_in_times,
    'ConsultationOut': consultation_out_times,
    'TotalConsultationTimeMin': consultation_durations,
    'ConsultationWaitMin': consultation_wait_times,
    'Doctor': consultation_doctor,
    'Specialty': consultation_specialty,
    'RoutineQueueSize': routine_waiting_list,
    'EmergencyQueueSize': emergency_waiting_list,
    'DiagnosticTests': test_assignments,
    'TestIn': test_in_times,
    'TestOut': test_out_times,
    'TestDurationMin': test_durations,
    'TestWaitMin': test_wait_times,
    'TestStation': test_station_assigned,
    'TestResultDays': test_result_days,
    'PrescriptionDropoff': prescription_dropoff_times,
    'PharmacistIn': pharmacist_in_times,
    'PharmacistOut' : pharmacist_out_times,
    'PrescriptionFindTime' : prescription_find_times,
    'PrescriptionCollectionTime' : prescription_collection_times,
    'ExplanationTimeMin' : patient_explaining_times,
    'PharmacyWaitMin' : pharmacy_wait_times,
    'PharmacistAssigned' : pharmacist_assigned,
    'MedicationAvailability' : medication_availability,
    'PharmacyQueueSize' : pharmacy_waiting_list,
    'BillingIn' : billing_in_times,
    'BillingOut' : billing_out_times,
    'BillingDurationMin' : billing_durations,
    'BillingWaitMin' : billing_wait_times,
    'BillingStation' : billing_station_assigned,
    'BillingQueueSize' : billing_waiting_list

})

patients_df.to_csv('patient_flow_data.csv')
patients_df.iloc[0:20]

Unnamed: 0,PatientID,Age,Gender,HealthCondition,Insurance,ArrivalTime,CheckInStart,CheckInEnd,CheckInDurationMin,CheckInWaitMin,...,PharmacyWaitMin,PharmacistAssigned,MedicationAvailability,PharmacyQueueSize,BillingIn,BillingOut,BillingDurationMin,BillingWaitMin,BillingStation,BillingQueueSize
0,0,63,Female,Bipolar Disorder,AIA,08:28:00,08:28:00,08:29:00,1,0,...,6,P3,Available,0,09:48:00,09:54:00,6,0,B1,0
1,1,26,Female,Anxiety Disorders,NTUC Income,08:40:00,08:40:00,08:41:00,1,0,...,6,P1,Available,0,09:13:00,09:19:00,6,0,B1,0
2,2,83,Male,Depression,AIA,09:06:00,09:06:00,09:07:00,1,0,...,6,P2,Available,0,09:36:00,09:41:00,5,0,B2,0
3,3,72,Female,Depression,HSBC,09:27:00,09:27:00,09:28:00,1,0,...,7,P1,Available,0,10:01:00,10:07:00,6,0,B2,0
4,4,32,Male,Depression,NTUC Income,09:34:00,09:34:00,09:35:00,1,0,...,6,P3,Available,0,10:19:00,10:25:00,6,0,B2,0
5,5,86,Male,OCD,HSBC,09:35:00,09:35:00,09:36:00,1,0,...,6,P2,Available,0,10:18:00,10:24:00,6,0,B1,0
6,6,86,Female,Anxiety Disorders,Great Eastern,09:43:00,09:43:00,09:46:00,3,0,...,6,P1,Available,0,10:31:00,10:38:00,7,0,B1,0
7,7,35,Female,Depression,NTUC Income,10:13:00,10:13:00,10:14:00,1,0,...,6,P2,Available,0,10:45:00,10:51:00,6,0,B2,0
8,8,14,Male,Anxiety Disorders,Great Eastern,10:16:00,10:16:00,10:17:00,1,0,...,5,P3,Available,0,10:45:00,10:50:00,5,0,B1,0
9,9,33,Male,Depression,AIA,10:35:00,10:35:00,10:37:00,2,0,...,6,P1,Available,0,11:07:00,11:13:00,6,0,B1,0


### Patient Demographics Plots

This visualization summarizes key demographic characteristics of the patients in the dataset using a set of six interactive Plotly subplots:

- **Bar Plot**: Number of patients per disorder, based on the `HealthCondition` column.
- **Pie Chart**: Distribution of insurance types (`Insurance`).
- **Pie Chart**: Gender breakdown (`Gender`).
- **Histogram**: Age distribution of patients in 10-year bins (`Age`).
- **Pie Chart**: Percentage of Emergency vs Routine patients (`TriageCategory`).
- **Pie Chart**: Medication availability across patients (`MedicationAvailability`).

In [38]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import pandas as pd


# Create subplots
fig = make_subplots(
    rows=3, cols=2,
    subplot_titles=[
        "Number of Patients per Disorder",
        "Insurance Distribution",
        "Gender Distribution",
        "Age Distribution (10-Year Bins)",
        "Triage Category",
        "Medication Availability"
    ],
    specs=[[{"type": "bar"}, {"type": "pie"}],
           [{"type": "pie"}, {"type": "histogram"}],
           [{"type": "pie"}, {"type": "pie"}]]
)

# 1. Bar plot - HealthCondition
condition_counts = patients_df['HealthCondition'].value_counts()
fig.add_trace(
    go.Bar(
        x=condition_counts.index,
        y=condition_counts.values,
        marker_color='indianred',
        name="Disorder Count"
    ),
    row=1, col=1
)
fig.update_yaxes(title_text="Number of Patients", row=1, col=1)

# 2. Pie chart - Insurance
insurance_counts = patients_df['Insurance'].value_counts()
fig.add_trace(
    go.Pie(
        labels=insurance_counts.index,
        values=insurance_counts.values,
        marker=dict(colors=['royalblue', 'lightblue', 'mediumpurple']),
        name="Insurance",
        textposition='inside',
        textinfo='percent+label',
        hoverinfo='label+percent'
    ),
    row=1, col=2
)


# 3. Pie chart - Gender
gender_counts = patients_df['Gender'].value_counts()
fig.add_trace(
    go.Pie(
        labels=gender_counts.index,
        values=gender_counts.values,
        marker=dict(colors=['lightpink', 'lightgreen']),
        name="Gender",
        textposition='inside',
        textinfo='percent+label',
        hoverinfo='label+percent'
    ),
    row=2, col=1
)

# 4. Histogram - Age
fig.add_trace(
    go.Histogram(
        x=patients_df['Age'],
        xbins=dict(start=0, end=100, size=10),
        marker_color='steelblue',
        name="Age"
    ),
    row=2, col=2
)
fig.update_xaxes(title_text="Age (years)", row=2, col=2)
fig.update_yaxes(title_text="Number of Patients", row=2, col=2)


# 5. Pie chart - TriageCategory
triage_counts = patients_df['TriageCategory'].value_counts()
fig.add_trace(
    go.Pie(
        labels=triage_counts.index,
        values=triage_counts.values,
        marker=dict(colors=['tomato', 'gold']),
        name="Triage Category",
        textposition='inside',
        textinfo='percent+label',
        hoverinfo='label+percent'
    ),
    row=3, col=1
)

# 6. Pie chart - MedicationAvailability
medication_counts = patients_df['MedicationAvailability'].value_counts()
fig.add_trace(
    go.Pie(
        labels=medication_counts.index,
        values=medication_counts.values,
        marker=dict(colors=['mediumseagreen', 'salmon']),
        name="Medication Availability",
        textposition='inside',
        textinfo='percent+label',
        hoverinfo='label+percent'
    ),
    row=3, col=2
)

# Layout settings
fig.update_layout(
    height=900,
    width=1000,
    title_text="Patient Demographics Overview",
    title_x=0.5,
    showlegend=False,
    template="plotly_white"
)

fig.show()


### Clinic Flow Mapping – Sankey Diagram

This visualization shows the flow of patients through different stages of the clinic using a **Sankey diagram**.

A **Sankey diagram** is used to visualize flow between stages, with the width of each link proportional to the quantity of flow. It’s especially useful for mapping how patients move through a system and branch into different paths.

#### Patient Journey:
- **Check-In → Triage → Consultation**
- After Consultation, patients either:
  - Go directly to **Pharmacy** if no tests are needed
  - Or proceed to **one of three diagnostic tests**:
    - Blood Test
    - Neuroimaging
    - EEG
- All patients eventually reach **Pharmacy → Billing**

The plot reflects actual patient counts from the dataset at each transition.


In [39]:
import plotly.graph_objects as go

# Patient Flow: Check-In → Triage → Consultation → Diagnostics (Blood Test, Neuroimaging, EEG) → Pharmacy → Billing

# Count how many patients went through each test
test_counts = patients_df['DiagnosticTests'].value_counts()
total_patients = len(patients_df)
patients_with_test = patients_df[patients_df['DiagnosticTests'].notna()]
patients_without_test = total_patients - len(patients_with_test)

# ------------------- Sankey Diagram -------------------

stages = [
    "Check-In", "Triage", "Consultation",
    "Blood Test", "Neuroimaging", "EEG", "Pharmacy", "Billing"
]
stage_idx = {name: i for i, name in enumerate(stages)}

sankey_links = {
    ('Check-In', 'Triage'): total_patients,
    ('Triage', 'Consultation'): total_patients,
    ('Consultation', 'Blood Test'): test_counts.get('Blood Test', 0),
    ('Consultation', 'Neuroimaging'): test_counts.get('Neuroimaging', 0),
    ('Consultation', 'EEG'): test_counts.get('EEG', 0),
    ('Consultation', 'Pharmacy'): patients_without_test,
    ('Blood Test', 'Pharmacy'): test_counts.get('Blood Test', 0),
    ('Neuroimaging', 'Pharmacy'): test_counts.get('Neuroimaging', 0),
    ('EEG', 'Pharmacy'): test_counts.get('EEG', 0),
    ('Pharmacy', 'Billing'): total_patients,
}

sankey_fig = go.Figure(data=[go.Sankey(
    node=dict(
        pad=20,
        thickness=20,
        line=dict(color="black", width=0.5),
        label=stages,
        color=["#6baed6", "#9ecae1", "#c6dbef", "#fdae6b", "#fd8d3c", "#e6550d", "#a1d99b", "#74c476"]
    ),
    link=dict(
        source=[stage_idx[s[0]] for s in sankey_links.keys()],
        target=[stage_idx[s[1]] for s in sankey_links.keys()],
        value=list(sankey_links.values()),
        color = ["rgba(173, 216, 230, 0.5)",  # LightBlue
                      "rgba(144, 238, 144, 0.5)",  # LightGreen
                      "rgba(255, 182, 193, 0.5)",  # LightPink
                      "rgba(255, 255, 224, 0.5)",  # LightYellow
                      "rgba(221, 160, 221, 0.5)",  # Plum
                      "rgba(240, 230, 140, 0.5)",  # Khaki
                      "rgba(255, 228, 225, 0.5)",  # MistyRose
                      "rgba(175, 238, 238, 0.5)",  # PaleTurquoise
                      "rgba(250, 235, 215, 0.5)",  # AntiqueWhite
                      "rgba(255, 239, 213, 0.5)"   # PapayaWhip
                      ]
    )
)])

sankey_fig.update_layout(
    title_text="Patient Flow Through Clinic – Sankey Diagram",
    font_size=12,
    height=500,
    width=1000
)

# Show plot
sankey_fig.show()


### 🕒 Gantt-Style Chart: Patient Timelines Through the Clinic

This Gantt-style visualization shows the flow of 10 randomly sampled patients through various clinic stages on **May 15, 2025**. The stages include:

- Check-In
- Triage
- Consultation
- Test
- Pharmacy
- Billing

Each patient's journey is represented as a horizontal bar, color-coded by stage. The chart helps identify bottlenecks and overlaps in patient processing times.

📊 **Use case**: Visual audit of patient flow, clinic efficiency, and resource allocation.

🔍 **Sample size**: 10 patients  
📅 **Date used**: May 15, 2025  
🎨 **Chart type**: Interactive timeline (Gantt style)


In [40]:
from datetime import datetime, date, time

# Sample size
sample_size = 10
sample_df = patients_df.sample(n=sample_size, random_state=42).copy()

# Define the base date to use
base_date = date(2025, 5, 15)

# Convert all time columns to datetime using the fixed base date
time_cols = [
    'CheckInStart', 'CheckInEnd',
    'TriageIn', 'TriageOut',
    'ConsultationIn', 'ConsultationOut',
    'TestIn', 'TestOut',
    'PrescriptionDropoff', 'PrescriptionCollectionTime',
    'BillingIn', 'BillingOut'
]

# Combine time with base_date to create full datetime
for col in time_cols:
    sample_df[col] = sample_df[col].apply(
        lambda t: datetime.combine(base_date, t) if pd.notnull(t) else pd.NaT
    )

# Create long-form DataFrame for Gantt chart
timeline_data = []
for _, row in sample_df.iterrows():
    pid = row['PatientID']
    timeline_data.extend([
        {'PatientID': pid, 'Stage': 'Check-In', 'Start': row['CheckInStart'], 'End': row['CheckInEnd']},
        {'PatientID': pid, 'Stage': 'Triage', 'Start': row['TriageIn'], 'End': row['TriageOut']},
        {'PatientID': pid, 'Stage': 'Consultation', 'Start': row['ConsultationIn'], 'End': row['ConsultationOut']},
        {'PatientID': pid, 'Stage': 'Test', 'Start': row['TestIn'], 'End': row['TestOut']},
        {'PatientID': pid, 'Stage': 'Pharmacy', 'Start': row['PrescriptionDropoff'], 'End': row['PrescriptionCollectionTime']},
        {'PatientID': pid, 'Stage': 'Billing', 'Start': row['BillingIn'], 'End': row['BillingOut']},
    ])

timeline_df = pd.DataFrame(timeline_data)

# Drop rows with missing start or end times
timeline_df.dropna(subset=['Start', 'End'], inplace=True)

# Plot using Plotly Express
fig = px.timeline(
    timeline_df,
    x_start="Start", x_end="End",
    y="PatientID",
    color="Stage",
    title="🕒 Patient Timelines Through the Clinic",
    labels={"PatientID": "Patient ID", "Stage": "Clinic Stage"},
)

# Uniform and larger bar thickness
fig.update_traces(width=1.5)  # Adjust this value (0.2 to 0.6) to control bar height

# Improve appearance
fig.update_yaxes(autorange="reversed")
fig.update_layout(
    height=1000,
    margin=dict(l=50, r=50, t=80, b=40),
    plot_bgcolor='rgb(245, 245, 245)',
    paper_bgcolor='rgb(245, 245, 245)',
)

fig.show()


### 📦 Box Plot: Processing Time per Station

This box plot visualizes the **distribution of processing times** (in minutes) across key clinical stations:  
**Check-In, Triage, Consultation, Diagnostic Tests (Blood Test, Neuroimaging, EEG), Pharmacy,** and **Billing**.

- Each box represents the **median**, **interquartile range (Q1–Q3)**, and **outliers**.
- Diagnostic tests include only patients routed to those tests.
- Helps identify time-intensive stages and variability across stations.

In [41]:
# 1. Create processing time DataFrame
processing_df = pd.DataFrame({
    'Check-In': patients_df['CheckInDurationMin'],
    'Triage': patients_df['TriageDurationMin'],
    'Consultation': patients_df['TotalConsultationTimeMin'],
    'Pharmacy': patients_df['PrescriptionFindTime'] + patients_df['ExplanationTimeMin'],
    'Billing': patients_df['BillingDurationMin']
})

# Add diagnostic test durations with NaNs for non-participants
for test in ['Blood Test', 'Neuroimaging', 'EEG']:
    mask = patients_df['DiagnosticTests'] == test
    processing_df[test] = np.where(mask, patients_df['TestDurationMin'], np.nan)

# 2. Plotly box plot figure
colors = [
    'rgba(93, 164, 214, 0.5)',      # Check-In
    'rgba(255, 144, 14, 0.5)',      # Triage
    'rgba(44, 160, 101, 0.5)',      # Consultation
    'rgba(255, 65, 54, 0.5)',       # Pharmacy
    'rgba(207, 114, 255, 0.5)',     # Billing
    'rgba(127, 96, 0, 0.5)',        # Blood Test
    'rgba(127, 166, 238, 0.5)',     # Neuroimaging
    'rgba(234, 153, 153, 0.5)'      # EEG
]

fig = go.Figure()

# 3. Add traces
for col, color in zip(processing_df.columns, colors):
    fig.add_trace(go.Box(
        y=processing_df[col],
        name=col,
        whiskerwidth=1,
        fillcolor=color,
        line_width=2
    ))

# 4. Update layout
fig.update_layout(
    title=dict(text='Processing Time per Station (Minutes)'),
    yaxis=dict(
        autorange=True,
        showgrid=True,
        zeroline=True,
        dtick=5,
        gridcolor='rgb(255, 255, 255)',
        gridwidth=1,
        zerolinecolor='rgb(255, 255, 255)',
        zerolinewidth=2,
    ),
    margin=dict(l=40, r=30, b=80, t=100),
    paper_bgcolor='rgb(243, 243, 243)',
    plot_bgcolor='rgb(243, 243, 243)',
    showlegend=False
)

fig.show()


### 📦 Box Plot: Wait Time per Station

This box plot visualizes the **distribution of wait times** (in minutes) across key clinical stations:  
**Check-In, Triage, Consultation, Diagnostic Tests (Blood Test, Neuroimaging, EEG), Pharmacy,** and **Billing**.

- Each box represents the **median**, **interquartile range (Q1–Q3)**, and **outliers**.
- Diagnostic tests include only patients routed to those tests.
- Helps identify wait-time-intensive stages and variability across stations.

In [42]:
# 1. Box plot: Wait Time per Station
wait_df = pd.DataFrame({
    'Check-In': patients_df['CheckInWaitMin'],
    'Triage': patients_df['TriageWaitMin'],
    'Consultation': patients_df['ConsultationWaitMin'],
    'Pharmacy': patients_df['PharmacyWaitMin'],
    'Billing': patients_df['BillingWaitMin']
})
for test in ['Blood Test', 'Neuroimaging', 'EEG']:
    mask = patients_df['DiagnosticTests'] == test
    wait_df[test] = np.where(mask, patients_df['TestWaitMin'], np.nan)

# 2. Plotly box plot figure
colors = [
    'rgba(93, 164, 214, 0.5)',      # Check-In
    'rgba(255, 144, 14, 0.5)',      # Triage
    'rgba(44, 160, 101, 0.5)',      # Consultation
    'rgba(255, 65, 54, 0.5)',       # Pharmacy
    'rgba(207, 114, 255, 0.5)',     # Billing
    'rgba(127, 96, 0, 0.5)',        # Blood Test
    'rgba(127, 166, 238, 0.5)',     # Neuroimaging
    'rgba(234, 153, 153, 0.5)'      # EEG
]

fig = go.Figure()

# 3. Add traces
for col, color in zip(wait_df.columns, colors):
    fig.add_trace(go.Box(
        y=wait_df[col],
        name=col,
        whiskerwidth=1,
        fillcolor=color,
        line_width=2
    ))

# 4. Update layout
fig.update_layout(
    title=dict(text='Wait Time per Station (Minutes)'),
    yaxis=dict(
        autorange=True,
        showgrid=True,
        zeroline=True,
        dtick=5,
        gridcolor='rgb(255, 255, 255)',
        gridwidth=1,
        zerolinecolor='rgb(255, 255, 255)',
        zerolinewidth=2,
    ),
    margin=dict(l=40, r=30, b=80, t=100),
    paper_bgcolor='rgb(243, 243, 243)',
    plot_bgcolor='rgb(243, 243, 243)',
    showlegend=False
)

fig.show()


## 📊 Time Analysis of Patient Flow

This dashboard presents four key visualizations to analyze patient timing patterns.

---

### 1. ⏰ Bar Plot: Hourly Patient Arrivals

- **What it shows**:  
  The number of patients arriving at the clinic for each hour of the day, extracted from the `ArrivalTime` column.

- **Insight**:  
  Identifies peak arrival hours and helps in optimizing staff allocation during busy periods.

---

### 2. 🚑 Grouped Bar Plot: Wait Times by Triage Category

- **What it shows**:  
  Average wait times for **Emergency** vs **Routine** patients across:
  - Consultation
  - Blood Test
  - Neuroimaging
  - EEG

- **Insight**:  
  Reveals disparities in wait times between Emergency and Routine categories, useful for service prioritization and triage efficiency.

---

### 3. ⌛ Histogram: Total Waiting Time per Patient

- **What it shows**:  
  The distribution of total time patients spend **waiting**

- **Insight**:  
  Identifies if wait times get too long in the clinic and how to better optimize wait times for patients

### 4. ⌛ Histogram: Total Time Spent in Clinic per Patient

- **What it shows**:  
The complete duration a patient spends in the clinic from arrival to checkout.

- **Insight**:  
Measures end-to-end time spent in the clinic, supporting overall efficiency analysis and time management improvements.


In [43]:
# Create subplots
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=[
        "Hourly Patient Arrivals",
        "Wait Times: Emergency vs Routine",
        "Total Waiting Time Distribution",
        "Total Time Spent in Clinic Distribution"
    ],
    specs=[[{"type": "bar"}, {"type": "bar"}],
           [{"type": "histogram"}, {"type": "histogram"}]]
)

# 1. Hourly Patient Arrivals (corrected)
patients_df['ArrivalHour'] = patients_df['ArrivalTime'].apply(lambda t: t.hour)

# Count number of patients per hour
hourly_counts = patients_df['ArrivalHour'].value_counts().sort_index()

# Plot
fig.add_trace(
    go.Bar(
        x=hourly_counts.index,
        y=hourly_counts.values,
        marker_color='royalblue',
        name="Arrivals",
        showlegend=False
    ),
    row=1, col=1
)

# Update axes for first subplot
fig.update_xaxes(title_text='Hour', tickvals = hourly_counts.index, row=1, col=1)
fig.update_yaxes(title_text='No of Patients', row=1, col=1)

# 2. Grouped Bar Plot: Emergency vs Routine Wait Times
bar_data = {'Emergency': [], 'Routine': []}
tests = ['Consultation', 'Blood Test', 'Neuroimaging', 'EEG']

# Collect rounded wait times
for test in tests:
    if test == 'Consultation':
        emergency_time = patients_df[patients_df['TriageCategory'] == 'Emergency']['ConsultationWaitMin'].mean()
        routine_time = patients_df[patients_df['TriageCategory'] == 'Routine']['ConsultationWaitMin'].mean()
    else:
        emergency_time = patients_df[(patients_df['TriageCategory'] == 'Emergency') & (patients_df['DiagnosticTests'] == test)]['TestWaitMin'].mean()
        routine_time = patients_df[(patients_df['TriageCategory'] == 'Routine') & (patients_df['DiagnosticTests'] == test)]['TestWaitMin'].mean()

    bar_data['Emergency'].append(round(emergency_time, 2) if pd.notna(emergency_time) else 0)
    bar_data['Routine'].append(round(routine_time, 2) if pd.notna(routine_time) else 0)

# Add Emergency trace with custom hover
fig.add_trace(
    go.Bar(x=tests, y=bar_data['Emergency'], name='Emergency', marker_color='tomato', showlegend=True,
        hovertemplate='Emergency: %{y:.2f} min<extra></extra>'),
    row=1, col=2
)

# Add Routine trace with custom hover
fig.add_trace(
    go.Bar(x=tests, y=bar_data['Routine'], name='Routine', marker_color='lightseagreen', showlegend=True,
        hovertemplate='Routine: %{y:.2f} min<extra></extra>'),
    row=1, col=2
)

# Update axes for second subplot
fig.update_xaxes(title_text='Station', row=1, col=2)
fig.update_yaxes(title_text='Average Wait Time', row=1, col=2)


# 3. Histogram: Total Wait Time per Patient
total_wait = (
    patients_df['CheckInWaitMin'] +
    patients_df['TriageWaitMin'] +
    patients_df['ConsultationWaitMin'] +
    patients_df['TestWaitMin'].fillna(0) +
    patients_df['PharmacyWaitMin'] +
    patients_df['BillingWaitMin']
)
fig.add_trace(
    go.Histogram(x=total_wait, name="Total Wait Time", marker_color='mediumpurple',
                 xbins=dict(start=0, end=70,size=5 ),
                 showlegend=False),
    row=2, col=1
)

fig.update_xaxes(title_text='Total Waiting Time (Mins)',tickmode='array', tickvals=[10, 20, 30, 40, 50, 60], row=2, col=1)
fig.update_yaxes(title_text='No of Patients', row=2, col=1)

# 4. Histogram: Total Clinic Time
total_clinic_time = (
    patients_df['CheckInWaitMin'] + patients_df['CheckInDurationMin'] +
    patients_df['TriageWaitMin'] + patients_df['TriageDurationMin'] +
    patients_df['ConsultationWaitMin'] + patients_df['TotalConsultationTimeMin'] +
    patients_df['TestWaitMin'].fillna(0) + patients_df['TestDurationMin'].fillna(0) +
    patients_df['PharmacyWaitMin'] + patients_df['ExplanationTimeMin'] +
    patients_df['BillingDurationMin'] + patients_df['BillingWaitMin']
)
fig.add_trace(
    go.Histogram(x=total_clinic_time, name="Total Clinic Time", marker_color='cadetblue',
                 xbins=dict(start=0, end=150,size=10),
                 showlegend=False),
    row=2, col=2
)

fig.update_xaxes(title_text='Total Time at Clinic (Mins)', tickmode='array', tickvals=[10, 30, 50, 70, 90, 110, 130, 150], row=2, col=2)
fig.update_yaxes(title_text='No of Patients', row=2, col=2)


# Final layout
fig.update_layout(
    height=1000,
    width=1000,
    title="Time Analysis of Patient Flow",
    title_x=0.5,
    template="plotly_white",
    barmode="group",
    # showlegend=False  # Hide full legend
)

fig.show()


# 📊 Queue Monitoring Over Time and Others

This section presents visual analyses of queue dynamics and doctor workload based on the clinical simulation data.

---

1. **Line Plot**: Consultation Queue Sizes Over Time

Tracks how queue sizes evolve at the **consultation stage** for routine and emergency patients.


2. **Line Plot**: Pharmacy Queue Size Over Time

Illustrates the queue size at the **pharmacy** over time.


3. **Bar Plot**: Number of Patients Seen Per Doctor

- Count of patients **grouped by Doctor**

In [44]:
# Group and sort data for plot 1 (Consultation queue over time)
consultation_queue_df = patients_df[['TriageOut', 'RoutineQueueSize', 'EmergencyQueueSize']].dropna()
consultation_queue_df = consultation_queue_df.sort_values(by='TriageOut')

# Group and sort data for plot 2 (Pharmacy queue over time)
pharmacy_queue_df = patients_df[['PrescriptionDropoff', 'PharmacyQueueSize']].dropna()
pharmacy_queue_df = pharmacy_queue_df.sort_values(by='PrescriptionDropoff')

# Group data for plot 3 (patients seen per doctor)
doctor_counts = patients_df['Doctor'].value_counts()


# Colors
colors = {
    'Routine': 'royalblue',
    'Emergency': 'firebrick',
    'Pharmacy': 'seagreen',
    'Doctor': 'darkslateblue'
}

# Create 3x1 subplot
fig = make_subplots(
    rows=3, cols=1,
    subplot_titles=[
        "Consultation Queue Sizes Over Time",
        "Pharmacy Queue Size Over Time",
        "Patients Seen per Doctor"
    ]
)

# Plot 1: Routine vs Emergency queue over time
fig.add_trace(go.Scatter(
    x=consultation_queue_df['TriageOut'],
    y=consultation_queue_df['RoutineQueueSize'],
    mode='lines', name='Routine Queue',
    line=dict(color=colors['Routine']),
    showlegend = True
), row=1, col=1)

fig.add_trace(go.Scatter(
    x=consultation_queue_df['TriageOut'],
    y=consultation_queue_df['EmergencyQueueSize'],
    mode='lines', name='Emergency Queue',
    line=dict(color=colors['Emergency']),
    showlegend = True
), row=1, col=1)

# Plot 2: Pharmacy queue over time
fig.add_trace(go.Scatter(
    x=pharmacy_queue_df['PrescriptionDropoff'],
    y=pharmacy_queue_df['PharmacyQueueSize'],
    mode='lines', name='Pharmacy Queue',
    line=dict(color=colors['Pharmacy']),
    showlegend = False
), row=2, col=1)

# Plot 3: Patients seen per doctor
fig.add_trace(go.Bar(
    x=doctor_counts.index,
    y=doctor_counts.values,
    name='Patients per Doctor',
    marker_color=colors['Doctor'],
    showlegend = False
), row=3, col=1)

# Update axis labels
fig.update_xaxes(title_text="Time", row=1, col=1)
fig.update_yaxes(title_text="Queue Size", row=1, col=1)

fig.update_xaxes(title_text="Time", row=2, col=1)
fig.update_yaxes(title_text="Queue Size", row=2, col=1)

fig.update_xaxes(title_text="Doctor", row=3, col=1)
fig.update_yaxes(title_text="Patients Seen", row=3, col=1)


# Final layout
fig.update_layout(
    height=1000, width=700,
    title="Queue Monitoring and Doctor Load Analysis",
    title_x=0.7,
    template="plotly_white"
)

fig.show()


## 🔬 Diagnostic Tests & Outcomes

This section analyzes patterns in diagnostic test usage and turnaround times using two visualizations.

---

### 1. 📦 Box Plot: Days to Receive Diagnostic Test Results

- **What it shows**:  
  The distribution of **number of days** patients waited to receive results for each diagnostic test type.

- **Insight**:  
  Highlights variability in turnaround times across test types.

---

### 2. 📊 Bar Plot: Count of Diagnostic Tests Ordered

- **What it shows**:  
  The frequency of each diagnostic test type ordered

- **Insight**:  
  Reveals the most commonly ordered tests, helping stakeholders understand clinical demand and optimize test station capacity accordingly.

---


In [45]:
# Filter out rows with missing values for diagnostic tests
diagnostic_df = patients_df[['DiagnosticTests', 'TestResultDays']].dropna()

# Colors for each diagnostic test (customizable if you know exact test names)
color_map = {
    'Blood Test': 'rgba(93, 164, 214, 0.5)',
    'Neuroimaging': 'rgba(255, 144, 14, 0.5)',
    'EEG': 'rgba(44, 160, 101, 0.5)'
}

# Unique diagnostic tests
unique_tests = diagnostic_df['DiagnosticTests'].unique()

# Create subplot (1 row, 2 columns)
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=(
        "Days to Receive Diagnostic Test Results",
        "Diagnostic Test Order Counts"
    )
)

# === Plot 1: Box Plot (TestResultDays by Diagnostic Test) ===
for test in unique_tests:
    filtered = diagnostic_df[diagnostic_df['DiagnosticTests'] == test]
    fig.add_trace(go.Box(
        y=filtered['TestResultDays'],
        name=test,
        whiskerwidth=0.5,
        fillcolor=color_map.get(test, 'rgba(93, 164, 214, 0.5)'),  # fallback color
        line_width=1
    ), row=1, col=1)

# === Plot 2: Bar Plot (Test Type Counts) ===
test_counts = diagnostic_df['DiagnosticTests'].value_counts()
bar_colors = [color_map.get(test, 'rgba(93, 164, 214, 0.8)') for test in test_counts.index]

fig.add_trace(go.Bar(
    x=test_counts.index,
    y=test_counts.values,
    marker_color=bar_colors,
    name='Test Orders'
), row=1, col=2)

# === Update Layout ===
fig.update_layout(
    title=dict(text="Diagnostic Test Analysis", x=0.5),
    height=500, width=1000,
    paper_bgcolor='rgb(243, 243, 243)',
    plot_bgcolor='rgb(243, 243, 243)',
    margin=dict(l=50, r=50, t=100, b=60),
    showlegend=False
)

# Y-axis tweaks for box plot
fig.update_yaxes(
    title_text="Days to Receive Results", row=1, col=1,
    gridcolor='white', zerolinecolor='white', dtick=1
)

# Y-axis for bar plot
fig.update_yaxes(
    title_text="Count of Tests Ordered", row=1, col=2,
    gridcolor='white', zerolinecolor='white', dtick=5
)

# Show plot
fig.show()


## ⚙️ Efficiency & Bottleneck Insights

This section highlights how patient queues and wait times can signal areas needing process improvement.

---

### 1. 📊 Bar Plot: Average Queue Length per Station

- **What it shows**:  
  The average number of patients waiting at each key station.

- **Insight**:  
  Helps identify service areas experiencing consistent bottlenecks or staffing shortages.

---

### 2. ⏳ Scatter Plot: Wait Time vs Arrival Time

- **What it shows**:  
  Plots patient `ArrivalTime` on the x-axis and **total wait time** on the y-axis

- **Insight**:  
  Highlights specific time windows during the day when bottlenecks and delays peak, useful for shift planning and patient flow optimization.


In [46]:
# Plot 1: Average queue lengths
avg_queues = {
    'Check-In': patients_df['PatientsWaitingCheckIn'].mean(),
    'Triage': patients_df['PatientsWaitingTriage'].mean(),
    'Consultation': (patients_df['RoutineQueueSize'] + patients_df['EmergencyQueueSize']).mean(),
    'Pharmacy': patients_df['PharmacyQueueSize'].mean(),
    'Billing': patients_df['BillingQueueSize'].mean()
}

# Plot 2: Wait time vs ArrivalTime
patients_df['TestWaitMin'].fillna(0, inplace=True)
patients_df['TotalWaitTime'] = (
    patients_df['CheckInWaitMin'] +
    patients_df['TriageWaitMin'] +
    patients_df['ConsultationWaitMin'] +
    patients_df['TestWaitMin'] +
    patients_df['PharmacyWaitMin'] +
    patients_df['BillingWaitMin']
)

# Subplots
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=(
        "Average Queue Length per Station",
        "Total Wait Time vs. Arrival Time"
    )
)

# Bar plot
fig.add_trace(go.Bar(
    x=list(avg_queues.keys()),
    y=list(avg_queues.values()),
    marker_color='rgba(93, 164, 214, 0.7)',
    name="Avg Queue Length"
), row=1, col=1)

# Scatter plot
fig.add_trace(go.Scatter(
    x=patients_df['ArrivalTime'],
    y=patients_df['TotalWaitTime'],
    mode='markers',
    marker=dict(size=6, color='rgba(255, 65, 54, 0.7)'),
    name="Wait Time"
), row=1, col=2)

# Layout
fig.update_layout(
    title=dict(text="Queue Lengths and Wait Time Analysis", x=0.5),
    height=500, width=1000,
    margin=dict(l=50, r=50, t=100, b=60),
    showlegend=False
)

fig.update_yaxes(title_text="Average Queue Length", row=1, col=1)
fig.update_yaxes(title_text="Total Wait Time (min)", row=1, col=2)
fig.update_xaxes(title_text="Station", row=1, col=1)
fig.update_xaxes(title_text="Arrival Time", row=1, col=2)

fig.show()



A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.



