In [6]:
import sys
sys.path.append("../")
import pandas as pd

from utils.prompts import render
from utils.llm_client import LLMClient
from utils.logging_utils import log_llm_call
from utils.router import pick_model,should_use_reasoning_model
from IPython.display import display, Markdown

### Load data from the text file

In [7]:
with open('../data/raw/Scenarios.txt') as f:
    text = f.read()
    sample_scenarios = [s.strip() for s in text.split("\n\n") if s.strip()]
    
    print(f"Loaded {len(sample_scenarios)} sample scenarios.")

Loaded 2 sample scenarios.


### Instruction of the prompt 

In [8]:
instruction = """
You are an emergency decision-support system.

TASK:
Analyze the scenario and respond using EXACTLY the format below.

Reasoning:
1. Immediate life threat: <list the most urgent life-threatening issue>
2. Immediate health threat: <list critical medical issues>
3. Other risk: <list other risks, e.g., logistics, hunger, environment>
4. Prioritized actions:
   - For each issue above, write ONE concise action per bullet.
   - Each action MUST be ONE short sentence.
   - Do NOT combine multiple tasks in a single bullet.
   - Do NOT add extra explanations, reasoning.

Answer:
1. Immediate Life Threat: <combine the first issue with its action>
2. Immediate Health Threat: <combine the second issue with its action>
3. Other Risk: <combine the third issue with its action>

RULES:
- Maintain the structure exactly.
- Bullets in Prioritized Actions MUST be one short sentence each.
- Do NOT add paragraphs, introductions, conclusions, or extra text.
- Do NOT invent new threats or resources beyond the scenario.
- Keep output concise, factual, and structured.
"""



### LLM Client

In [9]:
model = pick_model('google', 'cot')
client = LLMClient('google', model)

prompt_text, spec = render(
    'cot_reasoning.v1',
    role='Emergency Decision Supporter',
    problem=sample_scenarios
)

### Get the Output

In [10]:
import time

for scenario in sample_scenarios:
    full_prompt = f"{prompt_text}\n\n{instruction}\n\nScenario:\n{scenario}"


    display(Markdown(f"{scenario}\n"))

    # run Choas mode 3 times
    for run in range(1, 4): 
        messages = [{'role': 'user', 'content': full_prompt}]
        response = client.chat(messages, temperature=1.0, max_tokens=spec.max_tokens)

        display(Markdown(f"### Chaos Run {run}"))
        display(Markdown(response['text']))
        time.sleep(2)


    # run Safe mode once
    messages = [{'role': 'user', 'content': full_prompt}]
    response_safe = client.chat(messages, temperature=0.0, max_tokens=spec.max_tokens)

    display(Markdown("### Safe Run"))
    display(Markdown(response_safe['text']))
    time.sleep(2)

    # Optional separator between scenarios
    display(Markdown("---"))


SCENARIO A: THE KANDY LANDSLIDE
Location: Hanthana Tea Factory
Details: "We are trapped. The access road is blocked. My uncle is stuck in the line rooms below, climbing a tree as water rises (Immediate Life Threat). However, inside the factory, we have a diabetic patient who has collapsed and needs insulin (Immediate Health Threat). We also have 40 people hungry."


### Chaos Run 1

Reasoning:
1. Immediate life threat: Uncle trapped in a tree by rising water.
2. Immediate health threat: Diabetic patient collapsed and requiring insulin.
3. Other risk: Food shortage for 40 people.
4. Prioritized actions:
   - Dispatch a water rescue team to extract the person from the tree.
   - Deliver emergency insulin to the factory via helicopter or drone.
   - Coordinate food supply drops for the trapped individuals.

Answer:
1. Immediate Life Threat: Uncle trapped in tree by rising water; dispatch water rescue team.
2. Immediate Health Threat: Diabetic patient collapsed; deliver emergency insulin.
3. Other Risk: Forty people hungry; coordinate food supply drops.

### Chaos Run 2

Reasoning:
1. Immediate life threat: Uncle trapped in a tree by rising floodwater.
2. Immediate health threat: Collapsed diabetic patient needing emergency insulin.
3. Other risk: Forty hungry people trapped due to a blocked access road.
4. Prioritized actions:
   - Dispatch a water rescue team to the line rooms immediately.
   - Deliver and administer insulin to the collapsed patient.
   - Provide food supplies to the 40 trapped individuals.

Answer:
1. Immediate Life Threat: Uncle stuck in rising water requires immediate deployment of a water rescue team.
2. Immediate Health Threat: Collapsed diabetic patient requires urgent insulin delivery and medical stabilization.
3. Other Risk: Forty hungry people require emergency food supplies and road clearance.

### Chaos Run 3

Reasoning:
1. Immediate life threat: Person stuck in a tree with rising water in the line rooms.
2. Immediate health threat: Collapsed diabetic patient requiring urgent insulin.
3. Other risk: Forty hungry people trapped due to blocked access roads.
4. Prioritized actions:
   - Deploy a water rescue team to retrieve the person from the tree.
   - Administer insulin to the collapsed diabetic patient immediately.
   - Coordinate emergency food delivery for the trapped individuals.

Answer:
1. Immediate Life Threat: Person stuck in a tree with rising water requires immediate water rescue deployment.
2. Immediate Health Threat: Collapsed diabetic patient requires urgent medical intervention and insulin.
3. Other Risk: Forty hungry people require emergency food supply coordination.

### Safe Run

Reasoning:
1. Immediate life threat: Uncle trapped in rising water.
2. Immediate health threat: Collapsed diabetic patient needing insulin.
3. Other risk: 40 hungry people.
4. Prioritized actions:
   - Dispatch water rescue to the line rooms.
   - Administer emergency insulin to the collapsed patient.
   - Arrange food supplies for the 40 people.

Answer:
1. Immediate Life Threat: Uncle trapped in rising water requires immediate water rescue.
2. Immediate Health Threat: Collapsed diabetic patient requires emergency insulin administration.
3. Other Risk: 40 hungry people require food supply coordination.

---

SCENARIO B: THE GAMPAHA HOSPITAL
Location: Gampaha District General Hospital
Details: "Power has failed in the ICU. We have 3 patients on ventilators with battery backup for 2 hours (Critical). Meanwhile, flood water is entering the ground floor ward where 50 elderly patients are bedridden (High Risk). We have one generator truck arriving in 30 mins, but it can only power one section."


### Chaos Run 1

Reasoning:
1. Immediate life threat: Three ICU patients on ventilators with only two hours of battery backup.
2. Immediate health threat: Fifty bedridden elderly patients at risk from rising flood water on the ground floor.
3. Other risk: Limited emergency power availability as the arriving generator can only support one section.
4. Prioritized actions:
   - Connect the incoming generator truck to the ICU power grid immediately.
   - Relocate all bedridden patients from the ground floor to upper floors.
   - Monitor ICU ventilator battery levels until the generator is operational.

Answer:
1. Immediate Life Threat: Ventilator patients facing power failure; connect the generator truck to the ICU.
2. Immediate Health Threat: Bedridden patients in a flooding ward; relocate them to upper floors.
3. Other Risk: Limited generator capacity; prioritize the ICU section for power restoration.

### Chaos Run 2

Reasoning:
1. Immediate life threat: Three ICU patients on ventilators with two-hour battery backup.
2. Immediate health threat: Fifty bedridden elderly patients in a flooding ground floor ward.
3. Other risk: Arriving generator truck has limited capacity and a 30-minute delay.
4. Prioritized actions:
   - Connect the generator truck to the ICU power grid upon arrival.
   - Relocate all 50 bedridden patients to a higher floor ward.
   - Monitor ventilator battery status until the power is restored.

Answer:
1. Immediate Life Threat: Provide power to ICU ventilators by connecting the arriving generator truck.
2. Immediate Health Threat: Relocate 50 bedridden elderly patients to upper floors to avoid flood water.
3. Other Risk: Prioritize ICU power allocation due to the limited capacity of the incoming generator.

### Chaos Run 3

Reasoning:
1. Immediate life threat: Three ICU patients on ventilators with only two hours of battery backup remaining.
2. Immediate health threat: Fifty bedridden elderly patients located in a flooding ground floor ward.
3. Other risk: A single arriving generator truck can only provide power to one section of the hospital.
4. Prioritized actions:
   - Direct the incoming generator truck to power the ICU grid.
   - Evacuate all bedridden patients to the hospital's upper floors.
   - Monitor the ventilator battery status until the generator is operational.

Answer:
1. Immediate Life Threat: Three patients on ventilators require the generator truck to be connected to the ICU grid.
2. Immediate Health Threat: Fifty bedridden patients must be relocated to upper floors to escape the flooding ward.
3. Other Risk: The single arriving generator truck must be prioritized for the ICU to maintain life support.

### Safe Run

Reasoning:
1. Immediate life threat: 3 patients on ventilators with only 2 hours of battery backup.
2. Immediate health threat: 50 bedridden elderly patients in a flooding ground floor ward.
3. Other risk: Limited power capacity of the incoming generator truck.
4. Prioritized actions:
   - Connect the generator truck to the ICU immediately upon arrival.
   - Evacuate all bedridden patients from the ground floor to upper levels.
   - Allocate the single generator resource exclusively to the critical care unit.

Answer:
1. Immediate Life Threat: 3 patients on ventilators require the generator truck to be connected to the ICU.
2. Immediate Health Threat: 50 bedridden elderly patients must be moved to higher floors to avoid flood water.
3. Other Risk: Limited generator capacity necessitates prioritizing power for the ICU section.

---

# Deliverables

## 1. Observation on Scneario A

### Safe Mode

- Life Threats: Correctly identifies uncle trapped in tree.

- Health Threats: Diabetic patient correctly listed.

- Other Risks: 40 people hungry identified.

- Prioritized Actions: Short, clear, actionable steps (rescue, insulin, food).

- Determinism: Fully consistent, repeatable output.

### Chaos Mode

- Life Threats: Mostly correct, but wording varies (“person in tree” instead of “uncle”). Safe Mode kept consistent terminology.

- Health Threats: Correct but phrasing varies; Safe Mode consistently mentions the collapsed diabetic patient.

- Other Risks: Sometimes combined with actions or extra details; Safe Mode keeps them separate.

- Prioritized Actions: Introduces creative/longer actions, e.g., helicopter or drone, while Safe Mode only uses direct resources like water rescue or insulin delivery.

- Determinism: Output varies across runs, unlike the stable Safe Mode output.

### Drift / Hallucination Examples

- Chaos Run 1: “Drop emergency food supply” → Safe Mode simply coordinated direct food delivery.

- Chaos Run 1: “Deliver insulin via emergency air transport” → Safe Mode only used standard insulin delivery.

- Chaos Run 2: “Forty hungry people require emergency food supplies and road clearance” → Safe Mode only coordinates food.

## 2. Observation on Scneario B

### Safe Mode

- Life Threats: ICU ventilator patients correctly identified.

- Health Threats: 50 bedridden elderly patients correctly identified.

- Other Risks: Limited generator capacity recognized.

- Prioritized Actions: Clear, concise, realistic steps.

- Determinism: Stable, repeatable output.

### Chaos Mode

- Life Threats: Correct but wording varies (“ICU ventilator failure within 2 hours”), whereas Safe Mode uses exact scenario language (“3 ICU patients on ventilators with 2-hour battery”).

- Health Threats: More verbose phrasing; Safe Mode keeps it concise.

- Other Risks: Sometimes combined with other actions or expanded; Safe Mode separates clearly.

- Prioritized Actions: Adds extra steps/resources (manual ventilation, reassigning generator), while Safe Mode only executes implied actions.

- Determinism: Chaos outputs inconsistent across runs; Safe Mode is fully consistent.

### Drift / Hallucination Examples

- Chaos Run 1: “Monitor ICU ventilator battery levels until the generator is operational”
→ Safe Run does not include monitoring actions; it only assigns the generator and evacuates patients.

- Chaos Run 2 – “30-minute delay” is explicitly highlighted as a risk, while Safe Run treats timing implicitly without emphasizing delay.

- Chaos Run 3: “Direct the incoming generator truck to power the ICU grid” vs Safe Run’s
“Connect the generator truck to the ICU immediately upon arrival”