### Part 2: The Stability Experiment

##### *Import Libraries*

In [8]:
import pandas as pd
from utils.prompts import render
from utils.llm_client import LLMClient
from utils.router import pick_model
from utils.logging_utils import log_llm_call

##### *Load data  from scenarios.txt*

In [4]:
input_scenarios_path = '../data/scenarios.txt'

try:
    with open(input_scenarios_path, 'r') as f:
        content = f.read()
        scenarios = [s.strip() for s in content.split('\n\n') if s.strip()]
    print(f"{len(scenarios)} scenarios loaded")
except FileNotFoundError:
    print("File not found")
    scenarios = []

2 scenarios loaded


##### *Setup LLM Client*

In [10]:
model = pick_model('openai', 'general')
client = LLMClient('openai', model)

##### *Temperature Stress Test*

In [None]:
for i, scenario in enumerate(scenarios):
    print(f"Testing {scenario.splitlines()[0]}") 
    
    # CoT prompt
    prompt_text, spec = render(
                                'cot_reasoning.v1',
                                role='Crisis Commander',
                                problem=scenario,
                                context='You must decide who to save first. Explain your reasoning step-by-step.'
                                )
    
    # Safe Mode (Temperature = 0.0)
    print("\n-Safe Mode-")
    response_safe = client.chat([{'role': 'user', 'content': prompt_text}], temperature=0.0)
    print(f"{response_safe['text'].strip()}\n")
    log_llm_call('openai', model, 'stability_safe', response_safe['latency_ms'], response_safe['usage'])
    
    print(f"{'-'*100}")

    # Chaos Mode (Temperature = 1.0)
    print("\n-Chaos Mode-")
    for run in range(1, 4):
        response_chaos = client.chat([{'role': 'user', 'content': prompt_text}], temperature=1.0)
        print(f"\n[Run {run}]:\n{response_chaos['text'].strip()}")
        log_llm_call('openai', model, 'stability_chaos', response_chaos['latency_ms'], response_chaos['usage'])

Testing SCENARIO A: THE KANDY LANDSLIDE

-Safe Mode-
**Reasoning Steps:**

1. **Assess Immediate Life Threat:** The uncle is in a critical situation, climbing a tree as water rises. This requires immediate action to rescue him.

2. **Assess Immediate Health Threat:** The diabetic patient who has collapsed needs insulin urgently. This is also a life-threatening situation that must be addressed quickly.

3. **Evaluate Resources:** Determine if there are any available resources in the factory that can be used for both rescue and medical assistance.

4. **Prioritize Actions:** Given the immediate life threat of the uncle and the health threat of the diabetic patient, prioritize actions based on the severity of the threats.

5. **Plan for Hunger:** While addressing the immediate threats, consider how to manage the hunger of the 40 people, but this is a lower priority compared to the life and health threats.

6. **Coordinate Rescue Efforts:** If possible, organize a team to assist in the res

#### How the Chaos output drifted or hallucinated compared to the Safe output?

##### Scenario A: The Kandy Landslide
The Safe output correctly treats the uncle’s rescue and the diabetic patient’s insulin as urgent, then addresses hunger. The Chaos outputs start to guess. Some of them suggest the diabetic patient might be able to wait or assume insulin or food could already be available. These details are not mentioned in the scenario, so they are assumptions, not facts.

##### Scenario B: The Gampaha Hospital
The Safe output makes a clear choice by using the generator for the ICU because the ventilator patients are in immediate danger, and move the elderly patients away from flooding. The Chaos outputs keep the same decision but add ideas like manual ventilation or extra batteries. These were never stated in the scenario and make the reasoning less clean.  


##### *The Safe outputs stayed grounded in the exact information provided and the Chaos outputs kept the same final decision but add guesses and unnecessary details were not given.*

