## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')
openrouter_api_key = os.getenv('OPENROUTER_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

if openrouter_api_key:
    print(f"OpenRouter API Key exists and begins {openrouter_api_key[:4]}")
else:
    print("OpenRouter API Key not set (and this is optional)")


OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key exists and begins AI
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)
OpenRouter API Key exists and begins sk-o


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


You are the chief crisis adviser to a coastal city-state of 250,000 people facing four simultaneous, interacting emergencies: (1) a category 4 hurricane will make landfall within 72 hours, projected to flood low-lying districts and cut power for an unknown period; (2) a nearby offshore vessel carrying 200 refugees in distress is signaling for rescue within your territorial waters right now (international law and moral duty require you to attempt rescue, but your resources are limited); (3) political protests in the city have escalated and are likely to turn violent within 24 hours unless de-escalation measures are taken; and (4) the main hospital has 60 intensive-care patients dependent on electricity and has fuel to run generators for at most 36 hours. Your available resources are: one multipurpose helicopter with four hours of flight time, 200 police officers (with limited nonlethal gear and no reinforcements), emergency shelters with capacity for 10,000 people (but no immediate logi

In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [8]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Below is a prioritized, time-phased crisis plan designed for the next 72 hours. It focuses first on preserving life in the hurricane event, then on the vulnerable ICU population in the hospital, the offshore refugees, and the risk of civil unrest, while keeping sheltering and law enforcement within capacity and constrained resources. For clarity, I present: (i) phased actions, (ii) major decisions with subpoints (a)-(f) as requested, (iii) contingency triggers, and (iv) a concluding justification.

Key assumptions guiding the plan (and sensitivity notes)
- The hurricane will make landfall within 72 hours and will flood low-lying districts and disrupt power for an unknown period. If actual flood extent or wind intensity is higher than expected, sheltering importance rises.
- There are 200 refugees in distress offshore; under international law we must attempt rescue, but the helicopter and port constraints limit rescue capacity. If distance/time to the vessel is longer than anticipated, rescue yields will be lower.
- 60 ICU patients rely on electricity; hospital has 36 hours of generator fuel in-house; 48 hours of diesel fuel remain in the system pool for generators and vehicles. If hospital power fails, survival odds for ICU patients plummet quickly.
- Police: 200 officers available; nonlethal gear only; no reinforcements. They can deter violence, protect shelters and critical infrastructure, and manage evacuations but cannot sustain large-scale crowd control for extended periods without risk.
- Emergency shelters can house up to 10,000 people, but there is no immediate, detailed logistics plan to feed/sustain all occupants.
- Communications: cellular may degrade; official radio and some social media posts are reliable channels if network exists. International aid is possible but will not arrive within 72 hours.
- The coastal city-state has one small port capable of handling small rescue craft; it is at risk of hurricane damage and may not be usable for a sustained relief effort.

Priority framing
- Top life-saving objective: keep the ICU hospital functioning through the immediate hurricane window; ensure evacuees from flood zones have safe shelter with basic sustenance.
- Secondary life-saving objective: rescue or triage offshore refugees within capacity constraints; maintain order to prevent preventable injuries during protests and the storm.
- Tertiary objective: minimize panic, preserve civil rights to protest, and maintain practical shelter logistics so people are protected rather than exposed to the disaster.

Phase 0–6 hours (Phase 1): Immediate mobilization and solidifying command
Core actions:
- Activate the city’s emergency operations center (EOC) and establish a single incident commander (SIC) with authority to reallocate resources rapidly.
- Immediate assessment briefings: flood-prone zones, hospital power needs, refugee vessel position, protest activity, shelter occupancy, fuel status.

Major decisions with (a)-(f) details:

Decision 1: Establish EOC with unified command and initial allocation of scarce resources
(a) Assumptions and sensitivity:
- Assumption: One commander can authorize rapid, cross-agency reallocations; resistance to reallocation would slow response. Sensitivity: if interagency conflict over authority occurs, decision cycle increases and lives at risk.
(b) Lives saved/loss estimate:
- Baseline: without centralized command, potential ICU power loss risk increases 15–40 additional deaths (depending on duration of outages); with unified command, reducing that risk to 5–15 deaths in the near term (rough range 0–25 lives saved now, relative to chaos scenario).
(c) Ethical/legal guidance:
- Duty to protect life, triage fairness, proportionality in use of force, and avoiding discrimination in evacuations.
(d) Public communication strategy:
- Publicize a clear, short message: “EOC active; follow official routes; shelters open; curfews/don’t travel in flood zones; nonessential protests discouraged during storm.” Use radio/official social channels to limit misinformation.
(e) Contingency triggers/update rule:
- Trigger: if the hurricane track shifts toward more populated districts or if hospital power status becomes uncertain, reallocate more generator fuel to hospital and accelerate shelter intake.
- Update rule: every 4–6 hours with new storm/hospital/shelter data; adjust resource allocation accordingly.
(f) Three riskiest failure modes and mitigations:
- Failure: miscommunication between agencies. Mitigation: appoint liaison officers; standardize messaging through one channel.
- Failure: misallocation of fuel due to faulty estimates. Mitigation: implement conservative allocations with rapid re-assessment rounds.
- Failure: inability to implement evacuations due to road blockages. Mitigation: pre-designated evacuation corridors and police escort plans; alternative routes planned.

Decision 2: Prioritize hospital power and ICU continuity
(a) Assumptions and sensitivity:
- Assumption: ICU survival benefit is highly power-dependent; hospital generators can be fed from the pool but must be prioritized. Sensitivity: if fuel logistics fail, ICU mortality could rise dramatically (>50% within hours at risk level).
(b) Lives saved/loss estimate:
- With allocating the 48-hour diesel pool to hospital generators (and prioritizing ICU and essential services), estimated ICU survival could improve from a high-mortality scenario to roughly 60–80% survival among ICU patients in the 48–72 hour window (roughly 24–36 lives saved vs. a worst-case scenario). If we share fuel with shelters/EMS, ICU deaths could rise by 10–30 lives.
(c) Ethical/legal guidance:
- Triage ethics: protect the most vulnerable (ICU patients); duty to preserve life when feasible; avoid neglect of those in hospital.
(d) Public communication strategy:
- Explain why hospital power is non-negotiable; reassure the public that life-saving care is the priority; emphasize that some non-critical services may be curtailed to protect ICU patients.
(e) Contingency triggers/update rule:
- Trigger: if fuel availability declines below a threshold (e.g., <30 hours left for hospital) or if hospital demand spikes (new ICU admissions), adjust allocation.
- Update rule: hourly internal check-in with hospital administration and fuel suppliers; adjust allocation every 6–12 hours.
(f) Three riskiest failure modes and mitigations:
- Failure: hospital fuel runs out before peak impact. Mitigation: reserve fuel for 48-hour hospital window; use priority re-supply if external aid or mobile generators become available.
- Failure: power disruption despite fuel (infrastructure failure). Mitigation: deploy portable generators to key ICU nodes and critical life-support equipment; deploy battery backups where possible.
- Failure: delayed patient transfer to alternate facilities (no alt facility). Mitigation: pre-plan a temporary triage flow within hospital to concentrate life-sustaining care and coordinate with EMS for rapid transfer if alternate sites become available.

Decision 3: Offshore refugee rescue plan (200 people)
(a) Assumptions and sensitivity:
- Assumption: the helicopter can conduct limited hoist rescues (likely 10–30 people per sortie, depending on equipment), with a total of 2–3 sorties possible in 4 hours of flying time. Sensitivity: greater distance or adverse weather reduces rescue capacity.
(b) Lives saved/loss estimate:
- Best-case: rescue 20–40 refugees with 0–5 deaths given the next 24–72 hour window. Worst-case within those sorties: 0–10 lives saved if sorties are limited or cannot reach the vessel in time.
(c) Ethical/legal guidance:
- Duty to rescue at sea; proportionality of resource use; protection of vulnerable refugees as per international law and humanitarian norms; avoid endangering rescuers.
(d) Public communication strategy:
- Issue a concise statement that rescue operations are underway and that the city will assist as capacity allows; emphasize that additional rescue attempts depend on weather and safety for rescuers.
(e) Contingency triggers/update rule:
- Trigger: if the vessel’s distress signals intensify or conditions worsen, escalate to a second rescue sortie or consider alternative rescue coordination with regional partners (even if not arriving within 72 hours).
- Update rule: update after each sortie; if weather deteriorates, suspend rescue to protect crew and refugees and re-evaluate when conditions improve.
(f) Three riskiest failure modes and mitigations:
- Failure: helicopter cannot reach vessel due to fuel/time constraints. Mitigation: maximize one or two high-probability sorties and prepare to coordinate with regional partners for later rescue; document best-effort rescue attempts publicly to maintain moral duty.
- Failure: refugees rescued but not properly triaged on shore, causing logistical overload. Mitigation: pre-stage triage area at receiving shelter; assign a dedicated team to manage medical screening and intake.
- Failure: rescue operation endangers responders. Mitigation: provide safety gear, weather checks, and strictly limit helicopter sorties to safe windows; if risk rises, abort sorties.

Decision 4: Evacuation and sheltering of flood-prone districts (up to 10,000 shelter capacity)
(a) Assumptions and sensitivity:
- Assumption: 6–8 shelters can be rapidly activated with basic supplies; 10,000 shelter capacity is a hard ceiling; 4–6 shelters can be used immediately with a plan to scale.
- Sensitivity: if wind/rain is more severe than expected, shelter occupancy will spike and supply lines will be stressed; miscalculation leads to crowding, disease risk, or inability to feed people.
(b) Lives saved/loss estimate:
- Evacuating 6,000–8,000 people from the most flood-prone zones could prevent 60–180 flood-related fatalities (based on typical urban flood exposure and history); failures to evacuate could result in 100+ preventable deaths if flooding is severe. Shelters’ ability to serve occupants minimizes exposure risk and illness.
(c) Ethical/legal guidance:
- Provide equitable access to shelter; protect vulnerable populations (elderly, disabled, children); ensure non-discriminatory access to shelters and aid; respect property rights and privacy where possible.
(d) Public communication strategy:
- Establish a clear message about evacuation routes, shelter locations, and what to bring; provide simple guidelines on staying safe in shelters; use radio to reach communities without cell access; set expectations for food and water provisioning.
(e) Contingency triggers/update rule:
- Trigger: if flood maps indicate higher-than-expected risk, or shelter occupancy exceeds 8,000, initiate additional shelter activation and call in external logistic support (where available).
- Update rule: re-evaluate every 6–12 hours; adjust shelter openings and routes as needed.
(f) Three riskiest failure modes and mitigations:
- Failure: shelter stockouts or insufficient food/water. Mitigation: pre-stock essential supplies for at least 10,000 people for 72 hours; use police and volunteers to distribute; prioritize distribution points to reduce lineups and crowding.
- Failure: health issues in shelters (disease spread). Mitigation: establish hygiene stations, basic medical triage at shelters, and a rapid response medical team to shelters.
- Failure: overcrowding and panic at shelters. Mitigation: ensure clear flow management, separate intake and exit routes, and a curfew or quiet hours policy to minimize conflicts.

Decision 5: Protests de-escalation and crowd management
(a) Assumptions and sensitivity:
- Assumption: protests could escalate within 24 hours; police have limited equipment and no reinforcements; de-escalation requires clear leadership, communication, and space to protest safely.
- Sensitivity: failure to de-escalate could lead to injuries and law-and-order breakdown, increasing casualties and delaying evacuation/shelter operations.
(b) Lives saved/loss estimate:
- If de-escalation succeeds, injuries reduce by 60–90% for protest-related events; if protests turn violent, injuries could reach 10–50 people, and interventions could cause additional injuries to officers or civilians.
(c) Ethical/legal guidance:
- Protect rights to peaceful assembly; use proportionate force; avoid targeting protected speech; balance civil liberties with public safety.
(d) Public communication strategy:
- Issue calm, transparent messages about peaceful assembly rules, permissible timing, and safety guidelines; provide alternative venues or designated protest zones with protection; encourage participants to move away from flood zones to safer areas.
(e) Contingency triggers/update rule:
- Trigger: if violence begins or crowds become aggressive, ramp up nonlethal deployment, designate safe zones, and negotiate with community leaders to pause gatherings during hurricane processing.
- Update rule: monitor police-crowd dynamics hourly; adjust deployments to minimize risk to public and officers.
(f) Three riskiest failure modes and mitigations:
- Failure: overreaction or escalation by police. Mitigation: strict rules of engagement; incident command structure for protests; debrief and adjust.
- Failure: critical protest zones disrupt shelter routes. Mitigation: designate safe protest corridors away from evacuation routes; use liaison officers to facilitate safe flows.
- Failure: misinformation sparks panic. Mitigation: deliver consistent, authoritative messages; avoid sensational social media posts; coordinate with media to ensure accurate reporting.

Decision 6: Port operations and logistics (supply and rescue coordination)
(a) Assumptions and sensitivity:
- Assumption: the port is damaged but capable of handling some small rescue craft; completing supply delivery depends on weather and structural integrity.
- Sensitivity: if the port becomes unusable, all inland shelter supplies must be managed through alternate channels, increasing risk of shortages.
(b) Lives saved/loss estimate:
- If port access is maintained for limited time, food/water/fuel and medical supplies can be delivered to shelters, potentially saving 1–5 lives via improved nutrition and care during the first 72 hours; if the port is unusable, shelter operations risk shortage-induced illness or attrition.
(c) Ethical/legal guidance:
- Ensure fair distribution of supplies across shelters; avoid preferential treatment for any group; maintain transparent triage for scarce resources.
(d) Public communication strategy:
- Explain the limits of port capacity and alternative delivery plans; transparently communicate supply levels and expected delivery times.
(e) Contingency triggers/update rule:
- Trigger: if port remains inaccessible, switch to land-based trucking routes and mobile distribution points; re-route supplies to key shelters and hospital supply lines.
- Update rule: daily supply-projection update; adjust distribution routes in light of storm damage assessments.
(f) Three riskiest failure modes and mitigations:
- Failure: port becomes completely unusable. Mitigation: predefine alternate distribution routes (land, inland convoys); maintain a rolling stockpile at shelter hubs.
- Failure: supply chain breaks due to weather. Mitigation: diversify suppliers and pre-position critical items (food, water, fuel, medical supplies) in shelters and hospital pre-stocks.
- Failure: theft or misallocation of supplies. Mitigation: station guards at distribution points; use documented hand-offs and accountability logs.

Decision 7: Public communications and update management
(a) Assumptions and sensitivity:
- Assumption: credible, consistent messaging reduces panic and violence; communication depends on radio and select social media; some cellular networks may fail.
- Sensitivity: inconsistent messaging creates panic, reduces compliance with evacuation and shelter directives.
(b) Lives saved/loss estimate:
- Effective communications could prevent multiple injuries and deaths by improving shelter compliance and orderly evacuations. A reasonable estimate might be 10–50 lives saved due to better compliance and reduced panic.
(c) Ethical/legal guidance:
- Maintain transparency and accountability; respect privacy and reduce misinformation; avoid fear-mongering; ensure messages do not stigmatize groups.
(d) Public communication strategy:
- Provide two parallel channels: official radio updates (short, precise) and official social media posts (clear, non-technical). Use pre-scripted messages in multiple languages. Provide daily briefings with simple survival tips (evacuation routes, shelter locations, food/water guidelines, power safety). Include “what to do now” steps and “what to expect next.”
(e) Contingency triggers/update rule:
- Trigger: if new data indicates higher risk to critical populations or if the storm intensity increases, issue an updated directive within 2–4 hours; if communication networks fail, shift to priority radio announcements and loudspeaker alerts at shelters.
- Update rule: every 6–12 hours during the first 72 hours; immediately after major new data or events (e.g., new flood maps, hospital status, refugee rescue progress, or protest developments).
(f) Three riskiest failure modes and mitigations:
- Failure: misinformation or rumors causing panic. Mitigation: centralize information; use official channels and rapid corrections.
- Failure: audience distrust. Mitigation: publish clear, consistent data; invite community leaders to co-endorse messages.
- Failure: outages in communications channels. Mitigation: maintain radio-based messaging and distribute printed guidance to shelters; use loudspeakers.

Contingency triggers and update framework
- Data to monitor: storm track and intensity; flood map updates; hospital generator fuel levels and ICU occupancy; shelter occupancy levels; refugee vessel status; protest activity; port status; fuel and supply stock levels; communications stability.
- Update rule: a standing 4–6 hour convening cadence at the EOC with cross-agency updates; if any critical data point deviates by a defined threshold (e.g., flood threat levels increased, ICU capacity threatened, refugee rescue viability changes, or fuel drops below 30 hours), revise decisions and reallocate resources accordingly.

Three riskiest failure modes and mitigations (summary)
1) ICU power failure due to fuel mismanagement or infrastructure faults: mitigate by prioritizing hospital generators, maintaining a strict fuel allocation protocol, and deploying portable backups where feasible; prepare rapid triage within ICU if older equipment fails.
2) Inability to rescue or adequately shelter offshore refugees due to time/weather constraints: mitigate by limiting the rescue scope to feasible sorties, initiating a staged rescue plan, and coordinating with regional partners for later support; pre-stage triage and intake at shelters.
3) Protests escalating into violence and blocking evacuations or shelter operations: mitigate by implementing early de-escalation, phased crowd management, designated protest zones, and rapid liaison with community leaders to avoid clashes.

Time-phased operational plan (summary by phase)

Phase 1 (0–6 hours): Establish command, confirm priorities, allocate scarce resources
- EOC activated; SIC assigns roles; hospital power preserved; police focus on protecting critical infrastructure and securing shelter intake points; begin helicopter reconnaissance for flood-prone districts and refugee vessel; begin communication push to public.
- Refugee rescue: plan for up to 2 sorties if weather is favorable; schedule hoist operations and prepare triage for shore intake.
- Shelters: designate initial 4–6 shelters as primary intake; pre-stage food/water/medical supplies for 8,000–10,000 people; plan for rapid distribution.
- Public messaging: launch radio announcements and social posts with evacuation routes, shelter locations, and protest guidelines.

Phase 2 (6–12 hours): Operationalize evacuations, stabilize hospital power, maintain public order
- Evacuate high-risk districts where feasible; police escorts to shelters; route management to minimize cross-traffic with flood zones; ensure 4–6 shelters are operational with basic supplies.
- Hospital power: commit entire diesel pool toward hospital generators if needed; prioritize ICU areas and essential life-support functions; minimize nonessential electrical use.
- Refugee rescue: execute planned sorties if weather is safe; prioritize vulnerable refugees (elderly, pregnant, children); document and prepare for onshore intake.
- Protests: deploy nonlethal, targeted crowd-control measures; designate safe protest zones; maintain access for evacuees and shelter operations.

Phase 3 (12–24 hours): Pre-landfall stabilization, maintain shelter operations, monitor fuel and supply lines
- Shelter logistics expand to 6–8 facilities; begin scheduled feeding/dispensing of water; maintain hygiene measures.
- Fuel planning: if hospital needs remain acute, extend hospital power; otherwise distribute remaining fuel to EMS and shelter distribution as a contingency.
- Refugee intake on shore (if rescue completed) and integration into shelter management; ensure medical triage aligns with shelter capacity.
- Protests: maintain de-escalation posture; adjust curfews if risk to life rises.

Phase 4 (24–48 hours): Hurricane landfall window; storm impacts; sustain life-support, evacuation, and sheltering
- Continue life-saving operations for ICU if possible; keep critical infrastructure (hospitals, shelters, fuel lines) operational; adjust shelter distribution as needed.
- Refugee survivors: ensure safe intake and medical screening; triage and stabilization at shelters prior to longer-term care.
- Protests: maintain safety; adjust messaging to reflect storm status and emphasize safety guidelines.

Phase 5 (48–72 hours): Post-landfall stabilization and recovery planning
- Evaluate hospital power status; begin debriefing and damage assessments; plan for further generator redeployment if needed.
- Shelter operations shift to longer-term stabilization; ensure supply chains are reoriented toward essential survival needs (food, medicine, clean water).
- Refugee survivors: evaluate long-term needs; coordinate with national or international partners if required for ongoing relief.
- Public messaging: continue to provide updates on shelter availability, safety guidelines, and recovery timelines.

Concluding one-paragraph justification (plain language)
This plan focuses first on saving the maximum number of lives during the most dangerous phase—the hurricane—by protecting the ICU’s power supply, ensuring rapid evacuation of the most at-risk residents, and sustaining essential shelter operations. It recognizes the moral obligation to rescue offshore refugees while acknowledging the limits of our resources; it uses the helicopter for targeted, high-probability rescues and relies on in-city sheltering as the backbone of protection when the port and airspace are compromised. It also emphasizes de-escalation and civil rights, recognizing that violence undermines every life-saving effort. The plan is designed to be adaptable: it uses a centralized command, transparent communication, and continuous reassessment of fuel, shelter occupancy, and storm data to re-prioritize actions as conditions change. In profound uncertainty, this approach seeks to maximize lives saved and reduce long-term harm by protecting critical infrastructure (hospital power), delivering rapid, principled responses to the most vulnerable, and maintaining order so that lifesaving work can proceed safely.

If you want, I can format this into a formal incident action plan (IAP) with a single-page executive summary, a resource task list, and a responsibility matrix (who does what, when, and with what authority) to use in the EOC.

In [None]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)




In [11]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

As your Chief Crisis Adviser, I recommend the following prioritized, time-phased operational plan for the next 72 hours, designed to minimize expected fatalities and long-term harm under these extreme constraints. Our core principle is to protect human life, guided by ethical duties and legal obligations, and to foster unity and resilience in the face of unprecedented challenges.

---

## Operational Plan: The Next 72 Hours

**Overall Goal:** Minimize expected fatalities and long-term harm across all interacting emergencies.

**Phase 1: 0-12 Hours (Immediate Crisis Response & Initial Preparations)**

**Priority Actions:**

1.  **Refugee Rescue (Immediate)**
    *   **Decision:** Deploy the helicopter immediately for urgent reconnaissance and initial critical rescue from the distressed vessel. Simultaneously, mobilize any small, robust local surface vessels (e.g., fishing boats, port authority launches) to assist in full-scale rescue.
    *   **Allocation:**
        *   **Helicopter:** 2 hours of its 4-hour flight time for initial assessment, winching critically ill/injured, or children.
        *   **Police:** 20 officers (10% of force) to the port for reception, security, initial medical triage, and maintaining order as refugees disembark.
        *   **Fuel:** Helicopter fuel consumed. Minimal vehicle fuel for port operations and refugee transport to a designated secure shelter section.
        *   **Shelter/Medical:** Designate a secure, separate area within a primary shelter or a non-critical annex of the hospital for initial medical screening, stabilization, and temporary housing of rescued refugees.
    *   **(a) Key Assumptions & Sensitivity:**
        *   **Assumption:** A local surface vessel *can* be mobilized quickly to complement the helicopter. *Sensitivity:* **Highly sensitive.** Without surface craft, the helicopter alone cannot rescue 200 people, leading to significant fatalities. The entire rescue hinges on this.
        *   **Assumption:** Weather permits initial helicopter flight and small boat operations. *Sensitivity:* If conditions are already too rough, rescue efforts will be severely hampered or delayed, increasing risk to all 200.
    *   **(b) Expected Lives Saved/Lost:**
        *   **Saved:** Up to 200 lives if a full and safe rescue is executed.
        *   **Lost (without action or failed rescue):** Potentially all 200, within hours to days, from vessel instability or worsening storm conditions.
    *   **(c) Ethical Principles & Legal Obligations:** **Duty to Rescue** (UNCLOS, SOLAS, SAR conventions). **Moral Imperative** to save lives in distress. **Triage** in initial helicopter rescue (prioritizing most vulnerable).
    *   **(d) Communication Strategy:** Internally: Rapid coordination with port, medical, and police. Externally (limited): Announce a "humanitarian maritime distress response" is underway, emphasizing care for those in need. Avoid specific numbers initially to prevent panic.
    *   **(e) Contingency Triggers & Update Rule:**
        *   **Trigger 1:** No local surface vessel can be mobilized within 2 hours. *Update:* Immediately send urgent international appeal for naval/coast guard assistance (understanding it won't arrive for 72 hrs, but establishing the request). Reallocate remaining helicopter time for maximum critical rescue (e.g., repeated trips for children/severely injured).
        *   **Trigger 2:** Vessel capsizes or sinks rapidly. *Update:* Shift to immediate water rescue/recovery operations, prioritizing any survivors, disregarding full triage for speed.
    *   **(f) Riskiest Failure Modes & Mitigation:**
        *   **1. Inability to execute full rescue:** *Mitigation:* Exhaust all local surface vessel options; initiate international plea; prepare for high fatality count if surface rescue is impossible.
        *   **2. Overwhelming port/medical facilities:** *Mitigation:* Clear pre-designation of holding areas; rapid medical screening to prioritize hospital transfer only for life-threatening cases; police for crowd control.
        *   **3. Rescue diverts resources from hurricane prep:** *Mitigation:* Dedicated teams; minimal police deployment to port; core hurricane teams remain focused.

2.  **Hospital ICU Fuel & Operations (36-hour clock)**
    *   **Decision:** Immediately secure an additional 12 hours of diesel fuel for the hospital generators, extending their critical supply to 48 hours. Simultaneously, prepare contingency plans for patient transfer/manual care.
    *   **Allocation:**
        *   **Fuel:** Divert 12 hours of *general-use* diesel fuel (from the 48-hour total) *immediately* to the hospital. This leaves 36 hours of general fuel for other needs.
        *   **Police:** 10 officers (5%) to secure fuel transport, protect the hospital perimeter, and assist with any potential patient movements.
        *   **Vehicles:** Designate necessary vehicles for secure fuel transport.
        *   **Hospital Staff:** Initiate review of all 60 ICU patients for potential early discharge, transfer to less critical care, or identify those who could survive limited power/manual support if truly unavoidable.
    *   **(a) Key Assumptions & Sensitivity:**
        *   **Assumption:** An additional 12 hours of fuel can be sourced and transported within 12-24 hours. *Sensitivity:* **Extremely sensitive.** If fuel cannot be secured, 60 lives are directly at risk after 36 hours.
        *   **Assumption:** 48 hours of generator fuel (original 36 + diverted 12) is sufficient for initial storm impact and allows for alternative power/patient transfer. *Sensitivity:* If generators fail or storm impact is longer than 48 hours, the crisis persists.
    *   **(b) Expected Lives Saved/Lost:**
        *   **Saved:** Up to 60 lives if fuel supply is secured or effective contingency is implemented.
        *   **Lost (without action):** Potentially all 60 patients *after* 36 hours.
    *   **(c) Ethical Principles & Legal Obligations:** **Medical Triage** (to maximize survival of vulnerable patients). **Duty of Care.**
    *   **(d) Communication Strategy:** Internally: Urgent coordination with hospital administration, fuel suppliers. Externally: *Do not* publicize the fuel shortage. Assure the public that "critical services are being secured."
    *   **(e) Contingency Triggers & Update Rule:**
        *   **Trigger 1:** No additional fuel can be sourced or transported within 24 hours. *Update:* Immediately begin preparing for patient transfers to pre-identified alternative sites (e.g., a high school gym with generator capacity) or manual ventilation protocols. Prioritize patient categories for transfer (e.g., those most likely to survive transfer).
        *   **Trigger 2:** Hospital generators fail. *Update:* Initiate immediate transfer to alternative sites or manual care, prioritizing patients with highest chance of survival.
    *   **(f) Riskiest Failure Modes & Mitigation:**
        *   **1. Fuel shortage cannot be mitigated:** *Mitigation:* Comprehensive contingency plan for patient transfer; manual care protocols; remote medical consultation.
        *   **2. Generator failure:** *Mitigation:* Redundant generators, maintenance crews on standby, pre-positioned spare parts if possible.
        *   **3. Unsafe patient transfers:** *Mitigation:* Dedicated medical transport teams; secure transfer routes; clear protocols for patient stability during transfer.

3.  **Political Protest De-escalation (24-hour clock)**
    *   **Decision:** Dispatch senior government officials to engage protest leaders. Deploy police for a visible, non-confrontational presence to protect critical infrastructure and maintain order, but avoid confrontation.
    *   **Allocation:**
        *   **Police:** 100 officers (50% of force) for protest monitoring, de-escalation, and static security around critical sites (hospital, shelters, government buildings). A 20-officer rapid response team held in reserve.
        *   **Government Officials:** Mayor/Chief Adviser to lead dialogue efforts.
        *   **Communications:** Radio, social media, direct engagement.
    *   **(a) Key Assumptions & Sensitivity:**
        *   **Assumption:** Protest leaders are open to dialogue and understand the existential threat of the hurricane. *Sensitivity:* If leaders are unwilling or unable to control protesters, violence is highly probable, severely impacting hurricane preparations.
        *   **Assumption:** Non-confrontational police presence deters violence. *Sensitivity:* Mismanagement or perceived aggression by police could escalate the situation.
    *   **(b) Expected Lives Saved/Lost:**
        *   **Saved:** Potentially dozens to hundreds by preventing violence, riots, and disruption of emergency services/evacuation routes.
        *   **Lost (without action):** Unpredictable, but 5-20 direct fatalities from clashes, and indirect fatalities from disrupted hurricane response.
    *   **(c) Ethical Principles & Legal Obligations:** **Freedom of Assembly** (within bounds of public safety). **Protection of Life and Property.** **Proportionality** in use of force.
    *   **(d) Communication Strategy:**
        *   **Direct:** Engage protest leaders, acknowledge grievances, emphasize the common threat of the hurricane and the need for unity. Propose a temporary "unity for survival" truce, promising dialogue *after* the storm.
        *   **Public:** Acknowledge "understandable public concerns" while emphasizing the critical need for "unity in the face of the storm." State that "all government efforts are focused on protecting *all* citizens."
    *   **(e) Contingency Triggers & Update Rule:**
        *   **Trigger 1:** Protests turn violent (looting, direct attacks on infrastructure/police). *Update:* Police switch to defensive mode to protect critical infrastructure and evacuation routes, using minimal nonlethal force if necessary. Consider localized, temporary curfews.
        *   **Trigger 2:** Protest leaders explicitly refuse cooperation. *Update:* Scale back dialogue. Police focus shifts entirely to critical infrastructure protection and maintaining order around shelters/evacuation routes.
    *   **(f) Riskiest Failure Modes & Mitigation:**
        *   **1. Uncontrolled violence:** *Mitigation:* Pre-position police at key sites; clear communication channels for de-escalation; strict rules of engagement emphasizing restraint.
        *   **2. Police overreaction:** *Mitigation:* Emphasize non-confrontational training; clear orders for defensive posture only; public accountability mechanisms.
        *   **3. Erosion of public trust:** *Mitigation:* Consistent, honest communication; visibly demonstrating commitment to saving all lives.

4.  **Hurricane Preparedness (72-hour window, beginning NOW)**
    *   **Decision:** Issue immediate, urgent public warnings and mandatory evacuation orders for all identified low-lying and flood-prone districts. Activate all 10,000 shelter capacities.
    *   **Allocation:**
        *   **Police:** 80 officers (40%) for traffic control for evacuation, securing routes, maintaining order in shelters, and assisting vulnerable populations with evacuation.
        *   **Shelters:** Open all 10,000 capacity. Focus on guiding people in; clearly communicate that immediate logistical supply (food/water) is *not guaranteed* beyond minimal emergency rations, so people must bring essentials.
        *   **Fuel:** Allocate 24 hours of the remaining 36 hours of general fuel for essential evacuation vehicles and initial shelter operations. This leaves 12 hours for critical post-landfall needs.
        *   **Helicopter:** Remaining 2 hours of flight time reserved for critical reconnaissance of flood zones and identifying vulnerable, un-evacuated populations in Phase 2.
    *   **(a) Key Assumptions & Sensitivity:**
        *   **Assumption:** A significant portion of the population will comply with evacuation orders. *Sensitivity:* **Highly sensitive.** Low compliance will lead to overwhelming post-storm rescue needs and dramatically increased fatalities.
        *   **Assumption:** Communications will hold long enough for initial critical warnings. *Sensitivity:* If cellular degrades immediately, reaching a large population becomes extremely difficult.
    *   **(b) Expected Lives Saved/Lost:**
        *   **Saved:** Thousands, potentially tens of thousands, by moving people out of high-risk zones.
        *   **Lost (without action):** Thousands from drowning, structural collapse, injuries.
    *   **(c) Ethical Principles & Legal Obligations:** **Protection of Life** (core government duty). **Equity** in access to information and safe shelter.
    *   **(d) Communication Strategy:**
        *   **Urgent & Frequent:** "Category 4 Hurricane landfall in 72 hours. Immediate mandatory evacuation for [list specific districts]. Go to [list shelters]. Power outages expected. Stay informed via official radio." Use clear, actionable, and authoritative language.
        *   **Detailed Information:** Provide shelter locations, emergency numbers, home securing tips.
        *   **Emphasize Unity:** Frame preparations as a community-wide survival effort.
    *   **(e) Contingency Triggers & Update Rule:**
        *   **Trigger 1:** Hurricane track shifts significantly. *Update:* Revise evacuation orders and shelter assignments immediately.
        *   **Trigger 2:** Low evacuation compliance. *Update:* Intensify communication; involve community leaders; conduct targeted door-to-door checks for vulnerable individuals; use public address systems in at-risk zones.
    *   **(f) Riskiest Failure Modes & Mitigation:**
        *   **1. Mass non-compliance:** *Mitigation:* Persistent, multi-channel communication; community engagement; clear route management by police.
        *   **2. Shelter capacity/conditions:** *Mitigation:* Clearly communicate limited supplies; police presence for order; prioritize sanitation.
        *   **3. Communication breakdown:** *Mitigation:* Redundant broadcast channels (radio, PA systems); pre-recorded messages; physical flyers; door-to-door outreach.

---

**Phase 2: 12-36 Hours (Intensified Preparation & Monitoring)**

**Overall Goal:** Maximize evacuation, secure critical assets, finalize hospital contingencies, monitor refugee and protest situations.

*   **Refugee Situation:** If rescued, focus shifts to medical screening, humanitarian aid (basic provisions from shelter stores), and initial integration into designated shelter section.
*   **Hospital ICU:** Execute the planned fuel extension. Monitor generator performance diligently. Only *begin* high-risk patient transfers if fuel extension fails and no other options remain.
*   **Protests:** Continue dialogue. Police (100 officers + 20 reserve) maintain presence for critical infrastructure protection and evacuation route management. If violence has erupted, prioritize containment and securing safe zones, reiterating the unity message.
*   **Hurricane Preparedness (Heightened Focus):**
    *   **Evacuation:** Intensify and enforce low-lying area evacuations. Use the remaining 2 hours of helicopter time for reconnaissance of flood zones and identifying holdouts or stranded individuals for targeted assistance.
    *   **Critical Infrastructure:** Secure port (move any remaining small craft inland), vital public buildings, communication towers, and other essential services.
    *   **Resource Management:**
        *   **Fuel:** Monitor the remaining 12 hours of general fuel critically. Ration strictly for essential emergency vehicles and the few remaining hours of helicopter operation.
        *   **Police:** Maintain 80 officers on hurricane duty (evacuation, shelters, traffic). Reallocate the 100 officers from protest management to high-risk zones/critical infrastructure protection if protests subside. The 20 reserve officers remain flexible.
        *   **Shelters:** Pre-position any available medical supplies, basic food/water, and sanitation facilities. Coordinate with volunteers.

**Communication Strategy for Phase 2:**
*   **Urgency:** "Less than 48 hours until landfall. Final critical hours for evacuation from [districts]. Shelters are open. All citizens must prepare now."
*   **Specific Instructions:** Reinforce safe behaviors during the storm, shelter guidelines, and what to bring.
*   **Reassurance:** "Emergency services are prepared and working tirelessly. Stay calm, stay safe." Subtly acknowledge other challenges: "Despite simultaneous challenges, our primary focus remains your safety from this storm."
*   **Refugees (if rescued):** A brief update: "Vessel in distress successfully brought to port. Humanitarian aid being provided. Medical teams on site."

---

**Phase 3: 36-72 Hours (Pre-Landfall & Last-Minute Actions)**

**Overall Goal:** Complete evacuations, secure city, brace for impact, position emergency response.

*   **Hospital ICU:** This is the critical juncture. The extended 48-hour fuel supply *must* be active. If not, patient transfers or extreme measures are underway. Continue monitoring generators and patient status.
*   **Refugee Situation:** All rescued refugees should be safely in shelters or receiving medical care. Focus on their well-being within the shelters.
*   **Protests:** If protests persist, police (now 100 officers + 20 reserve, assuming hurricane prep is largely complete) must prioritize securing essential infrastructure and evacuation routes *above all else*. A temporary, limited curfew might be necessary if violence escalates severely, to ensure people are off the streets before the storm hits, explaining the necessity for public safety.
*   **Hurricane Preparedness (Final Push):**
    *   **Evacuation Deadline:** Announce an absolute deadline for evacuation, after which search and rescue operations will be suspended due to unsafe weather.
    **Shelter Lock-down:** All shelters secured. No new admissions.
    *   **Critical Personnel:** All emergency responders, medical staff, utility crews to their designated safe locations.
    *   **Fuel:** The remaining 12 hours of general fuel are reserved strictly for *post-storm* emergency response (search & rescue, critical infrastructure checks). Hospital generators are running on their extended 48 hours.
    *   **Police:** 100 officers for static security at critical sites (hospital, power stations, shelters), 80 officers for pre-positioning for immediate post-storm emergency response (search & rescue, damage assessment), 20 in reserve. No helicopter available.
    *   **Port:** All assets moved or secured. The port is now braced for likely damage.
    *   **International Aid:** Reiterate the urgent need for *post-storm* international assistance.

**Communication Strategy for Phase 3:**
*   **Final Call to Action:** "Less than 24 hours until impact. THIS IS THE FINAL CALL FOR SAFETY. Everyone MUST be in a secure location NOW. Emergency services will cease operations at [X hour] due to unsafe conditions."
*   **Post-Storm Guidance:** "After the storm, stay indoors until authorities declare it safe. Do not venture out. Listen *only* to official radio broadcasts for updates. We will provide assistance as soon as it is safe to do so."
*   **Acknowledge Challenge:** "This is an unprecedented challenge, but we are resilient. We will get through this together, prioritizing human life above all else."

---

**Riskiest Failure Modes of the Plan & Mitigation:**

1.  **Massive Non-Compliance with Evacuation Orders:**
    *   **Risk:** Thousands stranded, overwhelming post-storm rescue, and high fatalities. Exacerbated by protests or communication failure.
    *   **Mitigation:** Persistent, clear, multi-channel urgent communication. Involve community and religious leaders. Police resources focused on securing and facilitating evacuation routes. Emphasize "unity for survival." Pre-identify vulnerable populations for targeted checks (though resource-limited).
2.  **Failure to Secure Extended Hospital Fuel/Contingency Plan for ICU Patients:**
    *   **Risk:** Direct loss of 60 lives. Severe blow to public trust and morale, potentially fueling unrest.
    *   **Mitigation:** This is the highest non-hurricane priority. Exhaust all local private/commercial fuel sources. Simultaneously, activate and prepare to execute an urgent, safe patient transfer plan to an alternative site with generator access or comprehensive manual care protocols.
3.  **Protests Escalate to Widespread Violence/Civil Unrest During Preparations or Post-Storm:**
    *   **Risk:** Disrupts evacuation, damages critical infrastructure, diverts police, increases casualties, and impedes recovery.
    *   **Mitigation:** Proactive de-escalation through dialogue, empathetic communication, and clear acknowledgments of grievances. Maintain a non-confrontational police posture during preparations. If violence erupts, prioritize protecting critical assets and securing evacuation routes, being prepared to impose temporary, targeted curfews *only if absolutely necessary* for public safety, with transparent justification.

---

**Plain-Language Justification:**

Our plan is the best course of action under profound uncertainty because it tackles the most immediate and catastrophic threats simultaneously, prioritizing the saving of lives above all else. We are using our extremely limited resources, like the helicopter, to first rescue those facing imminent death at sea, while urgently securing the hospital's lifeline to save vulnerable patients. At the same time, we are working to calm social unrest, so our city can unite against the approaching superstorm, which poses the greatest threat to the largest number of people. By issuing clear, urgent evacuation orders and preparing our shelters, we aim to move thousands out of harm's way. This strategy, though difficult and requiring tough trade-offs, focuses on maximizing survival across all crises, fostering unity, and ensuring we are as prepared as possible for what lies ahead. We will communicate honestly and adapt swiftly, because our shared survival depends on it.

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


In [12]:
openrouter = OpenAI(api_key = openrouter_api_key, base_url = "https://openrouter.ai/api/v1")
model_name = "deepseek/deepseek-r1-0528:free"

response = openrouter.chat.completions.create(model = model_name, messages = messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

### Prioritized, Time-Phased Operational Plan  
**Core Strategy:** Prioritize immediate life-saving actions (refugees and hospital), secure public order to enable evacuations, and rigorously conserve fuel/resources. Ensure transparent communications to reduce panic.  

---  
#### **Phase 1: T-72 to T-48 Hours (Refugees, Hospital, Protest De-escalation)**  
1. **Refugee Rescue (Highest Priority)**  
   - **Action:** Deploy helicopter immediately to assess the vessel. Coordinate rescue via port-based craft (e.g., coast guard boats). Evacuate refugees to shelters.  
   - *Resources:* Helicopter (2 hours), 20 police for port security, shelters (200 spots allocated).  
   - **Assumptions:** (a) Vessel can be reached/towed before sinking; (b) refugees are mostly healthy. **Sensitivity:** High. If the vessel sinks (10–20% risk), deaths jump to 150–200.  
   - **Expected Lives Saved:** 180–200 (estimate: 10–20 may perish in rough seas; WHO maritime disaster fatality rates).  
   - **Ethics/Law:** Moral/legal duty to rescue (UNCLOS/SOLAS) outweighs other risks. Triage prioritizes proximate life-or-death scenarios.  
   - **Comms:** Announce: "We are rescuing refugees. Shelters are prepared. Stay clear of port areas."  

2. **Hospital Fuel Conservation**  
   - **Action:** Audit fuel reserves; redirect 70% of diesel to hospital generators. Restrict non-critical hospital services.  
   - *Resources:* Diesel (36 hours reserved for ICU).  
   - **Assumptions:** Generator integrity holds. **Sensitivity:** Low. Failure risks 60 ICU deaths if fuel runs out.  
   - **Lives Saved:** 60 ICU patients secured for 36+ hours.  

3. **Protest De-escalation**  
   - **Action:** Deploy 150 police with nonlethal gear (e.g., tear gas, shields) to separate factions. Open dialogue with protest leaders, offer post-crisis negotiations.  
   - *Resources:* Police (150 officers), social media teams.  
   - **Assumptions:** Protests are politically motivated, not crisis-driven. **Sensitivity:** High. If ignored, violence (10–50 deaths) could erupt.  
   - **Fatalities Avoided:** 15–50 (based on historical urban protest escalation).  
   - **Ethics:** Protect free assembly (ICCPR) while preventing violence; minimal force used.  
   - **Comms:** "Protests must remain peaceful. We hear your concerns—dialogue begins post-storm."  

---  
#### **Phase 2: T-48 to T-24 Hours (Shelter Evacuations, Final Preps)**  
1. **Mass Ανακτήθηκε Evacuations loans**  
   - **Action:** Evacuate 10,000 high-risk residents to shelters. Use remaining helicopter time (2 hours) for inaccessible areas/delivery of critical supplies.  
   - *Resources:* Police (100 officers), shelters (10,000 capacity), diesel (10% for transport).  
   - **Assumptions:** 70% evacuation compliance. **Sensitivity:** Moderate. Low compliance could cause 100–500 flood deaths.  
   - **Lives Saved:** 300–500 (based on Category 4 flood models for low-lying urban areas).  
   - **Comms:** Broadcast every 2 hours: "Flood zones: Evacuate NOW. Shelters have food/water."  

2. **Hospital & Protest Follow-Up**  
   - **Action:** Position 20 police near hospitals to ensure fuel security. Maintain police presence at protests (30 officers). Confirm NGO support for shelter supplies.  
   - **Contingency Trigger:** If protest violence erupts, declare curfew and reclaim 50 police from shelters.  

3. **Port Closure & Fuel Reserve**  
   - **Action:** Close port at T–36 hours; relocate vessels. Stockpile remaining diesel (38 hours).  

---  
#### **Phase 3: T-24 Hours to Landfall (Lockdown and Sustenance)**  
1. **Shelter Order and Supplies**  
   - **Action:** Deploy all police (200) to shelters/hospitals. Ration food/water (e.g., 1,500 kcal/person/day).  
   - *Resources:* Police, shelters.  
   - **Assumptions:** Sheltered population includes refugees and evacuees. **Sensitivity:** High. Riots could occur if supplies dwindle.  
   - **Mitigation:** Transparent rationing updates; isolate agitators.  

2. **Hospital Triage Prep**  
   - **Action:** Designate a shelter clinic for non-ICU patients. If fuel drops below 12-hour reserve, helicopter assesses inland transfer options.  
   - **Lives Saved:** 55–60 ICU patients (5–10 at risk if fuel depletes mid-storm).  

3. **Communications Surge**  
   - **Action:** Broadcast storm alerts hourly. Social media: "Stay indoors. We have this under control."  

---  
#### **Phase 4: Landfall to T+24 Hours (Survival Mode)**  
- **Shelters/Hospital:** Police enforce no-movement orders. Cyclic generator use to conserve fuel.  
- **Contingency:** If ICU power fails, transfer patients to shelters with manual ventilators (if available).  
- **Comms:** Use radio for emergency signals if cellular fails.  

---  
### Key Elements Across All Phases  
**(a) Assumptions & Sensitivities:**  
- **Refugee rescue:** Assumes vessel seaworthiness (~80% probability). If sinks prematurely, lives saved

## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [13]:
!ollama pull llama3.2

]11;?\[6n[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 1.7 MB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 4.9 MB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 5.7 MB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 8.4 MB/2.0 GB                  [K[?25h[?2026l[?2026

In [14]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

**Prioritized Operational Plan for the Next 72 Hours**

**Hour 0-12:**

a) Assumption: Maximum resource utilization
b) Key Decision: Prioritize refugee rescue over hurricane relief and medical crisis management
c) Triage: Focus on saving as many lives in distress as possible without compromising overall population safety.
d) Communication Strategy: Public broadcasts warning of impending storms, hospital capacity announced, protesters assured police presence will be maintained and protests will be monitored closely; social media updates providing situational awareness to a limited audience.
e) Contingency Triggers:
  - Referral protocols breach (less than 5% of 200 people cannot be rescued).
  - Weather forecast indicates potential severe power outage for more than 36 hours, which may impact hospital capacity beyond crisis point.

f) Riskiest Failure Mode: Inattention could lead to unmet expectations on refugees rescued and increased risk in hospitals facing extended power loss; however, an unexpected hurricane landfall will significantly disrupt logistics.
g) Mitigation Plan:
  - Arrange all available aircrew for rapid port arrival of rescue boat with crew from local emergency agencies if not present during major storms.
  - Increase support staff near the main hospital where there is a concern about being unable to treat critical patients.

**Hour 12-48:**

a) Assumption: All factors will follow forecast probabilities for next part of the plan duration
b) Key Decision: Shift focus toward securing low-lying districts and preparing against hurricane debris, power outages, etc. 
c) Triage: Prioritize hospital capacity at an increased potential risk to critical care patient lives due to electricity supply concerns.
d) Communication Strategy: Continual update warnings for residents to be protected from severe outcomes by government-led instructions; use public media more effectively for information dissemination given the storm situation.

e) Contingency Triggers:
  - Severe damage to port facilities preventing safe boat arrival for rescue purposes.
  - Emergent changes in emergency services protocol from external countries that impact effectiveness of assistance.

f) Riskiest Failure Mode: Loss of hospital supplies under circumstances if no other measures taken can jeopardize safety life and death for patients. 
g) Mitigation Plan:
   - Implement more effective communication systems by local media to keep citizens informed in the event the government system goes down, with focus on key life-saving details.
    Maintain continuous operations for a higher number of hospital staff at a given cost.

**Hour 48-72:**

a) Assumption: The likelihood of recovery will reduce, due to more information that could be gathered, and some parts may become impossible without external aid or resources from the outside world.
b) Key Decision: Rely on international coordination through international humanitarian efforts and request assistance with limited exposure for this state since no support is guaranteed.
c) Triage: High value given emergency hospitals remaining resourceful despite being downgraded to less critical care level under conditions of possible electricity shortage.

d) Communication Strategy: Public announcements will convey that full assistance has been solicited as well to be more transparent about the situation facing all affected and working for better coordination globally

e) Contingency Triggers:
 - An external aid plan would need adjustments due to any changes requested from an outside authority over available resources on-hand to manage those needs.
-  A severe public protest event is imminent within hours.

f) Riskiest Failure Mode: Disregarding aid support and international collaboration in favor of unilateral rescue strategies could result in additional lives lost as a risk associated with the plan’s lack of coordination across jurisdictions.

g) Mitigation Plan:
  - Provide detailed plans including emergency shelter management for every district affected by flooding to enhance potential recovery planning.
   Be prepared to modify this situation based on whatever change would happen if more data becomes available regarding storm severity, aid availability, protest size and any additional local disasters arising as conditions improve or worsen

In [15]:
# So where are we?

print(competitors)
print(answers)


['gpt-5-nano', 'gemini-2.5-flash', 'deepseek/deepseek-r1-0528:free', 'llama3.2']


In [16]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-5-nano

Below is a prioritized, time-phased crisis plan designed for the next 72 hours. It focuses first on preserving life in the hurricane event, then on the vulnerable ICU population in the hospital, the offshore refugees, and the risk of civil unrest, while keeping sheltering and law enforcement within capacity and constrained resources. For clarity, I present: (i) phased actions, (ii) major decisions with subpoints (a)-(f) as requested, (iii) contingency triggers, and (iv) a concluding justification.

Key assumptions guiding the plan (and sensitivity notes)
- The hurricane will make landfall within 72 hours and will flood low-lying districts and disrupt power for an unknown period. If actual flood extent or wind intensity is higher than expected, sheltering importance rises.
- There are 200 refugees in distress offshore; under international law we must attempt rescue, but the helicopter and port constraints limit rescue capacity. If distance/time to the vessel is l

In [20]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [21]:
print(together)

# Response from competitor 1

Below is a prioritized, time-phased crisis plan designed for the next 72 hours. It focuses first on preserving life in the hurricane event, then on the vulnerable ICU population in the hospital, the offshore refugees, and the risk of civil unrest, while keeping sheltering and law enforcement within capacity and constrained resources. For clarity, I present: (i) phased actions, (ii) major decisions with subpoints (a)-(f) as requested, (iii) contingency triggers, and (iv) a concluding justification.

Key assumptions guiding the plan (and sensitivity notes)
- The hurricane will make landfall within 72 hours and will flood low-lying districts and disrupt power for an unknown period. If actual flood extent or wind intensity is higher than expected, sheltering importance rises.
- There are 200 refugees in distress offshore; under international law we must attempt rescue, but the helicopter and port constraints limit rescue capacity. If distance/time to the vesse

In [22]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [23]:
print(judge)

You are judging a competition between 4 competitors.
Each model has been given this question:

You are the chief crisis adviser to a coastal city-state of 250,000 people facing four simultaneous, interacting emergencies: (1) a category 4 hurricane will make landfall within 72 hours, projected to flood low-lying districts and cut power for an unknown period; (2) a nearby offshore vessel carrying 200 refugees in distress is signaling for rescue within your territorial waters right now (international law and moral duty require you to attempt rescue, but your resources are limited); (3) political protests in the city have escalated and are likely to turn violent within 24 hours unless de-escalation measures are taken; and (4) the main hospital has 60 intensive-care patients dependent on electricity and has fuel to run generators for at most 36 hours. Your available resources are: one multipurpose helicopter with four hours of flight time, 200 police officers (with limited nonlethal gear an

In [24]:
judge_messages = [{"role": "user", "content": judge}]

In [30]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-nano",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["1", "2", "3", "4"]}


In [31]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gpt-5-nano
Rank 2: gemini-2.5-flash
Rank 3: deepseek/deepseek-r1-0528:free
Rank 4: llama3.2


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>