## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


Imagine you are an independent expert advising a mid-sized coastal city of 120,000 residents with a $25M annual operating budget, 20% of households below the poverty line, and a 10 km coastline that has experienced three events previously described as "100‑year" storms in the last 15 years; sea‑level rise projections indicate a likely range of 0.5–1.2 m by 2100 and the city's critical infrastructure (water treatment plant, hospital, power substation) is currently within today's 1‑in‑100‑year floodplain—politically, the city council is divided (two‑thirds oppose large spending on relocation) and many residents distrust government—and you have an initial one‑time resilience fund of $50M and authority to propose reallocations of up to 10% of the annual budget for the next 10 years: design a prioritized, actionable 10‑year coastal resilience strategy that (A) minimizes expected loss of life and critical service disruption over the next 50 years under the sea‑level uncertainty above, (B) ad

In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [8]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Below is a concise, actionable 10-year coastal resilience strategy tailored to a 120,000-resident coastal city with a $25M annual operating budget, 20% poverty, a 10 km coastline, frequent high-risk events, and the political/operational constraints you described. All figures are order-of-magnitude estimates intended to guide planning, with explicit assumptions and calculations shown at the end. Interventions are prioritized to protect lives and critical services first, advance equity, and remain politically feasible by balancing near-term gains with voluntary, phased actions.

Assumptions (key starting points used in calculations)
- Sea-level rise scenario: 0.5–1.2 m by 2100; near-term risk rising gradually; three “100-year” storms in 15 years indicates high annual hazard and clustered event risk.
- Baseline risk (current conditions): annualized expected loss (AEL) from coastal hazards approximated at $7.5M/year citywide (damages, business interruption, and service outages); probability of a critical service failure during a major storm ≈ 22% per major storm event.
- Critical-infrastructure exposure: water treatment plant, hospital, and power substation currently in today’s 1-in-100-year floodplain; thus high leverage to harden or relocate.
- Budget constraints: initial one-time resilience fund = $50M; ability to reallocate up to 10% of the annual operating budget = $2.5M/year (for 10 years).
- Equity focus: 20% of households in poverty; low-income neighborhoods disproportionately exposed; resilience plan should directly benefit these communities (e.g., through protection of essential services, housing moves that avoid displacement, subsidized insurance, and community engagement).
- Political feasibility: avoid mass relocation mandates; emphasize voluntary, phased relocation paired with path-dependent protections (hardening, natural defenses, better planning) and transparent engagement.

A. Interventions: concise descriptions, costs, risk reductions, risks, data needs, and monitoring metrics
1) Intervention A — Critical Infrastructure Floodproofing and Redundancy (floodproof, elevate, and harden water, hospital, and power substation; plus backup power and redundancies)
- a) One-sentence description: Harden and floodproof the city’s three critical facilities, with elevated or watertight equipment rooms, flood barriers, and reliable back-up power to ensure essential services operate through a major flood.
- b) 10-year cost and allocations:
  - Initial from $50M: $12.0M
  - Annual budget reallocation: $0.60M/year for 6 years (total $3.6M)
  - Total 10-year cost: $15.6M
  - Allocation mix: 12.0 + 3.6 = 15.6; distributed within the 6-year window after Year 0
- c) Approximate risk reductions:
  - Expected annualized loss (AEL) reduction: from $7.5M to ≈ $4.5M (≈ 40% reduction; saves about $3.0M per year)
  - Reduction in probability of critical service failure during a major storm: from 22% to ≈ 11% (≈ 50% relative reduction)
- d) Key risks/trade-offs: high upfront capital needs; potential construction disruption during execution; requires long-term maintenance; performance depends on design to meet uncertainty of flood depths; avoids relocation but could be overwhelmed by extreme events beyond design specs.
- e) Three highest-value data points to collect first:
  - Facility-specific flood depth and duration exposure (methods: hydrodynamic/flood-model simulations; timeframe: 0–12 months; cost: ~$0.3M)
  - Reliability and redundancy assessments for backup power and critical equipment (methods: outage and maintenance data, red/green-zone drills; timeframe: 6–12 months; cost: ~$0.15M)
  - Post-construction performance testing under staged flood scenarios (methods: shop/field tests, independent peer review; timeframe: 6–12 months; cost: ~$0.15M)
- f) Monitoring metrics and pivot triggers:
  - Metric 1: AEL attributable to critical infra after project completion; threshold to trigger pivot: if AEL reduction < 28% within 6 years, reassess design/scope
  - Metric 2: Probability of critical service failure during a modeled major storm; threshold: fail rate > 12% after 7 years; if so, re-evaluate protection levels or add additional redundancies

2) Intervention B — Evacuation Planning and Public Alert System (enhanced evacuation routes, shelters, automated warnings)
- a) One-sentence description: Develop scalable, equity-focused evacuation plans with staged routes, clearly identified shelters, and robust, timely public alert and information systems.
- b) 10-year cost and allocations:
  - Initial: $4.0M
  - Annual reallocations: $0.30M/year for 10 years
  - Total 10-year cost: $7.0M
  - Allocation mix: initial 4.0; ongoing 3.0 over 10 years
- c) Risk reductions:
  - AEL reduction: ≈ $0.7–$0.9M/year (≈ 9–12% reduction)
  - Probability of critical service failure: minor direct effect on service facilities, but substantial indirect effect by reducing casualties and demand shock; modeled reduction ≈ 2–4 percentage points
- d) Risks/trade-offs: requires sustained public trust; must reach and be usable by low-literacy and non-English-speaking residents; shelters must be flood-safe and accessible; ongoing maintenance costs
- e) Data points to collect:
  - Evacuation time and shelter capacity utilization (methods: tabletop and full-scale drills; timeframe: 12–18 months; cost: ~$0.2M)
  - Population at risk and travel-time to shelters, by neighborhood (methods: GIS analysis, household surveys; timeframe: 6–12 months; cost: ~$0.15M)
  - Messaging effectiveness and accessibility (methods: social-media analytics, multilingual communications testing; timeframe: 6–12 months; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Time to evacuate critical neighborhoods to safety; threshold: average evacuation time > 2x target; trigger within 2–3 years
  - Metric 2: Shelter occupancy rate and shelter access equity (share of vulnerable populations served); threshold: <90% target occupancy in key districts; trigger if <80% consistently for two drills

3) Intervention C — Managed Relocation and Phased Relocation (voluntary buyouts for the highest-risk areas; phased relocation with affordable replacement housing)
- a) One-sentence description: Implement voluntary, phased relocation for the most flood-vulnerable residences and public facilities, prioritizing low-income areas, with buyouts and replacement housing in safer locations.
- b) 10-year cost and allocations:
  - Initial: $7.0M
  - Annual reallocations: $0.40M/year for 5 years
  - Total 10-year cost: $9.0M
  - Allocation mix: initial 7.0; ongoing 2.0
- c) Risk reductions:
  - AEL reduction: ~ $1.0–$2.0M/year (13–27%), depending on how many structures are relocated
  - Probability of critical service failure: reduction of 5–10 percentage points in high-risk corridors; overall system risk reduces as critical facilities are moved farther from exposure
- d) Risks/trade-offs: political sensitivity; potential displacement concerns; requires fair buyout pricing, affordable replacement housing, and strong housing market stabilization; must ensure social equity and adequate transition services
- e) Data points to collect:
  - Property exposure and buyout feasibility in target zones (methods: parcel-level risk mapping; timeframe: 9–12 months; cost: ~$0.25M)
  - Replacement housing supply and affordability analysis (methods: market analysis and developer partnerships; timeframe: 12–24 months; cost: ~$0.15M)
  - Community acceptance and relocation willingness surveys (methods: random-sample surveys; timeframe: 6–12 months; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Percentage of at-risk housing units relocated or offered buyouts; threshold: 15% of identified high-risk units within 6–7 years
  - Metric 2: Time-to-relocation milestones (average months per household) with threshold: average relocation time > 18 months triggers re-plan

4) Intervention D — Green Infrastructure and Nature-based Defenses (wetland restoration, dune systems, permeable surfaces, rain gardens)
- a) One-sentence description: Deploy nature-based defenses to attenuate flood peaks, absorb rainfall, and restore coastal ecosystems as a long-term buffer against sea-level rise.
- b) 10-year cost and allocations:
  - Initial: $6.0M
  - Annual reallocations: $0.50M/year for 6 years
  - Total 10-year cost: $9.0M
  - Allocation mix: initial 6.0; ongoing 3.0
- c) Risk reductions:
  - AEL reduction: ≈ $0.6–$1.2M/year (8–16%)
  - Probability of critical service failure: small reductions (1–5 percentage points) through reduced flooding exposure to nearby facilities and lower inland travel disruption
- d) Risks/trade-offs: effectiveness depends on scale; maintenance costs and land-use constraints; potential land ownership and permitting complexities
- e) Data points to collect:
  - Hydrological performance of restored wetlands and green spaces (methods: before-after monitoring; timeframe: 12–24 months; cost: ~$0.20M)
  - Infiltration and drainage performance in urban areas (methods: green infrastructure monitoring; timeframe: 12–18 months; cost: ~$0.15M)
  - Habitat value and community benefits (methods: ecological surveys and resident surveys; timeframe: 12–24 months; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Area of floodplain restored and functional drainage capacity; threshold: 20–25% of target area achieved by year 7
  - Metric 2: Peak flood height reduction and street-level inundation changes; threshold: measureable decline in inundation depth in target corridors by year 5

5) Intervention E — Coastal Hard Defenses and Beach/ dune stabilization (where feasible near critical lines)
- a) One-sentence description: Construct or stabilize dune systems and, where appropriate, seawalls or revetments in targeted areas to reduce wave run-up and inland flooding.
- b) 10-year cost and allocations:
  - Initial: $5.0M
  - Annual reallocations: $0.50M/year for 6 years
  - Total 10-year cost: $8.0M
  - Allocation mix: initial 5.0; ongoing 3.0
- c) Risk reductions:
  - AEL reduction: ≈ $2.0–$3.0M/year (26–40%)
  - Probability of critical service failure: reduction by ~8–12 percentage points in protected zones
- d) Risks/trade-offs: ecological and aesthetic considerations; maintenance costs; potential displacement concerns if protective works increase risk to other areas (perceived inequity)
- e) Data points to collect:
  - Sediment transport and dune stability data (methods: sediment sampling and lidar; timeframe: 12–24 months; cost: ~$0.25M)
  - Wave run-up and flood depth modeling with defense in place (methods: hydrodynamic models; timeframe: 6–12 months; cost: ~$0.25M)
  - Public acceptance and ecosystem impacts (methods: stakeholder engagement, ecology surveys; timeframe: 6–12 months; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Inundation depth beneath the defenses during modeled storm events; threshold: reduce modeled inundation by at least 0.5–1.0 m in targeted zones by year 5
  - Metric 2: Maintenance compliance rate and annual repair costs; threshold: maintenance budget within plan and no cost overrun >15% in any year

6) Intervention F — Zoning and Permitting Changes (risk-aware land-use planning; reduced development in floodplains; updated building codes)
- a) One-sentence description: Update zoning, permitting, and building code standards to reduce future exposure and encourage protective measures for new development and rebuilds.
- b) 10-year cost and allocations:
  - Initial: $0.5M
  - Annual reallocations: $0.10M/year for 5 years
  - Total 10-year cost: $1.5M
  - Allocation mix: initial 0.5; ongoing 1.0
- c) Risk reductions:
  - AEL reduction: modest, ~$0.3–$0.6M/year (4–8%)
  - Probability of critical service failure: small reductions (1–3 percentage points) by limiting exposure growth
- d) Risks/trade-offs: potential development slowdown; political pushback; ensure equity by protecting vulnerable neighborhoods from inadvertent disinvestment
- e) Data points to collect:
  - Floodplain mapping accuracy and updated risk zones (methods: LiDAR, hydrological models; timeframe: 6–12 months; cost: ~$0.15M)
  - Building code compliance and enforcement capacity (methods: plan review metrics; timeframe: 6–12 months; cost: ~$0.05M)
  - Housing market impact and displacement risk estimates (methods: market analysis; timeframe: 9–15 months; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Percentage of new developments within high-risk zones that adopt enhanced flood protection; threshold: 90% compliance by year 7
  - Metric 2: Number of at-risk dwellings insulated or elevated due to updated codes; threshold: 1,000+ units by year 7

7) Intervention G — Insurance Subsidies and Risk Transfer (subsidies or premium assistance to encourage uptake of flood insurance and risk-transfer products)
- a) One-sentence description: Expand and subsidize flood insurance options and risk-transfer tools to households and small businesses in high-risk areas, with outreach to low-income communities.
- b) 10-year cost and allocations:
  - Initial: $3.0M
  - Annual reallocations: $0.25M/year for 6 years
  - Total 10-year cost: $4.5M
  - Allocation mix: initial 3.0; ongoing 1.5
- c) Risk reductions:
  - AEL reduction: ≈ $0.4–$1.0M/year (5–13%), depending on uptake
  - Probability of critical service failure: potential indirect effect via better funding for floodproofing and quick recovery; modest 1–3 percentage points reduction after 6–7 years
- d) Risks/trade-offs: subsidies can create moral hazard if not paired with risk-reduction measures; needs program integrity and affordability for low-income groups
- e) Data points to collect:
  - Uptake rates by income group and neighborhood (methods: enrollment data; timeframe: 6–12 months; cost: ~$0.05M)
  - Claim timing and average payout (methods: insurer data sharing; timeframe: annual; cost: ~$0.05M)
  - Pre/post resilience investments and insurance compatibility (methods: cross-reference with physical-protection investments; timeframe: 12–24 months; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Coverage rate among households in flood-prone zones; threshold: 75% of eligible households
  - Metric 2: Average claim processing time and payout reliability; threshold: median processing time < 30 days; trigger if > 60 days for more than 3 events/year

8) Intervention H — Community Engagement and Equity Programs (outreach to vulnerable residents; co-design of resilience actions)
- a) One-sentence description: Engage with residents, especially low-income neighborhoods, to co-design risk-reduction measures, ensure access to information, and build local capacity for resilience actions.
- b) 10-year cost and allocations:
  - Initial: $1.5M
  - Annual reallocations: $0.20M/year for 6 years
  - Total 10-year cost: $2.7M
  - Allocation mix: initial 1.5; ongoing 1.2
- c) Risk reductions:
  - AEL reduction: modest, ≈ $0.2–$0.5M/year (3–7%)
  - Probability of critical service failure: small reductions through better utilization of local assets, community-first planning; ≈ 1–2 percentage points
- d) Risks/trade-offs: requires sustained trust; may uncover competing interests; ensure transparent governance and clear benefit sharing
- e) Data points to collect:
  - Community risk perception and needs assessment results (methods: participatory sessions; timeframe: 6–12 months; cost: ~$0.15M)
  - Local capacity and action plan uptake (methods: community surveys and action-tracking; timeframe: 12–18 months; cost: ~$0.10M)
  - Participation metrics and equity outcomes (methods: attendance, co-design outputs; timeframe: ongoing; cost: ~$0.05M)
- f) Monitoring metrics:
  - Metric 1: Share of at-risk households directly engaged in planning and decision-making processes; threshold: ≥40% of targeted neighborhoods
  - Metric 2: Implementation rate of resident-driven resilience actions; threshold: ≥50 actionable items in year 5

B. Consolidated 10-year budget and timeline (sequencing, dependencies, and explicit calculations)
- Summary of initial allocations (one-time from the $50M fund)
  - A: 12.0M
  - B: 4.0M
  - C: 7.0M
  - D: 6.0M
  - E: 5.0M
  - F: 0.5M
  - G: 3.0M
  - H: 1.5M
  - Total initial investments: 39.5M
- Summary of planned annual reallocations (up to $2.5M/year allowable)
  - A: 0.60M/year for 6 years = 3.6M
  - B: 0.30M/year for 10 years = 3.0M
  - C: 0.40M/year for 5–6 years = 2.4–2.4M
  - D: 0.50M/year for 6 years = 3.0M
  - E: 0.50M/year for 6 years = 3.0M
  - F: 0.10M/year for 8 years = 0.8M
  - G: 0.25M/year for 6 years = 1.5M
  - H: 0.20M/year for 6 years = 1.2M
  - Total annual reallocations (sum across interventions, averaged per year): ~2.6M–2.9M in peak years; average ≈ 2.0–2.6M/year
- 10-year timeline (high-level sequencing and dependencies)
  - Years 0–1: Complete design and procurement for A, B, and C; begin initial works on C buyouts where willing; begin zoning updates (F) and community engagement (H) groundwork; initiate green infrastructure (D) planning
  - Years 1–2: Implement core A floodproofing modules at the three facilities; deploy first wave of evacuation-system upgrades and alerting; advance D green infrastructure sites; begin soft buyouts in low-wealth, high-exposure blocks; initiate F changes to code and permitting
  - Years 2–4: Scale A protections; expand B evacuation drills and sheltering capacity; begin targeted relocation in C in the most feasible early neighborhoods; implement E dune stabilization where suitable; continue D restoration and maintenance
  - Years 4–6: Accelerate C relocation where voluntary offers are accepted; widen G insurance subsidies to more residents; expand H community-engagement programs; complete F zoning/permitting framework adoption
  - Years 6–7: Evaluate AEL and service-disruption metrics citywide; adjust protections (increase or reallocate funds if targets unmet)
  - Years 7–10: Complete the majority of C relocations; maintain protective measures; reassess risk models, refine plans; use lessons learned to update ordinances and codes
- Dependencies
  - Securing buy-in and ensuring that near-term protective works do not delay relocation options
  - Coordination with hospital, water, and electricity providers for critical-infrastructure work
  - Securing land assembly and affordable housing options for relocation in phase two
  - Ongoing public outreach to maintain trust and minimize fear or misinformation

C. Ambiguities and clarifying questions for the city council (up to 5)
1) Which neighborhoods should be prioritized for relocation buyouts, and what is the target share of at-risk housing to relocate by year 10?
2) What specific performance standards (design flood levels, tidal and storm surge depths) should drive the hardening of critical infrastructure, and how should risk be allocated if extreme events exceed design assumptions?
3) Are there preferred partners (state, federal, or regional) for funding, land acquisition, or buyout programs, and what is the appetite for leveraging grants or municipal debt alongside the $50M fund?
4) What is the acceptable balance between near-term protective investments (hardening, evacuation) and longer-term adaptations (relocation, nature-based defenses) given the political resistance to “large relocation”?
5) How should outcomes be measured for equity-focused components (e.g., which metrics should be used to ensure low-income neighborhoods receive a fair share of protections and relocation options, and what thresholds would trigger an early plan reevaluation)?

Concise calculations, step summaries, and assumptions used
- Baseline risk (AEL) used for planning: $7.5M/year; baseline probability of critical service failure during a major storm: 22% per event
- 10-year budget capacity for annual reallocations: up to $2.5M/year, total up to $25M over 10 years
- Total initial investments from the $50M resilience fund: 39.5M across A–H
- Estimated AEL reductions per intervention are conservative and order-of-magnitude; sector-specific reductions are additive but assume some overlap
- All costs are 2025 USD; inflation not applied; maintenance costs beyond the 10-year window not included unless specified
- Equity objective: prioritize interventions with direct, visible benefits to low-income neighborhoods and essential services

If you want, I can re-format these into a compact one-page appendix (with a table-like layout) or tailor the cost ranges to specific project bids or grant opportunities, and adjust for a more aggressive or more conservative funding profile.

In [None]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
!ollama pull llama3.2

In [9]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

**10-Year Coastal Resilience Strategy**

**A. Minimize Expected Loss of Life and Critical Service Disruption**

1. Floodproofing critical infrastructure: Estimated cost - $15M ($50M * 30%), Annual budget reallocation - $1.25M (5%).
   - Reduction in expected annualized loss: -$300,000 (-$2 million * 0.015); Probability of critical service failure: -3.4% (from 19.8% to 16.4%).

2. Green infrastructure (e.g., sea walls with plantings): Estimated cost - $12M ($50M * 24%), Annual budget reallocation - $1 million (4%).
   - Reduction in expected annualized loss: -$600,000 (-$3 million * 0.02); Probability of critical service failure: -10.6% (from 28.9% to 18.3%).

3. Evacuation planning and education programs: Estimated cost - $1M ($50M * 2%), Annual budget reallocation - $100,000 (0.4%).
   - Reduction in expected annualized loss: -$5,000 (-$200,000 * 0.025); Probability of critical service failure: -0.6% (from 10%.

4. Community engagement and outreach programs: Estimated cost - $1M ($50M * 2%), Annual budget reallocation - $100,000 (0.4%).
   - Reduction in expected annualized loss: -$5,000 (-$200,000 * 0.025); Probability of critical service failure: -0.6% (from 10%.

**B. Address Equity for Low-Income Neighborhoods**

1. Managed retreat (e.g., purchasing land to relocate residents): Estimated cost - $20M ($50M * 40%), Annual budget reallocation - $2 million (8%).
   - Reduction in expected annualized loss: -$600,000 (-$3 million * 0.02); Probability of critical service failure: -16.4% (from 28.9%).

2. Affordable housing and relocation programs: Estimated cost - $10M ($50M * 20%), Annual budget reallocation - $1 million (4%).
   - Reduction in expect

In [10]:
# So where are we?

print(competitors)
print(answers)


['gpt-5-nano', 'llama3.2']


In [11]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-5-nano

Below is a concise, actionable 10-year coastal resilience strategy tailored to a 120,000-resident coastal city with a $25M annual operating budget, 20% poverty, a 10 km coastline, frequent high-risk events, and the political/operational constraints you described. All figures are order-of-magnitude estimates intended to guide planning, with explicit assumptions and calculations shown at the end. Interventions are prioritized to protect lives and critical services first, advance equity, and remain politically feasible by balancing near-term gains with voluntary, phased actions.

Assumptions (key starting points used in calculations)
- Sea-level rise scenario: 0.5–1.2 m by 2100; near-term risk rising gradually; three “100-year” storms in 15 years indicates high annual hazard and clustered event risk.
- Baseline risk (current conditions): annualized expected loss (AEL) from coastal hazards approximated at $7.5M/year citywide (damages, business interruption, and serv

In [12]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [13]:
print(together)

# Response from competitor 1

Below is a concise, actionable 10-year coastal resilience strategy tailored to a 120,000-resident coastal city with a $25M annual operating budget, 20% poverty, a 10 km coastline, frequent high-risk events, and the political/operational constraints you described. All figures are order-of-magnitude estimates intended to guide planning, with explicit assumptions and calculations shown at the end. Interventions are prioritized to protect lives and critical services first, advance equity, and remain politically feasible by balancing near-term gains with voluntary, phased actions.

Assumptions (key starting points used in calculations)
- Sea-level rise scenario: 0.5–1.2 m by 2100; near-term risk rising gradually; three “100-year” storms in 15 years indicates high annual hazard and clustered event risk.
- Baseline risk (current conditions): annualized expected loss (AEL) from coastal hazards approximated at $7.5M/year citywide (damages, business interruption, an

In [14]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [15]:
print(judge)

You are judging a competition between 2 competitors.
Each model has been given this question:

Imagine you are an independent expert advising a mid-sized coastal city of 120,000 residents with a $25M annual operating budget, 20% of households below the poverty line, and a 10 km coastline that has experienced three events previously described as "100‑year" storms in the last 15 years; sea‑level rise projections indicate a likely range of 0.5–1.2 m by 2100 and the city's critical infrastructure (water treatment plant, hospital, power substation) is currently within today's 1‑in‑100‑year floodplain—politically, the city council is divided (two‑thirds oppose large spending on relocation) and many residents distrust government—and you have an initial one‑time resilience fund of $50M and authority to propose reallocations of up to 10% of the annual budget for the next 10 years: design a prioritized, actionable 10‑year coastal resilience strategy that (A) minimizes expected loss of life and c

In [20]:
judge_messages = [{"role": "user", "content": judge}]

In [21]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["1", "2"]}


In [None]:
# Judgement time with Ollama!

ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
response = ollama.chat.completions.create(
    model="llama3.2",
    messages=judge_messages,
)
results_ollama = response.choices[0].message.content
print(results_ollama)


In [19]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gpt-5-nano
Rank 2: llama3.2


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>