## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
# from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


You are an independent expert advisor to the mayor of a mid-sized city that has just learned its main reservoir is both critically low from an unprecedented three-week drought and contaminated with E. coli; the city has limited emergency funds, a population with large low-income and medically vulnerable groups, aging distribution infrastructure, a politically polarized council, and a nonfunctional water-quality lab — within the next 30 days propose a prioritized, evidence-based action plan that (a) describes immediate lifesaving measures, short-term stabilization (days to weeks), and medium-term recovery steps (weeks to months), (b) quantifies resource needs and likely outcomes and uncertainties for each action (people, time, cost, expected reduction in morbidity/mortality, confidence intervals or qualitative uncertainty levels), (c) explains ethical and legal trade-offs (e.g., rationing, prioritization, use of nonpotable sources) and how you would justify them to the public, (d) lists

In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [8]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Below is a structured, evidence-based plan you could adapt for a mid-sized city facing a critically low reservoir and E. coli contamination, with a nonfunctional local water-quality lab and limited funds. The plan is organized to cover (a) immediate lifesaving actions, (b) short-term stabilization (days–weeks), and (c) medium-term recovery (weeks–months); (d) resource needs and likely outcomes with uncertainties; (e) ethical/legal trade-offs and public communication; (f) data/tests you should request immediately; (g) failure modes, biases, and political obstacles with mitigation; and (h) objective criteria and triggers to move between phases. Assumptions and key unknowns are stated explicitly.

Assumptions and key unknowns you should treat as inputs to revise the plan
- City size and context: mid-sized city with roughly 100,000–180,000 residents; a significant share are low-income and medically vulnerable (elderly, immunocompromised, children).
- Water system status: main reservoir critically low due to drought; reservoir water is contaminated with E. coli; distribution infrastructure aging; local water-quality lab nonfunctional.
- Fiscal constraints: limited emergency funds; potential access to state/federal emergency assistance but needs clear requests and justification.
- External capacity: external labs and state public health resources are available but may have limited capacity; possible need to rent or deploy mobile testing and mobile treatment units.
- Unknowns with the most impact: exact level and persistence of reservoir contamination, the time required to restore safe supply, the capacity to deploy bottled-water and mobile-treatment assets, and the speed with which alternative safe-water sources can be scaled up.

(a) Immediate lifesaving measures (today through 48–72 hours)
Goal: prevent waterborne morbidity and mortality while you test and stabilize. Actions prioritize vulnerable groups and ensure basic drinking and hygiene needs, with rapid socialization of risks and actions.

1) Declarative actions and public health protections
- Issue an emergency public health declaration and activate the city emergency operations center (EOC) if not already active.
- Establish a centralized, transparent public information portal (24/7 updates) with plain-language guidance on drinking-water safety, boil-water instructions, and where to obtain bottled water and water-distribution sites.
- Create a prioritization framework for water distribution (see section on ethics).

Resource implications
- People: Incident Commander, EOC staff, communications lead, liaison with state/federal partners (3–6 FTEs for the first 2 weeks).
- Time: immediate setup; ongoing daily operations.
- Cost: mostly personnel and communications; minor operational costs at start (estimated $50k–$200k initial for setup, content production, and communications, depending on existing capacity).
- Expected morbidity/mortality impact: high if no action; with declaration and clear messaging, risk of inappropriate use and panic reduces; qualitative uncertainty high to moderate.

2) Boil-water advisory and disinfection guidance
- Issue a city-wide boil-water advisory for all drinking and food-preparation uses at all public and private facilities; publish altitude-adjusted boil time if applicable (usually 1 minute at rolling boil; longer at higher elevations).
- In parallel, provide official, step-by-step guidance for households using point-of-use boiling, chilling, and safe storage.

Resource implications
- People: health department reviewers, public communications, and partner agencies to disseminate guidance; minimal additional staff.
- Time: immediate; ongoing messaging.
- Cost: minimal direct cost; potential cost of translated materials.
- Expected morbidity/mortality impact: high reduction in ingestion of contaminated water; uncertainty moderate (depends on public compliance).

3) Rapid provision of safe drinking water to critical populations
- Distribute bottled water and/or provide on-site treated water (mobile water-delivery and/or water-tank fill stations) prioritized for:
  - Hospitals, clinics, nursing homes, and long-term care facilities.
  - Schools, daycare centers, and emergency shelters.
  - Homeless services and any congregate living facilities.
  - Households without reliable access to safe water (door-to-door or at central distribution points).

Resource implications
- People: logistics teams for water transport and distribution (10–30 individuals per shift across multiple sites); security and traffic management.
- Time: immediate (0–72 hours) for siting distribution points and ramping up deliveries.
- Cost: high but critical; bottled-water delivery could cost roughly $0.40–$0.70 per liter including freight, with a practical daily need of 2–3 liters per person for drinking plus hydration for hygiene; for 100,000 residents, daily cost could be in the range of $80k–$250k/day depending on logistics and volume; 7–14 days of initial stock could total $1–5 million, with higher variability.
- Expected morbidity/mortality impact: substantial reduction in acute diarrheal disease risk when coverage is high; uncertainty depends on uptake and duration of supply.

4) Temporary water treatment and distribution enhancements at the source
- Deploy portable water treatment units (PWTUs) or mobile treatment trucks at distribution points or near the reservoir to provide disinfected water (chlorination, filtration, or UV disinfection per unit capability) while the longer-term solution is arranged.
- Increase residual chlorine monitoring in the distribution system and at key access points; provide guidance on safe chlorine residual targets.

Resource implications
- People: 5–15 operators per PWTU plus supervisors; supply chain coordinators.
- Time: 0–7 days to install and begin operation; continuous during stabilization.
- Cost: equipment rental/leasing, chlorine and consumables, power, and transport; estimate $2–6 million depending on number of units and duration.
- Expected morbidity/mortality impact: lowers risk from non-crystal-clear source water; uncertainty depends on unit effectiveness and deployment coverage.

5) Immediate access to alternative safe-watering options in facilities
- If feasible, set up safe-water stations in shelters and public buildings with on-site chlorination and certified water that meets safety standards for drinking and cooking.
- Develop and implement rapid hygiene stations to reduce disease transmission (handwashing stations with soap, sanitizers) at critical sites.

Resource implications
- People: site managers, sanitation/policy staff, volunteers.
- Time: 1–3 days to establish; ongoing.
- Cost: modest per-site setup; overall costs depend on number of sites (estimate $50k–$500k initially).
- Expected morbidity/mortality impact: reduces secondary transmission; uncertainty moderate.

(Short-term stabilization objectives for the next 2–7 days: maintain safe access to drinking water, minimize exposure to contaminated source, and gather critical data.)

(b) Short-term stabilization (days to weeks)
Goal: stabilize health risk, begin data collection, and begin planning for longer-term recovery; prevent service gaps and equity issues.

1) Expand and optimize bottled-water access and alternative safe-water routes
- Scale up distribution capacity (more sites, mobile delivery, 24/7 operations as needed) to reach all neighborhoods, with a focus on high-poverty areas and medically vulnerable households.
- Implement a water-resilience plan for essential institutions (hospitals, schools, shelters) including back-up power, redundancy in water supply lines, and rapid testing at point-of-use.

Resource implications
- People: expanded logistics team (15–40 staff); volunteer coordination; security at distribution sites.
- Time: 7–14 days to reach steady-state distribution; ongoing thereafter.
- Cost: $5–15 million, depending on scale (bottled water volumes, transport, staffing, site setup).
- Expected morbidity/mortality impact: substantial reduction in water-related illnesses; uncertainty depends on coverage and duration.

2) Establish interim laboratory testing capacity (even with a nonfunctional city lab)
- Contract or partner with state/public health labs and private labs for routine E. coli, total coliform, turbidity, residual chlorine, and contaminant testing.
- Implement a standardized sampling plan across the distribution network (number of samples, frequency) and ensure rapid turnaround (24–48 hours for most tests, longer for some analyses).
- Move toward a simple, auditable QA/QC program and data-sharing with the city and the public.

Resource implications
- People: dedicated lab liaison, contract-lacrosse network; 2–4 full-time equivalents (FTEs) for coordination; lab technicians as needed.
- Time: 48–72 hours to establish contracts; ongoing testing cadence.
- Cost: testing costs depend on volume; rough daily testing budget could be $20k–$200k depending on frequency and lab type; initial setup could be $200k–$1M.
- Expected morbidity/mortality impact: improves confidence in safety and allows more targeted actions; uncertainty high if test coverage is uneven.

3) Public health risk communication and behavioral guidance
- Provide clear, consistent messaging about boiling, safe handling of water, and when to seek care for diarrheal illness.
- Use multiple channels (TV, radio, SMS, social media, community leaders) in multiple languages; hold regular press briefings.
- Launch trust-building measures to reduce misinformation and political polarization: transparent dashboards, a public FAQ, and open data sharing.

Resource implications
- People: communications team, translators, community liaisons.
- Time: immediate to establish; ongoing.
- Cost: moderate; $100k–$500k depending on channel reach and translation needs.
- Expected morbidity/mortality impact: improves compliance and reduces unnecessary demand on emergency services; uncertainty moderate.

4) Health system and social-support overlay
- Prioritize water access within health and social systems: ensure hospital, clinic, and shelter water needs are met; provide direct support to food banks, shelters, and homebound residents.
- Initiate or expand water-related procurement for essential facilities (kitchen, sanitation).

Resource implications
- People: hospital liaisons, shelter coordinators, logistics staff.
- Time: immediate; ongoing for 2–4 weeks.
- Cost: depends on numbers of facilities supported; estimate $1–4 million for broad coverage in initial weeks.
- Expected morbidity/mortality impact: reduces risk for vulnerable groups; uncertainty moderate.

5) Data-driven targeting and equity controls
- Start a district-by-district water-access map showing which areas have reliable access, which are dry, and where bottlenecks exist.
- Use vulnerability data (age, health status, income) to reallocate resources to the highest-need neighborhoods.

Resource implications
- People: data analysts, GIS staff, program evaluators.
- Time: immediate setup; ongoing updates.
- Cost: modest ($50k–$250k) depending on existing data infrastructure.
- Expected morbidity/mortality impact: improves equity and effectiveness; uncertainty moderate.

(c) Medium-term recovery steps (weeks to months)
Goal: restore safe, reliable water supply; reduce risk of future crises; rebuild systems with resilience; re-establish public trust.

1) Restore and secure a safe water source
- Rapidly evaluate options for a safe, scalable water supply (alternative groundwater, treated surface water, interconnections with neighboring utilities, or a combination of portable treatment and backup reservoirs).
- Plan for a formally engineered temporary-to-permanent arrangement: interim interconnection with a neighboring utility if feasible; temporary storage and distribution improvements; continuous disinfection and testing.

Resource implications
- People: water-supply engineers, civil engineers, procurement, legal/contract staff.
- Time: weeks to months for interconnection or alternative supply to be fully realized; parallel deployment of interim measures for continuity.
- Cost: potentially large; anticipate $10–100 million depending on the scale and whether an interconnection or new facility is required. Seek state/federal cost-sharing and low-interest financing.
- Expected morbidity/mortality impact: substantial long-term reductions in risk; uncertainty depends on duration to full supply restoration.

2) Infrastructure upgrades and resilience
- Begin phased investments to repair/modernize aging pipes and distribution infrastructure; install backflow preventers, improved chlorine-residual monitoring, and automated remote sensing of water quality where feasible.
- Invest in a small-scale, city-run or partner-supported water quality lab capacity (mobile/temporary lab units, remote data transfer, QA/QC program) to prevent future lab outages.

Resource implications
- People: civil engineers, project managers, QA/QC technicians, contractors.
- Time: 3–12 months for initial upgrades; longer for full resilience.
- Cost: substantial; initial phases could be $5–20 million, with larger future investments possible.
- Expected morbidity/mortality impact: reduces risk of future outages and contamination events; uncertainty depends on scope.

3) Policy, governance, and nonpotable use framework
- Develop a policy framework for nonpotable uses (e.g., toilet flushing, landscape irrigation) with risk-based treatment or acceptability standards during shortages, ensuring ethical considerations and public safety.
- Establish a formal drought contingency and water-use-restriction plan, including tiered restrictions and equity considerations.

Resource implications
- People: policy lawyers, public-works leadership, urban planners.
- Time: 1–3 months to draft; ongoing enforcement.
- Cost: moderate to low; $50k–$500k for plan development and initial enforcement.
- Expected morbidity/mortality impact: increases system resilience; uncertainty depends on adherence and enforcement.

4) Public-health sustained surveillance and readiness
- Create a sustained, city-wide water quality surveillance program, with clear triggers for escalation.
- Develop an ongoing readiness plan, including stockpiling critical supplies (bottled water, purification tablets, PPE), mutual-aid agreements with neighboring jurisdictions, and rapid-response playbooks for future droughts or contamination events.

Resource implications
- People: surveillance epidemiologists, public health nurses, procurement specialists.
- Time: 1–3 months to finalize; ongoing thereafter.
- Cost: ongoing annual cost but manageable with phased investment; initial setup $200k–$2M depending on scope.
- Expected morbidity/mortality impact: improves long-term resilience; uncertainty moderate.

(d) Data and tests to request immediately (and why)
- Water testing: E. coli and total coliforms in multiple distribution zones; residual chlorine; turbidity; pH; conductivity; heavy metals if risk suspected; microplastics if relevant.
- Source-water quality monitoring: reservoir inflows, upstream tributaries, groundwater sources (if any).
- Material and system testing: cross-connection checks, backflow prevention performance, pipe cleanliness, corrosion products in aging mains.
- Public-health outcomes: real-time diarrheal disease surveillance data (ED visits, clinic visits) to gauge impact of interventions.
- Operational data: water-usage patterns by district; bottling/distribution performance metrics; inventory levels for water supplies; interconnection status with neighboring utilities.
- Why: to adapt actions quickly, verify safety, allocate resources efficiently, and provide evidence for public communications and legal justifications.

(e) Failure modes, biases, and obstacles; detection and mitigation
- Data gaps and bias: limited upstream data due to nonfunctional lab; mitigate with external labs, standardized sampling plan, QA/QC, transparent data sharing, and back-up lab partners.
- Public miscommunication and panic: mitigate with a single authoritative spokesperson, daily live briefings, multilingual materials, and simple, actionable guidance; correct misinformation through rapid response.
- Supply chain disruptions: risk of bottlenecks in bottled-water supply, trucking, or staff shortages; mitigate by diversifying suppliers, pre-negotiated contracts, and mutual-aid agreements with neighboring jurisdictions.
- Political obstacles and polarization: risk of inconsistent policy; mitigate with an independent, data-driven dashboard; use citizen advisory groups, public forums, and third-party oversight.
- Equity gaps and mistrust: risk that some communities receive less water; mitigate with explicit equity metrics, targeted outreach, and accountability reporting.

(f) Objective criteria and triggers for transitions between phases or escalation
Define clear, quantitative triggers for escalation and de-escalation. Use a public health risk framework combining water quality data, supply reliability, and equity indicators.

Phase transitions and triggers
- Phase A (Immediate lifesaving measures: 0–3 days)
  - Trigger to enter Phase A: confirmation of E. coli presence in reservoir or distribution system; boil-water advisory issued citywide; essential water-distribution points established.
  - Exit trigger: none; move to Phase B when a stable interim supply with clear coverage for high-priority populations is in operation for at least 72 hours.

- Phase B (Short-term stabilization: days–weeks)
  - Triggers to remain Phase B: (1) coverage of safe water to ≥85% of residents in a 5–7 day rolling window; (2) external lab results showing no active contamination or contaminant trend; (3) ability to sustain at least two concurrent safe-water routes (bottled water and treated water) for at least 14 days; (4) a functioning system to monitor water quality with 24–48 hour turnaround.
  - Trigger to escalate to Phase C: sustained supply to ≥95% of residents with confirmed safe water at central testing sites for 14 days; plan in place for source replacement or interconnection if needed.

- Phase C (Medium-term recovery: weeks–months)
  - Triggers to escalate from Phase B to Phase C: (1) new supply arrangement (interconnection or alternative source) legally secured and functioning; (2) distribution system improvements deployed (new tanks, piping work) with completion milestones; (3) local lab capacity fully restored or replaced with external lab capacity; (4) risk to vulnerable populations is reduced as measured by service access and disease surveillance.
  - Exit criteria for Phase C: stable, ongoing safe-water supply with automated monitoring and QA/QC program in place for at least 3 months; no sustained contamination signals; written recovery plan enacted.

- Phase D (Long-term resilience and plan normalization)
  - Triggers to stay Phase D: permanent upgrades completed; drought contingency plan activated; systemic readiness for future events; routine public-health water quality surveillance in place.

Key public-communication and governance note
- Maintain public transparency about what is known, what is not known, and how decisions are made (including ethical trade-offs). Use a public dashboard with data on water quality, supply coverage, and distribution equity. Explain the rationale for rationing or prioritization using explicit ethical principles (see below).

(g) Assumptions and the most important unknowns your plan depends on
- Assumptions:
  - The city can rapidly mobilize external testing capacity and bottled-water supply at scale.
  - Alternative water sources or interconnections can be established within weeks to months.
  - Governance can negotiate inter-jurisdictional agreements and funding approvals in a timely manner.
  - The population will respond to communications and use the water guidance appropriately.
- Most important unknowns:
  - The exact extent and persistence of reservoir contamination and whether any contaminants exceed safe levels for extended periods.
  - The capacity and speed to deploy mobile treatment units and interconnect with neighboring utilities.
  - The rate at which households and institutions will adopt boil-water guidance and water-use restrictions.
  - Financial availability and speed of federal/state support.
  - The reliability of external testing partners and the turnaround time for results.
  - The potential for secondary health effects (e.g., load on healthcare facilities, mental health impacts from water insecurity).

Ethical and legal trade-offs: how to justify to the public
- Core ethical framework: prioritize the most vulnerable and essential services; maximize lives saved; use equity-centered distribution; ensure transparency and accountability.
- Rationing and prioritization: 
  - Give priority to hospitals, clinics, long-term care facilities, shelters, and schools serving vulnerable populations.
  - Provide baseline safe-water access for all residents, with additional supplies for vulnerable households (elderly, immunocompromised, children).
  - If shortages persist, implement a transparent tiered allocation system that is publicly documented and auditable; avoid favoritism by relying on objective criteria (household need, size of household, disability status, and risk level).
- Use of nonpotable sources:
  - If nonpotable sources are to be used (e.g., graywater for non-drinking uses or treated non-potable options in limited contexts), ensure strict safety barriers and legal compliance; clearly communicate limits and the rationale, and clearly separate nonpotable uses from drinking water.
  Ethical justification: emergency situations justify temporary measures that reduce risk and save lives, provided they do not create greater harm, include safeguards to protect the most vulnerable, and are transparent in rationale and duration.
- Public communication and trust:
  - Be explicit about uncertainties and the plan to reduce them; provide regular updates on data quality, supply status, and timelines.
  - Acknowledge constraints and trade-offs; invite public input and provide channels for concerns.
  - Ensure accessibility and language equity in communications to reach all communities.

What to do next: concrete set of ready-to-execute steps
- Immediate
  - Activate EOC and declare emergency status; appoint incident commander.
  - Issue boil-water advisory; publish authoritative guidance.
  - Establish primary and alternate water-distribution sites; begin bottled-water provisioning at scale; target highest-need facilities first.
  - Initiate external lab contracts and a sampling plan; set 24–48 hour turnaround for core tests.
  - Launch public communication plan with daily briefings and a single, trusted source of truth.
- 0–7 days
  - Scale up bottled-water procurement and distribution; begin mobile-treatment deployments.
  - Implement interim interagency MOUs with neighboring utilities or state agencies to secure alternate supplies.
  - Begin district-based water-use planning and equity targeting; begin data governance and dashboards.
- 1–4 weeks
  - Complete interim testing plan; validate external lab results; adjust guidance as needed.
  - Start infrastructure resilience planning and procurement for emergency upgrades; advance interconnections if feasible.
  - Sign off on a multi-week to multi-month recovery plan with clear actions and milestones.

- 1–6 months
  - Implement long-term water-supply solution (interconnection, new treatment capacity, or alternate supply).
  - Initiate permanent resilience improvements (infrastructure upgrades, QA/QC lab capacity, data systems).
  - Transition from emergency governance to a formal resilience and drought-management program.

Important caveats
- All estimates (costs, timelines, and health impacts) are highly contingent on external factors: availability of bottled water, lab capacity, interconnection approvals, and the political environment. Prepare to adjust rapidly.
- Ensure legal clearance for emergency expenditures and inter-jurisdictional agreements; coordinate with the state attorney general’s office for any waivers or authority required during an emergency.
- Maintain a strong focus on protecting the most vulnerable and ensuring equity throughout the response.

If you’d like, I can tailor the plan to your city’s exact population, existing infrastructure, and the specific legal/regulatory framework you must operate under, and convert the above into a 2–3 page executive directive with a phased Gantt-style timeline and a cost envelope.

In [9]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

NameError: name 'Anthropic' is not defined

We're using OpenAI() to call Gemini below. It could look very confusing. Remember, OpenAI() is not a LLM. It's a lightweight library that just wraps HTTP calls to an endpoint of a particular structure, a well-known structure that includes passing a list of dicts and kind of thing in the form of an HTTP request. OpenAI built these endpoints and they became very popular. So a lot of people decided that they would offer endpoints with exactly the same format, same spec as OpenAI. Anthropic is the only one that has not gone with this. OpenAI opened up their Python client library so that when you creat a new Python client instance, you can pass in a base URL. 

In the base URL, Google has madde special endpoints that have exactly the same format as OpenAI, and they are served on this URL. That's why the URL has "/openai/". You can pass in Google key, and tell OpenAI to use Google endpoint that is in their format. 

Same for DeepSeek and Groq.

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

By default, Ollama is designed to run LLMs locally on your computer.
Key Benefits of Local Execution:
- Privacy: Data stays on your machine.
- Control: You manage the models and their versions.
- Cost-Effective: No per-token cloud costs.
- Offline Access: Works without an internet connection after setup. 

In [None]:
# The exclaimation marks mean that this line of code does not run as code but as a command line. 
!ollama pull llama3.2

In [None]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# So where are we?

print(competitors)
print(answers)


In [None]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


In [None]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers): #enumerate is to loop through items with the index, instead of using count += 
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [None]:
print(together)

In [None]:
# If you want a {} to appear in the doc string, then add another one, so that the intended output {} will not be marked as code. That's why below we have {{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [None]:
print(judge)

In [None]:
judge_messages = [{"role": "user", "content": judge}]

In [None]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


In [None]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

Ufo's answer: Parallelization and Evaluator-Optimizer 

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>