## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


Imagine you are advising the government of a mid-sized country facing a sudden, severe energy shortage caused by a three-month disruption to imported fossil fuels, with 40% of its electricity generation offline and winter approaching; the country has limited domestic renewable capacity, moderate natural gas reserves sufficient for only two months, underfunded grid infrastructure, and a population with 20% below the poverty line—provide a prioritized, time-phased plan (immediate: days–weeks, short-term: weeks–months, and medium-term: months–2 years) to minimize loss of life and economic collapse while maintaining social cohesion and honoring international emissions commitments as much as feasible, and for each proposed measure quantify expected impacts (e.g., lives saved, percentage of population with restored electricity, GDP loss avoided), identify key assumptions and uncertainties, propose metrics to monitor success, describe contingency triggers and backup options, explain the ethic

In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [8]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Below is a structured, time-phased plan you can adapt to a mid-sized country facing a severe, winter-driven energy shortage. It emphasizes saving lives, preserving essential services, maintaining social cohesion, and staying as close as feasible to emissions commitments. All quantitative figures are illustrative defaults you should replace with country-specific data; I’ve included clear assumptions and ranges so you can tailor estimates quickly.

Executive summary
- Objective: minimize loss of life and economic collapse over days–weeks, weeks–months, and months–2 years; maintain social cohesion; honor emissions commitments as far as feasible.
- Core constraints: 40% of electricity generation offline; winter demand high; limited domestic renewables; gas reserves ~2 months; underfunded grid; 20% population below poverty line.
- Core approach: protect life and essential services first; aggressively reduce demand with targeted subsidies and efficiency; mobilize emergency generation and imports; expand low-emission capacity and efficiency in the medium term; institute robust social protection and transparent communications.
- Key risk: inequitable access to power and fuel; policy choices that destabilize the economy or undermine emissions goals if not carefully designed. Contingency planning and ethical guardrails are integral.

Important assumptions and uncertainties (use for sensitivity analysis)
- Demand: peak national load X GW; 40% outage implies ~0.4X GW shortfall without interventions. Exact numbers depend on the country’s grid and season.
- Gas reserves: only enough for 2 months of current gas-fired generation at current demand; international gas supply can be augmented with LNG imports.
- Renewables: limited capacity; additive wind/solar potential small in winter; storage options limited initially.
- Grid: distribution losses and noncritical-customer outages are high due to underinvestment; repair and maintenance cycles are disrupted.
- Social factors: 20% below poverty line; heating needs are a major risk for mortality and morbidity during winter.
- Emissions: strong emphasis on using gas/low-emission options first; coal or high-emission diesel would be a last resort with mitigations (emissions controls, short duration).

Immediate actions (days–weeks)
Goal: stabilize system, protect lives, prevent social collapse, set up governance and data flows.

1) Crisis governance and ethical framework
- Action: Establish a National Energy Crisis Authority (or equivalent) with a cross-ministerial mandate; appoint an Ethical Allocation Advisory Panel to guide prioritization and monitor fairness.
- What to measure:
  - Decision speed (time to approve critical contracts, set load-shedding schedules).
  - Fairness indicators (distribution across regions, urban/rural, poverty bands).
- Expected impact (illustrative, country-typical): enables rapid, transparent decisions; reduces social unrest risk.
- Assumptions/uncertainties: political capacity to centralize authority; public trust in the process.

2) Protect essential services and vulnerable populations
- Actions:
  - Legally protect hospitals, clinics, water systems, healthcare supply chains, emergency services, and critical transport (priority loads 100% protected).
  - Freeze utility disconnections for households and small businesses; target targeted subsidies for heating and essential electricity use.
  - Establish a vulnerability index to prioritize heating support for the elderly, chronically ill, disabled, and extremely poor households.
- Quantified targets (illustrative):
  - 100% uninterrupted power for hospitals, water treatment, and emergency services.
  - Freeze disconnections for 4–6 weeks; begin energy subsidies for top 30% of households by poverty risk.
  - Target: 60–75% of urban households receiving at least a minimum heating support through subsidies or direct provision of heating aids.
- Assumptions/uncertainties: how quickly subsidies can be delivered; fuel access to heating support.

3) Demand management and energy efficiency (short-term demand relief)
- Actions:
  - Public communications campaign to reduce consumption in nonessential sectors; advise lowering thermostat settings, throttling noncritical loads, switching to energy-saving modes; switch streetlights to LEDs and reduce lighting in noncritical public spaces.
  - Implement time-of-use pricing signals and voluntary curtailment programs for large commercial/industrial customers.
  - Place immediate insulation/ready-to-deploy efficiency measures in government buildings and hospitals.
- Quantified targets (illustrative):
  - 8–12% reduction in total demand from behavior + efficiency in the first 2–4 weeks.
  - Potentially 2–4% GDP loss avoided relative to worst-case no-action scenario due to stabilized electricity prices and fewer outages.
- Assumptions/uncertainties: public compliance; feasibility of rapid subsidies for efficiency upgrades; infrastructure for time-of-use signals.

4) Short-term generation and fuel relief
- Actions:
  - Fast-track procurement of emergency generation capacity (modular gas turbines, diesel gensets) to cover portions of the shortfall; prioritize critical urban/industrial hubs and heating areas.
  - Increase fuel imports for emergency generation (gas and/or diesel) with priority lanes at borders and streamlined procurement.
  - If feasible, maximize gas-fired generation where the gas supply is secure and prioritize hospitals, water systems, and heating for vulnerable groups.
  - Consider short-term, low-emission options (e.g., existing hydropower repair, small-scale wind/solar if available with storage, or microgrids for isolated communities).
- Quantified targets (illustrative):
  - Add 5–10% of peak demand back online via emergency generation within 2–3 weeks; expand to 15–20% over the first 6–8 weeks if imports and fuel allow.
  - Electricity restoration in regional grids improves from 60% of demand served to ~70–75% in corridors where emergency generation is deployed.
- Assumptions/uncertainties: speed of procurement; fuel price and supply reliability; grid compatibility and ramp rates; logistical constraints in fuel delivery.

5) Social protection and price stability
- Actions:
  - Implement targeted cash transfers or vouchers for heating and electricity costs to low-income households.
  - Stabilize key staple prices and impose temporary caps on electricity tariffs for households with usage in defined essential bands.
- Quantified targets (illustrative):
  - Cover 25–35% of the population with direct heating/electricity relief; limit price increases for essential electricity to 0–5% month-over-month in the crisis period.
- Assumptions/uncertainties: fiscal space; leakage or misallocation; effectiveness of cash transfers.

Short-term actions (weeks–months)
Goal: secure more reliable power, increase energy efficiency, and begin rebuilding capacity with lower-emission options.

1) Expand low-emission and distributed energy capacity
- Actions:
  - Fast-track small-scale solar, wind, and battery storage projects (especially in urban and peri-urban settings) with short permitting timelines; deploy microgrids for hospitals, clinics, and important regional centers.
  - Begin prepositioning natural gas for power expansion and storage; negotiate LNG imports for winter season if feasible; prioritize high-efficiency gas plants with emissions controls.
  - Repair/upgrade existing renewable capacity to unlock any available generation (seasonally dependent).
- Quantified targets (illustrative):
  - Add 2–4 GW of dispatchable capacity within 6–12 months (depends on scale and financing); storage/battery adds 1–2 GW in selected regions within 12–18 months.
  - Increase regions with reliable power from 60% to 70–80% of demand in well-connected urban corridors.
- Assumptions/uncertainties: financing; speed of permitting; supply chain constraints; weather impacts on renewables.

2) Grid modernization and reliability improvements (priority upgrades)
- Actions:
  - Accelerate prioritized maintenance, improve feeder reconfiguration to reduce outages, and implement enhanced SCADA/monitoring to optimize load-shedding with minimal social disruption.
  - Strengthen interconnections with neighboring countries for power imports and grid support where politically feasible and beneficial, with clear allocation rules.
- Quantified targets (illustrative):
  - 5–10% improvement in reliability in prioritized regions within 6–12 months; regional interconnections that enable import of 1–3 GW during peak periods within 12–24 months (subject to regional capacity and transmission upgrades).
- Assumptions/uncertainties: capital availability; cross-border agreements; coordination with regional grid operators.

3) Targeted energy efficiency and demand-side programs
- Actions:
  - Scale weatherization programs for low-income housing; subsidize insulation, heat-retention measures, and energy-efficient appliances.
  - Expand public-sector efficiency programs (schools, government offices) to reduce demand growth and stabilize energy expenditure.
- Quantified targets (illustrative):
  - 3–6% additional demand reduction in the mid-term (months 6–12) from efficiency gains; further reductions in energy intensity over the first 1–2 years.
- Assumptions/uncertainties: funding; program uptake; delivery logistics in rural areas.

Medium-term actions (months–2 years)
Goal: rebuild sustainable capacity, reduce vulnerability to future shocks, and advance emissions commitments.

1) Build resilient, lower-emission capacity
- Actions:
  - Scale up renewable energy capacity (solar, wind, hydro) with baseload and peaking support (batteries, pumped storage) to reduce reliance on fossil fuels.
  - Expand regional energy interoperability with neighboring countries to diversify supply and leverage regional markets for power trading.
  - Invest in energy efficiency, building retrofits, and demand management as a long-term pillar to reduce systemic load.
- Quantified targets (illustrative):
  - Add 5–10 GW of non-fossil capacity over 2 years (where physically feasible); battery storage 4–8 GWh to stabilize variability.
  - Reduction of long-term vulnerability to similar shocks by 30–50% relative to pre-crisis baseline through a combination of capacity, efficiency, and interconnections.
- Assumptions/uncertainties: capital, financing (public and concessional); project timelines; permitting and supply chains; regional geopolitics.

2) Gas and fuel strategy; emissions safeguards
- Actions:
  - Convert to a diversified energy mix that minimizes dependence on any single fuel; preserve LNG and domestic gas reserves for critical loads; implement strict emissions controls and retirement of the most polluting options as soon as feasible.
  - Maintain a rolling emissions trajectory that adheres to international commitments by accelerating non-emitting power and efficiency, while permitting limited, tightly controlled gas use during extreme shortages.
- Quantified targets (illustrative):
  - Emissions intensity of electricity generation declines year-over-year as non-emitting capacity comes online; ensure least-emitters are prioritized for critical loads during shortages.
- Assumptions/uncertainties: fossil fuel price volatility; international supply constraints; political willingness to reserve and export fuels in a crisis.

3) Social protection renewal and economic stabilization
- Actions:
  - Expand targeted cash transfers, food assistance, and unemployment support to households and small businesses; ensure price stability for essential goods; extend payment moratoriums where necessary.
  - Invest in macroeconomic stabilization measures to prevent a deep recession (fiscal buffers, rapid financing from international partners, and potentially temporary debt instruments).
- Quantified targets (illustrative):
  - Protect living standards for the bottom 40% of the population; limit GDP contraction to a range of -2% to -6% in the first year, depending on the severity of the crisis and policy effectiveness.
- Assumptions/uncertainties: fiscal space; access to international financing; effectiveness of social protection targeting.

Contingency triggers and backup options (if projections worsen)
Triggers to escalate actions
- If hospital capacity becomes overwhelmed or heat-related mortality rises beyond a threshold (e.g., excess winter mortality rate spikes by a specified factor).
- If electricity shortfall worsens (more than 60–70% of demand unmet in major regions) or social unrest indicators rise.
- If critical fuel supply (gas/LNG/diesel) falls below a defined threshold for a sustained period.

Backup options
- Bring additional emergency capacity online; deploy regional electricity imports on an urgent basis; implement prioritized load-shedding with stricter regional zoning to protect life-sustaining loads.
- Temporary alternative fuels (diesel or heavy fuel oil) with strict emissions controls and rapid decommissioning plans once crisis abates.
- Accelerate interregional electricity sharing and, if necessary, targeted coal-fired backup with emission controls as a last resort and only for critical winter periods.

Ethical reasoning: who receives power first and why
- Core ethical framework:
  - Triage with a life-preserving and fairness lens: protect life first (healthcare, emergency services, water), then protect vulnerable households (elderly, poor, disabled), then sustain essential economic and social infrastructure, and finally distribute remaining power to the general population.
  - Equity considerations: prioritize low-income households and rural/remote communities that lack alternatives; ensure no disconnections for the vulnerable and critical services; maintain access to heating for those at risk of morbidity/mortality from cold.
  Balanced against emissions commitments by:
  - Favoring non-emitting or lowest-emitting options whenever possible for critical loads; use gas strategically only when essential; avoid coal as a default unless absolutely required for public safety and only with strict controls and rapid exit plans.
- Practical policy rules (example):
  - Tiered priority: (1) hospitals, water and emergency services; (2) heating for elderly, disabled, and poor households; (3) essential municipal services and critical industries; (4) remaining population with capped supply.
  Public accountability: an independent ethical oversight panel reviews allocation decisions and risk-mitigation measures monthly.

Policies to avoid (and why)
- Across-the-board price caps that endanger utility solvency and new investment; they can lead to shortages and deterioration of service if suppliers cannot cover costs.
- Blanket rationing that neglects vulnerability or essential services; risk of civil unrest and loss of trust.
- Heavy-handed temporary coal usage as a default; while it may provide short-term relief, it increases emissions, undermines commitments, and may delay investment in cleaner options.
- Expropriation of private generation without fair compensation or clear, limited-duration authority; can deter investment and create legal/financial instability.
- Hasty rollback of essential social protections; will exacerbate poverty, hunger, and unrest.

Key performance indicators (KPIs) and metrics to monitor success
- Life preservation and health: excess winter mortality rate; hospital occupancy; ambulance wait times; number of heat-related/winter-mortality risk events.
- Electricity access and reliability: percentage of households with stable electricity; regional outage duration; reliability metrics (SAIDI/SAIFI).
- Economic impact: GDP loss relative to baseline; unemployment/firm closures; inflation for essential goods.
- Social cohesion: public trust scores; incidents of unrest; social protection uptake rates.
- Emissions and climate commitments: emissions intensity of electricity; total emissions from the energy sector; progress toward clean energy targets (as a share of generation mix).
- Fuel security: days of gas/diesel reserves remaining; LNG import reliability; availability of emergency generation capacity.
- Operational readiness: speed of approvals; time to secure emergency power contracts; interconnections with neighbors; grid restoration rates.

Information needs and rapid data gathering
- Immediate data gaps to fill (with rapid procurement methods):
  - Current power mix, real-time demand, and outages by region; fuel stock levels (gas, LNG, diesel); status of grid infrastructure and transmission constraints.
  - Household vulnerability data (poverty, heating needs, housing insulation quality); access to electricity by community (rural/urban split).
  - Healthcare capacity (beds, ICU capacity, oxygen supply) and water/sewage system reliability.
  - Import options: neighbor country interties, LNG market availability, port and logistics readiness, and any regional energy-sharing agreements.
  - Financing and liquidity: emergency funds, IMF/World Bank/neighboring-country support, concessional loans, and guarantees.
  - Public sentiment and acceptance of priority rules; risk of social unrest.
- Rapid data collection methods:
  - Real-time data feeds from national grid operator(s), fuel suppliers, and major utilities; centralized dashboard with SCADA and outage maps.
  - Quick household surveys and vulnerability mapping via mobile, SMS, or field teams; use of existing social protection databases to target subsidies.
  - Satellite and remote sensing to assess grid constraints, outage clusters, and infrastructure damage when on-ground data is slow.
  - International partners’ data (e.g., IEA, World Bank) for benchmarking and guidance; crisis lending data from IMF/World Bank.
  - Establish a rapid-response information unit that collates, validates, and distributes daily situation briefings to decision-makers and public.

What information would help decisions most, and how to obtain it quickly
- Accurate demand projections under different behavior and efficiency scenarios; current and projected fuel availability; real-time grid reliability metrics; and the status of neighboring grids for imports.
- How to obtain:
  - Require immediate reporting from all grid operators and major utilities; deploy a centralized data-sharing platform; authorize temporary data-sharing agreements with neighboring countries.
  - Commission rapid field assessments in hardest-hit regions; use mobile data collection to update vulnerability indices; run a short wave of targeted surveys in high-risk areas.
  - Engage international partners for financing and technical expertise; request expedited reviews and disbursements.

Illustrative example of a time-phased plan (condensed)
- Immediate (days–2 weeks):
  - Activate crisis authority; protect hospitals, water, emergency services; roll out disconnection moratorium and targeted subsidies; start demand-reduction campaign; begin emergency generation procurement and fuel logistics; set up ethical allocation oversight.
  - Targets (illustrative): 100% of critical services protected; 60–75% of urban households receive heating support; 8–12% reduction in demand from behavior/efficiency; emergency generation adding 5–10% of peak demand in areas with urgent need.
- Short-term (weeks–months):
  - Deploy microgrids/solar-plus-storage in priority regions; upgrade critical grid segments; secure LNG/fuel for extended operation; expand social protection; pursue interconnections and cross-border sharing where feasible.
  - Targets (illustrative): 2–4 GW new dispatchable capacity from renewables and storage; 70–80% regional demand served via strengthened network; 15–25% share of electricity from non-emitting sources during the crisis period where feasible; 25–35% of population covered by heating/electricity subsidies.
- Medium-term (months–2 years):
  - Build resilient, lower-emission capacity; scale energy efficiency; expand regional power trading; institutionalize crisis management and social protection; align investments with emissions commitments (avoid coal where possible).
  - Targets (illustrative): 5–10 GW non-emitting capacity added; storage/microgrids contributing 4–8 GWh; emissions trajectory improved with lower emission intensity of electricity; GDP stabilization and recovery path kept within a reasonable forecast band.

Governance and implementation notes
- Establish a central coordinating body with clear authorities, timelines, and accountability structures (transparency in allocation decisions, regular public briefings).
- Engage stakeholders: ministries, regional authorities, utilities, hospital networks, water utilities, civil society, and international partners; ensure equitable representation in decision-making.
- Financing: leverage emergency funds, concessional finance, and international aid; set clear use-of-funds rules and audit trails to preserve trust and avert fraud.
- Communications: honest, consistent messaging about why measures are necessary, what is being protected, and what households can do to help.

What I’d want to know next (and how to obtain quickly)
- Real-time grid data: current capability by region, outages, and resilience; arrange access to SCADA dashboards or operator reports.
- Fuel stock and logistics: LNG/diesel/gas reserves, current import commitments, pricing; engage ministries and traders to verify availability.
- Vulnerable populations: accurate up-to-date poverty and housing vulnerability maps; work with social protection programs to target subsidies.
- Healthcare and water services: current capacity and outages; ensure backups and fuel allocation for water and healthcare facilities.
- Neighboring countries’ capacity and political willingness to share power; establish clear agreements and timelines.
- Financing and procurement timelines: what emergency financing is realistically available this quarter; expedite contracting rules.
- Public sentiment and trust indicators: sentiment analytics and hotlines to gauge reaction to allocations and messaging.

Bottom line
- A crisis of this scale requires a carefully phased mix of protection of life and essential services, aggressive demand reduction, rapid deployment of emergency power, and a credible medium-term plan to expand low-emission capacity and energy efficiency. An ethical triage framework reduces avoidable death and hardship while preserving social cohesion and staying true to climate commitments as much as feasible.
- If you’d like, I can tailor this to your country by plugging in your actual numbers (population, GDP, peak load, current generation mix, gas reserves, and grid specifics) and produce a concrete, device-ready plan with quantified targets, timelines, and a dashboard template.

In [None]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
!ollama pull llama3.2

In [None]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# So where are we?

print(competitors)
print(answers)


In [None]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


In [None]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [None]:
print(together)

In [None]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [None]:
print(judge)

In [None]:
judge_messages = [{"role": "user", "content": judge}]

In [None]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


In [None]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>