## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [2]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [3]:
# Always remember to do this!
load_dotenv(override=True)

True

In [4]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [5]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [6]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [7]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


Design three distinct, evidence‑based national strategies—one primarily technological (e.g., a specific low‑carbon technology rollout), one primarily policy/regulatory (e.g., pricing, standards, or market reforms), and one primarily social/behavioral (e.g., demand‑reduction or cultural shifts)—that together could plausibly reduce a high‑income country's greenhouse‑gas emissions by 50% within 15 years without causing a contraction in GDP; for each strategy, (a) explain the causal mechanism linking intervention to emissions reductions, (b) give a realistic timeline and a high‑level budget allocation assuming an additional annual public‑spending envelope equal to 1% of GDP, (c) identify the key public and private actors and how to align their incentives, (d) list the three biggest risks or likely unintended consequences and concrete mitigations, (e) specify five measurable metrics to monitor progress and the statistical or modeling approaches you would use to detect whether the interventi

In [8]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [9]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Below are three distinct, evidence-informed national strategies that together could plausibly deliver a 50% reduction in a high‑income country’s greenhouse‑gas (GHG) emissions within ~15 years, without a contraction in GDP. The three are designed to be complementary: a technological rollout, a policy/regulatory package, and a social/behavioral program. For each, I provide (a) causal mechanism, (b) timeline and high‑level budget (assuming 1% of GDP per year in new public spending, allocated across the three strategies), (c) actors and incentives, (d) risks and mitigations, (e) five measurable metrics and modeling approaches, and (f) probabilistic contribution estimates to the 50% target. I finish with three empirical observations in the first five years that would force abandoning the plan and why they would be decisive.

 Assumptions used in the package
- Target country: a high‑income nation with a relatively large, energy‑intensive economy and mature institutions (e.g., similar to the United States, EU members, or advanced economies in OECD/OCED).
- Baseline: current policy landscape already includes some decarbonization measures; the plan adds a concerted, scaled push across technology, policy, and behavior.
- Budget envelope: an additive annual public spending envelope equal to 1% of GDP (scales with GDP). For a USD 25 trillion GDP country, that is about USD 250 billion per year. Allocation here is high‑level and illustrative; exact shares can be tuned to national priorities.

Strategy A — Primarily technological: Rapid nationwide rollout of green hydrogen (GH2) for hard‑to‑electrify sectors and a supporting hydrogen/electrification ecosystem
(a) Causal mechanism
- GH2 provides a zero‑carbon energy carrier and feedstock for sectors that are difficult to electrify directly (industrial process heat in steel, ammonia production, refineries, and potentially long‑haul transport and aviation). Electrolyzers coupled with abundant renewables (and potentially paired with storage) reduce CO2 from process heat, refining, and chemical production. Hydrogen‑driven or hydrogen‑enabled processes displace fossil fuels and, with clean electricity, deliver deep decarbonization in sectors where electrification is insufficient or impractical. Economies of scale and learning-by-doing compress costs (LCOH) over time, expanding GH2’s cost competitiveness and enabling broader sector uptake.
(b) Timeline and budget allocation
- Timeline: 0–2 years: accelerate permitting, grid/renewables integration, regulatory frameworks for GH2, and early pilot facilities; 2–6 years: scale electrolyzer capacity, build dedicated hydrogen pipelines/receipts, and long‑term contracts with industrial users; 6–12 years: broad diffusion into major hard‑to‑electrify sectors; 12–15 years: stabilize market, achieve substantial share of industrial process heat and a growing share of transport/refueling.
- Budget (illustrative, 1% GDP envelope, total ~USD 250B/year for a USD 25T economy; allocation example):
  - GH2 production capacity and electrolyzers: ~40% (~USD 100B/year)
  - Hydrogen storage/transmission infrastructure and pipelines/related grid integration: ~15% (~USD 37–40B/year)
  - Sector pilots, offtake contracts, and government offtake guarantees for industrial users: ~15% (~USD 37–40B/year)
  - R&D, standards, certification, safety regimes, and workforce training: ~10% (~USD 25B/year)
  - Early‑stage decarbonized fuel retail/industrial hubs and PPAs with price support where necessary: ~20% (~USD 50B/year)
- Note: Actual allocation should be guided by sector readiness, regional availability of renewables, and domestic industrial strategy. The aim is to reach a cumulative GH2 capacity that meaningfully displaces fossil process heat (e.g., tens of gigawatts of electrolyzer capacity over the period) and a growing share of GH2‑dependent sectors.
(c) Key public and private actors and incentive alignment
- Public actors: energy ministry/agency, transport/industry ministries, grid operator, regulator, finance ministry (for guarantees/subsidies), and competent authorities for safety and permitting.
- Private actors: utility/energy suppliers, electrolyzer manufacturers, hydrogen producers and traders, major industrial users (steel, ammonia, refineries), logistics and fuel suppliers, and infrastructure developers.
- Incentives and alignment:
  - Public: provide long‑term offtake guarantees, favorable permitting timelines, regulatory certainty, and public‑private partnerships (PPPs) to de-risk early deployments; create revenue support mechanisms tied to GHG reductions and system reliability.
  - Private: secure predictable demand (industrial PPAs, long‑term contracts), access to low‑cost financing, and tariff plus capacity payments tied to hydrogen uptake and CO2 reductions; ensure safety and certification regimes to build public trust.
(d) Three biggest risks and mitigations
- Risk 1: High upfront capital and uncertain cost trajectory for GH2—mitigate with scale‑neutral procurement, shared risk through PPAs, learning‑by‑doing incentives, and public guarantees for early projects; accelerate cost reductions via competitive tenders and international collaboration on R&D.
- Risk 2: Infrastructure bottlenecks and misalignment with renewables/grid capacity—mitigate with integrated planning, grid upgrades, and modular pipeline/port strategies; staggered rollouts aligned with renewables buildout and demand.
- Risk 3: Demand shortfall or sectoral uptake delays (industrial users slow to switch, certification hassles)—mitigate with mandatory long‑term offtake clauses for key sectors, phased requirements, and financial incentives tied to emission reductions; ensure policy certainty and safety compliance to reduce investment risk.
(e) Five metrics and modeling approaches
- Metrics:
  1) Cumulative GH2 production capacity (MW of electrolyzers) and annual hydrogen production (kg or tonnes/year)
  2) Share of key sectors using GH2 for process heat (e.g., steel, ammonia, refining) and corresponding CO2 emissions reductions (MtCO2e/year)
  3) LCOH for green hydrogen and its price parity over time with fossil fuels in target sectors
  4) Hydrogen storage capacity, pipeline/network reach, and supply reliability (availability factors)
  5) Number of industrial facilities under long‑term GH2 PPAs and volume of government offtake/goodwill guarantees
- Modeling approaches:
  - Energy system models (e.g., TIMES/MESSAGE) to project GH2 supply/demand, LCOH, and integration with renewables
  - CGE/ macro‑economic models to assess GDP, employment, and sectoral effects under GH2 deployment
  - Sector‑level techno‑economic analyses for steel, ammonia, refineries to estimate leakage, emissions reductions, and cost curves
  - Time‑series analysis of capacity buildout, learning curves, and price declines (panel data across jurisdictions)
  - Certification/safety risk modeling and scenario testing for pipeline networks
(f) Probabilistic contribution to the 50% target (with ranges and rationale)
- Estimated contribution when combined with Strategies B and C:
  - Low‑to‑mid range: 12–22% of total emissions reduction
  - Central estimate: ~17–28%
  - Upper bound (with aggressive uptake, favorable tech learning, and rapid sectoral switching): 28–40%
- Rationale: GH2 primarily affects hard‑to‑electrify sectors and industrial heat; its impact compounds when electrolyzers scale and costs fall, creating decarbonization in sectors that electrification cannot easily reach. However, full decarbonization from GH2 alone is unlikely; synergy with electrification, efficiency, and demand reduction (Strategies B and C) is essential to reach 50%.
Three empirical signals that would force abandoning Strategy A (or adjusting it drastically):
  1) GH2 demand fails to materialize in critical sectors within five years, combined with persistent high electrolyzer costs that cannot reach competitive LCOH even with policy support, undermining sectoral decarbonization targets.
  2) Major safety/regulatory barriers, leakage/robo‑costs, or grid reliability issues caused by large GH2 volumes that prevent reliable energy service delivery or raise consumer prices beyond tolerable levels.
  3) Large, persistent stranded‑asset risks or economic disruption from GH2 investments that cannot be mitigated by sectoral alignment or diversification, leading to net negative macro impacts despite policy efforts.

 Strategy B — Primarily policy/regulatory: Robust carbon pricing, standards, and market reforms to align prices with true social costs and accelerate decarbonization
(a) Causal mechanism
- A credible, economy‑wide carbon price (with border adjustments where appropriate) raises the shadow price of carbon across sectors, incentivizing low‑carbon investment and efficiency, while standards and market reforms (building codes, vehicle efficiency, performance standards, building/industrial energy efficiency mandates) push the pace of decarbonization. Revenue recycling to households or firms maintains GDP growth and can offset price increases, ensuring that reduced emissions do not come at the cost of economic contraction. Market reforms reduce regulatory friction and accelerate adoption of low‑carbon technologies.
(b) Timeline and budget allocation
- Timeline: 0–2 years: design and legislate carbon pricing (starting price and trajectory), establish border adjustments, reform fossil subsidies; 2–6 years: implement and adjust standards; begin revenue recycling and investment grants; 6–12 years: broaden coverage, tighten standards, monitor leakage and competitiveness; 12–15 years: achieve deep decarbonization across sectors.
- Budget (illustrative, 1% GDP envelope, USD ~USD 250B/year; allocation example):
  - Carbon pricing design, administration, and border adjustments: ~35% (~USD 87–90B/year)
  - Revenue recycling, rebates, and targeted support to low‑income households and energy‑intensive trades: ~25% (~USD 62–65B/year)
  - Standards development, compliance enforcement, grading/labeling schemes, and market reforms: ~20% (~USD 50B/year)
  - Public‑sector efficiency upgrades and policy evaluation/monitoring: ~10% (~USD 25B/year)
  - Transition support for workers and regions affected by decarbonization, plus aligned investments (R&D, pilots, and deployment grants): ~10% (~USD 25B/year)
- Note: The exact mix depends on the chosen carbon price path, revenue recycling design, and the breadth of sectors covered.
(c) Key public and private actors and incentive alignment
- Public actors: finance/ministry of economy, environmental/energy ministries, tax authority, competition/consumer protection agencies, central bank/financial regulators, and border‑control/import regimes.
- Private actors: fossil fuel producers, energy utilities, industry associations, consumer goods and service sectors, financial institutions, and investors.
- Incentives and alignment:
  - Public: credible price signal with transparent trajectory, revenue recycling to maintain affordability and competitiveness, and smart subsidies for early adopters/fallbacks for vulnerable households.
  - Private: price certainty and policy stability to plan investments; financial instruments (green bonds, performance‑based subsidies) aligned to emissions outcomes; regulatory certainty to avoid stranded assets.
(d) Three biggest risks and mitigations
- Risk 1: Carbon price volatility or political reversal undermining credibility—mitigate with a transparent, statutory price trajectory, credible border adjustments, and multiyear implementation with legislative guardrails.
- Risk 2: Competitiveness and leakage concerns for exposed sectors—mitigate with robust border carbon adjustments, targeted sectoral transitional support, and energy‑intensity benchmarks.
- Risk 3: Public acceptance and affordability concerns for households—mitigate with revenue recycling to offset regressive impacts, rebates for low‑income households, and targeted subsidies to maintain energy access.
(e) Five metrics and modeling approaches
- Metrics:
  1) Carbon price level and trajectory; share of economy covered by price signal
  2) Emissions by sector over time (MtCO2e/year) and emissions intensity of GDP
  3) Energy intensity and energy efficiency improvements (e.g., energy use per unit of output)
  4) Building, vehicle, and industrial standards adoption rates and compliance
  5) Revenue recycling effectiveness (household/firm net energy costs, inflation, and distributional impacts)
- Modeling approaches:
  - Computable General Equilibrium (CGE) models to assess macroeconomic and sectoral impacts of carbon pricing and fiscal recycling
  - Econometric difference‑in‑differences or synthetic control analyses to gauge policy shifts on emissions and GDP
  - Energy system optimization models to project the carbon price’s effect on technology adoption, efficiency, and fuel switching
  - Distributional analyses to monitor equity and affordability
  - Sectoral case studies and pilot programs to calibrate standards and leakage controls
(f) Probabilistic contribution to the 50% target
- Estimated contribution when combined with A and C:
  - Lower range: 12–22% of total reductions
  - Central estimate: ~18–28%
  - Upper bound: 25–35%
- Rationale: A strong carbon price can rapidly shift investment toward zero‑carbon options and energy efficiency, with broad GDP protection if revenue recycling is well designed. However, the magnitude depends on price path, coverage, global policy context, and leakage controls; standalone pricing cannot fully decarbonize without accompanying tech and behavioral changes.
Three empirical signals within five years that would force reconsideration of Strategy B (or its design):
  1) The carbon price fails to deliver meaningful emissions reductions or leads to significant leakage despite border measures, with GDP growth grudging and competitiveness deteriorating in key sectors.
  2) Revenue recycling yields regressive outcomes or affordability declines for households, causing political backlash and policy rollback.
  3) Standards and market reforms cause disproportionate compliance costs or supply chain disruptions that reduce overall economic welfare without commensurate emissions reductions.

 Strategy C — Primarily social/behavioral: Demand reduction and cultural shifts to lower energy consumption and carbon intensity
(a) Causal mechanism
- Behavioral and cultural shifts, supported by information campaigns, social norms, urban design, and incentives, reduce energy demand (e.g., through efficiency, reduced travel demand, shift to low‑carbon goods, and changes in consumption patterns). Coupled with nudges, feedback, and transparent labeling, these changes lower energy use and emissions directly, while maintaining or enhancing quality of life and GDP via more efficient service delivery and demand management.
(b) Timeline and budget allocation
- Timeline: 0–1.5 years: design and pilot programs (digital feedback tools, labeling, education campaigns); 1.5–6 years: nationwide rollout of demand‑side programs, building retrofits, and mobility shifts; 6–12 years: sustained behavioral change and urban design improvements; 12–15 years: consolidation and reinforcement of norms with widespread decarbonized living.
- Budget (illustrative, 1% GDP envelope; allocation example):
  - Public awareness and education campaigns, behavioral nudges, and social marketing: ~25% (~USD 62–65B/year)
  - Home/building energy efficiency retrofits and appliance efficiency subsidies/financing: ~40% (~USD 100–105B/year)
  - Urban planning, transportation demand management, and multimodal infrastructure (cycling, public transit, EV charging access): ~25% (~USD 62–65B/year)
  - Digital feedback systems, smart meters, data platforms, privacy safeguards, and program evaluation: ~5% (~USD 12–13B/year)
  - Targeted social programs for low‑income households and vulnerable communities during the transition: ~5% (~USD 12–13B/year)
- Note: With behavioral strategies, the emphasis is on cost‑effective efficiency gains, demand management, and shifting consumption toward lower‑carbon options, complemented by the other strategies.
(c) Key public and private actors and incentive alignment
- Public actors: health/education, urban planning, transport ministries, energy/utility regulators, housing agencies, local governments, and consumer protection authorities.
- Private actors: utilities, home appliance manufacturers, HVAC contractors, ride‑hailing/transport operators, retailers, and platform providers for energy feedback tools.
- Incentives and alignment:
  - Public: incentives for energy‑efficient retrofits, modernization of building codes, and urban design that reduces transport demand; data sharing and privacy protections to enable feedback systems.
  - Private: revenue gains from efficiency upgrades, demand‑side management programs, and new services (home energy management, smart devices) that capture savings and improve customer loyalty.
(d) Three biggest risks and mitigations
- Risk 1: Public fatigue, behavioral rebound, or low adoption of efficiency measures—mitigate with targeted outreach, social norm campaigns, incentives, and a mix of “hard” (codes, rebates) and “soft” (information, feedback) levers.
- Risk 2: Equity gaps in access to retrofits or mobility options—mitigate with targeted programs for low‑income households, renters, and vulnerable communities; ensure affordable financing and outreach in underserved areas.
- Risk 3: Data privacy and digital divide in feedback platforms—mitigate with strong privacy safeguards, opt‑in models, and offline components; invest in digital inclusion programs.
(e) Five metrics and modeling approaches
- Metrics:
  1) Household and building energy intensity per square meter (kWh/m2) and appliance efficiency uptake (percent of homes with high‑efficiency appliances)
  2) Energy consumption per capita and per GDP (energy intensity)
  3) Travel behavior changes: modal share (public transit, cycling, walking) and vehicle‑kilometers traveled (VKT) per capita
  4) Telework/remote work share and urban form measures (e.g., urban density, compactness, commuting times)
  5) Food/consumption patterns and waste reduction metrics (e.g., food waste kilograms per capita; plant‑based diet adoption rates)
- Modeling approaches:
  - Behavioral economics and agent‑based models to simulate adoption of efficiency technologies and behavior changes
  - Panel and time‑series econometrics to quantify impacts of campaigns, incentives, and labeling
  - Urban‑scale transport modeling to estimate mode shifts and congestion impacts
  - Life cycle assessments for appliances and buildings to quantify emissions reductions per retrofit
  - Difference‑in‑differences across regions with varying program intensity to isolate effects
(f) Probabilistic contribution to the 50% target
- Estimated contribution when combined with A and B:
  - Lower range: 8–15% of total reductions
  - Central estimate: ~12–20%
  - Upper bound: 20–28%
- Rationale: Demand‑side improvements can yield substantial savings, especially when paired with efficiency standards and incentives. However, behavioral changes alone typically yield smaller, slower emissions reductions than technology rollouts or pricing, so the contribution is meaningful but not dominating; synergy with Strategies A and B is essential.
Three empirical signals within five years that would force abandoning Strategy C (or its design)
  1) Actual demand reductions lag significantly behind targets, with energy services failing to improve commensurately and GDP per capita growth emerging weaker than baseline projections.
  2) Equity gaps persist or widen, and program acceptance deteriorates (e.g., low‑income households remain unable to access retrofits or affordable efficiency solutions), triggering political backlash.
  3) Digital feedback platforms fail to deliver measurable behavior change due to data privacy concerns, digital exclusion, or insufficient engagement, undermining confidence in the social program.

 How the three strategies fit together to plausibly hit a 50% emissions reduction in 15 years
- Complementarity: Strategy A delivers decarbonization in hard‑to‑electrify sectors via GH2 and associated infrastructure; Strategy B provides the price signals, standards, market reforms, and revenue recycling that accelerate electrification, efficiency, technology deployment, and cross‑sector decarbonization; Strategy C reduces demand and shifts consumption patterns, reducing energy needs and easing system strain, thereby amplifying the effects of A and B.
- Expected contribution ranges (summarizing across strategies):
  - Strategy A (Technological GH2 rollout): 12–28% (central ~17–28%)
  - Strategy B (Policy/regulatory): 12–28% (central ~18–28%)
  - Strategy C (Social/behavioral): 8–28% (central ~12–20%)
  - Combined central estimate: roughly 46–76% across the three; a plausible plan could be tuned to target a ~50% combined reduction within 15 years, recognizing uncertainties and the need for strong execution across all three streams.
- Important caveats: The ranges reflect uncertainties in technology costs, uptake rates, macroeconomic responses, and external factors such as global energy prices and the pace of renewables deployment. The plan’s robustness depends on credible policy design, coherent implementation, and continuous monitoring.

Three concrete, early‑warning empirical observations in the first five years that would cause you to abandon or radically re‑design the plan
- Observation 1: Emissions reductions fall short of trajectory by a large margin and GDP growth weakens; e.g., multi‑year emissions reductions are underperforming relative to model projections by a margin of, say, 30% or more, despite near‑term investments. Why decisive: if the plan cannot bend the emissions trajectory without harming GDP, the cost‑benefit balance becomes unfavorable and alternative strategies or a reassessment of ambition is warranted.
- Observation 2: Systemic cost overruns or affordability crises materialize (e.g., energy prices spike, consumer energy bills escalate unsustainably, or GH2/renewables costs fail to converge as expected). Why decisive: if the plan imposes unmanageable costs on households or industry, political and public support erodes, undermining long‑term feasibility.
- Observation 3: Unintended, large‑scale adverse effects occur (e.g., significant supply chain vulnerabilities, employment losses in key sectors, or energy security concerns from reliance on imports of critical inputs) that cannot be mitigated effectively. Why decisive: persistent adverse macroeconomic or social consequences undermine the legitimacy and durability of the plan, making it necessary to pause, recalibrate, or replace with alternative approaches.

Operational notes and cautions
- The precise numerical allocations and targets should be tailored to the country’s actual GDP, energy mix, industrial structure, and political economy. The above framework emphasizes: (i) a credible, long‑term price and policy signal; (ii) scalable technology deployment where it is most cost‑effective and rapid; and (iii) demand management that reduces unnecessary energy use while preserving or enhancing welfare.
- Monitoring and adaptive management are essential. Regular independent evaluation, transparent dashboards, and adjustable funding shares based on performance should be built into the design.
- Effective sequencing matters: early wins (e.g., efficiency gains, affordable electrification, and GH2 pilots in heavy industry) can build political and public capital for deeper decarbonization later.

If you’d like, I can tailor these three strategies to a specific country profile (e.g., United States, Germany, France, or Japan) with country‑specific GDP figures, current policy baselines, sectoral composition, and energy prices.

In [None]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
!ollama pull llama3.2

In [10]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

NotFoundError: Error code: 404 - {'error': {'message': "model 'llama3.2' not found", 'type': 'api_error', 'param': None, 'code': None}}

In [11]:
# So where are we?

print(competitors)
print(answers)


['gpt-5-nano']


In [12]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-5-nano

Below are three distinct, evidence-informed national strategies that together could plausibly deliver a 50% reduction in a high‑income country’s greenhouse‑gas (GHG) emissions within ~15 years, without a contraction in GDP. The three are designed to be complementary: a technological rollout, a policy/regulatory package, and a social/behavioral program. For each, I provide (a) causal mechanism, (b) timeline and high‑level budget (assuming 1% of GDP per year in new public spending, allocated across the three strategies), (c) actors and incentives, (d) risks and mitigations, (e) five measurable metrics and modeling approaches, and (f) probabilistic contribution estimates to the 50% target. I finish with three empirical observations in the first five years that would force abandoning the plan and why they would be decisive.

 Assumptions used in the package
- Target country: a high‑income nation with a relatively large, energy‑intensive economy and mature institutio

In [None]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [None]:
print(together)

In [None]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [None]:
print(judge)

In [None]:
judge_messages = [{"role": "user", "content": judge}]

In [None]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


In [None]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>