## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
#from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


You're an interdisciplinary advisory team for a mid-sized high-income city (population ≈ 500,000) currently emitting 6 million tonnes CO2e/year. Your task: design a prioritized, multi-step 10-year plan to reduce the city's territorial greenhouse gas emissions by at least 50% while preserving economic growth and improving social equity. For your plan, provide: (1) five prioritized policy and infrastructure interventions with timelines and responsible actors; (2) estimated annual GHG reductions and implementation costs for each intervention (with uncertainty ranges and key assumptions); (3) the minimum legal, regulatory, and financing changes required; (4) a 12–24 month pilot project for the highest-impact intervention including experimental design, metrics, sample size, and success criteria; (5) three plausible failure modes and concrete mitigations; (6) evaluation metrics and a monitoring plan to attribute emissions changes to interventions; (7) an alternative plan that prioritizes (a)

In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [8]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Plan for 50% territorial GHG reduction in 10 years (city of ~500k, current 6 MtCO2e/yr)

1) Five prioritized policy and infrastructure interventions (with timelines and responsible actors)
- Intervention A: Deep building energy retrofit and electrification (housing and commercial)
  - Timeline: 2025–2035 (policy design 2025–2026; ramp 2027–2035)
  - Responsible actors: City Environment & Housing Department; Building Standards/Planning; Utilities; Private retrofit/systems contractors; Finance (Green Bank)
- Intervention B: Transportation electrification and transit/urban mobility upgrades
  - Timeline: 2025–2035 (fleet electrification 2025–2032; charging network 2026–2035; urban mobility programs ongoing)
  - Responsible actors: City Transport Authority; Department of Public Works; Utilities; Transit operators; Private charging providers
- Intervention C: Clean electricity procurement and grid modernization
  - Timeline: 2025–2032 (renewable energy PPAs and procurement); 2025–2035 (grid upgrades, storage, market reforms)
  - Responsible actors: City/Regional Grid Operator; Utility; Independent power procurements; Regulators
- Intervention D: District heating networks, heat pumps, and on-site solar
  - Timeline: 2025–2035 (feasibility 2025–2026; pilot 2027–2029; scale 2030–2035)
  - Responsible actors: City Planning/Department of Energy; Utilities; Private developers; Private solar and heat-pump installers
- Intervention E: Waste and organics diversion with methane capture
  - Timeline: 2025–2035 (organics ban and collection 2025–2026; methane capture/AD facilities 2026–2035)
  - Responsible actors: Public Works; Waste management authorities; Private waste haulers; Local/regional regulators

2) Estimated annual GHG reductions and implementation costs for each intervention (uncertainty ranges and key assumptions)
- Intervention A: Building retrofits and electrification
  - 2030–2035 annual GHG reductions: 0.8–2.0 MtCO2e/yr (midpoint ~1.4 Mt/yr); by 2035 cumulative ~2.2–2.6 Mt/yr
  - 2030 baseline reduction by 2030: ~0.9–1.3 Mt/yr
  - Implementation cost (capex, 10-year program): $6–10 billion; annualized financing and delivery costs: ~$0.6–1.2 billion/yr
  - Key assumptions: median home/asset retrofits; heat pump replacements for space/water heating; energy codes tightened; reasonable uptake of incentives
- Intervention B: Transport electrification and transit upgrades
  - 2030–2035 annual GHG reductions: 0.9–2.0 MtCO2e/yr (mid ~1.4 Mt/yr)
  - 2030 baseline reduction by 2030: ~0.4–0.9 Mt/yr
  - Implementation cost: $0.6–1.0 billion capex; $0.05–0.15 billion/yr O&M/subsidies
  - Key assumptions: 400 electric buses; charging network scaling; private EV adoption aided by incentives; fleet turnover
- Intervention C: Clean electricity procurement and grid modernization
  - 2030–2035 annual GHG reductions: 1.5–3.0 MtCO2e/yr (mid ~2.2 Mt/yr)
  - 2030 baseline reductions: ~1.0–1.8 Mt/yr
  - Implementation cost: $2–3.5 billion capex (grid, storage, PPAs); $0.1–0.3 billion/yr O&M
  - Key assumptions: high-renewable procurement, grid flexibility, storage to handle intermittency
- Intervention D: District heating/heat pumps and on-site solar
  - 2030–2035 annual GHG reductions: 0.7–1.4 MtCO2e/yr
  - 2030 baseline reductions: ~0.3–0.8 Mt/yr
  - Implementation cost: $2–4 billion capex; $0.05–0.20 billion/yr O&M
  - Key assumptions: viable district heat networks; heat pumps for buildings; rooftop solar co-located with network
- Intervention E: Waste management and organics diversion
  - 2030–2035 annual GHG reductions: 0.3–0.8 MtCO2e/yr
  - 2030 baseline reductions: ~0.1–0.3 Mt/yr
  - Implementation cost: $0.5–1.0 billion capex; $0.02–0.08 billion/yr O&M
  - Key assumptions: organics ban/enhanced collection; anaerobic digestion and landfill gas capture

3) Minimum legal, regulatory, and financing changes required
- Legal/regulatory
  - Tighten building energy codes; require energy performance disclosure; mandate minimum retrofit triggers for major renovations
  - Enable district heating with streamlined permitting; zoning adjustments to support heat networks and microgrids
  - Implement mandatory organics recycling and landfill gas capture; methane-control standards for waste facilities
  - Define city-level procurement rules prioritizing low-carbon products; set interim targets for city operations
- Financing
  - Establish or expand municipal green financing instrument (green bonds/green bank) with credit enhancement; create a city climate fund with grant/loan windows
  - Create PPAs, performance-based incentives, and RE procurement frameworks; enableco-financing with private developers
  - Introduce local carbon pricing/revenue recycling or earmark a Climate Action Fund to de-risk low-income programs
- Institutional
  - Align departments on a single emissions dashboard; require joint program planning across housing, transport, energy, and waste
  - Institute inclusive governance to ensure equity (low-income households benefit from subsidies and job programs)

4) 12–24 month pilot project for the highest-impact intervention (experimental design, metrics, sample size, success criteria)
- Highest-impact intervention: Intervention A (building retrofits and electrification)
- Pilot scope: 3 districts totaling ~3,000 residential/commercial units; mixed income areas to test equity outcomes
- Experimental design: stepped-wedge / cluster-randomized rollout
  - Baseline period: 6 months of pre-retrofit energy and emissions data
  - Randomized rollout: 3 districts assigned to start retrofit in rolling 4-month steps over 12–18 months; remaining districts serve as control until they begin
- Metrics
  - Primary: percent reduction in building energy use intensity (EUI) and corresponding GHG reductions in pilot districts
  - Secondary: energy cost savings for occupants, indoor comfort surveys, tenant displacement/relocation rate, retrofit completion rate, job creation
- Sample size and power
  - 3 districts, ~1,000–1,200 units each (roughly 3,000 total units); aim for 80% power to detect 10–15% EUI reductions
- Success criteria
  - Achieve ≥25–30% average EUI reduction per retrofit unit within 12–18 months
  - Net present value of energy savings surpasses incremental retrofit costs within 6–10 years for pilot area
  - Equity objective met: at least 40% of households in pilot are low-income or socially vulnerable, with targeted subsidies delivered
  - Inform decisions for scaling district-wide program

5) Three plausible failure modes and concrete mitigations
- Failure mode 1: Up-front costs and financing barriers slow adoption; mitigation: develop blended finance (grants, low-interest loans, performance-based incentives) and a local green bond/CLF with grants for low-income occupants
- Failure mode 2: Supply chain and workforce bottlenecks (labor, equipment) delay retrofits; mitigation: pre-qualify contractors, build local workforce programs, deploy modular retrofit kits, stock essential heat-pump equipment, establish long-term procurement pre-bid
- Failure mode 3: Rebound effects and tenant discomfort/acceptability reduce uptake; mitigation: occupant engagement programs, heat comfort standards, real-time energy feedback dashboards, subsidies for tenants, robust dispute-resolution processes

6) Evaluation metrics and a monitoring plan to attribute emissions changes to interventions
- Core metrics
  - Annual citywide GHG inventory by sector (buildings, transport, electricity, industry, waste)
  - Sector-specific energy use intensity (EUI) and total energy consumption (electricity, gas, district heat)
  - Number and share of buildings retrofitted; installed heat pumps; installed capacity (MW) of solar and storage
  - Fuel-switching rates (gas to electric heat), EV share, bus fleet electrification status
  - Equity indicators: energy affordability, energy burden among low-income households, job creation in disadvantaged communities
- Data sources
  - Utility energy billing data; building permits and retrofit certificates; metered energy use; quarterly emissions reports; independent third-party verification
- Attribution plan
  - Use a city-wide emissions inventory with attribution models (before/after, control/comparison districts, discounting for external factors)
  - Deploy a local climate data platform integrating energy data, mobility data, and waste data; run annual calibrations comparing observed vs. modelled reductions
- Monitoring cadence
  - Quarterly dashboards for internal review; biannual public reporting; external audit every 2–3 years
- Evaluation methods
  - Difference-in-differences where feasible; scenario analysis with counterfactuals; sensitivity analyses for key assumptions (fuel prices, tech costs)

7) An alternative plan (trade-offs among different priorities)
- (a) Maximum speed of decarbonization
  - Approach: accelerate all five interventions with aggressive targets, streamline permitting, lock-in PPAs, mobilize large public capital quickly
  - Trade-offs: high upfront public expenditure; higher financing risk; potential equity concerns if benefits are uneven or long-term returns accrue to investors
- (b) Minimum public expenditure
  - Approach: leverage private finance, public–private partnerships, performance-based contracts, and market-driven decarbonization; limit new public capital outlays
  - Trade-offs: slower pace, greater reliance on market readiness; potential gaps in equity and resilience unless carefully designed
- (c) Maximum distributive equity
  - Approach: prioritize subsidies, job guarantees, workforce training in disadvantaged areas; targeted retrofit subsidies; ensure utility and ratepayer protections
  - Trade-offs: potential higher per-unit costs and longer timelines for full decarbonization; may require higher ongoing subsidies or reallocation of funds
- Brief synthesis: the fastest plan risks cost and equity issues if not carefully financed; the cheapest plan may slow decarbonization; the equity-first plan improves social outcomes but may raise costs or slow roll-out unless targeted funding is secured

8) Executive summary (≤200 words)
A 10-year, multi-sector plan targets at least a 50% reduction in city CO2e while preserving growth and advancing equity. Five prioritized interventions secure near- and long-term emissions cuts: (A) deep retrofit and electrification of building stock; (B) electrification of transport and transit upgrades; (C) clean electricity procurement with grid modernization; (D) district heating/heating electrification and on-site solar; (E) improved waste management with methane capture and organics diversion. Collectively, these interventions are designed to achieve 3–3.5 MtCO2e/year reductions by 2035 (roughly 7–8 MtCO2e/year in 2035, relative to a 6 Mt baseline), while enabling sustained economic growth and inclusive benefits. A 12–18 month pilot in a priority district evaluates the most impactful approach (building retrofits) with a robust experimental design. The plan includes legal/regulatory changes, financing tools (green bonds, PPAs), risk mitigations, and a transparent monitoring framework to attribute emission changes to interventions. An alternative plan outlines trade-offs among speed, cost, and equity. This is designed as a living blueprint with quarterly reviews and external verification.

9) Appendix (data sources, major assumptions, confidence; self-assessed 0–100%)
- Data sources (illustrative)
  - Local energy balance and emissions inventory; utility billing data; building stock data; transport fleet data; waste management records; regulatory impact assessments; capital costs from credible market benchmarks; grid operator planning documents
- Major assumptions
  - Baseline 6 MtCO2e/yr; sector shares approximated (buildings ~40%, transport ~35%, electricity ~15%, waste ~5%); technology costs (retrofits, heat pumps, EVs) align with mid-point of 2024–2025 market ranges; high-TA uptake of incentives; policy adoption rates
  - Renewable procurement and grid modernization achieve annualized capacity additions consistent with 80–100% city operations by 2030–2032
  - Economic growth remains robust; no large external shocks
- Confidence and caveats
  - Confidence: 62–78% overall for achieving ≥50% reduction with integrated plan; high uncertainty around funding scales, labor supply, technology costs, and policy uptake
  - Key risks: financing risk, supply chain constraints, equity implementation gaps, and population behavioral responses
  - Mitigations: staged implementation, equity-first design, pre-awarded procurement, and independent evaluation

Supporting details (expanded rationale, calculations, and references)

Intervention A: Buildings
- Rationale: Buildings are the largest sector in many high-income cities; deep retrofits plus heat pumps cut both energy use and emissions. Energy savings compound as population and stock grows.
- Detailed cost/benefit assumptions
  - Stock: ~230,000 units (residential and commercial) targeted over 10 years
  - Retrofit cost: average $25–$40k/unit (midpoint $32k); total capex $7.4–$9.3B
  - Financing: assume 60–70% debt, 30–40% city grants/tax credits; annual debt service ~0.6–1.0B
  - Emission reductions: building sector emissions ~2.4 MtCO2e/yr baseline; deep retrofits likely reduce 1.6–2.4 MtCO2e/yr by 2035
  - Co-benefits: energy bill savings for households and businesses; improved comfort; local jobs
- Pilot alignment: the pilot (3k units) informs scalable retrofit methods, performance-based incentives, and workforce training.

Intervention B: Transport
- Rationale: Electrified buses and extensive charging reduce high-emission transport; bus fleets are early, tangible decarbonization gains; transit improvements reduce car dependency.
- Detailed cost/benefit assumptions
  - Bus fleet: ~400 electric buses at $700k–$900k each; capex $0.28–0.36B
  - Charging infra and grid upgrades: $0.15–0.25B
  - Private EV uptake incentives: $0.05–0.10B
  - Emissions: 0.9–2.0 MtCO2e/yr by 2035
  - Co-benefits: reduced air pollution, improved mobility options, better equity in access to transit
- Risks: higher electricity load; mitigations include demand management, fast-response storage, smart charging.

Intervention C: Clean electricity and grid
- Rationale: Decarbonizing electricity supply is a multiplier effect for all energy-using sectors; grid modernization enables higher renewables, storage, and resilience.
- Detailed cost/benefit assumptions
  - Renewables and PPAs: 2–3.5B capex; storage 0.3–0.7B
  - Grid upgrades and market reforms: 0.5–1.0B
  - Emissions: 1.5–3.0 MtCO2e/yr reductions by 2035
  - Co-benefits: lower exposure to fuel price volatility; grid reliability
- Risks: procurement/availability of renewables; mitigations include diversified PPAs and local partnerships.

Intervention D: District heating and solar
- Rationale: Efficient district heat improves hot water and space heating efficiency; solar PV complements heat networks for daytime demand
- Detailed cost/benefit assumptions
  - District heating network build-out: 2–4B capex; solar PV on-site 0.5–1.0B
  - Emissions: 0.7–1.4 MtCO2e/yr by 2035
  - Co-benefits: improved heating comfort; local job creation
- Risks: regulatory hurdles; mitigations: fast-track permitting, upfront planning, community engagement.

Intervention E: Waste and organics
- Rationale: Organic waste diverted to AD and composting reduces methane; methane capture from landfills is a near-term opportunity
- Detailed cost/benefit assumptions
  - AD facilities, organics collection: 0.5–1.0B capex
  - Methane capture upgrades: 0.2–0.4B
  - Emissions: 0.3–0.8 MtCO2e/yr by 2035
  - Co-benefits: soil health, reduced odors, job creation
- Risks: public participation rates; mitigations: robust outreach and convenient collection programs.

End of plan. If you’d like, I can tailor the numbers to a specific city’s actual data (e.g., current energy mix, housing stock, fleet composition) and provide a spreadsheet-ready model with sensitivity analyses.

In [None]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
!ollama pull llama3.2

In [9]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3:latest"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

**10-Year Plan to Reduce CO2e Emissions by 50%**

**Policy and Infrastructure Interventions:**

1. **Electric Vehicle (EV) Incentives & Charging Infrastructure**
	* Timeline: Years 1-3
	* Responsible Actor: City Government, Public Private Partnerships
	* Estimated Annual GHG Reduction: 300,000 tCO2e (~5% of total emissions)
	* Implementation Costs:
		+ Uncertainty Range: $10M - $20M (avg. $15M)
		+ Key Assumption: EV adoption rate of 20%
2. **Building Energy Efficiency Retrofits**
	* Timeline: Years 1-4
	* Responsible Actor: City Government, Property Owners
	* Estimated Annual GHG Reduction: 400,000 tCO2e (~7% of total emissions)
	* Implementation Costs:
		+ Uncertainty Range: $5M - $10M (avg. $7.5M)
		+ Key Assumption: Average energy efficiency gains of 30%
3. **Public Transportation Overhaul and Bike Infrastructure**
	* Timeline: Years 2-6
	* Responsible Actor: City Government, Public Transit Agency
	* Estimated Annual GHG Reduction: 1,200,000 tCO2e (~20% of total emissions)
	* Implementation Costs:
		+ Uncertainty Range: $50M - $100M (avg. $75M)
		+ Key Assumption: Average reduction in emissions per passenger mile of 0.5 g CO2e/mile
4. **Waste-to-Energy Facility and Organics Recycling**
	* Timeline: Years 3-7
	* Responsible Actor: City Government, Public Private Partnerships
	* Estimated Annual GHG Reduction: 150,000 tCO2e (~2.5% of total emissions)
	* Implementation Costs:
		+ Uncertainty Range: $10M - $25M (avg. $17.5M)
		+ Key Assumption: Average organic waste diversion rate of 50%
5. **Urban Forestry and Green Spaces Expansion**
	* Timeline: Years 1-5
	* Responsible Actor: City Government, Community Groups
	* Estimated Annual GHG Reduction: 100,000 tCO2e (~1.7% of total emissions)
	* Implementation Costs:
		+ Uncertainty Range: $1M - $3M (avg. $2M)
		+ Key Assumption: Average sequestration rate of 10 tons CO2e/hectare/year

**Minimum Legal, Regulatory, and Financing Changes Required:**

* Update building codes to promote energy efficiency
* Implement congestion pricing for public transportation
* Establish a renewable portfolio standard (RPS) for city-owned buildings
* Allocate dedicated funding for infrastructure upgrades and emissions reduction initiatives
* Update zoning regulations to prioritize mixed-use development and walkability

**Pilot Project: EV Incentives & Charging Infrastructure**

* Experimental Design:
	+ Randomly select 10 neighborhoods across the city
	+ Offer tiered incentives (price discounts, priority access) for residents buying EVs
* Metrics:
	+ EV adoption rate
	+ Charging infrastructure usage
	+ Carbon reductions
* Sample Size: 50 participants (25 per treatment group)
* Success Criteria: at least a 10% increase in EV adoption and a 20% increase in charging infrastructure usage

**Failure Modes and Mitigations:**

1. **Resistance to Emissions Reduction Strategies**
Mitigation: Engage stakeholders through public outreach, education, and community involvement programs
2. **Insufficient Funding Allocation**
Mitigation: Prioritize initiatives with clear cost-benefit analyses and allocate funds accordingly
3. **Inadequate Technology Adoption Rate**
Mitigation: Implement incremental incentives for early adopters to drive widespread adoption

**Evaluation Metrics and Monitoring Plan:**

* Monitor emissions reductions using satellite imaging, energy consumption data, and transportation mode share statistics
* Track progress against benchmarks and adjust interventions as needed

**Alternative Plan Prioritizing Max Speed of Decarbonization, Min Public Expenditure, and Max Distributive Equity:**

* Emphasize EV incentives and public transit expansions
* Implement congestion pricing nationwide
* Focus on energy efficiency in commercial properties through tenant-incentivized retrofits
* Allocate a smaller percentage of funds for urban forestry initiatives

**Executive Summary (≤200 words):**
This 10-year plan aims to reduce the city's CO2e emissions by at least 50% while promoting economic growth and social equity. Five strategic interventions will drive emissions reductions: EV incentives, building energy efficiency retrofits, public transportation overhaul, waste-to-energy facilities, and expanded green spaces. Key responsible actors include city government, property owners, and community groups. By prioritizing these initiatives, we can achieve significant emissions reductions while preserving economic growth (estimated 2.5% annual GDP growth) and improving social equity.

**Appendix:**
Data Sources:

* US Environmental Protection Agency (EPA)
* National Renewable Energy Laboratory (NREL)
* University of California, Berkeley's Transportation Research Center
Major Assumptions:
* EV adoption rate
* Average energy efficiency gains in building retrofits
* Public transportation mode share
Self-Assessed Confidence Level: 80% (based on conservative estimates and data from comparable cities)

In [10]:
# So where are we?

print(competitors)
print(answers)


['gpt-5-nano', 'llama3:latest']
['Plan for 50% territorial GHG reduction in 10 years (city of ~500k, current 6 MtCO2e/yr)\n\n1) Five prioritized policy and infrastructure interventions (with timelines and responsible actors)\n- Intervention A: Deep building energy retrofit and electrification (housing and commercial)\n  - Timeline: 2025–2035 (policy design 2025–2026; ramp 2027–2035)\n  - Responsible actors: City Environment & Housing Department; Building Standards/Planning; Utilities; Private retrofit/systems contractors; Finance (Green Bank)\n- Intervention B: Transportation electrification and transit/urban mobility upgrades\n  - Timeline: 2025–2035 (fleet electrification 2025–2032; charging network 2026–2035; urban mobility programs ongoing)\n  - Responsible actors: City Transport Authority; Department of Public Works; Utilities; Transit operators; Private charging providers\n- Intervention C: Clean electricity procurement and grid modernization\n  - Timeline: 2025–2032 (renewable e

In [11]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-5-nano

Plan for 50% territorial GHG reduction in 10 years (city of ~500k, current 6 MtCO2e/yr)

1) Five prioritized policy and infrastructure interventions (with timelines and responsible actors)
- Intervention A: Deep building energy retrofit and electrification (housing and commercial)
  - Timeline: 2025–2035 (policy design 2025–2026; ramp 2027–2035)
  - Responsible actors: City Environment & Housing Department; Building Standards/Planning; Utilities; Private retrofit/systems contractors; Finance (Green Bank)
- Intervention B: Transportation electrification and transit/urban mobility upgrades
  - Timeline: 2025–2035 (fleet electrification 2025–2032; charging network 2026–2035; urban mobility programs ongoing)
  - Responsible actors: City Transport Authority; Department of Public Works; Utilities; Transit operators; Private charging providers
- Intervention C: Clean electricity procurement and grid modernization
  - Timeline: 2025–2032 (renewable energy PPAs and procu

In [12]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [13]:
print(together)

# Response from competitor 1

Plan for 50% territorial GHG reduction in 10 years (city of ~500k, current 6 MtCO2e/yr)

1) Five prioritized policy and infrastructure interventions (with timelines and responsible actors)
- Intervention A: Deep building energy retrofit and electrification (housing and commercial)
  - Timeline: 2025–2035 (policy design 2025–2026; ramp 2027–2035)
  - Responsible actors: City Environment & Housing Department; Building Standards/Planning; Utilities; Private retrofit/systems contractors; Finance (Green Bank)
- Intervention B: Transportation electrification and transit/urban mobility upgrades
  - Timeline: 2025–2035 (fleet electrification 2025–2032; charging network 2026–2035; urban mobility programs ongoing)
  - Responsible actors: City Transport Authority; Department of Public Works; Utilities; Transit operators; Private charging providers
- Intervention C: Clean electricity procurement and grid modernization
  - Timeline: 2025–2032 (renewable energy PPAs and

In [14]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [15]:
print(judge)

You are judging a competition between 2 competitors.
Each model has been given this question:

You're an interdisciplinary advisory team for a mid-sized high-income city (population ≈ 500,000) currently emitting 6 million tonnes CO2e/year. Your task: design a prioritized, multi-step 10-year plan to reduce the city's territorial greenhouse gas emissions by at least 50% while preserving economic growth and improving social equity. For your plan, provide: (1) five prioritized policy and infrastructure interventions with timelines and responsible actors; (2) estimated annual GHG reductions and implementation costs for each intervention (with uncertainty ranges and key assumptions); (3) the minimum legal, regulatory, and financing changes required; (4) a 12–24 month pilot project for the highest-impact intervention including experimental design, metrics, sample size, and success criteria; (5) three plausible failure modes and concrete mitigations; (6) evaluation metrics and a monitoring pla

In [16]:
judge_messages = [{"role": "user", "content": judge}]

In [17]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["1", "2"]}


In [19]:
competitors

['gpt-5-nano', 'llama3:latest']

In [None]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
print(ranks)
ranks = ['2','1'] # overwrite the results for testing, to understand the code
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

['1', '2']
Rank 1: llama3:latest
Rank 2: gpt-5-nano


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>