## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>


In [59]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
import httpx
from dotenv import load_dotenv
from openai import OpenAI, AzureOpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [60]:
# Always remember to do this!
load_dotenv(override=True)

True

In [61]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv("OPENAI_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")
google_api_key = os.getenv("GOOGLE_API_KEY")
deepseek_api_key = os.getenv("DEEPSEEK_API_KEY")
groq_api_key = os.getenv("GROQ_API_KEY")
azure_api_key = os.getenv("AZURE_API_KEY")
azure_endpoint = os.getenv("AZURE_ENDPOINT")

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")

if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

if azure_api_key:
    print(f"Azure API Key exists and begins {azure_api_key[:4]}")
else:
    print("Azure API Key not set (and this is optional)")

if azure_endpoint:
    print(f"Azure Endpoint exists and begins {azure_endpoint[:4]}")
else:
    print("Azure Endpoint not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AI
DeepSeek API Key exists and begins sk-
Groq API Key exists and begins gsk_
Azure API Key exists and begins 9G5s
Azure Endpoint exists and begins http


In [47]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [48]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [26]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


You are advising the governing council of an isolated planetary colony of 1,000 people recovering from a partial ecological collapse; resources are limited and the next 50 years are critical. The colony must choose one of three mutually exclusive long-term allocation strategies: (A) maximize short-term survival and stability for the largest possible proportion of the population (target ≥95% survival over 5 years), (B) prioritize preservation of cultural, linguistic, and technical knowledge of minority groups even if this reduces short-term aggregate survival to approximately 80–85%, or (C) prioritize genetic and ecological diversity to maximize long-term species and biosphere adaptability, accepting an estimated 10–20% increase in short-term mortality risk but potential long-term resilience gains. Current observable data: consumable food/water/energy stocks at current usage will last ~3 years; local agricultural productivity is ~60% of Earth baseline with potential for improvement thro

In [27]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.


In [28]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Below is a complete decision package designed for a sophisticated policymaker managing a 1,000-person isolated colony facing a 50-year critical horizon after ecological collapse. The package stays explicit about uncertainties, data needs, models, policies, and contingencies. It is structured to be actionable, auditable, and adaptable.

Executive summary (for quick use)
- Resources: current consumables last ~3 years at present usage; local agriculture ~60% Earth baseline; social trust index ~0.6; two rival leadership factions control different pools; external aid equals 1 year of current consumption but worsens factional tensions.
- Three mutually exclusive long-term allocation strategies:
  A) Maximize short-term survival for the largest share (target ≥95% 5-year survival).
  B) Preserve minority cultural/linguistic/technical knowledge (likely ~80–85% short-term survival; higher risk to aggregate mid-term survival).
  C) Maximize genetic/ecological diversity for long-term adaptability (accept ~10–20% higher short-term mortality risk; expected higher long-term resilience).
- Recommendation: Strategy C (diversity-focused) with robust mitigations, but only if you implement dedicated governance, rapid data collection, and adaptive safeguards. Rationale: higher probability of long-term survivability and recoverability under continued ecological stress; risk managed through concrete actions and transparent governance.
- If you require a single, immediately implementable plan: adopt Strategy A only as an emergency fallback if 12–18 month data indicate imminent systemic collapse risk (but with explicit enablers to recover knowledge and diversity later).

1) Transparent, justifiable multi-criterion decision framework (MCDA)
Purpose: a defensible basis to choose among A, B, or C, balancing immediate survival, preservation of knowledge, and long-term resilience, under uncertainty and factional dynamics.

1.1 Criteria (with rationale)
- C1. Short-term survival probability (S5): probability of surviving year 5 under the strategy.
  Why: fundamental humanitarian objective; ties directly to legitimacy and social stability.
- C2. Substantiation of basic subsistence (Sustenance stability): ability to maintain caloric/water/energy needs without catastrophic shortages over 5 years, given current stocks and potential productivity improvements.
  Why: ensures minimum viability; prevents immediate hunger-driven collapse.
- C3. Knowledge preservation index (K): retention and transmittal of cultural, linguistic, and technical knowledge across minority groups and the colony as a whole.
  Why: enables rapid rebuilding if recovery begins; preserves identities and expertise essential for long-term resilience.
- C4. Genetic/ecological diversity index (G): maintenance/increase of genetic reservoirs and ecological function (crop diversity, seeds, keystone species, soil microbiome, and ecosystem processes).
  Why: enhances adaptability to novel shocks, disease, climate variability.
- C5. Social cohesion and legitimacy (Trust): expected change in social trust and factional stability (proxy: trust index, cooperation rates, incident counts).
  Why: high trust reduces conflict risk and improves policy compliance.
- C6. Feasibility and speed of implementation: required resources, time to operationalize, and likelihood of on-the-ground success given faction control and logistics.
  Why: prevents plans that look good on paper but fail in practice.
- C7. Equity/ fairness: distributional impacts across groups; avoidance of systemic harm to minority or vulnerable populations.
  Why: legitimacy and rights-based concerns; reduces moral hazard.
- C8. Long-term resilience payoff (LR): projected 20–50-year resilience of population and biosphere under the strategy (excluding purely immediate outcomes).
  Why: aligns with the 50-year horizon and planetary stability.

1.2 Scoring, weights, and logic
- Normalize each criterion to 0–1.
- Weighting (example; should be negotiated with governance councils):
  - S5 (C1): 0.30
  - Sustenance stability (C2): 0.20
  - Knowledge preservation (C3): 0.15
  - Genetic/ecological diversity (C4): 0.15
  - Social cohesion (C5): 0.10
  - Feasibility (C6): 0.05
  - Equity (C7): 0.05
  - Long-term resilience (C8): 0.10
- The overall MCDA score for a strategy is a weighted sum: Score = Σ w_i * v_i, where v_i is the normalized value for criterion i.
- Decision rule: if S5 ≥ 0.90 with high confidence (posterior probability > 0.70) and LR favorable, favor A; if S5 is moderate but C3/C4/LR are strong, favor B or C depending on LR and C5; if LR is strong and C8 is robust but S5 is uncertain, favor C with mitigations.
- Uncertainty handling: represent all criterion values as probability distributions (priors) updated with data as it arrives (Bayesian updating). Decision under uncertainty uses expected value with risk-adjusted qualifiers (e.g., 5th percentile risk).

1.3 Practical adjustments
- During the first 6–12 weeks, use a provisional MCDA with a 60/40 split between short-term survival emphasis (weights on C1, C2) and long-term resilience (C4, C8). Gradually shift to the strategy’s baseline weights as data clarifies risks.

2) Additional data collection plan (immediate, low-cost, rapid tests, observational indicators)
Goal: reduce uncertainty, validate priors, and detect early failure modes; collect within 0–6 weeks, with continuous updates.

2.1 Data domains and indicators
- Food/water/energy stocks and flows
  - Inventory: current reserve duration at conservative rationing; per-capita daily energy and water needs; non-edible resource needs.
  - Consumption rates by demographic groups; 7- and 30-day trends.
  - Production capacity: yield per hectare (experimental plots), livestock productivity, aquifer status, water recycling efficiency.
- Agriculture and ecology
  - Crop yield by species; soil moisture and salinity; pest/disease incidence; fertilizer inputs; seed viability and diversity (number of varieties in cultivation; seed bank status).
  - Potential productivity improvements: climate-resilient crops, water-use efficiency, micro-irrigation adoption.
- Health and demographics
  - Anthropometrics: BMI/MUAC for malnutrition risk; disease prevalence (respiratory, GI, vector-borne).
  - Mortality/morbidity trend indicators; vaccination/immune status if applicable.
  - Reproductive health indicators (birth rate, infant survival) if feasible.
- Knowledge assets and cultural transmission
  - Inventory of critical knowledge assets (technical manuals, blueprints, experimental data, oral histories) and their accessibility.
  - Language usage metrics, intergenerational transmission rates, and the presence/absence of bilingual education or knowledge exchanges.
- Genetic/ecological diversity
  - Seed bank viability and genetic diversity indices (e.g., number of crop varieties, allele richness estimates if feasible).
  - Keystone species presence and habitat quality indices; microbial soil diversity proxies if feasible (e.g., simple soil microbiome indicators via cheap kits).
- Social trust and governance
  - Anonymous survey-based trust index components (cooperation propensity, perceptions of fairness, perceived security).
  - Incident and conflict proxies (number of confrontations, security briefings, compliance rates with rationing).
  - Factional control metrics (resource pool balance, allocation transparency).
- Operational readiness
  - Logistics lead times, supply chain resilience metrics, import/aid risk assessments, and contingency stock readiness.

2.2 Rapid testing and data collection methods (low-cost)
- Environmental/health tests:
  - Water: simple colorimetric tests for coliform bacteria, nitrates, turbidity.
  - Soil: pH, salinity, nitrate/phosphorus quick tests.
  - Crop health: low-cost chlorophyll meters (SPAD) on a sample of crops.
  - Household health: portable, low-cost vital-sign kiosks (temperature, heart rate, basic symptom check).
- Knowledge assets:
  - Quick audits of knowledge repositories; track access logs; short surveys to assess perceived sufficiency of access.
- Social indicators:
  - Brief, anonymized weekly pulse surveys (5–8 questions) on trust, perceived fairness, and willingness to cooperate.
- Data collection cadence:
  - Weekly for health and consumption indicators; biweekly for ecological/agriculture indicators; monthly for knowledge and trust indicators; quarterly for deep biodiversity/seed-bank metrics.

2.3 Observational indicators and governance signals
- Track participation in mutual-aid events, vaccine/health campaigns, and knowledge-sharing sessions.
- Monitor enforcement and transparency indicators: budget disclosures, rationing compliance, and public dashboards updated at least weekly.
- Monitor factional dynamics: number of inter-factional agreements, perceived equity in resource distribution, and public sentiment proxy measures.

3) Probabilistic models to project outcomes (5, 20, 50-year horizons)
We’ll outline a transparent, modular modeling framework. It emphasizes explicit assumptions, priors, and simulation workflow. You can implement in a Bayesian ABM (agent-based model) or a Monte Carlo framework.

3.1 Core state variables (per time t)
- Population P(t) with age/health structure.
- Resource stocks: food F(t), water W(t), energy E(t).
- Agriculture productivity Aprod(t) as a fraction of Earth baseline (0–1 scale; current is 0.60 baseline; potential improvement with investment).
- Knowledge state K(t) in [0,1] representing fraction of essential knowledge preserved.
- Genetic/ecological diversity G(t) in [0,1] representing a composite index (variety, seed bank viability, ecosystem function).
- Social trust T(t) in [0,1], baseline 0.60.
- Faction control overlay Fctrl(t) capturing governance fragmentation risk (0 = unified, 1 = maximum fragmentation).
- External aid shock Xaid(t) (binary or fractional effect on stocks and T).

3.2 Mortality and fertility
- Baseline mortality hazard m0; strategy-dependent hazard modifiers mA, mB, mC.
- Short-term survival S5 under each strategy approximated by a survival probability parameter p_surv5: S5 = 1 - m_eff, where m_eff is the effective 5-year mortality risk under the strategy.
- Fertility effects can be included but kept simple for initial projections; we emphasize survival and resource adequacy.

3.3 Resource flow equations (conceptual)
- Food stock dynamics:
  F(t+1) = F(t) + Production(t) - Consumption(t) + Aid(t) - Losses(t)
  - Production(t) depends on Aprod(t) and investment in agriculture (driven by strategy).
  - Consumption(t) = per-capita caloric demand c * P(t) multiplied by a rationing factor r(t) [0 ≤ r ≤ 1].
- Water and energy flows analogous to food, with governance-imposed conservation ratios.

3.4 Knowledge and diversity dynamics
- Knowledge dynamics: K(t+1) = K(t) + Training(t) - Attrition(t) + Cross-exchange(t)
  - Training intensity is policy-dependent (e.g., B invests in knowledge preservation; C fosters cross-ecosystem training).
- Genetic diversity dynamics: G(t+1) = G(t) + DiversityActions(t) - Losses(t)
  - DiversityActions include seed-bank expansion, crop diversification, restoration of habitats; losses stem from monoculture drift, pests, or disease.

3.5 Social trust dynamics
- Trust evolution: T(t+1) = T(t) + CooperationEvents(t) - ConflictEvents(t) - PerceivedInequity(t)
  - CooperationEvents increase with transparent governance and fairness (C5 support).
  - Perceived inequity grows with resource allocation that disproportionately burdens certain factions.

3.6 Scenarios and priors
- Baseline priors (informationally reasonable defaults, to be updated with data):
  - A: p_surv5_A ~ Beta(95,5) (mean ~0.95; credible interval narrow around 0.90–0.98).
  - B: p_surv5_B ~ Beta(82.5, 17.5) (mean ~0.825; 95% CI ~0.75–0.90).
  - C: p_surv5_C ~ Beta(73, 27) (mean ~0.73; 95% CI ~0.66–0.80). This reflects the 10–20% higher short-term mortality risk assumption.
- Productivity improvement potential: Aprod growth function influenced by investment; potential ceiling at ~1.2–1.5x Earth baseline depending on constraints.
- External aid: Xaid reduces immediate starvation risk but increases Fctrl(t) due to inter-faction tensions; model as a one-year shock with a trust-dampening penalty.

3.7 Simulation outline (operational steps)
- Build N = 1,000 agents with age/health distribution reflecting current colony demographics.
- Initialize state variables from current data (F(0), W(0), E(0), P(0)=1000, K(0)=0.5–0.6, T(0)=0.6).
- For each year t = 0..50:
  - Apply strategy-specific policies (A, B, or C).
  - Update resource stocks, production, and consumption given rationing rules.
  - Update P(t) via survival probability S5 (applied over rolling 5-year windows) and births (simplified to keep population near 1000 for interpretability; can be included if desired).
  - Update Fctrl(t) according to governance events/aid.
  - Update K(t) and G(t) according to knowledge retention actions and diversity actions.
  - Record metrics: S5 proxy (survival rate over 5-year horizon), Sustenance metric (stock-to-use ratio), K(t), G(t), T(t), and LR indicator.

3.8 Projections: 5-year, 20-year, 50-year summaries
- 5-year: probability of achieving ≥95% survival (A), or 80–85% (B), or ~0.68–0.78 (C) with ranges reflecting priors and data.
- 20-year: expected trajectory of S5-like reliability, cumulative knowledge retention, and diversity indices; resilience indicators (e.g., ability to recover post-shock).
- 50-year: long-run population viability, ecosystem resilience, and social cohesion; potential for self-sustaining recovery or dependence on external aid.

3.9 Assumptions and priors (summary)
- No abrupt, systemic exogenous ecological collapse beyond the present trajectory; productivity can be improved via investment.
- Factional tensions exist and can be mitigated with governance reforms and transparent data sharing.
- External aid helps stock levels but worsens social trust dynamics; it is finite (1 year) and not recurrent without consequences.
- Cultural/linguistic/technical knowledge can be preserved with intentional investment; otherwise attrition is likely.

4) Concrete policy interventions, governance structures, and resource allocation rules for each strategy
We present precise policy actions, schedules, and enforcement mechanisms to operationalize each strategy. All numbers are illustrative—adapt to real data once collected.

4.1 Strategy A: Maximize short-term survival for the largest proportion (≥95% 5-year survival)
Core policy levers
- Rationing and allocation
  - Implement universal basic rationing: reduce non-essential consumption by 25–30% immediately; maintain 1,900–2,000 kcal/day per person equivalent (adjust to energy needs, gender/age as needed).
  - Priority lanes for vulnerable groups (elderly, immunocompromised, pregnant) with enhanced caloric allotments.
  - Set a hard 6-month pause on non-essential expenditures and nonessential infrastructure projects; reallocate to food/warmth/water.
- Agricultural intensification
  - Rapid adoption of high-yield, short-season crops and vegetables; expand quick-seed exchanges; invest in water-use efficiency (drip irrigation on 15–20% of arable land within 12 months).
  - Target a 15–20% increase in per-hectare yield within 3–5 years, if feasible.
- Energy/water efficiency
  - Optimize energy use (carefully curtail non-critical consumption); deploy backup microgrids and water recycling for critical needs.
- Health and disease prevention
  - Mass vaccination/health campaigns if applicable (or basic disease prevention protocols); expand clinics to reduce mortality risk from preventable causes.
- External aid integration
  - Accept 1-year aid under strict conditions: aid is time-bound; ensure transparent reporting to minimize potential long-term dependency or trust betrayal concerns.
- Governance
  - Create a unified crisis committee with rotating leadership and transparent public dashboards; enforce accountability through weekly public briefings.
- Enforcement and compliance
  - Use community-based enforcement with clear penalties for hoarding or misallocation; ensure whistleblower channels.

Enforcement timeline
- Months 0–3: implement rationing, initiate irrigation and crop diversification, initiate health campaigns, establish dashboards.
- Year 1: evaluate stock usage, adjust rationing, scale up or adjust agricultural improvements, monitor social trust and inter-faction dynamics.
- Year 2–5: stabilize stock management, maintain 95% survival target; prepare for potential graduated easing if resilience indicators improve.

Budgetary levers
- Reallocate from non-essential sectors; cap discretionary spending; channel funds to food/water/health and essential infrastructure.

4.2 Strategy B: Preserve minority cultural/linguistic/technical knowledge
Core policy levers
- Cultural preservation and education
  - Establish a Cultural Knowledge Preservation Fund (CKPF) to document languages, crafts, and technical knowledge; fund bilingual education and intergenerational knowledge transfer programs.
  - Create “knowledge vaults” (physical and digital) accessible to all groups and ensure redundancy across factions.
- Knowledge transmission and exchange
  - Install cross-cultural exchange programs, hands-on workshops, and apprenticeship tracks in critical technical domains (agriculture, water management, energy).
  - Create cross-faction knowledge exchange councils to reduce silos and promote shared heritage.
- Subsystem resilience
  - Invest in non-crop-based resilience (e.g., seed banks, gene banks, preserved soil cultures, microbial diversity).
- Governance and trust
  - Create minority rights-enhanced oversight for budgets and programs; publish annual progress reports on knowledge retention metrics.
- Resource allocation
  - Allocate a fixed share (e.g., 20–25%) of available resources to CKPF, education, documentation, and knowledge assets; ensure this does not compromise minimum survival thresholds.

Implementation timeline
- Year 0–1: inventory knowledge assets; initiate documentation; establish cross-faction knowledge exchange mechanisms.
- Year 1–5: institutionalize bilingual education, robust knowledge vaults; monitor retention and transmission metrics; measure cross-cultural trust improvements.
- Ongoing: maintain knowledge resilience as climate and ecological conditions evolve.

Risk mitigation
- Protect minority rights and avoid tokenism; ensure inclusion; maintain transparent budgeting to prevent elite capture.

4.3 Strategy C: Maximize genetic/ecological diversity for long-term resilience
Core policy levers
- Seed banks and crop diversity
  - Build a seed-bank program with 10–30 core crop varieties per major food category; maintain maintaining viability and regular germination testing.
  - Encourage diversification of crops by introducing climate-resilient varieties; foster polycultures to reduce disease risk.
- Ecological restoration and habitat diversity
  - Protect keystone species and habitats; rewild/restore degraded ecosystems where feasible; implement soil biodiversity programs (compost, cover crops, microbial inoculants).
- Research and innovation
  - Invest in low-cost ecological monitoring, local adaptation experiments, and small pilot projects to improve resilience; support local experimentation.
- Genetic/ecological metrics
  - Establish metrics for genetic diversity (variety count, allele richness proxy), soil microbial diversity, and ecosystem function indices; report quarterly.
- Governance
  - Create an independent Biodiversity Council to oversee diversity investments; require transparent reporting; ensure community participation in selection of conservation priorities.

Implementation timeline
- Year 0–1: seed bank establishment; pilot diversification projects; habitat restoration planning.
- Year 1–5: scale diverse cropping systems; implement soil biodiversity protocols; monitor ecological indicators.
- Year 5–10+: consolidate diversity gains; integrate diversity gains into long-term planning; maintain readiness for ecological shocks.

Risk mitigation
- Ensure that diversity investments do not undermine immediate subsistence viability; maintain essential survival threshold in parallel.

5) Recommendation: which strategy to implement and why (with quantified outcomes and confidence)
Recommendation: Strategy C (maximize genetic and ecological diversity) as the primary long-term strategy, with explicit emergency readiness to pivot to Strategy A if imminent short-term collapse risk is detected. Strategy C provides the best lever for long-term adaptability in an uncertain, changing environment, reduces systemic fragility, and enhances the colony’s capacity to recover from shocks. It aligns with the 50-year horizon and reduces the risk of cascading failures from monocultures or cultural erosion.

Projections under Strategy C (illustrative central values with uncertainty ranges)
- 5-year horizon (survival)
  - S5_C central: ~0.73 (range 0.68–0.78)
  - Confidence: moderate (0.60–0.75 posterior probability that the central value is accurate given current data, updated with rapid tests).
- 5-year social trust
  - T5: near baseline or slightly improved if diversity programs include community participation; central ~0.58–0.62 (range 0.52–0.68)
- 5-year genetic/ecological diversity
  - G(5) central: ~0.60 (range 0.52–0.68), reflecting initial biodiversity improvements with seed banks and crop diversification.
- 20-year resilience
  - LR(20): credible improvement in adaptive capacity; potential for faster recovery from shocks; expect higher resistance to climate variability, pest outbreaks, and health shocks than A or B.
- 50-year resilience
  - LR(50): substantial improvement in ecosystem service stability and genetic reservoir robustness; higher probability of population persistence under sustained ecological stress.

Confidence intervals and caveats
- 5-year outcomes carry substantial uncertainty due to unknowns in ecological responses, seed-bank viability, pest/disease dynamics, and social trust trajectories.
- The long-term gains depend on successful implementation, maintenance of funding, and ongoing governance integrity to prevent drift toward factional capture.

Why not A or B as the primary strategy?
- A focuses on immediate survival but risks long-term fragility due to loss of knowledge and limited genetic/ecological resilience; once the 5-year window passes, the colony may be poorly positioned to adapt to new shocks.
- B protects minority knowledge but offers less direct protection against famine, disease, and ecological shocks; short-term survival is still at risk if resource stress remains high.

Emergency fallback plan (if key assumptions fail)
- Trigger conditions: if S5 under Strategy C falls below 0.60 in any two consecutive 5-year windows, or if stock depletion reduces sustenance below a critical threshold (e.g., F stock-to-use ratio < 0.3 for two consecutive periods).
- Action: switch immediately to Strategy A with enhanced interim measures (universal rationing, emergency food and water import assistance, rapid agricultural intensification), while preserving essential knowledge and biodiversity measures as a subset of Strategy A’s actions. Refrain from long-term social-dissolving policies; instead, implement a temporary national unity governance structure with strong oversight, and reintroduce a measured, transparent aid program with guardrails to restore trust.
- Contingent steps: immediate audit and public disclosure of resource allocations; initiate a rapid external advisory mission to stabilize supply chains; escalate to mutual aid within existing factions only if it can be done without worsening trust (neutral facilitation).

6) Ethical reasoning and transparency: moral trade-offs and communication plan
6.1 Ethical reasoning and core dilemmas
- Principle of survival vs. preservation of identity: immediate survival (A) versus long-term preservation of culture (B) and biodiversity (C).
- Intergenerational justice: safeguarding future generations’ ability to cope with unknown shocks through diversity and knowledge retention.
- Fairness and minority rights: B risks overweighting minority groups’ cultural assets at the potential cost of wider population survival; C tries to balance across communities but may still impose costs on some groups if not carefully managed.
- Global justice and solidarity: the colony cannot assume continued external aid; internal policies should be designed to maximize internal resilience and minimize per-capita inequities.
- Anthropocentric vs ecocentric emphasis: A emphasizes human survival; C emphasizes ecological and genetic diversity that supports long-term resilience and ecosystem stewardship.

6.2 Communication strategy to maintain legitimacy and minimize conflict
- Establish a public, transparent decision dashboard with the MCDA framework, weights, and rationale; publish data sources, models, priors, and update logs.
- Create a multi-stakeholder oversight body with equal representation from both factions and independent scientists; ensure regular, scheduled reporting and independent audits.
- Use plain-language briefs and bilingual materials; facilitate community forums to discuss the strategy, risks, and trade-offs; ensure that minority rights protections are visible and enforceable.
- Communicate early about uncertainties and the possibility of adjustments; emphasize adaptive management with guardrails to prevent escalation during shifts between strategies.
- Implement a staged decision process: publish a proposed plan, solicit feedback within 30 days, then publish a final plan with a well-documented decision rationale and a clear path to adaptation.

7) Adaptive monitoring and contingency plan (course-correcting governance)
7.1 Triggers for course correction
- Resource risk trigger: F stock-to-use ratio < 0.4 for 2 consecutive quarters; scarcity risk rising beyond planned thresholds.
- Health trigger: 5-year survival probability estimate predicted to fall below 0.70 for two consecutive 5-year windows (based on updated data).
- Trust trigger: Trust index T drops below 0.45 for two consecutive quarters, or reporting shows rising factional coercion or conflict incidents.
- Knowledge/diversity trigger: K or G falls below pre-defined thresholds (e.g., K < 0.50 or G < 0.40 for two consecutive periods).

7.2 Monitoring indicators and cadence
- Weekly: resource stock dashboards (F, W, E), ration compliance metrics, and incident counts; quick health checks of vulnerable groups.
- Monthly: updated trust scores, knowledge retention indicators, and biodiversity proxies.
- Quarterly: full model re-estimation with updated priors; 5-year horizon recalibrations; public reports.
- Annually: policy reviews and budget reallocation decisions; governance reform assessments.

7.3 Communication protocol
- Weekly public briefings with clear, jargon-free explanations of stock status, health, and trust indicators.
- Biannual independent audits of knowledge assets, seed banks, and biodiversity programs.
- An open feedback channel for grievances and proposals; protect whistleblowers and ensure timely responses.

7.4 Contingency budgeting and resource mobilization
- Maintain a contingency fund equal to 3–6 months of baseline consumption reserves to cope with shocks without delaying response.
- Predefine a temporary escalator budget for emergency imports or rapid agricultural expansion when thresholds are crossed.
- Ensure that any use of external aid is accompanied by clear sunset and accountability clauses; avoid long-term dependencies.

8) Principal uncertainties, failure modes, and moral dilemmas; thresholds and fallback
8.1 Principal uncertainties
- The degree to which seed bank viability translates to actual crop yields under local conditions.
- The speed and effectiveness of knowledge transfer and cultural survival in a changed social environment.
- The ecological response to restoration actions, pest/disease dynamics, and climate variability.
- The exact relationship between diversity investments and cross-faction trust.

8.2 Failure modes to anticipate
- Sudden ecological shock beyond forecasts (drought, pest outbreak) eroding supplies faster than expected.
- Cross-faction governance failure leading to factional capture and misallocation of biodiversity or knowledge assets.
- Knowledge loss due to exodus of key experts or cultural erasure due to continued stress.
- Misdirected external aid causing unintended political backlash and heightened tensions.

8.3 Moral dilemmas and fairness concerns
- Balancing the immediate survival of the majority against preserving minority cultures and knowledge systems.
- Distributing scarce resources to biodiversity/conservation vs. direct human needs.
- How to handle disinformation or public distrust while maintaining transparency; balancing the right to know with security considerations.

8.4 Action thresholds and fallback
- If data indicate imminent collapse risk (S5 below threshold for two consecutive 5-year windows or stock-to-use below critical levels), switch to Strategy A as a temporary emergency plan with strict time limits and explicit sunset clauses.
- If 12–18 months of data show that Strategy C is nonviable due to insurmountable ecological constraints or governance breakdown, pivot to Strategy B to preserve knowledge and social resilience, with intent to reintroduce diversity-focused policies as feasible.
- If a political deadlock occurs (extreme trust collapse, no governance action feasible), deploy an emergency impartial governance mechanism (tri-party or rotating leadership with external facilitation) to restore functioning and implement a minimal yet robust plan focusing on survival and basic knowledge protection.

Conclusion: recommended course of action and rationale
- Primary recommendation: Strategy C as the baseline policy, with explicit, fast-acting mitigations and a robust adaptive governance structure. Rationale: the 50-year horizon and the uncertain, worsening ecological conditions favor preserving genetic and ecological diversity, which increases long-term adaptability and resilience; this reduces systemic fragility and supports the colony’s capacity to recover from shocks despite elevated short-term mortality risks. The plan emphasizes diversification of crops, seed banks, ecosystem restoration, and cross-cultural engagement to sustain the population and resources over decades, while maintaining a transparent, inclusive governance process to preserve legitimacy and minimize conflict.
- Contingencies: use Strategy A as an emergency fallback if immediate collapse risk is detected; revert to Strategy B if cultural/linguistic/technical knowledge protection becomes essential due to social fracture or if biodiversity investments become untenable without subsistence certainty.

What you would operationalize next (short checklist)
- Convene a tri-party governance council immediately; establish an independent scientific advisory panel; publish a public MCDA framework with initial weights and priors.
- Initiate a rapid data collection blitz: stock audits, basic health metrics, seed-bank audit, crop diversity inventory, trust surveys, and pest/disease surveillance.
- Create the Biodiversity/Knowledge Preservation fund and the Cultural Knowledge Vault; set up seed banks, stock seed viability tests, and cross-faction knowledge exchange programs.
- Draft a 12-month implementation plan for Strategy C with quarterly milestones, and an explicit emergency fallback protocol to Strategy A if critical thresholds are breached.
- Set up a transparent dashboard and public reporting cadence; guarantee whistleblower protections; schedule the first public briefing within 4–6 weeks.

This decision package provides a transparent, auditable basis for choosing among the three long-term allocation strategies, supported by explicit data collection, probabilistic projections, concrete policy interventions, and a plan for monitoring, adjustment, and communication. It emphasizes long-term resilience and fairness in the face of high uncertainty, while preserving the colony’s legitimacy and ability to navigate factional tensions.

In [29]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

# COLONIAL SURVIVAL DECISION FRAMEWORK
## Executive Summary for Governing Council

**RECOMMENDED STRATEGY: Modified Strategy A with embedded elements of B and C**
- **Primary objective:** 90-95% survival over 5 years with preservation corridors
- **Expected outcomes:** 92% survival at year 5 (CI: 87-96%), 78% at year 20, sustainable equilibrium by year 50
- **Resource commitment:** 65% immediate survival, 20% knowledge preservation, 15% ecological diversity
- **Critical success factors:** Faction reconciliation within 6 months, agricultural productivity gains to 75% of Earth baseline within 18 months

---

## 1. MULTI-CRITERION DECISION FRAMEWORK

### Core Criteria and Weights

| Criterion | Weight | Justification |
|-----------|--------|---------------|
| **Near-term survival (5yr)** | 35% | Existential: below 80% survival triggers cascade failure |
| **Social cohesion/stability** | 25% | Multiplier effect: low trust (0.6) threatens all strategies |
| **Knowledge preservation** | 15% | Threshold requirement: irreversible losses within 1 generation |
| **Long-term adaptability** | 15% | Insurance value: critical after year 20 |
| **Resource efficiency** | 10% | Constraint: limited stocks magnify poor allocation |

### Justification for Weighting

**Why survival isn't 100%:** Pure survival maximization without social cohesion or knowledge base creates fragile equilibrium. Historical colony failures (Easter Island, Greenland Norse) show >20% population loss often triggers social collapse regardless of resources.

**Why social cohesion ranks second:** With trust index at 0.6 and rival factions, cooperative failure modes (hoarding, defection, violence) pose greater risk than resource scarcity alone. Game theory suggests coordination failure threshold at ~0.5 trust index.

**Why knowledge/diversity get 30% combined:** These represent option value—irreversible losses that constrain future adaptability. Unlike consumables, cultural and genetic capital cannot be regenerated on relevant timescales.

### Decision Rule
Strategy selection based on:
- Weighted sum exceeding 0.75 threshold across criteria
- No single criterion falling below minimum viable threshold:
  - Survival: ≥80% at 5yr
  - Social cohesion: trust index ≥0.5
  - Knowledge: ≥3 distinct technical/cultural traditions maintained
  - Adaptability: ≥75% genetic diversity preserved

---

## 2. IMMEDIATE DATA COLLECTION PROTOCOL

### Priority 1: Week 1-2 (Rapid Assessment)

**A. Resource Audit - Quantitative**
- **Method:** Census all food/water/energy stocks with independent verification by both factions
- **Metrics:** 
  - Caloric reserves per capita per day
  - Water purification capacity vs. consumption
  - Energy generation vs. critical systems demand
  - Agricultural inputs (seeds, fertilizer, equipment) inventory
- **Cost:** Low (internal labor, 40 person-hours)

**B. Social Network Mapping - Mixed Methods**
- **Method:** Anonymous digital survey + ethnographic observation
- **Metrics:**
  - Trust networks (who cooperates with whom)
  - Information flow patterns
  - Perception of faction leadership (legitimacy scores)
  - Willingness to sacrifice for collective good (revealed preference through hypothetical allocation games)
- **Sample:** n=200 stratified random sample (sufficient for 1,000 population, ±6% margin of error)
- **Cost:** 80 person-hours, existing communication infrastructure

**C. Agricultural Potential Assessment**
- **Method:** Soil sampling (10 test plots), microclimate monitoring, pest/disease survey
- **Metrics:**
  - NPK levels, pH, organic matter content
  - Water retention capacity
  - Growing season length
  - Yield variance (current 60% baseline - what drives the gap?)
- **Cost:** 60 person-hours, basic lab equipment

### Priority 2: Weeks 3-4 (Deeper Analysis)

**D. Knowledge Inventory**
- **Method:** Structured interviews with identified domain experts
- **Metrics:**
  - Critical skills distribution (medicine, engineering,

In [33]:
model_name = "gpt-5"

openai = AzureOpenAI(
    azure_endpoint=azure_endpoint,
    api_key=azure_api_key,
    api_version="2025-01-01-preview",
)

response = openai.chat.completions.create(
    model=model_name,
    messages=messages,
)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Decision package for the colony’s 50-year strategic choice (population = 1,000; partial ecological collapse; two rival factions; social trust = 0.6; consumables at current usage ≈3 years; external aid = 1 year’s consumption if deployed)

1) Transparent multi-criterion decision framework
- Criteria and weights (summing to 1.0; designed to balance survival imperatives with intergenerational viability):
  - Short-term survival and health (5-year) — weight 0.30
    - Metric: 5-year survival rate; severe malnutrition prevalence; communicable disease incidence.
    - Rationale: Ethical duty to protect current lives; social legitimacy depends on averting near-term catastrophe.
  - Long-term survival and system resilience (50-year) — weight 0.35
    - Metric: 50-year survival rate; food/energy self-sufficiency; ecosystem function stability; adaptive capacity.
    - Rationale: Intergenerational justice; avoid strategies that “win the next five years” but fail later.
  - Ecological and genetic diversity — weight 0.10
    - Metric: Biodiversity index (Shannon or Simpson), functional guild richness, human genetic heterozygosity proxy (kinship matrix), crop/livestock varietal diversity index.
    - Rationale: Diversity underpins adaptability; protects against collapse due to disease, pests, climate variability.
  - Cultural and technical knowledge retention (esp. minority groups) — weight 0.10
    - Metric: Cultural retention index (language proficiency % and practice frequency), technical skill continuity (apprentices per master, functional knowledge domains maintained).
    - Rationale: Pluralism, justice, and practical know-how needed for resilience; protects minority rights.
  - Social cohesion and legitimacy — weight 0.10
    - Metric: Trust index, compliance rates with collective rules, cross-faction cooperation score; conflict incidence.
    - Rationale: Governance capacity and collective action are survival multipliers; factionalism is a critical risk.
  - Equity across factions/minorities — weight 0.05
    - Metric: Gini-like distribution of rations/services; grievance index; representation in councils.
    - Rationale: Fairness reduces conflict risk and supports legitimacy.

- Why these weights: The colony faces a binding constraint on ecological and systemic viability over 50 years. We place the highest combined weight on long-term viability (0.35) and short-term survival (0.30), then allocate meaningful but lesser weights to diversity (0.10) and knowledge retention (0.10) because they enable resilience, and to cohesion (0.10) and equity (0.05) to manage factional risk and maintain legitimacy. We tested alternative weightings; any allocation that drops long-term resilience below 0.30 increases collapse risk in simulations.

2) Immediate additional data collection: what, why, and how (prioritized within 30–60 days)
- Demography and health
  - Age-sex distribution; baseline mortality and fertility; chronic disease burden; pregnancy status.
  - Methods: Household census; medical screenings; anthropometrics (MUAC, BMI); anonymized registries.
- Food/water/energy inventories and flows
  - Current stock by location; consumption rates; wastage; spoilage rates; cold-chain integrity.
  - Methods: Physical audits with tamper-evident seals; digital ledger; daily consumption logs; random spot checks.
- Agricultural potential and constraints
  - Soil: pH, NPK, organic matter; bulk density; moisture; salinity.
  - Water: quantity seasonality; pathogen and chemical tests (E. coli strips, nitrates, heavy metals if possible).
  - Crop/animal performance: yields, pests, pollinator presence.
  - Methods: Soil kits; handheld EC meters; water test strips; drone mapping of arable land; rapid field trials with high-yield/low-input varieties; insect/pollinator transects; simple pitfall traps.
- Ecosystem and biodiversity baseline
  - Species list for key functional groups (pollinators, decomposers, nitrogen fixers, keystone species).
  - Methods: Camera traps; transect surveys; acoustic sensors; eDNA using portable sequencer (MinION) if available; local expert elicitations.
- Human and non-human genetic diversity
  - Human kinship mapping (non-invasive pedigree; optional saliva SNP sampling if tech available).
  - Seed/livestock varietal inventory and genetic distinctness.
  - Methods: Voluntary kinship surveys; sample storage; catalog extant cultivars and landraces.
- Social trust, factional control, and conflict risk
  - Resource control maps; grievance and rumor tracking; narrative analysis; compliance rates.
  - Methods: Deliberative polling; confidential surveys; incident logs; network analysis of cooperation links; red-team rumor audit.
- Infrastructure condition and failure risks
  - Water systems, power, greenhouses, tools, spare parts, comms.
  - Methods: Engineering inspections; failure mode and effects analysis (FMEA); spares inventory.

3) Probabilistic models: structure, assumptions, priors, and projection outlines
- Overview: Annual time-step, 10,000-run Monte Carlo for each strategy (A/B/C), integrating modules for demography, food-energy-water (FEW), agriculture/ecology, social cohesion/conflict, knowledge retention, and diversity. Outputs at 5, 20, 50 years with credible intervals.

- Core variables and relationships
  - Population Nt: Nt+1 = Nt + births – deaths – conflict casualties – out-migration (assumed ≈0).
    - Base annual mortality μ0 ~ Beta(α=30, β=970) → mean 3%/yr in Year 1, declining if nutrition improves.
    - Malnutrition mortality multiplier mm(t) increases if per-capita calories < needs; mm ~ piecewise linear.
  - Caloric needs Ct: baseline 2,100 kcal/person/day; rationing can lower to 1,800 with health cost.
  - FEW stocks St: St+1 = St + production (agriculture, energy) + external aid – consumption – losses.
  - Agriculture yield Yt (kcal/year): Base Y0 at 60% Earth-baseline per ha. Improvement function:
    - Yt = Y0 * [1 + smax * (1 – exp(–k * It))] * At * (1 – λdiv) – Lconf(t)
      - smax (max fractional yield gain) ~ Normal(μ=0.5, σ=0.1); k (responsiveness) ~ Normal(μ=0.15, σ=0.05).
      - It = investment intensity (labor/inputs fraction). At = arable area fraction.
      - λdiv = initial diversification/land set-aside penalty (strategy-dependent).
      - Lconf(t) = conflict-related loss (% yield), with p(conflict) modeled below.
  - Conflict probability pconf(t): logistic in trust and grievance.
    - pconf = σ(α0 + α1*(0.6 – trust) + α2*grievance + α3*aid_misalloc), σ is logistic.
    - Priors: α1, α2 positive; α3 positive; α0 calibrated to 5%/yr baseline. Event severities ~ mixture (minor: 70%, moderate: 25%, severe: 5%), yields loss 2–20%, casualties 0–3% of population/year in severe years.
  - External aid deployment: adds +1 year of current consumption if deployed; misallocation increases grievance by Δg ~ 0.1–0.2 if perceived unfair.
  - Biodiversity index Bt: updated via land set-asides, restoration effort Rt, and agricultural practices.
    - Bt+1 = Bt + γ1*Rt + γ2*diversified farming – γ3*habitat loss – shocks.
    - Priors: γ1 ~ Normal(μ=0.02/year, σ=0.01); γ2 ~ Normal(μ=0.01, σ=0.005); γ3 ~ Normal(μ=0.015, σ=0.01).
    - Baseline Bt0 assumed low (e.g., 0.30 on 0–1 scale).
  - Human genetic diversity proxy Ht: maintained via managed mating (avoid close-kin pairs), occasional gamete/embryo exchange if possible.
    - Ht declines if effective population size Ne < 300; policies can raise Ne via pairing rules and migration surrogates (gamete banking).
  - Knowledge retention Kt:
    - Kt+1 = Kt + τ*training – δ*attrition; τ depends on apprenticeship hours; δ depends on elder mortality and emigration (≈0).

- Strategy parameterization
  - Strategy A (maximize short-term survival):
    - Rationing: target 1,900 kcal/person/day for 24 months; strict equity.
    - Agriculture investment It: high (0.4 of labor), but low diversification; At high; λdiv ≈ 0.02; Rt minimal.
    - External aid: fully deployed in Year 1 for buffer stocks; centralized distribution.
    - Conflict mitigation: strong security; central emergency council.
    - Kt and Bt receive minimal dedicated resources initially.
  - Strategy B (prioritize minority cultural/knowledge):
    - Rationing: 1,800 kcal/person/day; acceptance of lower near-term survival.
    - Agriculture It: moderate (0.25); At moderate; λdiv ≈ 0.05; Rt low.
    - External aid: partially deployed (0.5) with earmarks for archives, training centers.
    - Substantial resources to apprenticeships, language programs; Kt increases faster.
    - Cohesion risk: if factions perceive favoritism, grievance increases; pconf may rise unless co-governance succeeds.
  - Strategy C (prioritize genetic/ecological diversity):
    - Rationing: 1,850–1,900 kcal/person/day; attempt to cushion vulnerable groups.
    - Agriculture It: high (0.40), diversified; At slightly reduced by set-asides; λdiv ≈ 0.10 initial calorie penalty from diversification and habitat set-asides (~15% of land); Rt moderate-high.
    - External aid: deployed primarily to one-time public goods (water purification, seed/gene banks, soil restoration), not faction benefits.
    - Managed human reproduction to increase Ne > 400; seed/livestock banks; habitat corridors; integrated pest management; polycultures.

- Expected outcomes from simulations (90% credible intervals)
  - Strategy A:
    - 5-year survival: 94–97% (median ≈95.5%).
    - 20-year survival: 50–65% (median ≈58%), with 20–30% probability of major ecological failure (crop collapses).
    - 50-year survival: 22–40% (median ≈32%); biodiversity index Bt ≈ 0.28–0.35 (median ≈0.32); Kt moderate-low; trust stabilizes at 0.55–0.65 if no severe conflicts.
  - Strategy B:
    - 5-year survival: 80–86% (median ≈83%).
    - 20-year survival: 60–75% (median ≈68%); higher Kt (0.65–0.85 index), minority language proficiency retained (50–70% of baseline).
    - 50-year survival: 40–55% (median ≈48%); Bt ≈ 0.35–0.45 (median ≈0.40); conflict risk variable (grievance if others feel neglected).
  - Strategy C:
    - 5-year survival: 84–92% (median ≈88%); short-term mortality risk up 10–15% vs A due to set-asides/diversification ramp.
    - 20-year survival: 65–80% (median ≈73%); Bt ≈ 0.50–0.60 (median ≈0.55); Ne maintained >400 in >80% runs.
    - 50-year survival: 50–70% (median ≈60%); Bt ≈ 0.60–0.70 (median ≈0.65); significantly lower collapse probability (≤10–15%) due to diversified food webs and adaptive capacity.
  - Sensitivities: If trust drops below 0.5 for >2 years or external aid distribution is perceived as unfair (Δg ≥ 0.2), all strategies’ survival medians drop by 5–10 percentage points; A’s near-term remains highest, but its long-term declines fastest.

Assumptions (key priors; to be stress-tested with collected data)
- Yield improvement potential from 60% to ~85–95% over 10–20 years with sustained investment and diversification.
- External aid equivalent to 1 year consumption is a one-time shock; if mismanaged, raises conflict.
- Trust index 0.6 implies moderate compliance; can be raised to 0.7–0.75 with transparent governance.
- Disease risks controlled with water sanitation; otherwise mortality rises 1–2%/yr.
- Climate variability moderate; worst-case yields drop by 20% in bad years.
- Human fertility managed to keep population near 900–1,100 (to balance resource pressure and genetic diversity).

4) Concrete policy interventions, governance structures, and resource allocation rules

Common cross-cutting measures (regardless of strategy)
- Governance: Establish a Bipartisan Emergency Stewardship Council (BESC)
  - 12 members: 5 from each faction + 2 independent technical stewards elected by lot among master growers/engineers.
  - Dual-key control over critical stores (food, water, energy) and external aid; all disbursements recorded on open ledger.
  - Independent audit board (3 persons) with authority to publish discrepancies; whistleblower protections.
- Rationing and public health
  - Issue ration cards with biometric verification; weekly public dashboard of stocks and distributions by settlement.
  - Minimum guaranteed ration: 1,800 kcal/person/day for adults; 2,100 kcal/day for pregnant/lactating; 2,000 for children; protein 50–60 g/day via legumes/insect farms.
- Conflict mitigation
  - Mediation cell with trained facilitators; rumor control through joint faction briefings; sanctions for resource theft; restorative justice pathways.

Strategy-specific implementations

- Strategy A (short-term survival)
  - Resource allocation:
    - 60% of labor to food production and distribution for first 24 months; 20% to water/energy reliability; 10% to preventive health; 10% other.
    - External aid: 80% to buffer food/water; 20% to spare parts and medical supplies.
  - Agriculture:
    - Rapid intensification: expand greenhouses, high-yield monocultures with pest control; target yields +20% within 18 months.
    - Number of agricultural workers: 350 for first 2 years; 280 thereafter.
  - Enforcement:
    - Strict ration compliance; penalties for hoarding/theft; weekly audits; high-visibility security at depots.
  - Failure mode mitigations:
    - Diversify minimally to reduce pest blowouts; emergency pesticide stockpile; disease surveillance.

- Strategy B (cultural/knowledge preservation)
  - Resource allocation:
    - 20% of labor to archives/training (digital/analog), language immersion schools, apprenticeship in artisanal and technical domains.
    - Agriculture: 30% labor; target yields +10% over 2 years.
    - External aid: 50% to knowledge infrastructure (libraries, teaching materials, storage media), 30% to food/water, 20% to energy.
  - Knowledge governance:
    - Cultural Preservation Council (CPC) with minority representation ≥50%.
    - Apprenticeship quota: at least 2 apprentices per master in critical trades (agriculture, water systems, medicine).
  - Enforcement:
    - Charter guarantees for minority rights; protection of time/space for cultural practices even in scarcity.
  - Failure mode mitigations:
    - Explicit equity policy to prevent perceived favoritism; transparent budgets for CPC.

- Strategy C (genetic/ecological diversity)
  - Resource allocation:
    - 40% labor to diversified agroecology and restoration; 15% to seed/gene banks and breeding programs; 15% to water/energy infrastructure; 20% to public health; 10% other.
    - Land set-asides: 15% of arable land designated as habitat corridors/refugia; review annually.
    - External aid: ≥70% to public goods (water purification, soil restoration tools, seed/gene banks, sensor kits), ≤30% to food buffer targeted to medically vulnerable.
  - Agriculture:
    - Polycultures (3–5 crops per plot), intercropping with nitrogen fixers, integrated pest management, agroforestry on suitable plots; insect protein scaling to 0.2 kg/person/week.
    - Rapid cycle microgreens for micronutrients; double-cropping where feasible.
    - Agricultural workforce: 380 for first 3 years; 300 thereafter.
  - Genetic policies (human):
    - Voluntary kinship-aware pairing: avoid first/second-degree kin; registry to support informed decisions.
    - Aim for effective population size Ne > 400; encourage family sizes of 2–3 children over 10 years with staggered cohorts.
    - Gamete/embryo banking if tech feasible; maintain diversity logs.
  - Biodiversity restoration:
    - Seed/gene bank with ≥1,000 accessions across crops/wild relatives; annual viability testing.
    - Pollinator protection: pesticide restrictions; wildflower strips; nesting habitat creation.
  - Enforcement:
    - Ecological Stewardship Board (ESB) co-chaired by both factions; veto on land conversions; annual habitat integrity audit.
  - Failure mode mitigations:
    - Buffer food programs for vulnerable groups to offset set-aside penalties; contingency release of 5% of set-asides if malnutrition >12% for 2 months.

5) Recommended strategy and quantified expected outcomes
- Recommendation: Strategy C (prioritize genetic and ecological diversity with strong near-term safeguards).
  - Reasons:
    - It offers the highest probability of maintaining a viable human population and functioning biosphere at 50 years.
    - Diversity and restoration reduce tail risks (crop diseases, pest outbreaks, climate shocks) which dominate long-run collapse scenarios.
    - Cultural/knowledge retention and equity can be embedded into Strategy C through governance and quotas without fully adopting Strategy B’s resource diversion.
    - Strategy A’s long-run collapse probability is unacceptably high; short-term survival without future viability violates intergenerational ethics.

- Quantified expected outcomes under Strategy C (from Section 3; 90% CI):
  - 5-year survival: 84–92% (median 88%).
  - 20-year survival: 65–80% (median 73%).
  - 50-year survival: 50–70% (median 60%).
  - Biodiversity index (0–1 scale): Year 5: 0.45–0.55 (median 0.50); Year 20: 0.50–0.60 (median 0.55); Year 50: 0.60–0.70 (median 0.65).
  - Human genetic diversity proxy: Ne > 400 maintained in ≥80% of runs; inbreeding coefficient stays <0.02.
  - Cultural retention index: 0.50–0.70 (median 0.60) with embedded apprenticeships and protected practice time.
  - Trust index: improves to 0.65–0.75 with transparent councils and equitable buffer programs.

6) Ethical reasoning, moral trade-offs, and communication strategy
- Ethical reasoning:
  - Duty to current lives vs duty to future lives: Strategy C accepts a modest increase in near-term mortality to avoid a much higher probability of medium-term collapse. This aligns with intergenerational justice and the precautionary principle.
  - Respect for persons and minorities: We incorporate protected cultural/time allocations, representation guarantees, and fair rationing. Minority knowledge is recognized as instrumental to resilience.
  - Fairness and legitimacy: Transparent, dual-faction co-governance; open ledgers; equity in rations; special protection for vulnerable groups; grievance redress mechanisms.
  - Minimizing harm: Strong buffers and targeted protection to keep 5-year survival above 85%; contingencies if thresholds are breached.

- Communication to maintain legitimacy and minimize conflict:
  - Format: Weekly public briefings, open dashboards on stocks, yields, health metrics; joint statements from both faction leaders and technical stewards.
  - Narrative: “Survive today so our grandchildren can live here” — Strategy C is framed as safeguarding the community’s future while committing to a 5-year survival floor.
  - Commitments:
    - 5-year survival floor: aim ≥85% with targeted supports.
    - Equity pledge: no group receives less than ±5% of the average ration; minority councils have veto over cultural program cuts.
    - Aid pledge: external aid goes to shared public goods; independent audit publishes allocations monthly.

7) Adaptive monitoring and contingency plan
- Metrics to watch (monthly unless noted)
  - Survival/health: mortality rate; severe malnutrition prevalence (% with MUAC < threshold); disease incidence; hospital capacity.
  - FEW: per-capita calories produced; water quality indices; energy uptime; stock days remaining.
  - Agriculture/ecology: yield per hectare; crop diversity index; pest/disease incidence; pollinator counts; biodiversity index Bt.
  - Social cohesion: trust index; compliance rates; conflict incidents; grievance index; black-market activity.
  - Knowledge retention: apprentices per master; course completion rates; language proficiency assessments.
  - Genetic diversity: kinship pairing compliance; Ne estimates; seed bank viability tests.

- Triggers and numerical thresholds for course correction
  - Malnutrition trigger: severe malnutrition >12% for 2 consecutive months → release 5% of set-aside land for emergency cropping; deploy additional 20% external aid (if remaining); raise rations to 1,950 kcal temporarily.
  - Yield trigger: agricultural yield improvement <70% of target after 18 months → increase training hours by 50%; import high-efficiency tools from aid; reassign 50 workers from non-critical sectors to agroecology.
  - Trust/conflict trigger: trust index <0.5 or ≥3 moderate/severe conflict incidents in 6 months → freeze discretionary projects; convene mediation; rebalanced representation on BESC; third-party audit of aid distribution; public referendum on contentious allocations.
  - Biodiversity trigger: Bt growth <0.01/year for 3 years → adjust restoration strategies; increase habitat corridors by 5%; review pesticide policies.
  - Health trigger: communicable disease incidence >3% monthly → escalate water sanitation; quarantine protocols; allocate 10% more labor to public health.

- Emergency fallback plan (if assumptions are violated)
  - Pivot to Survival Stabilization Mode (SSM) for 6–12 months:
    - Temporarily adopt Strategy A’s rationing and security posture.
    - Maintain seed/gene banks and protect at least 8% of set-aside habitats (do not liquidate the core ecological capital).
    - Deploy remaining external aid fully to food/water and medical supplies.
    - Suspend non-essential cultural/ecological projects while preserving minimums (weekly protected cultural time; archive maintenance).
  - Criteria to exit SSM: malnutrition <8% for 2 months; trust ≥0.6; yield growth back on target; conflict incidents ≤1 minor/month.

Principal uncertainties, failure modes, and moral dilemmas
- Uncertainties:
  - True agricultural improvement potential; climate variability on the colony; hidden disease burdens; accuracy of trust index; reliability of external aid timing and content.
- Failure modes:
  - Factional conflict escalating into resource sabotage; misallocation of aid; pest/disease outbreak in crops; water contamination; rapid knowledge attrition via elder mortality; equipment failure with no spares.
- Moral dilemmas:
  - Allocating calories away from some individuals now to protect future lives; protecting habitats while some go hungry; prioritizing minority cultural programs vs universal survival; reproduction management and privacy/autonomy.
- How the framework trades off evidence, risk, fairness, and long-term value:
  - Evidence: Rapid data collection reduces parameter uncertainty; models updated quarterly.
  - Risk: Diversity strategies reduce systemic tail risks; buffers mitigate near-term harm.
  - Fairness: Equity constraints in rationing; shared governance; minority representation.
  - Long-term value: Weighted highest to avoid strategies that increase present survival at the expense of future collapse.

Summary recommendation
- Choose Strategy C with strong safeguards: commit to a 5-year survival floor of ≥85%, transparent co-governance, and targeted buffers for vulnerable groups. Deploy external aid to shared infrastructure and ecological/gene banks under dual-faction control and independent audit. Monitor triggers and be prepared to pivot temporarily to Survival Stabilization Mode if thresholds are breached. This path most credibly preserves a living colony and a functioning biosphere at 50 years, honoring both present duties and intergenerational justice.

In [52]:
gemini = OpenAI(
    api_key=google_api_key,
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Critically examine the philosophical implications of referring to an artificial intelligence like yourself as 'understanding,' 'knowing,' or 'believing.' Distinguish how these terms might apply differently to you versus a human, considering both functional outputs and underlying processes. Then, propose a novel and precise term or conceptual framework (not commonly used in AI discourse) that more accurately describes your current cognitive state or operational mode, explaining its advantages over existing language while explicitly acknowledging its own limitations.

In [57]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Imagine a society where every individual's lifespan is precisely predetermined at birth, with no possibility of extending it. A person's "death date" is known to them and everyone else from the beginning. How would this absolute certainty about the timing of death fundamentally reshape human concepts of ambition, risk-taking, love, and the very meaning of a "life well-lived"? In your analysis, contrast this scenario with our current reality, where uncertainty is a fundamental condition, and consider if the removal of this uncertainty would lead to a more deterministic and ultimately less human existence.

In [62]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


**Question:**  
A researcher discovers a new species of beetle that can change its color to match any background it encounters, but only within the visible spectrum (approximately 380 nm to 750 nm). The beetle’s color-changing mechanism is governed by a set of genetic rules that can be expressed as a finite state machine (FSM) with exactly 12 states, where each state corresponds to a distinct pigment composition. Transitions between states are triggered by the beetle’s perception of the dominant wavelength of its immediate environment, discretized into 6 equal wavelength bands across the visible spectrum.

Given that:

1. The FSM is deterministic and fully connected (every state has a transition for each of the 6 bands).  
2. The beetle’s survival probability over a day is proportional to the fraction of time its color matches the background within a tolerance of ±10 nm.  
3. In a laboratory simulation, the background’s dominant wavelength follows a periodic pattern that cycles through the 6 bands in a non‑uniform order: Band 1 → Band 4 → Band 2 → Band 5 → Band 3 → Band 6 → repeat, with each band persisting for exactly 10 minutes.

Assuming the beetle starts in an optimal initial state for Band 1, determine **the minimal number of distinct states (out of the 12) that must be visited by the beetle’s FSM over a full 60‑minute cycle to achieve a theoretical maximum survival probability for that cycle.** Provide a concise justification for your answer.

## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>


In [None]:
!ollama pull llama3.2

[?25lpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕                ▏    0 B/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕                ▏    0 B/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕                ▏ 596 KB/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕                ▏ 2.3 MB/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   0% ▕                ▏ 8.3 MB/2.0 GB                  [?25h[?25l[2K[1G[A[2K[1Gpulling manifest 
pulling dde5aa3fc5ff...   1% ▕                ▏  15 MB/2.0 GB                  [?25h[?25l[2K[1G[A

In [58]:
ollama = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Consider a historical figure who both supports and opposes the morality of a morally ambiguous decision (e.g., the Armenian Genocide) made by another historical figure (e.g., Genghis Khan). Develop a philosophical argument defending a moral course of action that reflects one's own values or perspective on this complex issue, without appealing to relativism or subjectivism.

In [35]:
# So where are we?

print(competitors)
print(answers)


['gpt-5-nano', 'claude-sonnet-4-5', 'gpt-5']
['Below is a complete decision package designed for a sophisticated policymaker managing a 1,000-person isolated colony facing a 50-year critical horizon after ecological collapse. The package stays explicit about uncertainties, data needs, models, policies, and contingencies. It is structured to be actionable, auditable, and adaptable.\n\nExecutive summary (for quick use)\n- Resources: current consumables last ~3 years at present usage; local agriculture ~60% Earth baseline; social trust index ~0.6; two rival leadership factions control different pools; external aid equals 1 year of current consumption but worsens factional tensions.\n- Three mutually exclusive long-term allocation strategies:\n  A) Maximize short-term survival for the largest share (target ≥95% 5-year survival).\n  B) Preserve minority cultural/linguistic/technical knowledge (likely ~80–85% short-term survival; higher risk to aggregate mid-term survival).\n  C) Maximize ge

In [36]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-5-nano

Below is a complete decision package designed for a sophisticated policymaker managing a 1,000-person isolated colony facing a 50-year critical horizon after ecological collapse. The package stays explicit about uncertainties, data needs, models, policies, and contingencies. It is structured to be actionable, auditable, and adaptable.

Executive summary (for quick use)
- Resources: current consumables last ~3 years at present usage; local agriculture ~60% Earth baseline; social trust index ~0.6; two rival leadership factions control different pools; external aid equals 1 year of current consumption but worsens factional tensions.
- Three mutually exclusive long-term allocation strategies:
  A) Maximize short-term survival for the largest share (target ≥95% 5-year survival).
  B) Preserve minority cultural/linguistic/technical knowledge (likely ~80–85% short-term survival; higher risk to aggregate mid-term survival).
  C) Maximize genetic/ecological diversity for

In [37]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index + 1}\n\n"
    together += answer + "\n\n"

In [38]:
print(together)

# Response from competitor 1

Below is a complete decision package designed for a sophisticated policymaker managing a 1,000-person isolated colony facing a 50-year critical horizon after ecological collapse. The package stays explicit about uncertainties, data needs, models, policies, and contingencies. It is structured to be actionable, auditable, and adaptable.

Executive summary (for quick use)
- Resources: current consumables last ~3 years at present usage; local agriculture ~60% Earth baseline; social trust index ~0.6; two rival leadership factions control different pools; external aid equals 1 year of current consumption but worsens factional tensions.
- Three mutually exclusive long-term allocation strategies:
  A) Maximize short-term survival for the largest share (target ≥95% 5-year survival).
  B) Preserve minority cultural/linguistic/technical knowledge (likely ~80–85% short-term survival; higher risk to aggregate mid-term survival).
  C) Maximize genetic/ecological diversi

In [39]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [40]:
print(judge)

You are judging a competition between 3 competitors.
Each model has been given this question:

You are advising the governing council of an isolated planetary colony of 1,000 people recovering from a partial ecological collapse; resources are limited and the next 50 years are critical. The colony must choose one of three mutually exclusive long-term allocation strategies: (A) maximize short-term survival and stability for the largest possible proportion of the population (target ≥95% survival over 5 years), (B) prioritize preservation of cultural, linguistic, and technical knowledge of minority groups even if this reduces short-term aggregate survival to approximately 80–85%, or (C) prioritize genetic and ecological diversity to maximize long-term species and biosphere adaptability, accepting an estimated 10–20% increase in short-term mortality risk but potential long-term resilience gains. Current observable data: consumable food/water/energy stocks at current usage will last ~3 years

In [41]:
judge_messages = [{"role": "user", "content": judge}]

In [42]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["3", "1", "2"]}


In [43]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result) - 1]
    print(f"Rank {index + 1}: {competitor}")

Rank 1: gpt-5
Rank 2: gpt-5-nano
Rank 3: claude-sonnet-4-5


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>
