# LLM Models Comparison using LLM as Judge

## Import the Needed Python Packages 

In [1]:
import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display


In [2]:
load_dotenv(override=True)

True

In [3]:
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AI
DeepSeek API Key exists and begins sk-
Groq API Key exists and begins gsk_


## LLMs evaluation 

In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence."
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]
print(messages)

[{'role': 'user', 'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence.Answer only with the question, no explanation.'}]


### OpenAI to generate a challenge

In [5]:
model = OpenAI()
response = model.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)

If you could design a hypothetical society that optimally balances technological advancement, environmental sustainability, and social equity, what core principles and systems would you implement, and how would you address the inevitable conflicts among these priorities?


In [6]:
messages = [{"role": "user", "content": question}]
competitors = []
answers = []
models = []

### Response: OpenAI

In [7]:
model_name = "gpt-4.1-mini"

response = model.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)
models.append(model)

Designing a hypothetical society that optimally balances **technological advancement**, **environmental sustainability**, and **social equity** requires integrating multidisciplinary approaches, values, and adaptive governance structures. Here’s a comprehensive outline of core principles and systems I would implement, along with strategies to manage conflicts among these priorities:

---

### Core Principles

1. **Holistic Well-being as the Goal**  
   Prioritize human and planetary well-being over narrow economic metrics like GDP. Success is measured by holistic indicators such as health, happiness, ecological footprint, equity, and innovation impact.

2. **Intergenerational Justice**  
   Ensure policies and development consider long-term impacts on future generations, embedding sustainability into the legal and cultural frameworks.

3. **Inclusive Participation and Deliberative Democracy**  
   Empower all stakeholders—citizens, marginalized groups, scientists, and policymakers—to co-create decisions, ensuring diverse voices shape technological and environmental policies.

4. **Precautionary and Adaptive Innovation**  
   Encourage innovation with precautionary principles, allowing experimentation but with rigorous impact assessments and adaptability to unforeseen consequences.

5. **Equity as Structural, Not Just Redistributive**  
   Address root causes of social inequity by redesigning systems related to education, healthcare, economy, and governance rather than relying solely on redistribution.

6. **Circular and Regenerative Economy**  
   Transition from extractive, linear economic models to circular systems that reuse resources, regenerate ecosystems, and minimize waste.

---

### Systems and Structures

#### 1. **Governance and Decision-Making**

- **Polycentric Governance:**  
  A multi-layered governance model where local, regional, national, and global entities collaborate. Local communities manage resources sustainably; higher levels coordinate large-scale technological innovation and environmental regulation.

- **Deliberative Assemblies and Citizen Juries:**  
  Use citizens' assemblies regularly to decide ethically charged issues, integrating scientific advice and public values to legitimize tough trade-offs.

- **Dynamic Regulatory Sandboxes:**  
  Innovation zones where technologies can be tested under close monitoring—balancing advancement with environmental and social safeguards.

#### 2. **Technological Innovation Framework**

- **Mission-Driven Innovation:**  
  Focus technological research on addressing grand challenges aligned with sustainability and equity goals (e.g., renewable energy, carbon capture, affordable healthcare tech).

- **Open-Source and Collaborative Platforms:**  
  Promote publicly accessible technologies to reduce monopolies, enabling broad-based participation in innovation and diffusion of benefits.

- **Technology Impact Assessment:**  
  Formalized multidisciplinary review (environmental, social, ethical) before large-scale deployment.

#### 3. **Environmental Sustainability Mechanisms**

- **Strict Environmental Limits:**  
  Enforce ecological boundaries, e.g., carbon budgets, biodiversity preservation zones, water use quotas, backed by transparent monitoring technologies.

- **Regenerative Agriculture and Urban Design:**  
  Support farming methods that restore soil health, biodiversity corridors in urban planning, green infrastructure, and renewable energy microgrids.

- **Decentralized Resource Management:**  
  Empower communities to manage natural resources sustainably, supported by real-time data and adaptive management techniques.

#### 4. **Social Equity Foundations**

- **Universal Basic Services (UBS):**  
  Guarantee access to healthcare, education, housing, and digital connectivity enabling full participation in society and technology benefits.

- **Equitable Education Systems:**  
  Focus on lifelong learning with emphasis on critical thinking, ethics, sustainability literacy, and technological proficiency.

- **Inclusive Economic Models:**  
  Co-operative enterprises, social entrepreneurship, and platform cooperatives to democratize economic power and share wealth generated by technology.

---

### Addressing Conflicts Among Priorities

Balancing these priorities inevitably involves trade-offs, for example, rapid technological growth might strain the environment or exacerbate inequalities. Strategies to manage conflicts include:

1. **Transparent Trade-off Deliberation:**  
   Use participatory processes to make explicit trade-offs visible; decisions reflect a collective balance rather than top-down impositions.

2. **Scenario Planning and Impact Modeling:**  
   Employ advanced modeling to foresee outcomes and unintended consequences, reducing risks and informing adaptive policies.

3. **Flexibility and Learning:**  
   Adopt adaptive governance that can shift course as new information emerges, avoiding rigid fixes that harm one priority over others.

4. **Integrative Policies:**  
   Design policies that co-benefit multiple goals—for example, renewable energy projects that create jobs in disadvantaged communities, or tech innovations that reduce resource use while improving access.

5. **Conflict Resolution Institutions:**  
   Establish impartial bodies to mediate disputes among stakeholders when priorities clash, ensuring procedural justice.

---

### Summary

This hypothetical society is governed by a vision of sustainable prosperity defined holistically, where innovation is responsible and inclusive, the environment is protected through regenerative systems, and equity is structurally embedded. Through participatory governance, adaptive regulation, and integrative policy design, it continuously negotiates tensions among priorities, learning and evolving dynamically toward a balanced and thriving future.

---

If you'd like, I can also provide concrete examples or explore how such systems might be phased in from current societal conditions!

### Response: Anthropic

In [8]:
model_name = "claude-3-7-sonnet-latest"

model = Anthropic()
response = model.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)
models.append(model)

# Designing a Balanced Society: Technology, Environment, and Equity

## Core Principles

1. **Long-term Stewardship**: Decisions evaluated on multi-generational timescales, not quarterly profits or election cycles
2. **Technological Humility**: Progress paired with precautionary principles and rigorous impact assessment
3. **Distributed Resilience**: Power, resources, and decision-making distributed across scales to prevent single points of failure
4. **Adaptive Governance**: Systems designed to evolve based on evidence, changing conditions, and diverse perspectives

## Key Systems

**Economic Structure**:
- Mixed economy with strong commons management for natural resources
- Market mechanisms with externalities fully priced in
- Universal basic services ensuring fundamental needs
- Progressive wealth limits to prevent extreme power concentration

**Technological Development**:
- Public interest technology assessment boards with diverse membership
- Open knowledge commons balanced with time-limited IP protections
- Research priorities set by participatory processes
- Technological benefits distributed equitably

**Environmental Systems**:
- Circular resource flows modeled on natural systems
- Strong biosphere preservation zones
- Regenerative agriculture and habitat restoration
- Carbon and pollution pricing reflecting true costs

## Addressing Conflicts

The hardest challenges would come at intersections:

- When rapid technological advancement threatens environmental stability, I'd implement staged deployment with careful monitoring
- When equity concerns conflict with efficiency, I'd create transparent processes that explicitly weigh tradeoffs and compensate losers
- When local autonomy clashes with global needs, I'd build nested governance structures with clear subsidiarity principles

The ideal system acknowledges these tensions rather than pretending perfect harmony is possible, creating forums explicitly designed to navigate these difficult tradeoffs democratically.

### Response: Google

In [9]:
model = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = model.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)
models.append(model)

Okay, this is a fascinating thought experiment! Designing a perfectly balanced society is, of course, impossible in reality due to the inherent complexities of human nature and unforeseen consequences. However, we can strive for an optimal balance using guiding principles and adaptive systems. Here's my attempt at outlining a hypothetical society:

**Core Principles:**

*   **Interconnectedness and Systems Thinking:** The foundation is understanding that everything is interconnected. Actions in one area (technology, environment, society) ripple outwards, impacting the others. Decisions are made holistically, considering long-term consequences.
*   **Precautionary Principle:** Where potential harm exists (especially to the environment or future generations), action should be taken to prevent or minimize that harm, even if the science is not yet conclusive.  Burden of proof rests on demonstrating safety.
*   **Universal Basic Services (UBS):**  Guaranteed access to essential resources: food, water, shelter, healthcare, education, information, and energy. These are considered human rights, ensuring a baseline level of well-being for all citizens, regardless of economic contribution.
*   **Decentralization and Localization:** Emphasizing local production, governance, and resource management where feasible. This increases resilience, reduces transportation costs, and allows communities to tailor solutions to their specific needs.
*   **Participatory Democracy and Deliberative Governance:** Citizens are actively involved in decision-making through direct democracy tools, citizen assemblies, and deliberative processes. Expert knowledge is integrated with community values.
*   **Regenerative Economy:** Moving beyond simply "sustainable" practices.  Focus is on actively restoring and enhancing ecosystems and social systems. Circular economy principles are paramount, minimizing waste and maximizing resource utilization.
*   **Ethical Technology Development and Deployment:**  Technology is viewed as a tool, not an end in itself. Development is guided by ethical frameworks that prioritize human well-being, privacy, and social justice.
*   **Lifelong Learning and Adaptive Capacity:** Education is not limited to childhood but is a continuous process, emphasizing critical thinking, problem-solving, and adaptability to change. Society fosters a culture of experimentation and learning from mistakes.
*   **Intrinsic Value of Nature:** Recognizing that nature has inherent value, independent of its usefulness to humans. This shapes policies related to resource extraction, land use, and conservation.
*   **Justice and Equity:** Striving for equitable distribution of resources, opportunities, and burdens. Addressing historical injustices and systemic inequalities.

**Systems Implementation:**

*   **Economic System:**
    *   **Hybrid Economy:** A blend of market mechanisms, cooperative ownership, and public services.
    *   **Complementary Currencies:** Local currencies to support regional economies and foster community engagement.
    *   **Taxation:** Progressive taxation system that internalizes environmental costs (e.g., carbon tax) and funds universal basic services.
    *   **Wealth Redistribution:** Mechanisms to address extreme wealth inequality, such as wealth taxes or inheritance taxes.
    *   **Focus on Well-being Indicators:**  Moving beyond GDP as the primary measure of progress, using indicators that reflect social and environmental well-being (e.g., Genuine Progress Indicator).

*   **Technological System:**
    *   **Open-Source and Collaborative Innovation:** Promoting open access to knowledge and technology to accelerate innovation and ensure wider benefits.
    *   **Technology Assessment and Regulation:** Rigorous assessment of the potential social, environmental, and ethical impacts of new technologies *before* widespread deployment.  Independent regulatory bodies with strong enforcement powers.
    *   **Decentralized Energy Systems:** Transition to renewable energy sources (solar, wind, geothermal) with decentralized grids and energy storage solutions.
    *   **Precision Agriculture:** Using technology to optimize agricultural practices, reduce water consumption, minimize fertilizer use, and enhance soil health.
    *   **Sustainable Manufacturing:** Embracing circular economy principles, using sustainable materials, and minimizing waste in manufacturing processes.
    *   **AI Ethics and Governance:**  Developing ethical guidelines and regulations for artificial intelligence to prevent bias, ensure accountability, and protect privacy.

*   **Environmental System:**
    *   **Protected Areas:** Establishing and maintaining extensive protected areas for biodiversity conservation and ecosystem services.
    *   **Ecological Restoration:** Investing in the restoration of degraded ecosystems (forests, wetlands, coral reefs).
    *   **Water Management:** Implementing integrated water resource management strategies that prioritize water conservation, efficient irrigation, and wastewater treatment.
    *   **Sustainable Agriculture:** Promoting agroecological farming practices that enhance soil health, reduce reliance on synthetic inputs, and support biodiversity.
    *   **Climate Change Mitigation and Adaptation:** Aggressively reducing greenhouse gas emissions through renewable energy transition, energy efficiency measures, and carbon sequestration.  Developing strategies to adapt to the impacts of climate change, such as sea-level rise and extreme weather events.

*   **Social System:**
    *   **Education for Sustainability:** Integrating sustainability principles into all levels of education, fostering environmental literacy, and promoting responsible citizenship.
    *   **Community Building:** Supporting community initiatives that promote social cohesion, resilience, and civic engagement.
    *   **Healthcare:** Universal healthcare system that emphasizes preventive care, mental health services, and equitable access.
    *   **Justice System:** Restorative justice practices that focus on rehabilitation, reconciliation, and addressing the root causes of crime.
    *   **Media and Information:** Promoting independent, fact-based journalism and addressing misinformation and disinformation. Supporting media literacy education.

**Addressing Conflicts:**

The key to managing conflicts lies in transparent decision-making processes, robust stakeholder engagement, and a commitment to adaptive management. Here are some specific approaches:

*   **Multi-Criteria Decision Analysis (MCDA):** Using MCDA to evaluate different policy options based on a range of criteria (economic, social, environmental) and stakeholder preferences.
*   **Scenario Planning:** Developing and analyzing different future scenarios to anticipate potential challenges and opportunities.
*   **Adaptive Management:**  Recognizing that the future is uncertain, policies are implemented as experiments, with ongoing monitoring and adjustments based on feedback.
*   **Conflict Resolution Mechanisms:** Establishing independent bodies to mediate disputes between stakeholders and ensure fair and equitable outcomes.
*   **Trade-offs and Prioritization:**  Acknowledging that some trade-offs are inevitable. Making these trade-offs transparently and democratically, based on the best available evidence and ethical considerations. For example, a new renewable energy project might impact a local ecosystem. The decision to proceed would involve a thorough environmental impact assessment, stakeholder consultations, and mitigation measures to minimize harm. Compensation or restoration efforts might be required.
*   **Emphasis on Collaboration and Dialogue:** Fostering a culture of collaboration and dialogue among different stakeholders, encouraging them to find common ground and develop mutually beneficial solutions.

**Challenges and Limitations:**

*   **Human Nature:** Overcoming inherent biases, self-interest, and resistance to change.
*   **Complexity:** Managing the complexity of interconnected systems and anticipating unintended consequences.
*   **Power Dynamics:** Addressing power imbalances and ensuring that marginalized voices are heard.
*   **Global Cooperation:** Achieving global cooperation on issues such as climate change, resource management, and technological development.
*   **Unforeseen Events:**  Adapting to unexpected events and crises.

**Conclusion:**

This hypothetical society is not a utopia, but rather a framework for continuous improvement. It is based on the understanding that technological advancement, environmental sustainability, and social equity are not mutually exclusive goals, but rather interconnected aspects of a thriving society. By embracing systems thinking, ethical principles, and participatory governance, we can create a society that is both prosperous and just for all. The design is meant to be flexible, adaptive, and constantly evolving based on new knowledge and experiences. It requires a fundamental shift in values, from individualistic consumerism to collective well-being and planetary stewardship.


### Response: Deepseek

In [10]:
model = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = model.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)
models.append(model)

Designing a hypothetical society that optimally balances **technological advancement**, **environmental sustainability**, and **social equity** requires a holistic, adaptive framework that integrates these priorities while managing trade-offs. Below are core principles, systems, and conflict-resolution strategies:

---

### **Core Principles**  
1. **Regenerative Stewardship**  
   - The economy operates within **planetary boundaries**, prioritizing circular production, renewable energy, and ecosystem restoration.  
   - **"Do no harm"** is replaced with **"actively regenerate"**—industries must prove net-positive environmental impact.  

2. **Distributed Innovation**  
   - Technology is democratized: open-source platforms, decentralized energy grids, and localized manufacturing (e.g., 3D printing) reduce monopolies.  
   - AI and automation are directed toward **public goods** (e.g., healthcare, education, climate modeling) rather than surveillance or labor exploitation.  

3. **Equitable Access by Design**  
   - **Universal Basic Services** (housing, healthcare, education, internet) are guaranteed via public-private-commons partnerships.  
   - **Wealth/Resource Caps** prevent extreme inequality (e.g., maximum income ratios, land ownership limits).  

4. **Participatory Governance**  
   - **Citizen Assemblies** with algorithmic support (e.g., AI-facilitated deliberation) guide long-term policy, balancing expert knowledge with public values.  
   - **Local Autonomy** within global frameworks lets communities adapt solutions (e.g., urban farming vs. vertical agriculture).  

5. **Resilience Over Growth**  
   - Success metrics shift from GDP to **well-being indices** (health, leisure, ecological health).  
   - **Jubilee Mechanisms** forgive unsustainable debts (personal, national) to reset inequities.  

---

### **Key Systems**  
1. **Energy & Production**  
   - **100% Renewable Microgrids** with storage backups, paired with degrowth in extractive sectors.  
   - **Closed-Loop Manufacturing**: Mandatory producer responsibility for waste (e.g., "right to repair" laws).  

2. **Food Systems**  
   - **Agroecology & Lab-Grown Proteins**: Transition subsidies from industrial agriculture to regenerative farming and cellular agriculture.  
   - **Food Sovereignty**: Communities control land use to prevent corporate monocultures.  

3. **Technology Governance**  
   - **Precautionary Principle**: New tech must pass societal impact assessments (e.g., banning facial recognition in policing).  
   - **Public Data Trusts**: Personal data is a collective resource, not a commodity.  

4. **Economic Structure**  
   - **Post-Capitalist Models**: Cooperatives, commons-based peer production, and gift economies supplement markets.  
   - **Carbon/Resource Taxes**: Fees fund universal dividends or green R&D.  

5. **Education & Culture**  
   - **Critical Thinking & Empathy Training**: Schools teach systems thinking, conflict resolution, and ecological literacy.  
   - **Media Reform**: Algorithms prioritize factual, non-polarizing content; ad-driven models are replaced by public funding.  

---

### **Conflict Resolution Strategies**  
1. **Trade-off Transparency**  
   - Use **participatory modeling** (e.g., simulations) to visualize how choices impact equity, ecology, and tech progress.  

2. **Iterative Adaptation**  
   - **Policy "Sandboxes"**: Test solutions in pilot regions (e.g., a city banning private cars) before scaling.  

3. **Redistribution Mechanisms**  
   - When tech displaces workers, **automation taxes** fund retraining or reduced work hours (e.g., 4-day weeks).  

4. **Conflict Arbitration**  
   - **Multistakeholder Councils** mediate disputes (e.g., Indigenous land rights vs. renewable energy projects) with binding ecological/social impact criteria.  

5. **Cultural Shifts**  
   - **Narrative Change**: Elevate stories of interdependence (e.g., "sufficiency" over consumerism) via art and media.  

---

### **Example Scenario: Urban Expansion vs. Greenbelts**  
- **Conflict**: A city needs housing (equity) but risks destroying wetlands (sustainability).  
- **Solution**:  
  1. **Citizen Assembly** debates options, informed by ecological data.  
  2. **Compromise**: High-density eco-housing with mandatory green roofs, paired with wetland restoration offsets.  
  3. **Funding**: Tax luxury developments to subsidize affordable units.  

---

### **Conclusion**  
This society thrives by **embedding equity and sustainability into technological design**, using democratic governance to navigate tensions, and valuing resilience over short-term gains. Conflicts become opportunities for innovation rather than zero-sum battles. The goal isn’t perfection, but **dynamic balance**—a civilization that learns, adapts, and uplifts all life.  

Would you emphasize any additional principles or systems?

### Response: Groq

In [11]:
model = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "llama-3.3-70b-versatile"

response = model.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)
models.append(model)

Designing a hypothetical society that balances technological advancement, environmental sustainability, and social equity requires a multifaceted approach. Here are the core principles and systems I would implement, along with strategies to address potential conflicts:

**Core Principles:**

1. **Circular Economy**: Design production and consumption systems that minimize waste, promote recycling, and utilize renewable resources.
2. **Social and Environmental Justice**: Ensure equal access to resources, opportunities, and benefits for all members of society, regardless of race, gender, class, or ability.
3. **Participatory Governance**: Foster inclusive, transparent, and accountable decision-making processes that involve diverse stakeholders and prioritize the well-being of both people and the planet.
4. **Technological Innovation**: Encourage the development and application of sustainable, environmentally benign technologies that support social equity and environmental sustainability.
5. **Education and Lifelong Learning**: Provide access to quality education and skills training, enabling individuals to adapt to changing technological and environmental landscapes.

**Systems:**

1. **Renewable Energy Grid**: Transition to 100% renewable energy sources, such as solar, wind, and hydroelectric power, to minimize greenhouse gas emissions and ensure energy security.
2. **Sustainable Agriculture**: Implement regenerative agriculture practices that prioritize soil health, biodiversity, and efficient water use, reducing synthetic fertilizers and pesticides.
3. **Closed-Loop Production**: Design manufacturing systems that minimize waste, use recycled materials, and promote product reuse and recycling.
4. **Universal Basic Services**: Provide essential services like healthcare, education, and social welfare programs to ensure a basic level of well-being for all members of society.
5. **Green Infrastructure**: Invest in urban planning, green spaces, and eco-friendly transportation systems to promote sustainable cities and reduce environmental impacts.

**Addressing Conflicts:**

1. **Trade-Off Analysis**: Regularly assess the potential trade-offs between technological advancement, environmental sustainability, and social equity to identify areas where compromises can be made.
2. **Stakeholder Engagement**: Foster dialogue and collaboration among diverse stakeholders, including community groups, businesses, and government agencies, to ensure that all perspectives are considered.
3. **Adaptive Governance**: Establish a flexible, iterative governance framework that can respond to emerging challenges and opportunities, balancing short-term needs with long-term sustainability goals.
4. **Innovation Funding**: Allocate resources to support research and development of sustainable technologies, as well as social and environmental innovation initiatives.
5. **Monitoring and Evaluation**: Establish a robust monitoring and evaluation system to track progress, identify areas for improvement, and adjust policies and programs accordingly.

**Conflict Resolution Strategies:**

1. **Mediation and Facilitation**: Use neutral third-party facilitators to help resolve conflicts and find mutually beneficial solutions.
2. **Consensus-Based Decision-Making**: Foster a culture of consensus-based decision-making, where all stakeholders work together to find solutions that balance competing interests.
3. **Compromise and Trade-Offs**: Identify areas where compromise is possible, and negotiate trade-offs that balance competing priorities.
4. **Education and Awareness**: Provide education and awareness programs to help stakeholders understand the interconnectedness of technological, environmental, and social issues.
5. **Incentives and disincentives**: Implement incentives for sustainable practices and disincentives for unsustainable behaviors, such as carbon pricing or pollution taxes.

**Implementation Roadmap:**

1. **Short-Term (0-5 years)**: Establish a national or global task force to develop a comprehensive plan for transitioning to a sustainable, equitable society.
2. **Medium-Term (5-20 years)**: Implement pilot projects and demonstration initiatives to test and refine sustainable technologies, policies, and practices.
3. **Long-Term (20-50 years)**: Scale up successful initiatives, integrate sustainable practices into mainstream culture, and continue to monitor and evaluate progress.
4. **Continuous Improvement**: Regularly assess and adjust policies, programs, and technologies to ensure that the society remains on a path toward optimal balance between technological advancement, environmental sustainability, and social equity.

By implementing these core principles, systems, and conflict resolution strategies, a hypothetical society can strive to achieve a balance between technological advancement, environmental sustainability, and social equity, creating a more just, resilient, and thriving world for all.

### Response: Ollama (local service): mistral-small, 25b

In [12]:
# Run `ollama serve` in the terminal first
#!ollama pull llama3.2:latest
!ollama pull mistral-small:latest # https://ollama.com/library/mistral-small

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling 102a747c1376: 100% ▕██████████████████▏  14 GB                         [K
pulling 6db27cd4e277: 100% ▕██████████████████▏  695 B                         [K
pulling 6d7b25ffd247: 100% ▕██████████████████▏  644 B                         [K
pulling 43070e2d4e53: 100% ▕██████████████████▏  11 KB                         [K
pulling e0daf17ff83e: 100% ▕██████████████████▏   21 B                         [K
pulling 41ffc852c4b6: 100% ▕██████████████████▏  562 B                         [K
verifying sha256 digest [K
writing manifest [K
success [K[?25h[?2026l


In [13]:
model = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "mistral-small:latest"

response = model.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)
models.append(model)

Designing a hypothetical society that optimizes technological advancement, environmental sustainability, and social equity requires a multi-faceted approach. Here are some core principles and systems I would implement, along with strategies to address potential conflicts:

### Core Principles

1. **Equitable Access to Resources:**
   - Ensure everyone has access to basic needs such as clean water, food, healthcare, and education.
   - Implement Universal Basic Income (UBI) or similar programs to provide financial security.

2. **Sustainable Practices:**
   - Foster a circular economy where waste is minimized and resources are reused.
   - Promote renewable energy sources like solar, wind, and hydro power.
   - Encourage regenerative agriculture and sustainable forestry practices.

3. **Innovation with Ethics:**
   - Develop technologies that prioritize environmental and social benefits over immediate profits.
   - Ensure transparency in technological development to mitigate risks and biases.

4. **Democratic Governance:**
   - Implement participatory governance where citizens have a direct say in decision-making processes.
   - Encourage diversity and inclusion in leadership roles to reflect the society at large.

5. **Education and Lifelong Learning:**
   - Provide high-quality education that focuses on critical thinking, environmental literacy, and technological skills.
   - Support continuous learning programs to adapt to changing technological landscapes.

6. **Community and Collaboration:**
   - Foster a culture of community and collaboration over individualism.
   - Encourage cross-sector partnerships between government, private sector, and civil society.

### Systems Implementation

1. **Renewable Energy Infrastructure:**
   - Invest in widespread renewable energy grids and storage systems.
   - Provide incentives for individuals and businesses to adopt renewable energy solutions.

2. **Eco-Cities Design:**
   - Urban planning should prioritize green spaces, sustainable transportation (e.g., electric vehicles, public transit), and energy-efficient buildings.

3. **Agriculture Innovation:**
   - Implement vertical farming and aquaponics to reduce land use and water consumption.
   - Support local farming initiatives and community-supported agriculture.

4. **Ethical AI and Automation:**
   - Develop robust regulations for AI to ensure it benefits society as a whole and doesn't exacerbate inequality.
   - Ensure automation increases productivity without displacing jobs, possibly through a job guarantee program or shorter work weeks.

5. **Social Safety Nets:**
   - Establish comprehensive social welfare programs that support those transitioning in the economy due to technological changes.
   - Provide lifelong learning opportunities for continuous skill development.

### Addressing Conflicts

1. **Conflict Resolution Mechanisms:**
   - Create independent regulatory bodies with oversight on each sector (technology, environment, and society) while ensuring they collaborate rather than operate in isolation.
   - Implement a conflict resolution framework to mediate disputes between competing priorities.

2. **Balanced Legislative Framework:**
   - Legislate balanced policies that ensure all three pillars are equally considered, such as mandating environmental impact assessments for significant technological projects and vice versa.
   - Use cost-benefit analyses that quantitatively measure social benefits versus technological innovations and environmental impacts.

3. **Public Engagement and Participation:**
   - Involve the public in decision-making processes through referendums and community consultations to foster understanding and consensus on complex issues.
   - Use crowdfunding models and collaborative platforms to support initiatives led by community groups rather than top-down approaches only.

### Example Scenario

Consider a scenario where a new technological advancement, like advanced AI systems for agriculture, promises significant agricultural productivity gains but could also lead to job displacement in rural areas. The society would address such challenges through:

1. **Preemptive Training Programs:**
   - Implement re-training programs that target farmers and agronomists, offering them skill development opportunities relevant to technological advancements.

2. **Incentivizing Adaptation:**
   - Provide tax breaks or subsidies for adopting AI-enhanced agricultural practices while ensuring fair employment transitions.

3. **Localized Community Support:**
   - Support local economies transitioning towards tech-enhanced agriculture by creating support groups and community-driven projects.

By balancing technological advancement, sustainability, and equity through participatory governance and innovative solutions, this hypothetical society strives to be resilient and inclusive in the face of ongoing challenges.

## Comparison

In [14]:
print(competitors)
#print(answers)

['gpt-4.1-mini', 'claude-3-7-sonnet-latest', 'gemini-2.0-flash', 'deepseek-chat', 'llama-3.3-70b-versatile', 'mistral-small:latest']


In [15]:
#for competitor, answer in zip(competitors, answers):
#    print(f"Competitor: {competitor}\n\n{answer}")

In [16]:
# Prepare the final input for comparison
together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"
print(together)

# Response from competitor 1

Designing a hypothetical society that optimally balances **technological advancement**, **environmental sustainability**, and **social equity** requires integrating multidisciplinary approaches, values, and adaptive governance structures. Here’s a comprehensive outline of core principles and systems I would implement, along with strategies to manage conflicts among these priorities:

---

### Core Principles

1. **Holistic Well-being as the Goal**  
   Prioritize human and planetary well-being over narrow economic metrics like GDP. Success is measured by holistic indicators such as health, happiness, ecological footprint, equity, and innovation impact.

2. **Intergenerational Justice**  
   Ensure policies and development consider long-term impacts on future generations, embedding sustainability into the legal and cultural frameworks.

3. **Inclusive Participation and Deliberative Democracy**  
   Empower all stakeholders—citizens, marginalized groups, sci

## LLMs as judge

In [17]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the ranked order of the competitors in a form of a Python list [number of the best competitor, number of second best competitor, ...], nothing else. Do not include markdown formatting, nor code blocks. Return only the list."""


In [18]:
#print(judge)

In [19]:
judge_messages = [{"role": "user", "content": judge}]

#### A single judge evaluation

In [20]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="o3-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)

[1, 4, 3, 2, 5, 6]


In [25]:
results

'[1, 4, 3, 2, 5, 6]'

In [27]:
import ast

#parsed = json.loads(results)
#ranks = parsed["results"]
#for index, result in enumerate(ranks):

parsed = ast.literal_eval(results)
results2 = ([int(x) for x in parsed])

for index, result in enumerate(results2):
    #competitor = competitors[int(result)-1]
    competitor = competitors[result-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gpt-4.1-mini
Rank 2: deepseek-chat
Rank 3: gemini-2.0-flash
Rank 4: claude-3-7-sonnet-latest
Rank 5: llama-3.3-70b-versatile
Rank 6: mistral-small:latest


#### Cross, Multi-LLM Evaluation

In [28]:
models

[<openai.OpenAI at 0x70b6cb9a9a90>,
 <anthropic.Anthropic at 0x70b6cb2a0050>,
 <openai.OpenAI at 0x70b6cb25dbd0>,
 <openai.OpenAI at 0x70b6cb25fc50>,
 <openai.OpenAI at 0x70b6cb51bbb0>,
 <openai.OpenAI at 0x70b6cb102060>]

In [40]:
# Query LLMs for their scores (this is asynchronous)
import re

models_scores = []

for index, model in enumerate(models):

    if "claude" in competitors[index].lower():
        response = model.messages.create(model=competitors[index], messages=judge_messages, max_tokens=1000)
        model_scoring = response.content[0].text

    else:
        response = model.chat.completions.create(model=competitors[index], messages=judge_messages)
        model_scoring = response.choices[0].message.content

    models_scores.append(model_scoring)

In [43]:
models_scores

['[1, 4, 3, 2, 6, 5]',
 '{"results": [1, 3, 4, 2, 6, 5]}',
 '["1", "4", "3", "2", "6", "5"]\n',
 '```json\n{"results": ["1", "4", "3", "2", "6", "5"]}\n```',
 '{"results": [1, 4, 2, 6, 3, 5]}',
 '[1, 5, 6, 2]']

In [44]:
# Standardize the scores to a common format
import ast
import json
import re

results = []

for item in models_scores:
    item = item.strip()
    
    # JSON wrapped in markdown
    if item.startswith('```json'):
        clean_json = re.sub(r'^```json\n|```$', '', item).strip()
        parsed = json.loads(clean_json)
        results.append([int(x) for x in parsed['results']])
    
    # Plain JSON with results list
    elif item.startswith('{') and '"results"' in item:
        parsed = json.loads(item)
        results.append([int(x) for x in parsed['results']])
    
    # Python-style list string
    elif item.startswith('[') and item.endswith(']'):
        parsed = ast.literal_eval(item)
        results.append([int(x) for x in parsed])
    
    else:
        raise ValueError(f"Unrecognized format: {item}")

print(results)

[[1, 4, 3, 2, 6, 5], [1, 3, 4, 2, 6, 5], [1, 4, 3, 2, 6, 5], [1, 4, 3, 2, 6, 5], [1, 4, 2, 6, 3, 5], [1, 5, 6, 2]]


In [45]:
print(f"Model ranking results:")
for index, model in enumerate(models):
    print(f"{competitors[index]}")
    
    for index2, result in enumerate(results[index]):
        competitor = competitors[int(result)-1]
        print(f"   Rank {index2+1}: {competitor}")

Model ranking results:
gpt-4.1-mini
   Rank 1: gpt-4.1-mini
   Rank 2: deepseek-chat
   Rank 3: gemini-2.0-flash
   Rank 4: claude-3-7-sonnet-latest
   Rank 5: mistral-small:latest
   Rank 6: llama-3.3-70b-versatile
claude-3-7-sonnet-latest
   Rank 1: gpt-4.1-mini
   Rank 2: gemini-2.0-flash
   Rank 3: deepseek-chat
   Rank 4: claude-3-7-sonnet-latest
   Rank 5: mistral-small:latest
   Rank 6: llama-3.3-70b-versatile
gemini-2.0-flash
   Rank 1: gpt-4.1-mini
   Rank 2: deepseek-chat
   Rank 3: gemini-2.0-flash
   Rank 4: claude-3-7-sonnet-latest
   Rank 5: mistral-small:latest
   Rank 6: llama-3.3-70b-versatile
deepseek-chat
   Rank 1: gpt-4.1-mini
   Rank 2: deepseek-chat
   Rank 3: gemini-2.0-flash
   Rank 4: claude-3-7-sonnet-latest
   Rank 5: mistral-small:latest
   Rank 6: llama-3.3-70b-versatile
llama-3.3-70b-versatile
   Rank 1: gpt-4.1-mini
   Rank 2: deepseek-chat
   Rank 3: claude-3-7-sonnet-latest
   Rank 4: mistral-small:latest
   Rank 5: gemini-2.0-flash
   Rank 6: llama-3.

In [50]:
# Average the results (based on the index of the competitors)
# Index 0 values → [1, 2, 1, 2]
# Index 1 values → [2, 1, 2, 1]
# Index 2 values → [3, 3, 3, 3]
# Index 3 values → [4, 3, 2, 3]

from statistics import mean

data = results

# Find the maximum length among the lists (normally all lists should be of the same length)
max_len = max(len(sublist) for sublist in data)

# Compute the average for each index across all lists
averages = []
for i in range(max_len):
    values_at_i = [sublist[i] for sublist in data if i < len(sublist)]
    averages.append(mean(values_at_i))

print(averages)

[1, 4, 3.5, 2.6666666666666665, 5.4, 5]


In [47]:
# Transpose the data and calculate the average of each column
averages = [sum(col) / len(col) for col in zip(*results)]
print(averages)

[1.0, 4.0, 3.5, 2.6666666666666665]


In [48]:
import statistics

# Transpose and compute medians
medians = [statistics.median(col) for col in zip(*results)]
print(medians)

[1.0, 4.0, 3.0, 2.0]


In [51]:
# Rank the averages

values = averages
sorted_values = sorted(set(values))
rank_dict = {v: i + 1 for i, v in enumerate(sorted_values)}
ranks = [rank_dict[v] for v in values]

print(competitors)
print(ranks)

['gpt-4.1-mini', 'claude-3-7-sonnet-latest', 'gemini-2.0-flash', 'deepseek-chat', 'llama-3.3-70b-versatile', 'mistral-small:latest']
[1, 4, 3, 2, 6, 5]


In [53]:
rankings = results

In [54]:
from collections import defaultdict

#rankings = [
#    [3, 1, 4, 6, 2, 5],
#    [3, 4, 1, 6, 2, 5],
#    [3, 4, 1, 6, 5, 2],
#    [1, 4, 3, 6, 2, 5],
#    [1, 3, 2, 4, 6, 5],
#    [5, 4, 3, 6, 1, 2]
#]

# Gather all positions per competitor
positions = defaultdict(list)

for race in rankings:
    for place, competitor_id in enumerate(race):
        positions[competitor_id].append(place + 1)  # +1 because place 0 = 1st

# Compute average position (lower is better)
average_ranks = {
    competitor: sum(places) / len(places)
    for competitor, places in positions.items()
}

# Sort competitors by average rank
ranked_competitors = sorted(average_ranks.items(), key=lambda x: x[1])

# Output
for i, (competitor, avg_rank) in enumerate(ranked_competitors, start=1):
    print(f"{i}. {competitors[competitor-1]} -- average rank = {avg_rank:.4f}")

1. gpt-4.1-mini -- average rank = 1.0000
2. deepseek-chat -- average rank = 2.2000
3. gemini-2.0-flash -- average rank = 3.2000
4. claude-3-7-sonnet-latest -- average rank = 3.8333
5. mistral-small:latest -- average rank = 4.5000
6. llama-3.3-70b-versatile -- average rank = 5.3333
