## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AI
DeepSeek API Key not set (and this is optional)
Groq API Key exists and begins gsk_


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


How would you reconcile the ethical implications of AI decision-making in life-and-death scenarios with the necessity for rapid, automated responses in critical situations?


In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [8]:
# The API we know well

model_name = "gpt-4o-mini"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling the ethical implications of AI decision-making in life-and-death situations with the necessity for rapid, automated responses involves a multifaceted approach that includes ethical frameworks, robust governance, transparent algorithms, and human oversight. Here are several strategies to consider:

1. **Establish Ethical Frameworks**: Develop a set of ethical guidelines or frameworks that govern AI deployment in critical scenarios. These could be based on principles such as transparency, accountability, fairness, and the priority of human life. Engaging ethicists, policymakers, industry experts, and the public in a participatory process can help shape these frameworks.

2. **Human-in-the-Loop Systems**: Implement designs where AI assists but does not entirely replace human judgment in critical decisions. 'Human-in-the-loop' systems can allow for rapid automated processing while ensuring that a human decision-maker has the final say or can intervene if needed.

3. **Prioritize Safety and Reliability**: Focus on rigorous testing and validation of AI systems in simulated environments to ensure reliability in high-stakes situations. This includes stress-testing the algorithms under diverse scenarios to understand their limitations and ensure they can function effectively without causing harm.

4. **Transparency and Explainability**: Ensure AI systems are designed with transparency in mind, allowing stakeholders to understand how decisions are made. Explainable AI can help users comprehend the reasoning behind an AI’s decisions, which is crucial in life-and-death scenarios where the rationale for a decision needs to be clear.

5. **Robust Governance and Regulation**: Establish regulatory frameworks that oversee the development and deployment of AI in critical applications. This could involve creating oversight bodies to monitor AI systems and enforce compliance with ethical standards and safety protocols.

6. **Continuous Monitoring and Feedback**: Implement systems that continuously monitor AI decision-making processes in real time. Gathering data on outcomes can help refine algorithms over time, improving their effectiveness and ethical compliance as new insights emerge.

7. **Training and Education**: Ensure that operators and stakeholders using AI systems are trained in both the technology and ethical considerations involved. This can help them make informed decisions and recognize potential issues early.

8. **Public Engagement and Trust**: Engage the public in discussions about the role of AI in life-and-death scenarios. Building trust through transparency and involving diverse perspectives can help address concerns about automated decision-making.

9. **Scenario-Specific Protocols**: Recognize that different critical situations may require different approaches. Establish protocols tailored to specific scenarios (e.g., medical emergencies, military operations), allowing for flexibility in how AI is implemented based on urgency and context.

By integrating these strategies, it is possible to develop AI systems that not only respond quickly to life-and-death situations but do so in a manner that respects ethical considerations and prioritizes human safety and dignity. Balancing efficiency with ethics is vital in ensuring responsible AI deployment in critical domains.

In [9]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-3-7-sonnet-latest"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

# Reconciling Ethics and Efficiency in AI Life-or-Death Decisions

This is one of the most challenging questions in AI ethics. I see several approaches to balance these competing concerns:

**Hybrid systems** might offer the best compromise - using AI for rapid initial assessment while maintaining human oversight for morally complex decisions when time permits. The key is designing systems where humans and AI complement each other's limitations.

**Value alignment engineering** is crucial - embedding carefully considered ethical frameworks developed through inclusive, democratic processes rather than simply reflecting developer values.

**Tiered autonomy** could help, where an AI's decision-making authority scales with:
- The time pressure of the situation
- The confidence level of its assessment
- The severity of potential consequences

For truly unavoidable automated decisions, we should insist on robust transparency allowing for meaningful post-hoc review, continuous improvement, and accountability.

What specific domain of life-or-death decisions were you thinking about? The appropriate balance might differ significantly between medical triage, autonomous vehicles, or military applications.

In [10]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling the ethical implications of AI decision-making in life-and-death scenarios with the need for rapid, automated responses is a complex challenge requiring a multi-pronged approach. Here's a breakdown of the key considerations and potential strategies:

**1. Defining the Ethical Stakes:**

*   **Transparency and Explainability:** How can we understand *why* an AI made a particular decision?  Black box algorithms make it difficult to assess fairness, bias, and potential errors.
*   **Accountability and Responsibility:** Who is responsible when an AI makes a wrong decision that leads to harm?  The programmer, the owner, the user, or the AI itself? Current legal and ethical frameworks struggle with assigning responsibility to non-human entities.
*   **Fairness and Bias:**  AI models are trained on data, and if that data reflects existing societal biases (e.g., racial, gender, socioeconomic), the AI will likely perpetuate and even amplify those biases, leading to discriminatory outcomes in critical situations.
*   **Autonomy vs. Human Oversight:**  How much autonomy should be granted to the AI?  When and how should human intervention be triggered?
*   **Data Privacy and Security:**  Life-and-death scenarios often involve highly sensitive data.  How can we ensure this data is protected and used ethically?
*   **The Value of Human Life:** Different ethical frameworks prioritize different values. How do we program an AI to make value judgments about who to save when it can't save everyone?

**2. Strategies for Mitigation and Reconciliation:**

*   **Develop Explainable AI (XAI):** Invest heavily in research and development of XAI techniques.  This includes:
    *   **Model simplification:**  Using simpler, more interpretable models where possible.
    *   **Feature importance analysis:**  Identifying the most influential factors in the AI's decision-making process.
    *   **Counterfactual explanations:**  Explaining what would need to change for the AI to arrive at a different decision.
*   **Robust Bias Detection and Mitigation:**
    *   **Diverse Training Data:**  Ensure training datasets are representative of the populations the AI will be used to serve.
    *   **Bias Auditing:**  Regularly audit AI systems for bias, using various metrics and techniques.
    *   **Algorithmic Fairness Techniques:**  Employ algorithms designed to mitigate bias during training and inference.  (e.g., adversarial debiasing, re-weighting techniques).
*   **Human-in-the-Loop Systems:**  Design systems that allow for human oversight and intervention, especially in high-stakes situations. This involves:
    *   **Alerting mechanisms:**  Flag potentially problematic decisions or situations where human judgment is needed.
    *   **Override capabilities:**  Enable humans to override the AI's decision when necessary.
    *   **Training for human operators:**  Train individuals on how to effectively interact with and manage AI systems in critical contexts.
*   **Clear Lines of Responsibility and Accountability:**
    *   **Legal Frameworks:**  Develop legal and regulatory frameworks that clearly define liability and accountability for AI-related harms.  This may require rethinking traditional notions of fault and causation.
    *   **Ethical Guidelines and Standards:**  Establish clear ethical guidelines and standards for the development, deployment, and use of AI in life-and-death scenarios. These guidelines should address issues such as transparency, fairness, and accountability.
*   **Scenario Planning and Testing:**
    *   **Simulations and Red Teaming:**  Extensively test AI systems in simulated environments that mimic real-world scenarios, including edge cases and high-stress situations.
    *   **"Red Teaming":** Employ experts to actively try to break or exploit the AI system to identify vulnerabilities and weaknesses.
*   **Prioritize Speed vs. Accuracy Trade-offs:**
    *   **Risk Assessment:** Conduct thorough risk assessments to determine the appropriate level of automation in different scenarios. In situations where speed is paramount, accept a higher degree of error, but implement fail-safes. In situations where accuracy is paramount, accept a slower response time.
    *   **Phased Implementation:**  Roll out AI systems gradually, starting with less critical applications and gradually expanding their use as trust and confidence grow.
*   **Ongoing Monitoring and Evaluation:**
    *   **Performance Monitoring:** Continuously monitor the performance of AI systems in real-world settings.
    *   **Feedback Loops:**  Establish mechanisms for gathering feedback from users and stakeholders to identify areas for improvement.
    *   **Adaptive Learning:** Design systems that can learn from their mistakes and adapt to changing circumstances.
*   **Data Governance and Security:**
    *   **Data Minimization:** Collect only the data that is strictly necessary for the AI to perform its function.
    *   **Data Anonymization and Encryption:**  Employ techniques to protect the privacy and security of sensitive data.
    *   **Access Controls:**  Implement strict access controls to limit who can access and modify data.
*   **Ethical Education and Awareness:**
    *   **Training for AI Developers:** Educate AI developers about the ethical implications of their work.
    *   **Public Awareness Campaigns:** Raise public awareness about the potential benefits and risks of AI.
*   **Collaboration and Interdisciplinary Approaches:**  Foster collaboration between AI researchers, ethicists, legal experts, policymakers, and domain experts (e.g., doctors, emergency responders) to address these complex challenges from multiple perspectives.

**3.  Acknowledging Inherent Limitations:**

*   **No Perfect Solution:**  It's crucial to acknowledge that there is no perfect solution that can eliminate all ethical risks. There will always be trade-offs and difficult decisions to make.
*   **Continuous Refinement:**  The ethical landscape of AI is constantly evolving. It is essential to continuously refine our approaches and adapt to new challenges as they arise.

**In Summary:**

Reconciling the need for rapid, automated AI responses with ethical considerations requires a proactive, multi-faceted approach.  It is not a problem that can be solved once and for all, but rather a continuous process of evaluation, improvement, and adaptation. By focusing on transparency, fairness, accountability, and human oversight, we can harness the power of AI while mitigating the risks and ensuring that it is used in a way that benefits humanity.  It also requires a realistic acceptance of unavoidable imperfections and a commitment to constant improvement.


In [11]:
if deepseek_api_key:
    deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
    model_name = "deepseek-chat"

    response = deepseek.chat.completions.create(model=model_name, messages=messages)
    answer = response.choices[0].message.content

    display(Markdown(answer))
    competitors.append(model_name)
    answers.append(answer)

In [12]:
groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "llama-3.3-70b-versatile"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


Reconciling the ethical implications of AI decision-making in life-and-death scenarios with the necessity for rapid, automated responses in critical situations requires a multi-faceted approach. Here are some strategies to help strike a balance between these competing demands:

1. **Design for transparency and explainability**: Develop AI systems that provide clear explanations for their decisions, enabling humans to understand the reasoning behind the AI's actions. This transparency can help build trust and facilitate accountability.
2. **Human oversight and review**: Implement a human review process to oversee AI decision-making, especially in high-stakes situations. This allows for the detection and correction of potential errors or biases.
3. **Value alignment**: Ensure that AI systems are designed to align with human values, such as the preservation of human life and the prevention of harm. This can be achieved through the development of value-based objectives and reward functions.
4. **Robust testing and validation**: Conduct thorough testing and validation of AI systems to ensure they can handle a wide range of scenarios and edge cases, including those that may involve life-and-death decisions.
5. **Regulatory frameworks**: Establish regulatory frameworks that address the use of AI in critical situations. This can include guidelines for the development, deployment, and use of AI systems, as well as mechanisms for accountability and oversight.
6. **Human-AI collaboration**: Design systems that facilitate human-AI collaboration, allowing humans to provide input and guidance in critical situations. This can help ensure that AI decisions are informed by human values and expertise.
7. **Continuous monitoring and improvement**: Regularly monitor AI system performance and update the systems as needed to ensure they remain effective and aligned with human values.
8. **Ethics by design**: Incorporate ethical considerations into the design process from the outset, rather than as an afterthought. This involves considering the potential consequences of AI decisions and designing systems that prioritize human well-being and safety.
9. **Accountability mechanisms**: Establish mechanisms for holding AI developers, deployers, and users accountable for the consequences of AI decisions, including those that result in harm or loss of life.
10. **Public engagement and education**: Engage with the public and educate them about the benefits and risks of AI decision-making in critical situations, as well as the measures being taken to ensure accountability and transparency.

**Rapid response strategies**:

1. **Hybrid systems**: Develop hybrid systems that combine the strengths of both human and AI decision-making, such as using AI to provide initial assessments and humans to review and validate the decisions.
2. **Real-time monitoring**: Implement real-time monitoring systems that can detect and respond to critical situations, with AI providing support and guidance to human decision-makers.
3. **Automated alerts**: Use automated alerts and notifications to notify humans of potential critical situations, allowing them to intervene and make decisions in a timely manner.
4. **AI-assisted decision support**: Provide AI-assisted decision support systems that can help humans make informed decisions in high-pressure situations, while still maintaining human oversight and control.

**Future research directions**:

1. **Value-based AI**: Develop AI systems that can learn and adapt to human values, ensuring that AI decisions align with human principles and ethics.
2. **Explainable AI**: Improve the explainability of AI decisions, enabling humans to understand the reasoning behind AI actions and build trust in AI systems.
3. **Human-AI collaboration**: Investigate new ways to facilitate human-AI collaboration, including the development of shared decision-making frameworks and interfaces.
4. **Regulatory frameworks**: Develop and refine regulatory frameworks that address the use of AI in critical situations, ensuring accountability, transparency, and safety.

By pursuing these strategies and research directions, we can develop AI systems that balance the need for rapid, automated responses in critical situations with the ethical implications of AI decision-making in life-and-death scenarios.

## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [13]:
#!ollama pull llama3.2

In [14]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling the ethical implications of AI decision-making in life-and-death scenarios with the need for rapid, automated responses is a complex and multifaceted challenge. Here are some possible approaches to addressing this issue:

1. **Hybrid approach**: Implementing hybrid systems that combine human oversight and review with AI-driven decision-making can help mitigate potential ethical concerns. This would ensure that while AI makes initial decisions, humans have the final say to review and correct any errors or inconsistencies.
2. **Transparency and explainability**: Developing more transparent and explainable algorithms can help build trust in AI-driven decision-making processes. By providing clear explanations for their recommendations, AI systems can demonstrate their thought process and underlying logic, making it easier for humans to evaluate the decisions made.
3. **Value alignment**: Ensuring that AI systems are aligned with human values and ethical principles is crucial. This involves developing AI systems that prioritize human well-being, dignity, and life preservation above other considerations.
4. **Human-AI collaboration**: Designing AI systems that facilitate human-AI collaboration can help ensure that humans remain in control of critical decision-making processes. For example, AI could assist humans in analyzing data or identifying patterns, while humans make the final decisions.
5. **Risk assessment and mitigation**: Conducting thorough risk assessments and developing strategies to mitigate adverse outcomes is essential when deploying AI systems in life-and-death scenarios. This may involve implementing safety protocols, developing redundant systems, or establishing procedures for manual override.
6. **Continuous monitoring and evaluation**: Regularly monitoring and evaluating the performance of AI systems in life-and-death scenarios can help identify areas for improvement and ensure that they are upholding ethical standards.
7. **Regulatory frameworks**: Establishing regulatory frameworks that specifically address AI decision-making in life-and-death scenarios can provide a clear framework for organizations to operate within, ensuring that AI systems are designed and deployed with ethics in mind.
8. **Education and awareness**: Educating users about the capabilities and limitations of AI systems, as well as the ethical considerations involved, is crucial. This can help build trust and ensure that stakeholders understand when AI-driven decisions should be used or overridden.
9. **Ethics-based design**: Incorporating ethics into the design phase of AI systems can help ensure that they are developed with humanvalues and ethical principles in mind from the outset.
10. **Human-centered design approach**: Adopting a human-centered design approach can help prioritize human well-being, dignity, and life preservation when designing AI systems for life-and-death scenarios.

To reconcile the necessity for rapid, automated responses in critical situations with the need to avoid potential ethical pitfalls, it's essential to adopt a holistic approach that incorporates multiple strategies. This may involve:

* Implementing decision-support systems that augment human judgment rather than replace it
* Developing AI systems that prioritize human well-being and life preservation above other considerations
* Establishing clear guidelines and procedures for AI-driven decision-making in critical situations
* Conducting continuous monitoring and evaluation to ensure that AI systems are upholding ethical standards
* Encouraging transparency, explainability, and human-AI collaboration to build trust and confidence in AI-driven decision-making processes.

Ultimately, reconciling the ethical implications of AI decision-making in life-and-death scenarios requires a multifaceted approach that prioritizes human well-being, dignity, and life preservation.

In [15]:
# So where are we?

print(competitors)
print(answers)


['gpt-4o-mini', 'claude-3-7-sonnet-latest', 'gemini-2.0-flash', 'llama-3.3-70b-versatile', 'llama3.2']
["Reconciling the ethical implications of AI decision-making in life-and-death situations with the necessity for rapid, automated responses involves a multifaceted approach that includes ethical frameworks, robust governance, transparent algorithms, and human oversight. Here are several strategies to consider:\n\n1. **Establish Ethical Frameworks**: Develop a set of ethical guidelines or frameworks that govern AI deployment in critical scenarios. These could be based on principles such as transparency, accountability, fairness, and the priority of human life. Engaging ethicists, policymakers, industry experts, and the public in a participatory process can help shape these frameworks.\n\n2. **Human-in-the-Loop Systems**: Implement designs where AI assists but does not entirely replace human judgment in critical decisions. 'Human-in-the-loop' systems can allow for rapid automated proces

In [16]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-4o-mini

Reconciling the ethical implications of AI decision-making in life-and-death situations with the necessity for rapid, automated responses involves a multifaceted approach that includes ethical frameworks, robust governance, transparent algorithms, and human oversight. Here are several strategies to consider:

1. **Establish Ethical Frameworks**: Develop a set of ethical guidelines or frameworks that govern AI deployment in critical scenarios. These could be based on principles such as transparency, accountability, fairness, and the priority of human life. Engaging ethicists, policymakers, industry experts, and the public in a participatory process can help shape these frameworks.

2. **Human-in-the-Loop Systems**: Implement designs where AI assists but does not entirely replace human judgment in critical decisions. 'Human-in-the-loop' systems can allow for rapid automated processing while ensuring that a human decision-maker has the final say or can intervene i

In [17]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [18]:
print(together)

# Response from competitor 1

Reconciling the ethical implications of AI decision-making in life-and-death situations with the necessity for rapid, automated responses involves a multifaceted approach that includes ethical frameworks, robust governance, transparent algorithms, and human oversight. Here are several strategies to consider:

1. **Establish Ethical Frameworks**: Develop a set of ethical guidelines or frameworks that govern AI deployment in critical scenarios. These could be based on principles such as transparency, accountability, fairness, and the priority of human life. Engaging ethicists, policymakers, industry experts, and the public in a participatory process can help shape these frameworks.

2. **Human-in-the-Loop Systems**: Implement designs where AI assists but does not entirely replace human judgment in critical decisions. 'Human-in-the-loop' systems can allow for rapid automated processing while ensuring that a human decision-maker has the final say or can interv

In [19]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [20]:
print(judge)

You are judging a competition between 5 competitors.
Each model has been given this question:

How would you reconcile the ethical implications of AI decision-making in life-and-death scenarios with the necessity for rapid, automated responses in critical situations?

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

Reconciling the ethical implications of AI decision-making in life-and-death situations with the necessity for rapid, automated responses involves a multifaceted approach that includes ethical frameworks, robust governance, transparent algorithms, and human oversight. Here are several strategies to consider:

1. **Establish Ethical Frameworks**: Develop a set of e

In [21]:
judge_messages = [{"role": "user", "content": judge}]

In [22]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="o3-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["3", "1", "4", "5", "2"]}


In [23]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gemini-2.0-flash
Rank 2: gpt-4o-mini
Rank 3: llama-3.3-70b-versatile
Rank 4: llama3.2
Rank 5: claude-3-7-sonnet-latest


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>