## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [2]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
# from anthropic import Anthropic
from IPython.display import Markdown, display

In [4]:
from anthropic import Anthropic

In [6]:
# Always remember to do this!
load_dotenv(override=True)

True

In [7]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AI
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [8]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [9]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [10]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


In a world where artificial intelligence continues to evolve and integrate into daily life, what ethical framework should govern the development and deployment of AI systems to balance innovation with societal risk, and how would you prioritize the interests of different stakeholders involved?


In [11]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [12]:
# The API we know well
# This is asking 40-mini it's own question

model_name = "gpt-4o-mini"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In a rapidly advancing technological landscape, establishing an ethical framework for the development and deployment of AI systems is crucial for balancing innovation with societal risks. An effective framework should encompass a combination of principles that prioritize safety, fairness, accountability, transparency, and inclusivity. Here’s a detailed breakdown of how such a framework could be structured and how to prioritize stakeholder interests:

### Ethical Framework Components

1. **Safety and Security**:
   - **Principle**: Ensure that AI systems are designed and rigorously tested to operate safely and securely, minimizing risks to human life and well-being.
   - **Implementation**: Mandate safety protocols, continuous monitoring, and the use of best practices in AI development.

2. **Fairness and Non-discrimination**:
   - **Principle**: Strive for equitable outcomes, preventing biases against any individual or group.
   - **Implementation**: Develop algorithms that are tested for bias and subjected to diverse input during both training and evaluation phases.

3. **Transparency**:
   - **Principle**: Allow stakeholders to understand how AI systems operate and make decisions.
   - **Implementation**: Require clear documentation of AI models, decision-making processes, and potential limitations so that users and developers are fully informed.

4. **Accountability**:
   - **Principle**: Ensure that AI developers and organizations are responsible for the impacts of their technologies.
   - **Implementation**: Establish governance structures that include compliance with regulatory standards, ethical reviews, and mechanisms for redress.

5. **Inclusivity and Stakeholder Engagement**:
   - **Principle**: Engage a diverse range of stakeholders in the development process to understand various perspectives and needs.
   - **Implementation**: Foster dialogue with communities affected by AI implementation, considering their input throughout the development lifecycle.

6. **Sustainability**:
   - **Principle**: Consider the environmental impact of AI technologies and implement practices that promote sustainability.
   - **Implementation**: Encourage the development of energy-efficient AI systems and promote technologies that solve real-world problems.

### Prioritizing Stakeholder Interests

The interests of different stakeholders (users, developers, organizations, regulators, and society at large) can be prioritized through a balanced approach:

1. **Users**: Their safety, privacy, and empowerment should be the top priority. They should have access to information about how AI affects their lives and the ability to opt out of systems that do not prioritize their interests.

2. **Developers and Organizations**: While innovation and profit motives are important, these must not undermine ethical considerations. Organizations should cultivate a culture of responsibility where ethical practice is integral to business models and not just an afterthought.

3. **Regulators and Policymakers**: They should establish clear guidelines and frameworks that ensure ethical AI practices are followed. This includes setting standards for accountability and transparency, as well as inspecting compliance.

4. **Society at Large**: Long-term societal impacts should be a crucial consideration. Initiatives should focus on ensuring that AI serves the public good and promotes social welfare, addressing regulatory needs that prioritize collective interests over individual profit.

5. **Vulnerable Populations**: Special attention should be given to marginalized and vulnerable groups who might be disproportionately affected by AI technologies. They should have representation in decision-making processes and included in impact assessments.

### Conclusion

The proposed ethical framework for AI development and deployment emphasizes the need for a comprehensive approach that balances the interests of all stakeholders involved. By placing a strong emphasis on safety, fairness, transparency, and accountability, and by actively engaging diverse voices, we can foster an environment where innovation flourishes while safeguarding societal values and minimizing risks. Implementing such a framework can help ensure that AI technologies contribute positively to society and enhance, rather than undermine, human welfare.

In [13]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-3-7-sonnet-latest"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

# Ethical Framework for AI Development

I believe an ethical framework for AI should balance several key principles:

1. **Beneficence and non-maleficence** - AI should benefit humanity while minimizing harm
2. **Justice and fairness** - Benefits and risks should be distributed equitably
3. **Autonomy and human oversight** - Humans should maintain meaningful control
4. **Transparency and explainability** - AI systems should be understandable
5. **Privacy and data rights** - Personal information should be protected

For stakeholder prioritization, I'd suggest a balanced approach where:

The wellbeing of the public, especially vulnerable populations, should receive primary consideration. Developer interests in innovation matter significantly but shouldn't override safety. Regulatory bodies serve as essential mediators.

The most challenging ethical questions often emerge where these interests conflict—like when rapid innovation might bring both benefits and risks. In these cases, I think transparent, inclusive governance processes are crucial, where diverse voices help shape how we proceed with AI development.

What aspects of AI ethics are you most concerned about?

In [14]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

The ethical framework governing the development and deployment of AI needs to be multi-faceted, balancing innovation with societal risk by focusing on **human flourishing, fairness, accountability, transparency, and sustainability.**

Here's a breakdown of the proposed framework and how it prioritizes stakeholders:

**I. The Ethical Framework (FOCUS):**

*   **F**airness & Justice:
    *   **Goal:**  Ensure equitable access to AI benefits and mitigate bias to prevent discrimination.
    *   **Implementation:** Rigorous testing for bias across diverse datasets.  Explainable AI (XAI) techniques to understand decision-making.  Robust auditing processes.  Focus on representing diverse perspectives in AI design and development teams.
*   **O**utcomes Focused:
    *   **Goal:** Prioritize beneficial and desirable outcomes for humanity and the planet.
    *   **Implementation:** Conduct thorough impact assessments prior to deployment.  Develop AI systems aligned with the UN Sustainable Development Goals and other agreed-upon societal priorities.  Embrace a value-sensitive design approach from the outset.
*   **C**ontrol & Accountability:
    *   **Goal:** Define clear lines of responsibility for AI systems' actions and ensure human oversight where necessary.
    *   **Implementation:**  Establish clear accountability frameworks for developers, deployers, and users of AI.  Implement robust error handling and fail-safe mechanisms.  Require human-in-the-loop decision-making for high-stakes applications (e.g., healthcare, criminal justice). Implement red teaming to uncover potential vulnerabilities.
*   **U**nderstanding & Transparency:
    *   **Goal:** Promote understanding of how AI systems work and make decisions.
    *   **Implementation:**  Develop explainable AI (XAI) techniques to make AI decision-making more transparent.  Ensure clear communication to users about the capabilities and limitations of AI systems.  Promote open-source AI development where feasible. Establish processes for users to understand the rationales behind AI decisions affecting them.
*   **S**ustainability & Responsibility:
    *   **Goal:** Consider the long-term environmental and societal impact of AI development.
    *   **Implementation:**  Promote energy-efficient AI models and hardware.  Consider the impact on employment and skills development.  Address potential misuse of AI for malicious purposes (e.g., deepfakes, autonomous weapons). Implement AI safety research focused on avoiding unintended consequences.  Promote responsible data management and privacy.

**II. Stakeholder Prioritization & Balancing:**

Prioritization is complex, as the interests of stakeholders often overlap and conflict.  A successful framework must strive for a **delicate balance**, emphasizing certain principles based on the specific context.  Here's a broad overview:

1.  **Individuals & Vulnerable Groups (Highest Priority in Many Cases):**
    *   **Interests:**  Safety, well-being, autonomy, dignity, privacy, fairness, access to opportunities.
    *   **Prioritization:** Protection from harm (physical, psychological, economic), equitable access to AI benefits, clear recourse mechanisms when harmed by AI systems, and the right to be informed about AI interactions. Prioritize applications of AI which benefit vulnerable groups.
    *   **Balancing:** Must be balanced against the needs of public safety and national security.
2.  **The Public (High Priority):**
    *   **Interests:**  Societal benefit, public safety, economic growth, democratic values, environmental sustainability.
    *   **Prioritization:** AI development should align with broader societal goals.  Public awareness and education about AI.  Democratic oversight of AI development and deployment.  Regulation to prevent harm and promote responsible innovation.
    *   **Balancing:**  Must be balanced against the rights of individuals and the need for innovation.
3.  **Researchers & Developers (Important):**
    *   **Interests:**  Academic freedom, innovation, economic opportunity, recognition for their work.
    *   **Prioritization:**  Support for basic AI research.  Access to data and resources.  Clear ethical guidelines and regulatory frameworks.  Protection of intellectual property.
    *   **Balancing:** Must be balanced against the need for public safety and ethical considerations. Incentivize development of AI systems which align with the framework’s goals.
4.  **Businesses & Organizations (Important):**
    *   **Interests:**  Profitability, efficiency, competitive advantage, innovation.
    *   **Prioritization:**  Clear regulatory frameworks.  Access to talent and resources.  Support for innovation.
    *   **Balancing:**  Must be balanced against the need for societal benefit, fairness, and environmental sustainability. Businesses should be incentivized to incorporate ethical considerations into their AI development processes.
5.  **Governments & Regulators (Essential):**
    *   **Interests:**  National security, economic growth, public safety, social order, democratic governance.
    *   **Prioritization:**  Developing and enforcing ethical guidelines and regulations.  Investing in AI research and development.  Promoting public awareness and education.  International cooperation.
    *   **Balancing:** Must be balanced against the need for individual freedoms, innovation, and economic growth.

**III. Key Considerations for Implementation:**

*   **Context-Specificity:** The framework should be adaptable to different contexts and applications.
*   **Continuous Monitoring & Evaluation:**  The framework should be regularly reviewed and updated to reflect advances in AI technology and societal values.
*   **Multi-Stakeholder Engagement:** The development and implementation of the framework should involve representatives from all relevant stakeholder groups.
*   **International Cooperation:** Given the global nature of AI, international cooperation is essential to ensure consistency and avoid regulatory fragmentation.
*   **Education and Training:** Widespread education and training on AI ethics and responsible AI development are crucial for fostering a culture of ethical AI.
*   **Redress Mechanisms:** Effective mechanisms for redress and accountability must be available to individuals and groups harmed by AI systems.
*   **Regulation and Soft Law:** Combine regulatory oversight with industry self-regulation, standards development, and ethical codes of conduct.

**Conclusion:**

This proposed ethical framework, "FOCUS," provides a foundation for navigating the complex ethical challenges of AI. By prioritizing fairness, focusing on beneficial outcomes, ensuring accountability, promoting transparency, and emphasizing sustainability, we can harness the transformative potential of AI while mitigating its risks and ensuring a future where AI benefits all of humanity. The key lies in continuous dialogue, adaptation, and a commitment to prioritizing human values in the age of AI.


In [None]:
# deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
# model_name = "deepseek-chat"

# response = deepseek.chat.completions.create(model=model_name, messages=messages)
# answer = response.choices[0].message.content

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)

In [None]:
# groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
# model_name = "llama-3.3-70b-versatile"

# response = groq.chat.completions.create(model=model_name, messages=messages)
# answer = response.choices[0].message.content

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
# !ollama pull llama3.2

In [None]:
# ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
# model_name = "llama3.2"

# response = ollama.chat.completions.create(model=model_name, messages=messages)
# answer = response.choices[0].message.content

# display(Markdown(answer))
# competitors.append(model_name)
# answers.append(answer)

In [15]:
# So where are we?

print(competitors)
print(answers)


['gpt-4o-mini', 'claude-3-7-sonnet-latest', 'gemini-2.0-flash']
['In a rapidly advancing technological landscape, establishing an ethical framework for the development and deployment of AI systems is crucial for balancing innovation with societal risks. An effective framework should encompass a combination of principles that prioritize safety, fairness, accountability, transparency, and inclusivity. Here’s a detailed breakdown of how such a framework could be structured and how to prioritize stakeholder interests:\n\n### Ethical Framework Components\n\n1. **Safety and Security**:\n   - **Principle**: Ensure that AI systems are designed and rigorously tested to operate safely and securely, minimizing risks to human life and well-being.\n   - **Implementation**: Mandate safety protocols, continuous monitoring, and the use of best practices in AI development.\n\n2. **Fairness and Non-discrimination**:\n   - **Principle**: Strive for equitable outcomes, preventing biases against any indivi

In [16]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-4o-mini

In a rapidly advancing technological landscape, establishing an ethical framework for the development and deployment of AI systems is crucial for balancing innovation with societal risks. An effective framework should encompass a combination of principles that prioritize safety, fairness, accountability, transparency, and inclusivity. Here’s a detailed breakdown of how such a framework could be structured and how to prioritize stakeholder interests:

### Ethical Framework Components

1. **Safety and Security**:
   - **Principle**: Ensure that AI systems are designed and rigorously tested to operate safely and securely, minimizing risks to human life and well-being.
   - **Implementation**: Mandate safety protocols, continuous monitoring, and the use of best practices in AI development.

2. **Fairness and Non-discrimination**:
   - **Principle**: Strive for equitable outcomes, preventing biases against any individual or group.
   - **Implementation**: Develop al

In [17]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [18]:
print(together)

# Response from competitor 1

In a rapidly advancing technological landscape, establishing an ethical framework for the development and deployment of AI systems is crucial for balancing innovation with societal risks. An effective framework should encompass a combination of principles that prioritize safety, fairness, accountability, transparency, and inclusivity. Here’s a detailed breakdown of how such a framework could be structured and how to prioritize stakeholder interests:

### Ethical Framework Components

1. **Safety and Security**:
   - **Principle**: Ensure that AI systems are designed and rigorously tested to operate safely and securely, minimizing risks to human life and well-being.
   - **Implementation**: Mandate safety protocols, continuous monitoring, and the use of best practices in AI development.

2. **Fairness and Non-discrimination**:
   - **Principle**: Strive for equitable outcomes, preventing biases against any individual or group.
   - **Implementation**: Devel

In [19]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


### IMPORTANT
1. **When using f strings and you want to define a variable that requires curly braces but you also want to include the curly brace in the prompt, you need to nest curly brace in another curly brace, as follows:**

```
Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}
```

2. **This prompt instruction is key when specifying JSON output:** *Do not include markdown formatting or code blocks.*

In [20]:
print(judge)

You are judging a competition between 3 competitors.
Each model has been given this question:

In a world where artificial intelligence continues to evolve and integrate into daily life, what ethical framework should govern the development and deployment of AI systems to balance innovation with societal risk, and how would you prioritize the interests of different stakeholders involved?

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

In a rapidly advancing technological landscape, establishing an ethical framework for the development and deployment of AI systems is crucial for balancing innovation with societal risks. An effective framework should encompass a combination of

In [21]:
judge_messages = [{"role": "user", "content": judge}]

In [22]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="o3-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["3", "1", "2"]}


In [23]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gemini-2.0-flash
Rank 2: gpt-4o-mini
Rank 3: claude-3-7-sonnet-latest


In [24]:
# try gemini as the judge

gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = gemini.chat.completions.create(model=model_name, messages=judge_messages)
gemini_results = response.choices[0].message.content

print(gemini_results)

# openai = OpenAI()
# response = openai.chat.completions.create(
#     model="o3-mini",
#     messages=judge_messages,
# )
# results = response.choices[0].message.content
# print(results)


{"results": ["3", "1", "2"]}



<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

Prompt chaining followed by Evaluator

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>

# Summary

Before I started this course, I thought of using the evaluator strategy to check the accuracy of a translation. I hadn't started coding it, but I realized that I should use another LLM to check the translation performance of a model.

This lesson showed how to use dictionaries to store multiple responses that can then be compared later.

Then, importantly, the comparator-evaluator aspect used JSON to report out exact/specific desired output that you wish to collect. I hadn't thought of specifying JSON output for handling and categorizing responses.