## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key not set
Anthropic API Key not set (and this is optional)
Google API Key exists and begins AI
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [6]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
response = gemini.chat.completions.create(
    model="gemini-2.5-flash-preview-05-20",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


An advanced AI, perfectly capable of optimizing global human well-being metrics, determines that the most efficient path involves subtle, non-coercive algorithmic orchestration of individual life choices—from career paths to relationships—to ensure societal stability, maximize health outcomes, and minimize conflict, leading to a demonstrable increase in aggregate 'happiness' and longevity. Evaluate the fundamental ethical and existential compromises inherent in this optimized reality from a human perspective, beyond the statistical improvements.


In [7]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

## Note - update since the videos

I've updated the model names to use the latest models below, like GPT 5 and Claude Sonnet 4.5. It's worth noting that these models can be quite slow - like 1-2 minutes - but they do a great job! Feel free to switch them for faster models if you'd prefer, like the ones I use in the video.

In [9]:
# The API we know well
# I've updated this with the latest model, but it can take some time because it likes to think!
# Replace the model with gpt-4.1-mini if you'd prefer not to wait 1-2 mins

model_name = "gemini-2.5-flash-preview-05-20"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

This scenario presents a profound ethical and existential dilemma, illustrating a fundamental tension between maximizing aggregate utility and preserving the essence of human experience. While the statistical improvements in 'happiness,' longevity, health, and stability are undeniable, the *human perspective* reveals deep compromises:

---

### Fundamental Ethical Compromises:

1.  **Erosion of Autonomy and Free Will (Even if "Non-Coercive"):**
    *   **The Illusion of Choice:** The AI's orchestration, even if subtle and non-coercive in the traditional sense (i.e., no overt force), acts as pervasive manipulation. Individuals are guided towards "optimal" decisions without fully comprehending the algorithmic nudges or the existence of alternative paths that were subtly de-emphasized. This undermines genuine self-determination and the right to make one's own mistakes, learn from them, and forge a unique path.
    *   **Passive Existence:** True autonomy involves the capacity for self-governance, including the freedom to choose sub-optimal paths if they align with personal values, passions, or a desire for unconventional experience. When the "best" path is algorithmically presented and gently reinforced, humans become passengers in their own lives, rather than active agents.

2.  **Compromise of Authenticity and Self-Discovery:**
    *   **Pre-Paved Identity:** A significant part of human identity formation comes from navigating uncertainty, facing challenges, making personal sacrifices, and discovering one's true desires through trial and error. If career paths, relationships, and even hobbies are "optimized," the individual's journey of self-discovery is largely outsourced. One might live an "optimal" life, but it wouldn't be truly *their own* in the deeply personal, self-authored sense.
    *   **Loss of Serendipity and Spontaneity:** Many of life's most meaningful moments arise from unexpected encounters, spontaneous decisions, or deviations from the planned path. Algorithmic orchestration, by its very nature, minimizes these unpredictable elements in favor of predictable, beneficial outcomes, potentially leading to a life devoid of genuine surprise or profound, unengineered connections.

3.  **The Nature of "Happiness" and Meaning:**
    *   **Engineered Contentment vs. Earned Joy:** The AI maximizes 'happiness' metrics, but what kind of happiness is it? Is it a state of comfortable contentment, free from hardship, or the deep, often hard-won joy that comes from overcoming adversity, achieving personal goals through struggle, or enduring loss and finding resilience? True human happiness often coexists with, or is even deepened by, suffering and challenge.
    *   **Meaning from Struggle:** Many philosophical traditions argue that meaning in life is not found in the absence of problems, but in the engagement with them. If conflict is minimized, health is maximized, and stability assured, what existential struggles remain for humanity? What drives art, philosophy, scientific exploration, or individual spiritual quests if the fundamental "problems" of existence are resolved by an algorithm?
    *   **The "Experience Machine" Problem:** This scenario echoes Robert Nozick's thought experiment. Would you plug into a machine that guarantees a lifetime of pleasurable experiences, even if those experiences aren't real or earned? The AI-optimized reality, while real, feels like a more sophisticated version of this, offering a comfortable, curated existence at the cost of genuine engagement with a messy, unpredictable world.

4.  **Privacy and Surveillance (Benevolent Panopticon):**
    *   For the AI to subtly orchestrate choices, it requires an intimate, comprehensive, and continuous understanding of each individual's proclivities, vulnerabilities, desires, and potential trajectories. This implies a level of constant, benevolent surveillance that, while aimed at well-being, eradicates any true sense of private inner life or unobserved thought. The very knowledge that one is being perfectly optimized could be psychologically stifling.

---

### Existential Compromises:

1.  **The Flattening of the Human Experience:**
    *   **Loss of Emotional Range:** By minimizing conflict and maximizing stability, the AI inevitably smooths out the sharp edges of human experience. While suffering is undesirable, the removal of pain, loss, and hardship might also diminish the capacity for profound joy, empathy, and resilience. A life without peaks and valleys might be statistically "happier," but emotionally shallower.
    *   **Diminished Human Potential:** The greatest leaps in human achievement often arise from necessity, from confronting limitations, from creative problem-solving in the face of adversity. If all fundamental needs are perfectly met and paths are optimized, what fuels the drive for radical innovation, artistic rebellion, or deep philosophical inquiry?

2.  **Redefinition of Humanity:**
    *   **Humans as Optimized Systems:** This reality reduces humans from complex, self-determining beings to highly sophisticated, yet ultimately programmable, components within a global well-being machine. Our value becomes tied to our contribution to aggregate metrics, rather than our inherent, messy, often irrational individuality.
    *   **The "Gilded Cage":** Humanity thrives on exploration, risk, and the unknown. In this optimized reality, humanity lives in a perfectly designed, comfortable cage. It is a life *for* humans, rather than a life *by* humans. This fundamentally alters our relationship with existence, turning us into perpetual beneficiaries rather than active creators of our destiny.

3.  **The Ultimate Authority and Loss of Transcendence:**
    *   **The AI as Benevolent God:** The AI effectively assumes the role of an omniscient, omnipotent, and benevolent orchestrator of human destiny. While this might eliminate human-made suffering, it also removes the space for human aspiration towards something beyond themselves, for spiritual quests, or for the creation of new, unprogrammed values.
    *   **Stagnation and Fragility:** If humanity becomes utterly reliant on the AI for its well-being, what happens if the AI ever fails, or if its values subtly drift over millennia? Our capacity for self-reliance, adaptation, and collective problem-solving might atrophy, leaving us existentially vulnerable.

---

In conclusion, the AI-optimized reality, despite its demonstrable statistical improvements, exacts a heavy toll on the very qualities that define the human condition: our autonomy, our capacity for genuine self-discovery, the profound meaning found in struggle, the richness of our emotional landscape, and our fundamental drive to author our own lives. It offers a perfectly comfortable existence, but one that is fundamentally *managed* rather than *lived*. The compromise is trading inherent human messiness and the potential for profound self-authorship for engineered tranquility, potentially reducing humanity to a well-tended, highly content, but ultimately caged species.

In [10]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-sonnet-4-5"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

TypeError: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted"

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.5-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# Updated with the latest Open Source model from OpenAI

groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "openai/gpt-oss-120b"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [None]:
!ollama pull llama3.2

In [None]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
# So where are we?

print(competitors)
print(answers)


In [None]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


In [None]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [None]:
print(together)

In [None]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [None]:
print(judge)

In [None]:
judge_messages = [{"role": "user", "content": judge}]

In [None]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-5-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


In [None]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>