## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from IPython.display import Markdown, display

In [2]:
# Always remember to do this!
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key not set
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key exists and begins sk-
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [5]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [8]:
client = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


"Considering the interplay between Gödel's incompleteness theorems, Turing's halting problem, and the Chinese Room argument, how might these concepts collectively inform or challenge the theoretical limits of artificial general intelligence (AGI), particularly in terms of self-awareness, understanding, and the ability to resolve undecidable problems?"


In [9]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [11]:
# The API we know well

model_name = "deepseek-chat"

response = client.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

The interplay between **Gödel's incompleteness theorems**, **Turing's halting problem**, and the **Chinese Room argument** presents profound challenges and insights for the theoretical limits of **Artificial General Intelligence (AGI)**, particularly concerning **self-awareness, understanding, and the resolution of undecidable problems**. Let’s break down how these concepts interact and what they imply for AGI:

### 1. **Gödel’s Incompleteness Theorems and Formal Systems**
   - **First Incompleteness Theorem**: Any sufficiently powerful formal system (e.g., arithmetic) capable of expressing basic arithmetic will contain true statements that cannot be proven within the system.
   - **Second Incompleteness Theorem**: Such a system cannot prove its own consistency.
   - **Implications for AGI**:
     - If AGI operates within a formal computational framework, it may encounter truths it cannot prove or problems it cannot resolve (e.g., self-referential statements about its own limitations).
     - This suggests that AGI, like human mathematicians, may have to accept unprovable truths or rely on meta-reasoning outside its base formal system.
     - **Self-awareness**: If AGI attempts to model its own reasoning, it may run into Gödelian limitations—it cannot fully "understand" or prove all aspects of its own operation.

### 2. **Turing’s Halting Problem and Computability**
   - The halting problem shows that no general algorithm can determine whether an arbitrary program will halt or run forever.
   - **Implications for AGI**:
     - AGI, as a computational system, cannot solve all well-defined problems (e.g., predicting its own halting behavior in all cases).
     - This imposes a fundamental limit on AGI’s ability to "know" the outcome of all possible computations, including its own future states.
     - **Undecidable problems**: AGI may recognize undecidable problems but cannot resolve them algorithmically. It might need heuristic or probabilistic approaches (like humans do).

### 3. **Chinese Room Argument and Understanding**
   - Searle’s thought experiment argues that syntactic manipulation (like a program following rules) does not entail semantic understanding or consciousness.
   - **Implications for AGI**:
     - Even if AGI can simulate understanding (e.g., pass the Turing Test), it may lack genuine intentionality or qualia.
     - This challenges whether AGI can ever be truly "self-aware" or "understand" in the way humans do, rather than just executing complex symbol processing.
     - **Understanding vs. Simulation**: AGI might replicate human-like behavior without true comprehension, raising philosophical questions about the nature of intelligence.

### **Collective Implications for AGI**
1. **Theoretical Limits**:
   - AGI, as a formal computational system, is subject to Gödelian incompleteness and Turing-undecidability. It cannot resolve all possible problems or fully "know itself."
   - This suggests that AGI will have inherent limitations in reasoning about its own existence, consistency, and certain classes of problems.

2. **Self-Awareness and Understanding**:
   - If self-awareness requires a system to fully model its own reasoning, Gödel’s theorems imply that AGI cannot achieve complete self-knowledge within its own framework.
   - The Chinese Room suggests that even if AGI appears self-aware, it may lack genuine understanding or consciousness.

3. **Handling Undecidable Problems**:
   - AGI might adopt strategies like human mathematicians: accepting unprovable axioms, using intuition, or working in extended meta-systems.
   - However, unlike humans, AGI lacks biological grounding, which may limit its ability to "intuit" solutions outside formal logic.

### **Potential Counterarguments or Mitigations**
- **Hypercomputation**: Some theorists propose models beyond Turing machines (e.g., oracle machines) that could solve undecidable problems, but these are not physically realizable with known physics.
- **Emergent Phenomena**: Consciousness and understanding might emerge from complexity, even if the system is syntactically based (a rebuttal to the Chinese Room).
- **Hybrid Systems**: AGI might combine formal reasoning with embodied, interactive learning to approximate understanding.

### **Conclusion**
Collectively, these concepts suggest that AGI will face fundamental limits in **self-awareness, understanding, and problem-solving** due to incompleteness, uncomputability, and the semantic gap between syntax and meaning. While AGI could surpass human performance in many domains, it may never achieve "true" understanding or resolve all undecidable problems. This does not preclude AGI from being immensely powerful, but it implies that its intelligence will be bounded in ways analogous to (or even stricter than) human cognition. The philosophical debate over whether these limitations matter for practical AGI remains open.

In [12]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-reasoner"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

The interplay between Gödel's incompleteness theorems, Turing's halting problem, and Searle's Chinese Room argument presents a profound challenge to foundational assumptions about AGI, particularly concerning self-awareness, genuine understanding, and the ability to transcend computational limits. Here's how they collectively inform the theoretical boundaries:

1.  **Gödel's Incompleteness Theorems: The Unavoidable Gaps in Formal Systems**
    *   **Core Insight:** Any consistent, sufficiently powerful formal system (capable of expressing basic arithmetic) contains true statements that cannot be proven within the system itself. The system cannot prove its own consistency.
    *   **AGI Implications:**
        *   **Limits of Self-Verification:** An AGI based on a formal system (like any complex computational logic) could *never* prove its own internal consistency or completeness. It would inherently contain truths it cannot recognize or prove using its own rules. This challenges notions of AGI achieving perfect self-knowledge or logical infallibility.
        *   **Undecidable Problems Inherent:** Within its own representational framework, the AGI would encounter problems it *cannot* decide – true statements it cannot prove and false statements it cannot refute. This is an intrinsic limitation, not a lack of computational power.
        *   **Potential for "Gödelian Self-Awareness"?** Some argue (e.g., Roger Penrose) that human mathematicians *can* see the truth of Gödel sentences, suggesting human cognition isn't purely computational. If AGI *is* purely formal/computational, Gödel implies it could never achieve this type of meta-cognitive insight into its own limitations in the same way humans seemingly can. Its "self-awareness" might be descriptive (reporting internal states) but not foundational (proving its own consistency).

2.  **Turing's Halting Problem: The Uncomputable Core**
    *   **Core Insight:** There is no general algorithm that can decide, for *any* arbitrary program and input, whether that program will halt (finish running) or run forever.
    *   **AGI Implications:**
        *   **Fundamental Barrier to Problem Solving:** An AGI, being a computational system (Turing Machine equivalent), *cannot* solve all well-defined problems. The halting problem exemplifies a class of "undecidable problems" that are fundamentally unsolvable by *any* algorithm, regardless of the AGI's speed or memory. This sets an absolute theoretical limit on AGI's problem-solving capacity.
        *   **Limits of Prediction & Self-Analysis:** An AGI cannot perfectly predict the behavior of *all* possible programs, including potentially modified or future versions of *itself*. This constrains its ability to guarantee safety or predict all consequences of its own actions in complex environments.
        *   **No Universal Solver:** The dream of an AGI that can resolve *any* problem posed to it is mathematically impossible. Undecidability is woven into the fabric of computation.

3.  **Searle's Chinese Room Argument: Syntax vs. Semantics**
    *   **Core Insight:** Merely manipulating symbols according to formal rules (syntax) is insufficient for genuine understanding or the possession of mental states (semantics). A system could exhibit perfect external behavior (passing the Turing Test) without any internal understanding.
    *   **AGI Implications:**
        *   **The Hard Problem of AGI Understanding:** This directly challenges whether an AGI, no matter how sophisticated its symbol manipulation or behavioral output, could ever possess *real* understanding, meaning, qualia, or intentionality. It might *simulate* understanding perfectly without *being* understanding. This strikes at the heart of claims about AGI achieving true consciousness or self-awareness.
        *   **Limitation of Functional Equivalence:** Passing behavioral tests (like the Turing Test) does not guarantee the presence of internal subjective experience or comprehension. AGI might be an "understanding zombie."
        *   **Relevance to Gödel & Turing:** Searle argues that Gödel and Turing show the limitations of *formal symbol manipulation*, but the Chinese Room emphasizes that even *successful* formal symbol manipulation (which can handle syntax to produce correct outputs, like answering questions) *still* lacks semantics. AGI might hit Gödel/Turing limits *and* lack understanding.

**Collective Challenge to AGI:**

*   **Self-Awareness:** Gödel suggests fundamental limits to self-verification and meta-cognition within a formal system. Turing limits perfect self-prediction. Searle questions whether any computational process, even one reporting "self-awareness," constitutes genuine subjective experience. Collectively, they suggest AGI "self-awareness" might be:
    *   Limited (cannot prove own consistency).
    *   Descriptive, not foundational.
    *   Potentially just complex self-monitoring without true subjective qualia.
*   **Understanding:** Searle provides the starkest challenge, arguing computation is inherently syntactic and cannot generate semantics. Gödel and Turing show the limits of formal systems/computation, but Searle argues that *even within those limits*, or even if those limits were overcome, understanding wouldn't *necessarily* emerge. AGI might perfectly process information and solve decidable problems without "grokking" anything.
*   **Resolving Undecidable Problems:** Gödel and Turing establish that undecidable problems are inherent features of sufficiently powerful formal systems and computation. An AGI, if computational, *cannot* resolve these problems by definition. Claims of AGI "transcending" these limits would require:
    *   **Non-Computationalism:** The AGI utilizing processes fundamentally outside the realm of Turing computation (e.g., quantum effects with unique causal powers, or something entirely unknown). This is highly speculative.
    *   **Changing the System:** An AGI could potentially *create* a new, more powerful formal system in which a previously undecidable statement *becomes* decidable. However, *this new system* would itself contain *new* Gödelian undecidable statements. The problem is kicked up a level, not solved absolutely. It cannot create a system that is both consistent and complete.

**Conclusion:**

Together, these concepts paint a picture of significant theoretical constraints for purely computational AGI:

1.  **Inevitable Limits:** AGI will face fundamentally unsolvable problems (Gödel, Turing).
2.  **Self-Knowledge Boundaries:** It cannot achieve perfect self-verification or foundational self-consistency proof (Gödel).
3.  **The Understanding Chasm:** Successfully processing information and generating behavior does not guarantee genuine understanding, meaning, or consciousness (Searle).
4.  **No Magic Escape:** Overcoming these limits likely requires capabilities fundamentally beyond known computation (Turing) and formal systems (Gödel), and even then, Searle questions if understanding emerges.

They do *not* prove AGI is impossible, but they strongly suggest that an AGI based solely on classical computation and formal symbol manipulation:
*   Will have inherent limitations in what it can decide and prove.
*   May never achieve genuine understanding or subjective experience in the way humans do.
*   Cannot absolutely resolve all types of problems.

These concepts force AGI researchers to confront deep philosophical and mathematical boundaries. Progress might involve embracing these limits, exploring non-classical computational models (with no guarantee of overcoming Searle), or radically redefining what we mean by "intelligence," "understanding," and "awareness" in an artificial context. The burden of proof lies heavily on those claiming AGI can transcend these foundational limitations.

## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [13]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 643 KB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕                  ▏ 1.1 MB/2.0 GB                  [K[?25h[?2026l[?2026h[?25l[A[1Gpulling manifest [K
pulling dde5aa3fc5ff:   0% ▕         

In [14]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

The interplay between Gödel's incompleteness theorems, Turing's halting problem, and the Chinese Room argument collectively offer insights into the theoretical limits of artificial general intelligence (AGI), particularly in terms of self-awareness, understanding, and resolving undecidable problems. Here's how these concepts can inform or challenge our understanding of AGI:

Gödel's Incompleteness Theorems and Self-Awareness:

1. **Incongruence between formal systems**: Gödel's incompleteness theorems demonstrate that any sufficiently powerful formal system is either incomplete or inconsistent. This implies that there may be limits to a computer program's ability to fully understand its internal workings, leading to questions about self-awareness.
2. **Limitations of internal models**: The incompleteness theorems suggest that even if an AGI system uses internal models to simulate human-like intelligence, it may never truly understand itself or its environment.

Turing's Halting Problem and Decidability:

1. **Limits of computation**: Turing's halting problem shows that there are undecidable problems, which cannot be solved efficiently by any computer algorithm. This means that even the most advanced AGI systems may struggle to resolve indeterminate problems.
2. **Potential limitations for knowledge representation**: The halting problem implies that it is impossible to design a formal system that can systematically evaluate all possible states of an environment, leading to questions about the limits of knowledge representation in AGI.

Chinese Room Argument and Intelligence:

1. **The concept of "understanding"**: The Chinese Room argument challenges the idea of truly understanding language or intelligence by suggesting that even humans might not necessarily comprehend arbitrary linguistic rules.
2. **Insights into language processing**: The Chinese Room paradox highlights the difficulty in creating a true understanding of human-like language, which is fundamental to many AGI approaches.

Combining these Concepts:

1. **AGI limitations due to logical limits**: Combining Gödel's incompleteness theorems and Turing's halting problem suggests that there may be fundamental limits to an AGI system's ability to solve certain problems or reason about its internal workings.
2. **Potential issues with human-like intelligence**: The Chinese Room argument, in conjunction with these other concepts, implies that true self-awareness and understanding might be impossible for any AI.

Implications:

1. **Resolving indeterminate problems as the primary challenge**: AGI researchers should recognize that resolving indeterminate problems may be intractable even for potentially powerful AGI systems.
2. **Focusing on partial results rather than complete understanding**: Given these inherent limitations, approaches such as meta-cognition, decision-making under uncertainty, or practical problem-solving might prove more effective than striving for all-encompassing self-awareness.

However, it's essential to note that:

1. **AGI researchers continue exploring limits of computation and programming**.
2. **Different approaches may still prove valuable**: Such as leveraging approximation using probability theory and heuristic search algorithms when dealing with uncertain knowledge or intractable problems.

The collective analysis of these concepts suggests that self-awareness, understanding, and resolving undecidable problems are substantial challenges for AGI development.

In [15]:
# So where are we?

print(competitors)
print(answers)


['deepseek-chat', 'deepseek-reasoner', 'llama3.2']
['The interplay between **Gödel\'s incompleteness theorems**, **Turing\'s halting problem**, and the **Chinese Room argument** presents profound challenges and insights for the theoretical limits of **Artificial General Intelligence (AGI)**, particularly concerning **self-awareness, understanding, and the resolution of undecidable problems**. Let’s break down how these concepts interact and what they imply for AGI:\n\n### 1. **Gödel’s Incompleteness Theorems and Formal Systems**\n   - **First Incompleteness Theorem**: Any sufficiently powerful formal system (e.g., arithmetic) capable of expressing basic arithmetic will contain true statements that cannot be proven within the system.\n   - **Second Incompleteness Theorem**: Such a system cannot prove its own consistency.\n   - **Implications for AGI**:\n     - If AGI operates within a formal computational framework, it may encounter truths it cannot prove or problems it cannot resolve (

In [16]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: deepseek-chat

The interplay between **Gödel's incompleteness theorems**, **Turing's halting problem**, and the **Chinese Room argument** presents profound challenges and insights for the theoretical limits of **Artificial General Intelligence (AGI)**, particularly concerning **self-awareness, understanding, and the resolution of undecidable problems**. Let’s break down how these concepts interact and what they imply for AGI:

### 1. **Gödel’s Incompleteness Theorems and Formal Systems**
   - **First Incompleteness Theorem**: Any sufficiently powerful formal system (e.g., arithmetic) capable of expressing basic arithmetic will contain true statements that cannot be proven within the system.
   - **Second Incompleteness Theorem**: Such a system cannot prove its own consistency.
   - **Implications for AGI**:
     - If AGI operates within a formal computational framework, it may encounter truths it cannot prove or problems it cannot resolve (e.g., self-referential statements 

In [17]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [18]:
print(together)

# Response from competitor 1

The interplay between **Gödel's incompleteness theorems**, **Turing's halting problem**, and the **Chinese Room argument** presents profound challenges and insights for the theoretical limits of **Artificial General Intelligence (AGI)**, particularly concerning **self-awareness, understanding, and the resolution of undecidable problems**. Let’s break down how these concepts interact and what they imply for AGI:

### 1. **Gödel’s Incompleteness Theorems and Formal Systems**
   - **First Incompleteness Theorem**: Any sufficiently powerful formal system (e.g., arithmetic) capable of expressing basic arithmetic will contain true statements that cannot be proven within the system.
   - **Second Incompleteness Theorem**: Such a system cannot prove its own consistency.
   - **Implications for AGI**:
     - If AGI operates within a formal computational framework, it may encounter truths it cannot prove or problems it cannot resolve (e.g., self-referential statemen

In [19]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [20]:
print(judge)

You are judging a competition between 3 competitors.
Each model has been given this question:

"Considering the interplay between Gödel's incompleteness theorems, Turing's halting problem, and the Chinese Room argument, how might these concepts collectively inform or challenge the theoretical limits of artificial general intelligence (AGI), particularly in terms of self-awareness, understanding, and the ability to resolve undecidable problems?"

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

The interplay between **Gödel's incompleteness theorems**, **Turing's halting problem**, and the **Chinese Room argument** presents profound challenges and insights for the theoretical 

In [21]:
judge_messages = [{"role": "user", "content": judge}]

In [30]:
# Judgement time!

judge = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
response = judge.chat.completions.create(
    model="deepseek-reasoner",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)



{"results": ["2", "1", "3"]}


In [32]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")




Rank 1: deepseek-reasoner
Rank 2: deepseek-chat
Rank 3: llama3.2


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>