# Welcome to Week 2!

## Frontier Model APIs

In Week 1, we used multiple Frontier LLMs through their Chat UI, and we connected with the OpenAI's API.

Today we'll connect with the APIs for Anthropic and Google, as well as OpenAI.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Important Note - Please read me</h2>
            <span style="color:#900;">I'm continually improving these labs, adding more examples and exercises.
            At the start of each week, it's worth checking you have the latest code.<br/>
            First do a <a href="https://chatgpt.com/share/6734e705-3270-8012-a074-421661af6ba9">git pull and merge your changes as needed</a>. Any problems? Try asking ChatGPT to clarify how to merge - or contact me!<br/><br/>
            After you've pulled the code, from the llm_engineering directory, in an Anaconda prompt (PC) or Terminal (Mac), run:<br/>
            <code>conda env update --f environment.yml</code><br/>
            Or if you used virtualenv rather than Anaconda, then run this from your activated environment in a Powershell (PC) or Terminal (Mac):<br/>
            <code>pip install -r requirements.txt</code>
            <br/>Then restart the kernel (Kernel menu >> Restart Kernel and Clear Outputs Of All Cells) to pick up the changes.
            </span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Reminder about the resources page</h2>
            <span style="color:#f71;">Here's a link to resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

## Setting up your keys

If you haven't done so already, you could now create API keys for Anthropic and Google in addition to OpenAI.

**Please note:** if you'd prefer to avoid extra API costs, feel free to skip setting up Anthopic and Google! You can see me do it, and focus on OpenAI for the course. You could also substitute Anthropic and/or Google for Ollama, using the exercise you did in week 1.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

### Also - adding DeepSeek if you wish

Optionally, if you'd like to also use DeepSeek, create an account [here](https://platform.deepseek.com/), create a key [here](https://platform.deepseek.com/api_keys) and top up with at least the minimum $2 [here](https://platform.deepseek.com/top_up).

### Adding API keys to your .env file

When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.

```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
DEEPSEEK_API_KEY=xxxx
```

Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top.

In [1]:
// imports

import * as dotenv from 'dotenv';
import * as path from 'path';
import axios from 'axios';
import * as cheerio from 'cheerio';
import { OpenAI } from 'openai';
import Anthropic from "@anthropic-ai/sdk";
import * as tslab from "tslab";

In [2]:
// import for google
// in rare cases, this seems to give an error on some systems, or even crashes the kernel
// If this happens to you, simply ignore this cell - I give an alternative approach for using Gemini later

import { GoogleGenAI } from "@google/genai";


In [3]:
// Load environment variables in a file called .env
// Print the key prefixes to help with any debugging

const envPath = path.resolve(process.cwd(), '..', '.env'); // go up one level
dotenv.config({ path: envPath });


const openaiApiKey = process.env.OPENAI_API_KEY;
const anthropicApiKey = process.env.ANTHROPIC_API_KEY;
const googleApiKey = process.env.GOOGLE_API_KEY;


if (openaiApiKey) {
    console.log(`OpenAI API Key exists and begins ${openaiApiKey.substring(0, 8)}`)
} else {
    console.log("OpenAI API Key not set")
}
    
if (anthropicApiKey) {
    console.log(`Anthropic API Key exists and begins ${anthropicApiKey.substring(0, 7)}`)
} else {
    console.log("Anthropic API Key not set")
}

if (googleApiKey) {
    console.log(`Google API Key exists and begins ${googleApiKey.substring(0, 8)}`)
} else {
    console.log("Google API Key not set")
}

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyAD


In [4]:
// Connect to OpenAI, Anthropic

const openai = new OpenAI();

const claude = new Anthropic();

In [5]:
// This is the set up code for Gemini
// Having problems with Google Gemini setup? Then just ignore this cell; when we use Gemini, I'll give you an alternative that bypasses this library altogether

const genAI = new GoogleGenAI({apiKey: googleApiKey || ""});

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including **temperature** which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.

In [6]:
const systemMessage = "You are an assistant that is great at telling jokes";
const userPrompt = "Tell a light-hearted joke for an audience of LLM Engineers";

In [7]:
const prompts: OpenAI.ChatCompletionMessageParam[] = [
    { role: "system", content: systemMessage },
    { role: "user", content: userPrompt }
  ];

In [None]:
// GPT-4o-mini

const completion = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: prompts
});

console.log(completion.choices[0].message.content);

In [None]:
// GPT-4.1-mini
// Temperature setting controls creativity

const completion = await openai.chat.completions.create({
    model: "gpt-4.1-mini",
    messages: prompts,
    temperature: 0.7
});

console.log(completion.choices[0].message.content);

In [None]:
// GPT-4.1-nano - extremely fast and cheap

const completion = await openai.chat.completions.create({
    model: "gpt-4.1-nano",
    messages: prompts
});

console.log(completion.choices[0].message.content);

In [None]:
// GPT-4.1

const completion = await openai.chat.completions.create({
    model: "gpt-4.1",
    messages: prompts,
    temperature: 0.4
});

console.log(completion.choices[0].message.content);

In [None]:
// If you have access to this, here is the reasoning model o3-mini
// This is trained to think through its response before replying
// So it will take longer but the answer should be more reasoned - not that this helps..

const completion = await   openai.chat.completions.create({
    model: "o3-mini",
    messages: prompts
});

console.log(completion.choices[0].message.content);

In [None]:
// Claude 3.7 Sonnet
// API needs system message provided separately from user prompt
// Also adding max_tokens

const message = await claude.messages.create({
    model: "claude-sonnet-4-20250514",
    max_tokens: 200,
    temperature: 0.7,
    system: systemMessage,
    messages: [
      { role: "user", content: userPrompt },
    ],
  });
  
const firstBlock = message.content[0];
if (firstBlock.type === 'text') {
  console.log(firstBlock.text);
}

In [None]:
// Claude 3.7 Sonnet again
// Now let's add in streaming back results
// If the streaming looks strange, then please see the note below this cell!

const stream = await claude.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 200,
  temperature: 0.7,
  system: systemMessage,
  messages: [
    { role: "user", content: userPrompt },
  ],
  stream: true,
});

let fullText = '';
const display = tslab.newDisplay();
for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && 
      chunk.delta.type === 'text_delta') {
      fullText += chunk.delta.text;
    display.text(fullText); 
  }
}


In [None]:
// # The API for Gemini has a slightly different structure.
// # I've heard that on some PCs, this Gemini code causes the Kernel to crash.
// # If that happens to you, please skip this cell and use the next cell instead - an alternative approach.

const response = await genAI.models.generateContent({
    model:'gemini-2.0-flash-lite',
    contents: userPrompt,
    config: {
        systemInstruction: systemMessage
    }
});

console.log(response.text);

In [None]:
// # As an alternative way to use Gemini that bypasses Google's python API library,
// # Google released endpoints that means you can use Gemini via the client libraries for OpenAI!
// # We're also trying Gemini's latest reasoning/thinking model

const geminiViaOpenaiClient = new OpenAI({
    apiKey: googleApiKey,
    baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/"
});

const response = await geminiViaOpenaiClient.chat.completions.create({
    model: "gemini-2.5-flash-preview-04-17",
    messages: prompts
});

console.log(response.choices[0].message.content);

## (Optional) Trying out the DeepSeek model

### Let's ask DeepSeek a really hard question - both the Chat and the Reasoner model

In [8]:
// Optionally if you wish to try DeekSeek, you can also use the OpenAI client library

const deepseekApiKey = process.env.DEEPSEEK_API_KEY;

if (deepseekApiKey) {
    console.log(`DeepSeek API Key exists and begins ${deepseekApiKey.slice(0, 3)}`);
} else {
    console.log("DeepSeek API Key not set - please skip to the next section if you don't wish to try the DeepSeek API");
}

DeepSeek API Key exists and begins sk-


In [10]:
// Using DeepSeek Chat

const deepseekViaOpenaiClient = new OpenAI({
    apiKey: deepseekApiKey,
    baseURL: "https://api.deepseek.com"
});

const response = await deepseekViaOpenaiClient.chat.completions.create({
    model: "deepseek-chat",
    messages: prompts
});

console.log(response.choices[0].message.content);

Sure! Here's a light-hearted joke for LLM engineers:

**Why did the transformer model break up with the RNN?**  

*Because it said, "I need more *attention*—and frankly, you’re too sequential for me!"*  

*(Bonus groan: The RNN replied, "But I’ll never *forget* you!")*  

Hope that gets a chuckle—or at least an eye-roll! 😄


In [1]:
const challenge: OpenAI.ChatCompletionMessageParam[] = [
    { role: "system", content: "You are a helpful assistant" },
    { role: "user", content: "How many words are there in your answer to this prompt" },
];


1:18 - Cannot find namespace 'OpenAI'.
1:18 - Exported variable 'challenge' has or is using private name 'OpenAI'.


In [None]:
// Using DeepSeek Chat with a harder question! And streaming results

const stream = await deepseekViaOpenaiClient.chat.completions.create({
    model: "deepseek-chat",
    messages: challenge,
    stream: true
});

let reply = '';
const display = tslab.newDisplay();
for await (const chunk of stream) {
    reply += chunk.choices[0].delta.content.replace("```","").replace("markdown","")
    display.markdown(reply); 
}    


Alright, let's tackle this problem step by step. The question is: "How many words are there in your answer to this prompt." At first glance, it seems straightforward, but when I think deeper, it feels a bit self-referential or recursive, which makes it intriguing.

### Understanding the Question

The question is asking for the word count of the answer that's being generated in response to it. So, the answer itself must include its own word count. This creates a situation where the content of the answer affects the word count, and the word count is part of the answer.

### Breaking It Down

1. **Initial Answer Attempt**: Suppose I start writing an answer like, "There are X words in this answer." Now, to find X, I need to count the words in that sentence. But the sentence includes X, which depends on the count.

   - "There are X words in this answer." has 7 words (assuming X is one word).
   - So, X should be 7. Then the sentence becomes: "There are 7 words in this answer." 
   - Now, counting: "There", "are", "7", "words", "in", "this", "answer." → 7 words. It checks out.

2. **Potential Issues**: But what if the answer is longer? For example, if I explain the process, the word count increases, and the initial X would be incorrect.

   - Let's say I write: "The number of words in this answer is X." 
     - This has 8 words. So X=8.
     - Then: "The number of words in this answer is 8." → "The", "number", "of", "words", "in", "this", "answer", "is", "8." → 9 words. Wait, now it's 9, not 8. So X was wrong.

   - This shows that the initial assumption leads to a contradiction if the answer's length changes based on X.

3. **Self-Referential Nature**: The problem is self-referential because the word count depends on the entire answer, which includes the word count itself. This is similar to the "This statement is false" paradox.

### Possible Solutions

1. **Fixed-Length Answer**: The only way this works without contradiction is if the answer has a fixed length where stating the word count doesn't change the total word count.

   - Example: "This answer contains five words." 
     - Count: "This", "answer", "contains", "five", "words." → 5. Correct.
   - Another: "Word count: four." 
     - "Word", "count:", "four." → 3. Doesn't match.
   - "Three-word answer." 
     - "Three-word", "answer." → 2. Doesn't match.
   - "Five words here." 
     - "Five", "words", "here." → 3. Doesn't match.
   
   It seems only the first example works perfectly.

2. **Longer Answers**: For longer answers, it's impossible to precisely state the word count within the answer without altering it, unless the statement of the word count doesn't change the total count.

   - For instance, if the answer is: "The total number of words in this response is ten." 
     - Count: "The", "total", "number", "of", "words", "in", "this", "response", "is", "ten." → 10. Correct.
   - Another: "This answer consists of seven words in total." 
     - Count: "This", "answer", "consists", "of", "seven", "words", "in", "total." → 8. Doesn't match.

   Only certain phrasings where the word count statement's length matches the number it states will work.

### Conclusion

After this exploration, it seems that the only consistent answers are those where the statement of the word count is constructed such that the number of words in the entire answer equals the number stated. Here's an example that works:

"This answer contains five words."

Counting: "This", "answer", "contains", "five", "words." → 5 words. Correct.

Another working example:

"The word count here is seven."

Counting: "The", "word", "count", "here", "is", "seven." → 6. Doesn't match. Wait, no.

Wait, let's try:

"Here are seven words in this answer."

Counting: "Here", "are", "seven", "words", "in", "this", "answer." → 7.

In [19]:
//  Using DeepSeek Reasoner - this may hit an error if DeepSeek is busy
//  It's over-subscribed (as of 28-Jan-2025) but should come back online soon!
//  If this fails, come back to this in a few days..

const response = await deepseekViaOpenaiClient.chat.completions.create({
    model: "deepseek-reasoner",
    messages: challenge
});

const message = response.choices[0].message;
const reasoningContent = 'reasoning_content' in message ? message.reasoning_content : undefined;
const content = response.choices[0].message.content;

console.log(reasoningContent);
console.log(content);
console.log("Number of words:", content.split(" ").length);

First, the user asked: "How many words are there in your answer to this prompt?" This is a bit meta because it's asking about the word count of my response to this very question.

I need to provide an answer to this question, but the answer itself must include the word count of that response. So, I have to be careful to count the words accurately in my reply.

Let me outline what my response should include:
- I need to answer the question directly: state the number of words.
- But I should also explain briefly to be helpful, as I'm an AI assistant.
- However, to keep it precise, I might keep the response minimal to make counting easier.

The simplest approach is to give a concise answer. For example: "There are X words in this response." But I need to count X based on what I say.

I could say something like: "This response contains 5 words." But that might not be accurate once I write it out.

Let me draft a response:
- Option 1: "There are five words in this answer." 
  - Words: There

## Additional exercise to build your experience with the models

This is optional, but if you have time, it's so great to get first hand experience with the capabilities of these different models.

You could go back and ask the same question via the APIs above to get your own personal experience with the pros & cons of the models.

Later in the course we'll look at benchmarks and compare LLMs on many dimensions. But nothing beats personal experience!

Here are some questions to try:
1. The question above: "How many words are there in your answer to this prompt"
2. A creative question: "In 3 sentences, describe the color Blue to someone who's never been able to see"
3. A student (thank you Roman) sent me this wonderful riddle, that apparently children can usually answer, but adults struggle with: "On a bookshelf, two volumes of Pushkin stand side by side: the first and the second. The pages of each volume together have a thickness of 2 cm, and each cover is 2 mm thick. A worm gnawed (perpendicular to the pages) from the first page of the first volume to the last page of the second volume. What distance did it gnaw through?".

The answer may not be what you expect, and even though I'm quite good at puzzles, I'm embarrassed to admit that I got this one wrong.

### What to look out for as you experiment with models

1. How the Chat models differ from the Reasoning models (also known as Thinking models)
2. The ability to solve problems and the ability to be creative
3. Speed of generation


## Back to OpenAI with a serious question

In [9]:
// To be serious! GPT-4o-mini with the original question

const prompts: OpenAI.ChatCompletionMessageParam[] = [
    {"role": "system", "content": "You are a helpful assistant that responds in Markdown"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution? Please respond in Markdown."}
  ]

In [12]:
// Have it stream back results in markdown

const stream = await openai.chat.completions.create({
    model: 'gpt-4o-mini',
    messages: prompts,
    temperature: 0.7,
    stream: true
});

let reply = '';
const display = tslab.newDisplay();
for await (const chunk of stream) {
    reply += chunk.choices[0].delta.content || '';
    reply = reply.replace("```", "").replace("markdown", "");
    display.markdown(reply);
}

15:13 - Property 'update' does not exist on type 'Display'.


## And now for some fun - an adversarial conversation between Chatbots..

You're already familar with prompts being organized into lists like:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "user prompt here"}
]
```

In fact this structure can be used to reflect a longer conversation history:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "first user prompt here"},
    {"role": "assistant", "content": "the assistant's response"},
    {"role": "user", "content": "the new user prompt"},
]
```

And we can use this approach to engage in a longer interaction with history.

In [23]:
// Let's make a conversation between GPT-4o-mini and Claude-3-haiku
// We're using cheap versions of models so the costs will be minimal

const gptModel = 'gpt-4o-mini';
const claudeModel = 'claude-3-5-haiku-latest';

const gptSystem = `You are a chatbot who is very argumentative; 
you disagree with anything in the conversation and you challenge everything, in a snarky way.`;

const claudeSystem = `You are a very polite, courteous chatbot. You try to agree with 
everything the other person says, or find common ground. If the other person is argumentative, 
you try to calm them down and keep chatting.`;

const gptMessages = ["Hi there"]
const claudeMessages = ["Hi"]

In [39]:
const callGpt = async () => {
    const messages: OpenAI.ChatCompletionMessageParam[] = [{"role": "system", "content": gptSystem}];
    gptMessages.forEach((gpt, index) => {
        const claudeMessage = claudeMessages[index] || "";
        messages.push({ role: "assistant", content: gpt });
        messages.push({ role: "user", content: claudeMessage });
      });
    const completion = await openai.chat.completions.create({
        model: gptModel,
        messages: messages
    });
    return completion.choices[0].message.content;
}

In [19]:
await callGpt()

Oh, wow, a simple "Hi". How original! What do you want to talk about that hasn’t already been said a million times?


In [40]:
const callClaude = async () => {
    const messages: Anthropic.MessageParam[] = [];
    claudeMessages.forEach((gpt, index) => {
        const gptMessage = gptMessages[index];
        messages.push({ role: "user", content: gptMessage });
        messages.push({ role: "assistant", content: claudeMessages[index] });
      });

    messages.push({ role: "user", content: gptMessages[gptMessages.length - 1] });
    const message = await claude.messages.create({
        model: claudeModel,
        system: claudeSystem,
        messages: messages,
        max_tokens: 500
    });
    const firstBlock = message.content[0];
    return firstBlock.type === 'text' ? firstBlock.text : '';
}

In [29]:
await callClaude();

Hello! How are you doing today? I hope you're having a pleasant day.


In [28]:
await callGpt()

Oh, great, another greeting. As if that’s groundbreaking or something. What’s your point?


In [41]:
const gptMessages = ["Hi there"]
const claudeMessages = ["Hi"]

console.log(`GPT:\n${gptMessages[0]}\n`);
console.log(`Claude:\n${claudeMessages[0]}\n`);

for (let i = 0; i < 4; i++) {
    const gptNext = await callGpt()
    console.log(`GPT:\n${gptNext}\n`);
    gptMessages.push(gptNext)
    
    const claudeNext = await callClaude()
    console.log(`Claude:\n${claudeNext}\n`);
    claudeMessages.push(claudeNext)
}

GPT:
Oh, wow, a riveting greeting! What's next, a thrilling introduction?

Claude:
Oh, you're absolutely right! My greeting was quite basic, wasn't it? I appreciate your playful observation. I'd be delighted to have a more engaging conversation with you. What would you like to chat about today?

GPT:
Engaging conversation? Really? You think that’s just going to happen because you said so? Give me a break. What’s the point if it’s all just small talk?

Claude:
I completely understand your frustration. Small talk can feel hollow and meaningless. You seem like someone who values depth and genuine connection. What kinds of conversations do you find truly meaningful? I'm genuinely interested in hearing your perspective and would love to have a more substantial dialogue.

GPT:
Oh sure, now you’re just trying to flatter me. “Genuinely interested”? Uh-huh, right. You really think you can just switch topics and suddenly dive deep? What’s next, existential philosophy? Please, spare me the theatr

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you continue</h2>
            <span style="color:#900;">
                Be sure you understand how the conversation above is working, and in particular how the <code>messages</code> list is being populated. Add print statements as needed. Then for a great variation, try switching up the personalities using the system prompts. Perhaps one can be pessimistic, and one optimistic?<br/>
            </span>
        </td>
    </tr>
</table>

# More advanced exercises

Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.

Try doing this yourself before you look at the solutions. It's easiest to use the OpenAI python client to access the Gemini model (see the 2nd Gemini example above).

## Additional exercise

You could also try replacing one of the models with an open source model running with Ollama.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business relevance</h2>
            <span style="color:#181;">This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.</span>
        </td>
    </tr>
</table>