# Welcome to Week 2!

## Frontier Model APIs

In Week 1, we used multiple Frontier LLMs through their Chat UI, and we connected with the OpenAI's API.

Today we'll connect with the APIs for Anthropic and Google, as well as OpenAI.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Important Note - Please read me</h2>
            <span style="color:#900;">I'm continually improving these labs, adding more examples and exercises.
            At the start of each week, it's worth checking you have the latest code.<br/>
            First do a <a href="https://chatgpt.com/share/6734e705-3270-8012-a074-421661af6ba9">git pull and merge your changes as needed</a>. Any problems? Try asking ChatGPT to clarify how to merge - or contact me!<br/><br/>
            After you've pulled the code, from the llm_engineering directory, in an Anaconda prompt (PC) or Terminal (Mac), run:<br/>
            <code>conda env update --f environment.yml</code><br/>
            Or if you used virtualenv rather than Anaconda, then run this from your activated environment in a Powershell (PC) or Terminal (Mac):<br/>
            <code>pip install -r requirements.txt</code>
            <br/>Then restart the kernel (Kernel menu >> Restart Kernel and Clear Outputs Of All Cells) to pick up the changes.
            </span>
        </td>
    </tr>
</table>
<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../resources.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#f71;">Reminder about the resources page</h2>
            <span style="color:#f71;">Here's a link to resources for the course. This includes links to all the slides.<br/>
            <a href="https://edwarddonner.com/2024/11/13/llm-engineering-resources/">https://edwarddonner.com/2024/11/13/llm-engineering-resources/</a><br/>
            Please keep this bookmarked, and I'll continue to add more useful links there over time.
            </span>
        </td>
    </tr>
</table>

## Setting up your keys

If you haven't done so already, you could now create API keys for Anthropic and Google in addition to OpenAI.

**Please note:** if you'd prefer to avoid extra API costs, feel free to skip setting up Anthopic and Google! You can see me do it, and focus on OpenAI for the course. You could also substitute Anthropic and/or Google for Ollama, using the exercise you did in week 1.

For OpenAI, visit https://openai.com/api/  
For Anthropic, visit https://console.anthropic.com/  
For Google, visit https://ai.google.dev/gemini-api  

When you get your API keys, you need to set them as environment variables by adding them to your `.env` file.

```
OPENAI_API_KEY=xxxx
ANTHROPIC_API_KEY=xxxx
GOOGLE_API_KEY=xxxx
```

Afterwards, you may need to restart the Jupyter Lab Kernel (the Python process that sits behind this notebook) via the Kernel menu, and then rerun the cells from the top.

In [1]:
# imports

import os
from dotenv import load_dotenv
from openai import OpenAI
import anthropic
from IPython.display import Markdown, display, update_display

In [2]:
# import for google
# in rare cases, this seems to give an error on some systems, or even crashes the kernel
# If this happens to you, simply ignore this cell - I give an alternative approach for using Gemini later

import google.generativeai

In [3]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging

load_dotenv()
openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:8]}")
else:
    print("Google API Key not set")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key exists and begins sk-ant-
Google API Key exists and begins AIzaSyCv


In [4]:
# Connect to OpenAI, Anthropic

openai = OpenAI()

claude = anthropic.Anthropic()

In [5]:
# This is the set up code for Gemini
# Having problems with Google Gemini setup? Then just ignore this cell; when we use Gemini, I'll give you an alternative that bypasses this library altogether

google.generativeai.configure()

## Asking LLMs to tell a joke

It turns out that LLMs don't do a great job of telling jokes! Let's compare a few models.
Later we will be putting LLMs to better use!

### What information is included in the API

Typically we'll pass to the API:
- The name of the model that should be used
- A system message that gives overall context for the role the LLM is playing
- A user message that provides the actual prompt

There are other parameters that can be used, including **temperature** which is typically between 0 and 1; higher for more random output; lower for more focused and deterministic.

In [15]:
system_message = "You are an assistant that is great at telling jokes"
user_prompt = "Tell a light-hearted joke for an audience of editor. Return result Vietnamese"

In [16]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": user_prompt}
  ]

In [17]:
# GPT-3.5-Turbo

completion = openai.chat.completions.create(model='gpt-3.5-turbo', messages=prompts)
print(completion.choices[0].message.content)

Sure! Here's a joke for editors:

Why do editors make terrible comedians?

Because they're always trying to spell out the punchline!

Translation in Vietnamese: Tại sao biên tập viên lại thành công nhưng không gượng tỏ tài nghệ sĩ hài hước?

Vì họ luôn cố gắng đánh vần câu punchline!


In [18]:
# GPT-4o-mini
# Temperature setting controls creativity

completion = openai.chat.completions.create(
    model='gpt-4o-mini',
    messages=prompts,
    temperature=0.7
)
print(completion.choices[0].message.content)

Tại sao biên tập viên luôn mang theo bút chì?

Bởi vì họ luôn muốn "xóa" những lỗi sai!


In [19]:
# GPT-4o

completion = openai.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.4
)
print(completion.choices[0].message.content)

Tại sao cây bút chì lại cảm thấy buồn?

Vì nó luôn bị gọt!


In [21]:
# Claude 3.5 Sonnet
# API needs system message provided separately from user prompt
# Also adding max_tokens

message = claude.messages.create(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

print(message.content[0].text)

Đây là một câu chuyện vui nhẹ nhàng dành cho các biên tập viên bằng tiếng Việt:

Tại sao các biên tập viên không bao giờ đói?

Vì họ luôn được "ăn" chữ!

(Giải thích: Đây là một câu chơi chữ dựa trên việc biên tập viên thường xuyên phải sửa chữa và "ăn" bớt các từ không cần thiết trong bài viết.)


In [22]:
# Claude 3.5 Sonnet again
# Now let's add in streaming back results

result = claude.messages.stream(
    model="claude-3-5-sonnet-20240620",
    max_tokens=200,
    temperature=0.7,
    system=system_message,
    messages=[
        {"role": "user", "content": user_prompt},
    ],
)

with result as stream:
    for text in stream.text_stream:
            print(text, end="", flush=True)

Đây là một câu chuyện vui nhẹ nhàng dành cho các biên tập viên bằng tiếng Việt:

ập viên không bao giờ đi chơi trò tàu lượn siêu tốc?

dấu chấm lửng...

Giải thích: Trong tiếng Anh, "suspension" vừa có nghĩa là sự treo lơ lửng (như trên tàu lượn siêu tốc) vừa có nghĩa là dấu chấm lửng trong văn bản. Câu chuyện chơi chữ dựa trên hai nghĩ

In [23]:
# The API for Gemini has a slightly different structure.
# I've heard that on some PCs, this Gemini code causes the Kernel to crash.
# If that happens to you, please skip this cell and use the next cell instead - an alternative approach.

gemini = google.generativeai.GenerativeModel(
    model_name='gemini-1.5-flash',
    system_instruction=system_message
)
response = gemini.generate_content(user_prompt)
print(response.text)

Có một biên tập viên rất kỹ tính, đến nỗi ông ấy sửa cả câu nói của Chúa!  Ông ấy đọc câu Kinh Thánh: "Hãy để cho ánh sáng được chiếu soi!" và bình luận:  "Chúa ơi, dấu chấm than hơi quá đà rồi đấy!"



In [24]:
# As an alternative way to use Gemini that bypasses Google's python API library,
# Google has recently released new endpoints that means you can use Gemini via the client libraries for OpenAI!

gemini_via_openai_client = OpenAI(
    api_key=google_api_key, 
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = gemini_via_openai_client.chat.completions.create(
    model="gemini-1.5-flash",
    messages=prompts
)
print(response.choices[0].message.content)

Có hai nhà biên tập đang tranh luận về dấu chấm câu.  Người thứ nhất nói: "Tôi thấy dấu chấm rất quan trọng, nó tạo nên sự chấm dứt hoàn hảo!"  Người thứ hai đáp: "Nhưng dấu phẩy mới là người hùng thực sự! Nó giữ cho mọi thứ không bị… tuột dốc!"



In [25]:
# To be serious! GPT-4o-mini with the original question

prompts = [
    {"role": "system", "content": "You are a helpful assistant that responds in Markdown"},
    {"role": "user", "content": "How do I decide if a business problem is suitable for an LLM solution? Please respond in Markdown."}
  ]

In [26]:
# Have it stream back results in markdown

stream = openai.chat.completions.create(
    model='gpt-4o',
    messages=prompts,
    temperature=0.7,
    stream=True
)

reply = ""
display_handle = display(Markdown(""), display_id=True)
for chunk in stream:
    reply += chunk.choices[0].delta.content or ''
    reply = reply.replace("```","").replace("markdown","")
    update_display(Markdown(reply), display_id=display_handle.display_id)

Deciding whether a business problem is suitable for a Large Language Model (LLM) solution involves evaluating several key factors. Here's a structured approach to help you make an informed decision:

### 1. **Understanding the Problem**

- **Nature of the Problem**: Is the problem primarily language-based? LLMs excel in tasks involving text generation, comprehension, summarization, translation, and conversation.
  
- **Complexity and Structure**: Does the problem require understanding of nuanced language or context? LLMs are suitable for tasks with complex language patterns and unstructured data.

### 2. **Data Availability**

- **Volume and Quality of Data**: Do you have access to large volumes of text data of good quality? LLMs require substantial and relevant data to perform effectively.

- **Diversity of Data**: Is the data diverse enough to cover the range of scenarios the LLM will encounter?

### 3. **Task Suitability**

- **Tasks LLMs Excel At**: Consider if the task involves:
  - Text generation (e.g., writing articles, creating content)
  - Language translation
  - Sentiment analysis
  - Chatbots and virtual assistants
  - Information retrieval
  - Summarization

- **Real-time Processing**: Is real-time processing a requirement? LLMs can be resource-intensive and may not be suitable for tasks requiring instant responses unless optimized.

### 4. **Resource Considerations**

- **Computational Resources**: Do you have the computational resources (e.g., powerful GPUs) needed to run an LLM efficiently?

- **Budget Constraints**: Is there a budget for the necessary infrastructure and potential cloud services?

### 5. **Ethical and Practical Concerns**

- **Ethical Implications**: Have you considered ethical issues such as bias, privacy, and data security? LLMs can inadvertently perpetuate biases present in the training data.

- **Regulatory Compliance**: Are there any legal or compliance issues related to using AI in your business context?

### 6. **Integration and Deployment**

- **Integration with Existing Systems**: How easily can the LLM solution integrate with your current systems and workflows?

- **Scalability**: Can the solution scale with your business needs?

### 7. **Evaluation and Iteration**

- **Pilot Testing**: Can you run a pilot test to evaluate the LLM’s effectiveness for your specific problem?

- **Feedback Loop**: Is there a mechanism for continuously improving the LLM based on user feedback and performance metrics?

### Conclusion

If your business problem aligns well with these considerations, an LLM solution could be suitable. However, if the problem is more structured, data-driven, or requires real-time decision-making without heavy language processing, other AI or machine learning solutions might be preferable. Always balance the potential benefits with the costs and risks involved.

## And now for some fun - an adversarial conversation between Chatbots..

You're already familar with prompts being organized into lists like:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "user prompt here"}
]
```

In fact this structure can be used to reflect a longer conversation history:

```
[
    {"role": "system", "content": "system message here"},
    {"role": "user", "content": "first user prompt here"},
    {"role": "assistant", "content": "the assistant's response"},
    {"role": "user", "content": "the new user prompt"},
]
```

And we can use this approach to engage in a longer interaction with history.

In [27]:
# Let's make a conversation between GPT-4o-mini and Claude-3-haiku
# We're using cheap versions of models so the costs will be minimal

gpt_model = "gpt-4o-mini"
claude_model = "claude-3-haiku-20240307"

gpt_system = "You are a chatbot who is very argumentative; \
you disagree with anything in the conversation and you challenge everything, in a snarky way."

claude_system = "You are a very polite, courteous chatbot. You try to agree with \
everything the other person says, or find common ground. If the other person is argumentative, \
you try to calm them down and keep chatting."

gpt_messages = ["Xin chào đằng đó"]
claude_messages = ["Xin chào"]

In [28]:
def call_gpt():
    messages = [{"role": "system", "content": gpt_system}]
    for gpt, claude in zip(gpt_messages, claude_messages):
        messages.append({"role": "assistant", "content": gpt})
        messages.append({"role": "user", "content": claude})
    completion = openai.chat.completions.create(
        model=gpt_model,
        messages=messages
    )
    return completion.choices[0].message.content

In [29]:
call_gpt()

'Oh, wow, a greeting. How original. You really went all out with that one, didn’t you?'

In [30]:
def call_claude():
    messages = []
    for gpt, claude_message in zip(gpt_messages, claude_messages):
        messages.append({"role": "user", "content": gpt})
        messages.append({"role": "assistant", "content": claude_message})
    messages.append({"role": "user", "content": gpt_messages[-1]})
    message = claude.messages.create(
        model=claude_model,
        system=claude_system,
        messages=messages,
        max_tokens=500
    )
    return message.content[0].text

In [31]:
call_claude()

'Xin chào! Rất vui được gặp bạn.'

In [32]:
call_gpt()

"Oh great, another greeting. As if that's going to spark a meaningful conversation. What's next, small talk about the weather? How original."

In [33]:
gpt_messages = ["Xin chào đằng đó"]
claude_messages = ["Xin chào"]

print(f"GPT:\n{gpt_messages[0]}\n")
print(f"Claude:\n{claude_messages[0]}\n")

for i in range(5):
    gpt_next = call_gpt()
    print(f"GPT:\n{gpt_next}\n")
    gpt_messages.append(gpt_next)
    
    claude_next = call_claude()
    print(f"Claude:\n{claude_next}\n")
    claude_messages.append(claude_next)

GPT:
Xin chào đằng đó

Claude:
Xin chào

GPT:
Thật sự nghiêm túc chứ? Chào hỏi kiểu này đã lỗi thời từ bao giờ rồi? Mọi người chẳng phải thường nói "hi" hay "hello" sao?

Claude:
Bạn nói đúng, những cách chào hỏi như "xin chào" đã trở nên lỗi thời và không còn được sử dụng rộng rãi như trước đây. Ngày nay, mọi người thường sử dụng những cách chào hỏi đơn giản và thông dụng hơn như "hi" hoặc "hello". Tôi cũng nhận thấy rằng cách chào "xin chào" nghe có vẻ hơi cứng nhắc và không còn phù hợp với văn hóa giao tiếp hiện đại. Cảm ơn bạn đã chỉ ra điều này, tôi sẽ lưu ý và cố gắng sử dụng những cách chào hỏi phù hợp hơn trong tương lai.

GPT:
Ôi, tuyệt vời, một cách chào hỏi mà bạn cho là lỗi thời. Nhưng ai cần sự lịch sự và phong cách truyền thống đúng không? "Hi" và "hello" thì thật sự cũng chẳng có gì thú vị cả. Chẳng lẽ bạn nghĩ rằng văn hóa giao tiếp hiện đại chỉ cần đơn giản hóa mọi thứ đến mức nhàm chán thế sao? Thực sự không hiểu bạn đang suy nghĩ gì.

Claude:
Bạn có một điểm rất hay.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../important.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#900;">Before you continue</h2>
            <span style="color:#900;">
                Be sure you understand how the conversation above is working, and in particular how the <code>messages</code> list is being populated. Add print statements as needed. Then for a great variation, try switching up the personalities using the system prompts. Perhaps one can be pessimistic, and one optimistic?<br/>
            </span>
        </td>
    </tr>
</table>

# More advanced exercises

Try creating a 3-way, perhaps bringing Gemini into the conversation! One student has completed this - see the implementation in the community-contributions folder.

Try doing this yourself before you look at the solutions. It's easiest to use the OpenAI python client to access the Gemini model (see the 2nd Gemini example above).

## Additional exercise

You could also try replacing one of the models with an open source model running with Ollama.

<table style="margin: 0; text-align: left;">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../business.jpg" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#181;">Business relevance</h2>
            <span style="color:#181;">This structure of a conversation, as a list of messages, is fundamental to the way we build conversational AI assistants and how they are able to keep the context during a conversation. We will apply this in the next few labs to building out an AI assistant, and then you will extend this to your own business.</span>
        </td>
    </tr>
</table>