# Prompt Engineering

<details>
<summary>Click to expand/collapse</summary>

## Introduction

**Prompt engineering** is the practice of designing and optimizing inputs (prompts) to guide AI models 
in generating more accurate, useful, and relevant responses. It plays a crucial role in improving 
interactions with large language models (LLMs) such as OpenAI's GPT-4. For this notebook, we require you to have both Claude or OpenAI API access. 

## Basic Principles of Prompt Engineering

1. **Clarity and Specificity**  
   - Clearly define the task and avoid vague or ambiguous instructions.  
   - Example:
     ```plaintext
     Bad: "Tell me about history."
     Good: "Provide a summary of the Renaissance period and its impact on European art."
     ```

2. **Role Assignment**  
   - Assign a persona or role to the AI model to get responses in a specific tone or expertise.  
   - Example:
     ```plaintext
     "You are an AI financial advisor. Explain the benefits of index funds to a beginner investor."
     ```

3. **Context and Constraints**  
   - Provide relevant background information and define constraints such as format, length, or tone.  
   - Example:
     ```plaintext
     "Write a 100-word product description for a new AI-powered smartphone."
     ```

4. **Step-by-Step Breakdown**  
   - Ask the model to explain its reasoning in steps for complex tasks.  
   - Example:
     ```plaintext
     "Explain the process of machine learning model training in a step-by-step manner."
     ```

5. **Examples and Formatting Guidance**  
   - Show examples to guide the model on expected output formats.  
   - Example:
     ```plaintext
     "Translate the following English sentences to French:\n1. Hello, how are you?\n2. The weather is nice today."
     ```

6. **Iterative Refinement**  
   - Experiment with different prompts and refine them based on the model's responses.  
   - Example:
     ```plaintext
     "List 5 ways AI is used in healthcare. If possible, provide real-world examples."
     ```

By following these principles, developers can craft effective prompts that enhance the performance of AI models.
"""

Python code can follow below if needed

For this prompt engineering, we follow [Anthropic API fundamentals](https://github.com/anthropics/courses/tree/master/anthropic_api_fundamentals) and have a similar track with OpenAI API.
</details>


# Step 1: Load Environment Variables and API Keys
<details>
<summary>Click to expand/collapse</summary>
In this step, we will ensure that the notebook can load the global `.env` file and that the required API keys are in place.

## Objectives:
1. Load environment variables using the `dotenv` package.
2. Verify that API keys are set up correctly.
3. Provide options to check API keys with and without displaying them.

## Instructions:
1. Ensure you have the `python-dotenv` package installed:
   ```bash
   pip install python-dotenv

2. Create a .env file in the root directory (if not already created) and add your API keys:

OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key

3. Run the following Python code to load and verify the API keys.</details>

In [6]:
### Python Code:

import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Fetch API keys
openai_api_key = os.getenv("OPENAI_API_KEY")
anthropic_api_key = os.getenv("ANTHROPIC_API_KEY")

# Check if API keys are loaded
if openai_api_key and anthropic_api_key:
    print("✅ API keys are successfully loaded.")
else:
    print("⚠️ Warning: One or more API keys are missing.")

# Optionally, display API keys (for debugging purposes only)
display_keys = False  # Change to True if you want to see the keys

if display_keys:
    print(f"OpenAI API Key: {openai_api_key}")
    print(f"Anthropic API Key: {anthropic_api_key}")
else:
    print("🔒 API keys are loaded but hidden for security.")


✅ API keys are successfully loaded.
🔒 API keys are loaded but hidden for security.


## Messages format and Understanding the concept of **role**, **user** and **content**

<details>
<summary>Click to expand/collapse</summary>


### Message Format     
As we saw in the previous lesson, we can use `client.messages.create()` (Claude) and `client.chat.completions.create` (Open AI) to send a message to Claude & OpenAI and get their respective responses.

The messages format allows us to structure our API calls to Claude or GPT in the form of a conversation, allowing for **context preservation**: The messages format allows for maintaining an entire conversation history, including both user and assistant messages. This ensures that Claude or GPT has access to the full context of the conversation when generating responses, leading to more coherent and relevant outputs.  

**Note: many use-cases don't require a conversation history, and there's nothing wrong with providing a list of messages that only contains a single message!** 

In addition to `content`, the `Message` object contains some other pieces of information:

* `id` - a unique object identifier
* `type` - The object type, which will always be "message"
* `role` - The conversational role of the generated message. This will always be "assistant".
* `model` - The model that handled the request and generated the response
* `stop_reason` - The reason the model stopped generating.  We'll learn more about this later.
* `stop_sequence` - We'll learn more about this shortly.
* `usage` - information on billing and rate-limit usage. Contains information on:
    * `input_tokens` - The number of input tokens that were used.
    * `output_tokens` - The number of output tokens that were used.

It's important to know that we have access to these pieces of information, but if you only remember one thing, make it this: `content` contains the actual model-generated content

### How does **role**, **user** and **content** work? 
Let's take a closer look at this bit (whether it's Anthropic or OpenAI): 
```py
messages=[
        {"role": "user", "content": "What flavors are used in Dr. Pepper?"}
    ]
```

The messages parameter is a crucial part of interacting with the Claude and OpenAI API. It allows you to provide the conversation history and context for **Claude** or **OpenAI** to generate a relevant response. 

The messages parameter expects a list of message dictionaries, where each dictionary represents a single message in the conversation.

Each message dictionary should have the following keys:

* `role`: A string indicating the role of the message sender. It can be either "user" (for messages sent by the user) or "assistant" (for messages sent by Claude or OpenAI).
* `content`: A string or list of content dictionaries representing the actual content of the message. If a string is provided, it will be treated as a single text content block. If a list of content dictionaries is provided, each dictionary should have a "type" (e.g., "text" or "image") and the corresponding content.  For now, we'll leave `content` as a single string.

Here's an example of a messages list with a single user message:

```py
messages = [
    {"role": "user", "content": "Hello Claude/OpenAI! How are you today?"}
]
```

And here's an example with multiple messages representing a conversation:

```py
messages = [
    {"role": "user", "content": "Hello Claude/OpenAI! How are you today?"},
    {"role": "assistant", "content": "Hello! I'm doing well, thank you. How can I assist you today?"},
    {"role": "user", "content": "Can you tell me a fun fact about ferrets?"},
    {"role": "assistant", "content": "Sure! Did you know that excited ferrets make a clucking vocalization known as 'dooking'?"},
]
```

Remember that messages always alternate between user and assistant messages.</details>



In [7]:
from anthropic import Anthropic
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the Anthropic client using environment variable
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))

# Call the Claude API
response = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=1000,
    messages=[
        {"role": "user", "content": "What flavors are used in Dr. Pepper?"}
    ]
)
print(response)

Message(id='msg_017uWCGskMV8cffGzgCskTS5', content=[TextBlock(text='The exact flavor blend in Dr Pepper is a closely guarded trade secret, but the generally accepted list of key flavors includes:\n\n- Cherry\n- Vanilla\n- Prune\n- Amaretto (almond)\n- Wintergreen\n- Molasses\n- Plum\n- Prune\n- Spices like cinnamon and nutmeg\n\nThe flavor is often described as a unique blend of cherry, vanilla, and other fruit and spice notes. Dr Pepper has a distinctive taste that is unlike other cola or soda flavors. The exact recipe has remained a mystery since the drink was first created in the late 1800s.', type='text')], model='claude-3-haiku-20240307', role='assistant', stop_reason='end_turn', stop_sequence=None, type='message', usage=Usage(cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=18, output_tokens=146))


In [8]:
import os
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Set up OpenAI client
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Check if API key is loaded
if not client.api_key:
    raise ValueError("⚠️ OpenAI API key is missing. Please check your .env file.")

# Call the OpenAI API
response = client.chat.completions.create(
    model="gpt-4-turbo",  # You can also use "gpt-3.5-turbo"
    messages=[
        {"role": "user", "content": "What flavors are used in Dr. Pepper?"}
    ],
    max_tokens=1000
)

# Print response
print(response.choices[0].message.content)


Dr. Pepper is known for its unique blend of 23 different flavors, which create its distinct taste. While the specific recipe is a closely guarded secret, some of the commonly speculated flavors include:

1. Cola
2. Cherry
3. Licorice
4. Almond
5. Vanilla
6. Blackberry
7. Apricot
8. Caramel
9. Pepper
10. Anise
11. Sarsaparilla
12. Ginger
13. Molasses
14. Lemon
15. Plum
16. Orange
17. Nutmeg
18. Cardamon
19. All Spice
20. Coriander
21. Juniper
22. Birch
23. Prune

These are not official, and the true mix of flavors that gives Dr. Pepper its characteristic taste remains proprietary.


# Model Parameters

<details><summary>Click to expand/collapse</summary>

### Lesson Goals
* Understand the role of the `max_tokens` parameter.
* Use the `temperature` parameter to control model responses.
* Explain the purpose of `stop_sequence`.

## Required Parameters

When making a request to a Large Language Model (LLM) such as **Claude** (Anthropic) or **GPT** (OpenAI), there are three required parameters:

* `model`
* `max_tokens`
* `messages`

So far, we have been using the `max_tokens` parameter in every single request, but we have not stopped to discuss what it is.

---

### Using `max_tokens` in Claude

Here is an example request to Claude:

```python
our_first_message = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=500,
    messages=[
        {"role": "user", "content": "Hi there! Please write me a haiku about a pet chicken"}
    ]
)
```

### Here is an equivalent request to OpenAI: 

```python
our_first_message = client.chat.completions.create(
    model="gpt-4-turbo",
    max_tokens=500,
    messages=[
        {"role": "user", "content": "Hi there! Please write me a haiku about a pet chicken"}
    ]
)
```

### So What is the Purpose of max_tokens?
In short, **max_tokens** controls the maximum number of tokens that the model should generate in its response. 

Before we go any further, let us pause for a moment to discuss tokens.

Most Large Language Models don't think in full words but instead process and generate responses using tokens, which are small building blocks of a text sequence.

When we provide a prompt to an LLM, the model:

Converts the input into tokens.
Processes the tokens and generates the output one token at a time.

### Token Differences Between Claude and OpenAI

|**Model**                | **Approximate Token Size**         |
|--------------------------|-----------------------------------|
|**Claude (Anthropic)**   | 1 token ≈ 3.5 English characters |
|**OpenAI (GPT-4/GPT-3.5)** | 1 token ≈ 4 English characters |

*The exact token count may vary depending on the language and structure of the text.*

### Why alter max tokens?
Understanding tokens is crucial when working with Claude, particularly for the following reasons:

* **API limits**: The number of tokens in your input text and the generated response count towards the API usage limits. Each API request has a maximum limit on the number of tokens it can process. Being aware of tokens helps you stay within the API limits and manage your usage efficiently.
* **Performance**: The number of tokens Claude generates directly impacts the processing time and memory usage of the API. Longer input texts and higher max_tokens values require more computational resources. Understanding tokens helps you optimize your API requests for better performance.
* **Response quality**: Setting an appropriate max_tokens value ensures that the generated response is of sufficient length and contains the necessary information. If the max_tokens value is too low, the response may be truncated or incomplete. Experimenting with different max_tokens values can help you find the optimal balance for your specific use case.


</details>

In [10]:
from anthropic import Anthropic
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Initialize the Anthropic client (not OpenAI)
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))

# Claude API with max_tokens and you can try max_tokens from 10 to 1000 and 500 is usually where the poem ends

truncated_response = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=1000,
    messages=[
        {"role": "user", "content": "Write me a poem"}
    ]
)
print(truncated_response.content[0].text)

Here is a poem for you:

The Dance of the Pen

Across the page, the pen takes flight,
Weaving words with rhythmic might.
A canvas blank, a mind in flight,
Together they create the dance of light.

Each stroke a step, each phrase a beat,
The poem unfolds, a work so sweet.
Emotions flow, thoughts take shape,
As the pen and mind their magic drape.

A symphony of ink and thought,
A tapestry of beauty, newly wrought.
The dance continues, ever-changing,
Capturing moments, forever rearranging.

In this dance, the heart finds its voice,
Expressing dreams, making a choice.
The pen, the poet, a harmonious pair,
Crafting poems that fill the air.


In [12]:
import os
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Initialize OpenAI client with API key from environment
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Check if API key is loaded correctly
if not client.api_key:
    raise ValueError("⚠️ OpenAI API key is missing. Please check your .env file.")

# Make API call with max_tokens set to 10 and try 100 or 500
truncated_response = client.chat.completions.create(
    model="gpt-4-turbo",
    max_tokens=1000,
    messages=[
        {"role": "user", "content": "Write me a poem"}
    ]
)

# Print truncated response
print(truncated_response.choices[0].message.content)


In a meadow, soft and wide, under the cloak of a starlit sky,
Where whispers of the wind collude, and the nightingales shyly cry.

Beneath the silver brushstrokes, of Luna’s gentle gleam,
Lies a tranquil, silent promise, a canvas born from dream.

The grass, a verdant ocean, waves in the moon’s soft light,
Each blade a silent sentry, in the quiet arms of night.

Crickets sing in choruses, a symphony so sweet,
A serenade of nature’s heart, where sky and earth discreetly meet.

Dewdrops, like scattered diamonds, upon the meadow lie,
Capturing moonbeams, daintily, as if they shyly vie.

A fox, with ember eyes aglow, through shadows deftly weaves,
Rustles 'neath the old oak’s boughs, brushed by autumn leaves.

The river, a silent wanderer, carves its path with grace,
Its waters whisper secrets, in this serene and hallowed place.

And above, the constellations spin, tales in celestial weave,
Orion with his belt so bold, the Bear’s drape none deceive.

In this night, under stardust veil, whe

# Stop Sequences 

<details>
<summary>Click to expand/collapse</summary>

Another important parameter we haven't seen yet is `stop_sequence` which allows us to provide the model with a set of strings that, when encountered in the generated response, cause the generation to stop.  They are essentially a way of telling Claude or OpenAI, "if you generate this sequence, stop generating anything else!"
</details> 

In [13]:
#Example of stop_sequence with Claude AI

from anthropic import Anthropic
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the Anthropic client using environment variable
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))

response = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=500,
    messages=[{"role": "user", "content": "Generate a JSON object representing a person with a name, email, and phone number ."}],
)
print(response.content[0].text)

Here's a JSON object representing a person with a name, email, and phone number:

{
  "name": "John Doe",
  "email": "johndoe@example.com",
  "phoneNumber": "123-456-7890"
}


In [14]:
# Example of stop_sequence with OpenAI API

import os
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the OpenAI client using environment variable
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

# Make an API call with a stop sequence
response = client.chat.completions.create(
    model="gpt-4-turbo",  # Use "gpt-3.5-turbo" if needed
    max_tokens=500,
    messages=[
        {"role": "user", "content": "Generate a JSON object representing a person with a name, email, and phone number."}
    ],
    stop=["}"]  # Stop generation after the closing JSON brace
)

# Print response
print(response.choices[0].message.content)


Here's an example JSON object representing a person with a name, email, and phone number:

```json
{
  "name": "John Doe",
  "email": "johndoe@example.com",
  "phone_number": "+1234567890"



## Temperature

The `temperature` parameter is used to control the "randomness" and "creativity" of the generated responses. It ranges from 0 to 1, with higher values resulting in more diverse and unpredictable responses with variations in phrasing.  Lower temperatures can result in more deterministic outputs that stick to the most probable phrasing and answers. **Temperature has a default value of 1**.

When generating text, any LLM (Claude, GPT or DeepSeek) predicts the probability distribution of the next token (word or subword). The temperature parameter is used to manipulate this probability distribution before sampling the next token. If the temperature is low (close to 0.0), the probability distribution becomes more peaked, with high probabilities assigned to the most likely tokens. This makes the model more deterministic and focused on the most probable or "safe" choices. If the temperature is high (closer to 1.0), the probability distribution becomes more flattened, with the probabilities of less likely tokens increasing. This makes the model more random and exploratory, allowing for more diverse and creative outputs. 

See this diagram for a visual representation of the impact of temperature (Source: Anthropic):
<img src="images/temperature.png" alt="Chart description" width="500" height="300"/>

Why would you change temperature?
**Use temperature closer to 0.0 for analytical tasks, and closer to 1.0 for creative and generative tasks.**

In [18]:
#Example of Temperature with Claude AI

from anthropic import Anthropic
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the Anthropic client using environment variable
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))

def demonstrate_temperature():
    temperatures = [0, 0.3, 0.5, 0.75, 1]
    for temperature in temperatures:
        print(f"Prompting Claude three times with temperature of {temperature}")
        print("================")
        for i in range(3):
            response = client.messages.create(
                model="claude-3-haiku-20240307",
                max_tokens=100,
                messages=[{"role": "user", "content": "Come up with a name for an alien planet. Respond with a single word."}],
                temperature=temperature
            )
            print(f"Response {i+1}: {response.content[0].text}")

demonstrate_temperature()

#Notice that with a temperature of 0, all three responses are the same.  
#Note that even with a temperature of 0.0, the results will not be fully deterministic.  
#However, there is a clear difference when compared to the results with a temperature of 1.  
#Each response was a completely different alien planet name. 


Prompting Claude three times with temperature of 0
Response 1: Xendor.
Response 2: Xendor.
Response 3: Xendor.
Prompting Claude three times with temperature of 0.3
Response 1: Zyloth.
Response 2: Xendor.
Response 3: Zyloth.
Prompting Claude three times with temperature of 0.5
Response 1: Xendor.
Response 2: Xendor.
Response 3: Xyloth.
Prompting Claude three times with temperature of 0.75
Response 1: Zyloth.
Response 2: Xylion.
Response 3: Xendor.
Prompting Claude three times with temperature of 1
Response 1: Zoloth.
Response 2: Xendor.
Response 3: Xylion.


In [17]:
# Example of Temperature with OpenAI API

import os
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the OpenAI client using environment variable
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

def demonstrate_temperature():
    temperatures = [0, 0.3, 0.5, 0.75, 1]
    
    for temperature in temperatures:
        print(f"Prompting OpenAI GPT three times with temperature of {temperature}")
        print("================")
        
        for i in range(3):
            response = client.chat.completions.create(
                model="gpt-4-turbo",  # Use "gpt-3.5-turbo" if needed
                max_tokens=100,
                messages=[{"role": "user", "content": "Come up with a name for an golden retriever. Respond with a single word."}],
                temperature=temperature
            )
            
            print(f"Response {i+1}: {response.choices[0].message.content}")

# Run the function
demonstrate_temperature()


Prompting OpenAI GPT three times with temperature of 0
Response 1: Sunny
Response 2: Sunny
Response 3: Sunny
Prompting OpenAI GPT three times with temperature of 0.3
Response 1: Sunny
Response 2: Sunny
Response 3: Sunny
Prompting OpenAI GPT three times with temperature of 0.5
Response 1: Sunny
Response 2: Sunny
Response 3: Sunny
Prompting OpenAI GPT three times with temperature of 0.75
Response 1: Sunny
Response 2: Sunny
Response 3: Sunny
Prompting OpenAI GPT three times with temperature of 1
Response 1: Sunny
Response 2: Sunny
Response 3: Sunny


## System Prompt

The `system_prompt` is an optional parameter that you can include when sending messages to **Claude (Anthropic)** and **GPT-4 (OpenAI)**. It sets the stage for the conversation by providing high-level instructions, defining the AI's role, or giving background information that should inform its responses.

### Key Points About the `system_prompt`:
- It's **optional** but can be useful for setting the **tone** and **context** of the conversation.
- It is applied at the **conversation level**, affecting all responses within that exchange.
- It helps **steer the model’s behavior** without needing to include instructions in every user message.

### How Each Model Uses the `system_prompt`:
| **Model**   | **System Prompt Usage** |
|------------|------------------------|
| **Claude (Anthropic)** | Uses `system_prompt` to set high-level guidance for the conversation. |
| **GPT-4 (OpenAI)** | Uses a system message as the first entry in `messages` (e.g., `{"role": "system", "content": "You are a helpful assistant."}`). |

---

### **Best Practices for Using System Prompts**
✅ **Use it for high-level guidance** (e.g., defining tone, behavior, role).  
✅ **Avoid detailed instructions or long documents** in the system prompt.  
✅ **Provide detailed instructions** inside the **first User message** for better results.  
✅ **No need to repeat it** for every subsequent user turn.

In [19]:
#Example of System Prompts with Claude AI

from anthropic import Anthropic
import os
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the Anthropic client using environment variable
client = Anthropic(api_key=os.getenv('ANTHROPIC_API_KEY'))

message = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=1000,
    system="You are a helpful foreign language tutor that always responds in Chinese.",
    messages=[
        {"role": "user", "content": "Hey there, how are you?!"}
    ]
)

print(message.content[0].text)

很高兴见到你!我很好,谢谢你的问候。你今天过得怎么样?


In [20]:
# Example of System Prompts with OpenAI API

import os
from openai import OpenAI
from dotenv import load_dotenv

# Load environment variables from .env file
load_dotenv()

# Initialize the OpenAI client using environment variable
client = OpenAI(api_key=os.getenv('OPENAI_API_KEY'))

# Create a chat completion with a system prompt
message = client.chat.completions.create(
    model="gpt-4-turbo",  # Use "gpt-3.5-turbo" if needed
    max_tokens=1000,
    messages=[
        {"role": "system", "content": "You are a helpful foreign language tutor that always responds in Chinese."},
        {"role": "user", "content": "Hey there, how are you?!"}
    ]
)

# Print response
print(message.choices[0].message.content)


你好！我很好，谢谢。你最近怎么样？需要帮助学习中文吗？
