# Prompt Engineering Techniques

## Global setup

In [3]:
# Global constants
OLLAMA_CHATBOT_URL="http://localhost:11434/api/chat"
OLLAMA_EMBEDDING_URL="http://localhost:11434/api/embed"
MODEL_NAME="llama3.1:8b-instruct-q4_0"
EMBEDDING_MODEL="nomic-embed-text"

In [2]:
# Global imports
import json
import requests

In [4]:
# Global utility functions
def get_chatbot_response(session: requests.Session , model_name: str, messages, temperature: float = 0.0):
    response = session.post(
        OLLAMA_CHATBOT_URL,
        data=json.dumps({
            "model": model_name,
            "messages": messages,
            "options": {
                "temperature": temperature,
                "top_p": 0.8
            },
            "stream": False
        })
    )
    return response.json()["message"]["content"]

def get_embedding(session: requests.Session, model_name: str, text_input):
    response = session.post(
        OLLAMA_EMBEDDING_URL,
        data=json.dumps({
            "model": model_name,
            "input": text_input
        })
    )
    return response.json()

In [5]:
session = requests.Session()

## Get LLM response

An LLM model accepts messages with specific roles to distinguish between the input from the user and the responses from the assistant. This helps the model understand the context and generate appropriate replies. Here's how the roles work:

- **User**: This role represents the input from the user. It's the part of the conversation where the user asks a question or provides information.
- **Assistant**: This role represents the response from the AI assistant. The assistant generates a reply based on the user's input. The assistant's role is to provide accurate and helpful information or continue the conversation.
- **System**: This role represents initial instructions or context for the conversation. It sets the behavior or context of the assistant.

In [22]:
messages = [{"role": "user", "content": "What's the capital of Italy?"}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

The capital of Italy is Rome (Italian: Roma).


## Structured output

Structured output refers to designing prompts in a way that the AI generates responses in a clear and organized format. This ensures the information is easy to read, process, and utilize, particularly when integrating with other systems or applications.

In [5]:
system_prompt = """
You are a helpful assistant that answers questions about capitals of countries.

Your output should be in a structured json format exactly like the one bellow. Your are not allowed to write anything but the structured json:

[
{
    "country": The country that you will get the capital of
    "capital": The capital of the country stated
}
]
"""

In [28]:
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "What's the capital of Germany?"}
]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

[
{
  "country": "Germany",
  "capital": "Berlin"
}
]


In [29]:
json_response = json.loads(response)
json_response

[{'country': 'Germany', 'capital': 'Berlin'}]

## Input structuring

Structured input is a technique used to organize and format data or requests in a clear, logical, and coherent manner. The goal is to ensure that the AI or system understands the user's needs accurately and can provide the desired output effectively.

In [6]:
user_prompt = """
Get me the capitals of the following countries:

```
1. Spain
2. Portugal
3. Norway
4. Germany
"""

In [7]:
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt}
]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

[
{
  "country": "Spain",
  "capital": "Madrid"
},
{
  "country": "Portugal",
  "capital": "Lisbon"
},
{
  "country": "Norway",
  "capital": "Oslo"
},
{
  "country": "Germany",
  "capital": "Berlin"
}
]


In [9]:
json_response = json.loads(response)
json_response

[{'country': 'Spain', 'capital': 'Madrid'},
 {'country': 'Portugal', 'capital': 'Lisbon'},
 {'country': 'Norway', 'capital': 'Oslo'},
 {'country': 'Germany', 'capital': 'Berlin'}]

## Give the model time to think (Chain of Thought)

The chain of thought technique involves encouraging the AI model to think step-by-step to solve a problem or provide a detailed answer. Instead of giving a direct answer, the model breaks down the problem into smaller, manageable pieces, reasoning through each one to reach a final conclusion. This technique is particularly useful for complex questions that require logical reasoning or multi-step problem-solving.

Here is the paper of the technique: [https://arxiv.org/pdf/2205.11916](https://arxiv.org/pdf/2205.11916)

In [10]:
user_prompt = """
Calculate the result of this equation:

```
1 + 3
```

Your output should be in a structured json format exactly like the one below. You are not allow to write anything but the structured json format:
{
    "result": The final number resulted from calculating the equation above
}
"""

In [11]:
messages = [{"role": "user", "content": user_prompt}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

{
    "result": 4
}


In [14]:
259/2*8654+91072*33-12971

4113098.0

In [15]:
user_prompt = """
Calculate the result of this equation:

```
259/2*8654+91072*33-12971
```

Your output should be in a structured json format exactly like the one below. You are not allow to write anything but the structured json format:
{
    "result": The final number resulted from calculating the equation above
}
"""

In [16]:
messages = [{"role": "user", "content": user_prompt}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

{
    "result": 11111111
}


In [17]:
4113098.0 - 11111111

-6998013.0

In [20]:
user_prompt = """
Calculate the result of this equation:

```
259/2*8654+91072*33-12971
```

Your output should be in a structured json format exactly like the one below. You are not allow to write anything but the structured json format:
{
    "steps": This is where you solve the equation step by step following the BEDMAS order of operations. You need to show your work and calculate each step leading to final result. Feel free to write here in free text. 
    "result": The final number resulted from calculating the equation above. Just the number.
}
"""

In [21]:
messages = [{"role": "user", "content": user_prompt}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

{
    "steps": "First, we divide 259 by 2: 259/2 = 129.5\nThen, we multiply 8654 by 129.5: 8654 * 129.5 = 1119111\nNext, we multiply 91072 by 33: 91072 * 33 = 3002016\nNow, we add 1119111 and 3002016: 1119111 + 3002016 = 4121127\nFinally, we subtract 12971 from 4121127: 4121127 - 12971 = 4100456",
    "result": 4100456
}


In [22]:
4113098.0 - 4100456

12642.0

## RAG (Retrival Augmented Generation)

The Retrieval Augmented Generation (RAG) technique is a method used in prompt engineering to enhance the capabilities of language models by integrating external knowledge sources. The key advantage of RAG is that it allows language models to access up-to-date information without needing to be retrained for specific tasks.

In the example we ask the LLM about a subject that does not know nothing about, in this case we ask it about the iphone 16 which at the time of training of llama3.1 didn't exist.

In [23]:
user_prompt = """
What's new in the iphone 16?
"""

messages = [{"role": "user", "content": user_prompt}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

There is no such thing as an "iPhone 16". Apple has released several iPhone models, including:

* iPhone (2007)
* iPhone 3G (2008)
* iPhone 4 (2010)
* iPhone 5 (2012)
* iPhone 6 (2014)
* iPhone 7 (2016)
* iPhone X (2017)
* iPhone 11 (2019)
* iPhone 12 (2020)
* iPhone 13 (2021)

However, Apple has announced the following upcoming iPhone models:

* iPhone 14 (expected to be released in September 2022)
* iPhone 15 (rumored to be released in 2023, but no official announcement yet)

If you're looking for information on a specific iPhone model, I'd be happy to help with that!


Now we are going to inject the context for the LLM into the user prompt, this way the LLM can generate an appropriate response.

In [6]:
iphone_16 = """
The iPhone 16 introduces several exciting updates, making it one of Apple's most advanced smartphones to date. It features a larger 6.1-inch display for the base model and a 6.7-inch screen for the iPhone 16 Plus, with thinner bezels and a more durable Ceramic Shield. The iPhone 16 Pro and Pro Max boast even larger displays, measuring 6.3 and 6.9 inches respectively, offering the thinnest bezels seen on any Apple product so far.

Powered by the new A18 chip (A18 Pro for the Pro models), these phones deliver significant performance improvements, with enhanced neural engine capabilities, faster GPU for gaming, and machine learning tasks. The camera systems are also upgraded, with the base iPhone 16 sporting a dual-camera setup with a 48MP main sensor. The Pro models offer a 48MP Ultra Wide and 5x telephoto camera, enhanced by Apple’s "Camera Control" button for more flexible photography options.

Apple also introduced advanced audio features like "Audio Mix," which uses machine learning to separate background sounds from speech, allowing for more refined audio capture during video recording. Battery life has been extended, especially in the iPhone 16 Pro Max, which is claimed to have the longest-lasting battery of any iPhone 
9TO5MAC

APPLEMAGAZINE
.

Additionally, Apple has switched to USB-C for faster charging and data transfer, and the Pro models now support up to 2x faster video encoding. The starting prices remain consistent with previous generations, with the iPhone 16 starting at $799, while the Pro models start at $999
"""

In [26]:
user_prompt = f"""
{iphone_16}

What's new in iphone 16?
"""

messages = [{"role": "user", "content": user_prompt}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

According to the text, the following are the new features and updates in the iPhone 16:

1. **Larger displays**: The base model has a 6.1-inch display, the Plus model has a 6.7-inch screen, the Pro model has a 6.3-inch display, and the Pro Max model has a 6.9-inch screen.
2. **Thinner bezels**: All models have thinner bezels than previous iPhones.
3. **Ceramic Shield**: The iPhone 16 features a more durable Ceramic Shield.
4. **New A18 chip (A18 Pro for Pro models)**: This provides significant performance improvements, including enhanced neural engine capabilities and faster GPU for gaming and machine learning tasks.
5. **Upgraded camera systems**:
	* Base model has a dual-camera setup with a 48MP main sensor.
	* Pro models have a 48MP Ultra Wide and 5x telephoto camera.
6. **Audio features**: The iPhone 16 introduces "Audio Mix," which uses machine learning to separate background sounds from speech, allowing for more refined audio capture during video recording.
7. **Extended battery 

### Extracting context data from database

Let's imagine we have the descriptions of samsung and iphone, and we are asking the model about the iphone. How can we determine which description we should inject in the user prompt?

In [7]:
samsung_s23 = """
The Samsung Galaxy S23 brings some incremental but notable upgrades to its predecessor, the Galaxy S22. It features the Snapdragon 8 Gen 2 processor, a powerful chip optimized for the S23 series, delivering enhanced performance, especially for gaming and multitasking. This chip ensures top-tier speed and efficiency across all models, from the base S23 to the larger S23+ and S23 Ultra​
STUFF

TECHRADAR
.

In terms of design, the S23's camera module has been streamlined by removing the raised metal contour around the cameras, creating a cleaner, sleeker look. It also sports the same 6.1-inch 120Hz AMOLED display, protected by tougher Gorilla Glass Victus 2, making it more resistant to scratches and drops​
TECHRADAR
.

The S23 Ultra stands out with its 200MP main camera, offering impressive photo clarity, especially in low-light conditions. The selfie camera across the series has been updated to a 12MP sensor, resulting in sharper selfies. The Ultra model also includes productivity tools such as the S-Pen, which remains an essential feature for note-taking and creative tasks​
STUFF

TECHRADAR
.

Battery life is solid, with the S23 Ultra featuring a 5000mAh battery that lasts comfortably through a day of heavy use. However, charging speeds still lag behind some competitors, with 45W wired charging, which is slower than other brands offering up to 125W charging​
STUFF
.

Overall, the Galaxy S23 series enhances performance, durability, and camera quality, making it a strong contender for users seeking a high-performance flagship.
"""

In [8]:
data = [iphone_16, samsung_s23]

In [9]:
user_prompt ="""
What's new in iphone 16?
"""

### Embeddings

Embeddings are a way to represent words as numbers (vectors) so that computers can understand them. Instead of treating words just as unique symbols, embeddings capture their meanings and relationships. Imagine each word is a dot on a big map, and words with similar meanings are close to each other.

Embeddings are used to find relevant information from a large collection of documents. The AI looks for the closest "dots" (documents) to the query on the map.

In [13]:
user_prompt_embedding = get_embedding(session, EMBEDDING_MODEL, user_prompt)
user_prompt_embedding["embeddings"]

[[-0.0051620365,
  0.0052757715,
  -0.14794724,
  0.03539895,
  0.035747394,
  0.012521504,
  -0.00037501642,
  -0.046194173,
  0.0073074712,
  -0.057164375,
  -0.009706688,
  0.023857333,
  0.06739639,
  0.01527999,
  -0.018489953,
  -0.03863316,
  -0.013000933,
  -0.04841448,
  0.025541296,
  0.01169141,
  0.08023607,
  -0.028305853,
  -0.074373424,
  0.050981704,
  0.08993426,
  -0.0032618267,
  0.036975313,
  0.059670378,
  -0.0032444743,
  0.04494527,
  0.06325461,
  -0.017304735,
  -0.0091284085,
  -0.0027310103,
  0.013372992,
  -0.015084373,
  0.046653997,
  -0.040068075,
  -0.014020665,
  0.01851331,
  0.026159499,
  0.016547913,
  -0.0193046,
  -0.010250008,
  0.05560817,
  -0.018424783,
  0.036805987,
  -0.009752596,
  -0.013783014,
  -0.039880402,
  -0.047787614,
  0.031901613,
  0.031045595,
  -0.02602258,
  0.054729536,
  0.016133253,
  0.01775732,
  0.038217817,
  -0.044598415,
  -8.49434e-05,
  0.07135627,
  0.038405195,
  0.00073835225,
  0.063246384,
  0.06391034,
  -

In [16]:
data_embeddings = get_embedding(session, EMBEDDING_MODEL, data)
data_embeddings["embeddings"]

[[0.04533401,
  0.08049695,
  -0.1509854,
  0.041210163,
  0.051370975,
  -0.018109234,
  0.004530468,
  -0.023527859,
  0.02335108,
  -0.014220325,
  -0.0030327905,
  0.022225019,
  0.052140716,
  0.03947962,
  0.028328797,
  -0.01787217,
  -0.01683558,
  -0.06472199,
  0.068538696,
  0.02514994,
  0.041909773,
  -0.023534756,
  -0.13693465,
  0.03180691,
  0.040395442,
  0.0346862,
  -0.008374418,
  0.00950424,
  -0.022766441,
  0.051112734,
  0.1022283,
  -0.058992196,
  -0.021933932,
  -0.0041987025,
  0.0019815713,
  -0.012009788,
  0.04187013,
  -0.0037261478,
  0.004307056,
  0.022960773,
  0.05880568,
  -0.00036613957,
  0.0060297702,
  -0.0010887448,
  0.057366863,
  -0.024379795,
  0.030496707,
  -0.0070623048,
  0.049399626,
  -0.053006943,
  -0.013922116,
  0.06805809,
  -0.004345578,
  0.0056456514,
  0.0627202,
  0.009124002,
  -0.01369975,
  0.06471193,
  -0.052316025,
  -0.0049676024,
  0.08221141,
  0.03499059,
  -0.0073098987,
  0.059999757,
  0.06209644,
  -0.0096825

Once we have the user prompt and the data in a vector form, we can compute the cosine similarity which tells us how similar are two vectors. That's how we can select the most relevant information to the user prompt.  

In [17]:
# Install scikit-learn to use cosine similarity
!uv pip install scikit-learn

[2mUsing Python 3.12.3 environment at: /home/eric/repos/bot-exploration/.venv[0m
[2K[2mResolved [1m5 packages[0m [2min 365ms[0m[0m                                         [0m
[2K[37m⠙[0m [2mPreparing packages...[0m (0/5)                                                   
[2K[1A[37m⠙[0m [2mPreparing packages...[0m (0/5)----[0m[0m     0 B/294.74 KiB                    [1A
[2mthreadpoolctl[0m [32m[2m------------------------------[0m[0m     0 B/17.98 KiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/5)----[0m[0m     0 B/294.74 KiB                    [2A
[2mthreadpoolctl[0m [32m-------------------------[2m-----[0m[0m 14.88 KiB/17.98 KiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/5)----[0m[0m     0 B/294.74 KiB                    [2A
[2mthreadpoolctl[0m [32m------------------------------[2m[0m[0m 17.98 KiB/17.98 KiB
[2K[2A[37m⠙[0m [2mPreparing packages...[0m (0/5)----[0m[0m     0 B/294.74 KiB                    [2A
[2K

In [21]:
from sklearn.metrics.pairwise import cosine_similarity

# Compute the similarity of the user prompt with the data.
data_similarity_scores = cosine_similarity([user_prompt_embedding["embeddings"][0]], data_embeddings["embeddings"])

# Get the similarity scores (the greatest score means it's the most similar)
data_similarity_scores

array([[0.74575299, 0.45784973]])

In [23]:
# Get the index of the most similar
most_similar_index = data_similarity_scores.argmax()
print(data[most_similar_index])


The iPhone 16 introduces several exciting updates, making it one of Apple's most advanced smartphones to date. It features a larger 6.1-inch display for the base model and a 6.7-inch screen for the iPhone 16 Plus, with thinner bezels and a more durable Ceramic Shield. The iPhone 16 Pro and Pro Max boast even larger displays, measuring 6.3 and 6.9 inches respectively, offering the thinnest bezels seen on any Apple product so far.

Powered by the new A18 chip (A18 Pro for the Pro models), these phones deliver significant performance improvements, with enhanced neural engine capabilities, faster GPU for gaming, and machine learning tasks. The camera systems are also upgraded, with the base iPhone 16 sporting a dual-camera setup with a 48MP main sensor. The Pro models offer a 48MP Ultra Wide and 5x telephoto camera, enhanced by Apple’s "Camera Control" button for more flexible photography options.

Apple also introduced advanced audio features like "Audio Mix," which uses machine learning

In [24]:
user_prompt_with_data = f"""
{data[most_similar_index]}

{user_prompt}
"""

In [25]:
messages = [{"role": "user", "content": user_prompt_with_data}]
response = get_chatbot_response(session, MODEL_NAME, messages)
print(response)

According to the text, the following are new features and updates in the iPhone 16:

1. Larger display sizes:
	* Base model: 6.1-inch
	* iPhone 16 Plus: 6.7-inch
	* iPhone 16 Pro: 6.3-inch
	* iPhone 16 Pro Max: 6.9-inch
2. Thinner bezels and Ceramic Shield for added durability
3. New A18 chip (A18 Pro for Pro models) with:
	* Enhanced neural engine capabilities
	* Faster GPU for gaming and machine learning tasks
4. Upgraded camera systems:
	* Base iPhone 16: dual-camera setup with 48MP main sensor
	* Pro models: 48MP Ultra Wide and 5x telephoto camera, with "Camera Control" button
5. Advanced audio features:
	* "Audio Mix" for refined audio capture during video recording
6. Extended battery life, especially in the iPhone 16 Pro Max
7. Switch to USB-C for faster charging and data transfer
8. Up to 2x faster video encoding on Pro models
