# **Prompt Engineering**

A **LLM** **predicts tokens** one after another **based on context and training**.
**Prompt engineering** means creating clear and optimized prompts to guide the model to give correct answers.
You need to **experiment** with the structure, style, length and parameters of the model (e.g. temperature, top-k).
Prompts allow you to do: **summaries, Q&A, classifications, translations, code generation and more**.
Each LLM can require different prompts: it must be **adapted to the model used** (Gemini, GPT, LLaMA, etc.).

### **LLM Output Configuration**

Once you have chosen a model, it is essential to **configure it well** to get outputs suitable for your task.

#### **Output Length**
- Controls **how many tokens** the model can generate.
- **More tokens** = more cost, slowness, energy consumption.
- Does not make the output more synthetic, only shorter.
- Useful to limit it in techniques like **ReAct** to avoid useless outputs.

#### **Sampling Controls**
LLMs do not choose a fixed token but **calculate a probability distribution**. These settings regulate **how random or creative** the output will be:

- **Temperature**
- **0** = deterministic (always the most likely token).
- **High** = more randomness and creativity.

- **Top-K**
- Considers only the **K most likely tokens**.
- Low K = more reliable answers.
- High K = more creativity.

- **Top-P** (nucleus sampling)
- Consider tokens until **cumulative probability ≤ P**.
- **Low P** = less variety.
- **High P** = more freedom.

**Tip**: Experiment with **temperature + top-k + top-p** to balance creativity and accuracy.

### **LLM Output Configuration**

**Top-K**, **Top-P** and **Temperature** must be balanced with each other. Each parameter affects the others:

- **Temperature = 0** → deterministic, *Top-K* and *Top-P* ignored.
- **Top-K = 1** → only the most probable token is chosen, ignore *temperature* and *top-P*.
- **Top-P = 0** → effect similar to *top-K = 1*, very restrictive choice.
- **High values** for *temperature*, *top-K* or *top-P* → more creative, but less controlled output.

**Recommended settings**:
- Balanced output: `temp = 0.2`, `top-P = 0.95`, `top-K = 30`
- Creative: `temp = 0.9`, `top-P = 0.99`, `top-K = 40`
- Not very creative / precise: `temp = 0.1`, `top-P = 0.9`, `top-K = 20`
- Questions with only one correct answer (e.g. math): `temp = 0`

**Beware of the repetition loop**: it can happen both at low temperatures (too rigid), and at high temperatures (too random output). You need to optimize the parameters well.

### **Prompting Techniques**

**General Prompting / Zero-shot**
- Give only instructions or text input **without examples**.
- This is the simplest type of prompt, useful for clear and short tasks.
- Use low temperatures (e.g. `0.1`) for more specific answers.

**One-shot Prompting**
- Include **only one example** in the prompt.
- It is used to show the format or style of the desired answer.
- Useful when the task is not trivial but can be understood with an example.

**Few-shot Prompting**
- Provide **3–5 examples or more**, to help the model spot patterns.
- Increases the likelihood that the model will follow the desired structure.
- Ideal for complex tasks, but be careful about the maximum length of the prompt.

**Tips**:
- Clear and well-written examples = better results.
- Include **edge cases** to increase model robustness.
- Track and document versions of your prompts as you design.

### **Types of Prompting (LLM)**

#### **1. System Prompting**
**Objective:** Defines the **general purpose** of the model and imposes **rules** on the output.

**Used to:**
- Specify output format (e.g. JSON only, uppercase labels only)
- Enforce behaviors (e.g. “be respectful”)

**Example:**
> Prompt:
> *"Classify reviews as POSITIVE, NEUTRAL, or NEGATIVE. Return only the uppercase label."*
> Output: `NEGATIVE`

**Another example with JSON:**
> Prompt:
> *"Classify reviews and return JSON with specific structure."*
> Output:
> ```json
> {
> "movie_reviews": [
> {
> "sentiment": "NEGATIVE",
> "name": "Her"
> }
> ]
> }
> ```

---

#### **2. Contextual Prompting**
**Objective:** Provides **specific context** for the current prompt, which can be used to generate more relevant responses.

**Use to:**
- Add background to the prompt
- Improve accuracy
- Adapt output to a dynamic situation

**Example:**
> Prompt:
> *"Context: You are writing a blog about arcade games from the 80s. Suggest 3 articles with descriptions."*
> Output:
> - The evolution of arcade cabinets
> - Iconic games of the 80s
> - The rebirth of pixel art

---

#### **3. Role Prompting**
**Objective:** Assign a **role or identity** to the model to influence **tone, style, and knowledge**.

**Used to:**
- Define a consistent behavior (e.g. “you are a teacher”, “you are a tour guide”)
- Modulate the tone (formal, humorous, inspirational…)

**Standard example:**
> Prompt:
> *"Act as a tour guide. I am in Amsterdam and I only want to see museums."*
> Output:
> - Rijksmuseum
> - Van Gogh Museum
> - Stedelijk Museum

**Humorous example:**
> Prompt:
> *"You are a comical tour guide. I am in Manhattan."*
> Output:
> - "Climb to the top of the Empire like King Kong (without a banana)"
> - "MoMA: art that makes you doubt your talent with stick men"
> - "Shopping on Fifth Avenue: prepare to cry… your wallet"

---

### **Conclusion**
- **System Prompt** = *Rules and structure*
- **Contextual Prompt** = *Dynamic and current context*
- **Role Prompt** = *Voice, identity, tone of the model*

You can combine them to get very powerful and personalized prompts, like:

> “Act as an expert teacher (*role*) and explain in a simple way (*system*) how backpropagation works in deep learning, considering that the student is a novice (🔍 *context*).”

### **Step-Back Prompting – What is it?**
It is a technique where you first ask the model a **general question** to activate its “background knowledge”, and then use that answer as **context** to answer the real question.

Goal: improve accuracy, consistency, and creativity, avoiding generic or stereotyped answers.

#### **How ​​does it work?**

1. **Step 1 – General question** (e.g. “What are 5 iconic settings for an FPS?”)
2. **Step 2 – Specific question** (e.g. “Write a story for a level inspired by one of these settings”)

#### **Example**

### Direct prompt:
> “Write the plot of a level of an FPS video game.”

Result: Generic, "soldiers in the city, shooting, traps, escape..." style

### Step-Back Prompt:

#### General Prompt:
> “What are 5 original settings for an FPS?”
- Abandoned military base
- Cyberpunk city
- Alien spaceship
- Zombie-infested city
- Underwater lab

#### Final Prompt (with context):
> “Choose one of these settings and write the plot of an FPS level.”

Result: Immersive and detailed narrative about an infested underwater lab, with a consistent atmosphere, plot, challenges, and tone.

### **Benefits**
- Greater activation of latent knowledge
- Improved narrative coherence and depth
- Reduces bias and stereotypical responses
- Extremely useful for creative writing, coding, complex explanations

### **What is Chain of Thought (CoT) Prompting?**
It is a technique that **improves the reasoning** of LLM models by asking them to **explain the logical steps** before arriving at the final answer.

#### Objective:
To help the model "think out loud", improving the accuracy, interpretability and consistency of the answers.

### **Why does it work?**
- Activates latent knowledge step by step.
- Allows to identify *where* the model goes wrong.
- Reduces variations between different models.
- Useful for logic, math, code, synthetic data generation problems, etc.

### **Simple example**

### Direct prompt:
> "When I was 3, my partner was 3 times my age. Now I'm 20. How old is my partner?"

Wrong answer: **63 years old**

### With CoT:
> "When I was 3, my partner was 3 times my age. Now I'm 20. **Let's think step by step.**"

Correct answer:
1. At 3, the partner was 3×3 = 9 years old.
2. So difference = 6 years old.
3. Now I'm 20 → the partner is 20 + 6 = **26 years old**.

### **One-shot version (example included):**
Give the model **a thoughtful example before your question** to further improve the answer.

### **Suggested uses:**
- Code (breaking tasks into logical steps)
- Mathematics
- Creating synthetic data
- Generating structured texts (e.g. guided product descriptions)

### **What is Tree of Thoughts (ToT)?**

It is an advanced prompting technique that **generalizes** Chain of Thought (CoT), allowing an LLM to:
> explore **multiple reasoning paths in parallel**, instead of following just one linear one.

#### **How ​​does it work?**

- Each “thought” is a **logical step** or a coherent sequence of text.
- Thoughts are **organized in a tree**, where each branch represents a different direction to solve a problem.
- The model can **expand multiple nodes**, evaluate the best branches and **choose** the most promising one.

#### **Why is it useful?**

- Perfect for **complex tasks** where **exploration and multiple evaluation** are needed, such as:
- Complex logic problems
- Optimization
- Advanced coding
- Math puzzles

#### **Difference from CoT:**

| Chain of Thought | Tree of Thoughts |
|------------------|------------------|
| Single linear path | Multiple simultaneous paths |
| Step-by-step | Branch-by-branch |
| Less exploration | More creativity, exploration and comparison |

### **What is ReAct Prompting?**

**ReAct** (Reason and Act) is a prompting technique that combines:
- **Natural LLM reasoning**
- **Action** via external tools (e.g. API, searches, calculation tools)

> Inspired by **how a human thinks and acts**: think → act → observe → think again → act again → until it solves.

#### **How ​​the Reason + Act loop works:**

1. **Thought:** the model reasons and generates a plan
2. **Action:** performs an action (e.g. online search, code, API call)
3. **Observation:** observes the result
4. **New Thought:** updates the plan with new information
5. Continues the loop until it reaches a final answer

### **Practical example (LangChain + VertexAI)**

**Question:** *How many children do the members of Metallica have?*

#### ReAct execution:
- LLM searches each member on Google (via SerpAPI)
- Reasons: “Hetfield has 3 children… now I look for Ulrich…”
- Progressively adds the children → **Final answer: 10**

#### **Why use ReAct?**

- It allows the LLM to **interact with the world external**
- Great for **complex, dynamic or knowledge-based tasks**
- First foundations for creating **intelligent autonomous agents** (Agent Modeling)

#### **ReAct requires:**
- Setup with frameworks like **LangChain**
- Access to tools like **SerpAPI**, **code interpreter**, **REST API**
- Prompts designed to handle **reasoning → action → observation loops**

### Conclusion

ReAct is a powerful paradigm for turning LLMs into **interactive intelligent agents**, capable not only of reasoning, but also **acting in the real world** to get the information they need.

### **Automatic Prompt Engineering (APE)** – What is it?

APE is a **technique to automate prompt writing**, leveraging **the LLMs themselves to generate, test and optimize prompts**. In practice:
> Write a prompt → the model generates **variations** → evaluate them → choose the best one → repeat.

### **How ​​it works (in 3 steps):**

1. **Initial prompt (meta-prompt):**
You use an LLM (eg: Gemini, GPT, Claude) to generate alternative versions of an existing prompt.

**Example:**
To train a chatbot on a Metallica t-shirt site:
> "One Metallica t-shirt size S"
Ask the model to generate **10 variations with the same meaning**.

2. **Output (generated alternative prompts):**
- “I’d like to purchase a Metallica t-shirt in size small.”
- “Can I order a small-sized Metallica t-shirt?”
- “One Metallica shirt, size small, please.”
- ...and so on (10 in total)

3. **Automatic evaluation:**
You compare variants using metrics like:
- **BLEU**: measures sentence similarity
- **ROUGE**: measures keyword recall

Then you select the **best**, or modify it and repeat the process.

### **Why is APE useful?**

- It **saves you time** in testing prompts manually.
- Discover **more effective** prompts for specific tasks (chatbots, classifiers, parsers…).
- Automate the more “experimental” part of prompt engineering.
- It is also useful for generating **realistic synthetic data**, useful for NLP or supervised training.

### Practical example for prompt generation

**Initial APE prompt:**
```text
We have a band merchandise t-shirt webshop. Generate 10 ways a user could order:
"One Metallica t-shirt size S". Maintain same meaning, vary phrasing.
```

**LLM Output:**
1. "I want a Metallica t-shirt, size small."
2. "Please order me one small Metallica tee."
3. "Can I buy a small Metallica shirt?"
...

### **Code Prompting – What is it?**

**Code prompting** is the use of text prompts to:
- Write code
- Explain code
- Translate code
- Debug and review code

Large Language Models (LLMs) like Gemini, GPT, Claude or PaLM can become real **development collaborators**.

### **1. Prompt to write code**

**Example**: Rename all files in a folder with `draft_` in front of the name.

**Prompt:**
> Write a code snippet in Bash, which asks for a folder name. Then it takes the contents of the folder and renames all the files inside by prepending the name `draft`.

**Output Bash:**
```bash
echo "Enter the folder name: "
read folder_name
files=( "$folder_name"/* )
for file in "${files[@]}"; do
new_file_name="draft_$(basename "$file")"
mv "$file" "$new_file_name"
done
```
### **2. Prompt to explain code**

Use the prompt:
> "Explain this Bash code:"
and paste the code.

**Expected result**:
- Per-line analysis
- Clear explanation of input, logic and output

### **3. Prompt to translate code (e.g. Bash to Python)**

**Prompt:**
> Translate this Bash script to Python.

**Output (Python):**
```python
import os, shutil
folder_name = input("Enter the folder name: ")
if not os.path.isdir(folder_name): 
print("Folder does not exist.") 
exit(1)
for file in os.listdir(folder_name): 
new_name = f"draft_{file}" 
shutil.move(os.path.join(folder_name, file), os.path.join(folder_name, new_name))
```

---

### **4. Prompt for debugging**

If the code has an error (e.g. `toUpperCase` does not exist), you can paste it and write:

> Debug this code. Here’s the traceback...

**Result:**
- Explain the problem (e.g. `toUpperCase` does not exist in Python)
- Suggest fix (`prefix.upper()`)

### **5. General suggestions of the template**

The template can also:
- Improve code aesthetics (f-strings, consistent names)
- Handle errors with `try-except`
- Handle files with spaces or special characters
- Add logic (e.g. keep file extensions)

### Practical applications:

- **Automation of systematic tasks** (rename, backup, parsing)
- **Explanation of legacy code** (useful for dev teams)
- **Collaborative debugging**
- **Refactoring**
- **Translation between languages** (e.g. Bash → Python, Python → JS…)

### Multimodal Prompting

**Multimodal prompting** means using **multiple input formats** other than just text, such as:

- **Images**
- **Audio**
- **Code**
- **Video**
- **Structured data**

**Example**:
> "Describe the content of this image and generate a creative tweet."
(or)
> "Analyze this audio and generate a transcript + sentiment analysis."

This type of prompting is only supported by **multimodal models**, such as Gemini Ultra, GPT-4V, Claude 3 Sonnet, PaLM-E, etc.

## Best Practices for Prompt Engineering

### 1. Provide examples (One-shot, Few-shot)

- Examples **teach** the model what to expect and what to generate.
- Improve **accuracy, consistency, and style**.

**Few-shot Example**:
```
Input: "I want a small pizza with mozzarella and tomato sauce."
Output:
{
"size": "small",
"toppings": ["mozzarella", "tomato sauce"]
}
```

### 2. Use simple and direct language

Avoid vague or overly complex sentences.

“I would like to know interesting facts about this area because I am here on vacation with two 3-year-olds.”

“Act as a tour guide. Suggest places to visit in Manhattan with 3-year-olds.”

### 3. Be specific about the desired output

Clearly define:

- **Format** (e.g. JSON, paragraph, bulleted list)
- **Style** (conversational, technical, informal)
- **Length** (e.g. “write in 3 paragraphs”, “in 100 words”)

### 4. Prefer **positive instructions** to **negative constraints**

"Write a 3-paragraph article describing the 5 best consoles. Include name, company, and year."

"Write an article but don't include games, don't be too long, don't mention accessories..."

→ **Positive** instructions help the model's creativity and reduce ambiguity.

### 5. Control the number of **tokens**

- Set a limit (max token) or include it in the prompt:
> “Write it as if it were a tweet.”
> “Limit the answer to 100 words.”

### 6. Use **variables in prompts**

For example in an app:
```python
city = "Amsterdam"
prompt = f"You are a travel guide. Tell me a fact about the city: {city}"
```
### 7. Experiment with **formats and styles**

A prompt can be a:

- **Question** → "What made the Sega Dreamcast revolutionary?"
- **Statement** → "The Sega Dreamcast was a sixth-generation console..."
- **Statement** → "Write a paragraph about the Dreamcast and its impact."

Each produces different results.