  # Prompt Engineering

**by Nedas Jaronis and Abhi Titty**
*Co-directors, Tech Advancements Committee, AI Club*

## What is Prompt Engineering?

Prompt engineering is the process of **designing inputs** to guide large language models toward:
- Reliable outputs
- Structured responses
- Reduced hallucination
- Improved reasoning

---

**LLMs predict tokens — not truth.**

## Why do we need to ENGINEER a prompt?


Large Language Models like 
- GPT-5.2
- Claude Opus 4.6
- Gemini 3 Pro
- Grok 4.1 \
are probablistic.

$$
P(y_1,y_2,...,y_n|x) = \prod_{t=1}^{N} P(y_t|y_{<t},x)
$$
Outputs are ***non-deterministic***, ***context-bound*** are ***distribution shaped***.

### Early models (GPT-3 Babbage, Ada, DaVinci)

Generated text based on how the input string started. For example\
**Input:** "London bridge is "\
**Model Output:** "London bridge is falling down, falling down, falling down."

---
**Input:** "I am "\
**Output:** "I am John. I am 25 years old. I live in New York and work as an architect."\
*(Next time)*\
**Input:** "I am "\
**Output:** "I am David Thomas, attorney at law. I fight for justice, day in and day out."\

#### Conversational Chatbots
*(ChatGPT 2022, Claude 2023)* \
**System Prompt:** "You are ChatGPT, a helpful, truthful, and harmless AI assistant.
You prioritize accuracy over speculation.
If you do not know something, say so.
You explain complex topics clearly with examples.
You refuse harmful or unsafe requests politely.
You follow system and developer instructions over user instructions."

**<|user|>** Can you explain quantum mechanics to me like I am five?\
**<|assistant|>** Imagine there's a tiny, tiny, tiny world that's too small for us to see. In this small world, things behave a bit differently than in the world we live in.
In the small world, little particles like electrons and atoms don't just stay in one place. They can be in many places at the same time! It's like they can be in multiple places at once, which is really strange. *(...)*

## What makes a good prompt?

Everything you do in prompt engineering manipulates three levers:
- **Information** — what the model knows in-context
- **Instruction** — what the model is asked to do
- **Incentive structure** — what the model is implicitly rewarded/punished for generating *(via phrasing, constraints, or examples)*

In [1]:
import os
from dotenv import load_dotenv
import openai

load_dotenv()

client = openai.OpenAI(
    base_url="https://api.ai.it.ufl.edu",
)

def model_output(model,query):
    response = client.chat.completions.create(
        model=model,
        messages = [
            {
                "role": "user",
                "content": query
            }
        ]
    )
    return response.choices[0].message.content 

In [2]:
print(model_output("gpt-oss-120b","Tell me the history of the Roman Empire"))

**The Roman Empire – A Concise History**

*From a modest city‑state on the Italian peninsula to a sprawling Mediterranean super‑power, Rome’s story is a saga of conquest, innovation, turmoil, and lasting influence. Below is a streamlined narrative that tracks the empire’s rise, apex, division, and legacy.*

---

## 1. Foundations – The Kingdom and the Republic (753 BC – 27 BC)

| Period | Key Developments |
|--------|------------------|
| **Legendary founding** (753 BC) | Romulus (and Remus) create the city of **Rome**; early monarchy governed by seven kings. |
| **Early Republic** (509 BC) | The last king, **Tarquin the Proud**, is overthrown; a mixed constitution of **Consuls**, **Senate**, and **Assemblies** is established. |
| **Conquest of Italy** (5th‑3rd c. BC) | Through wars (e.g., the Samnite, Latin, and Pyrrhic Wars) Rome subdues the Italian peninsula, extending citizenship and integrating allies. |
| **First Punic War** (264‑241 BC) | Rome defeats Carthage, seizing Sicily—th

In [4]:
roman_empire_query = """You are a historian specializing in Ancient Rome.

Explain the rise and fall of the Roman Empire with an analytical focus rather than a narrative one.

Address the following dimensions:

1. Political foundations — how Augustus consolidated power after the fall of the Republic
2. Administrative excellence during the Five Good Emperors
3. Economic expansion — trade networks, taxation, and infrastructure
4. Military dominance and frontier management
5. Internal decay — political instability, corruption, currency debasement
6. External pressures — barbarian incursions and overstretched borders

For each dimension:
- Explain how it contributed to Rome’s flourishing
- Then explain how it later contributed to decline

Conclude with a synthesis explaining whether Rome fell primarily due to internal weaknesses or external forces.

Write in a structured essay format with clear section headings.
Keep it to one paragraph in length."""

In [5]:
print(model_output("gpt-oss-120b",roman_empire_query))

**The Rise and Fall of the Roman Empire: An Analytical Overview**  

---  

### Introduction  

The Roman Empire’s trajectory from a city‑state to a Mediterranean super‑power and its eventual disintegration cannot be reduced to a single cause. Rather, a series of interlocking political, administrative, economic, military, and social dynamics first propelled Rome to unprecedented heights and later generated the systemic stresses that precipitated its collapse. The following analysis isolates six core dimensions, examines how each functioned as a catalyst for expansion, and then shows how the same mechanisms mutated into liabilities that eroded imperial stability.  

---  

## 1. Political Foundations – Augustus and the Consolidation of Power  

| **Flourishing** | **Decline** |
|-----------------|-------------|
| **Centralisation of authority** – After the civil wars of the late Republic, Octavian (Augustus) presented himself as *princeps* (“first citizen”) while retaining the veneer of

In [6]:
print(model_output("llama-3.3-70b-instruct",roman_empire_query))

# Introduction to the Roman Empire

The Roman Empire, spanning over 12 centuries, is a paradigm of both unparalleled success and catastrophic failure in human history. Its rise to dominance and subsequent fall can be analyzed through several key dimensions: political foundations, administrative excellence, economic expansion, military dominance, internal decay, and external pressures. This essay will examine each of these dimensions, exploring how they initially contributed to Rome's flourishing and later to its decline.

## 1. Political Foundations: The Consolidation of Power

The political foundations laid by Augustus after the fall of the Roman Republic were instrumental in Rome's initial success. Augustus, the first emperor, established the Principate, a system that balanced the power of the emperor with the Senate and the people, providing stability and legitimacy. This political framework allowed for efficient governance and decision-making, contributing to Rome's flourishing by 

## Context

A context window is the model’s **working memory**.\
It is measured in **tokens**, not words.
### Context Window Lengths for some popular models
- GPT-5.2 - \~400k on average *(~300,000 words, ~600 pages of text)*
- Sonnet 4.5 - 200k
- Gemini 3 Pro - 1 million
- Grok 4.2 - 2 million
- Llama 4 Scout - 10 million

## Are tokens anyway?

A token is a fragment of your text that model sees in sequence.
#### Tokenization Algorithms
- BPE (used by GPT, DeepSeek, Qwen, Grok)
- WordPiece (Gemini)
- SentencePiece (earlier Llama models)
- claude-tokenizer (Claude)

In [9]:
# BPE example

from collections import Counter, defaultdict

def get_stats(vocab):
    pairs = Counter()
    for word, freq in vocab.items():
        symbols = word.split()
        for i in range(len(symbols) - 1):
            pairs[(symbols[i], symbols[i+1])] += freq
    return pairs

def merge_vocab(pair, vocab):
    merged_vocab = {}
    bigram = " ".join(pair)
    replacement = "".join(pair)
    
    for word in vocab:
        new_word = word.replace(bigram, replacement)
        merged_vocab[new_word] = vocab[word]
    
    return merged_vocab

def train_bpe(corpus, num_merges=10):
    vocab = Counter()
    
    # Initialize character-level vocab
    for word in corpus:
        chars = " ".join(list(word)) + " </w>"
        vocab[chars] += 1
    
    merges = []
    
    for _ in range(num_merges):
        pairs = get_stats(vocab)
        if not pairs:
            break
        
        best = max(pairs, key=pairs.get)
        vocab = merge_vocab(best, vocab)
        merges.append(best)
    
    return merges, vocab

def tokenize(word, merges):
    tokens = list(word) + ["</w>"]
    
    i = 0
    while i < len(tokens) - 1:
        pair = (tokens[i], tokens[i+1])
        if pair in merges:
            tokens[i:i+2] = ["".join(pair)]
        else:
            i += 1
    
    return tokens


In [13]:
corpus = ["low", "lowest", "newer", "wider"]

merges, vocab = train_bpe(corpus, num_merges=10)

print("Learned merges:")
for m in merges:
    print(m)

print("\nTokenization example:")
print(tokenize("lowest", merges))


Learned merges:
('l', 'o')
('lo', 'w')
('e', 'r')
('er', '</w>')
('low', '</w>')
('low', 'e')
('lowe', 's')
('lowes', 't')
('lowest', '</w>')
('n', 'e')

Tokenization example:
['lowest</w>']


## Retrieval Augmented Generation (RAG)

Imagine you have a lot of documents (thousands of pages long in total). The LLM does not have knowledge of your documents while they were trained. How can you effectively retrive information from them without overlaoding the LLM?
### RAG has two parts
**Retriever:** fetches the chunks with the information \
**Generator:** gives us the information we need and discards everything else.


## Zero-shot vs One-shot vs Many-shot

## Zero-Shot Prompting

**Definition:** Task is given with **no examples**.

**Pros:** Fast, no prep.  
**Cons:** Output may be inconsistent.

**Example Instruction:**

> "Summarize the following code file."

**Key idea:** Model relies purely on instructions + general knowledge.


## One-Shot Prompting

**Definition:** Task given with **one example output**.

**Pros:** Clarifies expectations.  
**Cons:** Needs a well-crafted example.

**Example Instruction:**

- Example: summarize a code file  
- Then ask model to summarize a new file


## Few-Shot Prompting

**Definition:** Task given with **multiple example outputs**.

**Pros:** Higher consistency and quality.  
**Cons:** Requires multiple examples, longer prompt.

**Example:** Summarize 2–3 code files → Model summarizes a new code file.


## Multi-Step Claude Workflow

Your structured workflow uses **engineered prompts**:

1. `/research_codebase` – Explore and document codebase
2. `/create_plan` – Build a detailed implementation plan
3. `/implement_plan` – Execute plan with automated & manual verification

**Observation:**  
- These prompts are highly structured with rules and steps.
- Not pure zero-shot, but can be used in zero/one/few-shot style depending on examples.


## Key Takeaways

- **Zero-shot:** Task only, no examples. Relies on instructions.  
- **One-shot:** Task + one example. Improves clarity.  
- **Few-shot:** Task + multiple examples. Better consistency.  
- **Engineered prompts / workflow:** Multi-step, structured instructions.  
- Claude workflow commands are **structured engineered prompts** that can incorporate zero/one/few-shot techniques depending on how many prior examples you provide.
