<!-- .slide: data-background-color="#0f172a" class="center" -->

# Prompt Engineering

## From Prompts  
## to Production AI Systems

---

**1 Hour Workshop**

Design. Control. Optimize. Deploy.


# What is Prompt Engineering?

Prompt engineering is the process of designing inputs  
to guide large language models toward:

- Reliable outputs
- Structured responses
- Reduced hallucination
- Improved reasoning

---

**LLMs predict tokens ‚Äî not truth.**


# Why Prompt Engineering Matters

Modern LLM systems like:

- ChatGPT  
- Claude  
- Gemini  

are probabilistic systems.

---

Same input  
‚â†  
Same output.


# LLMs Are Not Deterministic

Large Language Models:

- Predict the next token using probability
- Sample from distributions
- Do not compute fixed answers

---

The same question  
can produce  
different outputs.

---

Let‚Äôs test this.


# Live Test

We will use the same prompt  
across multiple models.

Observe:
- Differences in reasoning
- Differences in formatting
- Confidence level


prompt = """A bat and a ball cost $1.10 total. The bat costs $1 more than the ball. How much does the ball cost?"""

print("Run this prompt in multiple models:")
print(prompt)


# Temperature & Randomness

Temperature controls  
how the model samples probabilities.

---

Higher temperature:

- More creative
- More diverse outputs
- Less predictable

---

Lower temperature:

- More deterministic
- More repeatable
- More stable


# Reasoning Models vs Traditional LLMs

## Traditional LLMs

- Fast
- Pattern-based completion
- May shortcut reasoning

---

## Reasoning-Optimized Models

- Explicit step-by-step reasoning
- Greater logical depth
- Slower, but often more accurate


# Reasoning Test

Solve step-by-step:

"If 5 machines take 5 minutes to make 5 widgets,  
how long do 100 machines take to make 100 widgets?"

---

Now try:

"Solve without explanation."

---

Observe:

- Does the answer change?
- Does confidence change?
- Does reasoning depth change?


# Context Windows

**Context Window** = How much text the model can remember at once.

---

Bigger context window:

- Handles longer documents  
- Maintains more history  
- **Does NOT mean smarter**

---

‚ö†Ô∏è Context overflow ‚Üí loss of early information


# Hallucination

**Hallucination** = Confident fabrication of information

---

Why it happens:

- Model fills statistical gaps  
- Optimized for coherence, not truth  
- No internal fact-checking

---

The model prefers **sounding right**  
over admitting uncertainty


# Prompt Versioning

Treat prompts like **code**.

---

**v1:** "Summarize this article."  
**v2:** "Summarize this article in 5 bullet points."  
**v3:** "Summarize this article in 5 executive-level bullet points under 50 words."

---

Track changes.  
Measure improvements.


# A/B Testing Prompts

Same input.  
Different prompt versions.  

Compare output quality.


# Prompt Chaining

Instead of one giant prompt:

1. Extract entities  
2. Analyze sentiment  
3. Generate report

Output of one prompt ‚Üí Input of next


# Zero-Shot vs Multi-Shot Prompting

**Zero-shot:**  
"Translate this to French."

**Multi-shot:**  
Provide 2‚Äì3 examples first.

---

Few-shot prompting dramatically improves format consistency.


# Multimodal Prompting

Modern models can process:

- Text  
- Images  
- Code  
- Audio

---

GPT-4o is a **multimodal model** capable of understanding text, images, and code simultaneously.  
This enables structured extraction, document understanding, and advanced reasoning across data types.

---

References:

- [Springer Chapter on GPT-4o Multimodal Capabilities](https://link.springer.com/chapter/10.1007/978-3-031-92611-2_4)  
- [OpenAI GPT-4o Documentation](https://developers.openai.com/api/docs/models/gpt-4o)


# üü¶ CELL 21 ‚Äî Markdown (Closing Slide)

```markdown
# Key Takeaways

- LLMs are probabilistic systems
- Prompt structure controls behavior
- Versioning and testing improve reliability
- Advanced techniques enable production-grade AI systems

Prompt engineering is system design.
Not just better questions.


# Adversarial Prompting / Jailbreaking

Techniques used to:

- Bypass safeguards  
- Override system instructions  
- Inject malicious instructions

---

Production systems must guard against:

- Prompt injection  
- Role-play exploits  
- Instruction override attacks


# Claude in Production Systems

Capabilities:

- Large context window  
- Strong reasoning  
- Tool use

---

Key Features:

- **Skills**: Predefined abilities to perform structured tasks  
- **Artifacts**: Reusable outputs, templates, or tools for workflows

---

References:

- [Claude Artifacts](https://claude.ai/artifacts)  
- [Claude Agents & Tools / Skills Overview](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview)

---

Claude is often used for structured reasoning workflows and production AI pipelines.


# Structured Outputs with BAML

BAML allows you to define **schemas for AI outputs**  

Instead of parsing messy text,  
you enforce **structured responses**.

---

Example / More Info:

- [BoundaryML ‚Äî BAML](https://boundaryml.com/)


class TradeSignal:
    action: str
    confidence: float
    reasoning: str


# Key Takeaways

- LLMs are probabilistic systems  
- Prompt structure controls behavior  
- Versioning and testing improve reliability  
- Advanced techniques enable production-grade AI systems

---

**Prompt engineering is system design**  
Not just better questions
