# Chapter 6: Prompt Engineering - Hard Tasks (Solutions)

Complete solutions for all Hard Tasks.

## Solutions Summary

### Task 1: Tree-of-Thought

**How it works:**
1. Generate multiple possible next steps
2. Evaluate each option with a prompt
3. Explore promising branches (score >= 5)
4. Prune low-scoring branches

**Key prompts:**

Generate options:
```
{problem}

Current situation: {current_state}

What are {num_options} possible next moves? List them:
1.
```

Evaluate options:
```
{problem}

Proposed move: {option}

On a scale of 0-10, how promising is this move?
Consider: Does it make progress? Does it violate constraints?

Score (0-10):
```

**Alternative evaluation criteria:**
- Safety-focused: "How safe is this move?"
- Goal-oriented: "How close does this get us to the goal?"
- Risk-aware: "What's the risk of this move failing?"

**When to use:**
- Strategic problems with multiple valid approaches
- When exploring alternatives is valuable
- Problems where linear reasoning often fails

### Task 2: Chain Prompting

**4-stage chain:**

**Stage 1 - Extraction:**
```
Extract key information from this review:

Review: {review}

List:
- Issues mentioned:
- Positive aspects:
- Duration of usage:
- Price mentioned:
```

**Stage 2 - Sentiment:**
```
Based on this information:

{stage1_output}

Determine:
1. Overall sentiment (positive/negative/mixed)
2. Satisfaction level (1-10)
3. Main concerns
```

**Stage 3 - Strategy:**
```
Given this sentiment:

{stage2_output}

And these facts:

{stage1_output}

Plan the response:
- What to acknowledge
- What to apologize for
- What actions to offer
```

**Stage 4 - Final Response:**
```
Write a professional response following this strategy:

{stage3_output}

Tone: Professional, empathetic, solution-focused
Length: 2-3 paragraphs

Response:
```

**Benefits:**
- Each stage is specialized
- Easier to debug and improve
- More thorough analysis
- Can inspect intermediate results

**Trade-offs:**
- More API calls (more expensive)
- Slower than single prompt
- More complex to implement

### Task 3: Output Verification

**Generate code:**
```
Write a Python function:

Task: {task}
Function name: {name}
Parameters: {parameters}
Must include: {must_have}

Write the complete function:
```

**Verification prompt:**
```
Check if this code meets the requirements:

Code:
{code}

Requirements:
- Function name: {name}
- Must have: {must_have}

List any issues found (or write 'No issues'):
```

**Correction prompt:**
```
This code has issues:

Code:
{code}

Issues found:
{verification_feedback}

Fix these issues. Provide the complete corrected function:
```

**When to use prompt-based verification:**
- Semantic requirements (clarity, style)
- High-level structure
- Documentation quality

**When to use code-based verification:**
- Syntax errors
- Specific patterns (regex)
- Performance critical

### Task 4: Multi-Stage Reasoning with Self-Reflection

**4-stage approach:**

**Stage 1 - Initial reasoning:**
```
{problem}

Let's think step-by-step:
```

**Stage 2 - Self-reflection:**
```
{problem}

Here was my initial reasoning:
{initial_reasoning}

Now, critically examine this:
- Did I consider all factors?
- Are calculations correct?
- Did I miss anything?

Critical reflection:
```

**Stage 3 - Revised reasoning:**
```
{problem}

My initial reasoning:
{initial}

After reflection:
{reflection}

Now provide improved reasoning:
```

**Stage 4 - Final answer:**
```
{problem}

After analysis:
{revised_reasoning}

Provide:
1. Final answer (which strategy?)
2. Key reasoning (2-3 sentences)
3. Confidence level (0-100%)
4. Main uncertainty
```

**What reflection typically catches:**
- Calculation errors
- Missed factors
- Hidden assumptions
- Alternative perspectives

**When to use:**
- High-stakes decisions
- Complex reasoning
- When initial answers often have errors
- When you need confidence assessment

## Questions Answered

### Task 1

1. **How did ToT compare to linear CoT?**
   - ToT explored multiple approaches
   - Found better solutions on strategic problems
   - More expensive (multiple evaluations)

2. **Which branches were pruned (scored below 5)?**
   - Moves that violate constraints
   - Dead-end approaches
   - Moves that don't make progress

3. **How did changing the evaluation criteria affect the solution?**
   - Safety-focused: Prioritizes avoiding violations
   - Progress-focused: Prioritizes moving forward
   - Different criteria can find different solutions

### Task 2

1. **Did the chain approach produce a better response?**
   - Usually yes, more thorough
   - Better structure
   - Addresses all aspects systematically

2. **Which stage added the most value?**
   - Varies by use case
   - Often Stage 3 (strategy) is most valuable
   - Stage 1 (extraction) prevents information loss

3. **Could you combine stages 1 and 2? What would you lose?**
   - Yes, but lose specialization
   - Harder to debug
   - May miss nuances

### Task 3

1. **What issues did verification catch?**
   - Missing docstrings
   - Missing input validation
   - Incomplete error handling
   - Style issues

2. **Did the correction prompt successfully fix the issues?**
   - Usually yes on first try
   - May need iteration for complex issues
   - Clear feedback leads to better corrections

3. **When would you use prompt-based verification vs code-based checks?**
   - Prompt-based: Semantic requirements, style, documentation
   - Code-based: Syntax, specific patterns, performance
   - Often use both together

### Task 4

1. **What did self-reflection catch that initial reasoning missed?**
   - Hidden assumptions
   - Calculation errors
   - Missed factors
   - Alternative approaches

2. **How did the revised reasoning differ from the initial attempt?**
   - More complete
   - Considered edge cases
   - More careful calculations
   - Better structured

3. **Was the confidence assessment reasonable?**
   - Usually aligns with problem difficulty
   - Lower confidence on ambiguous problems
   - Uncertainty identification is valuable