# 🤖 ReAct & Plan‑then‑Act **Validation Pipeline**  
*Day 5 – Advanced Reasoning Notebook*

Learn to build *reliable* agent pipelines that:

1. **Plan** a series of steps  
2. **Act** (call tools / APIs)  
3. **Reflect & Validate** outputs  
4. **Iterate** until success

We combine **ReAct** (interleaved Thought/Action/Observation) with **Plan‑then‑Act** (think first, execute later) plus automatic self‑critique and retry loops.

---

## 📚 Learning Path

| Section | Concept | Hands‑On |
|---------|---------|----------|
| 0 | Setup & helper functions | Configure API key |
| 1 | ReAct Quickstart | Toy calculator tool |
| 2 | Plan‑then‑Act Pattern | Multi‑step research plan |
| 3 | Validation Wrapper | Generate → Validate → Retry |
| 4 | Reflexion Upgrade | Self‑critique & prompt improvement |
| 5 | End‑to‑End Mini‑Agent | Query Wikipedia & cite answer |


## 0️⃣ Setup

In [None]:
%pip -q install --upgrade openai wikipedia tiktoken
import os, json, random, textwrap, re
import openai, wikipedia
openai.api_key = os.getenv("OPENAI_API_KEY") or "YOUR_API_KEY"
MODEL = "gpt-4o-mini"

def chat(system, user, temperature=0.3, max_tokens=256, model=MODEL):
    messages=[{"role":"system","content":system},
              {"role":"user","content":user}]
    resp=openai.ChatCompletion.create(model=model, messages=messages,
                                      temperature=temperature,
                                      max_tokens=max_tokens)
    return resp.choices[0].message.content.strip()


---

## 1️⃣ ReAct Quickstart

ReAct prompts **interleave**:

```
Thought: reason about next step
Action: tool(args)
Observation: result of tool
```

Below we register a *calculator* tool and let the LLM solve a word‑math query.

In [None]:
# --- Tool registry ---
def calculator(expr):
    try:
        return str(eval(expr))
    except Exception as e:
        return f"Error: {e}"

TOOLS = {
    "calculator": calculator
}

def run_tool(line:str):
    m=re.match(r"([a-zA-Z_]+)\((.*)\)", line.strip())
    if not m: return "Invalid tool syntax"
    name,args=m.group(1),m.group(2)
    if name not in TOOLS: return "Unknown tool"
    return TOOLS[name](args)

# --- ReAct loop ---
def react_agent(question, max_turns=5):
    context = f"Question: {question}"
    history=""
    for t in range(max_turns):
        prompt = f"""You are a ReAct agent. Available tool: calculator(expr).
Respond using steps:
Thought: ...
Action: ...
Observation: ...
(Stop with 'Answer: <final>').

{history}"""
        resp = chat("You are a helpful agent.", prompt)
        print(resp)
        history += "\n" + resp + "\n"
        # parse action
        action_lines=[l for l in resp.splitlines() if l.startswith("Action:")]
        if not action_lines: break
        cmd=action_lines[-1].replace("Action:","").strip()
        obs=run_tool(cmd)
        history += f"Observation: {obs}\n"
        if "Answer:" in resp: break
    return history

react_agent("What is (12+7)*3 - 5 ?")


#### 📝 Try‑it: Add a `sqrt(x)` function to the calculator tool.

---

## 2️⃣ Plan‑then‑Act

Pattern:

1. **Plan** (list numbered steps)  
2. **Execute** each step with tools  
3. Combine results for final answer


In [None]:
def plan_then_act(query):
    # Step 1: model drafts a plan
    plan = chat("You are good at planning.",
                f"First list 3 numbered steps to answer: {query}\nRespond ONLY with the plan.")
    print("Plan →", plan, "\n")

    # Step 2: iterate over each step
    aggregated_notes=[]
    for step in re.findall(r"\d+\. (.+)", plan):
        # naive tool choice: if numbers present → calculator
        if re.search(r"[0-9]", step):
            # extract expression between backticks if present
            m=re.search(r"`([^`]+)`", step)
            expr=m.group(1) if m else step
            obs=calculator(expr)
            aggregated_notes.append(f"{step} → {obs}")
        else:
            # fallback reasoning
            obs=chat("You are knowledgeable.", step)
            aggregated_notes.append(f"{step} → {obs}")

    # Step 3: summarize
    final = chat("You are a summarizer.",
                 f"Given these notes:\n{aggregated_notes}\nProvide the concise answer to '{query}'.")
    return final

print(plan_then_act("How many seconds are there in 2.5 hours?"))


---

## 3️⃣ Validation Pipeline

Wrap any generator (ReAct or Plan‑then‑Act) with a **generate→validate→retry** loop.

In [None]:
def validator(task, answer):
    # simple: ask LLM to verify
    judge = chat("You are a strict validator.",
                 f"Task: {task}\nAnswer: {answer}\nIs it correct? Reply YES or NO and reason.")
    return judge.startswith("YES")

def generate_validate(generator_fn, task, max_attempts=3):
    for attempt in range(1, max_attempts+1):
        ans = generator_fn(task)
        ok = validator(task, ans)
        print(f"Attempt {attempt}: {'✅' if ok else '❌'}")
        if ok: return ans
    return ans  # return last even if failed

generate_validate(plan_then_act, "Convert 5 miles to kilometers (1 mile = 1.60934 km).")


---

## 4️⃣ Reflexion Upgrade

If validation fails, ask the model to **critique** and improve its own plan before retrying.

In [None]:
def reflect_and_retry(task, rounds=2):
    plan_prompt=""

    for r in range(rounds):
        answer = plan_then_act(task) if r==0 else chat("You improve answers.",
                    f"Previous attempt was incorrect. Critique: {critique}\nRewrite a better answer.")
        if validator(task, answer):
            print("Success after", r+1, "rounds.")
            return answer
        critique = chat("You are a critical reviewer.",
                        f"The answer '{answer}' is incorrect. Explain why and hint improvements.")
        print("Critique →", critique[:200], "...")
    return answer

reflect_and_retry("What is the square root of 9801?")


---

## 5️⃣ Mini Research Agent

Tool: `wiki_search(query)` – returns first sentence from Wikipedia.  
Agent: Plan‑then‑Act with validation that cites a source.

In [None]:
def wiki_search(q):
    try:
        page=wikipedia.page(q, auto_suggest=False)
    except:
        page=wikipedia.search(q)
        if page:
            page=wikipedia.page(page[0])
        else:
            return "Not found"
    return page.summary.split('.')[0] + '.'

TOOLS['wiki_search']=wiki_search

def research_agent(question):
    plan = chat("You are a planning assistant.",
                f"Break down into steps and decide when to call wiki_search. Use `wiki_search("topic")` syntax.")
    notes=[]
    for step in re.findall(r"`wiki_search\("([^"]+)"\)`", plan):
        obs=wiki_search(step)
        notes.append(obs)
    answer=chat("You are a writer.",
                f"Context: {' '.join(notes)}\nAnswer the question '{question}' in two sentences and cite Wikipedia.")
    return answer

print(generate_validate(research_agent, "Who invented the World Wide Web?"))


---

## 🔗 Further Reading

* Yao et al., “ReAct: Synergizing Reasoning and Acting”, 2023  
* Nakano et al., “Plan-and-Execute”, 2024  
* Shinn et al., “Reflexion”, 2023  
* OpenAI Cookbook – function-calling agents
