Researcher + Writer Agents (LangGraph) — Step-by-step

Goal: Build a LangGraph workflow where a Researcher agent gathers facts (Wikipedia) and a Writer agent turns those facts into a LinkedIn-style post. State persists research + final post. The agents are created with create_react_agent (LangGraph prebuilt ReAct).

0 — Notes / prerequisites (read before running)

You need a Google API key from Google AI Studio (we use Gemini via langchain-google-genai).
LangChain

This notebook runs in CPU-only mode but works in Colab. If you use Colab, prefer setting the key via getpass or Colab secrets.

We use @tool to register callable tools for agents; ensure docstrings + type hints are present.
LangChain

Install dependencies

In [None]:
# Run this cell first (Colab / Jupyter)
!pip install -qU langchain langgraph langchain-google-genai wikipedia pydantic


: 

Set your Google API key securely

In [19]:
# In Colab use getpass to avoid leaking your key in logs
from getpass import getpass
import os

key = getpass("Paste your Google AI Studio API key (keeps it hidden): ")
os.environ["GOOGLE_API_KEY"] = key.strip()
os.environ["GOOGLE_API_KEY"]="AIzaSyBEhOoTh2Iu2UzC1p8Kfz8pL4FxGQP1F_w"

Paste your Google AI Studio API key (keeps it hidden): ··········


Imports & short explanation of core objects.
Why these?

StateGraph + add_node/add_edge are LangGraph primitives used to model the workflow as a graph.
LangChain
+1

create_react_agent is the recommended prebuilt to create ReAct-style agents in LangGraph.
LangChain

In [20]:
# Core LLM + LangGraph imports
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.prebuilt import create_react_agent
from langgraph.graph import StateGraph, START, END

# Tools decorator
from langchain_core.tools import tool

# Other utilities
from pydantic import BaseModel
from typing import List, Optional, Dict, Any
import wikipedia, json, re, time


Define the typed state model
Why typed state?

Typed state (Pydantic) gives predictable keys, validation, and easier debugging/visualization in LangGraph Studio

In [21]:
# A typed state helps guarantee consistent communication between agents.
class MultiAgentState(BaseModel):
    topic: str
    research_notes: List[str] = []
    structured_facts: Dict[str, Any] = {}
    draft: Optional[str] = None
    final_post: Optional[str] = None
    trace: List[Dict[str, Any]] = []


Tool: Wikipedia lookup (register as a tool)

In [22]:
# Tools must have a docstring and type annotations. LangChain will use the docstring
# as the tool description presented to the LLM.
@tool
def wiki_lookup(query: str) -> str:
    """
    wiki_lookup(query: str) -> str
    Search Wikipedia for the query and return a short JSON string:
      {"title": "<page title>", "summary": "<two-sentence summary>"}
    Returns a JSON string (so the agent can parse it reliably).
    """
    try:
        titles = wikipedia.search(query, results=3)
        if not titles:
            return json.dumps({"error": "no_results"})
        page = wikipedia.page(titles[0], auto_suggest=False)
        summary = wikipedia.summary(page.title, sentences=2)
        return json.dumps({"title": page.title, "summary": summary})
    except Exception as e:
        return json.dumps({"error": str(e)})


Initialize the LLM and create two agents

In [23]:
# Initialize Gemini (use your environment key)
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",  # pick the available model in your account
    google_api_key=os.environ.get("GOOGLE_API_KEY"),
    temperature=0.0
)

# Researcher agent: allowed to call wiki_lookup
researcher_agent = create_react_agent(
    model=llm,
    tools=[wiki_lookup],
    prompt=(
        "You are a Researcher agent. Your job: given a topic, gather up to 5 concise facts. "
        "When you need factual verification, call the tool `wiki_lookup(query)`. "
        "Return facts as a JSON array like: "
        '[{"fact": "...", "source":"<title>"}].'
    )
)

# Writer agent: no tools, just craft high-quality LinkedIn-style post using research notes.
writer_agent = create_react_agent(
    model=llm,
    tools=[],  # writer uses the state.research_notes as input
    prompt=(
        "You are a Writer agent. Use the research notes in state.research_notes to write "
        "a LinkedIn-style post (~150-220 words). Keep it professional and add one sentence "
        "highlighting the strongest fact from the research (mention source)."
    )
)


Node functions: how each graph node uses its agent


Notes on node behavior:

Node functions are intentionally defensive: agents may return free-text; we attempt JSON parse first, then fallback to raw text.

Each node returns only the updated fields (LangGraph merges them into the shared state). This aligns with LangGraph’s node semantics.
LangChain

In [None]:
# Node functions: accept a plain dict `state_dict` and return dict updates (LangGraph convention).

def researcher_node(state: MultiAgentState):
    """
    Reads: state.topic
    Writes: appends to state_dict['research_notes'], may set structured_facts
    """
    start = time.time()
    topic = state.topic
    # explicit prompt to make agent return structured JSON
    prompt = (
        f"Research the following topic and return up to 15 concise facts as a JSON array.\n\n"
        f"Topic: {topic}\n\n"
        "When needed, call wiki_lookup(query). Each fact object should be: "
        '{"fact": "...", "source":"<title>"}'
    )

    resp = researcher_agent.invoke({"messages": [("user", prompt)]})
    # robust extraction of generated text from response
    content = ""
    if isinstance(resp, dict):
        # many LangGraph prebuilt returns include a messages list
        content = resp.get("output") or ""
        if not content and resp.get("messages"):
            content = resp["messages"][-1].content
    else:
        content = str(resp)

    # Try parsing JSON first; if not JSON, keep as single raw note
    parsed_facts = []
    try:
        parsed = json.loads(content)
        if isinstance(parsed, list):
            for item in parsed:
                # normalize into string note
                fact = item.get("fact") if isinstance(item, dict) else str(item)
                src = item.get("source", "")
                parsed_facts.append(f"{fact} (source: {src})")
        elif isinstance(parsed, dict) and parsed.get("facts"):
            for item in parsed["facts"]:
                parsed_facts.append(item.get("fact",""))
    except Exception:
        # fallback: put the whole content as one research note
        parsed_facts.append(content.strip())

    # update state
    notes = state.research_notes or []
    notes.extend(parsed_facts)
    # naive structured extraction: first integer-like number found
    first_num = None
    m = re.search(r"(\d{1,3}(?:[,\d]{0,})?)", content.replace("\xa0"," "))
    if m:
        try:
            first_num = int(m.group(1).replace(",", ""))
        except:
            first_num = None

    structured = state.structured_facts or {}
    if first_num is not None and "first_number" not in structured:
        structured["first_number"] = first_num

    trace = state.trace or []
    trace.append({"node": "researcher", "time": time.time(), "raw": content[:1000]})

    return {"research_notes": notes, "structured_facts": structured, "trace": trace}

def writer_node(state: MultiAgentState):
    """
    Reads: state.research_notes or []
    Writes: sets state_dict['final_post'] (string)
    """
    start = time.time()
    notes = state.research_notes or []
    prompt = (
        "Using the research notes below, write a LinkedIn-style ~280-word post.\n\n"
        "RESEARCH NOTES:\n" + "\n".join(notes) + "\n\n"
        "Post:"
    )

    resp = writer_agent.invoke({"messages": [("user", prompt)]})
    content = ""
    if isinstance(resp, dict):
        content = resp.get("output") or ""
        if not content and resp.get("messages"):
            content = resp["messages"][-1].content
    else:
        content = str(resp)

    trace = state.trace or []
    trace.append({"node": "writer", "time": time.time(), "raw": content[:400]})

    return {"final_post": content.strip(), "trace": trace}


Build the graph (nodes + edges) and compile

In [25]:
# Create the StateGraph using the Pydantic model as the schema
graph = StateGraph(MultiAgentState)

# Register nodes
graph.add_node("researcher", researcher_node)
graph.add_node("writer", writer_node)

# Wire the flow: START -> researcher -> writer -> END
graph.add_edge(START, "researcher")
graph.add_edge("researcher", "writer")
graph.add_edge("writer", END)

# Compile to get a runnable app
app = graph.compile()
print("Graph compiled. Ready to run.")


Graph compiled. Ready to run.


Run the graph on the topic and inspect results:-
What to expect:

research_notes will contain the Researcher agent’s findings (may be one or multiple items).

final_post will be the Writer agent’s LinkedIn-style result.

trace shows which nodes ran and a snippet of their raw outputs (useful for debugging).

In [None]:
# Provide just the topic; the graph/state model defaults the rest
input_state = {"topic": "How Agentic Ai is better than Gen Ai"}

# Run the compiled graph app
result = app.invoke(input_state)

# Inspect final output and trace
print("=== FINAL POST ===\n")
print(result.get("final_post") or result.get("draft") or "No final post produced.\n")

print("\n=== RESEARCH NOTES ===")
for idx, n in enumerate(result.get("research_notes", []), 1):
    print(f"{idx}. {n}")

print("\n=== TRACE (short) ===")
for t in result.get("trace", []):
    print(t["node"], "-", t.get("raw", "")[:120])


=== FINAL POST ===

Global food security is facing an unprecedented challenge, and climate change is at the heart of it. Our agricultural systems, vital for feeding a growing population, are increasingly strained, making it harder to provide a stable food supply worldwide.

The impacts are clear and multifaceted. We're seeing how rising global temperatures and unpredictable weather patterns are directly contributing to lower crop yields across various regions. This isn't just about heat; water scarcity, driven by more frequent droughts, intense heat waves, and even devastating floods, further exacerbates the problem, making it harder for essential crops to thrive and reach their full potential.

It's a stark reality: according to "Effects of climate change on agriculture," rising temperatures and changing weather patterns often result in lower crop yields. This critical insight underscores the direct link between our changing climate and the very foundation of our food supply.

Ensurin

(Optional) Visualize the graph in the notebook

In [27]:
# Visualize the compiled graph as mermaid; works in many notebook UIs
from IPython.display import display, Markdown

try:
    mermaid = app.get_graph().draw_mermaid()
    display(Markdown("```mermaid\n" + mermaid + "\n```"))
except Exception as e:
    print("Graph drawing may not be supported in this environment:", e)


```mermaid
---
config:
  flowchart:
    curve: linear
---
graph TD;
	__start__([<p>__start__</p>]):::first
	researcher(researcher)
	writer(writer)
	__end__([<p>__end__</p>]):::last
	__start__ --> researcher;
	researcher --> writer;
	writer --> __end__;
	classDef default fill:#f2f0ff,line-height:1.2
	classDef first fill-opacity:0
	classDef last fill:#bfb6fc

```

# Troubleshooting & Tips (brief but critical)

If you see InvalidArgument: contents is not specified, ensure you call agents with a proper messages payload (example above uses {"messages":[("user", prompt)]}). If your environment requires role-content dicts, try {"messages":[{"role":"user","content":prompt}]}. (Different wrappers accept slightly different message formats.)

If agents return large free text instead of JSON: tighten the state_modifier and prompt to require JSON-only output with an exact schema example.

For production, add timeouts, max steps, and step limits to avoid runaway loops. Cache frequent wiki queries to reduce latency/cost.

If Wikipedia misses facts, consider SerpAPI / official data sources for higher reliability (but expect API keys/costs).

# Next steps you can try : ---
(A) add a small "Supervisor" node (optional) that checks research quality and re-runs researcher if notes are too short, or

(B) add a parallel-researchers + aggregator variant (more advanced).