# Research Workflow

This notebook demonstrates the research [workflow](https://langchain-ai.github.io/langgraph/tutorials/workflows/) that creates comprehensive reports through a series of focused steps. The system:

1. Uses a **graph workflow** with specialized nodes for each report creation stage
2. Enables user **feedback and approval** at critical planning points 
3. Produces a well-structured report with introduction, researched body sections, and conclusion

## From repo 

In [1]:
%cd ..
%load_ext autoreload
%autoreload 2

c:\Users\62687\Desktop\个人\实习相关\标准化研究院实习\QM_report\src


# Compile the Graph-Based Research Workflow

The next step is to compile the LangGraph workflow that orchestrates the report creation process. This defines the sequence of operations and decision points in the research pipeline.

In [2]:
# Import required modules and initialize the builder from open_deep_research
import uuid 
import os, getpass
import open_deep_research   
print(open_deep_research.__version__) 
from IPython.display import Image, display, Markdown
from langgraph.types import Command
from langgraph.checkpoint.memory import MemorySaver
from open_deep_research.graph import builder
from dotenv import load_dotenv

# Create a memory-based checkpointer and compile the graph
# This enables state persistence and tracking throughout the workflow execution


memory = MemorySaver()
graph = builder.compile(checkpointer=memory)
load_dotenv("../.env")
print(os.environ["PLANNER_MODEL"])
print(os.environ["PLANNER_PROVIDER"])

0.0.15
builder compiling ...
builder compiled.
deepseek-chat
deepseek


In [9]:
import open_deep_research
import open_deep_research.prompts
import open_deep_research.configuration
import open_deep_research.utils
import open_deep_research.graph
from importlib import reload

reload(open_deep_research) 
reload(open_deep_research.prompts)
reload(open_deep_research.configuration)
reload(open_deep_research.utils) 
reload(open_deep_research.graph) 

builder compiling ...
builder compiled.


<module 'open_deep_research.graph' from 'c:\\Users\\62687\\Desktop\\个人\\实习相关\\标准化研究院实习\\open_deep_research\\src\\open_deep_research\\graph.py'>

In [None]:
# Visualize the graph structure
# This shows the nodes and edges in the research workflow

display(Image(graph.get_graph(xray=1).draw_mermaid_png()))

In [None]:
# Define report structure template and configure the research workflow
# This sets parameters for models, search tools, and report organization

# REPORT_STRUCTURE = """Use this structure to create a report on the user-provided topic:

# 1. Introduction (no research needed)
#    - Brief overview of the topic area

# 2. Main Body Sections:
#    - Each section should focus on a sub-topic of the user-provided topic
   
# 3. Conclusion
#    - Aim for 1 structural element (either a list of table) that distills the main body sections 
#    - Provide a concise summary of the report"""
# topic = "Overview of Model Context Protocol (MCP), an Anthropic‑backed open standard for integrating external context and tools with LLMs. Give an architectural overview for developers, tell me about interesting MCP servers, and compare to google Agent2Agent (A2A) protocol."

REPORT_STRUCTURE = """ 
第一章、产品质量发展现状
   第一节、总体概况：结合当年的数据和政策分析整体情况
   第二节、地区概况：结合当年的数据和政策总体分析各地区的概况
   第三节、行业概况：结合当年的数据和政策总体分析各行业的概况
第二章、地区产品质量状况：包含若干节（对应于第一章第二节中列举的地区），每一节都聚焦于一个特定地区的产品质量状况，分析该地区的产品质量发展现状，若报告范围为全国则列举所有省份，若报告范围为某省份则列举所有地级市
第三章、行业产品质量状况：包含3-5节，每一节都聚焦于一个特定行业的产品质量状况，分析该行业的产品质量发展现状，比如装备制造类行业、资源加工类行业、食药烟酒类行业以及消费品工业制造业等等

"""

CONCLUSION_STRUCTURE = """
第四章、问题分析：包括3-5节，结合前面章节的内容，分析当前产品质量发展中存在的问题和挑战
第五章、政策建议：包括3-5节，提出针对第四章中分析出的问题的解决方案和建议
"""
topic = "湖南省制造业产品质量合格率分析报告（2024年）"

TAVILY_API_KEY = os.environ.get("TAVILY_API_KEY")
DEEPSEEK_API_KEY = os.environ.get("DEEPSEEK_API_KEY")
DEEPSEEK_MODEL = os.environ.get("PLANNER_MODEL", "deepseek-chat")
PROVIDER = os.environ.get("WRITER_PROVIDER", "deepseek")
BASE_URL = 'https://api.deepseek.com/v1'
thread = {"configurable": {"thread_id": str(uuid.uuid4()),
                           "search_api": "tavily",
                           "search_api_config": {"api_key": TAVILY_API_KEY},
                           "planner_provider": PROVIDER,
                           "planner_model": DEEPSEEK_MODEL,
                           "planner_model_kwargs": {
                              "api_key": DEEPSEEK_API_KEY,
                              "base_url": BASE_URL
                           },
                           "writer_provider": PROVIDER,
                           "writer_model": DEEPSEEK_MODEL,
                           "writer_model_kwargs": {
                              "api_key": DEEPSEEK_API_KEY,
                              "base_url": BASE_URL
                           },
                           "summarization_model_provider": PROVIDER,
                           "summarization_model_model": DEEPSEEK_MODEL,
                           "summarization_model_kwargs": {
                              "api_key": DEEPSEEK_API_KEY,
                              "base_url": BASE_URL
                           },      
                           "max_search_depth": 2,
                           "report_structure": REPORT_STRUCTURE,
                           "conclusion_structure": CONCLUSION_STRUCTURE
                           }}

# Define research topic about Model Context Protocol


# Run the graph workflow until first interruption (waiting for user feedback)
async for event in graph.astream({"topic":topic,}, thread, stream_mode="updates"):
   if '__interrupt__' in event:
      interrupt_value = event['__interrupt__'][0].value
      #   display(Markdown(interrupt_value))
      print("Interrupt value:", interrupt_value)

输入字符串的 Token 数: 64380
超过最大限制 15000，已缩减字符串长度到 15695 字符


CancelledError: 

# User Feedback Phase

* This allows for providing directed feedback on the initial report plan
* The user can review the proposed report structure and provide specific guidance
* The system will incorporate this feedback into the final report plan

In [4]:
# Submit feedback on the report plan
# The system will continue execution with the updated requirements

# Provide specific feedback to focus and refine the report structure
async for event in graph.astream(Command(resume="第二章的地区不够多，把湖南省的地级市都包括，每一个地级市一节"), thread, stream_mode="updates"):
    if '__interrupt__' in event:
        interrupt_value = event['__interrupt__'][0].value
        print("Interrupt value:", interrupt_value)
        # display(Markdown(interrupt_value))

输入字符串的 Token 数: 65549, 最大限制: 15000
输入字符串的 Token 数: 65549, 超过最大限制 15000，已缩减字符串长度到 15904 字符
Interrupt value: 请对以下报告框架提供反馈。

第1章 产品质量发展现状
	第1节 总体概况
		概要: 结合2024年的数据和政策分析湖南省制造业产品质量合格率的整体情况。
		是否需要联网搜索: 是
	第2节 地区概况
		概要: 分析湖南省各地区制造业产品质量合格率的总体情况，包括主要数据和政策背景。
		是否需要联网搜索: 是
	第3节 行业概况
		概要: 分析湖南省各行业制造业产品质量合格率的总体情况，包括主要数据和政策背景。
		是否需要联网搜索: 是
第2章 地区产品质量状况
	第1节 长沙市产品质量状况
		概要: 分析长沙市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第2节 株洲市产品质量状况
		概要: 分析株洲市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第3节 湘潭市产品质量状况
		概要: 分析湘潭市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第4节 衡阳市产品质量状况
		概要: 分析衡阳市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第5节 邵阳市产品质量状况
		概要: 分析邵阳市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第6节 岳阳市产品质量状况
		概要: 分析岳阳市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第7节 常德市产品质量状况
		概要: 分析常德市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第8节 张家界市产品质量状况
		概要: 分析张家界市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第9节 益阳市产品质量状况
		概要: 分析益阳市制造业产品质量合格率的具体情况，包括数据、政策及典型案例。
		是否需要联网搜索: 是
	第10节 郴州市产品质量状况
		概要:

# Final Approval Phase
* After incorporating feedback, approve the plan to start content generation

In [6]:
# Approve the final plan and execute the report generation
# This triggers the research and writing phases for all sections

# The system will now:
# 1. Research each section topic
# 2. Generate content with citations
# 3. Create introduction and conclusion
# 4. Compile the final report

async for event in graph.astream(Command(resume=True), thread, stream_mode="updates"):
    print(event)
    print("\n")

NameError: name 'AsyncTavilyClient' is not defined

In [9]:
# Display the final generated report
# Retrieve the completed report from the graph's state and format it for display

final_state = graph.get_state(thread)
report = final_state.values.get('final_report')
Markdown(report)

# Introduction  
Large language models excel at reasoning, but without structured access to the outside world they remain isolated. The Model Context Protocol (MCP) bridges this gap, defining an open, vendor‑neutral way for models to tap files, databases, APIs, and other tools through simple JSON‑RPC exchanges. This report walks developers through the protocol’s architecture, surveys real‑world MCP servers that showcase its flexibility, and contrasts MCP with Google’s emerging Agent‑to‑Agent (A2A) standard. By the end, you should know when, why, and how to weave MCP into your own agentic systems.

## MCP Architectural Overview for Developers

MCP uses a client‑host‑server model: a host process spawns isolated clients, and every client keeps a 1‑to‑1, stateful session with a single server that exposes prompts, resources, and tools through JSON‑RPC 2.0 messages [1][5].  

A session passes through three phases — initialize, operation, shutdown. The client begins with an initialize request that lists its protocolVersion and capabilities; the server replies with a compatible version and its own capabilities. After the client’s initialized notification, both sides may exchange requests, responses, or one‑way notifications under the agreed capabilities [2].  

Two official transports exist. Stdio is ideal for local child processes, while HTTP (SSE/“streamable HTTP”) supports multi‑client, remote scenarios. Both must preserve JSON‑RPC framing, and servers should validate Origin headers, bind to localhost where possible, and apply TLS or authentication to block DNS‑rebind or similar attacks [1][3].  

To integrate MCP, developers can:  
1) implement a server that registers needed primitives and advertises them in initialize.result.capabilities;  
2) validate all inputs and set reasonable timeouts;  
3) or consume existing servers via SDKs—select a transport, send initialize, then invoke or subscribe to tools/resources exactly as negotiated [4][5].  

### Sources  
[1] MCP Protocol Specification: https://www.claudemcp.com/specification  
[2] Lifecycle – Model Context Protocol: https://modelcontextprotocol.info/specification/draft/basic/lifecycle/  
[3] Transports – Model Context Protocol: https://modelcontextprotocol.io/specification/2025-03-26/basic/transports  
[4] Core Architecture – Model Context Protocol: https://modelcontextprotocol.io/docs/concepts/architecture  
[5] Architecture – Model Context Protocol Specification: https://spec.modelcontextprotocol.io/specification/2025-03-26/architecture/

## Ecosystem Spotlight: Notable MCP Servers

Hundreds of MCP servers now exist, spanning core data access, commercial platforms, and hobby projects—proof that the protocol can wrap almost any tool or API [1][2].

Reference servers maintained by Anthropic demonstrate the basics.  Filesystem, PostgreSQL, Git, and Slack servers cover file I/O, SQL queries, repository ops, and chat workflows.  Developers can launch them in seconds with commands like  
`npx -y @modelcontextprotocol/server-filesystem` (TypeScript) or `uvx mcp-server-git` (Python) and then point any MCP‑aware client, such as Claude Desktop, at the spawned process [1].

Platform vendors are adding “first‑party” connectors.  Microsoft cites the GitHub MCP Server and a Playwright browser‑automation server as popular examples that let C# or .NET apps drive code reviews or end‑to‑end tests through a uniform interface [3].  Other partner servers—e.g., Cloudflare for edge resources or Stripe for payments—expose full product APIs while still enforcing user approval through MCP’s tool‑calling flow [2].

Community builders rapidly fill remaining gaps.  Docker and Kubernetes servers give agents controlled shell access; Snowflake, Neon, and Qdrant handle cloud databases; Todoist and Obsidian servers tackle personal productivity.  Because every server follows the same JSON‑RPC schema and ships as a small CLI, developers can fork an existing TypeScript or Python implementation and swap in their own SDK calls to create new connectors in hours, not weeks [2].  

### Sources  
[1] Example Servers – Model Context Protocol: https://modelcontextprotocol.io/examples  
[2] Model Context Protocol Servers Repository: https://github.com/madhukarkumar/anthropic-mcp-servers  
[3] Microsoft partners with Anthropic to create official C# SDK for Model Context Protocol: https://devblogs.microsoft.com/blog/microsoft-partners-with-anthropic-to-create-official-c-sdk-for-model-context-protocol

## Agent‑to‑Agent (A2A) Protocol and Comparison with MCP  

Google’s Agent‑to‑Agent (A2A) protocol, announced in April 2025, gives autonomous agents a common way to talk directly across vendors and clouds [2]. Its goal is to let one “client” agent delegate work to a “remote” agent without sharing internal code or memory, enabling true multi‑agent systems.  

Discovery starts with a JSON Agent Card served at /.well‑known/agent.json, which lists version, skills and endpoints [3]. After discovery, the client opens a Task—an atomic unit that moves through states and exchanges Messages and multimodal Artifacts. HTTP request/response, Server‑Sent Events, or push notifications are chosen based on task length to stream progress safely [2].  

Anthropic’s Model Context Protocol (MCP) tackles a different layer: it links a single language model to external tools and data through a Host‑Client‑Server triad, exposing Resources, Tools and Prompts over JSON‑RPC [1]. Communication is model‑to‑tool, not agent‑to‑agent.  

Google therefore calls A2A “complementary” to MCP: use MCP to give each agent the data and actions it needs; use A2A to let those empowered agents discover one another, coordinate plans and exchange results [1]. In practice, developers might pipe an A2A task that, mid‑flow, invokes an MCP tool or serve an MCP connector as an A2A remote agent, showing the standards can interlock instead of compete.  

### Sources  
[1] MCP vs A2A: Comprehensive Comparison of AI Agent Protocols: https://www.toolworthy.ai/blog/mcp-vs-a2a-protocol-comparison  
[2] Google A2A vs MCP: The New Protocol Standard Developers Need to Know: https://www.trickle.so/blog/google-a2a-vs-mcp  
[3] A2A vs MCP: Comparing AI Standards for Agent Interoperability: https://www.ikangai.com/a2a-vs-mcp-ai-standards/

## Conclusion

Model Context Protocol (MCP) secures a model’s immediate tool belt, while Google’s Agent‑to‑Agent (A2A) protocol enables those empowered agents to find and hire one another. Their scopes differ but interlock, giving developers a layered recipe for robust, multi‑agent applications.

| Aspect | MCP | A2A |
| --- | --- | --- |
| Layer | Model‑to‑tool RPC | Agent‑to‑agent orchestration |
| Session start | `initialize` handshake | Task creation lifecycle |
| Discovery | Client‑supplied server URI | `/.well‑known/agent.json` card |
| Streaming | Stdio or HTTP/SSE | HTTP, SSE, or push |
| Best fit | Embed filesystems, DBs, SaaS APIs into one agent | Delegate subtasks across clouds or vendors |

Next steps: prototype an A2A task that internally calls an MCP PostgreSQL server; harden both layers with TLS and capability scoping; finally, contribute a new open‑source MCP connector to accelerate community adoption.

Trace: 

> Note: uses 80k tokens 

https://smith.langchain.com/public/31eca7c9-beae-42a3-bef4-5bce9488d7be/r