Web2BigTable: A Bi-Level Multi-Agent Framework for
Web-to-Table Search

Decompose complex tasks. Dispatch parallel workers. Evolve better strategies from every run.

What Is Web2BigTable?

Web2BigTable is a bi-level multi-agent framework for web-to-table search — given a natural-language query and a target schema, it autonomously searches the open web and returns a structured table whose rows are entities, whose columns are the requested attributes, and whose cells are independently verified against web sources. It handles both wide search (broad-coverage tasks that assemble many consistent rows across heterogeneous sources) and deep search (single complex queries resolved by chaining indirect clues across many hops).

An upper-level orchestrator decomposes the task and dispatches sub-problems to lower-level worker agents that solve them in parallel and coordinate through a shared workspace to reduce redundant exploration and reconcile conflicting evidence. The system is self-evolving through a closed-loop run-verify-reflect process that jointly refines how tasks are decomposed and how sub-tasks are solved — adaptation is mediated through persistent, human-readable external memory, leaving the underlying LLMs frozen throughout.

Benchmark Results · Install · Quick Start · Key Features · Why It Matters · Ecosystem · Citation

Benchmark Results

We evaluate Web2BigTable on two challenging benchmarks:

WideSearch — a benchmark for complex, multi-step information retrieval tasks requiring parallel search, data extraction, and structured output across diverse domains.
XBench-DeepSearch — a benchmark for evaluating deep research capabilities on real-world questions requiring multi-hop reasoning and comprehensive web search.

WideSearch

_{Performance landscape on WideSearch (Avg@4). Position encodes Row F1 (x) and Item F1 (y); label encodes Success Rate. Dashed lines show frontier single-agent Item F1. Web2BigTable dominates all three metrics simultaneously.}

XBench-DeepSearch

_{Accuracy on XBench-DeepSearch. Web2BigTable (73.0%) surpasses all open-source agentic models and rivals frontier proprietary systems.}

System Architecture

_{Three-stage architecture of Web2BigTable. Stage 1 (Orchestrate): an Orchestrator LLM reads decomposition strategies from Strategy Memory S_o through a Skill Router and partitions the user query into N subtasks (each with instruction and output schema). Stage 2 (Execute): parallel worker agents resolve execution skills from a Shared Skill Bank S_w (dynamic retrieval + self-repair) and coordinate asynchronously through a Shared Workboard m_e — file-locked, tag-partitioned — to avoid duplicated work and fill coverage gaps; partial outputs are aggregated into the structured BigTable. Stage 3 (Evolve, training only — red arrows): a Run-Verify-Reflect loop contrasts system output against a gold reference, clusters error patterns, and refines/modularises both decomposition skills (written back to S_o) and execution skills (written back to S_w). At inference time (black arrows), Stages 1–2 run with frozen S_o and S_w and no reflection.}

Training (Self-Evolving) Flow

_{Training flow of Web2BigTable over one episode k. For each training query q_k, Stage 1 reads the long-term orchestrator skills S_o and decomposes q_k into subtasks τ. Stage 2 dispatches the subtasks to N parallel workers, which read execution skills from S_w and read/write the short-term workboard m_e until convergence. Stage 3 verifies the aggregated output X_k against the gold reference, produces the structured reflection r_o^k+1, and consolidates it into both S_o (via M_o) and S_w (via M_w). Episodes are processed sequentially: the bottom black loop moves from episode k to k+1 without replanning within an episode. After K episodes, the two skill banks (S_o^*, S_w^*) are frozen and returned as the training output, then used unchanged during inference.}

Inference Flow

_{Inference flow of Web2BigTable on an unseen user query q. Using the trained skill banks S_o^* and S_w^* as frozen read-only inputs, Stage 1 decomposes q into subtasks τ. Stage 2 runs N parallel workers that resolve execution skills from S_w^* and coordinate through the shared workboard m_e (per-query, short-term); their partial outputs {x_i} are aggregated into the structured big table X. No verification, reflection, or memory update is performed: the system runs a single forward pass and returns X.}

Key Features

Core question. Web2BigTable is not about building yet another chatbot wrapper. It is about how to decompose hard tasks into parallel subtasks, coordinate workers effectively, and evolve better decomposition strategies from every run.

Decompose intelligently
Evolved orchestrator skills route tasks to the best decomposition strategy — split by entity, time, category, rank, or dependency.

Execute in parallel
Up to 10 Memento-S workers run concurrently, coordinating through a shared workboard to avoid redundant work and merge partial results.

Evolve from experience
Decomposition strategies are evolved from past task executions — the system clusters task patterns and generates specialised orchestrator skills automatically.

Feature	Why it matters
Multi-agent orchestration via MCP	An orchestrator agent decomposes tasks and dispatches subtasks to parallel workers through a FastMCP server, enabling true concurrent execution rather than sequential tool calls.
Learned decomposition strategies	Decomposition strategies (task-router + 11 decompose-* patterns) are learned from task experience, so the system continuously improves how it breaks down different types of tasks.
Shared workboard coordination	Workers read and edit a shared markdown workboard for inter-agent communication — claim sections, post partial results, and avoid duplicate work without central locking.
Semantic skill routing	BM25 + sentence-transformer embeddings + LLM selection ensure each worker picks the best skill for its subtask, even as the skill library grows.
Ops-based execution engine	Workers use a JSON ops architecture (not function calling) with filesystem, terminal, web, workboard, and meta operations, enabling fine-grained multi-round execution within each skill.
Textual TUI	A rich terminal interface for submitting tasks, inspecting per-worker execution steps, viewing live workboard state, and reading the final synthesised output.

What Makes It Different?

Web2BigTable is organised as a three-stage Orchestrate → Execute → Evolve pipeline. The first two stages run on every query; the third runs only during training and writes its lessons back into persistent external memory.

Stage	What it means
Orchestrate	An Orchestrator LLM reads decomposition strategies from Strategy Memory S_o via a Skill Router, picks the best-matching pattern for the incoming query, and partitions it into N self-contained subtasks (instruction + output schema).
Execute	Parallel worker agents resolve execution skills from a Shared Skill Bank S_w with dynamic retrieval and self-repair, and coordinate asynchronously through a file-locked, tag-partitioned Shared Workboard m_e — claiming sections, posting partial findings, and reconciling conflicts as the global state evolves. Outputs are aggregated into a structured BigTable.
Evolve (training only)	A Run-Verify-Reflect loop contrasts system output against a gold reference, clusters error patterns, and refines/modularises both decomposition skills (written back to S_o) and execution skills (written back to S_w). At inference time the two memories are frozen and read-only.

The key difference from prior multi-agent systems is bi-level co-evolution: most frameworks adapt either how to plan (decomposition) or how to act (execution skills), but not both. Web2BigTable jointly refines them through the same closed-loop reflection, mediated entirely by persistent, human-readable external memory — the underlying LLMs are never fine-tuned.

One-Click Install

curl -sSL https://raw.githubusercontent.com/Web2BigTable/Web2BigTable/main/install.sh | bash

_{One command to install, one command to launch. The installer sets up dependencies, downloads router assets, configures API keys, and creates the web2bigtable command.}

The installer will:

Install uv (if not present)
Clone the repository
Install all dependencies (Memento-S + orchestrator)
Download router assets (skill catalog + optional embeddings)
Configure .env interactively (API keys)
Create the web2bigtable command

Quick Start (Developer)

git clone https://github.com/Web2BigTable/Web2BigTable.git
cd Web2BigTable

# Install Memento-S worker dependencies
cd Memento-S && uv sync --python 3.12 && cd ..

# Install orchestrator dependencies
uv sync --python 3.12

Create a .env file in the project root:

OPENROUTER_API_KEY=sk-or-...
OPENROUTER_MODEL=anthropic/claude-sonnet-4-5
OPENROUTER_BASE_URL=https://openrouter.ai/api/v1
SERPER_API_KEY=...

Then launch:

web2bigtable

Configuration

All configuration is centralised in environment variables. Key settings:

Variable	Default	Description
`OPENROUTER_API_KEY`	—	API key for LLM calls (required)
`OPENROUTER_MODEL`	`anthropic/claude-sonnet-4-5`	Model for Memento-S workers
`OPENROUTER_BASE_URL`	`https://openrouter.ai/api/v1`	LLM API base URL
`SERPER_API_KEY`	—	API key for web search skill (serper.dev)
`MAX_WORKERS`	`10`	Max parallel workers per task
`SEMANTIC_ROUTER_ENABLED`	`true`	Enable semantic skill pre-filtering
`SEMANTIC_ROUTER_TOP_K`	`4`	Number of candidate skills for LLM routing
`SKILL_DYNAMIC_FETCH_ENABLED`	`true`	Auto-fetch missing skills from catalog
`DEBUG`	`false`	Enable debug logging
`WORKSPACE_DIR`	`Memento-S/workspace`	Workboard location shown in TUI

Built-in Skills

Skill	Description
`filesystem`	Read, write, edit, search, and manage files and directories
`terminal`	Execute shell commands with safety checks
`web-search`	Google search via Serper + URL fetching
`uv-pip-install`	Python package management via uv/pip
`skill-creator`	Dynamically create new skills at runtime

Workers automatically select the best skill for each subtask via semantic routing (BM25 + embeddings + LLM). If no existing skill matches, the system can dynamically fetch or create new skills on demand.

TUI

web2bigtable

Submit tasks directly from the interface (Ctrl+Enter or Run Task)
Session-scoped worker list with per-worker status (live / finished)
Click any worker row to inspect execution steps and events
Live workboard view showing real-time inter-worker coordination
Final orchestrator output panel

Shortcut	Action
`Ctrl+Enter`	Run task
`r`	Refresh worker list
`c`	Copy final output to clipboard
`q`	Quit

Developer Notes

Project structure

Web2BigTable/
├── tui_app.py                          # Textual TUI — primary interface
├── main.py                             # Standalone entry point (non-TUI)
├── install.sh                          # One-click installer
├── pyproject.toml                      # Root project (orchestrator deps + entry point)
├── orchestrator/
│   ├── orchestrator_agent.py           # LangChain orchestrator agent
│   └── mcp_server.py                   # FastMCP server (execute_subtasks + workboard)
├── orchestrator_skills/                # Auto-generated decomposition strategies
│   ├── task-router/                    # Routes queries to decompose strategies
│   ├── workboard/                      # Shared workboard coordination
│   ├── decompose-split-by-entity/      # Split by entity/brand
│   ├── decompose-split-by-time-period/ # Split by chronological range
│   ├── decompose-split-by-category/    # Split by categorical dimension
│   ├── decompose-split-by-rank-segment/# Split by rank ranges
│   ├── decompose-annual-rank-stats/    # Annual ranking statistics
│   ├── decompose-comparative-data-extraction/ # Comparative data extraction
│   ├── decompose-constrained-set-search/      # Constrained set search
│   ├── decompose-entity-benchmarking/  # Entity benchmarking
│   ├── decompose-geographic-registries/# Geographic registry lookup
│   ├── decompose-linear-multi-hop-dependency/ # Linear multi-hop dependency
│   ├── decompose-multimedia-source-verification/ # Multimedia source verification
│   └── decompose-temporal-event-logs/  # Temporal event log extraction
├── Memento-S/                          # Worker agent (submodule)
│   ├── core/
│   │   ├── agent/memento_s_agent.py    # Worker agent class
│   │   ├── config.py                   # Configuration & constants
│   │   ├── router.py                   # Skill routing (BM25 + embeddings + LLM)
│   │   ├── llm.py                      # LLM wrapper (OpenRouter)
│   │   ├── skill_engine/               # Planning, execution, bridge ops
│   │   └── tools/                      # Tool implementations
│   └── skills/                         # Built-in skill definitions
├── figures/                            # README figures
├── docs/                               # Documentation
└── logs/                               # Worker trajectory logs (*.jsonl)

Tech stack

Layer	Technology
Interface	Textual (TUI)
Orchestration	LangChain + MCP (Model Context Protocol)
Worker framework	Memento-S (ops-based skill execution)
LLM access	OpenRouter (multi-provider)
Skill routing	BM25 (jieba) + sentence-transformers (BAAI/bge-m3) + LLM selection
MCP transport	FastMCP (stdio)
Coordination	Shared workboard (thread-safe markdown read/write/edit)
Execution	uv sandbox + subprocess isolation
Async runtime	asyncio
Build and packaging	uv + hatchling

FAQ

Problem	Solution
Skills not found	Check that `Memento-S/skills/` exists and skill catalog is downloaded.
API timeout	Increase the model timeout or switch to a faster model in `.env`.
Import errors	Make sure both virtual environments are active: `Memento-S` and root.
Web search fails	Check whether `SERPER_API_KEY` is configured in `.env`.
Workers stuck	Check `logs/worker-*.jsonl` for error details. Increase `MAX_WORKERS` if tasks queue.
Workboard conflicts	Workers use tagged sections — check `.workboard.md` for malformed edits.

Memento Ecosystem

Web2BigTable is part of the broader Memento project family.

Resource	Link	Description
Memento Homepage	memento.run	The hub for all Memento series projects and research
Memento-Skills	GitHub	Single-agent self-evolving skill framework
Web2BigTable	GitHub	Multi-agent orchestration with self-improving decomposition (this repo)
Discord Community	Join Discord	Discussion, Q&A, feature requests, and collaboration

Citation

If you find Web2BigTable useful in your research, please cite:

@misc{huang2026web2bigtablebilevelmultiagentllm,
      title={Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction},
      author={Yuxuan Huang and Yihang Chen and Zhiyuan He and Yuxiang Chen and Ka Yiu Lee and Huichi Zhou and Weilin Luo and Meng Fang and Jun Wang},
      year={2026},
      eprint={2604.27221},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2604.27221},
}

Chinese Summary

点击展开中文摘要

Web2BigTable 是一个多智能体协作系统，核心思路是将复杂任务分解为可并行执行的子任务，由多个 Memento-S 工作智能体同时处理，并通过共享工作板（workboard）进行协调。

系统围绕 路由 → 分解 → 执行 → 合成 的在线流程构建。编排智能体（Orchestrator）通过 task-router 识别任务类型，匹配最佳的 decompose-* 分解策略，将任务拆分为独立子任务；工作智能体通过语义路由选择最佳技能并行执行，通过共享 workboard 进行协调；最后编排智能体聚合结果，生成最终响应。

在 WideSearch 基准测试中，Web2BigTable 在 Row F1（63.5）、Item F1（80.1）和 Success Rate（38.5）三项指标上全面超越 o3-high、Gemini 2.5 Pro、Claude Sonnet 4 等前沿基线。在 XBench-DeepSearch 上达到 73.0% 准确率，超越所有开源智能体模型，接近前沿商业系统。

Licence

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web2BigTable: A Bi-Level Multi-Agent Framework for
Web-to-Table Search

What Is Web2BigTable?

Benchmark Results

Key Features

What Makes It Different?

One-Click Install

Quick Start (Developer)

Built-in Skills

TUI

Developer Notes

FAQ

Memento Ecosystem

Citation

Chinese Summary

Licence

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Memento-S		Memento-S
docs		docs
figures		figures
logs		logs
orchestrator		orchestrator
orchestrator_skills		orchestrator_skills
README.md		README.md
example_task.json		example_task.json
install.sh		install.sh
main.py		main.py
multiagent-workflow.md		multiagent-workflow.md
pyproject.toml		pyproject.toml
tui_app.py		tui_app.py
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Web2BigTable: A Bi-Level Multi-Agent Framework forWeb-to-Table Search

What Is Web2BigTable?

Benchmark Results

Key Features

What Makes It Different?

One-Click Install

Quick Start (Developer)

Built-in Skills

TUI

Developer Notes

FAQ

Memento Ecosystem

Citation

Chinese Summary

Licence

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Web2BigTable: A Bi-Level Multi-Agent Framework for
Web-to-Table Search

Packages