🔍 Multi-Search Aggregator

30+ search engines in parallel — one query, comprehensive results from across the web.

📕 Table of Contents

💡 What is this?
✨ Key Features
📡 Supported Engines
🚀 Quick Start
⚙️ Configuration
🌐 Web UI & API
🧩 Extending
📁 Project Structure
🤝 Contributing
📄 License

💡 What is this?

A parallel search aggregator that fires queries to 30+ search engines simultaneously, deduplicates results, ranks by relevance, and returns the best matches — designed specifically for AI agents and research workflows.

Instead of calling one search API and hoping for the best, this tool combines results from multiple sources with intelligent scoring:

Concurrent execution — All engines run in parallel with configurable timeout
Smart ranking — Keyword relevance + engine authority + recency + semantic vector similarity
Real-time progress — SSE streaming shows each engine completing as it happens
Pluggable engines — Add new search backends without touching core code

✨ Key Features

⚡ Full parallelism — 40-thread pool, all engines fire simultaneously
🧠 Intelligent ranking — Multi-signal scoring: keyword match + engine weight + freshness + vector similarity
🔄 Auto-dedup — URL normalization removes duplicates across engines
📡 SSE streaming — Real-time progress in browser, each engine reports as it finishes
🖥️ Web UI — Google-style search page with engine filter sidebar
🚀 REST API — Simple JSON endpoint for programmatic access
🔌 31 engines — From Google to arXiv to Hacker News to Zhihu
⚙️ Hot config — Enable/disable engines, adjust weights, all without restart
🛡️ Graceful degradation — No API key? Engine skipped. Timeout? Partial results returned.

📡 Supported Engines

Category	Engines
General Search	Google, Bing (CN + Global), 360 Search, Sogou, Startpage, Brave, Qwant
Academic	arXiv, OpenAlex, OpenReview, DBLP, AMiner (Papers + Patents)
Developer	Stack Overflow, npm, crates.io, MDN
Social	Reddit, Hacker News, Zhihu, Xiaohongshu
AI-Powered	Zhipu AI (Pro/Sogou/Quark), Z.AI, Bailian
Chinese	WeChat Search, Zhihu, Baidu, WeRead
Browser	Any site via OpenCLI browser automation
Custom	Add your own via MCP, skill scripts, or web_fetch

🚀 Quick Start

1. Install

git clone https://github.com/yourname/multi-search-aggregator.git
cd multi-search-aggregator
pip install -r requirements.txt

2. Configure API Keys

cp .env.example .env
# Edit .env — add keys for engines you want to use:
# ZHIPU_API_KEY=your_key_here
# ZAI_API_KEY=your_key_here
# BAILIAN_API_KEY=your_key_here
# (No key = engine skipped, no errors)

3. Run

# CLI search
python run_search.py "RISC-V vector extension"
python run_search.py "LDPC decoder" --engines zhipu_pro,arxiv --top 10 --json

# Web server (http://localhost:8200)
python server.py

Open http://localhost:8200 → search like Google, results from 30+ engines.

⚙️ Configuration

# config.yaml

# Server
server:
  host: "0.0.0.0"
  port: 8200

# Search defaults
search:
  default_top_k: 50
  max_top_k: 200
  default_timeout: 35

# Engine weights (higher = more authoritative)
weights:
  Google: 10
  Z.AI: 9
  ZhipuPro: 9
  Stack Overflow: 8
  arXiv: 8

# Disable specific engines
engine_overrides:
  some_engine_id:
    enabled: false

🌐 Web UI & API

Search

# GET request
curl "http://localhost:8200/api/search?q=RISC-V&top_k=50"

# POST request (more options)
curl -X POST http://localhost:8200/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "transformer attention", "top_k": 20, "engines": ["zhipu_pro", "arxiv"]}'

# Response:
# {
#   "results": [
#     {"title": "...", "url": "...", "summary": "...", "source": "arXiv", "score": 0.95},
#     ...
#   ],
#   "meta": {"engines_used": 15, "total_results": 234, "elapsed_ms": 3200}
# }

SSE Streaming (real-time progress)

const es = new EventSource('/api/search/stream?q=RISC-V');
es.addEventListener('engine_done', e => {
    console.log(`✅ ${JSON.parse(e.data).engine}: ${JSON.parse(e.data).results} results`);
});
es.addEventListener('complete', e => {
    const data = JSON.parse(e.data);
    console.log(`Done: ${data.total_results} results in ${data.elapsed_ms}ms`);
});

Config & Engine Management

# List engines and their status
curl http://localhost:8200/api/engines

# Toggle engine on/off
curl -X POST http://localhost:8200/api/engines/toggle \
  -d '{"engine_id": "google", "enabled": false}'

# Get/update config
curl http://localhost:8200/api/config
curl -X PUT http://localhost:8200/api/config -d '{"timeout": 20}'

🧩 Extending

Adding a new engine is straightforward:

Create search_agg/engines/my_engine.py
Implement call_my_engine(query: str) -> list[dict]
Register in search_agg/engines/__init__.py

# search_agg/engines/my_engine.py
def call_my_engine(query: str) -> list[dict]:
    """Must return list of dicts with: title, url, summary"""
    import requests
    resp = requests.get(f"https://api.example.com/search?q={query}")
    return resp.json().get("results", [])

# In search_agg/engines/__init__.py, add to ENGINES list:
{
    "id": "my_engine",
    "name": "My Engine",
    "type": "web_fetch",
    "url_tpl": "https://api.example.com/search?q={q}",
    "weight": 7,
}

📁 Project Structure

multi-search-aggregator/
├── server.py                    # FastAPI web server
├── run_search.py                # CLI entry point
├── config.yaml                  # Configuration
│
├── search_agg/
│   ├── __init__.py              # Public API: search(), aggregate()
│   ├── config.py                # Config loader + env vars
│   ├── models.py                # SearchResult data model
│   ├── runner.py                # Parallel engine dispatcher
│   ├── ranker.py                # Dedup + scoring + vector ranking
│   ├── utils.py                 # URL cleanup, process management
│   └── engines/
│       ├── __init__.py          # Engine registry (31 engines)
│       ├── zhipu.py             # Zhipu AI (Pro/Sogou/Quark)
│       ├── zai.py               # Z.AI REST API
│       ├── bailian.py           # Alibaba Bailian MCP
│       ├── opencli.py           # Browser automation engines
│       ├── web_fetch.py         # HTTP fetch + HTML parsers
│       ├── mcp_stdio.py         # MCP stdio protocol
│       └── skill.py             # External script engines
│
├── static/                      # Web UI
│   ├── index.html               # Search page (SSE progress)
│   ├── config.html              # Engine config dashboard
│   └── style.css
│
├── requirements.txt
├── .env.example
├── .gitignore
├── LICENSE
└── README.md

🤝 Contributing

New engines, bug fixes, and improvements are welcome!

Fork the repo
Create a feature branch
Add your engine or fix
Open a PR

📄 License

MIT License — use it however you want.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🔍 Multi-Search Aggregator

💡 What is this?

✨ Key Features

📡 Supported Engines

🚀 Quick Start

1. Install

2. Configure API Keys

3. Run

⚙️ Configuration

🌐 Web UI & API

Search

SSE Streaming (real-time progress)

Config & Engine Management

🧩 Extending

📁 Project Structure

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
search_agg		search_agg
static		static
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md
config.yaml		config.yaml
requirements.txt		requirements.txt
run_search.py		run_search.py
server.py		server.py

Folders and files

Latest commit

History

Repository files navigation

🔍 Multi-Search Aggregator

💡 What is this?

✨ Key Features

📡 Supported Engines

🚀 Quick Start

1. Install

2. Configure API Keys

3. Run

⚙️ Configuration

🌐 Web UI & API

Search

SSE Streaming (real-time progress)

Config & Engine Management

🧩 Extending

📁 Project Structure

🤝 Contributing

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages