Web search tool: DuckDuckGo + Perplexity for research and daily briefs (lightweight)

## Summary

Lightweight web search capability via HTTP requests — no browser needed. Two tiers: free (DuckDuckGo) and premium (Perplexity). This is the fast, cheap option for research, daily briefs, and information gathering.

## Why Two Options for Web Access

GAIA needs two distinct web capabilities:

| | **Web Search Tool** (this issue) | **Playwright Computer Use** (#458) |
|---|---|---|
| **What** | HTTP-based search + content extraction | Full browser automation |
| **Use for** | Research, news, daily briefs, fact-checking | Gmail, Calendar, web apps, form filling |
| **Requires** | Nothing (DuckDuckGo) or API key (Perplexity) | Chromium install (~150MB) |
| **Speed** | Fast (~1-2s per query) | Slow (~5-15s per page interaction) |
| **Auth** | No login capability | Can log into web apps |
| **Cost** | Free (DDG) or ~$0.005/query (Perplexity) | Free (local Chromium) |

## Tools

### Tier 1: DuckDuckGo (free, no API key)
- `search_web(query, num_results)` — Search the web, return titles + snippets + URLs
- `fetch_page(url, mode)` — Fetch a page and extract: readable text, raw HTML, links, or tables

### Tier 2: Perplexity (premium, requires API key)
- `search_web_premium(query)` — Higher quality search via Perplexity `sonar` model
- Auto-selects: if `PERPLEXITY_API_KEY` is set, uses Perplexity; otherwise falls back to DuckDuckGo

## Existing Code

- PR #495 includes `browser_tools.py` with `fetch_page` and `search_web` (DuckDuckGo) — already implemented
- #547 adds Perplexity as optional premium provider — port from gaia6
- `src/gaia/web/client.py` in PR #495 has HTTP client with rate limiting, SSRF prevention, content extraction

## Architecture

```
src/gaia/agents/tools/web_search_tools.py  (NEW — WebSearchToolsMixin)
├── search_web() — DuckDuckGo search (free, default)
├── search_web_premium() — Perplexity search (optional)
├── fetch_page() — HTTP GET + content extraction
└── download_file() — Download with size limits + path validation

src/gaia/web/client.py  (from PR #495)
├── Rate limiting per domain
├── SSRF prevention (blocked schemes, ports, private IPs)
├── Content extraction (BeautifulSoup, boilerplate removal)
└── Table extraction (HTML → JSON)
```

## Use Cases Enabled

- **Daily briefs** — "What's in the news about AI today?"
- **Research** — "Find the latest benchmarks for Qwen3-8B"
- **Fact-checking** — "Is this claim accurate?"
- **Price comparison** — "What's the cheapest flight to Denver next week?"
- **Documentation lookup** — "How do I configure Home Assistant automations?"

## Dependencies

- PR #495 (web client + DuckDuckGo search) — needs merge
- #547 (Perplexity provider) — port from gaia6

## Acceptance Criteria

- [ ] `search_web()` returns structured results from DuckDuckGo
- [ ] `fetch_page()` extracts readable content from any URL
- [ ] Perplexity auto-selected when API key is present
- [ ] SSRF prevention blocks private IPs and dangerous protocols
- [ ] Rate limiting prevents abuse
- [ ] No browser or Chromium install required

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Web search tool: DuckDuckGo + Perplexity for research and daily briefs (lightweight) #669

Summary

Why Two Options for Web Access

Tools

Tier 1: DuckDuckGo (free, no API key)

Tier 2: Perplexity (premium, requires API key)

Existing Code

Architecture

Use Cases Enabled

Dependencies

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	Web Search Tool (this issue)	Playwright Computer Use (#458)
What	HTTP-based search + content extraction	Full browser automation
Use for	Research, news, daily briefs, fact-checking	Gmail, Calendar, web apps, form filling
Requires	Nothing (DuckDuckGo) or API key (Perplexity)	Chromium install (~150MB)
Speed	Fast (~1-2s per query)	Slow (~5-15s per page interaction)
Auth	No login capability	Can log into web apps
Cost	Free (DDG) or ~$0.005/query (Perplexity)	Free (local Chromium)

Web search tool: DuckDuckGo + Perplexity for research and daily briefs (lightweight) #669

Description

Summary

Why Two Options for Web Access

Tools

Tier 1: DuckDuckGo (free, no API key)

Tier 2: Perplexity (premium, requires API key)

Existing Code

Architecture

Use Cases Enabled

Dependencies

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions