Skip to content

feat: add Baidu AI Search backend for web_search#2371

Merged
Hmbown merged 5 commits into
Hmbown:mainfrom
jimmyzhuu:feat/baidu-search-provider
May 31, 2026
Merged

feat: add Baidu AI Search backend for web_search#2371
Hmbown merged 5 commits into
Hmbown:mainfrom
jimmyzhuu:feat/baidu-search-provider

Conversation

@jimmyzhuu
Copy link
Copy Markdown
Contributor

@jimmyzhuu jimmyzhuu commented May 30, 2026

Summary

This PR adds Baidu AI Search as an explicit web_search backend so users in mainland China can choose a first-party, China-accessible search API instead of relying only on HTML scraping or providers that may be unreliable from Chinese networks.

What changed:

  • Adds SearchProvider::Baidu with config/env selection through:
    • [search] provider = "baidu"
    • DEEPSEEK_SEARCH_PROVIDER=baidu
    • aliases such as baidu-search and baidu_ai_search
  • Uses Baidu AI Search at https://qianfan.baidubce.com/v2/ai_search/web_search.
  • Sends the official Baidu web-search payload shape, including search_source: "baidu_search_v2" and resource_type_filter[].top_k.
  • Reads the API key from the normal search config key path, with Baidu-specific fallback support:
    • [search] api_key
    • DEEPSEEK_SEARCH_API_KEY
    • BAIDU_SEARCH_API_KEY
  • Normalizes Baidu references[] items into the existing ranked WebSearchResponse shape.
  • Adds the required network-policy host for qianfan.baidubce.com.
  • Keeps missing-key behavior explicit: selecting provider = "baidu" without a key returns a clear error instead of silently falling back to another provider and sending the query elsewhere.
  • Redacts bearer tokens from HTTP error previews before surfacing provider failures.
  • Documents Baidu search setup in the sample config and tool/config docs.

Related context:

Testing

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features
  • cargo test --workspace --all-features
  • cargo test -p codewhale-tui baidu
    • 7 passed, including the request-payload test that locks search_source to baidu_search_v2.
  • cargo test -p codewhale-tui sanitize_error_body_redacts_bearer_tokens
  • cargo test -p codewhale-tui tools::web_search::tests
    • 33 passed.
  • cargo check --workspace --all-features
  • Live smoke against Baidu AI Search with BAIDU_SEARCH_API_KEY
    • Endpoint: /v2/ai_search/web_search
    • Payload included search_source: "baidu_search_v2"
    • Query: Rust cargo workspace
    • Result: HTTP 200, 3 references[] returned.
  • Secret scan for accidentally committed Baidu credentials:
    • rg -n 'bce-v3|ALTAK|6edb0bd1c|BAIDU_SEARCH_API_KEY=.*[A-Za-z0-9]' . -S
    • No matches.

Known baseline issue:

  • cargo clippy --workspace --all-targets --all-features -- -D warnings currently fails on pre-existing warnings outside this PR:
    • crates/tui/src/commands/config.rs:476 useless format!
    • crates/tui/src/runtime_log.rs:177 redundant closure

Checklist

  • Updated docs or comments as needed
  • Added or updated tests where relevant
  • Verified TUI behavior manually if UI changes
    • No UI behavior changed; verified the backend with a live Baidu API smoke test instead.

Greptile Summary

This PR adds SearchProvider::Baidu backed by the Baidu AI Search API at qianfan.baidubce.com, giving users in mainland China a first-party, API-backed search option. It also wires in DEEPSEEK_SEARCH_API_KEY via apply_env_overrides (newly functional for all providers), fixes a pre-existing gap where SearchProvider::parse("metaso") returned None silently, and hardens sanitize_error_body with bearer-token redaction that now applies to all providers.

  • New Baidu backend (run_baidu_search): resolves the API key through [search] api_keyDEEPSEEK_SEARCH_API_KEYBAIDU_SEARCH_API_KEY, enforces network-policy checks against qianfan.baidubce.com, and surfaces clear per-status-code error messages (401/403/429) instead of silent fallbacks.
  • DEEPSEEK_SEARCH_API_KEY env override added to apply_env_overrides — previously documented but not implemented — now correctly takes priority over the config-file key for all API-backed search providers.
  • Bearer-token redaction added to sanitize_error_body using a OnceLock-compiled regex; this is a global improvement applied to Tavily, Bocha, and Metaso error paths in addition to Baidu.

Confidence Score: 5/5

This PR is safe to merge. The new Baidu backend is well-isolated, fails explicitly on missing keys, and the global bearer-token redaction tightens existing error paths.

The Baidu search path correctly enforces network-policy checks, implements an explicit key-resolution chain, and returns clear errors rather than silent fallbacks. The bearer-token regex is compile-once and applied defensively. The only gap is a 401/403 error message that omits DEEPSEEK_SEARCH_API_KEY as a remediation step — easily fixed and does not affect correctness or security.

crates/tui/src/tools/web_search.rs — specifically the 401/403 error message for the Baidu provider.

Important Files Changed

Filename Overview
crates/tui/src/tools/web_search.rs Adds run_baidu_search, payload builder, result parser, error helper, and bearer-token redaction in sanitize_error_body (applied globally to all providers). The 401/403 error message omits DEEPSEEK_SEARCH_API_KEY as a remediation path.
crates/tui/src/config.rs Adds SearchProvider::Baidu with serde aliases, wires DEEPSEEK_SEARCH_API_KEY into apply_env_overrides, and fixes a pre-existing gap where SearchProvider::parse("metaso") returned None. Well-tested with 4 new unit tests.
crates/tui/src/core/engine.rs Doc comment update only — mentions Baidu and BAIDU_SEARCH_API_KEY fallback alongside existing providers.
crates/tui/src/tools/spec.rs Doc comment update only — search_api_key field description updated to include Baidu.
config.example.toml Documents baidu as a new provider option and adds env-var override entries for METASO_API_KEY and BAIDU_SEARCH_API_KEY.
docs/CONFIGURATION.md Adds Baidu AI Search setup section and updates provider list in the web-search config block.
docs/TOOL_SURFACE.md One-line update to web_search entry to include Metaso and Baidu as selectable backends.

Sequence Diagram

sequenceDiagram
    participant U as User / ToolContext
    participant W as WebSearchTool
    participant E as apply_env_overrides
    participant B as Baidu AI Search API

    Note over E: DEEPSEEK_SEARCH_API_KEY → config.search.api_key
    Note over E: DEEPSEEK_SEARCH_PROVIDER=baidu → SearchProvider::Baidu

    U->>W: "execute({query: "..."})"
    W->>W: check_policy("qianfan.baidubce.com")
    W->>W: "resolve api_key<br/>1. context.search_api_key<br/>2. BAIDU_SEARCH_API_KEY env"
    alt No key found
        W-->>U: ToolError (missing API key)
    end
    W->>B: "POST /v2/ai_search/web_search<br/>Authorization: Bearer {key}"
    B-->>W: "HTTP 200 {references: [...]}"
    W->>W: baidu_error_message() → None (success)
    W->>W: "parse_baidu_results() → Vec<WebSearchEntry>"
    W-->>U: "ToolResult {query, source:"baidu", results}"
Loading

Comments Outside Diff (1)

  1. crates/tui/src/tools/web_search.rs, line 9-10 (link)

    P2 The module-level docstring comment on line 9 still lists only tavily/bocha/metaso and omits baidu. All other surfaces (tool description, config docs, TOOL_SURFACE.md) were updated — this one was missed.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Fix in Codex Fix in Claude Code Fix in Cursor

Fix All in Codex Fix All in Claude Code Fix All in Cursor

Reviews (3): Last reviewed commit: "Merge main into Baidu search provider" | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds Baidu AI Search as a new web search provider option. It implements the API client, result parsing, error handling, token redaction, configuration parsing, and comprehensive unit tests, alongside updating the relevant documentation. A review comment points out a discrepancy where DEEPSEEK_SEARCH_API_KEY is documented as an environment variable override but is not actually implemented in the configuration loading logic.

Comment thread crates/tui/src/config.rs
Comment thread crates/tui/src/tools/web_search.rs
@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

great idea - will work on getting this in soon

@jimmyzhuu
Copy link
Copy Markdown
Contributor Author

great idea - will work on getting this in soon

Thanks Hunter! Really excited to see this move forward. Let me know if you need any changes. Happy to help with testing or updating the PR as needed :)

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

Thanks @jimmyzhuu — an opt-in Baidu AI Search backend is directionally useful, especially for users working from China-network environments where the default web path can be brittle.

I did not harvest this in today’s batch because it needs a current rebase and a very narrow provider boundary: no default search behavior change, no extra service unless explicitly configured, and mockable web_search tests. This is still worth a focused follow-up.

@Hmbown
Copy link
Copy Markdown
Owner

Hmbown commented May 31, 2026

I rebased this onto current main and took care of the review nits while keeping the provider scope narrow:\n\n- kept Baidu as an explicit opt-in web_search.provider = "baidu" backend, with no silent fallback when the API key is missing\n- updated the docs/config wording so the provider list includes Baidu\n- added serde aliases for baidu_search and baidu-ai-search alongside the original names\n- resolved the web-search test import conflict from main\n\nVerified locally on 12c9cd41:\n- cargo fmt --all -- --check\n- git diff --check\n- python3 scripts/check-provider-registry.py\n- cargo test -p codewhale-tui baidu -- --nocapture\n- cargo test -p codewhale-tui tools::web_search::tests -- --nocapture (34 passed)\n- cargo check -p codewhale-tui --all-features --locked\n\nThanks @jimmyzhuu, this is a clean addition to the search-provider set.

@jimmyzhuu
Copy link
Copy Markdown
Contributor Author

@Hmbown Thanks for handling the rebase and nits! LGTM for merge from my side. Appreciate your review :)

@Hmbown Hmbown merged commit 42576a7 into Hmbown:main May 31, 2026
2 checks passed
@jimmyzhuu jimmyzhuu deleted the feat/baidu-search-provider branch May 31, 2026 05:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants