feat: add Firecrawl web search tools by wjiajian · Pull Request #7764 · AstrBotDevs/AstrBot

wjiajian · 2026-04-24T07:30:07Z

Closes #7761

添加 Firecrawl 作为内置网页搜索提供商。

本 PR 在内置 Function Tool 网页搜索层接入 Firecrawl，并对齐现有 Tavily 的使用方式。用户可以在网页搜索提供商中选择 Firecrawl，用于网页搜索和指定 URL 页面内容提取。

Modifications / 改动点

新增 web_search_firecrawl 内置网页搜索工具，使用 Firecrawl /v2/search。
新增 firecrawl_extract_web_page 内置页面内容提取工具，使用 Firecrawl /v2/scrape。
新增 provider_settings.websearch_firecrawl_key 配置项，支持多个 Firecrawl API Key 轮询。
在内置网页搜索提供商配置选项中加入 firecrawl。
更新 Agent 网页搜索工具注入逻辑，选择 firecrawl 时同时注册搜索工具和页面提取工具。
更新 Dashboard 配置元数据翻译，补充 Firecrawl API Key 文案。
更新旧版 ChatUI 网页搜索结果解析逻辑，支持识别 web_search_firecrawl。
新增单元测试，覆盖 Firecrawl 工具注册、配置迁移、搜索参数映射、页面提取输出和 Agent 工具注入。
This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

日志1

日志2

配置栏

使用截图

Checklist / 检查清单

😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过
我在此issue提到了该功能[Feature]关于网页搜索的自定义引擎问题 #7761
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
/ 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到 requirements.txt 和 pyproject.toml 文件相应位置。
😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。

Summary by Sourcery

Integrate Firecrawl as a built-in web search provider alongside existing engines, including both search and page-extraction capabilities, configuration, and agent/dashboard wiring.

New Features:

Add Firecrawl-based web search tool web_search_firecrawl backed by the Firecrawl Search API.
Add Firecrawl-based page extraction tool firecrawl_extract_web_page backed by the Firecrawl scrape API.
Expose firecrawl as a selectable web search provider in chat configuration with support for multiple API keys via provider_settings.websearch_firecrawl_key.
Enable old ChatUI to parse and render Firecrawl web search results in the same way as other web search tools.

Enhancements:

Update agent web search tool injection to register both Firecrawl search and extract tools when Firecrawl is selected as the provider.
Normalize legacy config for websearch_firecrawl_key to support list-based key rotation and align with other providers.
Ensure Firecrawl tools are registered and retrievable via the built-in function tool manager.
Add i18n metadata entries for Firecrawl API key configuration across supported locales.

Tests:

Add unit tests covering Firecrawl config migration, search parameter mapping, scrape output handling, builtin tool registration, and agent tool injection behavior.

…n and tests

sourcery-ai

Hey - I've left some high level feedback:

The Firecrawl search and scrape helpers duplicate the same API key/header/ClientSession setup logic; consider extracting a shared internal helper to reduce duplication and keep future changes (e.g., base URL or headers) in one place.
Both _firecrawl_search and _firecrawl_scrape raise a generic Exception for HTTP errors; using more specific exception types (e.g., a custom web-search error or RuntimeError/ValueError) would make it easier for callers to distinguish between configuration, network, and API-level failures.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The Firecrawl search and scrape helpers duplicate the same API key/header/ClientSession setup logic; consider extracting a shared internal helper to reduce duplication and keep future changes (e.g., base URL or headers) in one place.
- Both `_firecrawl_search` and `_firecrawl_scrape` raise a generic `Exception` for HTTP errors; using more specific exception types (e.g., a custom web-search error or `RuntimeError`/`ValueError`) would make it easier for callers to distinguish between configuration, network, and API-level failures.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request integrates Firecrawl as a new web search and page extraction provider, adding the FirecrawlWebSearchTool and FirecrawlExtractWebPageTool. The changes span the core agent logic, configuration defaults, tool implementations, dashboard UI, and unit tests. Review feedback identifies a critical bug in the Firecrawl API response parsing that would lead to an AttributeError, the inclusion of unsupported search parameters (tbs), potential TypeErrors when handling null arguments, and opportunities to improve performance and code reuse by refactoring HTTP session management.

…earch tools

…d validation

Soulter

按原来那样创建新的 aiohttp.session 实例会更好一些，防止资源泄漏以及不使用websearch功能的用户可以避免创建模块级 session 实例。

…improved session management as it was

… add corresponding tests

…improved error handling and session management

…use default limit in payload

feat: add Firecrawl web search and extract tools, update configuratio…

eb65e73

…n and tests

auto-assign Bot requested review from advent259141 and anka-afk April 24, 2026 07:30

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 24, 2026

sourcery-ai Bot reviewed Apr 24, 2026

View reviewed changes

gemini-code-assist Bot reviewed Apr 24, 2026

View reviewed changes

wjiajian added 2 commits April 24, 2026 15:44

feat: implement Firecrawl API integration and error handling in web s…

dae5e15

…earch tools

feat: enhance Firecrawl web search with session management and payloa…

4b0deb3

…d validation

Soulter requested changes Apr 25, 2026

View reviewed changes

Comment thread astrbot/core/tools/web_search_tools.py Outdated

wjiajian added 4 commits April 25, 2026 17:39

feat: Firecrawl web search to use aiohttp.ClientSession directly for …

cc19cc2

…improved session management as it was

feat: update Firecrawl search to handle grouped web data response and…

b31bb0e

… add corresponding tests

feat: refactor Firecrawl web search to use aiohttp.ClientSession for …

866caf2

…improved error handling and session management

feat: remove unused coercion function and update Firecrawl search to …

ba478c5

…use default limit in payload

wjiajian requested a review from Soulter April 25, 2026 10:41

Soulter approved these changes Apr 26, 2026

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label Apr 26, 2026

Soulter merged commit 17aea1a into AstrBotDevs:master Apr 26, 2026
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Firecrawl web search tools#7764

feat: add Firecrawl web search tools#7764
Soulter merged 7 commits intoAstrBotDevs:masterfrom
wjiajian:feat/firecrawl-web-search

wjiajian commented Apr 24, 2026 •

edited

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Soulter left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

wjiajian commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Soulter left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wjiajian commented Apr 24, 2026 •

edited

Loading