Skip to content

feat: add Perplexity Search built-in tool#35709

Open
jliounis wants to merge 3 commits into
langgenius:mainfrom
jliounis:feat/perplexity-search-tool
Open

feat: add Perplexity Search built-in tool#35709
jliounis wants to merge 3 commits into
langgenius:mainfrom
jliounis:feat/perplexity-search-tool

Conversation

@jliounis
Copy link
Copy Markdown

@jliounis jliounis commented Apr 29, 2026

Summary

Adds a new built-in tool provider perplexity with a single tool perplexity_search that wraps Perplexity's Search API. It returns ranked web results (title, URL, snippet, date) as a JSON message, link messages, and a human-readable Markdown text message — mirroring the structure used by the existing Tavily provider that previously lived at api/core/tools/provider/builtin/tavily/.

Files added

  • api/core/tools/builtin_tool/providers/perplexity/perplexity.yaml – provider manifest with a perplexity_api_key secret-input credential.
  • api/core/tools/builtin_tool/providers/perplexity/perplexity.py – provider class. Credential validation runs a tiny query=ping, max_results=1 search through the tool and surfaces any error as ToolProviderCredentialValidationError.
  • api/core/tools/builtin_tool/providers/perplexity/tools/perplexity_search.{yaml,py} – tool manifest and implementation calling POST https://api.perplexity.ai/search with Authorization: Bearer <key>.
  • api/core/tools/builtin_tool/providers/perplexity/_assets/icon.svg – minimal placeholder icon.
  • api/core/tools/builtin_tool/_position.yaml – registers perplexity so it sorts alongside the other built-ins.

Tool parameters

Name Type Notes
query string (required, llm) Search query.
max_results number (form, default 5) 1–20 results.
search_domain_filter string (form) Comma-separated allow-list, or -domain.com entries for deny. Don't mix.
search_recency_filter select (form) hour / day / week / month / year.
search_after_date_filter string (form) m/d/yyyy (e.g. 1/1/2025).
search_before_date_filter string (form) m/d/yyyy (e.g. 12/31/2025).

The user-facing description is: "Search the web for up-to-date information using the Perplexity Search API. Returns ranked results with snippets, titles, URLs, and dates."

Docs

Testing

Nine unit tests in api/tests/unit_tests/core/tools/test_perplexity_search.py mock requests.post and cover:

  1. Default max_results and override behavior in _build_payload.
  2. Domain filter accepting both string (comma/space-separated) and list inputs, with empties dropped.
  3. Successful invocation: JSON, per-result link messages, and a final text message.
  4. Empty results returns a single friendly text message.
  5. Missing query short-circuits without an HTTP call.
  6. Missing credentials short-circuits without an HTTP call.
  7. HTTP 500 from the API raises ToolInvokeError.
  8. All filter parameters are forwarded to the API request body verbatim.
  9. The text formatter renders title/URL/snippet/date for each result.

Run:

cd api && uv run pytest tests/unit_tests/core/tools/test_perplexity_search.py

Result: 9 passed. uv run ruff check and uv run ruff format --check are also clean for the new files.

Adds a new built-in tool provider `perplexity` with a single tool
`perplexity_search` that calls the Perplexity Search API
(POST https://api.perplexity.ai/search) and returns ranked web results
(title, url, snippet, date) as a JSON message, link messages, and a
human-readable text message.

The provider mirrors the existing Tavily reference pattern:
- `perplexity.yaml` with a `perplexity_api_key` secret-input credential
- `perplexity.py` provider whose credential validation runs a tiny
  search query through the tool
- `tools/perplexity_search.{yaml,py}` exposing `query`, `max_results`,
  `search_domain_filter`, `search_recency_filter`,
  `search_after_date_filter`, and `search_before_date_filter`
- minimal SVG icon under `_assets/`

Registered in `builtin_tool/_position.yaml` so the provider shows up in
the same UI ordering as the other built-ins.

Tests: nine unit tests in
`api/tests/unit_tests/core/tools/test_perplexity_search.py` covering
payload construction, default/override behavior, domain filter parsing,
result rendering, missing-query and missing-credentials paths, HTTP
error mapping to ToolInvokeError, and end-to-end message generation
with the HTTP layer mocked.

Docs:
- https://docs.perplexity.ai/docs/search/quickstart
- https://docs.perplexity.ai/api-reference/search-post
@dosubot dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

base → PR
--- /tmp/pyrefly_base.txt	2026-04-30 06:30:36.218607724 +0000
+++ /tmp/pyrefly_pr.txt	2026-04-30 06:30:26.458514946 +0000
@@ -1845,7 +1845,7 @@
 ERROR Missing argument `app_model` in function `handler` [missing-argument]
   --> tests/unit_tests/controllers/console/app/test_wraps.py:43:16
 ERROR Cannot set item in `OrderedDict[str, bool | list[str] | str]` [unsupported-operation]
-   --> tests/unit_tests/controllers/console/app/workflow_draft_variables_test.py:137:47
+   --> tests/unit_tests/controllers/console/app/workflow_draft_variables_test.py:134:47
 ERROR `None` is not subscriptable [unsupported-operation]
    --> tests/unit_tests/controllers/console/auth/test_login_logout.py:516:16
 ERROR `None` is not subscriptable [unsupported-operation]
@@ -5045,6 +5045,9 @@
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
+ERROR Object of class `BlobChunkMessage` has no attribute `text`
+ERROR Object of class `BlobChunkMessage` has no attribute `text`
+ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Object of class `BlobChunkMessage` has no attribute `text`
 ERROR Argument `SimpleNamespace` is not assignable to parameter `agent_message` with type `Message` in function `core.tools.tool_engine.ToolEngine._create_message_files` [bad-argument-type]

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, PerplexitySearchTool uses requests.post(...) directly, bypassing the repo’s centralized HTTP stack (core.helper.ssrf_proxy) that provides shared proxying/retry/timeout/trace/SSRF protections. This creates inconsistent network behavior versus other HTTP flows and may break deployments that rely on configured proxying/telemetry behavior for outbound requests.

Severity: action required | Category: reliability

How to fix: Use ssrf_proxy/httpx for calls

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

PerplexitySearchTool performs outbound HTTP using requests.post, which bypasses the repo’s standard outbound HTTP layer (core.helper.ssrf_proxy) and its shared behavior (proxying, retries/backoff, default timeouts, SSL verify config, and trace headers).

Issue Context

The codebase implements core.helper.ssrf_proxy.make_request to centralize outbound HTTP settings and protections. New outbound calls should use this layer (or the pooled httpx client mechanisms it relies on) for consistency.

Fix Focus Areas

  • api/core/tools/builtin_tool/providers/perplexity/tools/perplexity_search.py[6-95]
  • api/core/helper/ssrf_proxy.py[134-209]

Notes

  • Replace requests.post(...) with ssrf_proxy.post(...) (or ssrf_proxy.make_request("POST", ...)) and adapt response handling to httpx (response.raise_for_status(), response.json() equivalents).
  • Preserve the existing timeout behavior (map HTTP_TIMEOUT appropriately to httpx timeout config).

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Spotted by Qodo code review - free for open-source projects.

@jliounis
Copy link
Copy Markdown
Author

Thanks for the review. Pushed 5b653cd addressing all feedback:

  • @Qodo-Free-For-OSS SSRF/proxy bypass: replaced direct requests.post(...) with core.helper.ssrf_proxy.post(...) so the call goes through the centralized HTTP stack (proxying, retries, timeouts, SSRF protections). Response handling adapted to httpx (raise_for_status() / .json()), with HTTP errors still surfaced as ToolInvokeError. Unit tests updated to mock ssrf_proxy.post.
  • CI Style Check / Python Style (mypy): fixed the int(Any | None) type error at perplexity_search.py:28 by narrowing/clamping max_results before coercion. make type-check-core, basedpyright, ruff check, ruff format --check, and the 9 unit tests are all green locally.
  • Attribution header: added X-Pplx-Integration: dify/<version> to outbound requests (consistent with how we identify integration traffic across our partner ecosystem). Covered by a new test assertion.

Could a maintainer take another look when you have a moment? Happy to iterate further. 🙏

@jliounis
Copy link
Copy Markdown
Author

Re-checked all review comments at head 5b653cd:

  • @Qodo-Free-For-OSS (SSRF/proxy bypass): already fixed in 5b653cdrequests.post replaced with ssrf_proxy.post, response handling adapted to httpx, X-Pplx-Integration header added.
  • Pyrefly diff (github-actions): the three new BlobChunkMessage has no attribute text errors shown in the diff are in tests/unit_tests/core/plugin/utils/test_chunk_merger.py, which this PR does not modify — they are pre-existing errors in the base branch, not introduced by these changes.

All 9 unit tests pass locally (pytest tests/unit_tests/core/tools/test_perplexity_search.py). No further changes needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants