feat: add tool search with semantic, local, and auto search modes #322
feat: add tool search with semantic, local, and auto search modes #322shashi-stackone merged 6 commits intomainfrom
Conversation
commit: |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new tool discovery/search API on StackOneToolSet, supporting semantic (cloud) search, local BM25+TF‑IDF fallback, and an auto mode that attempts semantic first. It also refactors the previous “utility tools” approach into dedicated search APIs and updates docs/examples and test mocks accordingly.
Changes:
- Added
searchTools(),searchActionNames(), andgetSearchTool()(+ exportedSearchTooland semantic search client/types). - Implemented semantic search client for
/actions/searchand a local hybrid search index (ToolIndex) as fallback / local-only mode. - Updated tests, mocks, README, and examples; removed the old
utility-toolsexample.
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| src/toolsets.ts | Adds search modes, SearchTool wrapper, semantic+local search implementations, and provider filtering via connector. |
| src/semantic-search.ts | Introduces SemanticSearchClient + SemanticSearchError and response/result mapping. |
| src/local-search.ts | Adds local hybrid BM25+TF‑IDF search index (ToolIndex). |
| src/utils/normalize.ts | Adds action name normalization bridging semantic API names to MCP tool names. |
| src/tool.ts | Adds BaseTool.connector and replaces utilityTools() with Tools.getConnectors(). |
| src/index.ts | Exports new search APIs and semantic search client/types. |
| src/toolsets.test.ts, src/semantic-search.test.ts, src/local-search.test.ts | Adds/updates tests for semantic/local/auto search and normalization. |
| mocks/* + mocks/constants.ts | Centralizes test base URL and updates MSW handlers to use it. |
| examples/*, README.md, examples/README.md | Updates documentation and examples to use the new search APIs; removes utility-tools example. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/toolsets.ts
Outdated
| console.warn( | ||
| `Semantic search failed (${error.message}), falling back to local BM25+TF-IDF search`, | ||
| ); | ||
| return this.localSearch(query, allTools, options); |
There was a problem hiding this comment.
console.warn usage here will violate the repo’s no-console lint rule for non-test code (see .oxlintrc.jsonc). Please remove the console call (silent fallback) or route this through an internal logger/diagnostics hook that doesn’t rely on console.*.
| if (error instanceof SemanticSearchError) { | ||
| console.warn(`Semantic search failed: ${error.message}`); | ||
| return []; | ||
| } |
There was a problem hiding this comment.
console.warn usage here will violate the repo’s no-console lint rule for non-test code (see .oxlintrc.jsonc). Consider returning [] without logging, or expose an optional callback/event emitter for diagnostics instead of writing to stdout.
| const index = new ToolIndex(allTools.toArray()); | ||
| const results = await index.search( | ||
| query, | ||
| options?.topK ?? 5, | ||
| options?.minSimilarity ?? 0.0, | ||
| ); | ||
|
|
||
| const matchedNames = results.map((r) => r.name); | ||
| const toolMap = new Map(allTools.toArray().map((t) => [t.name, t])); | ||
| const filterConnectors = options?.connector | ||
| ? new Set([options.connector.toLowerCase()]) | ||
| : availableConnectors; | ||
|
|
||
| const matchedTools = matchedNames | ||
| .filter((name) => toolMap.has(name)) | ||
| .map((name) => toolMap.get(name)!) | ||
| .filter((tool) => tool.connector && filterConnectors.has(tool.connector)); | ||
|
|
There was a problem hiding this comment.
localSearch applies the connector filter after taking the top-K results from the full index. This can return an empty/too-small result set even when matches exist for the requested connector (because higher-scoring tools from other connectors get filtered out). Filter the tools by connector before building/searching the index, or over-fetch results and keep collecting until you have topK matches for the connector.
src/toolsets.ts
Outdated
| const client = this.getSemanticClient(); | ||
| const allResults: SemanticSearchResult[] = []; |
There was a problem hiding this comment.
In auto mode, a missing API key will currently throw ToolSetConfigError from getSemanticClient() (not a SemanticSearchError), so searchTools() won’t fall back to local search. If search: 'auto' is intended to be resilient, treat “no API key” as a semantic-unavailable condition and fall back to localSearch, reserving hard failures for search: 'semantic'.
src/local-search.ts
Outdated
| import * as orama from '@orama/orama'; | ||
| import { DEFAULT_HYBRID_ALPHA } from './consts'; | ||
| import type { BaseTool } from './tool'; | ||
| import type { ToolParameters } from './types'; |
There was a problem hiding this comment.
Unused type import ToolParameters triggers the repo’s typescript/no-unused-vars rule. Remove the import (it’s not referenced in this module).
| import type { ToolParameters } from './types'; |
| try { | ||
| const client = this.getSemanticClient(); | ||
| let allResults: SemanticSearchResult[] = []; |
There was a problem hiding this comment.
searchActionNames() calls getSemanticClient(), which can throw ToolSetConfigError when no API key is configured. That error isn’t handled here, so the method will throw instead of returning [] as the docs/tests imply for semantic-search failures. Consider handling missing API key the same way as other semantic failures (return [] or add an explicit option to control behavior).
src/toolsets.ts
Outdated
| const index = new ToolIndex(allTools.toArray()); | ||
| const results = await index.search( | ||
| query, |
There was a problem hiding this comment.
localSearch rebuilds a full BM25/TF-IDF index (new ToolIndex(allTools.toArray())) on every search call. For repeated searches this is unnecessarily expensive; consider caching the index (e.g., keyed by a tools snapshot/version) or constructing it lazily once per StackOneToolSet/Tools instance.
src/local-search.test.ts
Outdated
| import { ToolIndex, type ToolSearchResult } from './local-search'; | ||
| import { BaseTool } from './tool'; | ||
| import { ParameterLocation } from './types'; |
There was a problem hiding this comment.
Unused imports (type ToolSearchResult and ParameterLocation) will fail the repo’s typescript/no-unused-vars lint rule. Please remove them or use them in the tests.
| import { ToolIndex, type ToolSearchResult } from './local-search'; | |
| import { BaseTool } from './tool'; | |
| import { ParameterLocation } from './types'; | |
| import { ToolIndex } from './local-search'; | |
| import { BaseTool } from './tool'; |
| async searchTools(query: string, options?: SearchToolsOptions): Promise<Tools> { | ||
| const search = options?.search ?? 'auto'; | ||
| const allTools = await this.fetchTools({ accountIds: options?.accountIds }); | ||
| const availableConnectors = allTools.getConnectors(); | ||
|
|
||
| if (availableConnectors.size === 0) { | ||
| return new Tools([]); | ||
| } | ||
|
|
||
| // Local-only search — skip semantic API entirely | ||
| if (search === 'local') { | ||
| return this.localSearch(query, allTools, options); | ||
| } |
There was a problem hiding this comment.
searchTools is likely to exceed the repo’s configured cyclomatic complexity limit (max 7 in .oxlintrc.jsonc). Consider extracting the semantic-search path and the connector-selection logic into smaller private helpers (e.g., semanticSearchTools() / getConnectorsToSearch() / matchSemanticResultsToTools()) to keep each function under the limit and easier to test.
There was a problem hiding this comment.
5 issues found across 37 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/semantic-search.ts">
<violation number="1" location="src/semantic-search.ts:131">
P1: Custom agent: **Flag Security Vulnerabilities**
Enforce HTTPS for semantic search requests. The constructor accepts any baseUrl and uses it for fetch calls; this allows insecure `http://` endpoints and can leak API keys in transit. Validate the scheme and reject non-HTTPS base URLs per the security rule requiring TLS.</violation>
<violation number="2" location="src/semantic-search.ts:182">
P2: Clear the timeout in a finally block so it’s always cancelled even when fetch throws; otherwise the timer can leak and fire after the request has already failed.</violation>
</file>
<file name="examples/search-tools.ts">
<violation number="1" location="examples/search-tools.ts:105">
P2: Guard the fetchTools call so it only runs when topActions is non-empty; otherwise this example fetches the entire tool catalog whenever no action exceeds the 0.7 threshold.</violation>
</file>
<file name="src/toolsets.ts">
<violation number="1" location="src/toolsets.ts:650">
P2: Bug: `localSearch` applies `topK` limit to the index search *before* filtering by `connector`, which can return fewer results than requested. When a connector filter is active, the index should fetch more results (or all) so that post-filter slicing still yields up to `topK` tools.</violation>
</file>
<file name="src/mcp-client.test.ts">
<violation number="1" location="src/mcp-client.test.ts:50">
P2: Hardcoding `http://localhost/mcp` here makes the test depend on a running local MCP server. Since no server is started in this test and `createMCPClient` performs real HTTP calls, this will fail in CI or for contributors without a local server. Consider using MSW/mocked transport or a test server fixture instead of a hardcoded localhost endpoint.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| timeout?: number; | ||
| }) { | ||
| this.apiKey = apiKey; | ||
| this.baseUrl = baseUrl.replace(/\/+$/, ''); |
There was a problem hiding this comment.
P1: Custom agent: Flag Security Vulnerabilities
Enforce HTTPS for semantic search requests. The constructor accepts any baseUrl and uses it for fetch calls; this allows insecure http:// endpoints and can leak API keys in transit. Validate the scheme and reject non-HTTPS base URLs per the security rule requiring TLS.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/semantic-search.ts, line 131:
<comment>Enforce HTTPS for semantic search requests. The constructor accepts any baseUrl and uses it for fetch calls; this allows insecure `http://` endpoints and can leak API keys in transit. Validate the scheme and reject non-HTTPS base URLs per the security rule requiring TLS.</comment>
<file context>
@@ -0,0 +1,260 @@
+ timeout?: number;
+ }) {
+ this.apiKey = apiKey;
+ this.baseUrl = baseUrl.replace(/\/+$/, '');
+ this.timeout = timeout;
+ }
</file context>
| const topActions = results.filter((r) => r.similarityScore > 0.7).map((r) => r.actionName); | ||
| console.log(`\nFetching tools for top actions: ${topActions.join(', ')}`); | ||
|
|
||
| const tools = await toolset.fetchTools({ actions: topActions }); |
There was a problem hiding this comment.
P2: Guard the fetchTools call so it only runs when topActions is non-empty; otherwise this example fetches the entire tool catalog whenever no action exceeds the 0.7 threshold.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At examples/search-tools.ts, line 105:
<comment>Guard the fetchTools call so it only runs when topActions is non-empty; otherwise this example fetches the entire tool catalog whenever no action exceeds the 0.7 threshold.</comment>
<file context>
@@ -0,0 +1,147 @@
+ const topActions = results.filter((r) => r.similarityScore > 0.7).map((r) => r.actionName);
+ console.log(`\nFetching tools for top actions: ${topActions.join(', ')}`);
+
+ const tools = await toolset.fetchTools({ actions: topActions });
+ console.log(`Fetched ${tools.length} tools`);
+ }
</file context>
src/toolsets.ts
Outdated
| } | ||
|
|
||
| const index = new ToolIndex(allTools.toArray()); | ||
| const results = await index.search( |
There was a problem hiding this comment.
P2: Bug: localSearch applies topK limit to the index search before filtering by connector, which can return fewer results than requested. When a connector filter is active, the index should fetch more results (or all) so that post-filter slicing still yields up to topK tools.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/toolsets.ts, line 650:
<comment>Bug: `localSearch` applies `topK` limit to the index search *before* filtering by `connector`, which can return fewer results than requested. When a connector filter is active, the index should fetch more results (or all) so that post-filter slicing still yields up to `topK` tools.</comment>
<file context>
@@ -265,6 +349,326 @@ export class StackOneToolSet {
+ }
+
+ const index = new ToolIndex(allTools.toArray());
+ const results = await index.search(
+ query,
+ options?.topK ?? 5,
</file context>
| test('createMCPClient can connect and list tools from MCP server', async () => { | ||
| await using mcpClient = await createMCPClient({ | ||
| baseUrl: 'https://api.stackone-dev.com/mcp', | ||
| baseUrl: 'http://localhost/mcp', |
There was a problem hiding this comment.
P2: Hardcoding http://localhost/mcp here makes the test depend on a running local MCP server. Since no server is started in this test and createMCPClient performs real HTTP calls, this will fail in CI or for contributors without a local server. Consider using MSW/mocked transport or a test server fixture instead of a hardcoded localhost endpoint.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp-client.test.ts, line 50:
<comment>Hardcoding `http://localhost/mcp` here makes the test depend on a running local MCP server. Since no server is started in this test and `createMCPClient` performs real HTTP calls, this will fail in CI or for contributors without a local server. Consider using MSW/mocked transport or a test server fixture instead of a hardcoded localhost endpoint.</comment>
<file context>
@@ -47,7 +47,7 @@ test('createMCPClient provides asyncDispose for cleanup', async () => {
test('createMCPClient can connect and list tools from MCP server', async () => {
await using mcpClient = await createMCPClient({
- baseUrl: 'https://api.stackone-dev.com/mcp',
+ baseUrl: 'http://localhost/mcp',
headers: {
Authorization: `Basic ${Buffer.from('test-key:').toString('base64')}`,
</file context>
Summary
searchTools(),searchActionNames(), andgetSearchTool()methods toStackOneToolSet0.2 * BM25 + 0.8 * TF-IDFscoring
automode (default): tries semantic first, falls back to local on failureutility-toolsexampleTest plan
searchTools()returns ranked tools for natural language queriessearchActionNames()returns action names and similarity scoresgetSearchTool()returns reusable search tool for agent loopsSemanticSearchErroron failureSummary by cubic
Adds semantic tool search with an auto fallback to local BM25+TF‑IDF and a reusable SearchTool for agent loops. Adds constructor-level search config and a new semantic search example; examples now use env-based setup with a topK demo.
New Features
Migration
Written for commit 8145e88. Summary will update on new commits.