Skip to content

feat: add tool search with semantic, local, and auto search modes #322

Merged
shashi-stackone merged 6 commits intomainfrom
semantic_search_12112
Mar 6, 2026
Merged

feat: add tool search with semantic, local, and auto search modes #322
shashi-stackone merged 6 commits intomainfrom
semantic_search_12112

Conversation

@shashi-stackone
Copy link
Copy Markdown
Contributor

@shashi-stackone shashi-stackone commented Mar 3, 2026

Summary

  • Add searchTools(), searchActionNames(), and getSearchTool() methods to
    StackOneToolSet
  • Implement semantic search via StackOne's search API with per-connector parallel search
  • Implement local hybrid search (BM25 via Orama + TF-IDF) with 0.2 * BM25 + 0.8 * TF-IDF
    scoring
  • Add auto mode (default): tries semantic first, falls back to local on failure
  • Add URL deduplication in MCP client to prevent duplicate tool definitions
  • Update README and all examples to use new search API
  • Remove old utility-tools example

Test plan

  • searchTools() returns ranked tools for natural language queries
  • searchActionNames() returns action names and similarity scores
  • getSearchTool() returns reusable search tool for agent loops
  • Auto mode falls back to local when semantic API is unavailable
  • Semantic mode throws SemanticSearchError on failure
  • Local mode runs entirely in-process with no network calls
  • URL deduplication prevents duplicate tools from MCP

Summary by cubic

Adds semantic tool search with an auto fallback to local BM25+TF‑IDF and a reusable SearchTool for agent loops. Adds constructor-level search config and a new semantic search example; examples now use env-based setup with a topK demo.

  • New Features

    • Added StackOneToolSet.searchTools(), searchActionNames(), and getSearchTool(); modes: semantic, local, and auto (default).
    • Constructor SearchConfig (method, topK, minSimilarity) with per-call overrides; set search: null to disable.
    • Per-connector parallel semantic queries via /actions/search; local search runs offline.
    • Normalizes versioned action names to MCP tool names.
    • Exported SearchTool, SemanticSearchClient, SemanticSearchError, and related types.
    • New example: examples/search-tools.ts with semantic, agent-loop, action-name-only, local-only, and topK demos.
  • Migration

    • Replaced the utility-tools example with search-tools.
    • README/examples now read STACKONE_API_KEY and STACKONE_ACCOUNT_ID; baseUrl no longer hardcoded.
    • Tests/mocks standardized on a TEST_BASE_URL; minor example command/title fixes.

Written for commit 8145e88. Summary will update on new commits.

Copilot AI review requested due to automatic review settings March 3, 2026 23:10
@shashi-stackone shashi-stackone requested a review from a team as a code owner March 3, 2026 23:10
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Mar 3, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@stackone/ai@322

commit: cdd0eaf

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new tool discovery/search API on StackOneToolSet, supporting semantic (cloud) search, local BM25+TF‑IDF fallback, and an auto mode that attempts semantic first. It also refactors the previous “utility tools” approach into dedicated search APIs and updates docs/examples and test mocks accordingly.

Changes:

  • Added searchTools(), searchActionNames(), and getSearchTool() (+ exported SearchTool and semantic search client/types).
  • Implemented semantic search client for /actions/search and a local hybrid search index (ToolIndex) as fallback / local-only mode.
  • Updated tests, mocks, README, and examples; removed the old utility-tools example.

Reviewed changes

Copilot reviewed 37 out of 37 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/toolsets.ts Adds search modes, SearchTool wrapper, semantic+local search implementations, and provider filtering via connector.
src/semantic-search.ts Introduces SemanticSearchClient + SemanticSearchError and response/result mapping.
src/local-search.ts Adds local hybrid BM25+TF‑IDF search index (ToolIndex).
src/utils/normalize.ts Adds action name normalization bridging semantic API names to MCP tool names.
src/tool.ts Adds BaseTool.connector and replaces utilityTools() with Tools.getConnectors().
src/index.ts Exports new search APIs and semantic search client/types.
src/toolsets.test.ts, src/semantic-search.test.ts, src/local-search.test.ts Adds/updates tests for semantic/local/auto search and normalization.
mocks/* + mocks/constants.ts Centralizes test base URL and updates MSW handlers to use it.
examples/*, README.md, examples/README.md Updates documentation and examples to use the new search APIs; removes utility-tools example.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/toolsets.ts Outdated
Comment on lines +527 to +530
console.warn(
`Semantic search failed (${error.message}), falling back to local BM25+TF-IDF search`,
);
return this.localSearch(query, allTools, options);
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.warn usage here will violate the repo’s no-console lint rule for non-test code (see .oxlintrc.jsonc). Please remove the console call (silent fallback) or route this through an internal logger/diagnostics hook that doesn’t rely on console.*.

Copilot uses AI. Check for mistakes.
Comment on lines +628 to +631
if (error instanceof SemanticSearchError) {
console.warn(`Semantic search failed: ${error.message}`);
return [];
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

console.warn usage here will violate the repo’s no-console lint rule for non-test code (see .oxlintrc.jsonc). Consider returning [] without logging, or expose an optional callback/event emitter for diagnostics instead of writing to stdout.

Copilot uses AI. Check for mistakes.
Comment on lines +649 to +666
const index = new ToolIndex(allTools.toArray());
const results = await index.search(
query,
options?.topK ?? 5,
options?.minSimilarity ?? 0.0,
);

const matchedNames = results.map((r) => r.name);
const toolMap = new Map(allTools.toArray().map((t) => [t.name, t]));
const filterConnectors = options?.connector
? new Set([options.connector.toLowerCase()])
: availableConnectors;

const matchedTools = matchedNames
.filter((name) => toolMap.has(name))
.map((name) => toolMap.get(name)!)
.filter((tool) => tool.connector && filterConnectors.has(tool.connector));

Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

localSearch applies the connector filter after taking the top-K results from the full index. This can return an empty/too-small result set even when matches exist for the requested connector (because higher-scoring tools from other connectors get filtered out). Filter the tools by connector before building/searching the index, or over-fetch results and keep collecting until you have topK matches for the connector.

Copilot uses AI. Check for mistakes.
src/toolsets.ts Outdated
Comment on lines +465 to +466
const client = this.getSemanticClient();
const allResults: SemanticSearchResult[] = [];
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In auto mode, a missing API key will currently throw ToolSetConfigError from getSemanticClient() (not a SemanticSearchError), so searchTools() won’t fall back to local search. If search: 'auto' is intended to be resilient, treat “no API key” as a semantic-unavailable condition and fall back to localSearch, reserving hard failures for search: 'semantic'.

Copilot uses AI. Check for mistakes.
import * as orama from '@orama/orama';
import { DEFAULT_HYBRID_ALPHA } from './consts';
import type { BaseTool } from './tool';
import type { ToolParameters } from './types';
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused type import ToolParameters triggers the repo’s typescript/no-unused-vars rule. Remove the import (it’s not referenced in this module).

Suggested change
import type { ToolParameters } from './types';

Copilot uses AI. Check for mistakes.
Comment on lines +576 to +578
try {
const client = this.getSemanticClient();
let allResults: SemanticSearchResult[] = [];
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

searchActionNames() calls getSemanticClient(), which can throw ToolSetConfigError when no API key is configured. That error isn’t handled here, so the method will throw instead of returning [] as the docs/tests imply for semantic-search failures. Consider handling missing API key the same way as other semantic failures (return [] or add an explicit option to control behavior).

Copilot uses AI. Check for mistakes.
src/toolsets.ts Outdated
Comment on lines +649 to +651
const index = new ToolIndex(allTools.toArray());
const results = await index.search(
query,
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

localSearch rebuilds a full BM25/TF-IDF index (new ToolIndex(allTools.toArray())) on every search call. For repeated searches this is unnecessarily expensive; consider caching the index (e.g., keyed by a tools snapshot/version) or constructing it lazily once per StackOneToolSet/Tools instance.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +3
import { ToolIndex, type ToolSearchResult } from './local-search';
import { BaseTool } from './tool';
import { ParameterLocation } from './types';
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports (type ToolSearchResult and ParameterLocation) will fail the repo’s typescript/no-unused-vars lint rule. Please remove them or use them in the tests.

Suggested change
import { ToolIndex, type ToolSearchResult } from './local-search';
import { BaseTool } from './tool';
import { ParameterLocation } from './types';
import { ToolIndex } from './local-search';
import { BaseTool } from './tool';

Copilot uses AI. Check for mistakes.
Comment on lines +435 to +447
async searchTools(query: string, options?: SearchToolsOptions): Promise<Tools> {
const search = options?.search ?? 'auto';
const allTools = await this.fetchTools({ accountIds: options?.accountIds });
const availableConnectors = allTools.getConnectors();

if (availableConnectors.size === 0) {
return new Tools([]);
}

// Local-only search — skip semantic API entirely
if (search === 'local') {
return this.localSearch(query, allTools, options);
}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

searchTools is likely to exceed the repo’s configured cyclomatic complexity limit (max 7 in .oxlintrc.jsonc). Consider extracting the semantic-search path and the connector-selection logic into smaller private helpers (e.g., semanticSearchTools() / getConnectorsToSearch() / matchSemanticResultsToTools()) to keep each function under the limit and easier to test.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 37 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/semantic-search.ts">

<violation number="1" location="src/semantic-search.ts:131">
P1: Custom agent: **Flag Security Vulnerabilities**

Enforce HTTPS for semantic search requests. The constructor accepts any baseUrl and uses it for fetch calls; this allows insecure `http://` endpoints and can leak API keys in transit. Validate the scheme and reject non-HTTPS base URLs per the security rule requiring TLS.</violation>

<violation number="2" location="src/semantic-search.ts:182">
P2: Clear the timeout in a finally block so it’s always cancelled even when fetch throws; otherwise the timer can leak and fire after the request has already failed.</violation>
</file>

<file name="examples/search-tools.ts">

<violation number="1" location="examples/search-tools.ts:105">
P2: Guard the fetchTools call so it only runs when topActions is non-empty; otherwise this example fetches the entire tool catalog whenever no action exceeds the 0.7 threshold.</violation>
</file>

<file name="src/toolsets.ts">

<violation number="1" location="src/toolsets.ts:650">
P2: Bug: `localSearch` applies `topK` limit to the index search *before* filtering by `connector`, which can return fewer results than requested. When a connector filter is active, the index should fetch more results (or all) so that post-filter slicing still yields up to `topK` tools.</violation>
</file>

<file name="src/mcp-client.test.ts">

<violation number="1" location="src/mcp-client.test.ts:50">
P2: Hardcoding `http://localhost/mcp` here makes the test depend on a running local MCP server. Since no server is started in this test and `createMCPClient` performs real HTTP calls, this will fail in CI or for contributors without a local server. Consider using MSW/mocked transport or a test server fixture instead of a hardcoded localhost endpoint.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

timeout?: number;
}) {
this.apiKey = apiKey;
this.baseUrl = baseUrl.replace(/\/+$/, '');
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Custom agent: Flag Security Vulnerabilities

Enforce HTTPS for semantic search requests. The constructor accepts any baseUrl and uses it for fetch calls; this allows insecure http:// endpoints and can leak API keys in transit. Validate the scheme and reject non-HTTPS base URLs per the security rule requiring TLS.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/semantic-search.ts, line 131:

<comment>Enforce HTTPS for semantic search requests. The constructor accepts any baseUrl and uses it for fetch calls; this allows insecure `http://` endpoints and can leak API keys in transit. Validate the scheme and reject non-HTTPS base URLs per the security rule requiring TLS.</comment>

<file context>
@@ -0,0 +1,260 @@
+		timeout?: number;
+	}) {
+		this.apiKey = apiKey;
+		this.baseUrl = baseUrl.replace(/\/+$/, '');
+		this.timeout = timeout;
+	}
</file context>
Fix with Cubic

const topActions = results.filter((r) => r.similarityScore > 0.7).map((r) => r.actionName);
console.log(`\nFetching tools for top actions: ${topActions.join(', ')}`);

const tools = await toolset.fetchTools({ actions: topActions });
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Guard the fetchTools call so it only runs when topActions is non-empty; otherwise this example fetches the entire tool catalog whenever no action exceeds the 0.7 threshold.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At examples/search-tools.ts, line 105:

<comment>Guard the fetchTools call so it only runs when topActions is non-empty; otherwise this example fetches the entire tool catalog whenever no action exceeds the 0.7 threshold.</comment>

<file context>
@@ -0,0 +1,147 @@
+		const topActions = results.filter((r) => r.similarityScore > 0.7).map((r) => r.actionName);
+		console.log(`\nFetching tools for top actions: ${topActions.join(', ')}`);
+
+		const tools = await toolset.fetchTools({ actions: topActions });
+		console.log(`Fetched ${tools.length} tools`);
+	}
</file context>
Fix with Cubic

src/toolsets.ts Outdated
}

const index = new ToolIndex(allTools.toArray());
const results = await index.search(
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Bug: localSearch applies topK limit to the index search before filtering by connector, which can return fewer results than requested. When a connector filter is active, the index should fetch more results (or all) so that post-filter slicing still yields up to topK tools.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/toolsets.ts, line 650:

<comment>Bug: `localSearch` applies `topK` limit to the index search *before* filtering by `connector`, which can return fewer results than requested. When a connector filter is active, the index should fetch more results (or all) so that post-filter slicing still yields up to `topK` tools.</comment>

<file context>
@@ -265,6 +349,326 @@ export class StackOneToolSet {
+		}
+
+		const index = new ToolIndex(allTools.toArray());
+		const results = await index.search(
+			query,
+			options?.topK ?? 5,
</file context>
Fix with Cubic

test('createMCPClient can connect and list tools from MCP server', async () => {
await using mcpClient = await createMCPClient({
baseUrl: 'https://api.stackone-dev.com/mcp',
baseUrl: 'http://localhost/mcp',
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Hardcoding http://localhost/mcp here makes the test depend on a running local MCP server. Since no server is started in this test and createMCPClient performs real HTTP calls, this will fail in CI or for contributors without a local server. Consider using MSW/mocked transport or a test server fixture instead of a hardcoded localhost endpoint.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/mcp-client.test.ts, line 50:

<comment>Hardcoding `http://localhost/mcp` here makes the test depend on a running local MCP server. Since no server is started in this test and `createMCPClient` performs real HTTP calls, this will fail in CI or for contributors without a local server. Consider using MSW/mocked transport or a test server fixture instead of a hardcoded localhost endpoint.</comment>

<file context>
@@ -47,7 +47,7 @@ test('createMCPClient provides asyncDispose for cleanup', async () => {
 test('createMCPClient can connect and list tools from MCP server', async () => {
 	await using mcpClient = await createMCPClient({
-		baseUrl: 'https://api.stackone-dev.com/mcp',
+		baseUrl: 'http://localhost/mcp',
 		headers: {
 			Authorization: `Basic ${Buffer.from('test-key:').toString('base64')}`,
</file context>
Fix with Cubic

Copy link
Copy Markdown

@willleeney willleeney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@shashi-stackone shashi-stackone merged commit feefdc5 into main Mar 6, 2026
18 checks passed
@shashi-stackone shashi-stackone deleted the semantic_search_12112 branch March 6, 2026 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants