feat: auto-suggest tags based on content analysis (#243) by RobertLD · Pull Request #273 · RobertLD/libscope

RobertLD · 2026-03-02T20:40:42Z

Closes #243

Adds tag suggestion via TF-IDF-like keyword extraction:

Core: suggestTags() in tags.ts — tokenize, stopword filter, TF scoring, known-tag boosting
MCP: suggest-tags tool
CLI: libscope tag suggest <docId> [--limit <n>]
REST: GET /api/v1/documents/:id/suggest-tags?limit=5
6 new tests

vercel · 2026-03-02T20:40:46Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
libscope	Ignored	Preview	Mar 2, 2026 8:47pm

Copilot

Pull request overview

Adds content-based tag suggestions to LibScope, exposing the functionality through core APIs and multiple user-facing interfaces (CLI, REST, MCP) to improve document discoverability.

Changes:

Implemented suggestTags() in src/core/tags.ts using tokenization, stopword filtering, TF scoring, and boosting for existing system tags.
Exposed tag suggestion via MCP tool (suggest-tags), CLI command (libscope tag suggest), and REST endpoint (GET /api/v1/documents/:id/suggest-tags).
Added unit tests covering suggestTags() behavior and error handling.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`src/core/tags.ts`	Adds stopword list, tokenizer, and `suggestTags()` implementation.
`src/core/index.ts`	Re-exports `suggestTags` from the core entrypoint.
`src/mcp/server.ts`	Adds MCP tool for suggesting tags for a document.
`src/cli/index.ts`	Adds `libscope tag suggest` command to print suggested tags.
`src/api/routes.ts`	Adds REST route matcher + handler for suggest-tags endpoint.
`tests/unit/tags.test.ts`	Adds unit tests for `suggestTags()` including boosting, exclusion, limits, and missing-doc error.

Copilot · 2026-03-02T20:45:52Z

src/api/routes.ts

+    const suggestDocId = matchDocumentSuggestTags(segments);
+    if (suggestDocId && method === "GET") {
+      const limitRaw = url.searchParams.get("limit");
+      const limit = limitRaw ? parseInt(limitRaw, 10) : 5;


limit is parsed with parseInt but not validated for NaN/non-numeric values (unlike other endpoints in this file that convert NaN to undefined). As written, ?limit=foo results in limit = NaN and an empty suggestions list rather than a 400. Validate the query param (finite integer >= 1, and optionally cap it) and return a VALIDATION_ERROR on invalid input.

Suggested change

const limit = limitRaw ? parseInt(limitRaw, 10) : 5;

let limit = 5;

if (limitRaw !== null) {

const parsedLimit = Number.parseInt(limitRaw, 10);

if (!Number.isFinite(parsedLimit) || !Number.isInteger(parsedLimit) || parsedLimit < 1) {

sendError(res, 400, "VALIDATION_ERROR", "Query parameter 'limit' must be a positive integer");

return;

}

const MAX_LIMIT = 100;

limit = Math.min(parsedLimit, MAX_LIMIT);

}

Copilot · 2026-03-02T20:45:53Z

src/api/routes.ts

+    // Suggest tags for a document
+    const suggestDocId = matchDocumentSuggestTags(segments);
+    if (suggestDocId && method === "GET") {
+      const limitRaw = url.searchParams.get("limit");
+      const limit = limitRaw ? parseInt(limitRaw, 10) : 5;
+      const suggestions = suggestTags(db, suggestDocId, limit);
+      const took = Math.round(performance.now() - start);
+      sendJson(res, 200, { documentId: suggestDocId, suggestions }, took);
+      return;


This adds a new REST endpoint (GET /api/v1/documents/:id/suggest-tags) but there are existing API route tests in tests/unit/api.test.ts for other /api/v1/documents/* endpoints. Add unit coverage for the new route (success case, invalid limit, and 404 for missing document) to prevent regressions.

Copilot · 2026-03-02T20:45:53Z

src/core/tags.ts

+  const limit = maxSuggestions ?? 5;
+


maxSuggestions is used directly to compute limit, but callers can pass NaN, 0, negative numbers, or very large values (e.g., REST/CLI parseInt failures). This can lead to surprising behavior (e.g., slice(0, NaN) returns an empty list). Validate maxSuggestions inside suggestTags (finite integer, >= 1, and ideally a reasonable max) and throw ValidationError when invalid.

Suggested change

const limit = maxSuggestions ?? 5;

const defaultLimit = 5;

const maxAllowedLimit = 100;

let limit: number;

if (maxSuggestions === undefined) {

limit = defaultLimit;

} else {

if (

!Number.isFinite(maxSuggestions) ||

!Number.isInteger(maxSuggestions) ||

maxSuggestions < 1 ||

maxSuggestions > maxAllowedLimit

) {

throw new ValidationError(

`maxSuggestions must be an integer between 1 and ${maxAllowedLimit}, got ${String(

maxSuggestions,

)}`,

);

}

limit = maxSuggestions;

}

Copilot · 2026-03-02T20:45:53Z

src/core/tags.ts

+  // Get all known tags in the system for boosting
+  const knownTags = new Set(
+    (db.prepare("SELECT name FROM tags").all() as Array<{ name: string }>).map((r) => r.name),
+  );


knownTags is built by selecting all tag names on every suggestTags call. If the tag table grows, this becomes an O(totalTags) read per request/tool invocation. Consider querying only the subset relevant to the extracted terms (e.g., SELECT name FROM tags WHERE name IN (...)) or otherwise caching known tag names for the lifetime of the process.

Suggested change

// Get all known tags in the system for boosting

const knownTags = new Set(

(db.prepare("SELECT name FROM tags").all() as Array<{ name: string }>).map((r) => r.name),

);

// Get known tags among the candidate terms for boosting

const candidateTerms = [...tf.keys()];

const knownTags =

candidateTerms.length === 0

? new Set<string>()

: new Set(

(

db

.prepare(

`SELECT name FROM tags WHERE name IN (${candidateTerms

.map(() => "?")

.join(", ")})`,

)

.all(...candidateTerms) as Array<{ name: string }>

).map((r) => r.name),

);

Copilot · 2026-03-02T20:45:54Z

src/core/tags.ts

+  "its",
+  "our",
+  "their",
+  "your",
+  "his",
+  "her",
+  "our",
+  "out",
+  "then",
+  "there",
+  "these",
+  "those",
+  "them",
+  "they",
+  "you",
+  "your",
+  "only",


STOPWORDS includes duplicates (e.g., "our" and "your" appear twice). This is harmless at runtime because it's a Set, but it adds noise and makes future maintenance harder. Remove the duplicates to keep the list clean.

Copilot · 2026-03-02T20:45:54Z

src/cli/index.ts

+  .action((documentId: string, opts: { limit: string }) => {
+    const { db } = initializeApp();
+    try {
+      const limit = parseInt(opts.limit, 10);


This command parses --limit with parseInt directly, but the CLI already has parseIntOption() which validates numeric options and exits with a clear error. Using raw parseInt here allows NaN (or negative/zero) to flow into suggestTags, producing confusing results. Use parseIntOption(opts.limit, "--limit") (and enforce >= 1) for consistent CLI behavior.

Suggested change

const limit = parseInt(opts.limit, 10);

const limit = parseIntOption(opts.limit, "--limit");

if (limit < 1) {

console.error("--limit must be at least 1");

process.exitCode = 1;

return;

}

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 2, 2026 20:40

Copilot started reviewing on behalf of RobertLD March 2, 2026 20:41 View session

Copilot AI reviewed Mar 2, 2026

View reviewed changes

feat: auto-suggest tags based on content analysis (#243)

43146e4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

RobertLD force-pushed the feat/auto-suggest-tags branch from f93c920 to 43146e4 Compare March 2, 2026 20:47

RobertLD merged commit da158ff into main Mar 2, 2026
4 of 7 checks passed

This was referenced Mar 2, 2026

chore(main): release 1.3.0 #238

Closed

chore(development): release 1.3.0 #346

Closed

chore(development): release 1.3.0 #355

Closed

chore(main): release 1.3.0 #359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: auto-suggest tags based on content analysis (#243)#273

feat: auto-suggest tags based on content analysis (#243)#273
RobertLD merged 1 commit intomainfrom
feat/auto-suggest-tags

RobertLD commented Mar 2, 2026

Uh oh!

vercel bot commented Mar 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Copilot AI Mar 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-      const limit = limitRaw ? parseInt(limitRaw, 10) : 5;
+      let limit = 5;
+      if (limitRaw !== null) {
+        const parsedLimit = Number.parseInt(limitRaw, 10);
+        if (!Number.isFinite(parsedLimit) || !Number.isInteger(parsedLimit) || parsedLimit < 1) {
+          sendError(res, 400, "VALIDATION_ERROR", "Query parameter 'limit' must be a positive integer");
+          return;
+        }
+        const MAX_LIMIT = 100;
+        limit = Math.min(parsedLimit, MAX_LIMIT);
+      }

-  const limit = maxSuggestions ?? 5;
+  const defaultLimit = 5;
+  const maxAllowedLimit = 100;
+  let limit: number;
+  if (maxSuggestions === undefined) {
+    limit = defaultLimit;
+  } else {
+    if (
+      !Number.isFinite(maxSuggestions) ||
+      !Number.isInteger(maxSuggestions) ||
+      maxSuggestions < 1 ||
+      maxSuggestions > maxAllowedLimit
+    ) {
+      throw new ValidationError(
+        `maxSuggestions must be an integer between 1 and ${maxAllowedLimit}, got ${String(
+          maxSuggestions,
+        )}`,
+      );
+    }
+    limit = maxSuggestions;
+  }

-  // Get all known tags in the system for boosting
-  const knownTags = new Set(
-    (db.prepare("SELECT name FROM tags").all() as Array<{ name: string }>).map((r) => r.name),
-  );
+  // Get known tags among the candidate terms for boosting
+  const candidateTerms = [...tf.keys()];
+  const knownTags =
+    candidateTerms.length === 0
+      ? new Set<string>()
+      : new Set(
+          (
+            db
+              .prepare(
+                `SELECT name FROM tags WHERE name IN (${candidateTerms
+                  .map(() => "?")
+                  .join(", ")})`,
+              )
+              .all(...candidateTerms) as Array<{ name: string }>
+          ).map((r) => r.name),
+        );

-      const limit = parseInt(opts.limit, 10);
+      const limit = parseIntOption(opts.limit, "--limit");
+      if (limit < 1) {
+        console.error("--limit must be at least 1");
+        process.exitCode = 1;
+        return;
+      }

Conversation

RobertLD commented Mar 2, 2026

Uh oh!

vercel bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 2, 2026 •

edited

Loading