Skip to content

fix(score): filter URLs, env-vars, Windows paths, and HTTP routes from grounding checker#168

Merged
stackbilt-admin merged 2 commits into
mainfrom
fix/grounding-path-false-positives
May 23, 2026
Merged

fix(score): filter URLs, env-vars, Windows paths, and HTTP routes from grounding checker#168
stackbilt-admin merged 2 commits into
mainfrom
fix/grounding-path-false-positives

Conversation

@stackbilt-admin
Copy link
Copy Markdown
Member

Summary

  • looksLikePath() now rejects env-var assignments (FOO=...), Windows drive-letter paths (C:/...), multi-segment HTTP route paths (/api/v1/users), and hostname:port strings
  • Raw prose regex extractor now requires an explicit .//../ prefix or a known file extension — eliminates Windows path fragments and cross-repo noise from freeform text
  • resolveReferencedPath() uses path.isAbsolute() alongside startsWith('/') for cross-platform correctness
  • Adds one new regression test covering all four false-positive categories

What's still a known gap (follow-up)

Cross-repo references with known extensions (e.g. `stackbilt_llc/policies/oss-infrastructure-update-policy.md` in CLAUDE.md) are still caught by the backtick extractor and marked broken. A config-based exclusion mechanism (groundingExcludePatterns in .charter/config.json) is the right fix and will be addressed in a separate issue.

Test plan

  • pnpm test — 491/491 pass (1 new test added)
  • New test covers: env-var assignment, Windows absolute path, HTTP routes, URL-only link, hostname:port
  • Existing score tests (A-grade repo, plain repo, stale config, custom ai-dir) all still pass

Closes #164, #163

🤖 Generated with Claude Code

…m grounding checker

The path reference extractor was producing false positives for:
- Env-var assignments in code blocks (FOO=http://...)
- Windows drive-letter absolute paths (C:/Users/...)
- HTTP API route paths in tables (/api/v1/users)
- hostname:port strings (localhost:3000)
- Windows path fragments captured by the raw prose regex

Three changes:
1. `looksLikePath()`: add guards for env-var `=`, Windows drive letters,
   multi-segment `/`-prefixed routes with no extension, and hostname:port.
2. `extractPathCandidates()`: tighten the raw prose regex extractor to only
   include candidates that have an explicit `./`/`../` prefix or a known file
   extension, eliminating Windows path fragments and cross-repo noise.
3. `resolveReferencedPath()`: use `path.isAbsolute()` alongside `startsWith('/')`
   for cross-platform correctness.

Adds a regression test covering all false-positive categories.

Closes #164, #163

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread packages/cli/src/commands/score.ts Outdated
for (const match of content.matchAll(/(^|[\s(])((?:\.{1,2}\/)?(?:[A-Za-z0-9._-]+\/)*[A-Za-z0-9._-]+(?:\.[A-Za-z0-9._-]+)?)/gm)) {
const candidate = normalizePathCandidate(match[2]);
const ext = path.posix.extname(candidate).toLowerCase();
if (!candidate.startsWith('./') && !candidate.startsWith('../') && !KNOWN_PATH_EXTENSIONS.has(ext)) continue;
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gate looks right in intent, but it currently never sees raw ./ candidates because normalizePathCandidate() strips a leading ./ before this check runs. That means prose refs like ./docs/runbooks (no known extension) get dropped and wont be checked as broken paths. Consider checking for the .//../ prefix before normalization, or preserving a pre-normalized copy for this predicate.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e9044cf — now capture raw = match[2] before normalization and use it for the .//../ prefix check. A new regression test (./docs/runbooks with no extension, missing file → must appear in broken-paths list) confirms the behavior. Thanks for the catch.

…te strips it

normalizePathCandidate() removes the leading ./ from path candidates.
The gate added in the previous commit checked the already-normalized string,
so ./docs/runbooks (no extension) became docs/runbooks, failed the
startsWith('./') check, and was silently dropped — never flagged as a
broken reference even though the intent was unambiguous.

Fix: capture the raw regex match before normalization and use it for the
./ / ../ prefix predicate; apply normalization only for the looksLikePath
check and set insertion.

Adds a regression test: ./docs/runbooks (no extension, missing file) must
appear in the broken-paths list.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@stackbilt-admin stackbilt-admin merged commit 680cf78 into main May 23, 2026
3 checks passed
@stackbilt-admin stackbilt-admin deleted the fix/grounding-path-false-positives branch May 23, 2026 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(score): grounding checker counts URLs, env vars, and HTTP paths as broken file references

1 participant