Skip to content

fix: address security, performance, and documentation audit findings#368

Merged
RobertLD merged 4 commits intodevelopmentfrom
fix/audit-findings
Mar 6, 2026
Merged

fix: address security, performance, and documentation audit findings#368
RobertLD merged 4 commits intodevelopmentfrom
fix/audit-findings

Conversation

@RobertLD
Copy link
Owner

@RobertLD RobertLD commented Mar 6, 2026

Summary

Addresses findings from a comprehensive security, performance, and documentation audit of the development branch. All 1089 tests pass, TypeScript clean.

Security fixes (closes #364)

  • FTS5 query injection — Added sanitizeFtsWord() helper that strips column-filter syntax, prefix/suffix wildcards, and standalone FTS5 operators (NEAR, AND, OR, NOT) before building queries. Applied to both AND and OR fallback paths.
  • CORS wildcard default — Changed default from ["*"] to ["http://localhost", "http://localhost:3000"]. Callers can configure corsOrigins explicitly for custom setups.
  • Webhook secrets at rest — AES-256-GCM encryption via LIBSCOPE_SECRET_KEY env var. Graceful fallback to plaintext if key not set (with warning log). Backward compatible with existing plaintext secrets.

Performance fixes (closes #365)

  • Missing DB indexes (migration v16) — idx_documents_content_hash eliminates full table scans on every dedup check; composite idx_chunks_doc_idx ON chunks(document_id, chunk_index) speeds up context chunk range queries
  • SQLite pragmassynchronous = NORMAL (safe with WAL, 2–3× faster writes), cache_size = -32000 (32MB), temp_store = MEMORY
  • 9× ANN over-fetch fixedvectorSearch was applying an inner * 3 on top of the caller's * 3, totalling 9× over-fetch. Removed inner multiplier; effective fetch is now (offset + limit) * 3 (capped at 5000).
  • Ratings query deferredAVG(rating) subquery no longer inlined in all 4 search paths. When minRating filter is not set, ratings are fetched post-pagination in a single batch query on the final result set (typically 10 docs).
  • getStats() combined — 5 sequential COUNT(*) queries merged into one subquery-based SELECT

Documentation fixes (closes #367)

  • License mismatch — VitePress footer now correctly says "Business Source License 1.1"
  • MCP tool count — Updated from 17 → 26 to match actual registered tools
  • New: How Search Works (docs/guide/how-search-works.md) — Documents the full hybrid pipeline: query embedding → vector ANN → FTS5 → RRF fusion → title boost → pagination. Includes search methods table, scoreExplanation shape, and tuning reference.
  • New: Troubleshooting (docs/guide/troubleshooting.md) — Common issues: sqlite-vec loading, model download, dimension mismatch, search quality, API auth, database locks
  • Dedup modes documenteddedup: skip | warn | force and dedupOptions added to MCP tools and REST API reference
  • Sidebar updated — New "Deep Dives" section linking both new pages

Test plan

  • All 1089 tests pass
  • TypeScript: npx tsc --noEmit clean
  • Systematic code review completed by dedicated review agent
  • Migration v16 is idempotent (IF NOT EXISTS)
  • Webhook encryption backward compatible with existing plaintext secrets
  • FTS5 sanitization preserves legitimate queries, strips dangerous tokens

🤖 Generated with Claude Code

Security (closes #364):
- Add sanitizeFtsWord() to strip FTS5 operators, column filters, and
  wildcards before query construction — prevents FTS5 injection
- Change CORS default from ["*"] to localhost-only origins
- Encrypt webhook secrets at rest using AES-256-GCM when
  LIBSCOPE_SECRET_KEY env var is set; graceful plaintext fallback

Performance (closes #365):
- Add migration v16: idx_documents_content_hash and idx_chunks_doc_idx
  indexes — eliminates full table scans on dedup and context fetching
- Add SQLite pragmas: synchronous=NORMAL, cache_size=32MB, temp_store=MEMORY
- Remove double 9x ANN over-fetch in vectorSearch (was 3x*3x, now 3x total)
- Defer ratings AVG() join to post-pagination attachRatings() batch query
- Combine getStats() 5 sequential COUNTs into a single subquery SELECT

Documentation (closes #367):
- Fix VitePress footer license: MIT → Business Source License 1.1
- Fix MCP tool count on homepage: 17 → 26
- Add docs/guide/how-search-works.md: hybrid RRF pipeline, search methods,
  scoreExplanation, and tuning options
- Add docs/guide/troubleshooting.md: common issues and solutions
- Document dedup modes and scoreExplanation in MCP tools reference
- Add new pages to VitePress sidebar

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Mar 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
libscope Ignored Ignored Preview Mar 6, 2026 2:42am

RobertLD and others added 3 commits March 5, 2026 21:38
- url-fetcher.ts: move codeql[js/request-forgery] suppression comment
  onto the fetch() line itself so CodeQL recognises it as intentional
  (SSRF is mitigated by validateUrl() + DNS rebinding checks above)
- confluence.ts: cap [^>]* to [^>]{0,500} in ri:attachment regex to
  prevent polynomial ReDoS on maliciously crafted Confluence markup

Both alerts were pre-existing on the development branch (detected
2026-03-04/05) and are not introduced by this PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Prettier moved the inline comment inside the object literal on the
previous commit; reposition it as a line comment directly above the
fetch() call so CodeQL recognises the suppression.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
/^\*+|\*+$/g triggers CodeQL js/polynomial-redos on strings with
many consecutive '*' characters. Replace with a simple while-loop
index scan that strips leading/trailing asterisks without regex
backtracking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@RobertLD RobertLD merged commit 3cb4cdb into development Mar 6, 2026
9 checks passed
@RobertLD RobertLD deleted the fix/audit-findings branch March 6, 2026 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant