Skip to content

fix: comprehensive security and robustness hardening for registry feature#407

Merged
RobertLD merged 14 commits intomainfrom
fix/registry-audit-hardening
Mar 18, 2026
Merged

fix: comprehensive security and robustness hardening for registry feature#407
RobertLD merged 14 commits intomainfrom
fix/registry-audit-hardening

Conversation

@RobertLD
Copy link
Copy Markdown
Owner

Summary

Full security and robustness audit of the git-based pack registry feature, addressing 20+ findings across critical, high, and medium severity.

Critical fixes

  • Path traversal prevention: Pack names and versions validated against [a-zA-Z0-9._-] allowlist before use in path.join() — in publish, unpublish, and resolve paths
  • Checksum verification on install: verifyChecksum() now called before installing any registry-resolved pack (was defined but never invoked)
  • Runtime type validation: readIndex() validates each index.json entry through a type guard instead of blind as PackSummary[] cast

High severity fixes

  • Symlink protection: core.symlinks=false on clone and persisted via git config on fetch
  • Corrupted cache recovery: fetchRegistry() detects non-git cache dirs, removes them, and throws descriptive error
  • Index corruption: Corrupt index.json now throws instead of silently resetting to []
  • File permissions: Config file written with 0o600, directories with 0o700
  • Credential rejection: URLs with embedded user:pass@ or token@ rejected; URLs sanitized in logs
  • Concurrent sync locking: File-based PID lock prevents parallel git operations on same registry

Medium severity fixes

  • SSH URL support: ssh://git@host:port/path format now accepted (was rejected, forcing SCP format which misinterprets port as path)
  • Semver validation: Version strings must match ^\d+\.\d+\.\d+(-[a-zA-Z0-9._-]+)?$
  • Configurable git timeout: LIBSCOPE_GIT_TIMEOUT_MS env var, capped at 5 minutes
  • origin/HEAD fallback: Tries origin/HEADorigin/mainorigin/master
  • Pack size limit: 50MB cap on serialized pack data
  • Index deduplication: Duplicate pack entries collapsed during publish
  • Stale index cache: Cleared after fetch in publish/unpublish paths
  • Publish rollback: Version directory removed on failure (prevents orphaned dirs blocking re-publish)

Low severity fixes

  • URL whitespace trimming and trailing slash normalization
  • Registry name length limits (2-64 chars)
  • Better git error messages (auth, network, timeout diagnostics)
  • deriveNameFromUrl() handles ssh:// URLs correctly

Tests

  • 46 new tests added (155 → 201 total), covering all security fixes
  • New tests/unit/registry/publish.test.ts with 19 tests
  • Extended config, git, and sync test suites

Test plan

  • All 201 registry tests pass (npx vitest run tests/unit/registry/ tests/integration/registry/)
  • TypeScript compiles clean (pre-existing errors in unrelated epub/pptx/rag modules only)
  • QA review completed — 2 FAIL findings fixed, remaining WARNs documented as acceptable risk
  • Manual test: libscope registry add ssh://git@bitbucket:7999/mdog/libscope-registry.git accepts SSH URL

🤖 Generated with Claude Code

RobertLD and others added 10 commits March 16, 2026 19:21
Bitbucket Server uses ssh://git@host:port/path format which was rejected
by the URL validator. Only SCP-style (git@host:path) and https:// were
accepted. This caused users to fall back to the SCP format with a port
number, which git misinterprets as a path — leading to password prompts
instead of SSH key auth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Trim whitespace and trailing slashes from git URLs before validation
- Reject URLs with embedded credentials (user:pass@host or token@host) and direct users to SSH keys / git credential helpers
- Add sanitizeUrl() helper (exported) that masks credentials in log output
- Enforce min/max length (2–64 chars) on registry names
- Create ~/.libscope/ with mode 0o700 and chmod config.json to 0o600 after write
- Fix deriveNameFromUrl() to handle ssh:// URLs via URL parser before falling back to SCP regex
- Validate derived registry name before proceeding in `registry add`; give clear error with --name hint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…k, and dedup to registry publish

- validatePathSegment(): rejects empty, path separators, .., null bytes,
  and chars outside [a-zA-Z0-9._-]; called on pack.name and version in
  publishPack(), and on packName + version in unpublishPack()
- validateSemver(): enforces ^\d+\.\d+\.\d+(-[a-zA-Z0-9._-]+)?$ pattern,
  called after final version is determined in publishPack()
- 50 MB pack data size guard (JSON.stringify length) with clear error
- index.json corruption now throws ValidationError instead of silently
  resetting to [], preserving data integrity
- publish operations wrapped in try/catch; versionDir is removed on any
  failure to prevent orphaned directories blocking future publishes
- duplicate index entries deduplicated by pack name (keep first, log warning)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add `verifyResolvedPackChecksum()` to `src/registry/resolve.ts` that reads
the expected SHA-256 from the pack manifest and calls `verifyChecksum()` on
the cached data file before any installation. The CLI now calls this function
immediately before both `installPack()` sites that use a registry-resolved
data path, ensuring tampered or corrupted packs are rejected with a clear error.

Update `tests/integration/registry/registry-conflict.test.ts` to write real
SHA-256 checksums in `createBareRepoWithPacks` instead of the "placeholder"
string, so the integration tests remain valid under the new verification.

Add 5 unit tests for `verifyResolvedPackChecksum` covering: matching checksum
(passes), tampered file (throws), missing manifest (skips with warning),
empty checksum in manifest (throws), and version not found in manifest (skips).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Runtime validation of index.json entries (type guard for PackSummary)
- Symlink protection on git clone (core.symlinks=false)
- Corrupted cache detection and auto-recovery in fetchRegistry
- File-based sync locking to prevent concurrent operations
- Configurable git timeout via LIBSCOPE_GIT_TIMEOUT_MS env var
- origin/HEAD resolution fallback (tries main, then master)
- Better diagnostic messages for git auth/network/timeout errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
QA FAIL fixes:
- resolve.ts: validate packName and version with validatePathSegment
  before path construction (prevents path traversal via malicious index.json)
- publish.ts: clear index cache after fetchRegistry in publish/unpublish
  (prevents stale index reads within same process)
- publish.ts: export validatePathSegment for use by resolve.ts

QA WARN fixes:
- git.ts: persist core.symlinks=false via git config in fetchRegistry
  (not just at clone time via -c flag)

New tests (46 added, 201 total):
- config.test.ts: name length limits, URL trimming, credential rejection,
  ssh:// URLs, sanitizeUrl, file permissions
- git.test.ts: runtime type validation, symlink protection, cache corruption,
  HEAD fallback, timeout config
- publish.test.ts: path traversal, semver, corrupt index, size limit,
  dedup, rollback, re-publish
- sync.test.ts: locking, stale lock cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
libscope Ignored Ignored Preview Mar 18, 2026 3:09pm

RobertLD and others added 4 commits March 17, 2026 14:41
…rdening

# Conflicts:
#	src/registry/config.ts
- Replace || with ?? for nullish coalescing in GIT_TIMEOUT_MS (src/registry/git.ts:19)
- Remove async keyword from isNeuralModelAvailable since it has no await expressions (tests/integration/retrieval-quality.test.ts:513)

These changes fix the lint-and-typecheck job failures on PR #407.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@RobertLD RobertLD merged commit cab4baf into main Mar 18, 2026
9 checks passed
@RobertLD RobertLD deleted the fix/registry-audit-hardening branch March 18, 2026 15:27
RobertLD added a commit that referenced this pull request Mar 18, 2026
* fix: accept ssh:// protocol URLs in registry URL validation

Bitbucket Server uses ssh://git@host:port/path format which was rejected
by the URL validator. Only SCP-style (git@host:path) and https:// were
accepted. This caused users to fall back to the SCP format with a port
number, which git misinterprets as a path — leading to password prompts
instead of SSH key auth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden registry config validation and file security

- Trim whitespace and trailing slashes from git URLs before validation
- Reject URLs with embedded credentials (user:pass@host or token@host) and direct users to SSH keys / git credential helpers
- Add sanitizeUrl() helper (exported) that masks credentials in log output
- Enforce min/max length (2–64 chars) on registry names
- Create ~/.libscope/ with mode 0o700 and chmod config.json to 0o600 after write
- Fix deriveNameFromUrl() to handle ssh:// URLs via URL parser before falling back to SCP regex
- Validate derived registry name before proceeding in `registry add`; give clear error with --name hint

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add path traversal validation, semver check, size limit, rollback, and dedup to registry publish

- validatePathSegment(): rejects empty, path separators, .., null bytes,
  and chars outside [a-zA-Z0-9._-]; called on pack.name and version in
  publishPack(), and on packName + version in unpublishPack()
- validateSemver(): enforces ^\d+\.\d+\.\d+(-[a-zA-Z0-9._-]+)?$ pattern,
  called after final version is determined in publishPack()
- 50 MB pack data size guard (JSON.stringify length) with clear error
- index.json corruption now throws ValidationError instead of silently
  resetting to [], preserving data integrity
- publish operations wrapped in try/catch; versionDir is removed on any
  failure to prevent orphaned directories blocking future publishes
- duplicate index entries deduplicated by pack name (keep first, log warning)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: verify pack checksum before installing from registry cache

Add `verifyResolvedPackChecksum()` to `src/registry/resolve.ts` that reads
the expected SHA-256 from the pack manifest and calls `verifyChecksum()` on
the cached data file before any installation. The CLI now calls this function
immediately before both `installPack()` sites that use a registry-resolved
data path, ensuring tampered or corrupted packs are rejected with a clear error.

Update `tests/integration/registry/registry-conflict.test.ts` to write real
SHA-256 checksums in `createBareRepoWithPacks` instead of the "placeholder"
string, so the integration tests remain valid under the new verification.

Add 5 unit tests for `verifyResolvedPackChecksum` covering: matching checksum
(passes), tampered file (throws), missing manifest (skips with warning),
empty checksum in manifest (throws), and version not found in manifest (skips).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: harden registry git operations and sync locking

- Runtime validation of index.json entries (type guard for PackSummary)
- Symlink protection on git clone (core.symlinks=false)
- Corrupted cache detection and auto-recovery in fetchRegistry
- File-based sync locking to prevent concurrent operations
- Configurable git timeout via LIBSCOPE_GIT_TIMEOUT_MS env var
- origin/HEAD resolution fallback (tries main, then master)
- Better diagnostic messages for git auth/network/timeout errors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address QA findings and add comprehensive audit tests

QA FAIL fixes:
- resolve.ts: validate packName and version with validatePathSegment
  before path construction (prevents path traversal via malicious index.json)
- publish.ts: clear index cache after fetchRegistry in publish/unpublish
  (prevents stale index reads within same process)
- publish.ts: export validatePathSegment for use by resolve.ts

QA WARN fixes:
- git.ts: persist core.symlinks=false via git config in fetchRegistry
  (not just at clone time via -c flag)

New tests (46 added, 201 total):
- config.test.ts: name length limits, URL trimming, credential rejection,
  ssh:// URLs, sanitizeUrl, file permissions
- git.test.ts: runtime type validation, symlink protection, cache corruption,
  HEAD fallback, timeout config
- publish.test.ts: path traversal, semver, corrupt index, size limit,
  dedup, rollback, re-publish
- sync.test.ts: locking, stale lock cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address linting and typecheck CI failures

- Replace || with ?? for nullish coalescing in GIT_TIMEOUT_MS (src/registry/git.ts:19)
- Remove async keyword from isNeuralModelAvailable since it has no await expressions (tests/integration/retrieval-quality.test.ts:513)

These changes fix the lint-and-typecheck job failures on PR #407.

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* chore: fix code formatting via prettier

* chore: fix formatting on all registry files via prettier

* fix: handle gzip-compressed pack files in registry publish

readPackJson() was reading files as UTF-8 text, which corrupts gzip
binary data and produces "is not valid JSON" errors when publishing
.json.gz pack files. Now reads as raw Buffer and auto-detects gzip
via magic bytes (0x1f 0x8b), matching the existing pattern in
src/core/packs.ts:readPackFile.

Also adds project CLAUDE.md with agent guidelines including this
pack file I/O pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: comprehensive CLAUDE.md with full project guidelines

Covers: build/test commands, CI expectations, code style (strict TS,
no-any, prettier config), project structure, error hierarchy, database
schema/migrations, testing patterns and fixtures, pack file gzip
handling, registry system, logging, and configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant