Skip to content

docs(v0.4.0): align README + frozen contract + protocol-comparison with reality#63

Merged
VirusAlex merged 1 commit intomainfrom
docs/v040-docs-alignment
May 1, 2026
Merged

docs(v0.4.0): align README + frozen contract + protocol-comparison with reality#63
VirusAlex merged 1 commit intomainfrom
docs/v040-docs-alignment

Conversation

@VirusAlex
Copy link
Copy Markdown
Owner

Summary

Closes audit findings HIGH H9/H10/H11/H12/H13 + the medium "one whitelisted code path" claim. Final of 5 PRs gating v0.4.0; pure docs.

After this lands the v1.0.0 audit's Blocker / High / Medium tier is fully closed.

Changes

README

  • Endpoint table replaces the partial 5-route bullet list — pre-v0.4.0 the README listed /api/health, /api/browse, /api/manifest, /api/transfers, /ws/progress and stopped. Anyone evaluating attack surface saw half the picture. Now: full table of 16 routes with auth posture and one-line purpose.
  • Hash claims now match the implementation. Per-chunk = XXH3-128 (wire). Full-file finalize = SHA-256. The historic JSON field name hashHex is documented as kept-for-v0.x-stability, with a note that the rename to sha256Hex is for v1.x.
  • Sidecar contents correctly list FOUR files (added chunks.hashes, the per-chunk-XXH3 file used by the selective re-verify path since v0.2.5).
  • "Tested on Linux" — replaced the "runs on macOS and Windows too" claim that wasn't backed by CI with an honest note that Linux is the only platform under CI / shipped images.
  • Whitelist count fixed: was "one whitelisted code path", actually four classes (SafeFileOps, FileFinalizer, JsonJobStore, FileSidecarStore). New banned APIs (FileChannel.truncate, TRUNCATE_EXISTING) listed. Already covered in the foundation PR but the prose is now consistent.

tasks/contracts/data-formats.md

  • Lowercase enums everywhere — pre-v0.4.0 examples showed "protocol": "TCP", "state": "RUNNING" etc; the actual wire format has been lowercase since v0.3.0 (commit 81ce723). Anyone implementing a parser by reading this contract was getting wrong info.
  • TCP wire = v2 — added DATA_END_V2 (frame type 0x09) carrying the trailer XXH3-128. Documents the negotiated-min handshake (server is v2, downgrades cleanly to v1 against legacy clients) and the single-pass vs two-pass trade-off.
  • schemaVersion field added to JobState and SidecarMeta examples + a top-of-file note explaining read-side reject semantics.
  • New endpoints that grew during 0.x: /api/manifest/register, /api/browse/stats, /api/peer/info, /api/metrics, the acknowledgeOverwrite gate on POST /api/transfers, the per-job metrics block in GET /api/transfers/{id}, and TransferRegistered / TransferDismissed WS events.
  • New section on chunks.hashes (previously undocumented).
  • HELLO-watchdog timeout (30s) and stats-walk depth cap (32) documented.

docs/protocol-comparison.md

  • Removed the _TBD_ results table that the V5 verify task never filled in. The README links here as the source of truth; landing on a page that says "TBD" undermines confidence.
  • Replaced with honest design notes explaining where each protocol wins, plus a manual reproduction guide. Acknowledges that real-world LAN throughput is dominated by disk speed regardless of protocol choice — the Performance modal already exposes per-chunk timings users need to identify their own bottleneck.

Test plan

  • Local mvn compile clean; mvn test -Dtest=ArchitectureTest 8/8 pass.
  • CI green on Linux.
  • Manual: cross-check every endpoint in the new table exists in App.java (or its route registrations).
  • Manual: render the README in the GitHub UI and confirm the table is readable.

🤖 Generated with Claude Code

…th reality

Closes audit findings HIGH H9/H10/H11/H12/H13 + medium "one whitelisted code
path" claim. Final of 5 PRs gating v0.4.0; pure docs.

README
- Replaced the partial endpoint bullet list (5 routes) with a complete table
  of 16 routes — the prior list was missing /api/peer/info, the transfer
  lifecycle endpoints, /api/hash, /api/browse-local, /api/browse/stats,
  /api/manifest/register, /api/relay/push, and /api/metrics. Anyone reasoning
  about attack surface or writing a non-UI client now sees the full picture.
- Replaced the single-paragraph hash claim ("xxh3-128") with a two-layer
  description that matches the implementation: per-chunk wire verification is
  XXH3-128, full-file finalize is SHA-256, and the JSON field name kept its
  historical "hashHex" identifier for v0.x wire-format stability (rename
  scheduled for v1.x).
- Sidecar contents now correctly list FOUR files: data.partial, meta.json,
  chunks.bitmap, AND chunks.hashes (the per-chunk XXH3 file added in v0.2.5
  for selective re-verify on full-file mismatch). The README pre-v0.4.0
  showed three.
- "Tested on Linux. Runs on macOS and Windows too" rephrased to honestly
  reflect the reality: Linux is the only platform under CI / shipped images;
  macOS and Windows are best-effort with documented quirks.

tasks/contracts/data-formats.md
- All persistent-state JSON examples now show LOWERCASE enums
  ("inbound" / "tcp" / "running" / "skip"), matching the wire format since
  v0.3.0. Pre-v0.4.0 examples were uppercase, contradicting every actual
  state file.
- TCP wire-protocol section updated to v2: documents the new DATA_END_V2
  frame (0x09) carrying the trailer XXH3-128, explains the negotiated-min
  protocol-version handshake, and describes the v1 (two-pass) vs v2
  (single-pass) trade-offs.
- Added schemaVersion field to JobState and SidecarMeta examples + a top-of-
  file note explaining read-side reject semantics.
- Added new endpoints that grew during 0.x: /api/manifest/register,
  /api/browse/stats, /api/peer/info / /api/metrics implications, the
  acknowledgeOverwrite gate on POST /api/transfers, the per-job metrics
  block in GET /api/transfers/{id}, and TransferRegistered /
  TransferDismissed WS events.
- New section on chunks.hashes (pre-doc undocumented).
- Stats-walk depth cap (32) and HELLO-watchdog timeout documented.

docs/protocol-comparison.md
- Removed the "_TBD_" results table that the V5 verify task never filled
  in. The README links to this doc; landing on a page that says "TBD"
  undermines confidence.
- Replaced with honest design notes describing where each protocol wins,
  acknowledging that real-world LAN throughput is dominated by the slower
  of the two disks (source HDD seek + receiver fsync) regardless of
  protocol choice. Manual reproduction steps preserved.

Local mvn test: ArchitectureTest 8/8 pass; compile clean. Linux CI handles
the Jetty server-tier suites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@VirusAlex VirusAlex merged commit 500ce4f into main May 1, 2026
1 check passed
@VirusAlex VirusAlex deleted the docs/v040-docs-alignment branch May 1, 2026 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants