HFS+ B-tree fork grow-on-full (§4b A/B/C) + classic-HFS fsck_hfs fix#48
Merged
Conversation
…OS fsck_hfs Implements the deferred §4b of docs/hfsplus_btree_growth_plan.md: the catalog, extents-overflow, and attributes B-trees now grow their backing fork when they run out of nodes (grow_btree_fork) instead of returning DiskFull, so an under-sized or foreign catalog fills in place. Phase A is contiguous tail growth; Phase B spills the 9th+ extent into the extents-overflow B-tree via a new overflow-aware write_fork_data_with_overflow; Phase C appends map nodes past the (node_size-256)*8 header-bitmap cap. Doing this against real macOS fsck_hfs surfaced two foundational blank-builder bugs that our own checker missed and that affected every HFS+ image we build from scratch (blanks and the defrag/clone path used by restore + HFS-expand): - Invalid BTH length: the B-tree header laid out BTHeaderRec as 128 bytes; Apple requires exactly 106 (record 1 at 120, node bitmap at 248). This also resolves the header-vs-map-node capacity discrepancy — the bitmap now addresses (node_size-256)*8 nodes, matching map_nodes_required. - Empty-tree representation: an empty tree was written depth-1 with an empty root leaf (Apple uses depth-0 / no leaf) and the bitmap didn't reserve the trailing block holding the alternate volume header. btree_insert_full now bootstraps the root leaf on the first insert into a depth-0 tree, and returns a clean DiskFull (pre-split free_nodes >= depth+1 check) so the grow-retry can't half-apply a split and duplicate a record. Adds rb-cli new --fs hfsplus [--case-sensitive] [--min-catalog BYTES] (Phase 0) for GUI/CLI parity and to drive the macOS validation recipe. Every grow path plus the defrag-clone output is fsck_hfs "appears to be OK" and mountable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
build_file_record (and the in-place write_resource_fork update) wrote the fork's first allocation block into filStBlk (record offset 24) and filRStBlk (offset 34). Those are the File Manager's in-memory "first allocation block" cache and must be 0 on disk — the real fork location lives in the extent record at offset 74 (data) / 86 (resource). A non-zero on-disk value makes Apple fsck_hfs report "Reserved fields in the catalog record have incorrect data" (E_CatalogFlagsNotZero); the check is `if (file->dataStartBlock || file->rsrcStartBlock || file->reserved)`. Every classic-HFS file rusty-backup wrote tripped this. Real Mac OS ignores the on-disk value (it recomputes from the extents) so files still worked, and our own fsck never checked it, so it stayed hidden. Confirmed on real Mac OS X 10.4 (Tiger), which fully supports HFS Standard: a fresh multi-file image is now fsck_hfs "appears to be OK" and mounts with all files intact. The reader uses the extent record, never filStBlk, so zeroing is safe. Covers HFV too. Regression test: test_hfs_file_record_first_alloc_block_fields_are_zero. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Classic-Mac HFS/HFS+ allow '/' in catalog names (it's the Finder's display
of the ':' separator), but rb-cli split every in-image path on '/', so a
name like "Oxyd b/w" was unaddressable — the harvest pipeline had to skip
such files.
Add a shared escape/colon-aware tokenizer and route every literal walker
through it, supporting two grammars:
- slash (all filesystems): '\/' = a literal slash in a component, '\\' =
a literal backslash.
- colon (HFS/HFS+ only, gated on a new Filesystem::uses_colon_paths()):
':' as the separator, so '/' is plain data — ':System Folder:Oxyd b/w'.
A colon path is always literal (never globs).
split_image_path()/split_image_parent() in cli::parse are the single source
of truth, replacing ~6 hand-rolled split('/') sites across ls/get/get-binhex/
put/put-binhex/put-macbinary/mkdir/rm/locate/api-hfs. Remote rb:// addressing
stays slash-only. Help text + a "Path grammar" section in cli-reference.md
document both grammars; unit tests cover the tokenizer and a round-trip
against a real HFS volume holding 'Oxyd b/w'.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the deferred §4b of
docs/hfsplus_btree_growth_plan.md(HFS+ B-tree fork grow-on-full), and — because validating it against a real Mac surfaced them — fixes several latent on-disk-format bugs that only an HFS-capable Mac could catch. All acceptance testing was done against realfsck_hfs: modern macOS for HFS+/HFSX, and a Mac OS X 10.4 (Tiger) box for classic HFS.Commit 1 —
feat(hfsplus): §4b grow-on-full (A/B/C)The catalog / extents-overflow / attributes B-trees now grow their backing fork on demand (
grow_btree_fork) instead of returningDiskFull, so an under-sized or foreign catalog fills in place.write_fork_data_with_overflow(the write-side mirror ofread_fork_with_overflow).(node_size-256)*8≈ 30,720-node header-bitmap cap.rb-cli new --fs hfsplus [--case-sensitive] [--min-catalog BYTES](GUI/CLI parity + drives the macOS recipe).Two foundational blank-builder bugs fixed along the way (affected every HFS+ image we build from scratch — blanks and the defrag/clone path used by restore + HFS-expand):
btree_insert_fullnow bootstraps the root leaf on the first insert into a depth-0 tree, and returns a cleanDiskFull(pre-splitfree_nodes >= depth+1check) so the grow-retry can't half-apply a split and duplicate a record.Commit 2 —
fix(hfs): classic-HFSfilStBlk/filRStBlkmust be zerobuild_file_record(andwrite_resource_fork) wrote the fork's first allocation block intofilStBlk(offset 24) /filRStBlk(offset 34). Those are the File Manager's in-memory cache and must be 0 on disk — the real location is the extent record at offset 74/86. A non-zero value makesfsck_hfsreport "Reserved fields in the catalog record have incorrect data" (E_CatalogFlagsNotZero). Pre-existing; real Mac OS ignored the on-disk value so files worked, and our own fsck never checked it. The reader uses the extent record, so zeroing is safe. Covers HFV too.Validation
fsck_hfs -f -n"appears to be OK" and mountable on macOS.fsck_hfs-clean and mounts with all files intact (diffed catalog records against a Tiger-created reference to pin the root cause).test_hfs_file_record_first_alloc_block_fields_are_zero.cargo test --libgreen (2140),build --all-targetszero warnings,clippy -D warningsclean.🤖 Generated with Claude Code