Skip to content

feat(self-packaging #42): selfpath + bundle_locator (EOCD reverse scan)#61

Merged
jrosskopf merged 1 commit into
mainfrom
feature/gh-42-selfpath-bundle-locator
May 22, 2026
Merged

feat(self-packaging #42): selfpath + bundle_locator (EOCD reverse scan)#61
jrosskopf merged 1 commit into
mainfrom
feature/gh-42-selfpath-bundle-locator

Conversation

@jrosskopf
Copy link
Copy Markdown
Contributor

Part of epic #40. Re-opening #52 after auto-close (base branch gh-41 was deleted on #51's merge).

Summary

Two small modules that together let the running binary discover whether
a ZIP archive has been appended to it.

  • src/selfpath.{hpp,cpp} -- cross-platform self-binary path
    (/proc/self/exe / _NSGetExecutablePath / GetModuleFileNameW).
  • src/bundle_locator.{hpp,cpp} -- reverse-scan a file for a ZIP
    End-of-Central-Directory record, returning BundleLocation{offset, size}
    or nullopt.

What the locator checks

Check Rejected as Why
EOCD signature 0x06054b50 not present in tail nullopt obviously no ZIP
multi-disk archive nullopt not a single-binary use case
comment length doesn't fit in scanned tail nullopt claims more than file has
bytes after EOCD+comment are non-zero nullopt not a clean append
central dir doesn't fit before EOCD nullopt impossible layout

The padding tolerance is load-bearing: libarchive's default tar-block
rounding pushed the EOCD off file-EOF in the spike; reader is defensive.

Tests (7 cases / 14 assertions, all green on Linux x86_64)

  • locate appended ZIP after 4 KiB of random leading bytes
  • tolerate 10 KiB of trailing zero padding
  • nullopt when no EOCD signature exists
  • nullopt when EOCD has impossible cd_size / cd_offset
  • nullopt when bundle truncated by 1 KiB from EOF
  • selfpath returns existing path
  • LocateBundleInSelf returns nullopt on unbundled test binary

Closes #42. Part of #40.

Part of #40, depends on #41. Two small modules that let the running
binary discover an appended ZIP bundle.

- `src/selfpath.{hpp,cpp}` -- cross-platform self-binary path:
  - Linux:   readlink("/proc/self/exe")
  - macOS:   _NSGetExecutablePath
  - Windows: GetModuleFileNameW (with buffer growth loop)

- `src/bundle_locator.{hpp,cpp}` -- reverse-scan a file for a ZIP
  End-of-Central-Directory record:
  - reads tail buffer of 22 + max_comment + 64 KiB padding budget
  - reverse-scans for the 0x06054b50 signature
  - validates: single-disk archive, entry counts match, comment
    length fits in tail, anything after comment must be zero
    (padding tolerance), central directory must fit before EOCD
  - returns BundleLocation{offset, size} or nullopt
  - `LocateBundleInSelf()` convenience wrapper over GetSelfPath()

Tests (`test/cpp/bundle_locator_test.cpp`, 7 cases):
- locate a ZIP appended to 4 KiB of random leading bytes
- tolerate 10 KiB of trailing zero padding (the spike-caught case)
- nullopt when no EOCD signature exists in random data
- nullopt when an EOCD signature has impossible cd_size / cd_offset
- nullopt when the bundle is truncated by 1 KiB from EOF
- selfpath returns an existing path
- LocateBundleInSelf returns nullopt against the unbundled test binary

Fixtures reuse `archive_io::WriteArchive` (#41) to produce real ZIPs.

Closes #42.
@jrosskopf jrosskopf marked this pull request as ready for review May 22, 2026 12:48
@jrosskopf jrosskopf merged commit b323fe9 into main May 22, 2026
17 checks passed
@jrosskopf jrosskopf deleted the feature/gh-42-selfpath-bundle-locator branch May 22, 2026 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Self-packaging #2: selfpath + bundle_locator (cross-platform EOCD scan)

1 participant