feat: parallel chunk PUT and GET by sanity · Pull Request #8 · freenet/freenet-git

sanity · 2026-05-01T14:53:15Z

Problem

ChunkedPack publish and fetch are serial: each chunk operation waits for the previous to complete. Against the Freenet host under load each chunk takes ~60s, so a 200-chunk push (full freenet-core history at 1 MiB chunks) takes ~3-4 hours of wall time. That's the dominant pain point for routine work on big repos and is called out as the last "not yet supported" item in the README.

Approach

Run chunk operations in parallel across a small pool of WebSocket connections to the local Freenet node.

Open N connections up front (Vec<Arc<Mutex>>).
Round-robin chunks to connections (chunk_index % N).
Drive completion with futures::stream::iter(...).map(...).buffer_unordered(N).
For fetch, slot results into Vec<Option<Vec>> indexed by chunk position so reassembly is in manifest order regardless of completion order.
Manifest PUT and re-GET stay single-shot on the first connection (they are one-of-one operations).

N defaults to 8 (DEFAULT_PARALLEL_OPS) and can be overridden with FREENET_GIT_PARALLEL_OPS=K. Setting it to 1 forces serial behavior. The pool is capped at the chunk count to avoid wasted connections for small payloads.

API change

publish_chunked_pack and fetch_chunked_pack now take ws_url: &str instead of &mut WebApi because they open their own pool internally. The _with_progress variants gain a parallelism: usize parameter. Progress callbacks fire on completion (with completed-count) instead of before each op, which is the correct semantics for out-of-order parallel work.

This is a breaking lib API change. The lib is internal to the binaries in this same crate; no external consumers known. The callers in git-remote-freenet.rs are updated.

Rejected alternatives

Multiplex one connection. WebApi is !Sync and the WebSocket stream isn't designed for interleaved txns from one client; one connection per concurrent op is simpler and the resource cost is small.
Spawn N worker tasks pulling from a channel. Equivalent in throughput but more code; the buffer_unordered form is shorter and the per-task index propagation is direct.

Testing

New unit tests in chunked.rs::tests:

assemble_chunks_concatenates_in_order: gives the assembler the same chunks in slot order and again in shuffled completion order, asserts both produce the same byte vector. Load-bearing piece for parallel fetch correctness.
assemble_chunks_errors_on_missing_slot: a None slot must error rather than silently truncate.
assemble_chunks_errors_on_total_size_mismatch: defends the manifest total_size invariant.
parallelism_from_env_handles_unset_zero_and_invalid: env-var parsing including 0 and non-integer fallback.

Manual e2e against a real Freenet node is the planned next step before merge; will report results in a follow-up comment.

Why didn't CI catch this?

The serial implementation wasn't a bug, it was a perf limitation. There is no slow-operation regression CI check. The new parallel impl could regress in two ways:

Reassembly mis-ordering. Closed by the assemble_chunks unit tests.
Hung locks / deadlock from an .await while holding a mutex guard wrong. Caught by running the existing test suite under tokio runtime; could be made stronger with a mock-WebApi integration test in a follow-up.

Out-of-scope (filed for follow-up if needed)

Parallelize rescue_chunked in main.rs (different control flow: GET-then-PUT per chunk).
Streaming chunks from git pack-objects directly without staging the full pack in memory. Bigger refactor; do separately if it's needed.

Fixes

Closes #7

[AI-assisted - Claude]

ChunkedPack publish and fetch were serial: each chunk operation waited for the previous to complete. Against the Freenet host under load each chunk takes ~60s, so a 200-chunk push (full freenet-core history at 1 MiB chunks) took ~3-4 hours. This PR runs chunk operations in parallel across a small pool of WebSocket connections. Pool size defaults to 8, configurable via FREENET_GIT_PARALLEL_OPS=N. Round-robin chunk-to-conn dispatch with buffer_unordered so completion order is decoupled from chunk index. Reassembly is the load-bearing piece: chunks complete out of order but must be glued back in manifest order. The new assemble_chunks helper takes Vec<Option<Vec<u8>>> indexed by chunk position and concatenates in slot order, with unit tests covering the in-order, shuffled, missing-slot, and total-size-mismatch cases. Manifest PUT and re-GET stay single-shot on the first connection because they are one-of-one operations. Expected speedup for a 200-chunk push at 60s/chunk: ~200 minutes serial -> ~25 minutes 8-way parallel. API change: publish_chunked_pack and fetch_chunked_pack now take a ws_url: &str instead of &mut WebApi, because they open their own pool of N connections internally. Both add a parallelism: usize parameter on the _with_progress variants. Progress callbacks fire on completion (with completed-count) instead of before each op, which is the correct semantics for out-of-order parallel work. Out-of-scope (filed for follow-up if needed): - Parallelize rescue_chunked in main.rs (different control flow, GET-then-PUT per chunk). - Streaming chunks: read directly from pack-objects subprocess and feed chunks as they're produced. Bigger refactor. Closes #7 [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- Wrap pack_wasm in Arc<Vec<u8>> so phase 2 (verify) and fetch phase 4 don't deep-clone the WASM bytes per chunk; phase 1 still clones because put_pack consumes Vec<u8> for retries. - Take ownership of chunks via .into_iter() in phase 1 to remove the per-chunk clone (chunks isn't used after phase 1). - Rename mutex guards from `g` to `conn`. - Replace `Ok::<_, anyhow::Error>(...)` with `anyhow::Ok(...)`. - Replace `(0..n).map(|_| None).collect()` with `vec![None; n]`. - Drop a redundant `async {}` wrapper in open_pool. No behavior change. All 19 unit tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes #7 ships it.

… parsing Addresses review findings on PR #8: Codex P1: open_pool now degrades gracefully. If the local node refuses extra connections beyond the first, we proceed with however many we got rather than aborting the whole publish/fetch. Only fails if we cannot open even one connection. The pre-parallel implementation managed with one, so this matches that compatibility floor. Testing P1 / Skeptical M1: parallelism_from_env split into a thin env reader plus a pure parse_parallelism(Option<&str>) helper. The unit test exercises the parser directly, so we no longer mutate process env state in tests. Removes a known race that becomes a hard error in edition 2024 where set_var/remove_var are unsafe. Testing P1: extracted collect_chunks_into_slots helper from the fetch chunked-GET loop. Two new tokio::test cases drive it with a deliberately scrambled stream of (chunk_index, bytes) results, so the "completion order does not affect slot placement" invariant is now genuinely tested. The previous assemble_chunks "shuffled" test was testing Vec iteration, not the load-bearing reassembly logic. Code-first / Skeptical N1: removed redundant drop(stream) calls. The while-let loops only exit when the stream is empty, so the explicit drops were no-ops. Skeptical N3: unified the actual_parallelism clamp form between publish and fetch (both now use parallelism.min((n as usize).max(1))). Skeptical N4: tightened the Arc<Vec<u8>> comment to reflect that the phase 1 win is small (avoids one local move) and the real fix is plumbing &[u8] through wsclient::put_pack, deferred as a follow-up. Skeptical N5: added a comment that phase 2's BLAKE3 verify is belt-and-braces with wsclient::get_pack's own check; kept anyway as defense-in-depth at the publish-verify layer. Skeptical N7: renamed shadowed `completed` -> `puts_completed` / `verifies_completed` so the publish phase 1 / phase 2 counters are visibly distinct. Skeptical M2: documented the assumption that per-chunk latency is roughly uniform. With round-robin and one stuck chunk on conn k, every chunk where i % N == k queues behind it; for content-addressed PUT/GET this is fine but worth noting if costs become more variable. Big-picture: updated the two user-facing progress strings in git-remote-freenet.rs to disclose the parallelism N, so users don't multiply ~60s × 200 chunks in their head. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sanity · 2026-05-01T15:04:27Z

Review pass complete

Ran four internal review agents (code-first, testing, skeptical, big-picture) and Codex in parallel. Summary of what was applied and what was punted.

Applied to this branch

[Codex P1] Graceful pool degradation. `open_pool` no longer aborts the whole publish/fetch if the local node refuses connection N+1; it proceeds with however many it got and warns. Falls back to error only if it cannot open even one (matching the pre-parallel single-connection behavior).
[Testing P1 + Skeptical M1] Race-free env parsing. Split `parallelism_from_env` into a thin reader plus a pure `parse_parallelism(Option<&str>)` helper. Tests now exercise the parser directly instead of mutating process env state. Avoids the known anti-pattern that becomes a hard error in edition 2024.
[Testing P1] Real out-of-order completion test. Extracted `collect_chunks_into_slots` from the fetch loop and added two `#[tokio::test]` cases that drive it with a deliberately scrambled `stream::iter`. The previous "shuffled" test was testing Vec iteration, not the load-bearing reassembly logic.
[Skeptical N1] Removed redundant `drop(stream)` calls that were no-ops.
[Skeptical N3] Unified the actual_parallelism clamp form between publish and fetch.
[Skeptical N4] Tightened the Arc<Vec> comment about phase 1 vs phase 2 cloning.
[Skeptical N5] Added a comment that phase 2's BLAKE3 verify is belt-and-braces with wsclient::get_pack's own check.
[Skeptical N7] Renamed shadowed `completed` to `puts_completed` / `verifies_completed`.
[Skeptical M2] Documented the uniform-per-chunk-latency assumption: round-robin + one stuck chunk on conn k means siblings on that conn queue serially.
[Big-picture] Updated the two user-facing progress strings to disclose the parallelism N, so users don't multiply ~60s × 200 chunks in their head.
[Big-picture] Removed the stale README "parallel chunk uploads not yet shipped" caveat in an earlier commit on this branch.

Filed as follow-ups (out of scope for #7)

[Skeptical M4 + M5] wsclient: put_contract / get_state can leave stale responses on a WS connection after retry or error #9 — wsclient `put_contract` / `get_state` can leave stale responses on a connection after retry. Pre-existing; feat: parallel chunk PUT and GET #8 increases the surface area, doesn't introduce. Worth fixing in its own PR.
[Code-first] wsclient::put_pack: take &[u8] instead of Vec<u8> to eliminate per-chunk WASM clone #10 — refactor `wsclient::put_pack` to take `&[u8]` so phase 1 can drop the per-chunk WASM deep-clone. Roughly 1 GB of transient allocations per push for a 200-chunk freenet-core history.
[Big-picture / out-of-scope from Parallel chunk PUTs for ChunkedPack #7] Parallelize rescue_chunked the same way as publish/fetch #11 — parallelize `rescue_chunked` in `main.rs` the same way.

Validation gap

No CI in this repo, and there is no mock `WebApi`. The orchestration logic (`buffer_unordered` over a `Vec<Arc<Mutex>>` pool, round-robin chunk dispatch, slot-indexed reassembly) is unit-tested in pieces but not end-to-end. Big-picture review noted this and asked for a real-network smoke test before merge confirming 8-way concurrent ops actually work and the speedup is in the expected ballpark. Pending.

[AI-assisted - Claude]

…marize Addresses all four internal review agents + Codex on PR #17. Critical: - Snapshot tree build now uses `git archive HEAD | tar -xf -` instead of `tar --exclude=.git`. The previous approach copied the worktree and ran `git add .` in the fresh repo, which re-applied .gitignore and silently dropped force-added files. `git archive` dumps exactly HEAD's tracked tree, and `git add -f .` then commits all of it. (Codex P2; Skeptical M1; Code-first nit.) - Every `${{ inputs.* }}` reference in `run:` blocks now goes through `env:` indirection. Reusable callers are in-org and trusted today, but the previous direct interpolation made the workflow a footgun for any future caller -- a stray `"` in a branch name or version string was a shell-injection vector. (Skeptical H1, H2; Code-first should-fix #3.) - Workflow-level `concurrency:` block keyed on `freenet_repo`. The repo contract is read-modify-write; two parallel pushes to the same prefix race on `update_seq` and the second arriving is silently dropped by the CRDT merge. `cancel-in-progress: false` queues new runs so a fresh push never loses to an in-flight stale one. (Skeptical H3; Code-first should-fix #5; Big-picture future-work.) - History mode now does `git checkout -B "$BRANCH"` after actions/checkout to defend against detached-HEAD checkouts that would break the `<branch>:<branch>` refspec. (Skeptical M2.) Should-fix: - Toolchain pin dropped from `1.93.0` to `dtolnay/rust-toolchain@stable` (no `with: toolchain:`). The pin doesn't aid determinism for a CLI install, and pinning to a specific version risks both workflows breaking simultaneously if a freenet-git release picks up a transitive dep with a higher MSRV. (Skeptical M3; Code-first nit #8.) - Bundle-write step now starts with `set -euo pipefail` and validates the file with `test -s` rather than running `freenet-git whoami` (the latter prints registry contents to CI logs unnecessarily; both paths are audited safe per skeptical L4-L5). (Code-first should-fix #4.) - `freenet_repo` input now validated against the `<prefix>/<label>` shape in a dedicated step, with a directed error if the caller accidentally includes the `freenet:` scheme prefix. (Code-first should-fix #6; Skeptical L6.) - Workflow header gained a "Choosing mode" stanza explaining snapshot vs history semantics, and a paragraph noting that snapshot mode does not handle submodules or git-LFS. (Big-picture #2; Code-first should-fix #1, #2.) - New `freenet_git_extra_args` passthrough input so caller workflows can extend the `git push` invocation (e.g. timeouts) without pushing back changes here. (Big-picture future-work #1.) - New `workflow_dispatch` trigger so a maintainer can manually exercise the workflow against a throwaway target before relying on it from a caller repo. (Code-first concern.) rescue-demos.yml: - New `summarize` job aggregates matrix results so a partial-green "rescue worked for 2/3 repos" registers as a workflow-level failure (and triggers GitHub's default failure-notification email). (Big-picture #3; Skeptical L3.) - Header now flags the dependency on the gateway holding the bytes -- if the gateway evicts before the next scheduled run, rescue is a no-op for that URL. Future `freenet-git rescue --from <git-dir>` will let us pass a clone path and remove that dependency. (Big-picture #1.) - Toolchain pin dropped here too; matches mirror-repo.yml. Pushed back / deferred: - Code-first #7 (matrix to JSON file): three URLs is fine, would add surface area without proportionate benefit. - Code-first #11 (cron times): subjective, and GH cron is best-effort anyway. - Code-first #12 (README duplication): pre-existing, out of scope. - Code-first #13 (test-plan checkboxes): cosmetic. - Big-picture #4 (`latest` vs pin): default kept as `latest` with the override input documented; pinning by default would freeze the pipeline against the buggy 0.1.12 prior to this PR's release. - Big-picture #5 (docs/0001-large-repos.md): a design doc for ChunkedPack, not a workflow doc. Out of scope. - Big-picture #7 (agent-memory rot): handled in a separate ~/.claude commit after this PR merges, per the global commit-claude-config rule. - Skeptical M4 (`force_push_allowed`): verified safe -- the contract's update_seq monotonicity gates writes regardless of the `+` flag. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add mirror + rescue workflows; rotate demo URLs Three new artifacts. `.github/workflows/mirror-repo.yml` (reusable workflow_call). Caller repos (freenet-core, freenet-stdlib) invoke it via `uses: freenet/freenet-git/.github/workflows/mirror-repo.yml@<ref>` to push their content into a Freenet-hosted repo on every commit to main. Inputs: target prefix/label, mode (`snapshot` for HEAD-only orphan commits, `history` for full git history), branch, and an optional `freenet_git_version` pin. Caller passes the identity bundle and gateway WS URL via secrets. `.github/workflows/rescue-demos.yml` (scheduled). Twice a day, runs `freenet-git rescue` for the three demo URLs so they don't fall out of peers' LRU caches. Rescue is permissionless, so this only needs `FREENET_GIT_WS_URL`. README updates: the demo URLs are rotated to fresh prefixes derived from a no-passphrase identity bundle, replacing the old chunked-demo-bundle URLs (`AaRxPZVdWrPh/freenet-core` and `2pyvKxrozxgT/freenet-stdlib`). The "Keeping a published repo alive" section now points at the rescue workflow as the canonical keep-alive mechanism for these specific URLs. New URLs: - freenet:3GEERif5ihbf/freenet-core (snapshot, mirrored from freenet/freenet-core@main) - freenet:96rknpy1GYhZ/freenet-stdlib (full history, mirrored from freenet/freenet-stdlib@main) - freenet:99TmCayXn6Tm/freenet-git (this repo's own mirror) Initial state for the first two is already published; the workflow just keeps it fresh and re-publishes on upstream pushes. Required secrets in caller repos: - FREENET_GIT_IDENTITY_BUNDLE_BASE64 -- base64 of the no-passphrase bundle that owns the prefixes above - FREENET_GIT_WS_URL -- WebSocket URL of a reachable Freenet node (we reuse RIVER_GATEWAY_URL in this org) [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review fixes: env-var indirection, concurrency, snapshot rebuild, summarize Addresses all four internal review agents + Codex on PR #17. Critical: - Snapshot tree build now uses `git archive HEAD | tar -xf -` instead of `tar --exclude=.git`. The previous approach copied the worktree and ran `git add .` in the fresh repo, which re-applied .gitignore and silently dropped force-added files. `git archive` dumps exactly HEAD's tracked tree, and `git add -f .` then commits all of it. (Codex P2; Skeptical M1; Code-first nit.) - Every `${{ inputs.* }}` reference in `run:` blocks now goes through `env:` indirection. Reusable callers are in-org and trusted today, but the previous direct interpolation made the workflow a footgun for any future caller -- a stray `"` in a branch name or version string was a shell-injection vector. (Skeptical H1, H2; Code-first should-fix #3.) - Workflow-level `concurrency:` block keyed on `freenet_repo`. The repo contract is read-modify-write; two parallel pushes to the same prefix race on `update_seq` and the second arriving is silently dropped by the CRDT merge. `cancel-in-progress: false` queues new runs so a fresh push never loses to an in-flight stale one. (Skeptical H3; Code-first should-fix #5; Big-picture future-work.) - History mode now does `git checkout -B "$BRANCH"` after actions/checkout to defend against detached-HEAD checkouts that would break the `<branch>:<branch>` refspec. (Skeptical M2.) Should-fix: - Toolchain pin dropped from `1.93.0` to `dtolnay/rust-toolchain@stable` (no `with: toolchain:`). The pin doesn't aid determinism for a CLI install, and pinning to a specific version risks both workflows breaking simultaneously if a freenet-git release picks up a transitive dep with a higher MSRV. (Skeptical M3; Code-first nit #8.) - Bundle-write step now starts with `set -euo pipefail` and validates the file with `test -s` rather than running `freenet-git whoami` (the latter prints registry contents to CI logs unnecessarily; both paths are audited safe per skeptical L4-L5). (Code-first should-fix #4.) - `freenet_repo` input now validated against the `<prefix>/<label>` shape in a dedicated step, with a directed error if the caller accidentally includes the `freenet:` scheme prefix. (Code-first should-fix #6; Skeptical L6.) - Workflow header gained a "Choosing mode" stanza explaining snapshot vs history semantics, and a paragraph noting that snapshot mode does not handle submodules or git-LFS. (Big-picture #2; Code-first should-fix #1, #2.) - New `freenet_git_extra_args` passthrough input so caller workflows can extend the `git push` invocation (e.g. timeouts) without pushing back changes here. (Big-picture future-work #1.) - New `workflow_dispatch` trigger so a maintainer can manually exercise the workflow against a throwaway target before relying on it from a caller repo. (Code-first concern.) rescue-demos.yml: - New `summarize` job aggregates matrix results so a partial-green "rescue worked for 2/3 repos" registers as a workflow-level failure (and triggers GitHub's default failure-notification email). (Big-picture #3; Skeptical L3.) - Header now flags the dependency on the gateway holding the bytes -- if the gateway evicts before the next scheduled run, rescue is a no-op for that URL. Future `freenet-git rescue --from <git-dir>` will let us pass a clone path and remove that dependency. (Big-picture #1.) - Toolchain pin dropped here too; matches mirror-repo.yml. Pushed back / deferred: - Code-first #7 (matrix to JSON file): three URLs is fine, would add surface area without proportionate benefit. - Code-first #11 (cron times): subjective, and GH cron is best-effort anyway. - Code-first #12 (README duplication): pre-existing, out of scope. - Code-first #13 (test-plan checkboxes): cosmetic. - Big-picture #4 (`latest` vs pin): default kept as `latest` with the override input documented; pinning by default would freeze the pipeline against the buggy 0.1.12 prior to this PR's release. - Big-picture #5 (docs/0001-large-repos.md): a design doc for ChunkedPack, not a workflow doc. Out of scope. - Big-picture #7 (agent-memory rot): handled in a separate ~/.claude commit after this PR merges, per the global commit-claude-config rule. - Skeptical M4 (`force_push_allowed`): verified safe -- the contract's update_seq monotonicity gates writes regardless of the `+` flag. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses all four internal reviewers + Codex on PR #26. Codex P2 / Code-first #1 / Skeptical M1: object existence check now distinguishes "missing", "non-commit", and "tooling failure" cases. Renamed `object_exists` -> `commit_exists` and reimplemented via `git rev-parse --verify --quiet <sha>^{commit}`. The new check: - Returns Ok(true) only if <sha> is present and is a commit object. cat-file -e succeeded on blobs and trees, which would have left the cryptic `fatal: not a commit object` rev-list error reachable for corrupted local stores. - Distinguishes missing from corrupt-tooling via exit code 1 vs 128; the latter bubbles up rather than silently widening the pack (force) or producing a misleading "run git fetch" message (non-force). - New test `commit_exists_rejects_blob_and_tree` pins this against blob and tree SHAs from the same repo. Skeptical M2: directed error message reordered to lead with --force since snapshot/orphan-style pushes are the canonical case driving this PR. `git fetch` is still listed for the regular fast-forward recovery scenario but no longer the first remediation. Code-first #5: error message wording improved -- "use `git push +`" isn't a real invocation; rephrased as "use `git push --force` (or prefix the refspec with `+`)". The prior bail!(...) string was preserved but rephrased. Code-first #7 / Skeptical L3: test setup now sets per-repo commit.gpgsign=false and tag.gpgsign=false, so a developer with GPG signing on globally without a working key won't trip a prompt in tests. Code-first #8: `build_pack_have_present_succeeds` was a no-op (let _ = pack). Renamed to `build_pack_have_equal_to_want_emits_ empty_pack` and now asserts the resulting pack is smaller than the no-have pack -- pinning that rev-list short-circuits when have == want. Testing must-add #1: new test `build_pack_fast_forward_have_is_real_ancestor` exercises the production happy path -- have is the parent commit, want is the child. Asserts the incremental pack is non-empty AND smaller than the full pack, so a future regression that drops `^<have>` from the rev-list args silently would be caught. Testing should-add #3: directed-error wording now pinned via substring assertions on "not present", "local repo", "git fetch", "--force", and the bogus SHA itself. Big-picture LOW: docstring "Phase 1.0 caveats" block in the helper gained a "Push semantics" bullet documenting non-force vs force expectations. Pushed back / not addressing in this PR: - Code-first #2 (silent SinglePack/ChunkedPack flip when force drops have for large `want` history): real concern but architectural; for snapshot mode the orphan commits are small so not a current practical issue. Worth a follow-up if a future caller pushes a large-history force. - Code-first #4 (parse_push_spec allows force=true with empty src): caller rejects empty src before reading force; future-proofing with a defensive parser check is unnecessary today. - Skeptical L2 (race between cat-file and rev-list under concurrent gc): theoretical, two-Command::status() window. The helper is invoked synchronously by git push; concurrent gc on the same repo is rare. Not in scope. - Skeptical L4 (git init -b main needs git >= 2.28): all current CI runners have a modern git. Not in scope. - Skeptical L5 (pack uploaded before contract validation runs): pre-existing ordering, not introduced here. - Testing should-add #2 (parse-spec -> build-pack glue test): the existing test surface is sufficient given each function is unit-tested. A full handle_push test would require WS-API scaffolding that's overkill for this scope. - Testing should-add #4 (bogus `want`): the existing `git_resolve_ref` call path catches that earlier in handle_push. - Big-picture LOW (file follow-up issue for workflow workaround cleanup): tracked in issue body as "follow-up"; will file a separate ticket once this lands and the caller pin is bumped. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: build_pack handles force-push when local lacks the remote tip Closes #21. ## Problem `build_pack` invoked `git rev-list ^<have> <want>` blindly. When `<have>` is the remote tip reported by the contract state but the local repo has never seen it (canonical case: force-pushing an orphan commit, exactly what snapshot-mode mirroring does), rev-list bailed with `fatal: bad object <sha>` and the entire push failed. This was the root cause behind the freenet-core mirror failing 5/5 runs after merge. PR #20 worked around it at the workflow layer with a `git fetch freenet $BRANCH` before each push; that fix keeps working but adds bandwidth and only helps callers of the reusable workflow. ## Fix Two pieces: 1. `parse_push_spec` now returns `(force, src, dst)` instead of silently discarding the leading `+`. 2. `build_pack` takes the new `force: bool` and pre-validates `<have>` with `git cat-file -e`. If `<have>` is missing locally and the push is a force, drop `<have>` from the rev-list args and send everything reachable from `<want>` -- the right semantic for "replace whatever was there." If `<have>` is missing and the push is non-force, return a directed error pointing the user at `git fetch` or `+` rather than the cryptic `fatal: bad object` from rev-list. ## Tests Six new unit tests in `git-remote-freenet.rs` covering: - `parse_push_spec_force_flag` -- `+` is parsed as force, plain spec is non-force. - `build_pack_no_have_succeeds` -- empty have is the first-push case. - `build_pack_have_present_succeeds` -- existing fast-forward path. - `build_pack_missing_have_non_force_fails_with_directed_error` -- the non-force surface, directed error message. - `build_pack_missing_have_force_drops_have_and_succeeds` -- the exact case that was broken before. - `object_exists_distinguishes_real_from_bogus` -- pins the `cat-file -e` helper. ## End-to-end verification Reproduced the original failure with the patched binary against the live demo URL `freenet:3GEERif5ihbf/freenet-core`: $ # Fresh orphan repo, force push, no fetch: $ git push freenet +main:main ==> reading repo state from Freenet ==> publishing 226 B pack as a single bundle ==> updating repo state on Freenet ==> done + b2a8836...5d4a733 main -> main (forced update) $ # Non-force from fresh orphan -- gets the directed error: $ git push freenet main:main ! [remote rejected] main -> main (remote tip 5d4a7334... is not present in the local repo; run `git fetch` to populate it, or use `git push +`/`--force` to overwrite (which will discard the remote tip's history)) The freenet-core demo was clobbered briefly during testing and re-published to its proper snapshot (`b2a8836`) after. ## Follow-up Once this ships in a release and the caller workflow's pin is bumped, drop the `git fetch` workaround from `.github/workflows/mirror-repo.yml`. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review fixes: tighten object check, reorder error, harden tests Addresses all four internal reviewers + Codex on PR #26. Codex P2 / Code-first #1 / Skeptical M1: object existence check now distinguishes "missing", "non-commit", and "tooling failure" cases. Renamed `object_exists` -> `commit_exists` and reimplemented via `git rev-parse --verify --quiet <sha>^{commit}`. The new check: - Returns Ok(true) only if <sha> is present and is a commit object. cat-file -e succeeded on blobs and trees, which would have left the cryptic `fatal: not a commit object` rev-list error reachable for corrupted local stores. - Distinguishes missing from corrupt-tooling via exit code 1 vs 128; the latter bubbles up rather than silently widening the pack (force) or producing a misleading "run git fetch" message (non-force). - New test `commit_exists_rejects_blob_and_tree` pins this against blob and tree SHAs from the same repo. Skeptical M2: directed error message reordered to lead with --force since snapshot/orphan-style pushes are the canonical case driving this PR. `git fetch` is still listed for the regular fast-forward recovery scenario but no longer the first remediation. Code-first #5: error message wording improved -- "use `git push +`" isn't a real invocation; rephrased as "use `git push --force` (or prefix the refspec with `+`)". The prior bail!(...) string was preserved but rephrased. Code-first #7 / Skeptical L3: test setup now sets per-repo commit.gpgsign=false and tag.gpgsign=false, so a developer with GPG signing on globally without a working key won't trip a prompt in tests. Code-first #8: `build_pack_have_present_succeeds` was a no-op (let _ = pack). Renamed to `build_pack_have_equal_to_want_emits_ empty_pack` and now asserts the resulting pack is smaller than the no-have pack -- pinning that rev-list short-circuits when have == want. Testing must-add #1: new test `build_pack_fast_forward_have_is_real_ancestor` exercises the production happy path -- have is the parent commit, want is the child. Asserts the incremental pack is non-empty AND smaller than the full pack, so a future regression that drops `^<have>` from the rev-list args silently would be caught. Testing should-add #3: directed-error wording now pinned via substring assertions on "not present", "local repo", "git fetch", "--force", and the bogus SHA itself. Big-picture LOW: docstring "Phase 1.0 caveats" block in the helper gained a "Push semantics" bullet documenting non-force vs force expectations. Pushed back / not addressing in this PR: - Code-first #2 (silent SinglePack/ChunkedPack flip when force drops have for large `want` history): real concern but architectural; for snapshot mode the orphan commits are small so not a current practical issue. Worth a follow-up if a future caller pushes a large-history force. - Code-first #4 (parse_push_spec allows force=true with empty src): caller rejects empty src before reading force; future-proofing with a defensive parser check is unnecessary today. - Skeptical L2 (race between cat-file and rev-list under concurrent gc): theoretical, two-Command::status() window. The helper is invoked synchronously by git push; concurrent gc on the same repo is rare. Not in scope. - Skeptical L4 (git init -b main needs git >= 2.28): all current CI runners have a modern git. Not in scope. - Skeptical L5 (pack uploaded before contract validation runs): pre-existing ordering, not introduced here. - Testing should-add #2 (parse-spec -> build-pack glue test): the existing test surface is sufficient given each function is unit-tested. A full handle_push test would require WS-API scaffolding that's overkill for this scope. - Testing should-add #4 (bogus `want`): the existing `git_resolve_ref` call path catches that earlier in handle_push. - Big-picture LOW (file follow-up issue for workflow workaround cleanup): tracked in issue body as "follow-up"; will file a separate ticket once this lands and the caller pin is bumped. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tifier-fail Addresses Codex P2 + skeptical 1, 3, 8 + big-picture minor on PR #28. - `olabiniV2/matrix-message@v0.0.1` -> SHA pin `5927bc6d98755e4364377fcfe0074aacbc01d237` (with inline `# v0.0.1` comment) so a re-tag of v0.0.1 by an external party can't silently swap in a different commit with secret access. The org's existing `matrix_pr_merge_notify.yml` precedent uses tag-pinning; that's a pre-existing footgun this PR opts NOT to extend. (Skeptical #3, big-picture minor.) - `if: failure()` -> `if: failure() || cancelled()` on the mirror notification step. A 30-minute job timeout or runner shutdown fires `cancelled()` rather than `failure()` and would otherwise silently leave the demo behind upstream with no alert. (Skeptical #1.) - The summarize job's `Fail if any rescue failed` step is now `if: always() && needs.rescue.result != 'success'`. Without always(), a failure in the Matrix-notify step above would skip the summary error and the workflow would fail with the notifier's error (e.g. missing secrets) instead of the actual rescue failure. (Codex P2.) - Updated the inline comment to acknowledge the optional-secrets-mean-extra-red-step UX (the prior wording said "errors quietly" which is over-confident -- skeptical #8). - Tightened wording of the mirror failure message ("Demo URL may be behind upstream" rather than the snapshot-mode-specific "going stale"). Pushed back / out of scope: - Skeptical #2 (caller-side gap: explicit `secrets:` map in freenet-core / freenet-stdlib doesn't pass the new optional Matrix secrets, so the alert is inert for the two production callers until they SHA-bump and update their pass-through): acknowledged; addressed by companion PRs to those repos that land with this one. The "finish the fix" criterion is met by the three PRs together, not by this PR alone. - Skeptical #4 (`secrets.X` in `${{ }}` `with:` input): the matrix-message action's API requires this. Switching to a better-maintained action with env-var inputs is a bigger change worth filing separately, not in this PR. - Skeptical #5 (push-storm dedup): real concern but requires a cross-run notification deduper. Today's worst case is a single Matrix room getting N copies of the same alert during a sustained gateway outage, which is annoying but actionable. Defer until we've seen it happen. - Skeptical #6 (untrusted input.freenet_repo from fork PRs): not reachable today (callers are trusted org repos); the workflow header comment already documents that the reusable workflow trusts its inputs. - Skeptical #7 (style: `failure()` vs `result != 'success'`): the two halves correctly use different idioms because the rescue side is a `summarize` job depending on a matrix job (where `result != 'success'` is the canonical aggregate check), while the mirror side is a step inside the same job (where `failure() || cancelled()` is the canonical step predicate). - Skeptical #9 (snapshot/history wording): reworded to be mode-neutral. - Skeptical #10 (emoji): consistent with org precedent in matrix_pr_merge_notify.yml; not changing. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: matrix notification on mirror/rescue failure Closes #23. ## Problem When `mirror-to-freenet.yml` (in caller repos) or `rescue-demos.yml` fails, the only signal is GitHub's default email-on-failure-to-author. In practice nobody actively watches it -- five consecutive freenet-core mirror failures and a community report of cold clones on 2026-05-07 went unnoticed for hours despite the email going out. ## Fix Add a `Notify Matrix on failure` step to both workflows. Uses the `olabiniV2/matrix-message@v0.0.1` action that's already in use elsewhere in the org (see `freenet-stdlib/.github/workflows/ matrix_pr_merge_notify.yml`). `MATRIX_ACCESS_TOKEN` and `MATRIX_ROOM_ID` are org-level secrets at freenet (visibility: all), so any repo invoking the reusable workflow with `secrets: inherit` (or passing them explicitly) gets alerts for free without per-repo setup. Reusable workflow declares both as optional `secrets:`; if absent, the action errors quietly inside an already-failing job -- acceptable noise. The rescue-demos `summarize` job already aggregated matrix-cell failures into a workflow-level failure status; the new step posts to Matrix before failing the workflow. ## Testing - Both YAML files parse. - Cannot easily smoke-test without inducing a real failure; will observe on next genuine failure. ## Out of scope / follow-up The caller workflows (freenet-core, freenet-stdlib) currently pass only `FREENET_GIT_IDENTITY_BUNDLE_BASE64` and `FREENET_GIT_WS_URL` explicitly. To opt in to Matrix alerts they need to either switch to `secrets: inherit` (broader blast radius -- skeptical reviewer flagged this on PR #71) or add `MATRIX_ACCESS_TOKEN` and `MATRIX_ROOM_ID` to the explicit pass-through. Will follow up with SHA-bump PRs to the callers after this lands. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * review fixes: SHA-pin matrix-message, fail-on-cancel, summarize-on-notifier-fail Addresses Codex P2 + skeptical 1, 3, 8 + big-picture minor on PR #28. - `olabiniV2/matrix-message@v0.0.1` -> SHA pin `5927bc6d98755e4364377fcfe0074aacbc01d237` (with inline `# v0.0.1` comment) so a re-tag of v0.0.1 by an external party can't silently swap in a different commit with secret access. The org's existing `matrix_pr_merge_notify.yml` precedent uses tag-pinning; that's a pre-existing footgun this PR opts NOT to extend. (Skeptical #3, big-picture minor.) - `if: failure()` -> `if: failure() || cancelled()` on the mirror notification step. A 30-minute job timeout or runner shutdown fires `cancelled()` rather than `failure()` and would otherwise silently leave the demo behind upstream with no alert. (Skeptical #1.) - The summarize job's `Fail if any rescue failed` step is now `if: always() && needs.rescue.result != 'success'`. Without always(), a failure in the Matrix-notify step above would skip the summary error and the workflow would fail with the notifier's error (e.g. missing secrets) instead of the actual rescue failure. (Codex P2.) - Updated the inline comment to acknowledge the optional-secrets-mean-extra-red-step UX (the prior wording said "errors quietly" which is over-confident -- skeptical #8). - Tightened wording of the mirror failure message ("Demo URL may be behind upstream" rather than the snapshot-mode-specific "going stale"). Pushed back / out of scope: - Skeptical #2 (caller-side gap: explicit `secrets:` map in freenet-core / freenet-stdlib doesn't pass the new optional Matrix secrets, so the alert is inert for the two production callers until they SHA-bump and update their pass-through): acknowledged; addressed by companion PRs to those repos that land with this one. The "finish the fix" criterion is met by the three PRs together, not by this PR alone. - Skeptical #4 (`secrets.X` in `${{ }}` `with:` input): the matrix-message action's API requires this. Switching to a better-maintained action with env-var inputs is a bigger change worth filing separately, not in this PR. - Skeptical #5 (push-storm dedup): real concern but requires a cross-run notification deduper. Today's worst case is a single Matrix room getting N copies of the same alert during a sustained gateway outage, which is annoying but actionable. Defer until we've seen it happen. - Skeptical #6 (untrusted input.freenet_repo from fork PRs): not reachable today (callers are trusted org repos); the workflow header comment already documents that the reusable workflow trusts its inputs. - Skeptical #7 (style: `failure()` vs `result != 'success'`): the two halves correctly use different idioms because the rescue side is a `summarize` job depending on a matrix job (where `result != 'success'` is the canonical aggregate check), while the mirror side is a step inside the same job (where `failure() || cancelled()` is the canonical step predicate). - Skeptical #9 (snapshot/history wording): reworded to be mode-neutral. - Skeptical #10 (emoji): consistent with org precedent in matrix_pr_merge_notify.yml; not changing. [AI-assisted - Claude] Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sanity and others added 4 commits May 1, 2026 09:47

docs: drop "parallel chunk uploads not yet shipped" caveat

709326e

Closes #7 ships it.

sanity merged commit c099ea1 into main May 1, 2026
1 check passed

sanity mentioned this pull request May 1, 2026

feat: parallelize rescue_chunked #13

Merged

sanity mentioned this pull request May 7, 2026

fix: build_pack handles force-push when local lacks the remote tip #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: parallel chunk PUT and GET#8

feat: parallel chunk PUT and GET#8
sanity merged 4 commits into
mainfrom
fix-7

sanity commented May 1, 2026

Uh oh!

sanity commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sanity commented May 1, 2026

Problem

Approach

API change

Rejected alternatives

Testing

Why didn't CI catch this?

Out-of-scope (filed for follow-up if needed)

Fixes

Uh oh!

sanity commented May 1, 2026

Review pass complete

Applied to this branch

Filed as follow-ups (out of scope for #7)

Validation gap

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant