Skip to content

chore(dev-loop): kill the slow iteration loop on echo-wesley-gen tests#383

Merged
flyingrobots merged 1 commit into
mainfrom
chore/test-loop-speedup
May 30, 2026
Merged

chore(dev-loop): kill the slow iteration loop on echo-wesley-gen tests#383
flyingrobots merged 1 commit into
mainfrom
chore/test-loop-speedup

Conversation

@flyingrobots
Copy link
Copy Markdown
Owner

@flyingrobots flyingrobots commented May 30, 2026

Summary

  • Pre-built wesley-gen binary (OnceLock<PathBuf>): builds the CLI once per test binary instead of cargo run -p echo-wesley-gen -- per test invocation.
  • Shared CARGO_TARGET_DIR for every generated consumer crate: echo-wasm-abi, echo-registry-api, and serde now compile once for the whole suite instead of once per test.
  • Auto-staged hook formatters (rustfmt + prettier): when the working tree was clean before the formatter ran, the pre-commit hook now restages and continues instead of aborting and forcing a manual git add -A + recommit cycle.

Measured impact

test_toy_contract_no_std_generated_output_checks_in_consumer_crate was the test that dominated wall-clock time during PR #382 review-resolution work:

Before After
warm cache ~96s 1.14s (~85×)
cold cache ~9m + 96s ~46s + 9.06s

Pre-push hook on this branch ran 20 wesley-gen tests in 27.96s (the whole suite was previously upwards of 20–30 minutes warm).

Why

This was identified during PR #382's review-resolution pass, where the verification loop dominated the actual work. The backlog item docs/method/backlog/cool-ideas/PLATFORM_wesley-gen-test-loop-speedup.md (currently sitting on cycle/0025-sessions-as-causal-contexts as part of #382) documents the diagnosis; this PR implements wins #1, #2, and #5 from it. Win #3 (crate-scoped pre-commit) already shipped under scripts/verify-local.sh. Win #4 (cargo-nextest) can ride later.

Determinism / safety

The integration test change is purely test-harness — no production code, no generated wire formats, no codec contract changes. The shared CARGO_TARGET_DIR is opt-in via .env() calls inside the test file only; it does not affect the workspace's normal target directory or CI.

Test plan

  • cargo clippy -p echo-wesley-gen --tests clean
  • Cold first run of one integration test (forces the OnceLock build): completes in ~9s after the cargo binary build, vs ~96s previously.
  • Warm second run: 1.14s.
  • Pre-push hook (scripts/verify-local.sh pre-push) ran the full wesley-gen suite in 27.96s and passed.
  • Reviewer should sanity-check by running cargo test -p echo-wesley-gen --test generation twice in a row and confirming the second run completes in seconds, not minutes.

Summary by CodeRabbit

  • Chores
    • Improved git pre-commit hook formatting behavior to better preserve partial staging when auto-formatting is enabled.
    • Optimized integration test suite performance through cached build artifacts and shared compilation targets, reducing redundant rebuilds during testing.

Review Change Stack

Three targeted fixes that compound: pre-built wesley-gen binary,
shared CARGO_TARGET_DIR for generated consumer crates, and auto-staged
hook formatters. Observed result on the test that previously dominated
wall-clock time (test_toy_contract_no_std_generated_output_checks_in_
consumer_crate):

  Before this commit (warm cache):     ~96s
  After this commit (warm cache):       1.14s
  After this commit (cold cache):      ~9s + build

That is roughly an 85x speedup on the hot path. The mechanics:

1) wesley_gen_binary() -- OnceLock<PathBuf> in tests/generation.rs.
   Builds echo-wesley-gen exactly once per test binary via cargo
   build with --message-format=json-render-diagnostics, captures the
   executable path from the compiler-artifact JSON, then exec's the
   binary directly on every subsequent invocation. Previously every
   test spawned cargo run -p echo-wesley-gen -- and paid a freshness
   re-check across the whole dependency graph per test.

2) shared_consumer_target_dir() -- returns
   target/echo-wesley-gen-test-consumer-cache/ and is wired in via
   .env("CARGO_TARGET_DIR", ...) on every cargo check/cargo test
   invocation against a generated consumer crate. Previously each
   generated crate had its own [workspace] block AND a per-PID
   parent dir, so echo-wasm-abi, echo-registry-api, and serde
   compiled from scratch for every single test. Now they compile
   once and stay cached across both within-run and between-run
   test invocations.

3) .githooks/pre-commit -- both auto-format paths (cargo fmt and
   prettier) now auto-stage their output when the working tree was
   clean before the formatter ran, instead of aborting and forcing a
   manual git add -A + recommit cycle. Partial staging still aborts
   safely: if there were unstaged changes before the formatter, we
   can't tell whether the formatter touched staged hunks, unstaged
   hunks, or both, so we bail with the same message as before. The
   common case (working tree clean, you staged everything you
   wanted) just continues. The cargo verify cost was previously
   paid twice per typical hook-aborted commit; now it's paid once.

The integration test change is purely test-harness; no production
code or generated wire formats are affected, so the determinism
gates are not at risk. Verified by running
test_toy_contract_no_std_generated_output_checks_in_consumer_crate
twice back-to-back -- first run 9.06s cold, second run 1.14s warm.

Implements wins #1, #2, #5 from
docs/method/backlog/cool-ideas/PLATFORM_wesley-gen-test-loop-speedup.md
(currently committed on cycle/0025-sessions-as-causal-contexts as
part of PR #382). Win #3 (crate-scoped pre-commit) already shipped
under scripts/verify-local.sh; win #4 (cargo-nextest) can ride later.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 30, 2026

📝 Walkthrough

Walkthrough

This PR introduces two independent optimizations: the pre-commit hook now snapshots working-tree dirtiness before Rust and Markdown auto-formatting, auto-staging formatted changes only when the tree was clean beforehand to preserve partial-staging integrity; and the test harness caches the echo-wesley-gen binary and reuses a shared cargo target directory across generated-crate tests to avoid redundant builds.

Changes

Pre-commit hook partial-staging safety

Layer / File(s) Summary
Rust formatter dirtiness snapshot and conditional auto-stage
.githooks/pre-commit
ECHO_AUTO_FMT branch snapshots whether the working tree had unstaged changes before running cargo fmt --all, then auto-stages formatted files and refreshes STAGED only if pre-format state was clean; otherwise aborts to preserve partial-staging integrity.
Markdown formatter re-checked dirtiness and conditional auto-stage
.githooks/pre-commit
Prettier markdown flow re-detects working-tree dirtiness at execution time (because earlier rustfmt auto-staging can make the original snapshot stale), then conditionally auto-stages markdown output only if that markdown-specific working tree is clean; otherwise aborts with the partial-staging hint.

Test harness build caching and target directory reuse

Layer / File(s) Summary
Cached binary builder and test helper refactor
crates/echo-wesley-gen/tests/generation.rs
Introduces OnceLock-based wesley_gen_binary() to build and cache the echo-wesley-gen executable (parsing cargo build JSON output for the path), adds shared_consumer_target_dir() to provision a stable workspace target directory, and refactors run_wesley_gen* helpers to spawn the cached binary directly instead of running cargo run per invocation.
Apply shared target directory to generated-crate tests
crates/echo-wesley-gen/tests/generation.rs
Four generated-crate smoke test invocations set CARGO_TARGET_DIR to the shared directory so upstream dependencies compile once and are reused. One no-std/minicbor test case is updated to use the run_wesley_gen_with_args() helper instead of manual cargo run.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • flyingrobots/echo#15: Pre-commit hook auto-format and partial-staging behavior introduced in an earlier iteration of ECHO_AUTO_FMT gating logic.

Poem

🔧 Hook snapshots dirtiness, formats with care,
Stages only when the tree is fair;
Tests now cache binaries, share targets wide—
Builds compile once, and compilers collide no more. ⚡

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title directly describes the main objective: optimizing the slow iteration loop on echo-wesley-gen tests through pre-building and caching, which matches the changeset's core improvements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/test-loop-speedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8dda4646b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread .githooks/pre-commit
exit 1
if [[ "$HAD_UNSTAGED_BEFORE_FMT" -eq 0 ]]; then
echo "pre-commit: rustfmt updated files; auto-staging (working tree was clean)." >&2
git add -u || { echo "pre-commit: failed to re-stage rustfmt changes" >&2; exit 1; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Limit rustfmt restaging to staged files

When cargo fmt --all rewrites any tracked Rust file outside the user's staged set while the worktree is otherwise clean, this git add -u stages every tracked formatter change, not just the files that were part of the commit. In that scenario the hook silently expands the commit with unrelated formatting edits; capture the originally staged paths and restage only those (or keep aborting when rustfmt touches other files).

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.githooks/pre-commit:
- Around line 167-174: Update the comment above the markdown re-check to clarify
the true reason for re-snapshotting: explain that we explicitly re-check the
cleanliness of the specific MD_FILES to ensure the exact markdown files we may
auto-stage are clean (markdown-specific safety), rather than saying
HAD_UNSTAGED_BEFORE_FMT is simply “stale”; mention the relevant symbols
(MD_FILES, MD_HAD_UNSTAGED, HAD_UNSTAGED_BEFORE_FMT, rustfmt) so readers know
this is a defensive check targeted at markdown auto-staging rather than a
general whole-tree snapshot correction.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 571a57f8-34ee-4db3-b6e2-ad2257dfd18c

📥 Commits

Reviewing files that changed from the base of the PR and between b8b430a and 8dda464.

📒 Files selected for processing (2)
  • .githooks/pre-commit
  • crates/echo-wesley-gen/tests/generation.rs

Comment thread .githooks/pre-commit
Comment on lines +167 to +174
# Re-snapshot working-tree dirtiness for the markdown path specifically.
# The rustfmt step above may have auto-staged its own changes, so the
# earlier HAD_UNSTAGED_BEFORE_FMT value is stale by this point.
if git diff --quiet -- "${MD_FILES[@]}"; then
MD_HAD_UNSTAGED=0
else
MD_HAD_UNSTAGED=1
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | 💤 Low value

Minor documentation nit: comment reasoning could be clearer.

The comment claims HAD_UNSTAGED_BEFORE_FMT is "stale" after rustfmt auto-staging, but that's not quite accurate:

  • If HAD_UNSTAGED_BEFORE_FMT=0, rustfmt auto-stages and working tree remains clean → not stale.
  • If HAD_UNSTAGED_BEFORE_FMT=1, rustfmt aborts before reaching here → never stale.

The real reason for the re-check is markdown-specific safety: verifying the exact files we're about to auto-stage are clean, independent of the earlier whole-tree snapshot. The logic itself is correct and defensive; just the comment's explanation is slightly off.

📝 Clearer comment wording
-        # Re-snapshot working-tree dirtiness for the markdown path specifically.
-        # The rustfmt step above may have auto-staged its own changes, so the
-        # earlier HAD_UNSTAGED_BEFORE_FMT value is stale by this point.
+        # Check working-tree dirtiness for the markdown files specifically.
+        # Even though we know the whole tree was clean before rustfmt, we verify
+        # these exact files are clean before auto-staging them (defensive check).
         if git diff --quiet -- "${MD_FILES[@]}"; then
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Re-snapshot working-tree dirtiness for the markdown path specifically.
# The rustfmt step above may have auto-staged its own changes, so the
# earlier HAD_UNSTAGED_BEFORE_FMT value is stale by this point.
if git diff --quiet -- "${MD_FILES[@]}"; then
MD_HAD_UNSTAGED=0
else
MD_HAD_UNSTAGED=1
fi
# Check working-tree dirtiness for the markdown files specifically.
# Even though we know the whole tree was clean before rustfmt, we verify
# these exact files are clean before auto-staging them (defensive check).
if git diff --quiet -- "${MD_FILES[@]}"; then
MD_HAD_UNSTAGED=0
else
MD_HAD_UNSTAGED=1
fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.githooks/pre-commit around lines 167 - 174, Update the comment above the
markdown re-check to clarify the true reason for re-snapshotting: explain that
we explicitly re-check the cleanliness of the specific MD_FILES to ensure the
exact markdown files we may auto-stage are clean (markdown-specific safety),
rather than saying HAD_UNSTAGED_BEFORE_FMT is simply “stale”; mention the
relevant symbols (MD_FILES, MD_HAD_UNSTAGED, HAD_UNSTAGED_BEFORE_FMT, rustfmt)
so readers know this is a defensive check targeted at markdown auto-staging
rather than a general whole-tree snapshot correction.

@flyingrobots flyingrobots merged commit d3b5ffe into main May 30, 2026
36 checks passed
@flyingrobots flyingrobots deleted the chore/test-loop-speedup branch May 30, 2026 18:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant