chore(dev-loop): kill the slow iteration loop on echo-wesley-gen tests by flyingrobots · Pull Request #383 · flyingrobots/echo

flyingrobots · 2026-05-30T18:14:18Z

Summary

Pre-built wesley-gen binary (OnceLock<PathBuf>): builds the CLI once per test binary instead of cargo run -p echo-wesley-gen -- per test invocation.
Shared CARGO_TARGET_DIR for every generated consumer crate: echo-wasm-abi, echo-registry-api, and serde now compile once for the whole suite instead of once per test.
Auto-staged hook formatters (rustfmt + prettier): when the working tree was clean before the formatter ran, the pre-commit hook now restages and continues instead of aborting and forcing a manual git add -A + recommit cycle.

Measured impact

test_toy_contract_no_std_generated_output_checks_in_consumer_crate was the test that dominated wall-clock time during PR #382 review-resolution work:

	Before	After
warm cache	~96s	1.14s (~85×)
cold cache	~9m + 96s	~46s + 9.06s

Pre-push hook on this branch ran 20 wesley-gen tests in 27.96s (the whole suite was previously upwards of 20–30 minutes warm).

Why

This was identified during PR #382's review-resolution pass, where the verification loop dominated the actual work. The backlog item docs/method/backlog/cool-ideas/PLATFORM_wesley-gen-test-loop-speedup.md (currently sitting on cycle/0025-sessions-as-causal-contexts as part of #382) documents the diagnosis; this PR implements wins #1, #2, and #5 from it. Win #3 (crate-scoped pre-commit) already shipped under scripts/verify-local.sh. Win #4 (cargo-nextest) can ride later.

Determinism / safety

The integration test change is purely test-harness — no production code, no generated wire formats, no codec contract changes. The shared CARGO_TARGET_DIR is opt-in via .env() calls inside the test file only; it does not affect the workspace's normal target directory or CI.

Test plan

cargo clippy -p echo-wesley-gen --tests clean
Cold first run of one integration test (forces the OnceLock build): completes in ~9s after the cargo binary build, vs ~96s previously.
Warm second run: 1.14s.
Pre-push hook (scripts/verify-local.sh pre-push) ran the full wesley-gen suite in 27.96s and passed.
Reviewer should sanity-check by running cargo test -p echo-wesley-gen --test generation twice in a row and confirming the second run completes in seconds, not minutes.

Summary by CodeRabbit

Chores
- Improved git pre-commit hook formatting behavior to better preserve partial staging when auto-formatting is enabled.
- Optimized integration test suite performance through cached build artifacts and shared compilation targets, reducing redundant rebuilds during testing.

Three targeted fixes that compound: pre-built wesley-gen binary, shared CARGO_TARGET_DIR for generated consumer crates, and auto-staged hook formatters. Observed result on the test that previously dominated wall-clock time (test_toy_contract_no_std_generated_output_checks_in_ consumer_crate): Before this commit (warm cache): ~96s After this commit (warm cache): 1.14s After this commit (cold cache): ~9s + build That is roughly an 85x speedup on the hot path. The mechanics: 1) wesley_gen_binary() -- OnceLock<PathBuf> in tests/generation.rs. Builds echo-wesley-gen exactly once per test binary via cargo build with --message-format=json-render-diagnostics, captures the executable path from the compiler-artifact JSON, then exec's the binary directly on every subsequent invocation. Previously every test spawned cargo run -p echo-wesley-gen -- and paid a freshness re-check across the whole dependency graph per test. 2) shared_consumer_target_dir() -- returns target/echo-wesley-gen-test-consumer-cache/ and is wired in via .env("CARGO_TARGET_DIR", ...) on every cargo check/cargo test invocation against a generated consumer crate. Previously each generated crate had its own [workspace] block AND a per-PID parent dir, so echo-wasm-abi, echo-registry-api, and serde compiled from scratch for every single test. Now they compile once and stay cached across both within-run and between-run test invocations. 3) .githooks/pre-commit -- both auto-format paths (cargo fmt and prettier) now auto-stage their output when the working tree was clean before the formatter ran, instead of aborting and forcing a manual git add -A + recommit cycle. Partial staging still aborts safely: if there were unstaged changes before the formatter, we can't tell whether the formatter touched staged hunks, unstaged hunks, or both, so we bail with the same message as before. The common case (working tree clean, you staged everything you wanted) just continues. The cargo verify cost was previously paid twice per typical hook-aborted commit; now it's paid once. The integration test change is purely test-harness; no production code or generated wire formats are affected, so the determinism gates are not at risk. Verified by running test_toy_contract_no_std_generated_output_checks_in_consumer_crate twice back-to-back -- first run 9.06s cold, second run 1.14s warm. Implements wins #1, #2, #5 from docs/method/backlog/cool-ideas/PLATFORM_wesley-gen-test-loop-speedup.md (currently committed on cycle/0025-sessions-as-causal-contexts as part of PR #382). Win #3 (crate-scoped pre-commit) already shipped under scripts/verify-local.sh; win #4 (cargo-nextest) can ride later.

coderabbitai · 2026-05-30T18:14:32Z

📝 Walkthrough

Walkthrough

This PR introduces two independent optimizations: the pre-commit hook now snapshots working-tree dirtiness before Rust and Markdown auto-formatting, auto-staging formatted changes only when the tree was clean beforehand to preserve partial-staging integrity; and the test harness caches the echo-wesley-gen binary and reuses a shared cargo target directory across generated-crate tests to avoid redundant builds.

Changes

Pre-commit hook partial-staging safety

Layer / File(s)	Summary
Rust formatter dirtiness snapshot and conditional auto-stage `.githooks/pre-commit`	`ECHO_AUTO_FMT` branch snapshots whether the working tree had unstaged changes before running `cargo fmt --all`, then auto-stages formatted files and refreshes `STAGED` only if pre-format state was clean; otherwise aborts to preserve partial-staging integrity.
Markdown formatter re-checked dirtiness and conditional auto-stage `.githooks/pre-commit`	Prettier markdown flow re-detects working-tree dirtiness at execution time (because earlier rustfmt auto-staging can make the original snapshot stale), then conditionally auto-stages markdown output only if that markdown-specific working tree is clean; otherwise aborts with the partial-staging hint.

Test harness build caching and target directory reuse

Layer / File(s)	Summary
Cached binary builder and test helper refactor `crates/echo-wesley-gen/tests/generation.rs`	Introduces `OnceLock`-based `wesley_gen_binary()` to build and cache the `echo-wesley-gen` executable (parsing `cargo build` JSON output for the path), adds `shared_consumer_target_dir()` to provision a stable workspace target directory, and refactors `run_wesley_gen*` helpers to spawn the cached binary directly instead of running `cargo run` per invocation.
Apply shared target directory to generated-crate tests `crates/echo-wesley-gen/tests/generation.rs`	Four generated-crate smoke test invocations set `CARGO_TARGET_DIR` to the shared directory so upstream dependencies compile once and are reused. One no-std/minicbor test case is updated to use the `run_wesley_gen_with_args()` helper instead of manual `cargo run`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

flyingrobots/echo#15: Pre-commit hook auto-format and partial-staging behavior introduced in an earlier iteration of ECHO_AUTO_FMT gating logic.

Poem

🔧 Hook snapshots dirtiness, formats with care,
Stages only when the tree is fair;
Tests now cache binaries, share targets wide—
Builds compile once, and compilers collide no more. ⚡

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	Title directly describes the main objective: optimizing the slow iteration loop on echo-wesley-gen tests through pre-building and caching, which matches the changeset's core improvements.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch chore/test-loop-speedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8dda4646b8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-30T18:18:30Z

-      exit 1
+      if [[ "$HAD_UNSTAGED_BEFORE_FMT" -eq 0 ]]; then
+        echo "pre-commit: rustfmt updated files; auto-staging (working tree was clean)." >&2
+        git add -u || { echo "pre-commit: failed to re-stage rustfmt changes" >&2; exit 1; }


Limit rustfmt restaging to staged files

When cargo fmt --all rewrites any tracked Rust file outside the user's staged set while the worktree is otherwise clean, this git add -u stages every tracked formatter change, not just the files that were part of the commit. In that scenario the hook silently expands the commit with unrelated formatting edits; capture the originally staged paths and restage only those (or keep aborting when rustfmt touches other files).

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.githooks/pre-commit:
- Around line 167-174: Update the comment above the markdown re-check to clarify
the true reason for re-snapshotting: explain that we explicitly re-check the
cleanliness of the specific MD_FILES to ensure the exact markdown files we may
auto-stage are clean (markdown-specific safety), rather than saying
HAD_UNSTAGED_BEFORE_FMT is simply “stale”; mention the relevant symbols
(MD_FILES, MD_HAD_UNSTAGED, HAD_UNSTAGED_BEFORE_FMT, rustfmt) so readers know
this is a defensive check targeted at markdown auto-staging rather than a
general whole-tree snapshot correction.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 571a57f8-34ee-4db3-b6e2-ad2257dfd18c

📥 Commits

Reviewing files that changed from the base of the PR and between b8b430a and 8dda464.

📒 Files selected for processing (2)

.githooks/pre-commit
crates/echo-wesley-gen/tests/generation.rs

coderabbitai · 2026-05-30T18:20:30Z

+        # Re-snapshot working-tree dirtiness for the markdown path specifically.
+        # The rustfmt step above may have auto-staged its own changes, so the
+        # earlier HAD_UNSTAGED_BEFORE_FMT value is stale by this point.
+        if git diff --quiet -- "${MD_FILES[@]}"; then
+          MD_HAD_UNSTAGED=0
+        else
+          MD_HAD_UNSTAGED=1
+        fi


🧹 Nitpick | 🔵 Trivial | 💤 Low value

Minor documentation nit: comment reasoning could be clearer.

The comment claims HAD_UNSTAGED_BEFORE_FMT is "stale" after rustfmt auto-staging, but that's not quite accurate:

If HAD_UNSTAGED_BEFORE_FMT=0, rustfmt auto-stages and working tree remains clean → not stale.

If HAD_UNSTAGED_BEFORE_FMT=1, rustfmt aborts before reaching here → never stale.

The real reason for the re-check is markdown-specific safety: verifying the exact files we're about to auto-stage are clean, independent of the earlier whole-tree snapshot. The logic itself is correct and defensive; just the comment's explanation is slightly off.

📝 Clearer comment wording

- # Re-snapshot working-tree dirtiness for the markdown path specifically. - # The rustfmt step above may have auto-staged its own changes, so the - # earlier HAD_UNSTAGED_BEFORE_FMT value is stale by this point. + # Check working-tree dirtiness for the markdown files specifically. + # Even though we know the whole tree was clean before rustfmt, we verify + # these exact files are clean before auto-staging them (defensive check). if git diff --quiet -- "${MD_FILES[@]}"; then

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Re-snapshot working-tree dirtiness for the markdown path specifically.

# The rustfmt step above may have auto-staged its own changes, so the

# earlier HAD_UNSTAGED_BEFORE_FMT value is stale by this point.

if git diff --quiet -- "${MD_FILES[@]}"; then

MD_HAD_UNSTAGED=0

else

MD_HAD_UNSTAGED=1

fi

# Check working-tree dirtiness for the markdown files specifically.

# Even though we know the whole tree was clean before rustfmt, we verify

# these exact files are clean before auto-staging them (defensive check).

if git diff --quiet -- "${MD_FILES[@]}"; then

MD_HAD_UNSTAGED=0

else

MD_HAD_UNSTAGED=1

fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.githooks/pre-commit around lines 167 - 174, Update the comment above the markdown re-check to clarify the true reason for re-snapshotting: explain that we explicitly re-check the cleanliness of the specific MD_FILES to ensure the exact markdown files we may auto-stage are clean (markdown-specific safety), rather than saying HAD_UNSTAGED_BEFORE_FMT is simply “stale”; mention the relevant symbols (MD_FILES, MD_HAD_UNSTAGED, HAD_UNSTAGED_BEFORE_FMT, rustfmt) so readers know this is a defensive check targeted at markdown auto-staging rather than a general whole-tree snapshot correction.

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

coderabbitai Bot reviewed May 30, 2026

View reviewed changes

flyingrobots merged commit d3b5ffe into main May 30, 2026
36 checks passed

flyingrobots deleted the chore/test-loop-speedup branch May 30, 2026 18:29

This was referenced May 31, 2026

docs(backlog): two follow-ups from the 2026-05-30 merge session #385

Open

docs(method): five cool-ideas cards from the 2026-05-30 wrap-up #388

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(dev-loop): kill the slow iteration loop on echo-wesley-gen tests#383

chore(dev-loop): kill the slow iteration loop on echo-wesley-gen tests#383
flyingrobots merged 1 commit into
mainfrom
chore/test-loop-speedup

flyingrobots commented May 30, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 30, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

flyingrobots commented May 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Measured impact

Why

Determinism / safety

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

flyingrobots commented May 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 30, 2026 •

edited

Loading