Actually fix the software factory shard 1/3 instantiate-validation flake (follow-up to #4782)#4802
Open
habdelra wants to merge 3 commits into
Open
Actually fix the software factory shard 1/3 instantiate-validation flake (follow-up to #4782)#4802habdelra wants to merge 3 commits into
habdelra wants to merge 3 commits into
Conversation
The previous approach (per-file `client.write` + a 90s poll waiting
for the Spec to surface in `_federated-search`) was a polling race
against realm-side async indexing. CI run 25768256725 confirmed even
a 90s budget sometimes never sees the Spec, regardless of write
ordering or whether the bad-example file existed when the indexer
processed the Spec.
Switch to the realm-server's first-class read-after-write boundary:
- Stage all three files in a fresh temp dir.
- `client.sync(..., { preferLocal: true, waitForIndex: true })`
pushes them as a single `_atomic` batch with `?waitForIndex=true`,
which makes the realm-server's `_atomic` handler `await
performIndex()` before responding.
- Cleanup the temp dir.
This trades one push round-trip for a deterministic
"indexer is settled" boundary — the call doesn't return until the
realm has processed every seeded file, so the downstream
`InstantiateValidationStep.discoverRealmSpecs` is guaranteed to see
the Spec on its first search hit.
Also adds one diagnostic log on the way out (realm file listing +
current Spec search hits) so if the realm's `performIndex` returns
without the Spec actually being searchable — the failure mode we
were trying to diagnose — we capture it in CI logs alongside the
`InstantiateValidationStep` warn-log added earlier in this PR.
Removed: the `awaitSpecSearchable` poll-and-warn helper. The
underlying mechanism it papered over (realm acks file write before
indexer surfaces it in search) is now handled at the realm side
via the atomic-upload waitForIndex query param.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diagnosed the actual flake on PR #4782 by stripping linkedExamples from the Spec — the bare Spec then surfaced in `_federated-search` within seconds, confirming the realm indexer drops a Spec whenever its `linkedExamples` `loadLinks` walk can't settle on the target. That happens both when the target is missing (404 — the variant tried previously in this PR) and when the target is in error_doc state (containsMany got a string — the original CI flake). Either way, seeding the realm directly with a broken-shape example silently disqualifies the Spec from search. So: - Seed the realm-side `TagsCard/bad-example.json` with a well-formed empty-tags placeholder so the Spec's loadLinks walk completes and the Spec surfaces normally. - New `overwriteTagsExampleWithBadShape(workspaceDir)` helper writes the broken `containsMany: "not-an-array"` shape into the workspace copy. The validation pipeline reads example JSON from the workspace path (see `prepareExampleInstance` in instantiate-execution.ts), not from the realm index, so the bad shape only needs to live in the workspace at the moment the step or in-memory tool reads it. - Both `instantiate-validation.spec.ts:91` and `run-instantiate-in-memory.spec.ts:71` call the overwrite helper after `client.pull` so the pull doesn't immediately stomp the workspace file. Local verification: both previously flaking tests now pass deterministically across runs. Two unrelated tests in the same in-memory spec file (197, 295) still flake locally with realm-server "database does not exist" errors during harness teardown — separate infra issue, not from this change. The realm-indexer behavior (dropping a Spec from search whenever its linkedExamples target won't settle) is a real bug worth filing separately; this PR is just unblocking the e2e on the fixture side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR addresses flakiness in instantiate-validation e2e tests by making fixture seeding deterministic and preventing realm-side indexing from dropping the Spec due to an error_doc linked example.
Changes:
- Seed fixture files via a single
client.sync(..., { waitForIndex: true })to establish a deterministic “indexer settled” boundary. - Seed a well-formed realm-side
TagsCard/bad-example.json, then overwrite the workspace copy post-pullwith the intentionally broken shape. - Update the two affected e2e specs to apply the workspace overwrite after
client.pull.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| packages/software-factory/tests/helpers/instantiate-test-fixtures.ts | Adds sync-based seeding with waitForIndex, introduces valid placeholder example, and provides a workspace overwrite helper for the broken shape |
| packages/software-factory/tests/instantiate-validation.spec.ts | Overwrites the workspace example after pull so validation reads the broken shape without poisoning realm indexing |
| packages/software-factory/tests/run-instantiate-in-memory.spec.ts | Same workspace overwrite after pull in the relevant in-memory e2e cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The helper writes each `files[].path` under a `mkdtempSync` staging dir via `join(stagingDir, path)`. Callers in this file all pass relative literal paths, so an escape isn't actually reachable today, but the guard is near-zero-cost and prevents a future caller (or a copy-paste outside this file) from accidentally pointing the seed at a path outside the temp dir. Reject `isAbsolute(path)` and any path segment of `..`. Both conditions are checked before any disk touch so a bad input fails loudly with a useful message rather than silently writing somewhere unexpected on the test machine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR #4782 added diagnostic logging and a soft
awaitSpecSearchablegate, but didn't actually fix the underlying flake. After it merged
I confirmed the root cause and landed two more commits on the
branch that didn't make the merge — this PR brings them up.
The actual bug
The realm-server's indexer drops a Spec from
_federated-searchwhenever its
linkedExamplesloadLinkswalk can't settle on thetarget — both when the target file is missing (404) and when the
target card is in error_doc state. In the failing fixture, the Spec
linked to
TagsCard/bad-example.json, the bad shape(
containsManygot a string) put that example in error_doc onindexing, the Spec got silently disqualified from search, and
InstantiateValidationStep.discoverRealmSpecscame back empty.Confirmed by stripping
linkedExamplesfrom the Spec in anexperiment — the bare Spec surfaces in search within seconds.
What this PR does
Seed instantiate-validation fixtures via sync(waitForIndex:true)Stages the seed files in a temp dir and pushes via
client.sync(..., { preferLocal: true, waitForIndex: true }). The_atomicupload appends?waitForIndex=true, so the realm-serverblocks on
performIndex()before responding. Replaces theper-file
client.write+ polling-for-search approach with adeterministic "indexer is settled" boundary. (This was the
intermediate fix I tried before realizing it didn't help on its
own — but it's still the right shape and pairs cleanly with the
real fix below.)
Keep realm-side example well-formed; overwrite workspace copySeeds the realm-side
TagsCard/bad-example.jsonwith awell-formed empty-tags placeholder so the Spec's
loadLinkswalk completes. New
overwriteTagsExampleWithBadShape(workspaceDir)helper writes the broken
containsMany: "not-an-array"shapeinto the workspace copy after
client.pull. The validationpipeline reads example JSON from
workspaceDir(seeprepareExampleInstanceininstantiate-execution.ts), not fromthe realm index, so the bad shape only needs to live in the
workspace at the moment the step / in-memory tool reads it.
Both
instantiate-validation.spec.ts:91andrun-instantiate-in-memory.spec.ts:71get theoverwriteTagsExampleWithBadShapecall afterclient.pull.Local verification
Both previously flaking tests now pass deterministically across
local runs. The realm-indexer behavior (Spec dropped from search
whenever its
linkedExamplestarget won't settle) is worth filingseparately — this PR just unblocks the e2e on the fixture side.
Test plan
pnpm lintclean (per-package)pnpm test:playwright tests/instantiate-validation.spec.ts tests/run-instantiate-in-memory.spec.ts— both target testspass; two unrelated tests in the same in-memory spec file
(197, 295) still flake locally with realm-server "database
does not exist" errors during harness teardown (separate infra
issue, not from this change)
🤖 Generated with Claude Code