You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite. Learn more
Code Review: feat(workflow-engine): add retryOnTimeout opt-in for step timeout retries
Overview
This PR adds a retryOnTimeout?: boolean flag to StepConfig so step timeouts can opt into normal retry behavior instead of always being treated as critical failures. The change is small and well-targeted.
Correctness
The core logic change in context.ts is correct: the guard error instanceof StepTimeoutError && !config.retryOnTimeout correctly short-circuits to a critical failure only when the flag is absent/false, and falls through to the normal retry path otherwise.
The kind: "timeout" disambiguation for the exhausted-after-retries case is also correct — TryStepCatchKind already had "timeout" as a variant, and it now consistently means "this step failed due to a timeout" whether retried or not.
Issues
Missing tryStep test coverage
The new behavior interacts with tryStep's catch filter, but there is no test for that combination. Specifically:
awaitctx.tryStep({name: "foo",retryOnTimeout: true,maxRetries: 2,catch: ["timeout"],run: async()=>{/* always times out */},});
This should resolve to { ok: false, failure: { kind: "timeout", ... } } rather than throwing. Without a test, it is easy for a future refactor to silently break this. There is a try.test.ts that already covers the tryStep + catch surface — adding a case there would be natural.
Multi-line comment in context.ts
Per project conventions, comments should be one short line max. The new two-line comment at line 946-947 should be condensed:
// Timeout errors are critical by default; opt in to retries with retryOnTimeout: true.if(errorinstanceofStepTimeoutError&&!config.retryOnTimeout){
Grammar nit in retries.md
when retries exhaust on a timeout the try-step failure kind is "timeout"
Should be "when retries are exhausted".
Minor observations
The test timing (5 ms step timeout, 25 ms step duration, 10 ms inter-run wait) is tight but consistent with the existing steps.test.ts pattern for yield-mode tests, so it is acceptable.
The StepConfig JSDoc comment for retryOnTimeout is appropriately concise.
No persistence schema changes are required since retryOnTimeout is call-site config, not stored state.
Summary
The implementation is correct and the docs update is accurate. Two things worth addressing before merge:
Add a tryStep + catch: ["timeout"] + retryOnTimeout: true test to try.test.ts.
Collapse the two-line comment in context.ts to one line.
The grammar fix in the docs is minor and can be bundled with the above.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Please include a summary of the changes and the related issue. Please also include relevant motivation and context.
Type of change
How Has This Been Tested?
Please describe the tests that you ran to verify your changes.
Checklist: