-
Notifications
You must be signed in to change notification settings - Fork 850
flow step retry feature #493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| } | ||
| } | ||
| tracing::error!(job_id = %job.id, "Error handling job: {err}"); | ||
| tracing::error!(job_id = %job.id, err = err.alt(), "error handling job"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The errors are also displayed in the frontend, need to check what impact this has
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tracing::error! shouldn't end up anywhere near the frontend/user? (or does this end up in the job logs somehow -- github won't show me context?)
|
It should show error causes, like messages from underlying errors wrapped using `context()` into `anyhow::Error`. But this might be a bit unpredictable if we are trying to hide underlying errors like from sqlx.
It made debugging a bit easier for a bit but i wouldn't miss it if we want remove it if we dont want to spend time checking if the impact is negative
On August 27, 2022 11:56:55 PM PDT, Ruben Fiszel ***@***.***> wrote:
***@***.*** commented on this pull request.
…
> @@ -240,7 +241,7 @@ pub async fn run_worker(
}
}
}
- tracing::error!(job_id = %job.id, "Error handling job: {err}");
+ tracing::error!(job_id = %job.id, err = err.alt(), "error handling job");
The errors are also displayed in the frontend, need to check what impact this has
--
Reply to this email directly or view it on GitHub:
#493 (review)
You are receiving this because you authored the thread.
Message ID: ***@***.***>
|
| /* pass fail */ | ||
| (0, Some(99)), | ||
| (1, None), | ||
| /* pass pass fail */ | ||
| (0, Some(99)), | ||
| (1, Some(99)), | ||
| (2, None), | ||
| /* pass pass pass */ | ||
| (0, Some(3)), | ||
| (1, Some(5)), | ||
| (2, Some(7)), | ||
| /* fail the last step once */ | ||
| (0xff, None), | ||
| (0xff, Some(9)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i've re-read the tests a few times and I couldn't make much sense of the meaning of 99, and 3,5,7 and attempts tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The particular numbers are arbitrary, but the values from the first two tries should be rejected so having them be 99 should help us know, if we see it, that it shouldn't be there and where it came from.
The attempts assert is to verify the steps are running/re-running in the right order.
rubenfiszel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of great things here, would prefer to merge #491 first
also renamed duration to interval to be more specific about the retry interval/period between tries or attempts
rubenfiszel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing the openapi spec changes
When a job fails and a retry is possible,
push_next_flow_jobcreates a new job scheduled for some time based on the retry configuration.The previous job
argsare not reused because there were some weird things happening to do with for loops running using the wrong arguments passed in. So thelast_result/previous_resultis stored in the retry status so that it can be replayed later if a retry is necessary in order to re-calculate the jobargsagain.