-
Notifications
You must be signed in to change notification settings - Fork 8.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kbn/optimizer] Force worker exit, extend parent ping timeout #67235
Conversation
Pinging @elastic/kibana-operations (Team:Operations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
setTimeout(() => { | ||
send( | ||
workerMsgs.error( | ||
new Error('process did not automatically exit within 5 seconds, forcing exit') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably still want to log an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What error? If we call process.exit() the process is going to exit immediately and we won't be able to set a timer or anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - just one minor comment.
💔 Build Failed
Failed CI StepsTest FailuresKibana Pipeline / kibana-intake-agent / Jest Integration Tests.src/dev/code_coverage/ingest_coverage/integration_tests.Ingesting coverage to the coverage index should result in every posted item having a site url that meets all regex assertionsStandard Out
Stack Trace
To update your PR or re-run it, just comment with: |
I'm just going to revert the changes I've made to the optimizer recently, there is clearly something wrong with the strategy here and I'm really unsure that this is going to make things better. |
We're still seeing failures on CI caused by workers who are exiting early (probably because the parent process doesn't response to the ping quickly enough)
As well as workers which don't gracefully close for some reason
We don't know exactly why this is happening, but it's clearly related to the pings we implemented yesterday, hoping that forcefully closing the worker internally, and extending the ping timeout for the parent will be sufficient to avoid this level of failure on CI.