Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kbn/optimizer] Force worker exit, extend parent ping timeout #67235

Closed
wants to merge 2 commits into from

Conversation

spalger
Copy link
Contributor

@spalger spalger commented May 22, 2020

We're still seeing failures on CI caused by workers who are exiting early (probably because the parent process doesn't response to the ping quickly enough)

image

As well as workers which don't gracefully close for some reason

image

We don't know exactly why this is happening, but it's clearly related to the pings we implemented yesterday, hoping that forcefully closing the worker internally, and extending the ping timeout for the parent will be sufficient to avoid this level of failure on CI.

@spalger spalger added Team:Operations Team label for Operations Team v8.0.0 release_note:skip Skip the PR/issue when compiling release notes v7.9.0 v7.8.1 labels May 22, 2020
@spalger spalger requested a review from a team as a code owner May 22, 2020 01:28
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

Copy link
Member

@mistic mistic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

setTimeout(() => {
send(
workerMsgs.error(
new Error('process did not automatically exit within 5 seconds, forcing exit')
Copy link
Contributor

@tylersmalley tylersmalley May 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably still want to log an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What error? If we call process.exit() the process is going to exit immediately and we won't be able to set a timer or anything.

Copy link
Contributor

@tylersmalley tylersmalley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - just one minor comment.

@kibanamachine
Copy link
Contributor

💔 Build Failed

Failed CI Steps


Test Failures

Kibana Pipeline / kibana-intake-agent / Jest Integration Tests.src/dev/code_coverage/ingest_coverage/integration_tests.Ingesting coverage to the coverage index should result in every posted item having a site url that meets all regex assertions

Link to Jenkins

Standard Out

Failed Tests Reporter:
  - Test has failed 1 times on tracked branches: https://github.com/elastic/kibana/issues/67075


Stack Trace

Error: Failed: 1
    at Env.fail (/var/lib/jenkins/workspace/elastic+kibana+pipeline-pull-request/kibana/node_modules/jest-jasmine2/build/jasmine/Env.js:778:61)
    at ChildProcess.next (/var/lib/jenkins/workspace/elastic+kibana+pipeline-pull-request/kibana/node_modules/jest-jasmine2/build/queueRunner.js:31:24)
    at ChildProcess.emit (events.js:198:13)
    at maybeClose (internal/child_process.js:982:16)
    at Socket.stream.socket.on (internal/child_process.js:389:11)
    at Socket.emit (events.js:198:13)
    at Pipe._handle.close (net.js:607:12)

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@spalger
Copy link
Contributor Author

spalger commented May 22, 2020

I'm just going to revert the changes I've made to the optimizer recently, there is clearly something wrong with the strategy here and I'm really unsure that this is going to make things better.

@spalger spalger closed this May 22, 2020
@spalger spalger deleted the extend-parent-ping-timeout branch August 18, 2020 18:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:Operations Team label for Operations Team v7.8.1 v7.9.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants