Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error in k8s after job completes #2804

Merged
merged 1 commit into from
Jun 6, 2024

Conversation

DrJosh9000
Copy link
Contributor

@DrJosh9000 DrJosh9000 commented May 28, 2024

Description

There's a spurious error in some k8s job logs:

🚨 Error: Error waiting for client interrupt: context canceled

This happens when a job is ending normally. The cause is the surrounding context is being closed, and the goroutine that logs it then races the closure of the log writer.

While I'm here, I noticed that cancelCh is unbuffered, but there is a mismatch between sends and receives which would lead to leaking the goroutine that logs the message. So I fixed that by using a "close-with-guard pattern" (that's what I'm calling it, anyway).

Context

Noticed a bug, and it was pretty easy to fix.

Changes

  • Don't log the error if the context was cancelled
  • Add some explanatory comments
  • Change the cancelCh usage from sending to closing

Testing

  • Tests have run locally (with go test ./...). Buildkite employees may check this if the pipeline has run automatically.
  • Code is formatted (with go fmt ./...)

@DrJosh9000 DrJosh9000 requested a review from a team May 28, 2024 03:28
@DrJosh9000 DrJosh9000 force-pushed the fix-client-interrupt-wait-error branch from 3eb2f54 to daa9e7a Compare May 28, 2024 07:22
@DrJosh9000 DrJosh9000 enabled auto-merge May 28, 2024 07:22
@DrJosh9000 DrJosh9000 force-pushed the fix-client-interrupt-wait-error branch 2 times, most recently from 7405455 to 4400ca0 Compare June 5, 2024 03:07
@DrJosh9000 DrJosh9000 force-pushed the fix-client-interrupt-wait-error branch from 4400ca0 to 8c4946a Compare June 5, 2024 07:02
Copy link
Contributor

@zhming0 zhming0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 ! 👍🏿 for all these descriptive code comments.

@DrJosh9000 DrJosh9000 merged commit 548cf08 into main Jun 6, 2024
1 check passed
@DrJosh9000 DrJosh9000 deleted the fix-client-interrupt-wait-error branch June 6, 2024 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants