Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix go routine leak and refactor concurrency on external checks #548

Merged
merged 12 commits into from
Jul 10, 2020

Conversation

integrii
Copy link
Collaborator

@integrii integrii commented Jul 7, 2020

Should be a more inclusive fix for #537

@integrii integrii closed this Jul 7, 2020
@integrii integrii reopened this Jul 7, 2020
@integrii
Copy link
Collaborator Author

integrii commented Jul 7, 2020

Not only is the memory leak gone, but memory use overall looks much lower after this PR...

image

pkg/checks/external/main.go Outdated Show resolved Hide resolved
pkg/checks/external/main.go Outdated Show resolved Hide resolved
@integrii integrii changed the title Pr fix goroutine leak Fix go routine leak and refactor concurrency on external checks Jul 7, 2020
Copy link
Collaborator

@joshulyne joshulyne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 👍

cmd/kuberhealthy/Makefile Outdated Show resolved Hide resolved
cmd/kuberhealthy/Makefile Outdated Show resolved Hide resolved
case <-sawRemovalChan: // we saw the watched pod remove
case <-ctx.Done(): // graceful shutdown signal
ext.log("pod shutdown monitor stopping gracefully")
case <-ext.waitForDeletedEvent(watcher): // we saw the watched pod remove
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's possible that waitForDeletedEvent(watcher) can return an error to this channel when a Watch.Error event occurs. In that case, should the we recreate the watcher and call waitForDeletedEvent(watcher) again?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's too hard to implement retries here because we won't know if the error is because the supplied watcher is completely closed, or some other condition. As long as we pass the error upstream, it's probably better to do the retry logic at a higher level in the code.

I have had bad experiences trying to make every thing that tries anything do retries...

@integrii integrii merged commit 0e13380 into master Jul 10, 2020
@integrii integrii deleted the pr-fix-goroutine-leak branch July 13, 2020 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants