New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix FS usage goroutine leaks #1051
Conversation
Awesome, thx @jimmidyson for the quick fix. |
985e846
to
09f948c
Compare
@jimmidyson: Can you describe the issue you are trying to fix with this PR? |
@@ -746,6 +746,9 @@ func (m *manager) registerCollectors(collectorConfigs map[string]string, cont *c | |||
|
|||
// Create a container. | |||
func (m *manager) createContainer(containerName string) error { | |||
m.containersLock.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you moved out the creation of goroutines from the initialization of container handlers, is it necessary to guard the entire function with the lock?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The guard is really between createContainer
& destroyContainer
.
@vishh the linked issue ( kubernetes/kubernetes#19633) contains the complete details. cadvisor can leak goroutines. |
@vishh Goroutines were leaked as detailed in kubernetes/kubernetes#19633. @timothysc found this was due to tracking filesystem usage. The prior version was starting tracking filesystem usage even if the container had already been seen - this was not being cleaned up. This PR ensures that the track usage goroutine is only started once. |
// Check that the container didn't already exist. | ||
_, ok := m.containers[namespacedName] | ||
if ok { | ||
return true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an alternative, if we were to invoke handler.Cleanup()
here, would that also fix the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
handler.Cleanup()
should only be called when a container is removed. Not sure why you would do that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem IIUC is that of not invoking Cleanup() before letting go GC the container object.
I intended to say that if alreadyExists
is true
, then we can invoke Cleanup() right away.
But I prefer not creating the object if it already exists and invoking Cleanup() iif we return error instead of starting housekeeping.
That would let us not add the new Start
method.
09f948c
to
4e9d29a
Compare
@jimmidyson: That was quick :) LGTM. I don't like the additional |
@vishh I agree this was just yet another quick fix... Really need to review cadvisor's structure so we can tidy things up like this. |
Reported in kubernetes/kubernetes#19633
/cc @vishh @timothysc