Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix FS usage goroutine leaks #1051

Merged
merged 1 commit into from Jan 14, 2016
Merged

Conversation

jimmidyson
Copy link
Collaborator

@k8s-bot
Copy link
Collaborator

k8s-bot commented Jan 14, 2016

Jenkins GCE e2e

Build/test passed for commit 985e846.

@timothysc
Copy link
Contributor

Awesome, thx @jimmidyson for the quick fix.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Jan 14, 2016

Jenkins GCE e2e

Build/test passed for commit 09f948c.

@vishh
Copy link
Contributor

vishh commented Jan 14, 2016

@jimmidyson: Can you describe the issue you are trying to fix with this PR?

@@ -746,6 +746,9 @@ func (m *manager) registerCollectors(collectorConfigs map[string]string, cont *c

// Create a container.
func (m *manager) createContainer(containerName string) error {
m.containersLock.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you moved out the creation of goroutines from the initialization of container handlers, is it necessary to guard the entire function with the lock?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guard is really between createContainer & destroyContainer.

@timothysc
Copy link
Contributor

@vishh the linked issue ( kubernetes/kubernetes#19633) contains the complete details.

cadvisor can leak goroutines.

@jimmidyson
Copy link
Collaborator Author

@vishh Goroutines were leaked as detailed in kubernetes/kubernetes#19633. @timothysc found this was due to tracking filesystem usage. The prior version was starting tracking filesystem usage even if the container had already been seen - this was not being cleaned up. This PR ensures that the track usage goroutine is only started once.

// Check that the container didn't already exist.
_, ok := m.containers[namespacedName]
if ok {
return true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, if we were to invoke handler.Cleanup() here, would that also fix the issue?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handler.Cleanup() should only be called when a container is removed. Not sure why you would do that here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem IIUC is that of not invoking Cleanup() before letting go GC the container object.
I intended to say that if alreadyExists is true, then we can invoke Cleanup() right away.
But I prefer not creating the object if it already exists and invoking Cleanup() iif we return error instead of starting housekeeping.
That would let us not add the new Start method.

@k8s-bot
Copy link
Collaborator

k8s-bot commented Jan 14, 2016

Jenkins GCE e2e

Build/test passed for commit 4e9d29a.

@vishh
Copy link
Contributor

vishh commented Jan 14, 2016

@jimmidyson: That was quick :) LGTM. I don't like the additional Start and Cleanup to be frank. I will try to find alternative methods to avoid them. For now, let's merge this PR.

vishh added a commit that referenced this pull request Jan 14, 2016
@vishh vishh merged commit 290a844 into google:master Jan 14, 2016
@jimmidyson
Copy link
Collaborator Author

@vishh I agree this was just yet another quick fix... Really need to review cadvisor's structure so we can tidy things up like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants