Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allocator: Fix events watch leak #2215

Merged
merged 1 commit into from
Jun 8, 2017
Merged

Conversation

aaronlehmann
Copy link
Collaborator

The allocator starts a watch on events and never shuts it down. If this node loses leadership, events will continue to pile up in that queue forever, leading to memory exhaustion.

Fix this by giving the watch a well-defined lifecycle. Return it along with a cancel function from an init() method to make it clear it must be cancelled, instead of sticking the cancel function value in a struct where there is no clear responsibilty for calling it.

In the future, we might consider changing Watch to run a callback function, so shutting down the watch is mandatory (similar to what we do with store transactions).

cc @briantd @jakegdocker

The allocator starts a watch on events and never shuts it down. If this
node loses leadership, events will continue to pile up in that queue
forever, leading to memory exhaustion.

Fix this by giving the watch a well-defined lifecycle. Return it along
with a cancel function from an init() method to make it clear it must be
cancelled, instead of sticking the cancel function value in a struct
where there is no clear responsibilty for calling it.

In the future, we might consider changing Watch to run a callback
function, so shutting down the watch is mandatory (similar to what we do
with store transactions).

Signed-off-by: Aaron Lehmann <aaron.lehmann@docker.com>
@codecov
Copy link

codecov bot commented Jun 5, 2017

Codecov Report

Merging #2215 into master will increase coverage by 0.01%.
The diff coverage is 88.23%.

@@            Coverage Diff             @@
##           master    #2215      +/-   ##
==========================================
+ Coverage   60.03%   60.05%   +0.01%     
==========================================
  Files         124      124              
  Lines       20115    20122       +7     
==========================================
+ Hits        12077    12085       +8     
+ Misses       6680     6674       -6     
- Partials     1358     1363       +5

@rogaha
Copy link

rogaha commented Jun 5, 2017

👍 great catch @briantd! Thanks for working on a patch for it @aaronlehmann!

@andrewhsu andrewhsu mentioned this pull request Jun 6, 2017
40 tasks
@ijc
Copy link
Contributor

ijc commented Jun 7, 2017

LGTM.

@cyli
Copy link
Contributor

cyli commented Jun 8, 2017

LGTM, with the caveat that I don't know the allocator code very well :) So please excuse the dumb question: this looks like it's creating a watch per allocActor? Granted there is only ever one right now, so it doesn't matter at the moment - but I was sure why there's a for loop - is the intention that there will be more in the future? If so in the future would we have to condense the watches again so that there is only one no matter how many allocActors there are?

@aaronlehmann
Copy link
Collaborator Author

aaronlehmann commented Jun 8, 2017 via email

@cyli
Copy link
Contributor

cyli commented Jun 8, 2017

Ah ok, so it's not like a pool of workers who grabs the first event, it's more of a fanout to all actors?

@aaronlehmann
Copy link
Collaborator Author

aaronlehmann commented Jun 8, 2017 via email

@cyli
Copy link
Contributor

cyli commented Jun 8, 2017

Ok, that makes sense, thanks for explaining! 👍

@dongluochen
Copy link
Contributor

LGTM

@ijc
Copy link
Contributor

ijc commented Jun 8, 2017

It didn't make sense to me that the watch used to be shared between all actors.

This was something I tripped over in one of my PoC attempts to refactor some of this stuff for CNI networking on top of #1965, I can confirm that it does not work well at all...

@cyli cyli merged commit ba9fec7 into moby:master Jun 8, 2017
@aaronlehmann aaronlehmann deleted the watch-leak branch June 12, 2017 23:57
silvin-lubecki pushed a commit to silvin-lubecki/docker-ce that referenced this pull request Feb 3, 2020
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Feb 3, 2020
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 10, 2020
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants