Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hammer: mon: fix memory leak in prepare_beacon #10238

Merged
merged 1 commit into from Nov 23, 2016
Merged

Conversation

aiicore
Copy link
Contributor

@aiicore aiicore commented Jul 11, 2016

http://tracker.ceph.com/issues/17285

prepare_beacon() case of prepare_update() should put()
message in two more cases, because is the last step
of dispatch()

Signed-off-by: Igor Podoski igor.podoski@ts.fujitsu.com

@@ -501,6 +502,7 @@ bool MDSMonitor::prepare_beacon(MMDSBeacon *m)
dout(0) << "got beacon for MDS in STATE_STOPPING, ignoring requested state change"
<< dendl;
_note_beacon(m);
m->put();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this put should be here when return 'true', because we can go into propose_pending() from dispatch() in mon/PaxosService.cc. I'm not so familiar with MDS monitor code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare_update() is responsible to free the message before returns. so this change is correct.

@aiicore
Copy link
Contributor Author

aiicore commented Jul 11, 2016

Memory leaks every time when you see in monitor log:
mon.0@0(leader).mds e1 warning, MDS mds.? XXX.XXX.XXX.XXX:6800/1494583 up but filesystem disabled
Is not very noticeable, however when cluster is in unstable state, e.g. network cards are dropping random packets, monitors tend to fall into election loop. When two monitors are marking themselves as leaders for a while and then start election from the beginning, MDS messages are forwarded from one to another (to leader). After a while amount of MDS messages is quite big and when they hit monitor which is leader (at that point of time) memory leak is very visible and it leads to OOM kill.

@aiicore aiicore changed the title mon/MDSMonitor: fix memory leak in prepare_beacon hammer: mon/MDSMonitor: fix memory leak in prepare_beacon Jul 11, 2016
@liewegas liewegas added bug-fix cephfs Ceph File System labels Jul 11, 2016
@tchaikov tchaikov added this to the hammer milestone Jul 15, 2016
@smithfarm
Copy link
Contributor

Adding DNM just to keep this very new PR out of the next hammer integration testing round. Will remove it afterwards.

@smithfarm smithfarm changed the title hammer: mon/MDSMonitor: fix memory leak in prepare_beacon [DNM] hammer: mon/MDSMonitor: fix memory leak in prepare_beacon Jul 16, 2016
@smithfarm
Copy link
Contributor

@aiicore Is this bug present in infernalis, jewel? Do you know which commit fixed it? When backporting, we prefer to cherry-pick existing fixes.

@aiicore
Copy link
Contributor Author

aiicore commented Sep 15, 2016

@smithfarm This code lasts until v10.0.5. There was big MDSMonitor reorganization in 4e9b953, and above code was removed.

@smithfarm smithfarm changed the title [DNM] hammer: mon/MDSMonitor: fix memory leak in prepare_beacon hammer: mon/MDSMonitor: fix memory leak in prepare_beacon Sep 15, 2016
@smithfarm
Copy link
Contributor

@aiicore OK - is there an existing tracker issue? Or can you make a new one?

@aiicore
Copy link
Contributor Author

aiicore commented Sep 15, 2016

@smithfarm I think there is no related tracker issue. I'll make one soon.

@aiicore
Copy link
Contributor Author

aiicore commented Sep 16, 2016

@smithfarm I've created bug report http://tracker.ceph.com/issues/17285 and assigned to myself, but I can't edit it anymore (wanted to change it to FIX and clear a typo). Am I missing permissions?

@smithfarm
Copy link
Contributor

@aiicore Don't know about your tracker permissions, but I fixed up the issue so it will work with our backporting scripts and saved queries.

@aiicore
Copy link
Contributor Author

aiicore commented Sep 16, 2016

@smithfarm Thanks!

@smithfarm
Copy link
Contributor

@aiicore Could you please rebase.

@smithfarm smithfarm self-assigned this Nov 11, 2016
@smithfarm
Copy link
Contributor

@aiicore Also, please add a Fixes: http://tracker.ceph.com/issues/17285 line right above Signed-off-by thanks.

prepare_beacon() case of prepare_update() should put()
message in two more cases, because is the last step
of dispatch()

Fixes: http://tracker.ceph.com/issues/17285
Signed-off-by: Igor Podoski <igor.podoski@ts.fujitsu.com>
@aiicore
Copy link
Contributor Author

aiicore commented Nov 17, 2016

@smithfarm rebased + additional line in commit

@smithfarm smithfarm changed the title hammer: mon/MDSMonitor: fix memory leak in prepare_beacon hammer: mon: fix memory leak in prepare_beacon Nov 18, 2016
smithfarm added a commit that referenced this pull request Nov 18, 2016
…acon

Reviewed-by: Nathan Cutler <ncutler@suse.com>
smithfarm added a commit that referenced this pull request Nov 20, 2016
…acon

Reviewed-by: Nathan Cutler <ncutler@suse.com>
@smithfarm
Copy link
Contributor

@jcsp @tchaikov This hammer-only fix passed a rados suite at http://tracker.ceph.com/issues/17151#note-7 with failures that I believe have been addressed (except for http://tracker.ceph.com/issues/15139 which is caused by the build system no longer providing dumpling-era packages).

I have rebuilt the integration branch to include the two fixes and scheduled a new run at http://tracker.ceph.com/issues/17151#note-14

Do you think it's OK to merge provided the second run succeeds?

@tchaikov tchaikov self-assigned this Nov 20, 2016
smithfarm added a commit that referenced this pull request Nov 21, 2016
…acon

Reviewed-by: Nathan Cutler <ncutler@suse.com>
smithfarm added a commit that referenced this pull request Nov 21, 2016
…acon

Reviewed-by: Nathan Cutler <ncutler@suse.com>
@smithfarm
Copy link
Contributor

Copy link
Contributor

@tchaikov tchaikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tchaikov
Copy link
Contributor

lgtm.

@tchaikov tchaikov removed their assignment Nov 22, 2016
@jcsp
Copy link
Contributor

jcsp commented Nov 23, 2016

Looks sensible to me

@smithfarm smithfarm merged commit 9655228 into ceph:hammer Nov 23, 2016
@aiicore aiicore deleted the hammer branch November 24, 2016 06:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-fix cephfs Ceph File System
Projects
None yet
5 participants