New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mon: mark mgr reports as no_reply #21057
Conversation
it should address the failures in http://pulpito.ceph.com/kchai-2018-03-27_08:29:32-rados-wip-kefu-testing-2018-03-27-1407-distro-basic-smithi/
|
see also: ceph#20517 Fixes: http://tracker.ceph.com/issues/22114 Signed-off-by: Kefu Chai <kchai@redhat.com>
@jecluis @gregsfortytwo ping? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@tchaikov gotcha. thanks |
@theanalyst this needs to be backported see #21016 |
strange enough, we still have SLOW_OPS in http://pulpito.ceph.com/kchai-2018-03-29_13:20:02-rados-wip-slow-mon-ops-kefu-distro-basic-smithi/2334154/ . tracked at http://tracker.ceph.com/issues/23511 |
Given that we keep adding new "no_reply()" markings, maybe we should mark them that way on construction rather than when the monitor has to handle them? Is that possible? |
@gregsfortytwo i am not sure if that's possible or a better way. as i think, no_reply() is an decision by the upper layer of the application stack, not the call of messenger, where the messages are decoded. or probably i misread you completely. |
If @gregsfortytwo was suggesting having the message marked at construction by the sender, then this could possibly be feasible, but I think we have too many exceptions in the monitors to drop/no_reply vs handle that this may be even trickier than addressing individual instances. It would be nice if we knew exactly which messages are not expecting a reply, so that we could simply mark them as such in their ctor or something. But I'm guessing those would be a small subset of the messages we actually handle as no_reply() [he said, without actually looking at the code]. On the other hand, if we were to identify those that are typically no_reply, and mark them as such by default, and only reply to them in selected cases... then that may help with things a little. However, this doesn't seem a trivial task to accomplish either. |
Yes, I meant what Joao says: the peon monitor could mark them as no-reply, that gets flagged (in the MRoute or similar) and the leader's PaxosService dispatch machinery can be responsible for sending back the blank "handled" message to the peon once the message is resolved. This might be useless, in that it merely moves responsibility for noticing it from the recipient to the sender. Or it might be easier, since we can identify categories of messages that don't need a response and set it all up in their constructors as an obvious decision the author needs to make, instead of something to maintain in far-off code without obvious failures. (So far, they are all static: MMonMgrReport, MOSDFailure, MOSDPGCreated, MOSDBeacon, and MMgrBeacon.) (Or maybe now that we are noticing slow mon ops we don't care any more.) |
see also: #20517
Fixes: http://tracker.ceph.com/issues/22114
Signed-off-by: Kefu Chai kchai@redhat.com