New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mimic: msg: ceph_abort() when there are enough accepter errors in msg server #25045

Merged
merged 1 commit into from Nov 26, 2018

Conversation

Projects
None yet
4 participants
@tchaikov
Copy link
Contributor

tchaikov commented Nov 11, 2018

msg: ceph_abort() when there are enough accepter errors in msg server
In some extrem cases(we have met one in our production cluster), when Accepter thread break out , new client can not connect to the osd. Because the former heartbeat connections are already connected, other osd can not detect failure then notify monitor to mark the failed osd down.
In the patch, we there are abnormal communication errors ,we just ceph_abort  so that osd can go down fastly and other osds can notify monitor to mark the failed osd down.
Signed-off-by: penglaiyxy@gmail.com <penglaiyxy@gmail.com>

(cherry picked from commit 00e0ab4)

Conflicts:
	src/common/legacy_config_opts.h
	src/common/options.cc
	src/msg/async/AsyncMessenger.cc: trivial resolution

@smithfarm smithfarm requested a review from gregsfortytwo Nov 11, 2018

@gregsfortytwo
Copy link
Member

gregsfortytwo left a comment

LGTM

@yuriw

This comment has been minimized.

Copy link
Contributor

yuriw commented Nov 16, 2018

@yuriw yuriw merged commit 8b9898d into ceph:mimic Nov 26, 2018

4 checks passed

Docs: build check OK - docs built
Details
Signed-off-by all commits in this PR are signed
Details
Unmodified Submodules submodules for project are unmodified
Details
make check make check succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment