osd: add osd_fast_shutdown option (default true) #31677

liewegas · 2019-11-15T15:32:03Z

If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast
shutdown by exiting immediately. This has a few important benefits:

We immediately stop responding (binding) to any sockets, which means
other OSDs will immediately decide we are down (and dead!). This
minimizes IO interruption.
We avoid the complex "clean" shutdown process, which is historically a
source of bugs.

In reality, the only purpose of the "clean" shutdown is to try to tear down
everything in memory so we can do memory leak checking with valgrind. Set
this option to false for valgrind QA runs so we can still do that.

Not that with the new read leases in octopus, we rely on the default
behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead,
so that we don't have to wait for any leases to time out. This works in
sane environments with normal IP networks, but that behavior could
conceivably be a bad idea if there are some weird network shenanigans
going on. If osd_fast_fail_on_connection_refused were disabled, then this
fast shutdown procedure might be worse than the clean shutdown because
we would have to wait for the heartbeat timeout.

Signed-off-by: Sage Weil sage@redhat.com

If we get a SIGINT or SIGTERM or are deleted from the OSDMap, do a fast shutdown by exiting immediately. This has a few important benefits: - We immediately stop responding (binding) to any sockets, which means other OSDs will immediately decide we are down (and dead!). This minimizes IO interruption. - We avoid the complex "clean" shutdown process, which is historically a source of bugs. In reality, the only purpose of the "clean" shutdown is to try to tear down everything in memory so we can do memory leak checking with valgrind. Set this option to false for valgrind QA runs so we can still do that. Not that with the new read leases in octopus, we rely on the default behavior that a ECONNREFUSED is taken to mean that the OSD is fully dead, so that we don't have to wait for any leases to time out. This works in sane environments with normal IP networks, but that behavior could conceivably be a bad idea if there are some weird network shenanigans going on. If osd_fast_fail_on_connection_refused were disabled, then this fast shutdown procedure might be *worse* than the clean shutdown because we would have to wait for the heartbeat timeout. Signed-off-by: Sage Weil <sage@redhat.com>

liewegas · 2019-11-15T15:33:40Z

In my vstart testing this gives me a sub-second IO interruption when killing an OSD.

jdurgin · 2019-11-15T16:39:42Z

Seems like this would expose potential consistency bugs in the objectstore/rocksdb more frequently, so it would be good for master, but not backport without plenty of testing

gregsfortytwo · 2019-11-15T21:02:30Z

Hmm is that fast IO interruption going to scale okay to real clusters, since it's relying on the monitor OSD failure tracking?
The MarkMeDown message was historically important because it got the IO hiccup down to a normal OSDMap distribution, rather than being an event you could track just by looking at client IO latencies.

liewegas · 2019-11-15T21:34:47Z

Hmm is that fast IO interruption going to scale okay to real clusters, since it's relying on the monitor OSD failure tracking?
The MarkMeDown message was historically important because it got the IO hiccup down to a normal OSDMap distribution, rather than being an event you could track just by looking at client IO latencies.

That used to be true. These days (since luminous or kraken maybe?) when we get ECONNREFUSED we infer that the OSD process is really gone and mark it down immediately.

Stopping the OSD doesn't guarantee that it will be marked down. Signed-off-by: Sage Weil <sage@redhat.com>

A kill doesn't induce a mark-down of the OSD with osd_fast_shutdown=true. Signed-off-by: Sage Weil <sage@redhat.com>

* refs/pull/31677/head: qa/standalone/ceph-helpers.sh: remove osd down check qa/standalone/ceph-helpers.sh: destroy_osd: mark osd down osd: add osd_fast_shutdown option (default true) Reviewed-by: Sébastien Han <seb@redhat.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>

leseb · 2019-11-26T13:09:49Z

Are we going to backport this?

liewegas · 2019-11-26T13:27:53Z

Eventually, probably... but not in any hurry

liewegas added the core label Nov 15, 2019

liewegas requested review from jdurgin, leseb and neha-ojha November 15, 2019 15:32

jdurgin approved these changes Nov 15, 2019

View reviewed changes

liewegas added the wip-sage-testing label Nov 15, 2019

travisn mentioned this pull request Nov 15, 2019

Stop osd process more quickly during pod shutdown to reduce IO unresponsiveness rook/rook#4328

Merged

9 tasks

neha-ojha approved these changes Nov 15, 2019

View reviewed changes

leseb approved these changes Nov 15, 2019

View reviewed changes

qa/standalone/ceph-helpers.sh: destroy_osd: mark osd down

ede1d36

Stopping the OSD doesn't guarantee that it will be marked down. Signed-off-by: Sage Weil <sage@redhat.com>

liewegas added wip-sage-testing and removed wip-sage-testing labels Nov 20, 2019

liewegas force-pushed the wip-fast-osd-down branch from 8d53cc9 to 30eb7dd Compare November 24, 2019 02:31

qa/standalone/ceph-helpers.sh: remove osd down check

3a62d16

A kill doesn't induce a mark-down of the OSD with osd_fast_shutdown=true. Signed-off-by: Sage Weil <sage@redhat.com>

liewegas force-pushed the wip-fast-osd-down branch from 30eb7dd to 3a62d16 Compare November 24, 2019 18:44

liewegas merged commit 3a62d16 into ceph:master Nov 25, 2019

liewegas deleted the wip-fast-osd-down branch November 25, 2019 14:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

osd: add osd_fast_shutdown option (default true) #31677

osd: add osd_fast_shutdown option (default true) #31677

liewegas commented Nov 15, 2019

liewegas commented Nov 15, 2019

jdurgin commented Nov 15, 2019

gregsfortytwo commented Nov 15, 2019

liewegas commented Nov 15, 2019

leseb commented Nov 26, 2019

liewegas commented Nov 26, 2019

osd: add osd_fast_shutdown option (default true) #31677

osd: add osd_fast_shutdown option (default true) #31677

Conversation

liewegas commented Nov 15, 2019

liewegas commented Nov 15, 2019

jdurgin commented Nov 15, 2019

gregsfortytwo commented Nov 15, 2019

liewegas commented Nov 15, 2019

leseb commented Nov 26, 2019

liewegas commented Nov 26, 2019