mgr: Release GIL before calling OSDMap::calc_pg_upmaps() #31064

dzafman · 2019-10-22T23:13:37Z

Fixes: https://tracker.ceph.com/issues/42432

Signed-off-by: David Zafman dzafman@redhat.com

Checklist

References tracker ticket
~~Updates documentation if necessary~~
~~Includes tests for new functionality or reproducer for bug~~

Show available Jenkins commands

jenkins retest this please
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard backend
jenkins test docs
jenkins render docs

dzafman · 2019-10-22T23:27:46Z

@liewegas @xiexingguo @jdurgin I'm not sure if I put the PyEval_SaveThread() in the best location nor whether there are ramification to dropping the GIL there.

I was able to do ceph balancer status and ceph balancer show xxx while a test sleep was happening in OSDMap::calc_pg_upmaps().

jdurgin · 2019-10-22T23:35:38Z

hmm looking closer I see one potential race - since the balancer module inserts the plan object in optimize()->plan_create(), a user could call execute() on a partially initiallized plan (with the incremental in particular being updated).

To fix this we could only add the plan to self.plans after optimize() is done initializing it.

dzafman · 2019-10-23T04:54:39Z

@jdurgin I fixed the issue with self.plans that you pointed out. In addition, now the code disallows the command "ceph balancer optimize ...." while the balancer is active. This prevents races with optimize being run in 2 threads.

There is still a race:

active balancer is processing in calc_pg_upmaps()
ceph balancer off
ceph balancer optimize ....

I can fix with with a boolean "optimizing" that is set while active balancer is optimizing. So even if active = False, optimizing = True will still fail the balancer optimize command.

There is a lock in the ceph-mgr for incoming ceph commands, so if a manual "ceph balancer optimize ..." takes a long time, other ceph balancer commands like status will hang as before. At least this prevents activating the balancer while a manual optimize is already running.

I believe that this is the relevant code:
bool DaemonServer::handle_command(const ref_t<MMgrCommand>& m)
{
std::lock_guard l(lock);

src/pybind/mgr/balancer/module.py

tchaikov

http://pulpito.ceph.com/kchai-2019-10-24_07:20:49-rados-wip-kefu-testing-2019-10-24-1212-distro-basic-smithi/4440398/

src/pybind/mgr/balancer/module.py

dzafman · 2019-10-25T01:55:04Z

@tchaikov Individual tests for balancer passed:
http://pulpito.ceph.com/dzafman-2019-10-24_17:49:46-rados-wip-zafman-testing-distro-basic-smithi
http://pulpito.ceph.com/dzafman-2019-10-24_13:26:29-rados-wip-zafman-testing-distro-basic-smithi

Prevent optimize and execute commands from running with active balancer Fixes: https://tracker.ceph.com/issues/42432 Signed-off-by: David Zafman <dzafman@redhat.com>

Add balancer status fields so that slow optimizations can be detected Signed-off-by: David Zafman <dzafman@redhat.com>

Signed-off-by: David Zafman <dzafman@redhat.com>

tserong

I'm not quite sure about this... @dzafman, are you able to confirm if it's only mgr balancer commands that are affected, or do all mgr commands block (try ceph osd status for example). If the latter is true, we're hitting a bigger problem (see https://tracker.ceph.com/issues/37514)

mamahtehok · 2019-10-29T11:06:51Z

I'm not quite sure about this... @dzafman, are you able to confirm if it's only mgr balancer commands that are affected, or do all mgr commands block (try ceph osd status for example). If the latter is true, we're hitting a bigger problem (see https://tracker.ceph.com/issues/37514)

Hi, in my case, long balancer task blocked ceph osd status command also.

dzafman · 2019-10-29T17:04:03Z

@tserong @mamahtehok This fix only deals with the active balancer background thread. In that case a slow OSDMap::calc_pg_upmaps() will NOT block other commands. So this pull request would fix using ceph osd status with an active balancer.

If a user uses the ceph balancer optimize command it would block other commands if it is slow per https://tracker.ceph.com/issues/37514.

tserong · 2019-10-30T06:41:18Z

Thanks for the explanation. As for the placement of PyEval_SaveThread() / PyEval_RestoreThread(), I think what you have makes sense; it drops the GIL for a potential long-running operation which doesn't need access to Python stuff while it's running, then reacquires it later. The only possible catch I can think of is that if some other balancer command were invoked while calc_pg_upmaps was running (without the GIL), the PyEval_RestoreThread() would block until that command finished, so this change effectively means interactive tasks take precedence over background tasks.

* refs/pull/31064/head: test: Test balancer module commands mgr: Improve balancer module status mgr: Release GIL before calling OSDMap::calc_pg_upmaps() Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Kefu Chai <kchai@redhat.com>

dzafman added bug-fix mgr labels Oct 22, 2019

dzafman requested review from jdurgin, liewegas and xiexingguo October 22, 2019 23:14

jdurgin approved these changes Oct 22, 2019

View reviewed changes

tchaikov added the needs-qa label Oct 23, 2019

dzafman force-pushed the wip-42432 branch from cea6d11 to 984b7b0 Compare October 23, 2019 04:28

sebastian-philipp requested a review from tserong October 23, 2019 10:19

jdurgin reviewed Oct 23, 2019

View reviewed changes

src/pybind/mgr/balancer/module.py Outdated Show resolved Hide resolved

dzafman force-pushed the wip-42432 branch from 6f242df to 3a9c99d Compare October 24, 2019 00:15

tchaikov added the wip-kefu-testing label Oct 24, 2019

tchaikov requested changes Oct 24, 2019

View reviewed changes

src/pybind/mgr/balancer/module.py Show resolved Hide resolved

src/pybind/mgr/balancer/module.py Show resolved Hide resolved

tchaikov removed needs-qa wip-kefu-testing labels Oct 24, 2019

dzafman added 3 commits October 24, 2019 18:56

mgr: Release GIL before calling OSDMap::calc_pg_upmaps()

e2a35e8

Prevent optimize and execute commands from running with active balancer Fixes: https://tracker.ceph.com/issues/42432 Signed-off-by: David Zafman <dzafman@redhat.com>

mgr: Improve balancer module status

f04c505

Add balancer status fields so that slow optimizations can be detected Signed-off-by: David Zafman <dzafman@redhat.com>

test: Test balancer module commands

3a0e2c8

Signed-off-by: David Zafman <dzafman@redhat.com>

dzafman force-pushed the wip-42432 branch from e353054 to 3a0e2c8 Compare October 25, 2019 01:57

tchaikov added the needs-qa label Oct 25, 2019

liewegas added the wip-sage2-testing label Oct 25, 2019

tserong reviewed Oct 29, 2019

View reviewed changes

dzafman requested a review from tchaikov November 1, 2019 21:04

tchaikov added the wip-kefu2-testing label Nov 3, 2019

liewegas removed the wip-sage2-testing label Nov 3, 2019

tchaikov approved these changes Nov 4, 2019

View reviewed changes

liewegas added the wip-sage-testing label Nov 5, 2019

liewegas merged commit 3a0e2c8 into ceph:master Nov 7, 2019

dzafman deleted the wip-42432 branch November 8, 2019 01:30

dzafman mentioned this pull request Nov 15, 2019

nautilus: mgr: Release GIL before calling OSDMap::calc_pg_upmaps() #31682

Merged

3 tasks

dzafman mentioned this pull request Dec 3, 2019

mimic: mgr: Release GIL and Balancer fixes #31957

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mgr: Release GIL before calling OSDMap::calc_pg_upmaps() #31064

mgr: Release GIL before calling OSDMap::calc_pg_upmaps() #31064

dzafman commented Oct 22, 2019 •

edited

dzafman commented Oct 22, 2019

jdurgin commented Oct 22, 2019

dzafman commented Oct 23, 2019

tchaikov left a comment

dzafman commented Oct 25, 2019

tserong left a comment

mamahtehok commented Oct 29, 2019

dzafman commented Oct 29, 2019

tserong commented Oct 30, 2019

mgr: Release GIL before calling OSDMap::calc_pg_upmaps() #31064

mgr: Release GIL before calling OSDMap::calc_pg_upmaps() #31064

Conversation

dzafman commented Oct 22, 2019 • edited

Checklist

dzafman commented Oct 22, 2019

jdurgin commented Oct 22, 2019

dzafman commented Oct 23, 2019

tchaikov left a comment

Choose a reason for hiding this comment

dzafman commented Oct 25, 2019

tserong left a comment

Choose a reason for hiding this comment

mamahtehok commented Oct 29, 2019

dzafman commented Oct 29, 2019

tserong commented Oct 30, 2019

dzafman commented Oct 22, 2019 •

edited