Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mon, osd: fix potential collided *Up Set* after PG remapping #20653

Merged
merged 1 commit into from Mar 2, 2018

Conversation

xiexingguo
Copy link
Member

@xiexingguo xiexingguo commented Mar 1, 2018

The mgr balancer module are basically doing optimizations based on
the snapshots of OSDMap at certain moments, which turns out to be
the culprit of data loss since it can produce bad PG mapping results
sometimes while in upmap mode.
I.e.:

  1. original cluster topology:
-5       2.00000     host host-a
 0   ssd 1.00000         osd.0       up  1.00000 1.00000
 1   ssd 1.00000         osd.1       up  1.00000 1.00000
-7       2.00000     host host-b
 2   ssd 1.00000         osd.2       up  1.00000 1.00000
 3   ssd 1.00000         osd.3       up  1.00000 1.00000
-9       2.00000     host host-c
 4   ssd 1.00000         osd.4       up  1.00000 1.00000
 5   ssd 1.00000         osd.5       up  1.00000 1.00000
  1. mgr balancer applies optimization for PG 3.f:
            pg-upmap-items[3.f : 1->4]
3.f [1 3] + -------------------------> [4 3]
  1. osd.3 is out/reweighted etc., original crush mapping of 3.f changed
    (while pg-upmap-items did not):
            pg-upmap-items[3.f : 1->4]
3.f [1 5] + -------------------------> [4 5]
  1. we are now mapping PG 3.f to two OSDs(osd.4 & osd.5) on the same host
    (host-c).

Fix the above problem by putting a guard procedure before we can
finally encode these unsafe upmap remappings into OSDMap.
If any of them turns out to be inappropriate, we can simply cancel it
since balancer can still re-calculate and re-generate later if necessary.

Fixes: http://tracker.ceph.com/issues/23118
Signed-off-by: xie xingguo xie.xingguo@zte.com.cn

@liewegas
Copy link
Member

liewegas commented Mar 1, 2018

A unit test for this would be great: install an invalid mapping and assert that it removes it. Probably in src/test/osd/TestOSDMap.cc?

@liewegas
Copy link
Member

liewegas commented Mar 1, 2018

Please add Fixes: http://tracker.ceph.com/issues/23118 to the commit message?

@xiexingguo xiexingguo force-pushed the wip-fix-upmap branch 2 times, most recently from 6a0f993 to 9c9da3d Compare March 1, 2018 11:21
The mgr balancer module are basically doing optimizations based on
the snapshots of OSDMap at certain moments, which turns out to be
the culprit of data loss since it can produce bad PG mapping results
sometimes while in upmap mode.
I.e.:
1) original cluster topology:

-5       2.00000     host host-a
 0   ssd 1.00000         osd.0       up  1.00000 1.00000
 1   ssd 1.00000         osd.1       up  1.00000 1.00000
-7       2.00000     host host-b
 2   ssd 1.00000         osd.2       up  1.00000 1.00000
 3   ssd 1.00000         osd.3       up  1.00000 1.00000
-9       2.00000     host host-c
 4   ssd 1.00000         osd.4       up  1.00000 1.00000
 5   ssd 1.00000         osd.5       up  1.00000 1.00000

2) mgr balancer applies optimization for PG 3.f:

            pg-upmap-items[3.f : 1->4]
3.f [1 3] + -------------------------> [4 3]

3) osd.3 is out/reweighted etc., original crush mapping of 3.f changed
   (while pg-upmap-items did not):

            pg-upmap-items[3.f : 1->4]
3.f [1 5] + -------------------------> [4 5]

4) we are now mapping PG 3.f to two OSDs(osd.4 & osd.5) on the same host
   (host-c).

Fix the above problem by putting a guard procedure before we can
finally encode these *unsafe* upmap remappings into OSDMap.
If any of them turns out to be inappropriate, we can simply cancel it
since balancer can still re-calculate and re-generate later if necessary.

Fixes: http://tracker.ceph.com/issues/23118
Signed-off-by: xie xingguo <xie.xingguo@zte.com.cn>
@xiexingguo
Copy link
Member Author

A unit test for this would be great: install an invalid mapping and assert that it removes it. Probably in src/test/osd/TestOSDMap.cc?

@liewegas All addresed

@yuriw
Copy link
Contributor

yuriw commented Mar 1, 2018

@tchaikov tchaikov merged commit 1b5c937 into ceph:master Mar 2, 2018
@xiexingguo xiexingguo deleted the wip-fix-upmap branch March 2, 2018 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants