Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PG: primary should not be in the peer_info, skip if it is #20189

Merged
merged 1 commit into from Feb 2, 2018

Conversation

neha-ojha
Copy link
Member

Skip the primary when iterating over peer_info to avoid adding the primary twice.

Signed-off-by: Neha Ojha nojha@redhat.com

@dzafman
Copy link
Contributor

dzafman commented Jan 30, 2018

@neha-ojha I'm enhancing this function right now. Actually, the primary is never in peer_info, so I'm not clear this is needed.

Did you see this happen or was this from code inspection?

@neha-ojha
Copy link
Member Author

@dzafman I hit this assert (target >= missing_target_objects.size()) http://qa-proxy.ceph.com/teuthology/nojha-2018-01-29_22:11:59-rados-wip-async-recovery-2018-01-29---basic-smithi/2124263/teuthology.log

Adding some debug logs showed the following:

2018-01-29 22:25:12.812 7fb3b160a700 10 osd.2 pg_epoch: 27 pg[2.e( v 27'1271 lc 17'2 (0'0,27'1271] local-lis/les=26/27 n=1267 ec=12/12 lis/c 26/16 les/c/f 27/17/0 26/26/26) [2,0] r=0 lpr=26 pi=[12,26)/2 luod=27'1270 crt=27'1271 mlcod 17'2 active+recovery_unfound+degraded m=2 u=2] _update_calc_stats target = 2

2018-01-29 22:25:12.812 7fb3b160a700 10 osd.2 pg_epoch: 27 pg[2.e( v 27'1271 lc 17'2 (0'0,27'1271] local-lis/les=26/27 n=1267 ec=12/12 lis/c 26/16 les/c/f 27/17/0 26/26/26) [2,0] r=0 lpr=26 pi=[12,26)/2 luod=27'1270 crt=27'1271 mlcod 17'2 active+recovery_unfound+degraded m=2 u=2] _update_calc_stats missing_target_objects size = 3

2018-01-29 22:25:12.812 7fb3b160a700 10 osd.2 pg_epoch: 27 pg[2.e( v 27'1271 lc 17'2 (0'0,27'1271] local-lis/les=26/27 n=1267 ec=12/12 lis/c 26/16 les/c/f 27/17/0 26/26/26) [2,0] r=0 lpr=26 pi=[12,26)/2 luod=27'1270 crt=27'1271 mlcod 17'2 active+recovery_unfound+degraded m=2 u=2] _update_calc_stats missing_target_objects = 0,2,2,0,2,2

2 being the primary here was getting added twice.

Copy link
Contributor

@dzafman dzafman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment that this shouldn't be needed because the primary should be in the peer_info.

@dzafman
Copy link
Contributor

dzafman commented Jan 30, 2018

@neha-ojha Also, change the commit comment to reflect that the primary isn't supposed to be in peer_info.

Signed-off-by: Neha Ojha <nojha@redhat.com>
@neha-ojha neha-ojha changed the title PG: do not add repeated entry in missing_target_objects PG: primary should not be in the peer_info, skip if it is Jan 30, 2018
@jdurgin
Copy link
Member

jdurgin commented Jan 31, 2018

@dzafman we've got a bunch of loops over peer_* data structures that explicitly skip the primary - it does seem like a bad pattern, but they seem to be preventing bugs for now.

@dzafman
Copy link
Contributor

dzafman commented Jan 31, 2018

@jdurgin Yes, I talked with @neha-ojha about it. I created a tracker to figure out later why primary is ending up in peer_info. http://tracker.ceph.com/issues/22834

@tchaikov tchaikov merged commit 85fe7ef into ceph:master Feb 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants