Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: For ec pool recovery, only for can recoverable object. #4269

Merged
merged 4 commits into from Apr 22, 2015

Conversation

majianpeng
Copy link
Member

MissingLoc::is_found() don't work for only replicate object error on ec
pool. For example, ec pool with k=3/m=1, if two replicate objects met
data digest error, in repair_object, it can't add ok_peers is
missing_loc and don't add this object into missing_object.
But MissingLoc::is_found() check soid whether in missing_object.
So for this situation, the recovery will endless.

Signed-off-by: Jianpeng Ma jianpeng.ma@intel.com

@majianpeng
Copy link
Member Author

@liewegas . I'm, not sure is_unfound what's mean. It mean we lost primary object. But from the code, it has some other mean? Can you give some suggestion? Thanks!

@loic-bot
Copy link

loic-bot commented Apr 3, 2015

SUCCESS: the output of run-make-check.sh on centos-7 for 9d229e8 is http://paste2.org/Nw80hyIX

:octocat: Sent from GH.

@majianpeng
Copy link
Member Author

@athanatos . Can you review this? Thanks

@loic-bot
Copy link

loic-bot commented Apr 7, 2015

SUCCESS: the output of run-make-check.sh on centos-7 for 92f54d7 is http://paste2.org/G7dNOJV6

:octocat: Sent from GH.

@majianpeng
Copy link
Member Author

@liewegas . Have you time review this?

@tchaikov
Copy link
Contributor

  • as to the EC pool fix, it looks good. but might want to avoid the unnecessary code path for replicated pool.
  • for the early return optimisation, we should return immediately in that case. and my concern is "is it worthwhile for a negative case?".

@loic-bot
Copy link

@loic-bot
Copy link

@loic-bot
Copy link

@ghost
Copy link

ghost commented Apr 13, 2015

@majianpeng #4341 could help diagnose the problems, would you have time to review it ?

@majianpeng
Copy link
Member Author

@tchaikov . For MissingLoc::is_unfound, if pg is replicated pool, the func name maybe ok. But for ec pool, i think it cause misunderstand. How about rename is_unfound to is_unrecover?

@majianpeng
Copy link
Member Author

@athanatos . From the git log, this part mainly added by you. Have you time to review this? Thanks!

@tchaikov
Copy link
Contributor

For MissingLoc::is_unfound, if pg is replicated pool, the func name maybe ok. But for ec pool, i think it cause misunderstand. How about rename is_unfound to is_unrecover?

@majianpeng IMHO, is_lost() would be better. as we have already a member variable named is_recoverable. and we should also rename MissingLoc::num_unfound() also ?

@athanatos what do you think ?

@tchaikov
Copy link
Contributor

looks good to me.

not sure if we need a test case in https://github.com/ceph/ceph-qa-suite/blob/master/tasks/repair_test.py ? maybe i can take this. osd-scrub-repair.sh would be too crowded for negative tests, i think.

@tchaikov tchaikov removed the needs-qa label Apr 17, 2015
@tchaikov
Copy link
Contributor

@majianpeng it would be ideal if you could add a test case for this change? thanks.

@majianpeng
Copy link
Member Author

@tchaikov . add test-case & please review . Thanks!

In repair_object, if bad_peer is replica, it don't add soid in
MissingLoc for ec pool. If there are more bad replica for ec pool
which cause object can't recover, the later recoverying will endless.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Kefu Chai <kchai@redhat.com>
If object unfound, asap return -EIO.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
It is the same as MissingLoc::get_needs_recovery.

Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
tchaikov added a commit that referenced this pull request Apr 22, 2015
osd: For ec pool recovery, only for can recoverable object.

Reviewed-by: Kefu Chai <kchai@redhat.com>
@tchaikov tchaikov merged commit 927f0f9 into ceph:master Apr 22, 2015
@majianpeng majianpeng deleted the add-recoverable branch April 22, 2015 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants