New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
osd: For ec pool recovery, only for can recoverable object. #4269
Conversation
@liewegas . I'm, not sure is_unfound what's mean. It mean we lost primary object. But from the code, it has some other mean? Can you give some suggestion? Thanks! |
SUCCESS: the output of run-make-check.sh on centos-7 for 9d229e8 is http://paste2.org/Nw80hyIX |
@athanatos . Can you review this? Thanks |
f92f71f
to
b5c7787
Compare
SUCCESS: the output of run-make-check.sh on centos-7 for 92f54d7 is http://paste2.org/G7dNOJV6 |
b5c7787
to
44a13f1
Compare
@liewegas . Have you time review this? |
|
44a13f1
to
87dc890
Compare
87dc890
to
49658a6
Compare
49658a6
to
b69a62b
Compare
@majianpeng #4341 could help diagnose the problems, would you have time to review it ? |
b69a62b
to
c81c2bd
Compare
@tchaikov . For MissingLoc::is_unfound, if pg is replicated pool, the func name maybe ok. But for ec pool, i think it cause misunderstand. How about rename is_unfound to is_unrecover? |
@athanatos . From the git log, this part mainly added by you. Have you time to review this? Thanks! |
@majianpeng IMHO, @athanatos what do you think ? |
looks good to me. not sure if we need a test case in https://github.com/ceph/ceph-qa-suite/blob/master/tasks/repair_test.py ? maybe i can take this. osd-scrub-repair.sh would be too crowded for negative tests, i think. |
9451306
to
4e4ba90
Compare
@majianpeng it would be ideal if you could add a test case for this change? thanks. |
4e4ba90
to
f16bfec
Compare
@tchaikov . add test-case & please review . Thanks! |
In repair_object, if bad_peer is replica, it don't add soid in MissingLoc for ec pool. If there are more bad replica for ec pool which cause object can't recover, the later recoverying will endless. Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com> Signed-off-by: Kefu Chai <kchai@redhat.com>
If object unfound, asap return -EIO. Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
It is the same as MissingLoc::get_needs_recovery. Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
f16bfec
to
8f30db8
Compare
osd: For ec pool recovery, only for can recoverable object. Reviewed-by: Kefu Chai <kchai@redhat.com>
MissingLoc::is_found() don't work for only replicate object error on ec
pool. For example, ec pool with k=3/m=1, if two replicate objects met
data digest error, in repair_object, it can't add ok_peers is
missing_loc and don't add this object into missing_object.
But MissingLoc::is_found() check soid whether in missing_object.
So for this situation, the recovery will endless.
Signed-off-by: Jianpeng Ma jianpeng.ma@intel.com