New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: fix get object instance returned NoSuchKey error #10820
Conversation
@yehudasa bug report: http://tracker.ceph.com/issues/17111 |
* @copied_obj is used to indicate whether @src_instance is valid. | ||
* */ | ||
string src_instance; | ||
bool copied_obj; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanghonggang do we really need to keep copied_obj field? Isn't it that we should just check to see that !src_instance.empty()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yehudasa
obj a (instance="", src_instance="") ---copy----> obj b (version_id="abc', src_instance="") ---copy---> obj c
without copied_obj,we can't decide obj c's src_instance's value. should it be obj b's src_instance (obj a's version_id) or obj b's version_id? Here empty src_instance is valid which should be copied into dest object's manifest in this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanghonggang if src_instance is not empty, use it, otherwise continue as previously. Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following example shows that we can not decide dest object's src_instance value
only by source object's src_instance. In the following exmaple, source object's empty src_instance still need to be copied into dest object's src_instance. All the buckets in this example share the same
data pool.
I want to copy object AA to generate object BB, and copy object BB to generate object CC, ...
copy chain:
AA --copy1--> BB --copy2--> CC --copy--> DD ...
Bucket A | Bucket B | Bucket C |
---|---|---|
versioning CLOSED | versioning ENABLED | versioning ENABLED |
Object AA -> | Object BB -> | Object CC |
instance = "" | instance = "abcd123" | instance = "xyz789" |
src_instance = "" | src_instance = "" | src_instance = "???abcd123" |
copy1:
Because object AA's src_instance is empty, use ""(object AA's instance) as object BB's src_instance.
It's OK.
copy2:
Because object BB's src_instance is empty, use "abcd123"(object BB's instance) as object CC's src_instance.
This will cause an ERROR. Because we should use "" as object CC's src_instance, not "abcd123".
So with 'coped_obj' we can decide if src object is generated by a copy operation.
If source object is not a copied object, we must copy its 'instance' to dest object's 'src_instance'.
If source object is a copied object, we can directly copy its 'src_instance' to dest object' 'src_instance'.
In the copy chain, only the first time copy need to copy source object's 'instance' to dest object's
'src_instance'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yanghonggang ok, I understand the issue now. Instead of calling the field "tail_instance", it should really be use_instance. By default it should match the instance field, unless it's copied and in that case it should match the copied object instance.
@yehudasa |
@yanghonggang looking good. Can you squash all the commits together? |
@yehudasa All commits are squashed. Total Test time (real) = 583.88 sec The following tests FAILED: I tried again, and get the following result: 97% tests passed, 4 tests failed out of 150 Total Test time (real) = 460.43 sec The following tests FAILED: |
When accessing a copied destination object, its source object's instance ID information is needed, however it's missing now in the destination object's manifest. In order to fix this problem, we can record source object's version_id/instance into dest object's manifest(a new filed 'tail_instance' is added). When creating a new object(not copy), 'tail_instance' should be equal to its instance value. When copy/get a object, 'tail_instance' should always be used to get the right tail objects. Fixes: http://tracker.ceph.com/issues/17111 Signed-off-by: Yang Honggang <joseph.yang@xtaotech.com>
@yanghonggang these test failures are unrelated |
When accessing the copied destination object, the source object's instance ID
information is needed, however it's missing now in the destination object’s
manifest. This will cause an IO error when get the copied object. In order to fix
this problem, we can record source object's version_id/instance into dest object's
manifest.
Signed-off-by: Yang Honggang joseph.yang@xtaotech.com