rados-Backfill/Recovery: rados object that data size is 0 and this object have a large amount of omap key-value. when primary osd is backfilled, this object will be lost#44450
Conversation
rados-Backfill/Recovery: rados object that data size is 0 and this object have a large amount of omap key-value. when primary osd is backfilled, this object will be lost. issues/53757 Signed-off-by: xingyu wang alcyoneus86@163.com
|
Is this fix under review? Or has it been overlooked? This PR seems to be serious for RGW because bucket index objects are "rados object that data size is 0 and this object have a large amount of omap key-value.". |
|
@alcyoneus86 Could you update this PR to pass the CI? |
|
jenkins retest this please |
|
@satoru-takeuchi : the commit signature seems to be in the wrong format ("wrong" here - not accepted by the |
| bool error = pushing[soid].begin()->second.recovery_progress.error; | ||
|
|
||
| if (!pi->recovery_progress.data_complete && !error) { | ||
| if ((!pi->recovery_progress.data_complete || !pi->recovery_progress.omap_complete)&& !error) { |
There was a problem hiding this comment.
a missing blank before the '&&'
Yes, so I've asked @alcyoneus86 to update this PR. But he've not replied yet. I consider that the bug to be fixed in this PR is serious. So if possible I'd like to make another PR having the same fix as this PR. Of course, signed-off-by will be his name. Does it make sense? |
Maybe wait for the formal approval first, just in case there will be more issues suggested? your call |
OK, I'll wait for a while. |
|
@satoru-takeuchi , @alcyoneus86 : I've now spent two full days trying to create the faulty scenario that is supposed to be fixed by this PR - with no success. It seems as though the code does make sure to never set data_complete before omap_complete |
|
jenkins retest this please |
|
@alcyoneus86 Please tell us if you have a reproducer. In addition, I'd like to know how did you find this problem, e.g. when reading source code, or encounted this problem in real system, and so on. @ronen-fr Unfortunately I've never reproduced this problem. I just thought that this problem seems to be serious as a result of looking at the commit description. Anyway, I'll also try to reproduce this problem. |
|
cc @vumrao |
It looks like it is only possible with FileStore. The condition here may be fulfilled if extents are removed from the list by readv(): ceph/src/osd/ReplicatedBackend.cc Lines 2143 to 2148 in 6b4382a The readv implementation for FileStore (using the default one in ObjectStore.h) will remove extents that have no data. BlueStore has its own readv implementation which does not modify the interval_set of extents passed in though, so it would only be possible with FileStore using certain allocation patterns and filesystem combinations that resulted in fiemap returning some extents that were in practice 0 length on disk (this is possible because fiemap on linux is an advisory output, not necessarily an authoritative one). So this seems to be fixing a real bug, though I'm uncertain how easy it is to hit - you'd need enough omap to avoid sending them in one push, plus a data allocation pattern in the underlying filesystem that results in empty extents (not sure if this is possible with all-omap objects like bucket indexes or not). |
|
@jdurgin Thank you for your input!
Unfortunately, I don't have FIleStore OSDs. So I'd like to know @alcyoneus86 's answer for now. In addition, I'll check the source code to understand your comment in detail (I've not read FileStore's code). |
|
This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days. |
|
This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution! |
rados-Backfill/Recovery: rados object that data size is 0 and this object have a large amount of omap key-value. when primary osd is backfilled, this object will be lost. issues/53757
Signed-off-by: xingyu wang alcyoneus86@163.com
Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume tox