Skip to content

rados-Backfill/Recovery: rados object that data size is 0 and this object have a large amount of omap key-value. when primary osd is backfilled, this object will be lost#44450

Closed
alcyoneus86 wants to merge 1 commit intoceph:mainfrom
alcyoneus86:master

Conversation

@alcyoneus86
Copy link
Copy Markdown

rados-Backfill/Recovery: rados object that data size is 0 and this object have a large amount of omap key-value. when primary osd is backfilled, this object will be lost. issues/53757
Signed-off-by: xingyu wang alcyoneus86@163.com

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

rados-Backfill/Recovery: rados object that data size is 0 and this object have a large amount of omap key-value. when primary osd is backfilled, this object will be lost. issues/53757
Signed-off-by: xingyu wang alcyoneus86@163.com
@github-actions github-actions bot added the core label Jan 4, 2022
@satoru-takeuchi
Copy link
Copy Markdown
Contributor

Is this fix under review? Or has it been overlooked? This PR seems to be serious for RGW because bucket index objects are "rados object that data size is 0 and this object have a large amount of omap key-value.".

@satoru-takeuchi
Copy link
Copy Markdown
Contributor

@alcyoneus86 Could you update this PR to pass the CI?

@neha-ojha neha-ojha self-requested a review February 7, 2022 22:32
@neha-ojha
Copy link
Copy Markdown
Member

jenkins retest this please

@ronen-fr
Copy link
Copy Markdown
Contributor

ronen-fr commented Mar 6, 2022

@satoru-takeuchi : the commit signature seems to be in the wrong format ("wrong" here - not accepted by the
CI)

bool error = pushing[soid].begin()->second.recovery_progress.error;

if (!pi->recovery_progress.data_complete && !error) {
if ((!pi->recovery_progress.data_complete || !pi->recovery_progress.omap_complete)&& !error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a missing blank before the '&&'

@satoru-takeuchi
Copy link
Copy Markdown
Contributor

@ronen-fr

the commit signature seems to be in the wrong format ("wrong" here - not accepted by the
CI)

Yes, so I've asked @alcyoneus86 to update this PR. But he've not replied yet.

I consider that the bug to be fixed in this PR is serious. So if possible I'd like to make another PR having the same fix as this PR. Of course, signed-off-by will be his name. Does it make sense?

@ronen-fr
Copy link
Copy Markdown
Contributor

ronen-fr commented Mar 7, 2022

Yes, so I've asked @alcyoneus86 to update this PR. But he've not replied yet.

I consider that the bug to be fixed in this PR is serious. So if possible I'd like to make another PR having the same fix as this PR. Of course, signed-off-by will be his name. Does it make sense?

Maybe wait for the formal approval first, just in case there will be more issues suggested? your call

@satoru-takeuchi
Copy link
Copy Markdown
Contributor

Maybe wait for the formal approval first,

OK, I'll wait for a while.

@ronen-fr
Copy link
Copy Markdown
Contributor

ronen-fr commented Mar 9, 2022

@satoru-takeuchi , @alcyoneus86 : I've now spent two full days trying to create the faulty scenario that is supposed to be fixed by this PR - with no success. It seems as though the code does make sure to never set data_complete before omap_complete
(and, btw, omap_complete is checked in ObjectRecoveryProgress.is_complete()).
Can you show me how to reproduce a problem?
Thanks

@ronen-fr
Copy link
Copy Markdown
Contributor

ronen-fr commented Mar 9, 2022

jenkins retest this please

@satoru-takeuchi
Copy link
Copy Markdown
Contributor

@alcyoneus86 Please tell us if you have a reproducer. In addition, I'd like to know how did you find this problem, e.g. when reading source code, or encounted this problem in real system, and so on.

@ronen-fr Unfortunately I've never reproduced this problem. I just thought that this problem seems to be serious as a result of looking at the commit description. Anyway, I'll also try to reproduce this problem.

@vumrao
Copy link
Copy Markdown
Contributor

vumrao commented Mar 10, 2022

cc @vumrao

@jdurgin
Copy link
Copy Markdown
Member

jdurgin commented Mar 10, 2022

@satoru-takeuchi , @alcyoneus86 : I've now spent two full days trying to create the faulty scenario that is supposed to be fixed by this PR - with no success. It seems as though the code does make sure to never set data_complete before omap_complete (and, btw, omap_complete is checked in ObjectRecoveryProgress.is_complete()). Can you show me how to reproduce a problem? Thanks

It looks like it is only possible with FileStore.

The condition here may be fulfilled if extents are removed from the list by readv():

if (out_op->data_included.size() != origin_size) {
dout(10) << __func__ << " some extents get pruned "
<< out_op->data_included.size() << "/" << origin_size
<< dendl;
new_progress.data_complete = true;
}

The readv implementation for FileStore (using the default one in ObjectStore.h) will remove extents that have no data. BlueStore has its own readv implementation which does not modify the interval_set of extents passed in though, so it would only be possible with FileStore using certain allocation patterns and filesystem combinations that resulted in fiemap returning some extents that were in practice 0 length on disk (this is possible because fiemap on linux is an advisory output, not necessarily an authoritative one).

So this seems to be fixing a real bug, though I'm uncertain how easy it is to hit - you'd need enough omap to avoid sending them in one push, plus a data allocation pattern in the underlying filesystem that results in empty extents (not sure if this is possible with all-omap objects like bucket indexes or not).

@satoru-takeuchi
Copy link
Copy Markdown
Contributor

@jdurgin Thank you for your input!

So this seems to be fixing a real bug, though I'm uncertain how easy it is to hit - you'd need enough omap to avoid sending them in one push, plus a data allocation pattern in the underlying filesystem that results in empty extents (not sure if this is possible with all-omap objects like bucket indexes or not).

Unfortunately, I don't have FIleStore OSDs. So I'd like to know @alcyoneus86 's answer for now. In addition, I'll check the source code to understand your comment in detail (I've not read FileStore's code).

@djgalloway djgalloway changed the base branch from master to main May 27, 2022 16:10
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jul 28, 2022
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jan 5, 2023

This pull request has been automatically closed because there has been no activity for 90 days. Please feel free to reopen this pull request (or open a new one) if the proposed change is still appropriate. Thank you for your contribution!

@github-actions github-actions bot closed this Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants