Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: compact db after bulk omap naming upgrade. #42218

Merged
merged 1 commit into from Jul 14, 2021

Conversation

ifed01
Copy link
Contributor

@ifed01 ifed01 commented Jul 7, 2021

Omap naming scheme upgrade introduced recently might perform bulk data
removal and hence leave DB in a "degraded" state. Let's compact it.

Signed-off-by: Igor Fedotov ifedotov@suse.com

Checklist

  • References tracker ticket
  • Updates documentation if necessary
  • Includes tests for new functionality or reproducer for bug

Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

@ifed01 ifed01 requested a review from aclamk July 7, 2021 14:11
@github-actions github-actions bot added the core label Jul 7, 2021
Copy link
Contributor

@aclamk aclamk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but it is better to use booleans.

@@ -3680,6 +3685,7 @@ class BlueStoreRepairer

private:
std::atomic<unsigned> to_repair_cnt = { 0 };
std::atomic<bool> need_compact = { 0 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false, not 0.

src/os/bluestore/BlueStore.h Outdated Show resolved Hide resolved
@ifed01 ifed01 force-pushed the wip-ifed-compact-after-upgrade branch from b10acad to 2a2e2ad Compare July 8, 2021 12:18
@ifed01
Copy link
Contributor Author

ifed01 commented Jul 8, 2021

Approved, but it is better to use booleans.

Right, just forget to modify after switching to bool from unsigned... Thanks!

@ifed01 ifed01 added the need-qa label Jul 8, 2021
@baergj
Copy link

baergj commented Jul 8, 2021

Perhaps things have improved in Octopus+, but I wanted to note that at least in Nautilus, attempting to perform an online compaction on our rgw index or our spinners leads to them being marked down rather frequently.

@ifed01
Copy link
Contributor Author

ifed01 commented Jul 8, 2021

Perhaps things have improved in Octopus+, but I wanted to note that at least in Nautilus, attempting to perform an online compaction on our rgw index or our spinners leads to them being marked down rather frequently.

yeah we have https://tracker.ceph.com/issues/50297 for that but the compaction in this PR is performed during OSD startup after repair procedure which might take a while by itself. Hence OSD is already down at this point and this would rather just extend that down period - not cause down state

@baergj
Copy link

baergj commented Jul 8, 2021

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

@ifed01
Copy link
Contributor Author

ifed01 commented Jul 8, 2021

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade

@ifed01
Copy link
Contributor Author

ifed01 commented Jul 8, 2021

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade

May be worth a switch though...

@baergj
Copy link

baergj commented Jul 8, 2021

Yeah, understood. Just nervous about exceeding pglogs and going into backfill during an upgrade. :D

@frittentheke
Copy link
Contributor

frittentheke commented Jul 9, 2021

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade

But since the extended downtime of an OSD upgrade with fsck for OMAP conversion is already mentioned in the release notes this does not come as a surprise.

For us it was rather a surprise to find our OSDs choking with a degraded RocksDB after the upgrade from Nautilus to Octopus. Having a compaction done as part of the upgrade is really a great service to the users.

As for the switch - users can already decide to not do the OMAP conversion on startup or to do this later. But adding yet another switch to convert to OMAP but then not do a compaction seems excessive, IMHO.

Omap naming scheme upgrade introduced recently might perform bulk data
removal and hence leave DB in a "degraded" state. Let's compact it.

Signed-off-by: Igor Fedotov <ifedotov@suse.com>
@ifed01 ifed01 force-pushed the wip-ifed-compact-after-upgrade branch from 2a2e2ad to 0e5c140 Compare July 9, 2021 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants