os/bluestore: compact db after bulk omap naming upgrade. #42218

ifed01 · 2021-07-07T14:11:13Z

Omap naming scheme upgrade introduced recently might perform bulk data
removal and hence leave DB in a "degraded" state. Let's compact it.

Signed-off-by: Igor Fedotov ifedotov@suse.com

Checklist

References tracker ticket
Updates documentation if necessary
Includes tests for new functionality or reproducer for bug

Show available Jenkins commands

jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox

aclamk

Approved, but it is better to use booleans.

aclamk · 2021-07-08T11:58:57Z

src/os/bluestore/BlueStore.h

@@ -3680,6 +3685,7 @@ class BlueStoreRepairer

 private:
  std::atomic<unsigned> to_repair_cnt = { 0 };
+  std::atomic<bool> need_compact = { 0 };


false, not 0.

src/os/bluestore/BlueStore.h

ifed01 · 2021-07-08T12:19:08Z

Approved, but it is better to use booleans.

Right, just forget to modify after switching to bool from unsigned... Thanks!

baergj · 2021-07-08T14:05:13Z

Perhaps things have improved in Octopus+, but I wanted to note that at least in Nautilus, attempting to perform an online compaction on our rgw index or our spinners leads to them being marked down rather frequently.

ifed01 · 2021-07-08T14:10:18Z

Perhaps things have improved in Octopus+, but I wanted to note that at least in Nautilus, attempting to perform an online compaction on our rgw index or our spinners leads to them being marked down rather frequently.

yeah we have https://tracker.ceph.com/issues/50297 for that but the compaction in this PR is performed during OSD startup after repair procedure which might take a while by itself. Hence OSD is already down at this point and this would rather just extend that down period - not cause down state

baergj · 2021-07-08T14:32:21Z

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

ifed01 · 2021-07-08T14:40:53Z

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade

ifed01 · 2021-07-08T14:41:24Z

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade

May be worth a switch though...

baergj · 2021-07-08T15:32:52Z

Yeah, understood. Just nervous about exceeding pglogs and going into backfill during an upgrade. :D

frittentheke · 2021-07-09T09:07:15Z

Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up.

This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade

But since the extended downtime of an OSD upgrade with fsck for OMAP conversion is already mentioned in the release notes this does not come as a surprise.

For us it was rather a surprise to find our OSDs choking with a degraded RocksDB after the upgrade from Nautilus to Octopus. Having a compaction done as part of the upgrade is really a great service to the users.

As for the switch - users can already decide to not do the OMAP conversion on startup or to do this later. But adding yet another switch to convert to OMAP but then not do a compaction seems excessive, IMHO.

Omap naming scheme upgrade introduced recently might perform bulk data removal and hence leave DB in a "degraded" state. Let's compact it. Signed-off-by: Igor Fedotov <ifedotov@suse.com>

tchaikov · 2021-07-14T01:18:10Z

ifed01 requested a review from aclamk July 7, 2021 14:11

ifed01 added the bluestore label Jul 7, 2021

github-actions bot added the core label Jul 7, 2021

aclamk approved these changes Jul 8, 2021

View reviewed changes

ifed01 force-pushed the wip-ifed-compact-after-upgrade branch from b10acad to 2a2e2ad Compare July 8, 2021 12:18

ifed01 added the need-qa label Jul 8, 2021

tchaikov added wip-kefu-testing needs-qa and removed need-qa labels Jul 8, 2021

os/bluestore: compact db after bulk omap naming upgrade.

0e5c140

Omap naming scheme upgrade introduced recently might perform bulk data removal and hence leave DB in a "degraded" state. Let's compact it. Signed-off-by: Igor Fedotov <ifedotov@suse.com>

ifed01 force-pushed the wip-ifed-compact-after-upgrade branch from 2a2e2ad to 0e5c140 Compare July 9, 2021 09:32

tchaikov merged commit 8e9f425 into ceph:master Jul 14, 2021

ifed01 deleted the wip-ifed-compact-after-upgrade branch July 14, 2021 10:24

This was referenced Jul 16, 2021

octopus: os/bluestore: fix erroneous SharedBlob record removal during repair #42373

Merged

octopus: os/bluestore: compact db after bulk omap naming upgrade #42375

Merged

pacific: os/bluestore: compact db after bulk omap naming upgrade. #42426

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

os/bluestore: compact db after bulk omap naming upgrade. #42218

os/bluestore: compact db after bulk omap naming upgrade. #42218

ifed01 commented Jul 7, 2021

aclamk left a comment

aclamk Jul 8, 2021

ifed01 commented Jul 8, 2021 •

edited

baergj commented Jul 8, 2021

ifed01 commented Jul 8, 2021

baergj commented Jul 8, 2021

ifed01 commented Jul 8, 2021

ifed01 commented Jul 8, 2021

baergj commented Jul 8, 2021

frittentheke commented Jul 9, 2021 •

edited

tchaikov commented Jul 14, 2021

os/bluestore: compact db after bulk omap naming upgrade. #42218

os/bluestore: compact db after bulk omap naming upgrade. #42218

Conversation

ifed01 commented Jul 7, 2021

Checklist

aclamk left a comment

Choose a reason for hiding this comment

aclamk Jul 8, 2021

Choose a reason for hiding this comment

ifed01 commented Jul 8, 2021 • edited

baergj commented Jul 8, 2021

ifed01 commented Jul 8, 2021

baergj commented Jul 8, 2021

ifed01 commented Jul 8, 2021

ifed01 commented Jul 8, 2021

baergj commented Jul 8, 2021

frittentheke commented Jul 9, 2021 • edited

tchaikov commented Jul 14, 2021

ifed01 commented Jul 8, 2021 •

edited

frittentheke commented Jul 9, 2021 •

edited