New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore: compact db after bulk omap naming upgrade. #42218
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, but it is better to use booleans.
src/os/bluestore/BlueStore.h
Outdated
@@ -3680,6 +3685,7 @@ class BlueStoreRepairer | |||
|
|||
private: | |||
std::atomic<unsigned> to_repair_cnt = { 0 }; | |||
std::atomic<bool> need_compact = { 0 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false, not 0.
b10acad
to
2a2e2ad
Compare
Right, just forget to modify after switching to bool from unsigned... Thanks! |
Perhaps things have improved in Octopus+, but I wanted to note that at least in Nautilus, attempting to perform an online compaction on our rgw index or our spinners leads to them being marked down rather frequently. |
yeah we have https://tracker.ceph.com/issues/50297 for that but the compaction in this PR is performed during OSD startup after repair procedure which might take a while by itself. Hence OSD is already down at this point and this would rather just extend that down period - not cause down state |
Good to know. My follow-on concern, then, is that this could extend OSD downtime by a significant period of time; we've seen 1hr+ compactions on our spinners before. But I suppose that's better than the OSDs causing an outage when they come back up. |
This is gonna to be rare case - just omap update or repair would trigger the compaction with this patch. Hence it's rather a single-shot thing on upgrade |
May be worth a switch though... |
Yeah, understood. Just nervous about exceeding pglogs and going into backfill during an upgrade. :D |
But since the extended downtime of an OSD upgrade with fsck for OMAP conversion is already mentioned in the release notes this does not come as a surprise. For us it was rather a surprise to find our OSDs choking with a degraded RocksDB after the upgrade from Nautilus to Octopus. Having a compaction done as part of the upgrade is really a great service to the users. As for the switch - users can already decide to not do the OMAP conversion on startup or to do this later. But adding yet another switch to convert to OMAP but then not do a compaction seems excessive, IMHO. |
Omap naming scheme upgrade introduced recently might perform bulk data removal and hence leave DB in a "degraded" state. Let's compact it. Signed-off-by: Igor Fedotov <ifedotov@suse.com>
2a2e2ad
to
0e5c140
Compare
Omap naming scheme upgrade introduced recently might perform bulk data
removal and hence leave DB in a "degraded" state. Let's compact it.
Signed-off-by: Igor Fedotov ifedotov@suse.com
Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox