-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os/bluestore/BlueFS: use 64K alloc_size on the shared device #29537
Conversation
52eb067
to
dfd97b8
Compare
needs a rebase since #29425 merged |
needs rebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hopefully tests don't require any changes
.set_description(""), | ||
.set_description("Allocation unit size for DB and WAL devices"), | ||
|
||
Option("bluefs_shared_alloc_size", Option::TYPE_SIZE, Option::LEVEL_ADVANCED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add to bluestore docs/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks fine to me, @liewegas were you planning to run this through teuthology or should I?
dfd97b8
to
3667cc1
Compare
I'm heading out for the day, can you queue up a test? Thanks!
|
sure! |
the make check failure in unittest_bluefs looks related |
http://pulpito.ceph.com/nojha-2019-08-08_00:28:04-rados-wip-bluefs-shared-alloc-distro-basic-smithi/ - lot of test failures due to |
3667cc1
to
e03e743
Compare
fixed
|
@neha-ojha should i cherry-pick 8128240 here too? |
don't think that'll be required, "bluefs_alloc_size" will still be 1Mb |
cbt errors are due to a regression introduced in ceph/cbt#178. We merged ceph/cbt#182 yesterday which turned out to be an incomplete fix. ceph/cbt#184 should be a better fix until we introduce a better settings hierarchy. |
Signed-off-by: Neha Ojha <nojha@redhat.com>
Add a separate config option that controls the alloc_size for the shared device (BDEV_SLOW). Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
9bdbffa
to
f7233ba
Compare
<< " from " << (int)dev_target << dendl; | ||
return -EIO; | ||
} | ||
rewrite = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we break the loop here?
Keep an alloc_size vector so that we have this value handy at all times. Allow bluestore to fetch this value directly instead of looking at the bluefs_* config options since this encapsulates things a bit better, and also isn't vulnerable to the config setting changing at runtime. Signed-off-by: Sage Weil <sage@redhat.com>
…sizes The previous implementation moved extents individually. This caused problems when moving an extent with a small alloc_size that wasn't a multiple of the target device's alloc_size. Instead, identify files with extents that need to be moved, and then read the file in its entirety and rewrite it in its entirety. Signed-off-by: Sage Weil <sage@redhat.com>
7baa22e
to
9426974
Compare
* refs/pull/29537/head: os/bluestore/BlueFS: fix device_migrate_to_* to handle varying alloc sizes os/bluestore/BlueFS: apply shared_alloc_size to shared device os/bluestore: whitespace os/bluestore/BlueFS: add bluefs_shared_alloc_size os/bluestore/BlueStore.cc: start should be >= _get_ondisk_reserved() Reviewed-by: Igor Fedotov <ifedotov@suse.com> Reviewed-by: Josh Durgin <jdurgin@redhat.com> Reviewed-by: Neha Ojha <nojha@redhat.com>
@vumrao ^ |
I see there are also trackers. . . uff. Will try to get this cleaned up. |
@smithfarm Yes, mimic/luminous have little different handling as Josh mentioned in his commit(b5de477). Looks like now all taken care. Thank you. |
The shared devices is susceptible to fragmentation, making the 1 MB bluefs allocations fail. Use a smaller allocation size (64K) for the shared device, while keeping the same large allocations for DB and WAL.
master tracker: https://tracker.ceph.com/issues/41301
Backports:
nautilus #30229
mimic #30219
luminous #29910