Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore/BlueFS: use 64K alloc_size on the shared device #29537

Merged
merged 5 commits into from
Aug 15, 2019

Conversation

liewegas
Copy link
Member

@liewegas liewegas commented Aug 7, 2019

The shared devices is susceptible to fragmentation, making the 1 MB bluefs allocations fail. Use a smaller allocation size (64K) for the shared device, while keeping the same large allocations for DB and WAL.


master tracker: https://tracker.ceph.com/issues/41301

Backports:

nautilus #30229
mimic #30219
luminous #29910

@liewegas liewegas requested review from ifed01, jdurgin and neha-ojha and removed request for ifed01 August 7, 2019 17:42
@neha-ojha
Copy link
Member

needs a rebase since #29425 merged

src/os/bluestore/BlueFS.cc Outdated Show resolved Hide resolved
@ifed01
Copy link
Contributor

ifed01 commented Aug 7, 2019

needs rebase

Copy link
Member

@jdurgin jdurgin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully tests don't require any changes

.set_description(""),
.set_description("Allocation unit size for DB and WAL devices"),

Option("bluefs_shared_alloc_size", Option::TYPE_SIZE, Option::LEVEL_ADVANCED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add to bluestore docs/

Copy link
Member

@neha-ojha neha-ojha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine to me, @liewegas were you planning to run this through teuthology or should I?

@liewegas
Copy link
Member Author

liewegas commented Aug 7, 2019 via email

@neha-ojha
Copy link
Member

I'm heading out for the day, can you queue up a test? Thanks!

sure!

@neha-ojha
Copy link
Member

the make check failure in unittest_bluefs looks related

@neha-ojha
Copy link
Member

http://pulpito.ceph.com/nojha-2019-08-08_00:28:04-rados-wip-bluefs-shared-alloc-distro-basic-smithi/ - lot of test failures due to Floating point exception

@liewegas
Copy link
Member Author

liewegas commented Aug 8, 2019 via email

@liewegas
Copy link
Member Author

liewegas commented Aug 8, 2019

@neha-ojha should i cherry-pick 8128240 here too?

@neha-ojha
Copy link
Member

@neha-ojha should i cherry-pick 8128240 here too?

don't think that'll be required, "bluefs_alloc_size" will still be 1Mb

@tchaikov
Copy link
Contributor

tchaikov commented Aug 8, 2019

@markhpc
Copy link
Member

markhpc commented Aug 8, 2019

cbt errors are due to a regression introduced in ceph/cbt#178. We merged ceph/cbt#182 yesterday which turned out to be an incomplete fix. ceph/cbt#184 should be a better fix until we introduce a better settings hierarchy.

neha-ojha and others added 3 commits August 8, 2019 11:23
Add a separate config option that controls the alloc_size for the shared
device (BDEV_SLOW).

Signed-off-by: Sage Weil <sage@redhat.com>
Signed-off-by: Sage Weil <sage@redhat.com>
<< " from " << (int)dev_target << dendl;
return -EIO;
}
rewrite = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we break the loop here?

Keep an alloc_size vector so that we have this value handy at all times.
Allow bluestore to fetch this value directly instead of looking at the
bluefs_* config options since this encapsulates things a bit better, and
also isn't vulnerable to the config setting changing at runtime.

Signed-off-by: Sage Weil <sage@redhat.com>
…sizes

The previous implementation moved extents individually.  This caused
problems when moving an extent with a small alloc_size that wasn't
a multiple of the target device's alloc_size.

Instead, identify files with extents that need to be moved, and then read
the file in its entirety and rewrite it in its entirety.

Signed-off-by: Sage Weil <sage@redhat.com>
@neha-ojha
Copy link
Member

@liewegas liewegas merged commit 9426974 into ceph:master Aug 15, 2019
liewegas added a commit that referenced this pull request Aug 15, 2019
* refs/pull/29537/head:
	os/bluestore/BlueFS: fix device_migrate_to_* to handle varying alloc sizes
	os/bluestore/BlueFS: apply shared_alloc_size to shared device
	os/bluestore: whitespace
	os/bluestore/BlueFS: add bluefs_shared_alloc_size
	os/bluestore/BlueStore.cc: start should be >= _get_ondisk_reserved()

Reviewed-by: Igor Fedotov <ifedotov@suse.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Neha Ojha <nojha@redhat.com>
@smithfarm
Copy link
Contributor

smithfarm commented Sep 8, 2019

@liewegas @jdurgin @neha-ojha @vumrao I added a backports summary to the description of this PR. The current mimic backport #30219 differs from the nautilus backport #29739 . . . is that OK?

@smithfarm
Copy link
Contributor

@vumrao ^

@smithfarm
Copy link
Contributor

I see there are also trackers. . . uff. Will try to get this cleaned up.

@vumrao
Copy link
Contributor

vumrao commented Sep 9, 2019

@smithfarm Yes, mimic/luminous have little different handling as Josh mentioned in his commit(b5de477). Looks like now all taken care. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants