Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/bluestore: set bitmap freelist resolution to min_alloc_size #17610

Merged
merged 5 commits into from Sep 14, 2017

Conversation

Projects
None yet
3 participants
@liewegas
Copy link
Member

commented Sep 8, 2017

min_alloc_size is now fixed at mkfs time, so we may as well set the bitmap so that one
bit == min_alloc_size bytes. For HDD's this is a 16x reduction in size!

liewegas added some commits Sep 8, 2017

os/bluestore: require that bluefs_alloc_size be multiple of min_alloc…
…_size

Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: mkfs: choose min_alloc_size earlier
Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore/FreelistManager: create: accept min alloc size
Accept a block size other than bdev_block_size.  Let's call it, oh, I don't
know, min_alloc_size.

Signed-off-by: Sage Weil <sage@redhat.com>
os/bluestore: align bluefs_extents to min_alloc_size
Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas added the bluestore label Sep 8, 2017

@liewegas liewegas requested review from markhpc and varadakari Sep 8, 2017

os/bluestore: use min_alloc_size for freelist resolution
For HDD with min_alloc_size=64k, this is a 16x reduction in allocation
metadata!

Signed-off-by: Sage Weil <sage@redhat.com>

@liewegas liewegas force-pushed the liewegas:wip-bluestore-fm-mas branch from aa9b19c to f542a25 Sep 8, 2017

@liewegas liewegas requested a review from xiexingguo Sep 13, 2017

@@ -4555,6 +4555,13 @@ int BlueStore::_open_db(bool create)
bdev->get_size() * (cct->_conf->bluestore_bluefs_min_ratio +
cct->_conf->bluestore_bluefs_gift_ratio);
initial = MAX(initial, cct->_conf->bluestore_bluefs_min);
if (cct->_conf->bluefs_alloc_size % min_alloc_size) {

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Sep 14, 2017

Member

This is not right. We haven't choose a proper min_alloc_size yet by now. See mkfs.
Sorry for the noise, should go through the whole change set first:-)

@@ -6021,8 +6029,8 @@ int BlueStore::fsck(bool deep)
size_t next = used_blocks.find_next(cur);
if (next != cur + 1) {
derr << __func__ << " error: leaked extent 0x" << std::hex
<< ((uint64_t)start * block_size) << "~"

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Sep 14, 2017

Member

We introduce some related changes at #17268 to get the wrongly truncated last 4k block back. Seems that won't be able to function as it is supposed to(after this patch is backported)...

This comment has been minimized.

Copy link
@liewegas

liewegas Sep 14, 2017

Author Member

Yeah.. it occurs to me now that we actually have to revisit that fix, because it only corrects the problem if you run fsck, and that might not happen before they upgrade past luminous. I think we should make mount() check detect and correct the condition.

This comment has been minimized.

Copy link
@xiexingguo

xiexingguo Sep 14, 2017

Member

I think we should make mount() check detect and correct the condition.

👍

@xiexingguo

This comment has been minimized.

Copy link
Member

commented Sep 14, 2017

Good patch!

@liewegas liewegas added the needs-qa label Sep 14, 2017

@varadakari
Copy link
Contributor

left a comment

LGTM

r = -EINVAL;
goto out_close_bdev;
}

This comment has been minimized.

Copy link
@varadakari

varadakari Sep 14, 2017

Contributor

a nit, for the readability, can we make this a new function, so that we can make changes locally than moving the whole block and any changes in future will be only to this.

{
bytes_per_block = cct->_conf->bdev_block_size;
bytes_per_block = std::max(cct->_conf->bdev_block_size,
(int64_t)min_alloc_size);

This comment has been minimized.

Copy link
@varadakari

varadakari Sep 14, 2017

Contributor

from uint64_t to int64_t, can we change the other or make everything uint?

@@ -4169,7 +4169,7 @@ int BlueStore::_open_fm(bool create)
bl.append(freelist_type);
t->set(PREFIX_SUPER, "freelist_type", bl);
}
fm->create(bdev->get_size(), t);
fm->create(bdev->get_size(), cct->_conf->bdev_block_size, t);

This comment has been minimized.

Copy link
@varadakari

varadakari Sep 14, 2017

Contributor

should we send min_alloc_size? Anyways we are taking max of bdev_block_size, in freelist manager.

@liewegas liewegas force-pushed the liewegas:wip-bluestore-fm-mas branch from f542a25 to 6b8e4d5 Sep 14, 2017

@liewegas liewegas merged commit 1587f87 into ceph:master Sep 14, 2017

0 of 5 checks passed

Docs: build check Docs: building
Details
Signed-off-by checking if commits are signed
Details
Unmodified Submodules checking if PR has modified submodules
Details
make check running make check
Details
make check (arm64) running make check
Details

@liewegas liewegas deleted the liewegas:wip-bluestore-fm-mas branch Sep 14, 2017

dillaman pushed a commit to dillaman/ceph that referenced this pull request Sep 27, 2017

Ning Yao
FileStore:: fix fiemap issue in xfs when #extents > 1364
Fixes: ceph#17610
Backport: jewel, hammer
Signed-off-by: Ning Yao <yaoning@unitedstack.com>
(cherry picked from commit 1a1c126)

Conflicts:
        src/os/FileStore.cc
            in hammer, there is no _do_seek_hole_data() function so remove it
            in hammer, the logic is in FileStore::fiemap not in _do_fiemap()
            so port the logic to the else branch in FileStore::fiemap
@xiexingguo

This comment has been minimized.

Copy link
Member

commented Sep 30, 2017

Backport: #18050

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.