Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/filestore/FileJournal: set block size via config option #7628

Merged
merged 2 commits into from Mar 7, 2016

Conversation

liewegas
Copy link
Member

We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device. In reality, file systmes
report all kinds of crazy block sizes. Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want. Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes. And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about alignment of memory buffers
for the purposes of direct IO. Rename the option but do not change
the logic. That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil sage@redhat.com

@wjwithagen
Copy link
Contributor

@liewegas
I think that the line with
assert(0 == "bl should be align");
Should move to the bottom of the code block.
Since this assert always triggers, the others in the block are irrelevant.
Changing the order allows the other statements to execute at least their tests.
If nothing trigger we get this line that triggers.

We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas
Copy link
Member Author

@wjwithagen thanks, fixed the assert placement.

assert((pos & (CEPH_MINIMUM_BLOCK_SIZE - 1)) == 0);
assert((bl.length() & (CEPH_DIRECTIO_ALIGNMENT - 1)) == 0);
assert((pos & (CEPH_DIRECTIO_ALIGNMENT - 1)) == 0);
assert(0 == "bl should be align");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"aligned", while we're at it, and I think it's clearer to say "bl is not aligned"

Signed-off-by: Sage Weil <sage@redhat.com>
@wjwithagen
Copy link
Contributor

looks good to me.
Testing went from 27 FAIL to 20 FAIL.
So 7 more osd tests running to completion.
And the align assert has NOT occurred at all

wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Feb 19, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Feb 24, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Feb 26, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
@liewegas liewegas added this to the jewel milestone Mar 1, 2016
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 1, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 2, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 2, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 7, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
liewegas added a commit that referenced this pull request Mar 7, 2016
os/filestore/FileJournal: set block size via config option

Reviewed-by: Willem Jan Withagen <wjw@digiware.nl>
@liewegas liewegas merged commit 216967d into ceph:master Mar 7, 2016
@liewegas liewegas deleted the wip-journal-block-size branch March 7, 2016 17:28
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 16, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 17, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 18, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 21, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 21, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 21, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 25, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Mar 30, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Apr 4, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Apr 8, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Apr 15, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Apr 15, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Apr 20, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
wjwithagen added a commit to wjwithagen/ceph that referenced this pull request Apr 21, 2016
We were setting the block_size as the MIN of the min size (4096) and the
block size reported for the journal file/device.  In reality, file systmes
report all kinds of crazy block sizes.  Usually it's the page size, but
sometimes it is larger (e.g., 128KB for ZFS), and that's not actually what
we want.  Using a size smaller than the file systems block size is not
optimal for performance, but it doesn't affect how the IO happens--as long
as it is larger than the hardware sector size, which is either 512 or
4096 bytes.  And our min was hard-coded at 4096.

So, instead, just set a config option to specify teh block size, and
default that to 4096.

The other uses of this constant we about *alignment* of memory buffers
for the purposes of direct IO.  Rename the constant but do not change
the logic.  That means we continue to use 4k alignment for direct io even
if the device has 512 byte sectors, but that's fine--no reason to use
a smaller alignment.

Signed-off-by: Sage Weil <sage@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants