New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 4kn devices #176

Closed
thorvald opened this Issue Jun 17, 2017 · 2 comments

Comments

Projects
None yet
2 participants
@thorvald

thorvald commented Jun 17, 2017

Our NVMe are usually low-level formatted with 4k physical block sizes. With that, dm-writeboost refuses to use it as a caching device, since it cannot read the superblock -- issuing a 512 byte read to a 4kn device always fails.

[  823.102234] blk_update_request: I/O error, dev nvme0n1, sector 2048
[  823.108517] device-mapper: writeboost: read_superblock_header() I/O error(-5), bits(18446744073709551615), dev(259:3), sector(0), read
[  823.120595] device-mapper: writeboost: read_superblock_header failed
[  823.126949] device-mapper: writeboost: audit_cache_device failed

I tried changing the buf_1 pool to be 4096 byte and modifying read_superblock_header to read 1<<3 sectors, and that allows dm-writeboost to progress further. Encouraged by that, I tried reformatting the NVMe to 512-byte physical blocks, at which point all the errors went away.

Any plans to support 4kn cache devices?

Also, our backing devices are 4kn and don't support changing the block size. My understanding is that dm-writeboost always issues 4k IO, and initial testing looks good, but are there any caveats with 4kn backing devices?

@akiradeveloper

This comment has been minimized.

Show comment
Hide comment
@akiradeveloper

akiradeveloper Jun 18, 2017

Owner

@thorvald Thanks for reporting. I've investigated the code a bit and found that buf_1_pool is only used to read/write superblock header/recorder.

I tried changing the buf_1 pool to be 4096 byte and modifying read_superblock_header to read 1<<3 sectors, and that allows dm-writeboost to progress further

So this should work but I suppose it fails again once you turns sync_data_interval because it submits bio in 512KB alignment. The workaround modifying the pool to buf_8_pool won't work this time.

Any plans to support 4kn cache devices?

I think it's good idea to support 4KB cache device.

Also, our backing devices are 4kn and don't support changing the block size. My understanding is that dm-writeboost always issues 4k IO, and initial testing looks good, but are there any caveats with 4kn backing devices?

You will be happy if applications (typically filesystem) only submit I/O in 4KB alignment otherwise Writeboost redirects the partial (less than 4KB) read to the backing device.

static int process_read(struct wb_device *wb, struct bio *bio)
{
        struct lookup_result res;
        struct dirtiness dirtiness;
        struct per_bio_data *pbd;

        bool reserved = false;

        mutex_lock(&wb->io_lock);
        cache_lookup(wb, bio, &res);
        if (!res.found)
                reserved = reserve_read_cache_cell(wb, bio);
        mutex_unlock(&wb->io_lock);

        if (!res.found) {
                if (reserved) {
                        /*
                         * Remapping clone bio to the backing store leads to
                         * empty payload in clone_endio().
                         * To avoid caching junk data, we need this workaround
                         * to call dm_io() to certainly fill the bio payload.
                         */
                        if (read_backing_async(wb, bio)) {
                                struct read_backing_async_context ctx = {
                                        .wb = wb,
                                        .bio = bio
                                };
                                read_backing_async_callback_onstack(1, &ctx);
                        }
                        return DM_MAPIO_SUBMITTED;
                } else {
                        bio_remap(bio, wb->backing_dev, bi_sector(bio));
                        return DM_MAPIO_REMAPPED;
                }
        }

So we need to encourage the upper layer to only submit bio that aligns 4KB. But as far as I understand blk_io_limits_xxxs are only hints.

https://people.redhat.com/msnitzer/docs/io-limits.txt

Direct I/O best practices

Users must always take care to use properly aligned and sized IO. This
is especially important for Direct I/O access. Direct I/O should be
aligned on a 'logical_block_size' boundary and in multiples of the
'logical_block_size'. With native 4K devices (logical_block_size is 4K)
it is now critical that applications perform Direct I/O that is a
multiple of the device's 'logical_block_size'. This means that
applications that do not perform 4K aligned I/O, but 512-byte aligned
I/O, will break with native 4K devices. Applications may consult a
device's "I/O Limits" to ensure they are using properly aligned and
sized I/O. The "I/O Limits" are exposed through both sysfs and block
device ioctl interfaces (also see: libblkid).

Owner

akiradeveloper commented Jun 18, 2017

@thorvald Thanks for reporting. I've investigated the code a bit and found that buf_1_pool is only used to read/write superblock header/recorder.

I tried changing the buf_1 pool to be 4096 byte and modifying read_superblock_header to read 1<<3 sectors, and that allows dm-writeboost to progress further

So this should work but I suppose it fails again once you turns sync_data_interval because it submits bio in 512KB alignment. The workaround modifying the pool to buf_8_pool won't work this time.

Any plans to support 4kn cache devices?

I think it's good idea to support 4KB cache device.

Also, our backing devices are 4kn and don't support changing the block size. My understanding is that dm-writeboost always issues 4k IO, and initial testing looks good, but are there any caveats with 4kn backing devices?

You will be happy if applications (typically filesystem) only submit I/O in 4KB alignment otherwise Writeboost redirects the partial (less than 4KB) read to the backing device.

static int process_read(struct wb_device *wb, struct bio *bio)
{
        struct lookup_result res;
        struct dirtiness dirtiness;
        struct per_bio_data *pbd;

        bool reserved = false;

        mutex_lock(&wb->io_lock);
        cache_lookup(wb, bio, &res);
        if (!res.found)
                reserved = reserve_read_cache_cell(wb, bio);
        mutex_unlock(&wb->io_lock);

        if (!res.found) {
                if (reserved) {
                        /*
                         * Remapping clone bio to the backing store leads to
                         * empty payload in clone_endio().
                         * To avoid caching junk data, we need this workaround
                         * to call dm_io() to certainly fill the bio payload.
                         */
                        if (read_backing_async(wb, bio)) {
                                struct read_backing_async_context ctx = {
                                        .wb = wb,
                                        .bio = bio
                                };
                                read_backing_async_callback_onstack(1, &ctx);
                        }
                        return DM_MAPIO_SUBMITTED;
                } else {
                        bio_remap(bio, wb->backing_dev, bi_sector(bio));
                        return DM_MAPIO_REMAPPED;
                }
        }

So we need to encourage the upper layer to only submit bio that aligns 4KB. But as far as I understand blk_io_limits_xxxs are only hints.

https://people.redhat.com/msnitzer/docs/io-limits.txt

Direct I/O best practices

Users must always take care to use properly aligned and sized IO. This
is especially important for Direct I/O access. Direct I/O should be
aligned on a 'logical_block_size' boundary and in multiples of the
'logical_block_size'. With native 4K devices (logical_block_size is 4K)
it is now critical that applications perform Direct I/O that is a
multiple of the device's 'logical_block_size'. This means that
applications that do not perform 4K aligned I/O, but 512-byte aligned
I/O, will break with native 4K devices. Applications may consult a
device's "I/O Limits" to ensure they are using properly aligned and
sized I/O. The "I/O Limits" are exposed through both sysfs and block
device ioctl interfaces (also see: libblkid).

@akiradeveloper akiradeveloper self-assigned this Jun 18, 2017

@akiradeveloper akiradeveloper added the bug label Jun 18, 2017

@akiradeveloper akiradeveloper added this to the v2.2.8 milestone Jun 21, 2017

akiradeveloper added a commit that referenced this issue Oct 11, 2017

close #176 remove buf_1_pool
Submitting 512 byte request to 4kn devices are not allowed.

Signed-off-by: Akira Hayakawa <ruby.wktk@gmail.com>
@akiradeveloper

This comment has been minimized.

Show comment
Hide comment
@akiradeveloper

akiradeveloper Oct 11, 2017

Owner

@thorvald Hi, If you have spare time, could you test with the for-thorvald branch? https://github.com/akiradeveloper/dm-writeboost/tree/for-thorvald

It's ok to check if you can initialize the device and do some IOs.

Owner

akiradeveloper commented Oct 11, 2017

@thorvald Hi, If you have spare time, could you test with the for-thorvald branch? https://github.com/akiradeveloper/dm-writeboost/tree/for-thorvald

It's ok to check if you can initialize the device and do some IOs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment