Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC, discussion] l2arc i/o errors and poor performance with block size greater than 512b due to compression assuming a smaller min block size than the vdev supports #3436

Closed
kernelOfTruth opened this issue May 21, 2015 · 3 comments
Labels
Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Question Issue for discussion

Comments

@kernelOfTruth
Copy link
Contributor

There are reports from FreeBSD's side that l2arc cache devices configured to run on vdevs or underlying block devices with a block size greater than 512b (e.g. 4K, ashift=12) are to incur i/o errors and bad performance

I'm currently doing some research in relation to #3400 to look for potential issues for the problem with the L2ARC (compression) issue

This is somewhat related to #3432 and might be the beginning of a set of fixes related to compression (either ARC or L2ARC)

https://bugs.freenas.org/projects/freenas/repository/trueos/revisions/ededc82f03e5dd5a518ccda3b0aae506fbfd8fc8

Use the vdev's ashift to calculate the supported min block size passed to
zio_compress_data(..) when compressing l2arc buffers.

This eliminates l2arc I/O errors, which resulted in very poor performance on
vdev's configured with block size greater than 512b due to compression
assuming a smaller min block size than the vdev supports.

MFC after: 2 days

(cherry picked from commit 1c55b38aeb696ce04e221122ef481710d7baef65)

http://lists.freebsd.org/pipermail/svn-src-head/2013-October/052517.html

At second sight & comparison between ZoL, Illumos-gate and FreeBSD it appears to be due to a different implementation

Comments ?

@kernelOfTruth kernelOfTruth changed the title l2arc i/o errors and poor performance with block size greater than 512b due to compression assuming a smaller min block size than the vdev supports [RFC, discussion] l2arc i/o errors and poor performance with block size greater than 512b due to compression assuming a smaller min block size than the vdev supports May 21, 2015
@kernelOfTruth
Copy link
Contributor Author

account for ashift when choosing buffers to be written to l2arc device

If we don't account for that, then we might end up overwriting disk
area of buffers that have not been evicted yet, because l2arc_evict
operates in terms of disk addresses.

The discrepancy between the write size calculation and the actual increment
to l2ad_hand was introduced in
commit e14bb3258d05c1b1077e2db7cf77088924e56919

Also, consistently use asize / a_sz for the allocated size, psize / p_sz
for the physical size.  Where the latter accounts for possible size
reduction because of compression, whereas the former accounts for possible
size expansion because of alignment requirements.

The code still assumes that either underlying storage subsystems or
hardware is able to do read-modify-write when an L2ARC buffer size is
not a multiple of a disk's block size.  This is true for 4KB sector disks
that provide 512B sector emulation, but may not be true in general.
In other words, we currently do not have any code to make sure that
an L2ARC buffer, whether compressed or not, which is used for physical I/O
has a suitable size.

https://reviews.csiden.org/r/112/

#3433 (comment)

@kernelOfTruth
Copy link
Contributor Author

Referencing #3492

l2arc: make sure that all writes honor ashift of a cache device

Previously uncompressed buffers did not obey that rule.

Type of b_asize is changed to uint64_t for consistency,
given that this is a zeta-byte filesystem.

l2arc_compress_buf is renamed to l2arc_transform_buf to better reflect
its new utility. Now not only we ensure that a compressed buffer has
a size aligned to ashift, but we also allocate a properly sized
temporary buffer if the original buffer is not compressed and it has
an odd size. This ensures that all I/O to the cache device is always
ashift-aligned, in terms of both a request offset and a request size.

If the aligned data is larger than the original data, then we have to use
a temporary buffer when reading it as well.

Also, enhance physical zio alignment checks using vdev_logical_ashift.
On FreeBSD we have this information, so we can make stricter assertions.

@stale
Copy link

stale bot commented Aug 25, 2020

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Aug 25, 2020
@stale stale bot closed this as completed Nov 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Inactive Not being actively updated Status: Stale No recent activity for issue Type: Question Issue for discussion
Projects
None yet
Development

No branches or pull requests

2 participants
@kernelOfTruth and others