-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tune zio buffer caches and their alignments #15452
Conversation
Thinking more about it, since most of ZFS memory these days should be in scattered ABD buffers of ARC, it does not make much sense to try and squeeze every byte of transient allocations used by ZIO and relatively small DBUF cache. With many underutilized caches per-CPU overhead on many-CPU systems may end up costing more than that we can save with more granular caches. I am going to change it allocating only half-power of 2 sizes, that should make the code cleaner and life of underlying allocators easier. |
We should not always use PAGESIZE alignment for caches bigger than it and SPA_MINBLOCKSIZE otherwise. Doing that caches for 5, 6, 7, 10 and 14KB rounded up to 8, 12 and 16KB respectively make no sense. Instead specify as alignment the biggest power-of-2 divisor. This way 2KB and 6KB caches are both aligned to 2KB, while 4KB and 8KB are aligned to 4KB. Reduce number of caches to half-power of 2 instead of quarter-power of 2. This removes caches difficult for underlying allocators to fit into page-granular slabs, such as: 2.5, 3.5, 5, 7, 10KB, etc. Since these caches are mostly used for transient allocations like ZIOs and small DBUF cache it does not worth being too aggressive. Due to the above alignment issue some of those caches were not working properly any way. 6KB cache now finally has a chance to work right, placing 2 buffers into 3 pages, that makes sense. Remove explicit alignment in Linux user-space case. I don't think it should be needed any more with the above fixes. As result on FreeBSD instead of such numbers of pages per slab: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_14336.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_10240.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_7168.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 2 <= Broken vm.uma.zio_buf_comb_5120.keg.ppera: 2 vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3584.keg.ppera: 7 <= Hard to free vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2560.keg.ppera: 2 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 I am now getting such: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 3 <= Fixed, 2 in 3 pages vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed-by: Allan Jude <allan@klarasystems.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Using quarter power of 2 allocations always struct me as a bit excessive.
We should not always use PAGESIZE alignment for caches bigger than it and SPA_MINBLOCKSIZE otherwise. Doing that caches for 5, 6, 7, 10 and 14KB rounded up to 8, 12 and 16KB respectively make no sense. Instead specify as alignment the biggest power-of-2 divisor. This way 2KB and 6KB caches are both aligned to 2KB, while 4KB and 8KB are aligned to 4KB. Reduce number of caches to half-power of 2 instead of quarter-power of 2. This removes caches difficult for underlying allocators to fit into page-granular slabs, such as: 2.5, 3.5, 5, 7, 10KB, etc. Since these caches are mostly used for transient allocations like ZIOs and small DBUF cache it does not worth being too aggressive. Due to the above alignment issue some of those caches were not working properly any way. 6KB cache now finally has a chance to work right, placing 2 buffers into 3 pages, that makes sense. Remove explicit alignment in Linux user-space case. I don't think it should be needed any more with the above fixes. As result on FreeBSD instead of such numbers of pages per slab: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_14336.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_10240.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_7168.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 2 <= Broken vm.uma.zio_buf_comb_5120.keg.ppera: 2 vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3584.keg.ppera: 7 <= Hard to free vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2560.keg.ppera: 2 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 I am now getting such: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 3 <= Fixed, 2 in 3 pages vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#15452
We should not always use PAGESIZE alignment for caches bigger than it and SPA_MINBLOCKSIZE otherwise. Doing that caches for 5, 6, 7, 10 and 14KB rounded up to 8, 12 and 16KB respectively make no sense. Instead specify as alignment the biggest power-of-2 divisor. This way 2KB and 6KB caches are both aligned to 2KB, while 4KB and 8KB are aligned to 4KB. Reduce number of caches to half-power of 2 instead of quarter-power of 2. This removes caches difficult for underlying allocators to fit into page-granular slabs, such as: 2.5, 3.5, 5, 7, 10KB, etc. Since these caches are mostly used for transient allocations like ZIOs and small DBUF cache it does not worth being too aggressive. Due to the above alignment issue some of those caches were not working properly any way. 6KB cache now finally has a chance to work right, placing 2 buffers into 3 pages, that makes sense. Remove explicit alignment in Linux user-space case. I don't think it should be needed any more with the above fixes. As result on FreeBSD instead of such numbers of pages per slab: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_14336.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_10240.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_7168.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 2 <= Broken vm.uma.zio_buf_comb_5120.keg.ppera: 2 vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3584.keg.ppera: 7 <= Hard to free vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2560.keg.ppera: 2 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 I am now getting such: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 3 <= Fixed, 2 in 3 pages vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#15452
We should not always use PAGESIZE alignment for caches bigger than it and SPA_MINBLOCKSIZE otherwise. Doing that caches for 5, 6, 7, 10 and 14KB rounded up to 8, 12 and 16KB respectively make no sense. Instead specify as alignment the biggest power-of-2 divisor. This way 2KB and 6KB caches are both aligned to 2KB, while 4KB and 8KB are aligned to 4KB. Reduce number of caches to half-power of 2 instead of quarter-power of 2. This removes caches difficult for underlying allocators to fit into page-granular slabs, such as: 2.5, 3.5, 5, 7, 10KB, etc. Since these caches are mostly used for transient allocations like ZIOs and small DBUF cache it does not worth being too aggressive. Due to the above alignment issue some of those caches were not working properly any way. 6KB cache now finally has a chance to work right, placing 2 buffers into 3 pages, that makes sense. Remove explicit alignment in Linux user-space case. I don't think it should be needed any more with the above fixes. As result on FreeBSD instead of such numbers of pages per slab: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_14336.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_10240.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_7168.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 2 <= Broken vm.uma.zio_buf_comb_5120.keg.ppera: 2 vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3584.keg.ppera: 7 <= Hard to free vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2560.keg.ppera: 2 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 I am now getting such: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 3 <= Fixed, 2 in 3 pages vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#15452
We should not always use PAGESIZE alignment for caches bigger than it and SPA_MINBLOCKSIZE otherwise. Doing that caches for 5, 6, 7, 10 and 14KB rounded up to 8, 12 and 16KB respectively make no sense. Instead specify as alignment the biggest power-of-2 divisor. This way 2KB and 6KB caches are both aligned to 2KB, while 4KB and 8KB are aligned to 4KB. Reduce number of caches to half-power of 2 instead of quarter-power of 2. This removes caches difficult for underlying allocators to fit into page-granular slabs, such as: 2.5, 3.5, 5, 7, 10KB, etc. Since these caches are mostly used for transient allocations like ZIOs and small DBUF cache it does not worth being too aggressive. Due to the above alignment issue some of those caches were not working properly any way. 6KB cache now finally has a chance to work right, placing 2 buffers into 3 pages, that makes sense. Remove explicit alignment in Linux user-space case. I don't think it should be needed any more with the above fixes. As result on FreeBSD instead of such numbers of pages per slab: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_14336.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_10240.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_7168.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 2 <= Broken vm.uma.zio_buf_comb_5120.keg.ppera: 2 vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3584.keg.ppera: 7 <= Hard to free vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2560.keg.ppera: 2 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 I am now getting such: vm.uma.zio_buf_comb_16384.keg.ppera: 4 vm.uma.zio_buf_comb_12288.keg.ppera: 3 vm.uma.zio_buf_comb_8192.keg.ppera: 2 vm.uma.zio_buf_comb_6144.keg.ppera: 3 <= Fixed, 2 in 3 pages vm.uma.zio_buf_comb_4096.keg.ppera: 1 vm.uma.zio_buf_comb_3072.keg.ppera: 3 vm.uma.zio_buf_comb_2048.keg.ppera: 1 vm.uma.zio_buf_comb_1536.keg.ppera: 2 vm.uma.zio_buf_comb_1024.keg.ppera: 1 vm.uma.zio_buf_comb_512.keg.ppera: 1 Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#15452
We should not always use PAGESIZE alignment for caches bigger than it and SPA_MINBLOCKSIZE otherwise. Doing that caches for 5, 6, 7, 10 and 14KB rounded up to 8, 12 and 16KB respectively make no sense. Instead specify as alignment the biggest power-of-2 divisor. This way 2KB and 6KB caches are both aligned to 2KB, while 4KB and 8KB are aligned to 4KB.
Reduce number of caches to half-power of 2 instead of quarter-power of 2. This removes caches difficult for underlying allocators to fit into page-granular slabs, such as: 2.5, 3.5, 5, 7, 10KB, etc. Since these caches are mostly used for transient allocations like ZIOs and small DBUF cache it does not worth being too aggressive. ~7/8 of space efficiency on average should be sufficient. Due to the above alignment issue some of those caches were not working properly any way. 6KB cache now finally has a chance to work right, placing 2 buffers into 3 pages, that makes sense.
Remove explicit alignment in Linux user-space case. I don't think it should be needed any more with the above fixes.
As result on FreeBSD instead of such numbers of pages per slab:
I am now getting such:
Types of changes
Checklist:
Signed-off-by
.