zfs-2.1.6 patchset #13886

tonyhutter · 2022-09-13T23:05:48Z

Motivation and Context

New release mainly to support the 5.19 kernel.

Description

Fedora 36 is now running the 5.19.8 kernel, and we need to support it.

How Has This Been Tested?

Buildbot will test

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the OpenZFS code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
I have run the ZFS Test Suite with this change applied.
All commit messages are properly formatted and contain Signed-off-by.

When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13555

The only reason for spa_config_*() to use refcount instead of simple non-atomic (thanks to scl_lock) variable for scl_count is tracking, hard disabled for the last 8 years. Switch to simple int scl_count reduces the lock hold time by avoiding atomic, plus makes structure fit into single cache line, reducing the locks contention. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#12287

It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to decide whether to reallocate/copy the buffer, because the answer is OS-specific and depends on the buffer size. Instead of that use abd_size_alloc_linear(), moved into public header. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes openzfs#12425

The fnvlist versions of the functions are fatal if they fail, saving each call from having to include checking the result. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Igor Kozhukhov <igor@dilos.org> Signed-off-by: Allan Jude <allan@klarasystems.com>

Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pavel Zakharov <pavel.zakharov@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Closes openzfs#13022

Previous flushing algorithm limited only total number of log blocks to the minimum of 256K and 4x number of metaslabs in the pool. As result, system with 1500 disks with 1000 metaslabs each, touching several new metaslabs each TXG could grow spacemap log to huge size without much benefits. We've observed one of such systems importing pool for about 45 minutes. This patch improves the situation from five sides: - By limiting maximum period for each metaslab to be flushed to 1000 TXGs, that effectively limits maximum number of per-TXG spacemap logs to load to the same number. - By making flushing more smooth via accounting number of metaslabs that were touched after the last flush and actually need another flush, not just ms_unflushed_txg bump. - By applying zfs_unflushed_log_block_pct to the number of metaslabs that were touched after the last flush, not all metaslabs in the pool. - By aggressively prefetching per-TXG spacemap logs up to 16 TXGs in advance, making log spacemap load process for wide HDD pool CPU-bound, accelerating it by many times. - By reducing zfs_unflushed_log_block_max from 256K to 128K, reducing single-threaded by nature log processing time from ~10 to ~5 minutes. As further optimization we could skip bumping ms_unflushed_txg for metaslabs not touched since the last flush, but that would be an incompatible change, requiring new pool feature. Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#12789

When calculating mg_aliquot alike to openzfs#12046 use number of unique data disks in the vdev, not the total number of children vdev. Increase default value of the tunable from 512KB to 1MB to compensate. Before this change each disk in striped pool was getting 512KB of sequential data, in 2-wide mirror -- 1MB, in 3-wide RAIDZ1 -- 768KB. After this change in all the cases each disk should get 1MB. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13388

- Make prefetch distance adaptive: up to 4MB prefetch doubles for every, hit same as before, but after that it grows by 1/8 every time the prefetch read does not complete in time to satisfy the demand. My tests show that 4MB is sufficient for wide NVMe pool to saturate single reader thread at 2.5GB/s, while new 64MB maximum allows the same thread to reach 1.5GB/s on wide HDD pool. Further distance increase may increase speed even more, but less dramatic and with higher latency. - Allow early reuse of inactive prefetch streams: streams that never saw hits can be reused immediately if there is a demand, while others can be reused after 1s of inactivity, starting with the oldest. After 2s of inactivity streams are deleted to free resources same as before. This allows by several times increase strided read performance on HDD pool in presence of simultaneous random reads, previously filling the zfetch_max_streams limit for seconds and so blocking most of prefetch. - Always issue intermediate indirect block reads with SYNC priority. Each of those reads if delayed for longer may delay up to 1024 other block prefetches, that may be not good for wide pools. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13452

Modern Clang and GCC can successfully implement simple conditions without branching with math and flag operations. Use of arrays for translation no longer helps as much as it was 14+ years ago. Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1, Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this change still shows no branching instructions. Profiling of CPU-bound scan stage of sorted scrub shows reproducible reduction of time spent inside avl_find() from 6.52% to 4.58%. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13540

During sorted scrub multiple threads (one per vdev) are issuing many ZIOs same time, all using the same scn->scn_zio_root ZIO as parent. It causes huge lock contention on the single global lock on that ZIO. Improve it by introducing per-queue null ZIOs, children to that one, and using them instead as proxy. For 12 SSD pool storing 1.5TB of 4KB blocks on 80-core system this dramatically reduces lock contention and reduces scrub time from 21 minutes down to 12.5, while actual read stages (not scan) are about 3x faster, reaching 100K blocks per second per vdev. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13553

Handle crypto_dispatch() return values same as crp->crp_etype errors. On FreeBSD 12 many drivers returned same errors both ways, and lack of proper handling for the first ended up in assertion panic later. It was changed in FreeBSD 13, but there is no reason to not be safe. While there, skip waiting for completion, including locking and wakeup() call, for sessions on synchronous crypto drivers, such as typical aesni and software. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13563

- Reduce size and comparison complexity of q_exts_by_size B-tree. Previous code used two 64-bit divisions and many other operations to compare two B-tree elements. It created enormous overhead. This implementation moves the math to the upper level and stores the score in the B-tree elements themselves. Since all that we need to store in that B-tree is the extent score and offset, those can fit into single 8 byte value instead of 24 bytes of q_exts_by_addr element and can be compared with single operation. - Better decouple secondary tree logic from main range_tree by moving rt_btree_ops and related functions into dsl_scan.c as ext_size_ops. Those functions are very small to worry about the code duplication and range_tree does not need to know details such as rt_btree_compare. - Instead of accounting number of pending bytes per pool, that needs atomic on global variable per block, account the number of non-empty per-vdev queues, that change much more rarely. - When extent scan is interrupted by TXG end, continue it in the next TXG instead of selecting next best extent. It allows to avoid leaving one truncated (and so likely not the best any more) extent each TXG. On top of some other optimizations this saves about 1.5 minutes out of 10 to scrub pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tom Caputi <caputit1@tcnj.edu> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13576

- Introduce first element offset within a leaf. It allows to reduce by ~50% average memmove() size when adding/removing elements. If the added/removed element is in the first half of the leaf, we may shift elements before it and adjust the bth_first instead of moving more elements after it. - Use memcpy() instead of memmove() when we know there is no overlap. - Switch from uint64_t to uint32_t. It does not limit anything, but 32-bit arches should appreciate it greatly in hot paths. - Store leaf capacity in struct btree to avoid 64-bit divisions. - Adjust zfs_btree_insert_into_leaf() to always result in balanced leaves after splitting, no matter where the new element was inserted. Not that we care about it much, but it should also allow B-trees with as little as two elements per leaf instead of 4 previously. When scrubbing pool of 12 SSDs, storing 1.5TB of 4KB zvol blocks this reduces amount of time spent in memmove() inside the scan thread from 13.7% to 5.7% and total scrub time by ~15 seconds out of 9 minutes. It should also reduce spacemaps load time, but I haven't measured it. Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13582

Change math to make it like the ARC, using multiplications instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13591

Block statistics calculation during scrub I/O issue in case of sorted scrub accounted ditto blocks several times. Embedded blocks on other side were not accounted at all. This change moves the accounting from issue to scan stage, that fixes both problems and also allows to avoid pool-wide locking and the lock contention it created. Since this statistics is quite specific and is not even exposed now anywhere, disable its calculation by default to not waste CPU time. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13579

Issuing several scrub reads for a block we may use the parent ZIO buffer for one of child ZIOs. If that read complete successfully, then we won't need to copy the data explicitly. If block has only one copy (typical for root vdev, which is also a mirror inside), then we never need to copy -- succeed or fail as-is. Previous code also copied data from buffer of every successfully completed child ZIO, but that just does not make any sense. On healthy N-wide mirror this saves all N+1 (or even more in case of ditto blocks) memory copies for each scrubbed block, allowing CPU to focus mostly on check-summing. For other vdev types it should save one memory copy per block copy at root vdev. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13606

Before this change for every valid parity column raidz_parity_verify() allocated new buffer and copied there existing data, then recalculated the parity and compared the result with the copy. This patch removes the memory copy, simply swapping original buffer pointers with newly allocated empty ones for parity recalculation and comparison. Original buffers with potentially incorrect parity data are then just freed, while new recalculated ones are used for repair. On a pool of 12 4-wide raidz vdevs, storing 1.5TB of 16MB blocks, this change reduces memory traffic during scrub by 17% and total unhalted CPU time by 25%. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13613

It may happen that scan bookmark points to a block that was turned into a part of a big hole. In such case dsl_scan_visitbp() may skip it and dsl_scan_check_resume() will not be called for it. As result new scan suspend won't be possible until the end of the object, that may take hours if the object is a multi-terabyte ZVOL on a slow HDD pool, stretching TXG to all that time, creating all sorts of problems. This patch changes the resume condition to any greater or equal block, so even if we miss the bookmarked block, the next one we find will delete the bookmark, allowing new suspend. Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Attila Fülöp <attila@fueloep.org> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes openzfs#12895 Closes openzfs#12902 Signed-off-by: Rich Ercolani <rincebrain@gmail.com>

Also remove -Wno-unused-but-set-variable Upstream-bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61118 Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes openzfs#13110

Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes openzfs#13110

This code should be kept inline with the upstream lua version as much as possible. Therefore, we simply want to silence the warning. This check was enabled by default as part of -Wall in gcc 12.1. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

Restructure the code in zfs_log_xvattr() to use a lr_attr_end structure when accessing lr_attr_t elements located after the variable sized array. This makes the code more understandable and resolves the accessing beyond the end of the field warnings. Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

The wrong union memory was being accessed in EdonRInit resulting in a write beyond size of field compiler warning. Reference the correct member to resolve the warning. The warning was correct and this in case the mistake was harmless. In function ‘fortify_memcpy_chk’, inlined from ‘EdonRInit’ at zfs/module/icp/algs/edonr/edonr.c:494:3: ./include/linux/fortify-string.h:344:25: error: call to ‘__write_overflow_field’ declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning] Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

The memcpy(), memmove(), and memset() functions have been annotated to perform bounds checking when using FORTIFY_SOURCE. A warning is now generted when writing beyond the end of the specified field. Alternately, the new struct_group() macro could be used to create an anonymous union member for use by memcpy(). However, since this is the only place the macro would be helpful it's preferable to restructure the code slights to avoid the need for additional compatibility code when the macro does not exist. https://lore.kernel.org/lkml/20211118183807.1283332-1-keescook@chromium.org/T/ Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

Move the use of the private pointer after it is freed. It's only used as a tag so a dereference would never occur, but there's no harm in inverting the order to resolve the warning. module/zfs/dbuf.c: In function 'dbuf_issue_final_prefetch_done': module/zfs/dbuf.c:3204:17: error: pointer 'private' may be used after 'free' [-Werror=use-after-free] Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

Move the use of the db pointer after it is freed. It's only used as a tag so a dereference would never occur, but there's no reason we can't invert the order to resolve the warning. module/zfs/dbuf.c: In function 'dbuf_destroy': module/zfs/dbuf.c:2953:17: error: pointer 'db' may be used after 'free' [-Werror=use-after-free] Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

Extend the buffer slightly resolve the warning. cmd/zfs/zfs_main.c: In function ‘upgrade_set_callback’: cmd/zfs/zfs_main.c:2446:22: error: ‘%llu’ directive output may be truncated writing between 1 and 20 bytes into a region of size 16 [-Werror=format-truncation=] cmd/zfs/zfs_main.c:2445:24: note: ‘snprintf’ output between 2 and 21 bytes into a destination of size 16 Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

Switch to using asprintf() to satisfy the compiler and resolve the potential format-overflow warning. Not the conditional before the sprintf() would have prevented this regardless. cmd/zfs/zfs_project.c: In function ‘zfs_project_handle_dir’: cmd/zfs/zfs_project.c:241:38: error: ‘/’ directive writing 1 byte into a region of size between 0 and 4352 [-Werror=format-overflow=] cmd/zfs/zfs_project.c:241:17: note: ‘sprintf’ output between 2 and 4609 bytes into a destination of size 4352 Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13528 Closes openzfs#13575

Since the assembly routines calculating SHA checksums don't use a standard stack layout, CFI directives are needed to unroll the stack. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Attila Fülöp <attila@fueloep.org> Closes openzfs#11733

There is an ongoing effort to eliminate this feature. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#13908

Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#13909

See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#13910

ryao · 2022-09-28T18:29:32Z

@tonyhutter I have rebased the patch on your 2.1.6 branch and pushed it to ryao/for-tony:

https://github.com/ryao/zfs/tree/for-tony

A build test succeeds. A visual inspection of the rebased patch looks good. I also visually inspected the surrounding code to make sure that the same bug is present, and that also looked good. I even did a diff of the original and rebased patches:

http://dpaste.com/6UKZE4FC5

That looks good too. :)

If you force fault a drive that's resilvering, it's scan stats can get frozen in time, giving the false impression that it's being resilvered. This commit checks the vdev state to see if the vdev is healthy before reporting "resilvering" or "repairing" in zpool status. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#13927 Closes openzfs#13930

Clang's static analyzer found a bad free caused by skein_mac_atomic(). It will allocate a context on the stack and then pass it to skein_final(), which attempts to free it. Upon inspection, skein_digest_atomic() also has the same problem. These functions were created to match the OpenSolaris ICP API, so I was curious how we avoided this in other providers and looked at the SHA2 code. It appears that SHA2 has a SHA2Final() helper function that is called by the exported sha2_mac_final()/sha2_digest_final() as well as the sha2_mac_atomic() and sha2_digest_atomic() functions. The real work is done in SHA2Final() while some checks and the free are done in sha2_mac_final()/sha2_digest_final(). We fix the use after free in the skein code by taking inspiration from the SHA2 code. We introduce a skein_final_nofree() that does most of the work, and make skein_final() into a function that calls it and then frees the memory. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes openzfs#13954

META file and changelog updated. Signed-off-by: Tony Hutter <hutter2@llnl.gov>

tonyhutter · 2022-09-29T00:32:17Z

@ryao thank you! I pulled it in

szubersk · 2022-10-02T07:46:38Z

@tonyhutter is it to late to request for 35d81a7 inclusion in the release?

awehrfritz · 2022-10-03T06:43:09Z

Linux 6.0 was just released yesterday. Does this PR contain all compatibility patches in order to increment the supported kernel version in META?

I otherwise would not want to delay this release any longer, but given that various distributions will soon pick up the new kernel, we could get a head start here.

Bronek · 2022-10-03T08:30:21Z

Linux 6.0 was just released yesterday. Does this PR contain all compatibility patches in order to increment the supported kernel version in META?

I otherwise would not want to delay this release any longer, but given that various distributions will soon pick up the new kernel, we could get a head start here.

I suggest that the best way to get a head start is to get this patchset out of the door first, to make space for a new one (which will include patches from this one). The 5.19 release will certainly become EOL in the next few weeks.

tonyhutter · 2022-10-03T20:08:50Z

@szubersk even though that's low impact, I think we'll have to wait for the next release to pull it in. I've already started building the packages.

szubersk · 2022-10-03T21:52:06Z

Let's wait until Linux 6.0 compat release :)

awehrfritz · 2022-10-03T22:56:59Z

I suggest that the best way to get a head start is to get this patchset out of the door first, to make space for a new one (which will include patches from this one).

Not really a head start anymore if you start a day late… but well. I’m happy at least this one is out now!

awehrfritz · 2022-10-03T23:00:00Z

@tonyhutter thanks for pushing this release out! 🎉

Not to be a pest about it, but could you please open a PR for the next point release right away?

With 6.0 out now, it will only take a couple of weeks until Fedora picks that one up and in a few weeks more 5.19 will be EOL as well.

ryao · 2022-10-03T23:05:45Z

I actually would appreciate a standing PR for the next release too. That way people can comment with comments that they want added as they are merged to master so that Tony does not need to do a mad dash when he feels it is time to do a 2.1.7 release.

tonyhutter · 2022-10-03T23:34:20Z

All - zfs-2.1.6 has been released: https://github.com/openzfs/zfs/releases/tag/zfs-2.1.6

I've created a zfs-2.1.7-staging branch too. Feel free to open a pull request against that branch for any commits you want in the next release.

gdevenyi · 2022-10-04T13:53:03Z

Sadly #13612 and #13755 were not addressed in the release notes

jaen · 2022-10-08T06:20:25Z

@tonyhutter would it be a big problem to open a PR for that branch so it's easier to track the status?

almereyda · 2022-10-12T19:27:44Z

I have opened

Linux 6 compatibility in and tracking discussion of the 2.1.7 patchset #14024

for discussion of 2.1.7, while we can only diff the branches and no PR for discussion is present

https://github.com/openzfs/zfs/compare/zfs-2.1.7-staging

as the conversation here is beyond its EOL.

awehrfritz · 2022-10-31T13:07:26Z

Fedora just started shipping the 6.0 kernel. Is there any estimate on when the 2.1.7 PR will be opened and the few remaining commits (ad09676 and 871d66d) back-ported to the 2.1 release series?

ryao · 2022-10-31T13:55:50Z

Good question. @tonyhutter would you be up for doing 2.1.7 soon? :)

OpenZFS release 2.1.6 Notable upstream pull requeset merges: #11733 ICP: Add missing stack frame info to SHA asm files #12274 Optimize txg_kick() process #12284 Add Module Parameter Regarding Log Size Limit #12285 Introduce a tunable to exclude special class buffers from L2ARC #12287 Remove refcount from spa_config_*() #12425 Avoid small buffer copying on write #12516 Fix NFS and large reads on older kernels #12678 spa.c: Replace VERIFY(nvlist_*(...) == 0) with fnvlist_* #12789 Improve log spacemap load time #13022 Add more control/visibility and speedup spa_load_verify() #13106 add physical device size to SIZE column in 'zpool list -v' #13388 Improve mg_aliquot math #13405 Revert "Reduce dbuf_find() lock contention" #13452 More speculative prefetcher improvements #13476 Refactor Log Size Limit #13540 AVL: Remove obsolete branching optimizations #13553 Reduce ZIO io_lock contention on sorted scrub #13555 Scrub mirror children without BPs #13563 FreeBSD: Improve crypto_dispatch() handling #13576 Several sorted scrub optimizations #13579 Fix and disable blocks statistics during scrub #13582 Several B-tree optimizations #13591 Avoid two 64-bit divisions per scanned block #13606 Avoid memory copies during mirror scrub #13613 Avoid memory copy when verifying raidz/draid parity #13643 Fix scrub resume from newly created hole #13756 FreeBSD: Mark ZFS_MODULE_PARAM_CALL as MPSAFE #13767 arcstat: fix -p option #13781 Importing from cachefile can trip assertion #13794 Apply arc_shrink_shift to ARC above arc_c_min #13798 Improve too large physical ashift handling #13811 Fix column width in 'zpool iostat -v' and 'zpool list -v' #13842 make DMU_OT_IS_METADATA and DMU_OT_IS_ENCRYPTED return B_TRUE or B_FALSE #13855 zfs recv hangs if max recordsize is less than received recordsize #13861 Fix use-after-free in btree code #13865 vdev_draid_lookup_map() should not iterate outside draid_maps #13878 Delay ZFS_PROP_SHARESMB property to handle it for encrypted raw receive #13882 FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}() #13885 Fix incorrect size given to bqueue_enqueue() call in dmu_redact.c #13908 FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK #13930 zpool: Don't print "repairing" on force faulted drives #13954 Fix bad free in skein code Obtained from: OpenZFS OpenZFS tag: zfs-2.1.6 OpenZFS commit: 6a6bd49 Relnotes: yes

behlendorf and others added 30 commits July 14, 2022 10:21

Avoid two 64-bit divisions per scanned block

5e06805

Change math to make it like the ARC, using multiplications instead. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored-By: iXsystems, Inc. Closes openzfs#13591

config: prune unused -Wno-bool-compare checks

2d235d5

Reviewed-by: Alejandro Colomar <alx.manpages@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes openzfs#13110

mjguzik added 3 commits September 28, 2022 10:35

FreeBSD: stop passing LK_INTERLOCK to VOP_LOCK

2c8e3e4

There is an ongoing effort to eliminate this feature. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#13908

FreeBSD: catch up to 1400068

eec942c

Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#13909

FreeBSD: handle V_PCATCH

63d4838

See https://cgit.FreeBSD.org/src/commit/?id=a75d1ddd74312f5dd79bc1e965f7077679659f2e Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#13910

tonyhutter force-pushed the zfs-2.1.6-hutter branch from f440854 to a8f8e8c Compare September 28, 2022 17:36

tonyhutter force-pushed the zfs-2.1.6-hutter branch from a8f8e8c to 916cc32 Compare September 28, 2022 19:42

ryao and others added 2 commits September 28, 2022 17:25

Tag zfs-2.1.6

6a6bd49

META file and changelog updated. Signed-off-by: Tony Hutter <hutter2@llnl.gov>

tonyhutter force-pushed the zfs-2.1.6-hutter branch from 916cc32 to 6a6bd49 Compare September 29, 2022 00:30

ghost approved these changes Sep 29, 2022

View reviewed changes

ryao approved these changes Sep 29, 2022

View reviewed changes

tonyhutter merged commit 6a6bd49 into openzfs:zfs-2.1-release Oct 3, 2022

almereyda mentioned this pull request Oct 12, 2022

Linux 6 compatibility in and tracking discussion of the 2.1.7 patchset #14024

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zfs-2.1.6 patchset #13886

zfs-2.1.6 patchset #13886

tonyhutter commented Sep 13, 2022

ryao commented Sep 28, 2022

tonyhutter commented Sep 29, 2022

szubersk commented Oct 2, 2022 •

edited

awehrfritz commented Oct 3, 2022

Bronek commented Oct 3, 2022

tonyhutter commented Oct 3, 2022

szubersk commented Oct 3, 2022

awehrfritz commented Oct 3, 2022

awehrfritz commented Oct 3, 2022 •

edited

ryao commented Oct 3, 2022

tonyhutter commented Oct 3, 2022

gdevenyi commented Oct 4, 2022

jaen commented Oct 8, 2022

almereyda commented Oct 12, 2022

awehrfritz commented Oct 31, 2022

ryao commented Oct 31, 2022

zfs-2.1.6 patchset #13886

zfs-2.1.6 patchset #13886

Conversation

tonyhutter commented Sep 13, 2022

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

ryao commented Sep 28, 2022

tonyhutter commented Sep 29, 2022

szubersk commented Oct 2, 2022 • edited

awehrfritz commented Oct 3, 2022

Bronek commented Oct 3, 2022

tonyhutter commented Oct 3, 2022

szubersk commented Oct 3, 2022

awehrfritz commented Oct 3, 2022

awehrfritz commented Oct 3, 2022 • edited

ryao commented Oct 3, 2022

tonyhutter commented Oct 3, 2022

gdevenyi commented Oct 4, 2022

jaen commented Oct 8, 2022

almereyda commented Oct 12, 2022

awehrfritz commented Oct 31, 2022

ryao commented Oct 31, 2022

szubersk commented Oct 2, 2022 •

edited

awehrfritz commented Oct 3, 2022 •

edited