-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NAS-122348 / None / Merge zfs-2.1.12 #137
Commits on Apr 18, 2023
-
contrib: dracut: fix race with root=zfs:dset when necessities required
This had always worked in my testing, but a user on hardware reported this to happen 100%, and I reproduced it once with cold VM host caches. dracut-zfs-generator runs as a systemd generator, i.e. at Some Relatively Early Time; if root= is a fixed dataset, it tries to "solve [necessities] statically at generation time". If by that point zfs-import.target hasn't popped (because the import is taking a non-negligible amount of time for whatever reason), it'll see no children for the root datase, and as such generate no mounts. This has never had any right to work. No-one caught this earlier because it's just that much more convenient to have root=zfs:AUTO, which orders itself properly. To fix this, always run zfs-nonroot-necessities.service; this additionally simplifies the implementation by: * making BOOTFS from zfs-env-bootfs.service be the real, canonical, root dataset name, not just "whatever the first bootfs is", and only set it if we're ZFS-booting * zfs-{rollback,snapshot}-bootfs.service can use this instead of re-implementing it * having zfs-env-bootfs.service also set BOOTFSFLAGS * this means the sysroot.mount drop-in can be fixed text * zfs-nonroot-necessities.service can also be constant and always enabled, because it's conditioned on BOOTFS being set There is no longer any code generated at run-time (the sysroot.mount drop-in is an unavoidable gratuitous cp). The flow of BOOTFS{,FLAGS} from zfs-env-bootfs.service to sysroot.mount is not noted explicitly in dracut.zfs(7), because (a) at some point it's just visual noise and (b) it's already ordered via d-p-m.s from z-i.t. Backport-of: 3399a30 Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Configuration menu - View commit details
-
Copy full SHA for 18edf7a - Browse repository at this point
Copy the full SHA 18edf7aView commit details -
Revert "ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()"
This reverts commit 4b3133e. Users identified this commit as a possible source of data corruption: openzfs#14753 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Issue openzfs#14753 Closes openzfs#14761
Configuration menu - View commit details
-
Copy full SHA for a969b1b - Browse repository at this point
Copy the full SHA a969b1bView commit details -
Values printed by zpool-iostat(8) should be right-aligned
This inappropriate left-alignment was introduced in 7bb7b1f. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: WHR <msl0000023508@gmail.com> Closes openzfs#14751
Configuration menu - View commit details
-
Copy full SHA for 4e49d8e - Browse repository at this point
Copy the full SHA 4e49d8eView commit details -
META file and changelog updated. Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Configuration menu - View commit details
-
Copy full SHA for e25f913 - Browse repository at this point
Copy the full SHA e25f913View commit details
Commits on Apr 21, 2023
-
Fix buffered/direct/mmap I/O race
When a page is faulted in for memory mapped I/O the page lock may be dropped before it has been read and marked up to date. If a buffered read encounters such a page in mappedread() it must wait until the page has been updated. Failure to do so will result in a panic on debug builds and incorrect data on production builds. The critical part of this change is in mappedread() where pages which are not up to date are now handled. Additionally, it includes the following simplifications. - zfs_getpage() and zfs_fillpage() could be passed an array of pages. This could be more efficient if it was used but in practice only a single page was ever provided. These interfaces were simplified to acknowledge that. - update_pages() was modified to correctly set the PG_error bit on a page when it cannot be read by dmu_read(). - Setting PG_error and PG_uptodate was moved to zfs_fillpage() from zpl_readpage_common(). This is consistent with the handling in update_pages() and mappedread(). - Minor additional refactoring to comments and variable declarations to improve readability. - Add a test case to exercise concurrent buffered, direct, and mmap IO to the same file. - Reduce the mmap_sync test case default run time. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#13608 Closes openzfs#14498
Configuration menu - View commit details
-
Copy full SHA for c7db374 - Browse repository at this point
Copy the full SHA c7db374View commit details -
Linux: zfs_fillpage() should handle partial pages from end of file
After 89cd219 was merged, Clang's static analyzer began complaining about a dead assignment in `zfs_fillpage()`. Upon inspection, I noticed that the dead assignment was because we are not using the calculated io_len that we should use to avoid asking the DMU to read past the end of a file. This should result in `dmu_buf_hold_array_by_dnode()` calling `zfs_panic_recover()`. This issue predates 89cd219, but its simplification of zfs_fillpage() eliminated the only use of the assignment to io_len, which made Clang's static analyzer complain about the issue. Also, as a precaution, we add an assertion that io_offset < i_size. If this ever fails, bad things will happen. Otherwise, we are blindly trusting the kernel not to give us invalid offsets. We continue to blindly trust it on non-debug kernels. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Closes openzfs#14534
Configuration menu - View commit details
-
Copy full SHA for 4a5950a - Browse repository at this point
Copy the full SHA 4a5950aView commit details
Commits on Apr 24, 2023
-
Fix "Detach spare vdev in case if resilvering does not happen"
Spare vdev should detach from the pool when a disk is reinserted. However, spare detachment depends on the completion of resilvering, and if resilver does not schedule, the spare vdev keeps attached to the pool until the next resilvering. When a zfs pool contains several disks (25+ mirror), resilvering does not always happen when a disk is reinserted. In this patch, spare vdev is manually detached from the pool when resilvering does not occur and it has been tested on both Linux and FreeBSD. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes openzfs#14722
Configuration menu - View commit details
-
Copy full SHA for a68dfdb - Browse repository at this point
Copy the full SHA a68dfdbView commit details -
When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started, or restarted, if the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14410
Configuration menu - View commit details
-
Copy full SHA for 9fe3da9 - Browse repository at this point
Copy the full SHA 9fe3da9View commit details -
Increase default zfs_scan_vdev_limit to 16MB
For HDD based pools the default zfs_scan_vdev_limit of 4M per-vdev can significantly limit the maximum scrub performance. Increasing the default to 16M can double the scrub speed from 80 MB/s per disk to 160 MB/s per disk. This does increase the memory footprint during scrub/resilver but given the performance win this is a reasonable trade off. Memory usage is capped at 1/4 of arc_c_max. Note that number of outstanding I/Os has not changed and is still limited by zfs_vdev_scrub_max_active. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14428
Configuration menu - View commit details
-
Copy full SHA for fa28e26 - Browse repository at this point
Copy the full SHA fa28e26View commit details -
Increase default zfs_rebuild_vdev_limit to 64MB
When testing distributed rebuild performance with more capable hardware it was observed than increasing the zfs_rebuild_vdev_limit to 64M reduced the rebuild time by 17%. Beyond 64MB there was some improvement (~2%) but it was not significant when weighed against the increased memory usage. Memory usage is capped at 1/4 of arc_c_max. Additionally, vr_bytes_inflight_max has been moved so it's updated per-metaslab to allow the size to be adjust while a rebuild is running. Reviewed-by: Akash B <akash-b@hpe.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14428
Configuration menu - View commit details
-
Copy full SHA for cdbe1d6 - Browse repository at this point
Copy the full SHA cdbe1d6View commit details -
Allow MMP to bypass waiting for other threads
At our site we have seen cases when multi-modifier protection is enabled (multihost=on) on our pool and the pool gets suspended due to a single disk that is failing and responding very slowly. Our pools have 90 disks in them and we expect disks to fail. The current version of MMP requires that we wait for other writers before moving on. When a disk is responding very slowly, we observed that waiting here was bad enough to cause the pool to suspend. This change allows the MMP thread to bypass waiting for other threads and reduces the chances the pool gets suspended. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Herb Wartens <hawartens@gmail.com> Closes openzfs#14659
Configuration menu - View commit details
-
Copy full SHA for 33075e4 - Browse repository at this point
Copy the full SHA 33075e4View commit details
Commits on May 5, 2023
-
zpool import -m also removing spare and cache when log device is missing
spa_import() relies on a pool config fetched by spa_try_import() for spare/cache devices. Import flags are not passed to spa_tryimport(), which makes it return early due to a missing log device and missing retrieving the cache device and spare eventually. Passing ZFS_IMPORT_MISSING_LOG to spa_tryimport() makes it fetch the correct configuration regardless of the missing log device. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes openzfs#14794
Configuration menu - View commit details
-
Copy full SHA for 75ec145 - Browse repository at this point
Copy the full SHA 75ec145View commit details
Commits on May 9, 2023
-
Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole
If we receive a DRR_FREEOBJECTS as the first entry in an object range, this might end up producing a hole if the freed objects were the only existing objects in the block. If the txg starts syncing before we've processed any following DRR_OBJECT records, this leads to a possible race where the backing arc_buf_t gets its psize set to 0 in the arc_write_ready() callback while still being referenced from a dirty record in the open txg. To prevent this, we insert a txg_wait_synced call if the first record in the range was a DRR_FREEOBJECTS that actually resulted in one or more freed objects. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: David Hedberg <david.hedberg@findity.com> Sponsored by: Findity AB Closes openzfs#11893 Closes openzfs#14358
Configuration menu - View commit details
-
Copy full SHA for 9b17d5a - Browse repository at this point
Copy the full SHA 9b17d5aView commit details
Commits on May 10, 2023
-
Backport two minor ZTS test case fixes from 63652e1 to resolve a few spurious failures. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Configuration menu - View commit details
-
Copy full SHA for ecaf3ea - Browse repository at this point
Copy the full SHA ecaf3eaView commit details
Commits on May 11, 2023
-
pam: Fix "buffer overflow" in pam ZTS tests on F38
The pam ZTS tests were reporting a buffer overflow on F38, possibly due to F38 now setting _FORTIFY_SOURCE=3 by default. gdb and valgrind narrowed this down to a snprintf() buffer overflow in zfs_key_config_modify_session_counter(). I'm not clear why this particular snprintf() was being flagged as an overflow, but when I replaced it with an asprintf(), the test passed reliably. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes openzfs#14802 Closes openzfs#14842
Configuration menu - View commit details
-
Copy full SHA for 7c555fe - Browse repository at this point
Copy the full SHA 7c555feView commit details -
Add dmu_tx_hold_append() interface
Provides an interface which callers can use to declare a write when the exact starting offset in not yet known. Since the full range being updated is not available only the first L0 block at the provided offset will be prefetched. Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14819
Configuration menu - View commit details
-
Copy full SHA for 133faca - Browse repository at this point
Copy the full SHA 133facaView commit details -
When using zdb to output the value of an xattr only interpret it as printable characters if the entire byte array is printable. Additionally, if the --parseable option is set always output the buffer contents as octal for easy parsing. Reviewed-by: Olaf Faaland <faaland1@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14830
Configuration menu - View commit details
-
Copy full SHA for b17e472 - Browse repository at this point
Copy the full SHA b17e472View commit details
Commits on May 26, 2023
-
Fix concurrent resilvers initiated at same time
For draid vdevs it was possible to initiate both the sequential and healing resilver at same time. This fixes the following two scenarios. 1) There's a window where a sequential rebuild can be started via ZED even if a healing resilver has been scheduled. - This is fixed by adding additional check in spa_vdev_attach() for any scheduled resilver and return appropriate error code when a resilver is already in progress. 2) It was possible for zpool clear to start a healing resilver when it wasn't needed at all. This occurs because during a vdev_open() the device is presumed to be healthy not until the device is validated by vdev_validate() and it's set unavailable. However, by this point an async resilver will have already been requested if the DTL isn't empty. - This is fixed by cancelling the SPA_ASYNC_RESILVER request immediately at the end of vdev_reopen() when a resilver is unneeded. Finally, added a testcase in ZTS for verification. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com> Signed-off-by: Akash B <akash-b@hpe.com> Closes openzfs#14881 Closes openzfs#14892
Configuration menu - View commit details
-
Copy full SHA for c2f0aae - Browse repository at this point
Copy the full SHA c2f0aaeView commit details -
Probe vdevs before marking removed
Before allowing the ZED to mark a vdev as REMOVED due to a hotplug event confirm that it is non-responsive with probe. Any device which can be successfully probed should be left ONLINE to prevent a healthy pool from being incorrectly SUSPENDED. This may occur for at least the following two scenarios. 1) Drive expansion (zpool online -e) in VMware environments. If, during the partition resize operation, a partition is removed and re-created then udev will send a removed event. 2) Re-scanning the namespaces of an NVMe device (nvme ns-rescan) may result in a udev remove and add event being delivered. Finally, update the ZED to only kick in a spare when the removal was successful. Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#14859 Closes openzfs#14861
Configuration menu - View commit details
-
Copy full SHA for e2176f1 - Browse repository at this point
Copy the full SHA e2176f1View commit details -
Add the ability to uninitialize
zpool initialize functions well for touching every free byte...once. But if we want to do it again, we're currently out of luck. So let's add zpool initialize -u to clear it. Co-authored-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12451 Closes openzfs#14873
Configuration menu - View commit details
-
Copy full SHA for e97637d - Browse repository at this point
Copy the full SHA e97637dView commit details -
Use vmem_zalloc to silence allocation warning
The kmem allocation in zfs_prune_aliases() will trigger a large allocation warning on systems with 64K pages. Resolve this by switching to vmem_alloc() which internally uses kvmalloc() so the right allocator will be used based on the allocation size. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#8491 Closes openzfs#14694
Configuration menu - View commit details
-
Copy full SHA for 6ec3abc - Browse repository at this point
Copy the full SHA 6ec3abcView commit details -
Storage device expansion "silently" fails on degraded vdev
When a vdev is degraded or faulted, we refuse to expand it when doing online -e. However, we also don't actually cause the online command to fail, even though the disk didn't expand. This is confusing and misleading, and can result in violated expectations. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Matthew Ahrens <mahrens@delphix.com> Reviewed-by: Mark Maybee <mark.maybee@delphix.com> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes 14145
Configuration menu - View commit details
-
Copy full SHA for e2a96aa - Browse repository at this point
Copy the full SHA e2a96aaView commit details
Commits on May 28, 2023
-
We use block_device_wait to wait for the zvol block device to actually appear, and we log the result of the dd calls by using an intermediate file. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <pcd@delphix.com> Closes openzfs#14767
Configuration menu - View commit details
-
Copy full SHA for e1b3ab5 - Browse repository at this point
Copy the full SHA e1b3ab5View commit details -
ZTS: add snapshot/snapshot_002_pos exception
Add snapshot_002_pos to the known list of occasional failures for FreeBSD until it can be made entirely reliable. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#14831 Closes openzfs#14832
Configuration menu - View commit details
-
Copy full SHA for c6f6958 - Browse repository at this point
Copy the full SHA c6f6958View commit details -
ZTS: Annotate additonal flaky test cases
Update several flaky test cases in zts-report.py.in until they can be made entirely reliable. Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14392
Configuration menu - View commit details
-
Copy full SHA for 848c4b2 - Browse repository at this point
Copy the full SHA 848c4b2View commit details -
ZTS: Add auto_replace_001_pos to exceptions
The auto_replace_001_pos test case does not reliably pass on Fedora 37 and newer. Until the test case can be updated to make it reliable add it to the list of "maybe" exceptions on Linux. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Issue openzfs#14851 Closes openzfs#14852
Configuration menu - View commit details
-
Copy full SHA for 4e24df0 - Browse repository at this point
Copy the full SHA 4e24df0View commit details -
ZTS: Add zpool_resilver_concurrent exception
The zpool_resilver_concurrent test case requires the ZED which is not used on FreeBSD. Add this test to the known list of skipped tested for FreeBSD. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14904
Configuration menu - View commit details
-
Copy full SHA for c094b9a - Browse repository at this point
Copy the full SHA c094b9aView commit details -
Refine special_small_blocks property validation
When the special_small_blocks property is being set during a pool create it enforces a limit of 128KiB even if the pool's record size is larger. If the recordsize property is being set during a pool create, then use that value instead of the default SPA_OLD_MAXBLOCKSIZE value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <dev.fs.zfs@gmail.com> Closes openzfs#13815 Closes openzfs#14811
Configuration menu - View commit details
-
Copy full SHA for 30dcdda - Browse repository at this point
Copy the full SHA 30dcddaView commit details
Commits on May 30, 2023
-
FreeBSD: make zfs_vfs_held() definition consistent with declaration
Noticed while attempting to change FreeBSD's boolean_t into an actual bool: in include/sys/zfs_ioctl_impl.h, zfs_vfs_held() is declared to return a boolean_t, but in module/os/freebsd/zfs/zfs_ioctl_os.c it is defined to return an int. Make the definition match the declaration. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Dimitry Andric <dimitry@andric.com> Closes openzfs#14776
Configuration menu - View commit details
-
Copy full SHA for d1e05c6 - Browse repository at this point
Copy the full SHA d1e05c6View commit details -
FreeBSD: fix up EINVAL from getdirentries on .zfs
Without the change: /.zfs /.zfs/snapshot find: /.zfs: Invalid argument Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#14774
Configuration menu - View commit details
-
Copy full SHA for aef1324 - Browse repository at this point
Copy the full SHA aef1324View commit details -
FreeBSD: add missing vn state transition for .zfs
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#14774
Configuration menu - View commit details
-
Copy full SHA for 092021b - Browse repository at this point
Copy the full SHA 092021bView commit details -
Resolve a missed checkstyle warning. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14799
Configuration menu - View commit details
-
Copy full SHA for 45c4b3e - Browse repository at this point
Copy the full SHA 45c4b3eView commit details -
FreeBSD: don't verify recycled vnode for zfs control directory
Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501
Configuration menu - View commit details
-
Copy full SHA for f786232 - Browse repository at this point
Copy the full SHA f786232View commit details -
FreeBSD: add missing vop_fplookup assignments
It became illegal to not have them as of 5f6df177758b9dff88e4b6069aeb2359e8b0c493 ("vfs: validate that vop vectors provide all or none fplookup vops") upstream. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Closes openzfs#14788
Configuration menu - View commit details
-
Copy full SHA for 07a2ba5 - Browse repository at this point
Copy the full SHA 07a2ba5View commit details -
CLOCK_MONOTONIC_RAW is only a thing on Linux and macOS. I'm not actually sure why the previous hardcoding of a constant didn't error out, but when we removed it, it sure does now. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Rich Ercolani <rincebrain@gmail.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#12995
Configuration menu - View commit details
-
Copy full SHA for 435407e - Browse repository at this point
Copy the full SHA 435407eView commit details -
Correct exception path used in zts-report.py.in. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Configuration menu - View commit details
-
Copy full SHA for a836cc6 - Browse repository at this point
Copy the full SHA a836cc6View commit details
Commits on Jun 1, 2023
-
Fix NULL pointer dereference when doing concurrent 'send' operations
A NULL pointer will occur when doing a 'zfs send -S' on a dataset that is still being received. The problem is that the new 'send' will rightfully fail to own the datasets (i.e. dsl_dataset_own_force() will fail), but then dmu_send() will still do the dsl_dataset_disown(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Luís Henriques <henrix@camandro.org> Closes openzfs#14903 Closes openzfs#14890
Configuration menu - View commit details
-
Copy full SHA for 671b1af - Browse repository at this point
Copy the full SHA 671b1afView commit details -
Revert "initramfs: use
mount.zfs
instead ofmount
"This broke mounting of snapshots on / for users. See openzfs#9461 (comment) for more context. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes openzfs#14908
Configuration menu - View commit details
-
Copy full SHA for 93a99c6 - Browse repository at this point
Copy the full SHA 93a99c6View commit details -
Move zap_attribute_t to the heap in dsl_deadlist_merge
In the case of a regular compilation, the compiler raises a warning for a dsl_deadlist_merge function, that the stack size is to large. In debug build this can generate an error. Move large structures to heap. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes openzfs#14524
Configuration menu - View commit details
-
Copy full SHA for 7d26967 - Browse repository at this point
Copy the full SHA 7d26967View commit details
Commits on Jun 2, 2023
-
Fix positive ABD size assertion in abd_verify().
Gang ABDs without childred are legal, and they do have zero size. For other ABD types zero size doesn't have much sense and likely not working correctly now. Reviewed-by: Igor Kozhukhov <igor@dilos.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#14795
Configuration menu - View commit details
-
Copy full SHA for e271cd7 - Browse repository at this point
Copy the full SHA e271cd7View commit details -
Mark TX_COMMIT transaction with TXG_NOTHROTTLE.
TX_COMMIT has no on-disk representation and does not produce any more dirty data. It should not wait for anything, and even just skipping the checks if not waiting gives improvement noticeable in profiler. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Prakash Surya <prakash.surya@delphix.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#14798
Configuration menu - View commit details
-
Copy full SHA for c1b9dc7 - Browse repository at this point
Copy the full SHA c1b9dc7View commit details -
Fix two abd_gang_add_gang() issues.
- There is no reason to assert that added gang is not empty. It may be weird to add an empty gang, but it is legal. - When moving chain list from the added gang clear its size, or it will trigger assertion in abd_verify() when that gang is freed. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#14816
Configuration menu - View commit details
-
Copy full SHA for b2ede77 - Browse repository at this point
Copy the full SHA b2ede77View commit details -
Remove single parent assertion from zio_nowait().
We only need to know if ZIO has any parent there. We do not care if it has more than one, but use of zio_unique_parent() == NULL asserts that. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#14823
Configuration menu - View commit details
-
Copy full SHA for a727848 - Browse repository at this point
Copy the full SHA a727848View commit details -
zil: Don't expect zio_shrink() to succeed.
At least for RAIDZ zio_shrink() does not reduce zio size, but reduced wsz in that case likely results in writing uninitialized memory. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes openzfs#14853
Configuration menu - View commit details
-
Copy full SHA for b01a8cc - Browse repository at this point
Copy the full SHA b01a8ccView commit details
Commits on Jun 3, 2023
-
ZIL: Allow to replay blocks of any size.
There seems to be no reason for ZIL blocks to be limited by 128KB other than replay code is written in such a way. This change does not increase the limit yet, just removes the artificial limitation. Avoided extra memcpy() may save us a second during replay. Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc.
Configuration menu - View commit details
-
Copy full SHA for 8a315a3 - Browse repository at this point
Copy the full SHA 8a315a3View commit details
Commits on Jun 5, 2023
-
Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneously
Page writebacks with WB_SYNC_NONE can take several seconds to complete since they wait for the transaction group to close before being committed. This is usually not a problem since the caller does not need to wait. However, if we're simultaneously doing a writeback with WB_SYNC_ALL (e.g via msync), the latter can block for several seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE writeback since it needs to wait for the transaction to complete and the PG_writeback bit to be cleared. This commit deals with 2 cases: - No page writeback is active. A WB_SYNC_ALL page writeback starts and even completes. But when it's about to check if the PG_writeback bit has been cleared, another writeback with WB_SYNC_NONE starts. The sync page writeback ends up waiting for the non-sync page writeback to complete. - A page writeback with WB_SYNC_NONE is already active when a WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up waiting for the WB_SYNC_NONE writeback. The fix works by carefully keeping track of active sync/non-sync writebacks and committing when beneficial. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Shaan Nobee <sniper111@gmail.com> Closes openzfs#12662 Closes openzfs#12790
Configuration menu - View commit details
-
Copy full SHA for 9e5a297 - Browse repository at this point
Copy the full SHA 9e5a297View commit details -
Linux: use filemap_range_has_page()
As of the 4.13 kernel filemap_range_has_page() can be used to check if there is a page mapped in a given file range. When available this interface should be used which eliminates the need for the zp->z_is_mapped boolean. Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes openzfs#14493
Configuration menu - View commit details
-
Copy full SHA for 3ad6c16 - Browse repository at this point
Copy the full SHA 3ad6c16View commit details -
Workaround for Linux PowerPC GPL-only cpu_has_feature()
Linux since 4.7 makes interface 'cpu_has_feature' to use jump labels on powerpc if CONFIG_JUMP_LABEL_FEATURE_CHECKS is enabled, in this case however the inline function references GPL-only symbol 'cpu_feature_keys'. ZFS currently uses 'cpu_has_feature' either directly or indirectly from several places; while it is unknown how this issue didn't break ZFS on 64-bit little-endian powerpc, it is known to break ZFS with many Linux versions on both 32-bit and 64-bit big-endian powerpc. Until this issue is fixed in Linux, we have to workaround it by overriding affected inline functions without depending on 'cpu_feature_keys'. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: WHR <msl0000023508@gmail.com> Closes openzfs#14590
Configuration menu - View commit details
-
Copy full SHA for 35d43ba - Browse repository at this point
Copy the full SHA 35d43baView commit details -
Linux 6.3 compat: writepage_t first arg struct folio*
The type def of writepage_t in kernel 6.3 is changed to take struct folio* as the first argument. We need to detect this change and pass correct function to write_cache_pages(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes openzfs#14699
Configuration menu - View commit details
-
Copy full SHA for 04305bb - Browse repository at this point
Copy the full SHA 04305bbView commit details -
Linux 6.3 compat: idmapped mount API changes
Linux kernel 6.3 changed a bunch of APIs to use the dedicated idmap type for mounts (struct mnt_idmap), we need to detect these changes and make zfs work with the new APIs. NOTE: This backport only includes the configure checks to detect the 6.3 idmap API changes. It does not include support for idmap. When provided the idmap variable is ignored in most case in the same way the user_ns argument was ignored. This change is solely to provide compatibility with the new interfaces. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes openzfs#14682
Configuration menu - View commit details
-
Copy full SHA for f0aca5f - Browse repository at this point
Copy the full SHA f0aca5fView commit details -
Linux 6.3 compat: Fix memcpy "detected field-spanning write" error
Add a new union member of flexible array to dnode_phys_t and use it in the macro so we can silence the memcpy() fortify error. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes openzfs#14737
Configuration menu - View commit details
-
Copy full SHA for d7fb413 - Browse repository at this point
Copy the full SHA d7fb413View commit details -
Linux 6.4 compat: reclaimed_slab renamed to reclaimed
Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes openzfs#14891
Configuration menu - View commit details
-
Copy full SHA for 5f125e9 - Browse repository at this point
Copy the full SHA 5f125e9View commit details -
Silence clang warning of flexible array not at end
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Signed-off-by: Youzhong Yang <yyang@mathworks.com> Closes openzfs#14764
Configuration menu - View commit details
-
Copy full SHA for 79f8e62 - Browse repository at this point
Copy the full SHA 79f8e62View commit details -
Linux 6.3 compat: META (openzfs#14930)
Update the META file to reflect compatibility with the 6.3 kernel. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Configuration menu - View commit details
-
Copy full SHA for 1322f07 - Browse repository at this point
Copy the full SHA 1322f07View commit details
Commits on Jun 6, 2023
-
Fix Clang 15 compilation errors
- Clang 15 doesn't support `-fno-ipa-sra` anymore. Do a separate check for `-fno-ipa-sra` support by $KERNEL_CC. - Don't enable `-mgeneral-regs-only` for certain module files. Fix openzfs#13260 - Scope `GCC diagnostic ignored` statements to GCC only. Clang doesn't need them to compile the code. Porting notes: - Moved the stanzas removing -mgeneral-regs-only to Makefile.in since they wouldn't readily work in Kbuild.in and that did. Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: szubersk <szuberskidamian@gmail.com> Closes openzfs#13260 Closes openzfs#14150 Closes openzfs#14624 Ported-by: Rich Ercolani <rincebrain@gmail.com Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for dbbc2f9 - Browse repository at this point
Copy the full SHA dbbc2f9View commit details -
META file and changelog updated. Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Configuration menu - View commit details
-
Copy full SHA for 86783d7 - Browse repository at this point
Copy the full SHA 86783d7View commit details
Commits on Jun 14, 2023
-
ZFS Version 2.1.12 Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Configuration menu - View commit details
-
Copy full SHA for fa0045a - Browse repository at this point
Copy the full SHA fa0045aView commit details -
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Configuration menu - View commit details
-
Copy full SHA for fa8019f - Browse repository at this point
Copy the full SHA fa8019fView commit details