Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAS-122348 / None / Merge zfs-2.1.12 #137

Merged
merged 58 commits into from
Jun 14, 2023
Merged

Commits on Apr 18, 2023

  1. contrib: dracut: fix race with root=zfs:dset when necessities required

    This had always worked in my testing, but a user on hardware reported
    this to happen 100%, and I reproduced it once with cold VM host caches.
    
    dracut-zfs-generator runs as a systemd generator, i.e. at Some
    Relatively Early Time; if root= is a fixed dataset, it tries to
    "solve [necessities] statically at generation time".
    
    If by that point zfs-import.target hasn't popped (because the import is
    taking a non-negligible amount of time for whatever reason), it'll see
    no children for the root datase, and as such generate no mounts.
    
    This has never had any right to work. No-one caught this earlier because
    it's just that much more convenient to have root=zfs:AUTO, which orders
    itself properly.
    
    To fix this, always run zfs-nonroot-necessities.service;
    this additionally simplifies the implementation by:
      * making BOOTFS from zfs-env-bootfs.service be the real, canonical,
        root dataset name, not just "whatever the first bootfs is",
        and only set it if we're ZFS-booting
      * zfs-{rollback,snapshot}-bootfs.service can use this instead of
        re-implementing it
      * having zfs-env-bootfs.service also set BOOTFSFLAGS
      * this means the sysroot.mount drop-in can be fixed text
      * zfs-nonroot-necessities.service can also be constant and always
        enabled, because it's conditioned on BOOTFS being set
    
    There is no longer any code generated at run-time
    (the sysroot.mount drop-in is an unavoidable gratuitous cp).
    
    The flow of BOOTFS{,FLAGS} from zfs-env-bootfs.service to sysroot.mount
    is not noted explicitly in dracut.zfs(7), because (a) at some point it's
    just visual noise and (b) it's already ordered via d-p-m.s from z-i.t.
    
    Backport-of: 3399a30
    Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
    nabijaczleweli authored and tonyhutter committed Apr 18, 2023
    Configuration menu
    Copy the full SHA
    18edf7a View commit details
    Browse the repository at this point in the history
  2. Revert "ZFS_IOC_COUNT_FILLED does unnecessary txg_wait_synced()"

    This reverts commit 4b3133e.
    
    Users identified this commit as a possible source of data
    corruption:
    openzfs#14753
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Tony Hutter <hutter2@llnl.gov>
    Issue openzfs#14753 
    Closes openzfs#14761
    tonyhutter committed Apr 18, 2023
    Configuration menu
    Copy the full SHA
    a969b1b View commit details
    Browse the repository at this point in the history
  3. Values printed by zpool-iostat(8) should be right-aligned

    This inappropriate left-alignment was introduced in 7bb7b1f.
    
    Reviewed-by: Tony Hutter <hutter2@llnl.gov>
    Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
    Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
    Signed-off-by: WHR <msl0000023508@gmail.com>
    Closes openzfs#14751
    Low-power authored and tonyhutter committed Apr 18, 2023
    Configuration menu
    Copy the full SHA
    4e49d8e View commit details
    Browse the repository at this point in the history
  4. Tag zfs-2.1.11

    META file and changelog updated.
    
    Signed-off-by: Tony Hutter <hutter2@llnl.gov>
    tonyhutter committed Apr 18, 2023
    Configuration menu
    Copy the full SHA
    e25f913 View commit details
    Browse the repository at this point in the history

Commits on Apr 21, 2023

  1. Fix buffered/direct/mmap I/O race

    When a page is faulted in for memory mapped I/O the page lock
    may be dropped before it has been read and marked up to date.
    If a buffered read encounters such a page in mappedread() it
    must wait until the page has been updated. Failure to do so
    will result in a panic on debug builds and incorrect data on
    production builds.
    
    The critical part of this change is in mappedread() where pages
    which are not up to date are now handled. Additionally, it
    includes the following simplifications.
    
    - zfs_getpage() and zfs_fillpage() could be passed an array of
      pages. This could be more efficient if it was used but in
      practice only a single page was ever provided. These
      interfaces were simplified to acknowledge that.
    
    - update_pages() was modified to correctly set the PG_error bit
      on a page when it cannot be read by dmu_read().
    
    - Setting PG_error and PG_uptodate was moved to zfs_fillpage()
      from zpl_readpage_common(). This is consistent with the
      handling in update_pages() and mappedread().
    
    - Minor additional refactoring to comments and variable
      declarations to improve readability.
    
    - Add a test case to exercise concurrent buffered, direct,
      and mmap IO to the same file.
    
    - Reduce the mmap_sync test case default run time.
    
    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#13608
    Closes openzfs#14498
    behlendorf committed Apr 21, 2023
    Configuration menu
    Copy the full SHA
    c7db374 View commit details
    Browse the repository at this point in the history
  2. Linux: zfs_fillpage() should handle partial pages from end of file

    After 89cd219 was merged, Clang's
    static analyzer began complaining about a dead assignment in
    `zfs_fillpage()`. Upon inspection, I noticed that the dead assignment
    was because we are not using the calculated io_len that we should use to
    avoid asking the DMU to read past the end of a file. This should result
    in `dmu_buf_hold_array_by_dnode()` calling `zfs_panic_recover()`.
    
    This issue predates 89cd219, but its
    simplification of zfs_fillpage() eliminated the only use of the
    assignment to io_len, which made Clang's static analyzer complain about
    the issue.
    
    Also, as a precaution, we add an assertion that io_offset < i_size. If
    this ever fails, bad things will happen. Otherwise, we are blindly
    trusting the kernel not to give us invalid offsets. We continue to
    blindly trust it on non-debug kernels.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Closes openzfs#14534
    ryao authored and behlendorf committed Apr 21, 2023
    Configuration menu
    Copy the full SHA
    4a5950a View commit details
    Browse the repository at this point in the history

Commits on Apr 24, 2023

  1. Fix "Detach spare vdev in case if resilvering does not happen"

    Spare vdev should detach from the pool when a disk is reinserted.
    However, spare detachment depends on the completion of resilvering,
    and if resilver does not schedule, the spare vdev keeps attached to
    the pool until the next resilvering. When a zfs pool contains
    several disks (25+ mirror), resilvering does not always happen when
    a disk is reinserted. In this patch, spare vdev is manually detached
    from the pool when resilvering does not occur and it has been tested
    on both Linux and FreeBSD.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
    Closes openzfs#14722
    ixhamza authored and behlendorf committed Apr 24, 2023
    Configuration menu
    Copy the full SHA
    a68dfdb View commit details
    Browse the repository at this point in the history
  2. Improve resilver ETAs

    When resilvering the estimated time remaining is calculated using
    the average issue rate over the current pass.  Where the current
    pass starts when a scan was started, or restarted, if the pool
    was exported/imported.
    
    For dRAID pools in particular this can result in wildly optimistic
    estimates since the issue rate will be very high while scanning
    when non-degraded regions of the pool are scanned.  Once repair
    I/O starts being issued performance drops to a realistic number
    but the estimated performance is still significantly skewed.
    
    To address this we redefine a pass such that it starts after a
    scanning phase completes so the issue rate is more reflective of
    recent performance.  Additionally, the zfs_scan_report_txgs
    module option can be set to reset the pass statistics more often.
    
    Reviewed-by: Akash B <akash-b@hpe.com>
    Reviewed-by: Tony Hutter <hutter2@llnl.gov>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14410
    behlendorf committed Apr 24, 2023
    Configuration menu
    Copy the full SHA
    9fe3da9 View commit details
    Browse the repository at this point in the history
  3. Increase default zfs_scan_vdev_limit to 16MB

    For HDD based pools the default zfs_scan_vdev_limit of 4M
    per-vdev can significantly limit the maximum scrub performance.
    Increasing the default to 16M can double the scrub speed from
    80 MB/s per disk to 160 MB/s per disk.
    
    This does increase the memory footprint during scrub/resilver
    but given the performance win this is a reasonable trade off.
    Memory usage is capped at 1/4 of arc_c_max.  Note that number
    of outstanding I/Os has not changed and is still limited by
    zfs_vdev_scrub_max_active.
    
    Reviewed-by: Akash B <akash-b@hpe.com>
    Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14428
    behlendorf committed Apr 24, 2023
    Configuration menu
    Copy the full SHA
    fa28e26 View commit details
    Browse the repository at this point in the history
  4. Increase default zfs_rebuild_vdev_limit to 64MB

    When testing distributed rebuild performance with more capable
    hardware it was observed than increasing the zfs_rebuild_vdev_limit
    to 64M reduced the rebuild time by 17%.  Beyond 64MB there was
    some improvement (~2%) but it was not significant when weighed
    against the increased memory usage. Memory usage is capped at 1/4
    of arc_c_max.
    
    Additionally, vr_bytes_inflight_max has been moved so it's updated
    per-metaslab to allow the size to be adjust while a rebuild is
    running.
    
    Reviewed-by: Akash B <akash-b@hpe.com>
    Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com>
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14428
    behlendorf committed Apr 24, 2023
    Configuration menu
    Copy the full SHA
    cdbe1d6 View commit details
    Browse the repository at this point in the history
  5. Allow MMP to bypass waiting for other threads

    At our site we have seen cases when multi-modifier protection is enabled
    (multihost=on) on our pool and the pool gets suspended due to a single
    disk that is failing and responding very slowly. Our pools have 90 disks
    in them and we expect disks to fail. The current version of MMP requires
    that we wait for other writers before moving on. When a disk is
    responding very slowly, we observed that waiting here was bad enough to
    cause the pool to suspend. This change allows the MMP thread to bypass
    waiting for other threads and reduces the chances the pool gets
    suspended.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Herb Wartens <hawartens@gmail.com>
    Closes openzfs#14659
    hawartens authored and behlendorf committed Apr 24, 2023
    Configuration menu
    Copy the full SHA
    33075e4 View commit details
    Browse the repository at this point in the history

Commits on May 5, 2023

  1. zpool import -m also removing spare and cache when log device is missing

    spa_import() relies on a pool config fetched by spa_try_import() for
    spare/cache devices. Import flags are not passed to spa_tryimport(),
    which makes it return early due to a missing log device and missing
    retrieving the cache device and spare eventually. Passing
    ZFS_IMPORT_MISSING_LOG to spa_tryimport() makes it fetch the correct
    configuration regardless of the missing log device.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
    Closes openzfs#14794
    ixhamza authored and behlendorf committed May 5, 2023
    Configuration menu
    Copy the full SHA
    75ec145 View commit details
    Browse the repository at this point in the history

Commits on May 9, 2023

  1. Wait for txg sync if the last DRR_FREEOBJECTS might result in a hole

    If we receive a DRR_FREEOBJECTS as the first entry in an object range,
    this might end up producing a hole if the freed objects were the
    only existing objects in the block.
    
    If the txg starts syncing before we've processed any following
    DRR_OBJECT records, this leads to a possible race where the backing
    arc_buf_t gets its psize set to 0 in the arc_write_ready() callback
    while still being referenced from a dirty record in the open txg.
    
    To prevent this, we insert a txg_wait_synced call if the first
    record in the range was a DRR_FREEOBJECTS that actually
    resulted in one or more freed objects.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: David Hedberg <david.hedberg@findity.com>
    Sponsored by: Findity AB
    Closes openzfs#11893
    Closes openzfs#14358
    dhedberg authored and behlendorf committed May 9, 2023
    Configuration menu
    Copy the full SHA
    9b17d5a View commit details
    Browse the repository at this point in the history

Commits on May 10, 2023

  1. ZTS: Minor fixes

    Backport two minor ZTS test case fixes from 63652e1 to resolve
    a few spurious failures.
    
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    behlendorf committed May 10, 2023
    Configuration menu
    Copy the full SHA
    ecaf3ea View commit details
    Browse the repository at this point in the history

Commits on May 11, 2023

  1. pam: Fix "buffer overflow" in pam ZTS tests on F38

    The pam ZTS tests were reporting a buffer overflow on F38, possibly
    due to F38 now setting _FORTIFY_SOURCE=3 by default.  gdb and
    valgrind narrowed this down to a snprintf() buffer overflow in
    zfs_key_config_modify_session_counter().  I'm not clear why this
    particular snprintf() was being flagged as an overflow, but when
    I replaced it with an asprintf(), the test passed reliably.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Tony Hutter <hutter2@llnl.gov>
    Closes openzfs#14802 
    Closes openzfs#14842
    tonyhutter authored and behlendorf committed May 11, 2023
    Configuration menu
    Copy the full SHA
    7c555fe View commit details
    Browse the repository at this point in the history
  2. Add dmu_tx_hold_append() interface

    Provides an interface which callers can use to declare a write when
    the exact starting offset in not yet known.  Since the full range
    being updated is not available only the first L0 block at the
    provided offset will be prefetched.
    
    Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14819
    behlendorf committed May 11, 2023
    Configuration menu
    Copy the full SHA
    133faca View commit details
    Browse the repository at this point in the history
  3. zdb: consistent xattr output

    When using zdb to output the value of an xattr only interpret it
    as printable characters if the entire byte array is printable.
    Additionally, if the --parseable option is set always output the
    buffer contents as octal for easy parsing.
    
    Reviewed-by: Olaf Faaland <faaland1@llnl.gov>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14830
    behlendorf committed May 11, 2023
    Configuration menu
    Copy the full SHA
    b17e472 View commit details
    Browse the repository at this point in the history

Commits on May 26, 2023

  1. Fix concurrent resilvers initiated at same time

    For draid vdevs it was possible to initiate both the
    sequential and healing resilver at same time.
    
    This fixes the following two scenarios.
         1) There's a window where a sequential rebuild can
    be started via ZED even if a healing resilver has been
    scheduled.
    	- This is fixed by adding additional check in
    spa_vdev_attach() for any scheduled resilver and return
    appropriate error code when a resilver is already in
    progress.
    
         2) It was possible for zpool clear to start a healing
    resilver when it wasn't needed at all. This occurs because
    during a vdev_open() the device is presumed to be healthy not
    until the device is validated by vdev_validate() and it's set
    unavailable. However, by this point an async resilver will
    have already been requested if the DTL isn't empty.
    	- This is fixed by cancelling the SPA_ASYNC_RESILVER
    request immediately at the end of vdev_reopen() when a resilver
    is unneeded.
    
    Finally, added a testcase in ZTS for verification.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Dipak Ghosh <dipak.ghosh@hpe.com>
    Signed-off-by: Akash B <akash-b@hpe.com>
    Closes openzfs#14881
    Closes openzfs#14892
    akashb-22 authored and behlendorf committed May 26, 2023
    Configuration menu
    Copy the full SHA
    c2f0aae View commit details
    Browse the repository at this point in the history
  2. Probe vdevs before marking removed

    Before allowing the ZED to mark a vdev as REMOVED due to a
    hotplug event confirm that it is non-responsive with probe.
    Any device which can be successfully probed should be left
    ONLINE to prevent a healthy pool from being incorrectly
    SUSPENDED.  This may occur for at least the following two
    scenarios.
    
    1) Drive expansion (zpool online -e) in VMware environments.
       If, during the partition resize operation, a partition is
       removed and re-created then udev will send a removed event.
    
    2) Re-scanning the namespaces of an NVMe device (nvme ns-rescan)
       may result in a udev remove and add event being delivered.
    
    Finally, update the ZED to only kick in a spare when the
    removal was successful.
    
    Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
    Reviewed-by: Tony Hutter <hutter2@llnl.gov>
    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#14859
    Closes openzfs#14861
    behlendorf committed May 26, 2023
    Configuration menu
    Copy the full SHA
    e2176f1 View commit details
    Browse the repository at this point in the history
  3. Add the ability to uninitialize

    zpool initialize functions well for touching every free byte...once.
    But if we want to do it again, we're currently out of luck.
    
    So let's add zpool initialize -u to clear it.
    
    Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
    Closes openzfs#12451
    Closes openzfs#14873
    behlendorf committed May 26, 2023
    Configuration menu
    Copy the full SHA
    e97637d View commit details
    Browse the repository at this point in the history
  4. Use vmem_zalloc to silence allocation warning

    The kmem allocation in zfs_prune_aliases() will trigger a large
    allocation warning on systems with 64K pages.  Resolve this by
    switching to vmem_alloc() which internally uses kvmalloc() so the
    right allocator will be used based on the allocation size.
    
    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#8491
    Closes openzfs#14694
    behlendorf committed May 26, 2023
    Configuration menu
    Copy the full SHA
    6ec3abc View commit details
    Browse the repository at this point in the history
  5. Storage device expansion "silently" fails on degraded vdev

    When a vdev is degraded or faulted, we refuse to expand it when doing
    online -e. However, we also don't actually cause the online command
    to fail, even though the disk didn't expand. This is confusing and
    misleading, and can result in violated expectations.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Matthew Ahrens <mahrens@delphix.com>
    Reviewed-by: Mark Maybee <mark.maybee@delphix.com>
    Signed-off-by: Paul Dagnelie <pcd@delphix.com>
    Closes 14145
    pcd1193182 authored and behlendorf committed May 26, 2023
    Configuration menu
    Copy the full SHA
    e2a96aa View commit details
    Browse the repository at this point in the history

Commits on May 28, 2023

  1. ZTS: send-c_volume is flaky

    We use block_device_wait to wait for the zvol block device to 
    actually appear, and we log the result of the dd calls by using 
    an intermediate file.
    
    Reviewed-by: George Melikov <mail@gmelikov.ru>
    Reviewed-by: John Wren Kennedy <john.kennedy@delphix.com>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Paul Dagnelie <pcd@delphix.com>
    Closes openzfs#14767
    pcd1193182 authored and behlendorf committed May 28, 2023
    Configuration menu
    Copy the full SHA
    e1b3ab5 View commit details
    Browse the repository at this point in the history
  2. ZTS: add snapshot/snapshot_002_pos exception

    Add snapshot_002_pos to the known list of occasional failures
    for FreeBSD until it can be made entirely reliable.
    
    Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#14831
    Closes openzfs#14832
    behlendorf committed May 28, 2023
    Configuration menu
    Copy the full SHA
    c6f6958 View commit details
    Browse the repository at this point in the history
  3. ZTS: Annotate additonal flaky test cases

    Update several flaky test cases in zts-report.py.in until they
    can be made entirely reliable.
    
    Reviewed-by: George Melikov <mail@gmelikov.ru>
    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14392
    behlendorf committed May 28, 2023
    Configuration menu
    Copy the full SHA
    848c4b2 View commit details
    Browse the repository at this point in the history
  4. ZTS: Add auto_replace_001_pos to exceptions

    The auto_replace_001_pos test case does not reliably pass on
    Fedora 37 and newer.  Until the test case can be updated to make
    it reliable add it to the list of "maybe" exceptions on Linux.
    
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Issue openzfs#14851
    Closes openzfs#14852
    behlendorf committed May 28, 2023
    Configuration menu
    Copy the full SHA
    4e24df0 View commit details
    Browse the repository at this point in the history
  5. ZTS: Add zpool_resilver_concurrent exception

    The zpool_resilver_concurrent test case requires the ZED which is not used
    on FreeBSD.  Add this test to the known list of skipped tested for FreeBSD.
    
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14904
    behlendorf committed May 28, 2023
    Configuration menu
    Copy the full SHA
    c094b9a View commit details
    Browse the repository at this point in the history
  6. Refine special_small_blocks property validation

    When the special_small_blocks property is being set during a pool
    create it enforces a limit of 128KiB even if the pool's record size
    is larger.
    
    If the recordsize property is being set during a pool create, then
    use that value instead of the default SPA_OLD_MAXBLOCKSIZE value.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Don Brady <dev.fs.zfs@gmail.com>
    Closes openzfs#13815
    Closes openzfs#14811
    don-brady authored and behlendorf committed May 28, 2023
    Configuration menu
    Copy the full SHA
    30dcdda View commit details
    Browse the repository at this point in the history

Commits on May 30, 2023

  1. FreeBSD: make zfs_vfs_held() definition consistent with declaration

    Noticed while attempting to change FreeBSD's boolean_t into an actual
    bool: in include/sys/zfs_ioctl_impl.h, zfs_vfs_held() is declared to
    return a boolean_t, but in module/os/freebsd/zfs/zfs_ioctl_os.c it is
    defined to return an int. Make the definition match the declaration.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by: Dimitry Andric <dimitry@andric.com>
    Closes openzfs#14776
    DimitryAndric authored and behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    d1e05c6 View commit details
    Browse the repository at this point in the history
  2. FreeBSD: fix up EINVAL from getdirentries on .zfs

    Without the change:
    /.zfs
    /.zfs/snapshot
    find: /.zfs: Invalid argument
    
    Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
    Closes openzfs#14774
    mjguzik authored and behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    aef1324 View commit details
    Browse the repository at this point in the history
  3. FreeBSD: add missing vn state transition for .zfs

    Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
    Closes openzfs#14774
    mjguzik authored and behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    092021b View commit details
    Browse the repository at this point in the history
  4. Fix checkstyle warning

    Resolve a missed checkstyle warning.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
    Reviewed-by: George Melikov <mail@gmelikov.ru>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14799
    behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    45c4b3e View commit details
    Browse the repository at this point in the history
  5. FreeBSD: don't verify recycled vnode for zfs control directory

    Under certain loads, the following panic is hit:
    
        panic: page fault
        KDB: stack backtrace:
        #0 0xffffffff805db025 at kdb_backtrace+0x65
        #1 0xffffffff8058e86f at vpanic+0x17f
        #2 0xffffffff8058e6e3 at panic+0x43
        #3 0xffffffff808adc15 at trap_fatal+0x385
        #4 0xffffffff808adc6f at trap_pfault+0x4f
        #5 0xffffffff80886da8 at calltrap+0x8
        #6 0xffffffff80669186 at vgonel+0x186
        #7 0xffffffff80669841 at vgone+0x31
        #8 0xffffffff8065806d at vfs_hash_insert+0x26d
        #9 0xffffffff81a39069 at sfs_vgetx+0x149
        #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
        #11 0xffffffff8065a28c at lookup+0x45c
        #12 0xffffffff806594b9 at namei+0x259
        #13 0xffffffff80676a33 at kern_statat+0xf3
        #14 0xffffffff8067712f at sys_fstatat+0x2f
        #15 0xffffffff808ae50c at amd64_syscall+0x10c
        #16 0xffffffff808876bb at fast_syscall_common+0xf8
    
    The page fault occurs because vgonel() will call VOP_CLOSE() for active
    vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While
    here, define vop_open for consistency.
    
    After adding the necessary vop, the bug progresses to the following
    panic:
    
        panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1)
        cpuid = 17
        KDB: stack backtrace:
        #0 0xffffffff805e29c5 at kdb_backtrace+0x65
        #1 0xffffffff8059620f at vpanic+0x17f
        #2 0xffffffff81a27f4a at spl_panic+0x3a
        #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40
        #4 0xffffffff8066fdee at vinactivef+0xde
        #5 0xffffffff80670b8a at vgonel+0x1ea
        #6 0xffffffff806711e1 at vgone+0x31
        #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d
        #8 0xffffffff81a39069 at sfs_vgetx+0x149
        #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4
        #10 0xffffffff80661c2c at lookup+0x45c
        #11 0xffffffff80660e59 at namei+0x259
        #12 0xffffffff8067e3d3 at kern_statat+0xf3
        #13 0xffffffff8067eacf at sys_fstatat+0x2f
        #14 0xffffffff808b5ecc at amd64_syscall+0x10c
        #15 0xffffffff8088f07b at fast_syscall_common+0xf8
    
    This is caused by a race condition that can occur when allocating a new
    vnode and adding that vnode to the vfs hash. If the newly created vnode
    loses the race when being inserted into the vfs hash, it will not be
    recycled as its usecount is greater than zero, hitting the above
    assertion.
    
    Fix this by dropping the assertion.
    
    FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700
    Reviewed-by: Andriy Gapon <avg@FreeBSD.org>
    Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
    Reviewed-by: Alek Pinchuk <apinchuk@axcient.com>
    Reviewed-by: Ryan Moeller <ryan@iXsystems.com>
    Signed-off-by: Rob Wing <rob.wing@klarasystems.com>
    Co-authored-by: Rob Wing <rob.wing@klarasystems.com>
    Submitted-by: Klara, Inc.
    Sponsored-by: rsync.net
    Closes openzfs#14501
    rob-wing authored and behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    f786232 View commit details
    Browse the repository at this point in the history
  6. FreeBSD: add missing vop_fplookup assignments

    It became illegal to not have them as of
    5f6df177758b9dff88e4b6069aeb2359e8b0c493 ("vfs: validate that vop
    vectors provide all or none fplookup vops") upstream.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
    Closes openzfs#14788
    mjguzik authored and behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    07a2ba5 View commit details
    Browse the repository at this point in the history
  7. Fix test-runner on FreeBSD

    CLOCK_MONOTONIC_RAW is only a thing on Linux and macOS. I'm not
    actually sure why the previous hardcoding of a constant didn't
    error out, but when we removed it, it sure does now.
    
    Reviewed-by: Alexander Motin <mav@FreeBSD.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Co-authored-by: Rich Ercolani <rincebrain@gmail.com>
    Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
    Closes openzfs#12995
    nabijaczleweli authored and behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    435407e View commit details
    Browse the repository at this point in the history
  8. ZTS: threadsappend_001_pos

    Correct exception path used in zts-report.py.in.
    
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    behlendorf committed May 30, 2023
    Configuration menu
    Copy the full SHA
    a836cc6 View commit details
    Browse the repository at this point in the history

Commits on Jun 1, 2023

  1. Fix NULL pointer dereference when doing concurrent 'send' operations

    A NULL pointer will occur when doing a 'zfs send -S' on a dataset that
    is still being received.  The problem is that the new 'send' will
    rightfully fail to own the datasets (i.e. dsl_dataset_own_force() will
    fail), but then dmu_send() will still do the dsl_dataset_disown().
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Luís Henriques <henrix@camandro.org>
    Closes openzfs#14903 
    Closes openzfs#14890
    lumigch authored and behlendorf committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    671b1af View commit details
    Browse the repository at this point in the history
  2. Revert "initramfs: use mount.zfs instead of mount"

    This broke mounting of snapshots on / for users.
    
    See openzfs#9461 (comment) for more context.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
    Closes openzfs#14908
    rincebrain authored and behlendorf committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    93a99c6 View commit details
    Browse the repository at this point in the history
  3. Move zap_attribute_t to the heap in dsl_deadlist_merge

    In the case of a regular compilation, the compiler
    raises a warning for a dsl_deadlist_merge function, that
    the stack size is to large. In debug build this can
    generate an error.
    
    Move large structures to heap.
    
    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com>
    Sponsored-by: Klara, Inc.
    Sponsored-by: Wasabi Technology, Inc.
    Closes openzfs#14524
    oshogbo authored and behlendorf committed Jun 1, 2023
    Configuration menu
    Copy the full SHA
    7d26967 View commit details
    Browse the repository at this point in the history

Commits on Jun 2, 2023

  1. Fix positive ABD size assertion in abd_verify().

    Gang ABDs without childred are legal, and they do have zero size.
    For other ABD types zero size doesn't have much sense and likely
    not working correctly now.
    
    Reviewed-by: Igor Kozhukhov <igor@dilos.org>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#14795
    amotin authored and behlendorf committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    e271cd7 View commit details
    Browse the repository at this point in the history
  2. Mark TX_COMMIT transaction with TXG_NOTHROTTLE.

    TX_COMMIT has no on-disk representation and does not produce any more
    dirty data.  It should not wait for anything, and even just skipping
    the checks if not waiting gives improvement noticeable in profiler.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Prakash Surya <prakash.surya@delphix.com>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#14798
    amotin authored and behlendorf committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    c1b9dc7 View commit details
    Browse the repository at this point in the history
  3. Fix two abd_gang_add_gang() issues.

    - There is no reason to assert that added gang is not empty.  It
    may be weird to add an empty gang, but it is legal.
     - When moving chain list from the added gang clear its size, or it
    will trigger assertion in abd_verify() when that gang is freed.
    
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#14816
    amotin authored and behlendorf committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    b2ede77 View commit details
    Browse the repository at this point in the history
  4. Remove single parent assertion from zio_nowait().

    We only need to know if ZIO has any parent there.  We do not care if
    it has more than one, but use of zio_unique_parent() == NULL asserts
    that.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    Closes openzfs#14823
    amotin authored and behlendorf committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    a727848 View commit details
    Browse the repository at this point in the history
  5. zil: Don't expect zio_shrink() to succeed.

    At least for RAIDZ zio_shrink() does not reduce zio size, but reduced
    wsz in that case likely results in writing uninitialized memory.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by:  Alexander Motin <mav@FreeBSD.org>
    Sponsored by:   iXsystems, Inc.
    Closes openzfs#14853
    amotin authored and behlendorf committed Jun 2, 2023
    Configuration menu
    Copy the full SHA
    b01a8cc View commit details
    Browse the repository at this point in the history

Commits on Jun 3, 2023

  1. ZIL: Allow to replay blocks of any size.

    There seems to be no reason for ZIL blocks to be limited by 128KB
    other than replay code is written in such a way.  This change does
    not increase the limit yet, just removes the artificial limitation.
    
    Avoided extra memcpy() may save us a second during replay.
    
    Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
    Sponsored by:	iXsystems, Inc.
    amotin authored and behlendorf committed Jun 3, 2023
    Configuration menu
    Copy the full SHA
    8a315a3 View commit details
    Browse the repository at this point in the history

Commits on Jun 5, 2023

  1. Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneously

    Page writebacks with WB_SYNC_NONE can take several seconds to complete
    since they wait for the transaction group to close before being
    committed. This is usually not a problem since the caller does not
    need to wait. However, if we're simultaneously doing a writeback
    with WB_SYNC_ALL (e.g via msync), the latter can block for several
    seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE
    writeback since it needs to wait for the transaction to complete
    and the PG_writeback bit to be cleared.
    
    This commit deals with 2 cases:
    
    - No page writeback is active. A WB_SYNC_ALL page writeback starts
      and even completes. But when it's about to check if the PG_writeback
      bit has been cleared, another writeback with WB_SYNC_NONE starts.
      The sync page writeback ends up waiting for the non-sync page
      writeback to complete.
    
    - A page writeback with WB_SYNC_NONE is already active when a
      WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up
      waiting for the WB_SYNC_NONE writeback.
    
    The fix works by carefully keeping track of active sync/non-sync
    writebacks and committing when beneficial.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Shaan Nobee <sniper111@gmail.com>
    Closes openzfs#12662
    Closes openzfs#12790
    shaan1337 authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    9e5a297 View commit details
    Browse the repository at this point in the history
  2. Linux: use filemap_range_has_page()

    As of the 4.13 kernel filemap_range_has_page() can be used to
    check if there is a page mapped in a given file range.  When
    available this interface should be used which eliminates the
    need for the zp->z_is_mapped boolean.
    
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Closes openzfs#14493
    behlendorf authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    3ad6c16 View commit details
    Browse the repository at this point in the history
  3. Workaround for Linux PowerPC GPL-only cpu_has_feature()

    Linux since 4.7 makes interface 'cpu_has_feature' to use jump labels on
    powerpc if CONFIG_JUMP_LABEL_FEATURE_CHECKS is enabled, in this case
    however the inline function references GPL-only symbol
    'cpu_feature_keys'.
    
    ZFS currently uses 'cpu_has_feature' either directly or indirectly from
    several places; while it is unknown how this issue didn't break ZFS on
    64-bit little-endian powerpc, it is known to break ZFS with many Linux
    versions on both 32-bit and 64-bit big-endian powerpc.
    
    Until this issue is fixed in Linux, we have to workaround it by
    overriding affected inline functions without depending on
    'cpu_feature_keys'.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: WHR <msl0000023508@gmail.com>
    Closes openzfs#14590
    Low-power authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    35d43ba View commit details
    Browse the repository at this point in the history
  4. Linux 6.3 compat: writepage_t first arg struct folio*

    The type def of writepage_t in kernel 6.3 is changed to take
    struct folio* as the first argument. We need to detect this
    change and pass correct function to write_cache_pages().
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Signed-off-by: Youzhong Yang <yyang@mathworks.com>
    Closes openzfs#14699
    youzhongyang authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    04305bb View commit details
    Browse the repository at this point in the history
  5. Linux 6.3 compat: idmapped mount API changes

    Linux kernel 6.3 changed a bunch of APIs to use the dedicated idmap
    type for mounts (struct mnt_idmap), we need to detect these changes
    and make zfs work with the new APIs.
    
    NOTE: This backport only includes the configure checks to detect
    the 6.3 idmap API changes.  It does not include support for idmap.
    When provided the idmap variable is ignored in most case in the
    same way the user_ns argument was ignored.  This change is solely
    to provide compatibility with the new interfaces.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Youzhong Yang <yyang@mathworks.com>
    Closes openzfs#14682
    youzhongyang authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    f0aca5f View commit details
    Browse the repository at this point in the history
  6. Linux 6.3 compat: Fix memcpy "detected field-spanning write" error

    Add a new union member of flexible array to dnode_phys_t and use
    it in the macro so we can silence the memcpy() fortify error.
    
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Youzhong Yang <yyang@mathworks.com>
    Closes openzfs#14737
    youzhongyang authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    d7fb413 View commit details
    Browse the repository at this point in the history
  7. Linux 6.4 compat: reclaimed_slab renamed to reclaimed

    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: Youzhong Yang <yyang@mathworks.com>
    Closes openzfs#14891
    youzhongyang authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    5f125e9 View commit details
    Browse the repository at this point in the history
  8. Silence clang warning of flexible array not at end

    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Jorgen Lundman <lundman@lundman.net>
    Signed-off-by: Youzhong Yang <yyang@mathworks.com>
    Closes openzfs#14764
    youzhongyang authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    79f8e62 View commit details
    Browse the repository at this point in the history
  9. Linux 6.3 compat: META (openzfs#14930)

    Update the META file to reflect compatibility with the 6.3 kernel.
    
    Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Reviewed-by: Tony Hutter <hutter2@llnl.gov>
    behlendorf authored and tonyhutter committed Jun 5, 2023
    Configuration menu
    Copy the full SHA
    1322f07 View commit details
    Browse the repository at this point in the history

Commits on Jun 6, 2023

  1. Fix Clang 15 compilation errors

    - Clang 15 doesn't support `-fno-ipa-sra` anymore. Do a separate
      check for `-fno-ipa-sra` support by $KERNEL_CC.
    
    - Don't enable `-mgeneral-regs-only` for certain module files.
      Fix openzfs#13260
    
    - Scope `GCC diagnostic ignored` statements to GCC only. Clang
      doesn't need them to compile the code.
    
    Porting notes:
    - Moved the stanzas removing -mgeneral-regs-only to Makefile.in
      since they wouldn't readily work in Kbuild.in and that did.
    
    Reviewed-by: Richard Yao <richard.yao@alumni.stonybrook.edu>
    Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
    Signed-off-by: szubersk <szuberskidamian@gmail.com>
    Closes openzfs#13260
    Closes openzfs#14150
    
    Closes openzfs#14624
    Ported-by: Rich Ercolani <rincebrain@gmail.com
    Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
    szubersk authored and tonyhutter committed Jun 6, 2023
    Configuration menu
    Copy the full SHA
    dbbc2f9 View commit details
    Browse the repository at this point in the history
  2. Tag zfs-2.1.12

    META file and changelog updated.
    
    Signed-off-by: Tony Hutter <hutter2@llnl.gov>
    tonyhutter committed Jun 6, 2023
    Configuration menu
    Copy the full SHA
    86783d7 View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2023

  1. Merge tag 'zfs-2.1.12'

    ZFS Version 2.1.12
    
    Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
    ixhamza committed Jun 14, 2023
    Configuration menu
    Copy the full SHA
    fa0045a View commit details
    Browse the repository at this point in the history
  2. Bump changelog for 2.1.12

    Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
    ixhamza committed Jun 14, 2023
    Configuration menu
    Copy the full SHA
    fa8019f View commit details
    Browse the repository at this point in the history