Skip to content

Commits

Permalink
Yu-Kuai/tests-…
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Commits on Apr 27, 2023

  1. tests/dm: add a regression test

    Verify that reload a dm with maps to itself will fail.
    
    Signed-off-by: Yu Kuai <yukuai3@huawei.com>
    Yu Kuai authored and intel-lab-lkp committed Apr 27, 2023
    Copy the full SHA
    0fe7f3e View commit details
    Browse the repository at this point in the history

Commits on Apr 19, 2023

  1. dm: don't lock fs when the map is NULL in process of resume

    Commit fa24708 ("dm: requeue IO if mapping table not yet available")
    added a detection of whether the mapping table is available in the IO
    submission process. If the mapping table is unavailable, it returns
    BLK_STS_RESOURCE and requeues the IO.
    This can lead to the following deadlock problem:
    
    dm create                                      mount
    ioctl(DM_DEV_CREATE_CMD)
    ioctl(DM_TABLE_LOAD_CMD)
                                   do_mount
                                    vfs_get_tree
                                     ext4_get_tree
                                      get_tree_bdev
                                       sget_fc
                                        alloc_super
                                         // got &s->s_umount
                                         down_write_nested(&s->s_umount, ...);
                                       ext4_fill_super
                                        ext4_load_super
                                         ext4_read_bh
                                          submit_bio
                                          // submit and wait io end
    ioctl(DM_DEV_SUSPEND_CMD)
    dev_suspend
     do_resume
      dm_suspend
       __dm_suspend
        lock_fs
         freeze_bdev
          get_active_super
           grab_super
            // wait for &s->s_umount
            down_write(&s->s_umount);
      dm_swap_table
       __bind
        // set md->map(can't get here)
    
    IO will be continuously requeued while holding the lock since mapping
    table is NULL. At the same time, mapping table won't be set since the
    lock is not available.
    Like request-based DM, bio-based DM also has the same problem.
    
    It's not proper to just abort IO if the mapping table not available.
    So clear DM_SKIP_LOCKFS_FLAG when the mapping table is NULL, this
    allows the DM table to be loaded and the IO submitted upon resume.
    
    Fixes: fa24708 ("dm: requeue IO if mapping table not yet available")
    Cc: stable@vger.kernel.org
    Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Li Lingfeng authored and Mike Snitzer committed Apr 19, 2023
    Copy the full SHA
    38d11da View commit details
    Browse the repository at this point in the history
  2. dm flakey: add an "error_reads" option

    dm-flakey returns error on reads if no other argument is specified.
    This commit simplifies associated logic while formalizing an
    "error_reads" argument and an ERROR_READS flag.
    
    If no argument is specified, set ERROR_READS flag so that it behaves
    just like before this commit.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Apr 19, 2023
    Copy the full SHA
    aa7d7bc View commit details
    Browse the repository at this point in the history
  3. dm flakey: remove trailing space in the table line

    Don't return a trailing space in the output of STATUSTYPE_TABLE.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Apr 19, 2023
    Copy the full SHA
    e3675dc View commit details
    Browse the repository at this point in the history
  4. dm flakey: fix a crash with invalid table line

    This command will crash with NULL pointer dereference:
     dmsetup create flakey --table \
      "0 `blockdev --getsize /dev/ram0` flakey /dev/ram0 0 0 1 2 corrupt_bio_byte 512"
    
    Fix the crash by checking if arg_name is non-NULL before comparing it.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Apr 19, 2023
    Copy the full SHA
    98dba02 View commit details
    Browse the repository at this point in the history

Commits on Apr 17, 2023

  1. dm ioctl: fix nested locking in table_clear() to remove deadlock concern

    syzkaller found the following problematic rwsem locking (with write
    lock already held):
    
     down_read+0x9d/0x450 kernel/locking/rwsem.c:1509
     dm_get_inactive_table+0x2b/0xc0 drivers/md/dm-ioctl.c:773
     __dev_status+0x4fd/0x7c0 drivers/md/dm-ioctl.c:844
     table_clear+0x197/0x280 drivers/md/dm-ioctl.c:1537
    
    In table_clear, it first acquires a write lock
    https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L1520
    down_write(&_hash_lock);
    
    Then before the lock is released at L1539, there is a path shown above:
    table_clear -> __dev_status -> dm_get_inactive_table ->  down_read
    https://elixir.bootlin.com/linux/v6.2/source/drivers/md/dm-ioctl.c#L773
    down_read(&_hash_lock);
    
    It tries to acquire the same read lock again, resulting in the deadlock
    problem.
    
    Fix this by moving table_clear()'s __dev_status() call to after its
    up_write(&_hash_lock);
    
    Cc: stable@vger.kernel.org
    Reported-by: Zheng Zhang <zheng.zhang@email.ucr.edu>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Apr 17, 2023
    Copy the full SHA
    3d32aaa View commit details
    Browse the repository at this point in the history

Commits on Apr 14, 2023

  1. dm: unexport dm_get_queue_limits()

    There are no dm_get_queue_limits() callers outside of DM core and
    there shouldn't be.
    
    Also, remove its BUG_ON(!atomic_read(&md->holders)) to micro-optimize
    __process_abnormal_io().
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Apr 14, 2023
    Copy the full SHA
    f799508 View commit details
    Browse the repository at this point in the history
  2. dm: allow targets to require splitting WRITE_ZEROES and SECURE_ERASE

    Introduce max_write_zeroes_granularity and
    max_secure_erase_granularity flags in the dm_target struct.
    
    If a target sets these then DM core will split IO of these operation
    types accordingly (in terms of max_write_zeroes_sectors and
    max_secure_erase_sectors respectively).
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Apr 14, 2023
    Copy the full SHA
    13f6fac View commit details
    Browse the repository at this point in the history

Commits on Apr 11, 2023

  1. dm: add helper macro for simple DM target module init and exit

    Eliminate duplicate boilerplate code for simple modules that contain
    a single DM target driver without any additional setup code.
    
    Add a new module_dm() macro, which replaces the module_init() and
    module_exit() with template functions that call dm_register_target()
    and dm_unregister_target() respectively.
    
    Signed-off-by: Yangtao Li <frank.li@vivo.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Yangtao Li authored and Mike Snitzer committed Apr 11, 2023
    Copy the full SHA
    3664ff8 View commit details
    Browse the repository at this point in the history
  2. dm raid: remove unused d variable

    clang with W=1 reports
    drivers/md/dm-raid.c:2212:15: error: variable
      'd' set but not used [-Werror,-Wunused-but-set-variable]
            unsigned int d;
                         ^
    This variable is not used so remove it.
    
    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    trixirt authored and Mike Snitzer committed Apr 11, 2023
    Copy the full SHA
    306fbc2 View commit details
    Browse the repository at this point in the history
  3. dm: remove unnecessary (void*) conversions

    Pointer variables of void * type do not require type cast.
    
    Signed-off-by: Yu Zhe <yuzhe@nfschina.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    yuzhenfschina authored and Mike Snitzer committed Apr 11, 2023
    Copy the full SHA
    26cb62a View commit details
    Browse the repository at this point in the history
  4. dm mirror: add DMERR message if alloc_workqueue fails

    Signed-off-by: Yangtao Li <frank.li@vivo.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Yangtao Li authored and Mike Snitzer committed Apr 11, 2023
    Copy the full SHA
    990f61e View commit details
    Browse the repository at this point in the history
  5. dm: push error reporting down to dm_register_target()

    Simplifies each DM target's init method by making dm_register_target()
    responsible for its error reporting (on behalf of targets).
    
    Signed-off-by: Yangtao Li <frank.li@vivo.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Yangtao Li authored and Mike Snitzer committed Apr 11, 2023
    Copy the full SHA
    b362c73 View commit details
    Browse the repository at this point in the history

Commits on Apr 4, 2023

  1. dm integrity: call kmem_cache_destroy() in dm_integrity_init() error …

    …path
    
    Otherwise the journal_io_cache will leak if dm_register_target() fails.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    6b79a42 View commit details
    Browse the repository at this point in the history
  2. dm clone: call kmem_cache_destroy() in dm_clone_init() error path

    Otherwise the _hydration_cache will leak if dm_register_target() fails.
    
    Cc: stable@vger.kernel.org
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    6827af4 View commit details
    Browse the repository at this point in the history
  3. dm error: add discard support

    Add io_err_io_hints() and set discard limits so that the zero target
    advertises support for discards.
    
    The error target will return -EIO for discards.
    
    This is useful when the user combines dm-error with other
    discard-supporting targets in the same table; without dm-error
    support, discards would be disabled for the whole combined device.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Tested-by: Milan Broz <gmazyland@gmail.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    b6bcb84 View commit details
    Browse the repository at this point in the history
  4. dm zero: add discard support

    Add zero_io_hints() and set discard limits so that the zero target
    advertises support for discards.
    
    The zero target will ignore discards.
    
    This is useful when the user combines dm-zero with other
    discard-supporting targets in the same table; without dm-zero support,
    discards would be disabled for the whole combined device.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Tested-by: Milan Broz <gmazyland@gmail.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    00065f9 View commit details
    Browse the repository at this point in the history
  5. dm table: allow targets without devices to set ->io_hints

    In dm_calculate_queue_limits, add call to ->io_hints hook if the
    target doesn't provide ->iterate_devices.
    
    This is needed so the "error" and "zero" targets may support
    discards. The 2 following commits will add their respective discard
    support.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Tested-by: Milan Broz <gmazyland@gmail.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    85c938e View commit details
    Browse the repository at this point in the history
  6. dm verity: emit audit events on verification failure and more

    dm-verity signals integrity violations by returning I/O errors
    to user space. To identify integrity violations by a controlling
    instance, the kernel audit subsystem can be used to emit audit
    events to user space. Analogous to dm-integrity, we also use the
    dm-audit submodule allowing to emit audit events on verification
    failures of metadata and data blocks as well as if max corrupted
    errors are reached.
    
    The construction and destruction of verity device mappings are
    also relevant for auditing a system. Thus, those events are also
    logged as audit events.
    
    Tested by starting a container with the container manager (cmld) of
    GyroidOS which uses a dm-verity protected rootfs image root.img mapped
    to /dev/mapper/<uuid>-root. One block was manipulated in the
    underlying image file and repeated reads of the verity device were
    performed again until the max corrupted errors is reached, e.g.:
    
      dd if=/dev/urandom of=root.img bs=512 count=1 seek=1000
      for i in range {1..101}; do \
        dd if=/dev/mapper/<uuid>-root of=/dev/null bs=4096 \
           count=1 skip=1000 \
      done
    
    The resulting audit log looks as follows:
    
      type=DM_CTRL msg=audit(1677618791.876:962):
        module=verity op=ctr ppid=4876 pid=29102 auid=0 uid=0 gid=0
        euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=44
        comm="cmld" exe="/usr/sbin/cml/cmld" subj=unconfined
        dev=254:3 error_msg='success' res=1
    
      type=DM_EVENT msg=audit(1677619463.786:1074): module=verity
        op=verify-data dev=7:0 sector=1000 res=0
      ...
      type=DM_EVENT msg=audit(1677619596.727:1162): module=verity
        op=verify-data dev=7:0 sector=1000 res=0
    
      type=DM_EVENT msg=audit(1677619596.731:1163): module=verity
        op=max-corrupted-errors dev=254:3 sector=? res=0
    
    Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
    Acked-by: Paul Moore <paul@paul-moore.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    quitschbo authored and Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    074c446 View commit details
    Browse the repository at this point in the history
  7. dm verity: fix error handling for check_at_most_once on FEC

    In verity_end_io(), if bi_status is not BLK_STS_OK, it can be return
    directly. But if FEC configured, it is desired to correct the data page
    through verity_verify_io. And the return value will be converted to
    blk_status and passed to verity_finish_io().
    
    BTW, when a bit is set in v->validated_blocks, verity_verify_io() skips
    verification regardless of I/O error for the corresponding bio. In this
    case, the I/O error could not be returned properly, and as a result,
    there is a problem that abnormal data could be read for the
    corresponding block.
    
    To fix this problem, when an I/O error occurs, do not skip verification
    even if the bit related is set in v->validated_blocks.
    
    Fixes: 843f38d ("dm verity: add 'check_at_most_once' option to only validate hashes once")
    Cc: stable@vger.kernel.org
    Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
    Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    YeongjinGil authored and Mike Snitzer committed Apr 4, 2023
    Copy the full SHA
    e8c5d45 View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2023

  1. dm: improve hash_locks sizing and hash function

    Both bufio and bio-prison-v1 use the identical model for splitting
    their respective locks and rbtrees. Improve dm_num_hash_locks() to
    distribute across more rbtrees to improve overall performance -- but
    the maximum number of locks/rbtrees is still 64.
    
    Also factor out a common hash function named dm_hash_locks_index(),
    the magic numbers used were determined to be best using this program:
     https://gist.github.com/jthornber/e05c47daa7b500c56dc339269c5467fc
    
    Signed-off-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    jthornber authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    363b7fd View commit details
    Browse the repository at this point in the history
  2. dm bio prison v1: intelligently size dm_bio_prison's prison_regions

    Size the dm_bio_prison's number of prison_region structs using
    dm_num_hash_locks().
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    b6279f8 View commit details
    Browse the repository at this point in the history
  3. dm bio prison v1: prepare to intelligently size dm_bio_prison's priso…

    …n_regions
    
    Add num_locks member to dm_bio_prison struct and use it rather than
    the NR_LOCKS magic value (64).
    
    Next commit will size the dm_bio_prison's prison_regions according to
    dm_num_hash_locks().
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    c627341 View commit details
    Browse the repository at this point in the history
  4. dm bufio: intelligently size dm_buffer_cache's buffer_trees

    Size the dm_buffer_cache's number of buffer_tree structs using
    dm_num_hash_locks().
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    1e84c4b View commit details
    Browse the repository at this point in the history
  5. dm bufio: prepare to intelligently size dm_buffer_cache's buffer_trees

    Add num_locks member to dm_buffer_cache struct and use it rather than
    the NR_LOCKS magic value (64).
    
    Next commit will size the dm_buffer_cache's buffer_trees according to
    dm_num_hash_locks().
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    36c18b8 View commit details
    Browse the repository at this point in the history
  6. dm: add dm_num_hash_locks()

    Simple helper to use when DM core code needs to appropriately size,
    based on num_online_cpus(), its data structures that split locks.
    
    dm_num_hash_locks() rounds up num_online_cpus() to next power of 2
    but caps return at DM_HASH_LOCKS_MAX (64).
    
    This heuristic may evolve as warranted, but as-is it will serve as a
    more informed basis for sizing the sharded lock structs in dm-bufio's
    dm_buffer_cache (buffer_trees) and dm-bio-prison-v1's dm_bio_prison
    (prison_regions).
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    0bac3f2 View commit details
    Browse the repository at this point in the history
  7. dm bio prison v1: add dm_cell_key_has_valid_range

    Don't have bio_detain() BUG_ON if a dm_cell_key is beyond
    BIO_PRISON_MAX_RANGE or spans a boundary.
    
    Update dm-thin.c:build_key() to use dm_cell_key_has_valid_range() which
    will do this checking without using BUG_ON. Also update
    process_discard_bio() to check the discard bio that DM core passes in
    (having first imposed max_discard_granularity based splitting).
    
    dm_cell_key_has_valid_range() will merely WARN_ON_ONCE if it returns
    false because if it does: it is programmer error that should be caught
    with proper testing. So relax the BUG_ONs to be WARN_ON_ONCE.
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    3f8d3f5 View commit details
    Browse the repository at this point in the history
  8. dm bio prison v1: improve concurrent IO performance

    Split the bio prison into multiple regions, with a separate rbtree and
    associated lock for each region.
    
    To get fast bio prison locking and not damage the performance of
    discards too much the bio-prison now stipulates that discards should
    not cross a BIO_PRISON_MAX_RANGE boundary.
    
    Because the range of a key (block_end - block_begin) must not exceed
    BIO_PRISON_MAX_RANGE: break_up_discard_bio() now ensures the data
    range reflected in PHYSICAL key doesn't exceed BIO_PRISON_MAX_RANGE.
    And splitting the thin target's discards (handled with VIRTUAL key) is
    achieved by updating dm-thin.c to set limits->max_discard_sectors in
    terms of BIO_PRISON_MAX_RANGE _and_ setting the thin and thin-pool
    targets' max_discard_granularity to true.
    
    Signed-off-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    jthornber authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    e2dd8ac View commit details
    Browse the repository at this point in the history
  9. dm: split discards further if target sets max_discard_granularity

    The block core (bio_split_discard) will already split discards based
    on the 'discard_granularity' and 'max_discard_sectors' queue_limits.
    But the DM thin target also needs to ensure that it doesn't receive a
    discard that spans a 'max_discard_sectors' boundary.
    
    Introduce a dm_target 'max_discard_granularity' flag that if set will
    cause DM core to split discard bios relative to 'max_discard_sectors'.
    This treats 'discard_granularity' as a "min_discard_granularity" and
    'max_discard_sectors' as a "max_discard_granularity".
    
    Requested-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    06961c4 View commit details
    Browse the repository at this point in the history
  10. dm thin: speed up cell_defer_no_holder()

    Reduce the time that a spinlock is held in cell_defer_no_holder().
    
    Signed-off-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    jthornber authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    bb46c56 View commit details
    Browse the repository at this point in the history
  11. dm bufio: use multi-page bio vector

    The kernel supports multi page bio vector entries, so we can use them
    in dm-bufio as an optimization.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    56c5de4 View commit details
    Browse the repository at this point in the history
  12. dm bufio: use waitqueue_active in __free_buffer_wake

    Save one spinlock by using waitqueue_active. We hold the bufio lock at
    this place, so no one can add entries to the waitqueue at this point.
    
    Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mikulas Patocka authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    f5f9354 View commit details
    Browse the repository at this point in the history
  13. dm bufio: move dm_bufio_client members to avoid spanning cachelines

    Movement also consolidates holes in dm_bufio_client struct. But the
    overall size of the struct isn't changed.
    
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    530f683 View commit details
    Browse the repository at this point in the history
  14. dm bufio: add lock_history optimization for cache iterators

    Sometimes it is beneficial to repeatedly get and drop locks as part of
    an iteration.  Introduce lock_history struct to help avoid redundant
    drop and gets of the same lock.
    
    Optimizes cache_iterate, cache_mark_many and cache_evict.
    
    Signed-off-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    jthornber authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    7911880 View commit details
    Browse the repository at this point in the history
  15. dm bufio: improve concurrent IO performance

    When multiple threads perform IO to a thin device, the underlying
    dm_bufio object can become a bottleneck; slowing down access to btree
    nodes that store the thin metadata. Prior to this commit, each bufio
    instance had a single mutex that was taken for every bufio operation.
    
    This commit concentrates on improving the common case where: a user of
    dm_bufio wishes to access, but not modify, a buffer which is already
    within the dm_bufio cache.
    
    Implementation::
    
      The code has been refactored; pulling out an 'lru' abstraction and a
      'buffer cache' abstraction (see 2 previous commits). This commit
      updates higher level bufio code (that performs allocation of buffers,
      IO and eviction/cache sizing) to leverage both abstractions. It also
      deals with the delicate locking requirements of both abstractions to
      provide finer grained locking. The result is significantly better
      concurrent IO performance.
    
      Before this commit, bufio has a global lru list it used to evict the
      oldest, clean buffers from _all_ clients. With the new locking we
      don’t want different ways to access the same buffer, so instead
      do_global_cleanup() loops around the clients asking them to free
      buffers older than a certain time.
    
      This commit also converts many old BUG_ONs to WARN_ON_ONCE, see the
      lru_evict and cache_evict code in particular.  They will return
      ER_DONT_EVICT if a given buffer somehow meets the invariants that
      should _never_ happen. [Aside from revising this commit's header and
      fixing coding style and whitespace nits: this switching to
      WARN_ON_ONCE is Mike Snitzer's lone contribution to this commit]
    
    Testing::
    
      Some of the low level functions have been unit tested using dm-unit:
        https://github.com/jthornber/dm-unit/blob/main/src/tests/bufio.rs
    
      Higher level concurrency and IO is tested via a test only target
      found here:
        https://github.com/jthornber/linux/blob/2023-03-24-thin-concurrency-9/drivers/md/dm-bufio-test.c
    
      The associated userland side of these tests is here:
        https://github.com/jthornber/dmtest-python/blob/main/src/dmtest/bufio/bufio_tests.py
    
      In addition the full dmtest suite of tests (dm-thin, dm-cache, etc)
      has been run (~450 tests).
    
    Performance::
    
      Most bufio operations have unchanged performance. But if multiple
      threads are attempting to get buffers concurrently, and these
      buffers are already in the cache then there's a big speed up. Eg,
      one test has 16 'hotspot' threads simulating btree lookups while
      another thread dirties the whole device. In this case the hotspot
      threads acquire the buffers about 25 times faster.
    
    Signed-off-by: Joe Thornber <ejt@redhat.com>
    Signed-off-by: Mike Snitzer <snitzer@kernel.org>
    jthornber authored and Mike Snitzer committed Mar 30, 2023
    Copy the full SHA
    450e8de View commit details
    Browse the repository at this point in the history
Older