Skip to content
Permalink
kaike-wan-inte…
Switch branches/tags

Commits on Mar 19, 2021

  1. RDMA/rv: Integrate the file operations into the rv module

    Integrate the file operations into the module_init and module_exit
    functions so that user applications can access the rv module.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  2. RDMA/rv: Add functions for file operations

    A process communicates with the rv module through the file interface:
    - The process opens the /dev/rv device file.
    - The process sends an Attach request to bind to an RDMA device.
    - The process registers a number of user/kernel memory regions (MR).
    - The process sends Create Conn request to create connection between
      two nodes. If the local node is the server, it will start to listen
      to IB CM for any incoming connection request.
    - The process sends Connect request to start connection. For a server
      node, it does nothing. However, for a client node, it will send the
      IB CM connection request.
    - The process will wait for all connections to be established by
      polling the rv module. Receiving buffers will be posted.
    - The process mmaps the event ring into user space.
    - The process starts RDMA transaction by sending RDMA write with
      immediate requests to rv module. Send completion events will be
      posted to the event ring.
    - On the responder side, RDMA write with immediate requests will be
      received and receive completion events will be posted to the event
      ring buffer.
    - The process will poll the event ring for completion events.
    - When RDMA transactions are done, the process deregisters the memory
      regions.
    - The process closes the file. In this process, any connection will
      be torn down, and the RDMA device will be detached if there is no
      more user. Explicit detach and disconnection are not required.
    
    Technically, the MR registration, RDMA transactions, and MR
    deregistration occurs for every application IO.
    
    This patch adds the functions for the file operations and integrates
    the functions with memory region registration, connection management,
    and RDMA transactions.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  3. RDMA/rv: Add functions for RDMA transactions

    The only RDMA request used by this module is the RDMA WRITE WITH
    IMMEDIATE request. Part of the immediate data is used as a tag to
    encode the intended receiving rv_user in the rv_conn object, and
    remaining bits are reserved for the user application (eg. to associate
    the inbound completion with a specific outstanding rendezvous IO).
    
    This patch adds the following functions:
    - Send RDMA write with immediate request.
    - Handle the send completion event.
    - Receive the RDMA write with immediate request.
    - Post events to the event ring.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  4. RDMA/rv: Add connection management functions

    To improve scalibity of RDMA transaction, there will be only one
    connection between any two nodes. Within each node pair, one node will
    be the client and the other node will be the server, depending the
    lids/gids of the two nodes. However, to make best use of the bandwidth,
    each connection could have multiple RC QPs to share among all of the
    processes within a job. Connection is established through the IB CM
    interface.
    
    This patch adds the following functions:
    - Listerner functions to wait for any IB CM requests from the same
      job.
    - Client functions to send IB CM requests.
    - Functions to manage the lifetime of the connection object.
    - Functions to manage RC QPs.
    - IB CM event handlers for client and server.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  5. RDMA/rv: Add function to register/deregister memory region

    This patch adds functions for user application to register/deregister
    memory region (mr) for user buffers. Two types of mrs are supported:
    user mrs and kernel mrs.
    
    User mrs are used soley by the user application. The reason that
    the user mrs are cached in the rv module instead of in the user
    application is that a middleware application may not known when a user
    buffer (allocated by a upper lay application) is freed in order to free
    any stale nodes. On the other hand, the rv module can register an MMU
    notifier callback so that it can promptly remove any stale cache nodes.
    
    Kernel mrs are used by the rv module for any RDMA transactions between
    nodes.
    
    A user mr is registered in a way similar to that in the verbs
    interface. A kernel mr is registered similar to that in
    ib_reg_user_mr() for on-demand paging. An RDMA hardware may have to
    be qualified for this mechanism.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  6. RDMA/rv: Add functions for memory region cache

    The MR cache is implemented through an rb tree. Each node is indexed
    by a simple (address, length, access_flags) tuple, without any
    consideration of buffer overlapping. When a node's refcount goes
    down to 0, it is not removed from the cache. Instead, it is put into
    an LRU list that could be evicted if the cache memory limit is reached.
    However, if the user buffer for the memory region is freed, the node
    will be removed when the MMU notice is received.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  7. RDMA/rv: Add the rv module

    Add the rv module, the Makefile, and Kconfig file.
    
    Also add the functions to manage IB devices.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  8. RDMA/rv: Add the internal header files

    The two header files include the defines, structures, MACROs,
    function prototypes, and functions used in the module.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021
  9. RDMA/rv: Public interferce for the RDMA Rendezvous module

    The RDMA Rendezvous (rv) module provides an interface for HPC
    middlewares to improve performance by caching memory region
    registration, and improve the scalibity of RDMA transaction
    through connection managements between nodes. This mechanism
    is implemented through the following ioctl requests:
    - ATTACH: to attach to an RDMA device.
    - REG_MEM: to register a user/kernel memory region.
    - DEREG_MEM: to release application use of MR, allowing it to
                 remain in cache.
    - GET_CACHE_STATS: to get cache statistics.
    - CONN_CREATE: to create an RC connection.
    - CONN_CONNECT: to start the connection.
    - CONN_GET_CONN_COUNT: to use as part of error recovery from lost
                           messages in the application.
    - CONN_GET_STATS: to get connection statistics.
    - GET_EVENT_STATS: to get the RDMA event statistics.
    - POST_RDMA_WR_IMMED: to post an RDMA WRITE WITH IMMED request.
    
    Signed-off-by: Todd Rimmer <todd.rimmer@intel.com>
    Signed-off-by: Kaike Wan <kaike.wan@intel.com>
    kwan-intc authored and intel-lab-lkp committed Mar 19, 2021

Commits on Mar 12, 2021

  1. RDMA/mlx5: Allow larger pages in DevX umem

    The umem DMA list calculation was locked at 4k pages due to confusion
    around how this API works and is used when larger pages are present.
    
    The conclusion is:
    
     - umem's cannot extend past what is mapped into the process, so creating
       a lage page size and referring to a sub-range is not allowed
    
     - umem's must always have a page offset of zero, except for sub PAGE_SIZE
       umems
    
     - The feature of umem_offset to create multiple objects inside a umem
       is buggy and isn't used anyplace. Thus we can assume all users of the
       current API have umem_offset == 0 as well
    
    Provide a new page size calculator that limits the DMA list to the VA
    range and enforces umem_offset == 0.
    
    Allow user space to specify the page sizes which it can accept, this
    bitmap must be derived from the intended use of the umem, based on
    per-usage HW limitations.
    
    Link: https://lore.kernel.org/r/20210304130501.1102577-4-leon@kernel.org
    Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    jgunthorpe committed Mar 12, 2021
  2. IB/core: Split uverbs_get_const/default to consider target type

    Change uverbs_get_const/uverbs_get_const_default to work properly with
    both signed/unsigned parameters.
    
    Current APIs mix s64 and u64 which leads to incorrect check when u64
    value was supplied and its upper bit was set. In that case
    uverbs_get_const() / uverbs_get_const_default() lower bound check may
    fail unexpectedly, target is unsigned (lower bound is 0) but value
    became negative as of the s64 usage.
    
    Split to have two different APIs, no change to callers as the required
    API will be called internally according to the target type.
    
    Link: https://lore.kernel.org/r/20210304130501.1102577-3-leon@kernel.org
    Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Yishai Hadas authored and jgunthorpe committed Mar 12, 2021
  3. IB/core: Drop WARN_ON() from ib_umem_find_best_pgsz()

    The WARN_ON() issued as part of ib_umem_find_best_pgsz() blocked cases
    when only page sizes larger than PAGE_SIZE were set, drop it to enable
    those cases.
    
    In addition, there is no need to have a specific check for zero
    pgsz_bitmap, the function will do its job and return 0 at the end if
    nothing match will be found.
    
    Link: https://lore.kernel.org/r/20210304130501.1102577-2-leon@kernel.org
    Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Yishai Hadas authored and jgunthorpe committed Mar 12, 2021
  4. RDMA/mlx5: Fix mlx5 rates to IB rates map

    Correct the map between mlx5 rates and corresponding ib rates, as they
    don't always have a fixed offset between them.
    
    Fixes: e126ba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
    Link: https://lore.kernel.org/r/20210304124517.1100608-4-leon@kernel.org
    Signed-off-by: Mark Zhang <markzhang@nvidia.com>
    Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Mark Zhang authored and jgunthorpe committed Mar 12, 2021
  5. RDMA/mlx5: Fix query RoCE port

    mlx5_is_roce_enabled returns the devlink RoCE init value, therefore it
    should be used only when driver is loaded. Instead we just need to read
    the roce_en field.
    
    In addition, rename mlx5_is_roce_enabled to mlx5_is_roce_init_enabled.
    
    Fixes: 7a58779 ("IB/mlx5: Improve query port for representor port")
    Link: https://lore.kernel.org/r/20210304124517.1100608-2-leon@kernel.org
    Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    maorgottlieb authored and jgunthorpe committed Mar 12, 2021
  6. RDMA/mlx5: Rename mlx5_mr_cache_invalidate() to revoke_mr()

    Now that this is only used in a few places in mr.c give it a sensible
    name. It has nothing to do with the cache and can be invoked on any
    MR. DMA is stopped and the user cannot touch the MR any further once it
    completes.
    
    Link: https://lore.kernel.org/r/20210304120745.1090751-5-leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    jgunthorpe committed Mar 12, 2021
  7. RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()

    Now that the SRCU stuff has been removed the entire MR destroy logic can
    be made a lot simpler. Currently there are many different ways to destroy a
    MR and it makes it really hard to do this task correctly. Route all
    destruction through mlx5_ib_dereg_mr() and make it work for all
    situations.
    
    Since it turns out all the different MR types do basically the same thing
    this removes a lot of knowledge of MR internals from ODP and leaves ODP
    just exporting an operation to clean up children.
    
    This fixes a few weird corner cases bugs and firmly uses the correct
    ordering of the MR destruction:
     - Stop parallel access to the mkey via the ODP xarray
     - Stop DMA
     - Release the umem
     - Clean up ODP children
     - Free/Recycle the MR
    
    Link: https://lore.kernel.org/r/20210304120745.1090751-4-leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    jgunthorpe committed Mar 12, 2021
  8. RDMA/mlx5: Use a union inside mlx5_ib_mr

    The struct mlx5_ib_mr can be used for three different things, but only one
    at a time:
    
     - In the user MR cache
     - As a kernel MR
     - As a user MR
    
    Overlay the three things into a single union with the following rules:
    
     - If the mr is found on the cache_ent->head list then it is a cache MR
       and umem == NULL. The entire union is zero after the MR is removed from
       the cache.
    
     - If umem != NULL or type == IB_MR_TYPE_USER then it is a user MR.
    
     - If umem == NULL then it is a kernel MR
    
    This reduces the size of struct mlx5_ib_mr to 552 bytes from 702.
    
    The only place the three flows overlap in the code is during dereg, so add
    a few extra checks along there.
    
    Link: https://lore.kernel.org/r/20210304120745.1090751-3-leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    jgunthorpe committed Mar 12, 2021
  9. RDMA/mlx5: Zero out ODP related items in the mlx5_ib_mr

    All of the ODP code assumes when it calls mlx5_mr_cache_alloc() the ODP
    related fields are zero'd. This is true if the MR was just allocated, but
    if the MR is recycled through the cache then the values are never zero'd.
    
    This causes a bug in the odp_stats, they don't reset when the MR is
    reallocated, also is_odp_implicit is never 0'd.
    
    So we can use memset on a block of the mlx5_ib_mr reorganize the structure
    to put all the data that can be zero'd by the cache at the end.
    
    It is organized as an anonymous struct because the next patch will make
    this a union.
    
    Delete the unused smr_info. Don't set the kernel only desc_size on the
    user path. No longer any need to zero mr->parent before freeing it, the
    memset() will get it now.
    
    Fixes: a3de94e ("IB/mlx5: Introduce ODP diagnostic counters")
    Link: https://lore.kernel.org/r/20210304120745.1090751-2-leon@kernel.org
    Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    jgunthorpe committed Mar 12, 2021

Commits on Mar 11, 2021

  1. RDMA/hns: Add support for XRC on HIP09

    The HIP09 supports XRC transport service, it greatly saves the number of
    QPs required to connect all processes in a large cluster.
    
    Link: https://lore.kernel.org/r/1614826558-35423-1-git-send-email-liweihang@huawei.com
    Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
    Signed-off-by: Weihang Li <liweihang@huawei.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Wenpeng Liang authored and jgunthorpe committed Mar 11, 2021
  2. RDMA/rtrs-clt: Use rdma_event_msg in log

    It's easier to understand a string instead of enum.
    
    Link: https://lore.kernel.org/r/20210222141551.54345-2-jinpu.wang@cloud.ionos.com
    Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Jack Wang authored and jgunthorpe committed Mar 11, 2021
  3. RDMA/rtrs: Use new shared CQ mechanism

    Have the driver use shared CQs which provids a ~10%-20% improvement during
    test.
    
    Instead of opening a CQ for each QP per connection, a CQ for each QP will
    be provided by the RDMA core driver that will be shared between the QPs on
    that core reducing interrupt overhead.
    
    Link: https://lore.kernel.org/r/20210222141551.54345-1-jinpu.wang@cloud.ionos.com
    Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
    Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    Jack Wang authored and jgunthorpe committed Mar 11, 2021
  4. RDMA/core: Remove unused req_ncomp_notif device operation

    The request_ncomp_notif device operation and function are unused, remove
    them.
    
    Link: https://lore.kernel.org/r/20210311150921.23726-1-galpress@amazon.com
    Signed-off-by: Gal Pressman <galpress@amazon.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    galpress authored and jgunthorpe committed Mar 11, 2021

Commits on Mar 10, 2021

  1. RDMA/iwcm: Allow AFONLY binding for IPv6 addresses

    Binding IPv6 address/port to AF_INET6 domain only is provided via
    rdma_set_afonly(), but was not signalled to the provider.  Applications
    like NFS/RDMA bind the same port to both IPv4 and IPv6 addresses
    simultaneously and thus rely on it working correctly.
    
    Link: https://lore.kernel.org/r/20210219143441.1068-1-bmt@zurich.ibm.com
    Tested-by: Chuck Lever <chuck.lever@oracle.com>
    Tested-by: Benjamin Coddington <bcodding@redhat.com>
    Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    BernardMetzler authored and jgunthorpe committed Mar 10, 2021
  2. RDMA/hns: Use new SQ doorbell register for HIP09

    HIP09 uses new address space to map SQ doorbell registers, the doorbell of
    each QP is isolated based on the size of 64KB, which can improve the
    performance in concurrency scenarios.
    
    Link: https://lore.kernel.org/r/1614082833-23130-1-git-send-email-liweihang@huawei.com
    Signed-off-by: Lang Cheng <chenglang@huawei.com>
    Signed-off-by: Weihang Li <liweihang@huawei.com>
    Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
    larrch authored and jgunthorpe committed Mar 10, 2021

Commits on Mar 6, 2021

  1. Linux 5.12-rc2

    torvalds committed Mar 6, 2021
  2. Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/gi…

    …t/rdma/rdma
    
    Pull rdma fixes from Jason Gunthorpe:
     "Nothing special here, though Bob's regression fixes for rxe would have
      made it before the rc cycle had there not been such strong winter
      weather!
    
       - Fix corner cases in the rxe reference counting cleanup that are
         causing regressions in blktests for SRP
    
       - Two kdoc fixes so W=1 is clean
    
       - Missing error return in error unwind for mlx5
    
       - Wrong lock type nesting in IB CM"
    
    * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
      RDMA/rxe: Fix errant WARN_ONCE in rxe_completer()
      RDMA/rxe: Fix extra deref in rxe_rcv_mcast_pkt()
      RDMA/rxe: Fix missed IB reference counting in loopback
      RDMA/uverbs: Fix kernel-doc warning of _uverbs_alloc
      RDMA/mlx5: Set correct kernel-doc identifier
      IB/mlx5: Add missing error code
      RDMA/rxe: Fix missing kconfig dependency on CRYPTO
      RDMA/cm: Fix IRQ restore in ib_send_cm_sidr_rep
    torvalds committed Mar 6, 2021
  3. Merge tag 'gcc-plugins-v5.12-rc2' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/kees/linux
    
    Pull gcc-plugins fixes from Kees Cook:
     "Tiny gcc-plugin fixes for v5.12-rc2. These issues are small but have
      been reported a couple times now by static analyzers, so best to get
      them fixed to reduce the noise. :)
    
       - Fix coding style issues (Jason Yan)"
    
    * tag 'gcc-plugins-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
      gcc-plugins: latent_entropy: remove unneeded semicolon
      gcc-plugins: structleak: remove unneeded variable 'ret'
    torvalds committed Mar 6, 2021
  4. Merge tag 'pstore-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/kees/linux
    
    Pull pstore fixes from Kees Cook:
    
     - Rate-limit ECC warnings (Dmitry Osipenko)
    
     - Fix error path check for NULL (Tetsuo Handa)
    
    * tag 'pstore-v5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
      pstore/ram: Rate-limit "uncorrectable error in header" message
      pstore: Fix warning in pstore_kill_sb()
    torvalds committed Mar 6, 2021

Commits on Mar 5, 2021

  1. Merge tag 'for-5.12/dm-fixes' of git://git.kernel.org/pub/scm/linux/k…

    …ernel/git/device-mapper/linux-dm
    
    Pull device mapper fixes from Mike Snitzer:
     "Fix DM verity target's optional Forward Error Correction (FEC) for
      Reed-Solomon roots that are unaligned to block size"
    
    * tag 'for-5.12/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
      dm verity: fix FEC for RS roots unaligned to block size
      dm bufio: subtract the number of initial sectors in dm_bufio_get_device_size
    torvalds committed Mar 5, 2021
  2. Merge tag 'block-5.12-2021-03-05' of git://git.kernel.dk/linux-block

    Pull block fixes from Jens Axboe:
    
     - NVMe fixes:
          - more device quirks (Julian Einwag, Zoltán Böszörményi, Pascal
            Terjan)
          - fix a hwmon error return (Daniel Wagner)
          - fix the keep alive timeout initialization (Martin George)
          - ensure the model_number can't be changed on a used subsystem
            (Max Gurtovoy)
    
     - rsxx missing -EFAULT on copy_to_user() failure (Dan)
    
     - rsxx remove unused linux.h include (Tian)
    
     - kill unused RQF_SORTED (Jean)
    
     - updated outdated BFQ comments (Joseph)
    
     - revert work-around commit for bd_size_lock, since we removed the
       offending user in this merge window (Damien)
    
    * tag 'block-5.12-2021-03-05' of git://git.kernel.dk/linux-block:
      nvmet: model_number must be immutable once set
      nvme-fabrics: fix kato initialization
      nvme-hwmon: Return error code when registration fails
      nvme-pci: add quirks for Lexar 256GB SSD
      nvme-pci: mark Kingston SKC2000 as not supporting the deepest power state
      nvme-pci: mark Seagate Nytro XM1440 as QUIRK_NO_NS_DESC_LIST.
      rsxx: Return -EFAULT if copy_to_user() fails
      block/bfq: update comments and default value in docs for fifo_expire
      rsxx: remove unused including <linux/version.h>
      block: Drop leftover references to RQF_SORTED
      block: revert "block: fix bd_size_lock use"
    torvalds committed Mar 5, 2021
  3. Merge tag 'io_uring-5.12-2021-03-05' of git://git.kernel.dk/linux-block

    Pull io_uring fixes from Jens Axboe:
     "A bit of a mix between fallout from the worker change, cleanups and
      reductions now possible from that change, and fixes in general. In
      detail:
    
       - Fully serialize manager and worker creation, fixing races due to
         that.
    
       - Clean up some naming that had gone stale.
    
       - SQPOLL fixes.
    
       - Fix race condition around task_work rework that went into this
         merge window.
    
       - Implement unshare. Used for when the original task does unshare(2)
         or setuid/seteuid and friends, drops the original workers and forks
         new ones.
    
       - Drop the only remaining piece of state shuffling we had left, which
         was cred. Move it into issue instead, and we can drop all of that
         code too.
    
       - Kill f_op->flush() usage. That was such a nasty hack that we had
         out of necessity, we no longer need it.
    
       - Following from ->flush() removal, we can also drop various bits of
         ctx state related to SQPOLL and cancelations.
    
       - Fix an issue with IOPOLL retry, which originally was fallout from a
         filemap change (removing iov_iter_revert()), but uncovered an issue
         with iovec re-import too late.
    
       - Fix an issue with system suspend.
    
       - Use xchg() for fallback work, instead of cmpxchg().
    
       - Properly destroy io-wq on exec.
    
       - Add create_io_thread() core helper, and use that in io-wq and
         io_uring. This allows us to remove various silly completion events
         related to thread setup.
    
       - A few error handling fixes.
    
      This should be the grunt of fixes necessary for the new workers, next
      week should be quieter. We've got a pending series from Pavel on
      cancelations, and how tasks and rings are indexed. Outside of that,
      should just be minor fixes. Even with these fixes, we're still killing
      a net ~80 lines"
    
    * tag 'io_uring-5.12-2021-03-05' of git://git.kernel.dk/linux-block: (41 commits)
      io_uring: don't restrict issue_flags for io_openat
      io_uring: make SQPOLL thread parking saner
      io-wq: kill hashed waitqueue before manager exits
      io_uring: clear IOCB_WAITQ for non -EIOCBQUEUED return
      io_uring: don't keep looping for more events if we can't flush overflow
      io_uring: move to using create_io_thread()
      kernel: provide create_io_thread() helper
      io_uring: reliably cancel linked timeouts
      io_uring: cancel-match based on flags
      io-wq: ensure all pending work is canceled on exit
      io_uring: ensure that threads freeze on suspend
      io_uring: remove extra in_idle wake up
      io_uring: inline __io_queue_async_work()
      io_uring: inline io_req_clean_work()
      io_uring: choose right tctx->io_wq for try cancel
      io_uring: fix -EAGAIN retry with IOPOLL
      io-wq: fix error path leak of buffered write hash map
      io_uring: remove sqo_task
      io_uring: kill sqo_dead and sqo submission halting
      io_uring: ignore double poll add on the same waitqueue head
      ...
    torvalds committed Mar 5, 2021
  4. Merge tag 'pm-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/…

    …git/rafael/linux-pm
    
    Pull power management fixes from Rafael Wysocki:
     "These fix the usage of device links in the runtime PM core code and
      update the DTPM (Dynamic Thermal Power Management) feature added
      recently.
    
      Specifics:
    
       - Make the runtime PM core code avoid attempting to suspend supplier
         devices before updating the PM-runtime status of a consumer to
         'suspended' (Rafael Wysocki).
    
       - Fix DTPM (Dynamic Thermal Power Management) root node
         initialization and label that feature as EXPERIMENTAL in Kconfig
         (Daniel Lezcano)"
    
    * tag 'pm-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      powercap/drivers/dtpm: Add the experimental label to the option description
      powercap/drivers/dtpm: Fix root node initialization
      PM: runtime: Update device status before letting suppliers suspend
    torvalds committed Mar 5, 2021
  5. Merge tag 'acpi-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kerne…

    …l/git/rafael/linux-pm
    
    Pull ACPI fix from Rafael Wysocki:
     "Make the empty stubs of some helper functions used when CONFIG_ACPI is
      not set actually match those functions (Andy Shevchenko)"
    
    * tag 'acpi-5.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
      ACPI: bus: Constify is_acpi_node() and friends (part 2)
    torvalds committed Mar 5, 2021
  6. Merge tag 'iommu-fixes-v5.12-rc1' of git://git.kernel.org/pub/scm/lin…

    …ux/kernel/git/joro/iommu
    
    Pull iommu fixes from Joerg Roedel:
    
     - Fix a sleeping-while-atomic issue in the AMD IOMMU code
    
     - Disable lazy IOTLB flush for untrusted devices in the Intel VT-d
       driver
    
     - Fix status code definitions for Intel VT-d
    
     - Fix IO Page Fault issue in Tegra IOMMU driver
    
    * tag 'iommu-fixes-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
      iommu/vt-d: Fix status code for Allocate/Free PASID command
      iommu: Don't use lazy flush for untrusted device
      iommu/tegra-smmu: Fix mc errors on tegra124-nyan
      iommu/amd: Fix sleeping in atomic in increase_address_space()
    torvalds committed Mar 5, 2021
  7. Merge tag 'for-5.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/ke…

    …rnel/git/kdave/linux
    
    Pull btrfs fixes from David Sterba:
     "More regression fixes and stabilization.
    
      Regressions:
    
       - zoned mode
          - count zone sizes in wider int types
          - fix space accounting for read-only block groups
    
       - subpage: fix page tail zeroing
    
      Fixes:
    
       - fix spurious warning when remounting with free space tree
    
       - fix warning when creating a directory with smack enabled
    
       - ioctl checks for qgroup inheritance when creating a snapshot
    
       - qgroup
          - fix missing unlock on error path in zero range
          - fix amount of released reservation on error
          - fix flushing from unsafe context with open transaction,
            potentially deadlocking
    
       - minor build warning fixes"
    
    * tag 'for-5.12-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
      btrfs: zoned: do not account freed region of read-only block group as zone_unusable
      btrfs: zoned: use sector_t for zone sectors
      btrfs: subpage: fix the false data csum mismatch error
      btrfs: fix warning when creating a directory with smack enabled
      btrfs: don't flush from btrfs_delayed_inode_reserve_metadata
      btrfs: export and rename qgroup_reserve_meta
      btrfs: free correct amount of space in btrfs_delayed_inode_reserve_metadata
      btrfs: fix spurious free_space_tree remount warning
      btrfs: validate qgroup inherit for SNAP_CREATE_V2 ioctl
      btrfs: unlock extents in btrfs_zero_range in case of quota reservation errors
      btrfs: ref-verify: use 'inline void' keyword ordering
    torvalds committed Mar 5, 2021
Older