Skip to content
Permalink
Nathan-Huckleb…
Switch branches/tags

Commits on Jan 25, 2022

  1. crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL

    Add hardware accelerated version of POLYVAL for ARM64 CPUs with
    Crypto Extension support.
    
    This implementation is accelerated using PMULL instructions to perform
    the finite field computations.  For added efficiency, 8 blocks of the
    plaintext are processed simultaneously by precomputing the first 8
    powers of the key.
    
    Karatsuba multiplication is used instead of Schoolbook multiplication
    because it was found to be slightly faster on ARM64 CPUs.  Montgomery
    reduction must be used instead of Barrett reduction due to the
    difference in modulus between POLYVAL's field and other finite fields.
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022
  2. crypto: x86/polyval: Add PCLMULQDQ accelerated implementation of POLYVAL

    Add hardware accelerated version of POLYVAL for x86-64 CPUs with
    PCLMULQDQ support.
    
    This implementation is accelerated using PCLMULQDQ instructions to
    perform the finite field computations.  For added efficiency, 8 blocks
    of the plaintext are processed simultaneously by precomputing the first
    8 powers of the key.
    
    Schoolbook multiplication is used instead of Karatsuba multiplication
    because it was found to be slightly faster on x86-64 machines.
    Montgomery reduction must be used instead of Barrett reduction due to
    the difference in modulus between POLYVAL's field and other finite
    fields.
    
    More information on POLYVAL can be found in the HCTR2 paper:
    Length-preserving encryption with HCTR2:
    https://eprint.iacr.org/2021/1441.pdf
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022
  3. crypto: arm64/aes-xctr: Add accelerated implementation of XCTR

    Add hardware accelerated version of XCTR for ARM64 CPUs with ARMv8
    Crypto Extension support.  This XCTR implementation is based on the CTR
    implementation in aes-modes.S.
    
    More information on XCTR can be found in
    the HCTR2 paper: Length-preserving encryption with HCTR2:
    https://eprint.iacr.org/2021/1441.pdf
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022
  4. crypto: x86/aesni-xctr: Add accelerated implementation of XCTR

    Add hardware accelerated versions of XCTR for x86-64 CPUs with AESNI
    support.  These implementations are modified versions of the CTR
    implementations found in aesni-intel_asm.S and aes_ctrby8_avx-x86_64.S.
    
    More information on XCTR can be found in the HCTR2 paper:
    Length-preserving encryption with HCTR2:
    https://enterprint.iacr.org/2021/1441.pdf
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022
  5. crypto: hctr2 - Add HCTR2 support

    Add support for HCTR2 as a template.  HCTR2 is a length-preserving
    encryption mode that is efficient on processors with instructions to
    accelerate AES and carryless multiplication, e.g. x86 processors with
    AES-NI and CLMUL, and ARM processors with the ARMv8 Crypto Extensions.
    
    As a length-preserving encryption mode, HCTR2 is suitable for
    applications such as storage encryption where ciphertext expansion is
    not possible, and thus authenticated encryption cannot be used.
    Currently, such applications usually use XTS, or in some cases Adiantum.
    XTS has the disadvantage that it is a narrow-block mode: a bitflip will
    only change 16 bytes in the resulting ciphertext or plaintext.  This
    reveals more information to an attacker than necessary.
    
    HCTR2 is a wide-block mode, so it provides a stronger security property:
    a bitflip will change the entire message.  HCTR2 is somewhat similar to
    Adiantum, which is also a wide-block mode.  However, HCTR2 is designed
    to take advantage of existing crypto instructions, while Adiantum
    targets devices without such hardware support.  Adiantum is also
    designed with longer messages in mind, while HCTR2 is designed to be
    efficient even on short messages.
    
    HCTR2 requires POLYVAL and XCTR as components.  More information on
    HCTR2 can be found here: Length-preserving encryption with HCTR2:
    https://eprint.iacr.org/2021/1441.pdf
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022
  6. crypto: polyval - Add POLYVAL support

    Add support for POLYVAL, an ε-universal hash function similar to GHASH.
    POLYVAL is used as a component to implement HCTR2 mode.
    
    POLYVAL is implemented as an shash algorithm.  The implementation is
    modified from ghash-generic.c.
    
    More information on POLYVAL can be found in the HCTR2 paper:
    https://eprint.iacr.org/2021/1441.pdf
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022
  7. crypto: xctr - Add XCTR support

    Add a generic implementation of XCTR mode as a template.  XCTR is a
    blockcipher mode similar to CTR mode.  XCTR uses XORs and little-endian
    addition rather than big-endian arithmetic which makes it slightly
    faster on little-endian CPUs.  It is used as a component to implement
    HCTR2.
    
    More information on XCTR mode can be found in the HCTR2 paper:
    https://eprint.iacr.org/2021/1441.pdf
    
    Signed-off-by: Nathan Huckleberry <nhuck@google.com>
    nhukc authored and intel-lab-lkp committed Jan 25, 2022

Commits on Jan 7, 2022

  1. crypto: af_alg - rewrite NULL pointer check

    Because of the possible alloc failure of the alloc_page(), it could
    return NULL pointer.
    And there is a check below the sg_assign_page().
    But it will be more logical to move the NULL check before the
    sg_assign_page().
    
    Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    JiangJias authored and herbertx committed Jan 7, 2022
  2. lib/mpi: Add the return value check of kcalloc()

    Add the return value check of kcalloc() to avoid potential
    NULL ptr dereference.
    
    Fixes: a8ea8bd ("lib/mpi: Extend the MPI library")
    Signed-off-by: Zizhuang Deng <sunsetdzz@gmail.com>
    Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    enderdzz authored and herbertx committed Jan 7, 2022

Commits on Dec 31, 2021

  1. crypto: qat - fix definition of ring reset results

    The ring reset result values are defined starting from 0x1 instead of 0.
    This causes out-of-tree drivers that support this message to understand
    that a ring reset failed even if the operation was successful.
    
    Fix by starting the definition of ring reset result values from 0.
    
    Fixes: 0bba03c ("crypto: qat - add PFVF support to enable the reset of ring pairs")
    Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reported-by: Adam Guerin <adam.guerin@intel.com>
    Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    gcabiddu authored and herbertx committed Dec 31, 2021
  2. crypto: hisilicon - cleanup warning in qm_get_qos_value()

    Building with clang static analysis returns this warning:
    
    qm.c:4382:11: warning: The left operand of '==' is a garbage value
            if (*val == 0 || *val > QM_QOS_MAX_VAL || ret) {
                ~~~~ ^
    
    The call to qm_qos_value_init() can return an error without setting
    *val.  So check ret before checking *val.
    
    Fixes: 72b010d ("crypto: hisilicon/qm - supports writing QoS int the host")
    Signed-off-by: Tom Rix <trix@redhat.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    trixirt authored and herbertx committed Dec 31, 2021
  3. crypto: kdf - select SHA-256 required for self-test

    The self test of the KDF is based on SHA-256. Thus, this algorithm must
    be present as otherwise a warning is issued.
    
    Reported-by: kernel test robot <oliver.sang@intel.com>
    Signed-off-by: Stephan Mueller <smueller@chronox.de>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    smuellerDD authored and herbertx committed Dec 31, 2021
  4. crypto: x86/aesni - don't require alignment of data

    x86 AES-NI routines can deal with unaligned data. Crypto context
    (key, iv etc.) have to be aligned but we take care of that separately
    by copying it onto the stack. We were feeding unaligned data into
    crypto routines up until commit 83c83e6 ("crypto: aesni -
    refactor scatterlist processing") switched to use the full
    skcipher API which uses cra_alignmask to decide data alignment.
    
    This fixes 21% performance regression in kTLS.
    
    Tested by booting with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y
    (and running thru various kTLS packets).
    
    CC: stable@vger.kernel.org # 5.15+
    Fixes: 83c83e6 ("crypto: aesni - refactor scatterlist processing")
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>
    Acked-by: Ard Biesheuvel <ardb@kernel.org>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Jakub Kicinski authored and herbertx committed Dec 31, 2021
  5. crypto: ccp - remove unneeded semicolon

    Eliminate the following coccicheck warning:
    ./drivers/crypto/ccp/sev-dev.c:263:2-3: Unneeded semicolon
    
    Reported-by: Abaci Robot <abaci@linux.alibaba.com>
    Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Yang Li authored and herbertx committed Dec 31, 2021
  6. crypto: stm32/crc32 - Fix kernel BUG triggered in probe()

    The include/linux/crypto.h struct crypto_alg field cra_driver_name description
    states "Unique name of the transformation provider. " ... " this contains the
    name of the chip or provider and the name of the transformation algorithm."
    
    In case of the stm32-crc driver, field cra_driver_name is identical for all
    registered transformation providers and set to the name of the driver itself,
    which is incorrect. This patch fixes it by assigning a unique cra_driver_name
    to each registered transformation provider.
    
    The kernel crash is triggered when the driver calls crypto_register_shashes()
    which calls crypto_register_shash(), which calls crypto_register_alg(), which
    calls __crypto_register_alg(), which returns -EEXIST, which is propagated
    back through this call chain. Upon -EEXIST from crypto_register_shash(), the
    crypto_register_shashes() starts unregistering the providers back, and calls
    crypto_unregister_shash(), which calls crypto_unregister_alg(), and this is
    where the BUG() triggers due to incorrect cra_refcnt.
    
    Fixes: b51dbe9 ("crypto: stm32 - Support for STM32 CRC32 crypto module")
    Signed-off-by: Marek Vasut <marex@denx.de>
    Cc: <stable@vger.kernel.org> # 4.12+
    Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
    Cc: Fabien Dessenne <fabien.dessenne@st.com>
    Cc: Herbert Xu <herbert@gondor.apana.org.au>
    Cc: Lionel Debieve <lionel.debieve@st.com>
    Cc: Nicolas Toromanoff <nicolas.toromanoff@st.com>
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-stm32@st-md-mailman.stormreply.com
    To: linux-crypto@vger.kernel.org
    Acked-by: Nicolas Toromanoff <nicolas.toromanoff@foss.st.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Marek Vasut authored and herbertx committed Dec 31, 2021
  7. crypto: s390/sha512 - Use macros instead of direct IV numbers

    In the init functions of sha512 and sha384, the initial hash value
    use macros instead of numbers.
    
    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    uudiin authored and herbertx committed Dec 31, 2021
  8. crypto: sparc/sha - remove duplicate hash init function

    sha*_base_init() series functions has implemented the initialization
    of the hash context, this commit use sha*_base_init() function to
    replace repeated implementations.
    
    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    uudiin authored and herbertx committed Dec 31, 2021
  9. crypto: powerpc/sha - remove duplicate hash init function

    sha*_base_init() series functions has implemented the initialization
    of the hash context, this commit use sha*_base_init() function to
    replace repeated implementations.
    
    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    uudiin authored and herbertx committed Dec 31, 2021
  10. crypto: mips/sha - remove duplicate hash init function

    sha*_base_init() series functions has implemented the initialization
    of the hash context, this commit use sha*_base_init() function to
    replace repeated implementations.
    
    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    uudiin authored and herbertx committed Dec 31, 2021
  11. crypto: sha256 - remove duplicate generic hash init function

    crypto_sha256_init() and sha256_base_init() are the same repeated
    implementations, remove the crypto_sha256_init() in generic
    implementation, sha224 is the same process.
    
    Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    uudiin authored and herbertx committed Dec 31, 2021
  12. crypto: jitter - add oversampling of noise source

    The output n bits can receive more than n bits of min entropy, of course,
    but the fixed output of the conditioning function can only asymptotically
    approach the output size bits of min entropy, not attain that bound.
    Random maps will tend to have output collisions, which reduces the
    creditable output entropy (that is what SP 800-90B Section 3.1.5.1.2
    attempts to bound).
    
    The value "64" is justified in Appendix A.4 of the current 90C draft,
    and aligns with NIST's in "epsilon" definition in this document, which is
    that a string can be considered "full entropy" if you can bound the min
    entropy in each bit of output to at least 1-epsilon, where epsilon is
    required to be <= 2^(-32).
    
    Note, this patch causes the Jitter RNG to cut its performance in half in
    FIPS mode because the conditioning function of the LFSR produces 64 bits
    of entropy in one block. The oversampling requires that additionally 64
    bits of entropy are sampled from the noise source. If the conditioner is
    changed, such as using SHA-256, the impact of the oversampling is only
    one fourth, because for the 256 bit block of the conditioner, only 64
    additional bits from the noise source must be sampled.
    
    This patch is derived from the user space jitterentropy-library.
    
    Signed-off-by: Stephan Mueller <smueller@chronox.de>
    Reviewed-by: Simo Sorce <simo@redhat.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    smuellerDD authored and herbertx committed Dec 31, 2021
  13. MAINTAINERS: update SEC2 driver maintainers list

    Adding Kai Ye as SEC2 maintainer.
    
    Signed-off-by: Kai Ye <yekai13@huawei.com>
    Signed-off-by: Zaibo Xu <xuzaibo@huawei.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    yekai123123 authored and herbertx committed Dec 31, 2021

Commits on Dec 24, 2021

  1. crypto: ux500 - Use platform_get_irq() to get the interrupt

    platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static
    allocation of IRQ resources in DT core code, this causes an issue
    when using hierarchical interrupt domains using "interrupts" property
    in the node as this bypasses the hierarchical setup and messes up the
    irq chaining.
    
    In preparation for removal of static setup of IRQ resource from DT core
    code use platform_get_irq() so that interrupt mapping is created on demand.
    
    While at it also store the IRQ number in struct cryp_device_data so that
    we don't have to call platform_get_irq() frequently.
    
    Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    prabhakarlad authored and herbertx committed Dec 24, 2021
  2. crypto: hisilicon/qm - disable qm clock-gating

    For Kunpeng930, if qm clock-gating is enabled, rate limiter
    will be inaccurate. Therefore, disable clock-gating before doing task.
    
    Signed-off-by: Weili Qian <qianweili@huawei.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    Weili Qian authored and herbertx committed Dec 24, 2021
  3. crypto: omap-aes - Fix broken pm_runtime_and_get() usage

    This fix is basically the same as 3d6b661 ("crypto: stm32 -
    Revert broken pm_runtime_resume_and_get changes"), just for the omap
    driver. If the return value isn't used, then pm_runtime_get_sync()
    has to be used for ensuring that the usage count is balanced.
    
    Fixes: 1f34cc4 ("crypto: omap-aes - Fix PM reference leak on omap-aes.c")
    Cc: stable@vger.kernel.org
    Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    hkallweit authored and herbertx committed Dec 24, 2021
  4. MAINTAINERS: update caam crypto driver maintainers list

    Adding Gaurav as caam maintainer.
    
    Signed-off-by: Pankaj Gupta <pankaj.gupta@nxp.com>
    Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    pangupta authored and herbertx committed Dec 24, 2021
  5. crypto: octeontx2 - prevent underflow in get_cores_bmap()

    If we're going to cap "eng_grp->g->engs_num" upper bounds then we should
    cap the lower bounds as well.
    
    Fixes: 43ac0b8 ("crypto: octeontx2 - load microcode and create engine groups")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    error27 authored and herbertx committed Dec 24, 2021
  6. crypto: octeontx2 - out of bounds access in otx2_cpt_dl_custom_egrp_d…

    …elete()
    
    If "egrp" is negative then it is causes an out of bounds access in
    eng_grps->grp[].
    
    Fixes: d9d7749 ("crypto: octeontx2 - add apis for custom engine groups")
    Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    error27 authored and herbertx committed Dec 24, 2021
  7. crypto: qat - add support for compression for 4xxx

    Add the logic required to enable the compression service for 4xxx devices.
    This allows to load the compression firmware image and report
    the appropriate compression capabilities.
    
    The firmware image selection for a given device is based on the
    'ServicesEnabled' key stored in the internal configuration, which is
    added statically at the probe of the device according to the following
    rule, by default:
    - odd numbered devices assigned to compression services
    - even numbered devices assigned to crypto services
    
    In addition, restore the 'ServicesEnabled' key, if present, when SRIOV
    is enabled on the device.
    
    Signed-off-by: Tomasz Kowalik <tomaszx.kowalik@intel.com>
    Co-developed-by: Mateuszx Potrola <mateuszx.potrola@intel.com>
    Signed-off-by: Mateuszx Potrola <mateuszx.potrola@intel.com>
    Co-developed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    tkowalix authored and herbertx committed Dec 24, 2021
  8. crypto: qat - allow detection of dc capabilities for 4xxx

    Add logic to allow the detection of data compression capabilities for
    4xxx devices.
    The capability detection logic has been refactored to separate the
    crypto capabilities from the compression ones.
    
    This patch is not updating the returned capability mask as, up to now,
    4xxx devices are configured only to handle crypto operations.
    
    Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Reviewed-by: Marco Chiappero <marco.chiappero@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    gcabiddu authored and herbertx committed Dec 24, 2021
  9. crypto: qat - add PFVF support to enable the reset of ring pairs

    Extend support for resetting ring pairs on the device to VFs. Such
    reset happens by sending a request to the PF over the PFVF protocol.
    
    This patch defines two new PFVF messages and adds the PFVF logic for
    handling the request on PF, triggering the reset, and VFs, accepting the
    'success'/'error' response.
    
    This feature is GEN4 specific.
    
    This patch is based on earlier work done by Zelin Deng.
    
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    intmc authored and herbertx committed Dec 24, 2021
  10. crypto: qat - add PFVF support to the GEN4 host driver

    So far PFVF support for GEN4 devices has been kept effectively disabled
    due to lack of support. This patch adds all the GEN4 specific logic to
    make PFVF fully functional on PF.
    
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Co-developed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    intmc authored and herbertx committed Dec 24, 2021
  11. crypto: qat - config VFs based on ring-to-svc mapping

    Change the configuration logic for the VF driver to leverage the
    ring-to-service mappings now received via PFVF.
    
    While the driver config logic is not yet capable of supporting
    configurations other than the default mapping, make sure that both VF
    and PF share the same default configuration in order to work properly.
    
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    intmc authored and herbertx committed Dec 24, 2021
  12. crypto: qat - exchange ring-to-service mappings over PFVF

    In addition to retrieving the device capabilities, a VF may also need to
    retrieve the mapping of its ring pairs to crypto and or compression
    services in order to work properly.
    
    Make the VF receive the ring-to-service mappings from the PF by means of a
    new REQ_RING_SVC_MAP Block Message and add the request and response
    logic on VF and PF respectively. This change requires to bump the PFVF
    protocol to version 4.
    
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    intmc authored and herbertx committed Dec 24, 2021
  13. crypto: qat - support fast ACKs in the PFVF protocol

    The original design and current implementation of the PFVF protocol
    expects the sender to both acquire and relinquish the ownership of the
    shared CSR by setting and clearing the "in use" pattern on the remote
    half of the register when sending a message. This happens regardless of
    the acknowledgment of the reception, to guarantee changes, including
    collisions, are surely detected.
    
    However, in the case of a request that requires a response, collisions
    can also be detected by the lack of a reply. This can be exploited to
    speed up and simplify the above behaviour, letting the receiver both
    acknowledge the message and release the CSR in a single transaction:
    
    1) the sender can return as soon as the message has been acknowledged
    2) the receiver doesn't have to wait long before acquiring ownership
    of the CSR for the response message, greatly improving the overall
    throughput.
    
    Howerver, this improvement cannot be leveraged for fire-and-forget
    notifications, as it would be impossible for the sender to clearly
    distinguish between a collision and an ack immediately followed by a new
    message.
    
    This patch implements this optimization in a new version of the protocol
    (v3), which applies the fast-ack logic only whenever possible and
    guarantees backward compatibility with older versions. For requests, a
    new retry loop guarantees a correct behaviour.
    
    Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
    Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
    Reviewed-by: Fiona Trahe <fiona.trahe@intel.com>
    Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
    intmc authored and herbertx committed Dec 24, 2021
Older