Nathan-Huckleb…
Commits on Jan 25, 2022
-
crypto: arm64/polyval: Add PMULL accelerated implementation of POLYVAL
Add hardware accelerated version of POLYVAL for ARM64 CPUs with Crypto Extension support. This implementation is accelerated using PMULL instructions to perform the finite field computations. For added efficiency, 8 blocks of the plaintext are processed simultaneously by precomputing the first 8 powers of the key. Karatsuba multiplication is used instead of Schoolbook multiplication because it was found to be slightly faster on ARM64 CPUs. Montgomery reduction must be used instead of Barrett reduction due to the difference in modulus between POLYVAL's field and other finite fields. Signed-off-by: Nathan Huckleberry <nhuck@google.com>
-
crypto: x86/polyval: Add PCLMULQDQ accelerated implementation of POLYVAL
Add hardware accelerated version of POLYVAL for x86-64 CPUs with PCLMULQDQ support. This implementation is accelerated using PCLMULQDQ instructions to perform the finite field computations. For added efficiency, 8 blocks of the plaintext are processed simultaneously by precomputing the first 8 powers of the key. Schoolbook multiplication is used instead of Karatsuba multiplication because it was found to be slightly faster on x86-64 machines. Montgomery reduction must be used instead of Barrett reduction due to the difference in modulus between POLYVAL's field and other finite fields. More information on POLYVAL can be found in the HCTR2 paper: Length-preserving encryption with HCTR2: https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com>
-
crypto: arm64/aes-xctr: Add accelerated implementation of XCTR
Add hardware accelerated version of XCTR for ARM64 CPUs with ARMv8 Crypto Extension support. This XCTR implementation is based on the CTR implementation in aes-modes.S. More information on XCTR can be found in the HCTR2 paper: Length-preserving encryption with HCTR2: https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com>
-
crypto: x86/aesni-xctr: Add accelerated implementation of XCTR
Add hardware accelerated versions of XCTR for x86-64 CPUs with AESNI support. These implementations are modified versions of the CTR implementations found in aesni-intel_asm.S and aes_ctrby8_avx-x86_64.S. More information on XCTR can be found in the HCTR2 paper: Length-preserving encryption with HCTR2: https://enterprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com>
-
crypto: hctr2 - Add HCTR2 support
Add support for HCTR2 as a template. HCTR2 is a length-preserving encryption mode that is efficient on processors with instructions to accelerate AES and carryless multiplication, e.g. x86 processors with AES-NI and CLMUL, and ARM processors with the ARMv8 Crypto Extensions. As a length-preserving encryption mode, HCTR2 is suitable for applications such as storage encryption where ciphertext expansion is not possible, and thus authenticated encryption cannot be used. Currently, such applications usually use XTS, or in some cases Adiantum. XTS has the disadvantage that it is a narrow-block mode: a bitflip will only change 16 bytes in the resulting ciphertext or plaintext. This reveals more information to an attacker than necessary. HCTR2 is a wide-block mode, so it provides a stronger security property: a bitflip will change the entire message. HCTR2 is somewhat similar to Adiantum, which is also a wide-block mode. However, HCTR2 is designed to take advantage of existing crypto instructions, while Adiantum targets devices without such hardware support. Adiantum is also designed with longer messages in mind, while HCTR2 is designed to be efficient even on short messages. HCTR2 requires POLYVAL and XCTR as components. More information on HCTR2 can be found here: Length-preserving encryption with HCTR2: https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com>
-
crypto: polyval - Add POLYVAL support
Add support for POLYVAL, an ε-universal hash function similar to GHASH. POLYVAL is used as a component to implement HCTR2 mode. POLYVAL is implemented as an shash algorithm. The implementation is modified from ghash-generic.c. More information on POLYVAL can be found in the HCTR2 paper: https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com>
-
crypto: xctr - Add XCTR support
Add a generic implementation of XCTR mode as a template. XCTR is a blockcipher mode similar to CTR mode. XCTR uses XORs and little-endian addition rather than big-endian arithmetic which makes it slightly faster on little-endian CPUs. It is used as a component to implement HCTR2. More information on XCTR mode can be found in the HCTR2 paper: https://eprint.iacr.org/2021/1441.pdf Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Commits on Jan 7, 2022
-
crypto: af_alg - rewrite NULL pointer check
Because of the possible alloc failure of the alloc_page(), it could return NULL pointer. And there is a check below the sg_assign_page(). But it will be more logical to move the NULL check before the sg_assign_page(). Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
lib/mpi: Add the return value check of kcalloc()
Add the return value check of kcalloc() to avoid potential NULL ptr dereference. Fixes: a8ea8bd ("lib/mpi: Extend the MPI library") Signed-off-by: Zizhuang Deng <sunsetdzz@gmail.com> Reviewed-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commits on Dec 31, 2021
-
crypto: qat - fix definition of ring reset results
The ring reset result values are defined starting from 0x1 instead of 0. This causes out-of-tree drivers that support this message to understand that a ring reset failed even if the operation was successful. Fix by starting the definition of ring reset result values from 0. Fixes: 0bba03c ("crypto: qat - add PFVF support to enable the reset of ring pairs") Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reported-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: hisilicon - cleanup warning in qm_get_qos_value()
Building with clang static analysis returns this warning: qm.c:4382:11: warning: The left operand of '==' is a garbage value if (*val == 0 || *val > QM_QOS_MAX_VAL || ret) { ~~~~ ^ The call to qm_qos_value_init() can return an error without setting *val. So check ret before checking *val. Fixes: 72b010d ("crypto: hisilicon/qm - supports writing QoS int the host") Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> -
crypto: kdf - select SHA-256 required for self-test
The self test of the KDF is based on SHA-256. Thus, this algorithm must be present as otherwise a warning is issued. Reported-by: kernel test robot <oliver.sang@intel.com> Signed-off-by: Stephan Mueller <smueller@chronox.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: x86/aesni - don't require alignment of data
x86 AES-NI routines can deal with unaligned data. Crypto context (key, iv etc.) have to be aligned but we take care of that separately by copying it onto the stack. We were feeding unaligned data into crypto routines up until commit 83c83e6 ("crypto: aesni - refactor scatterlist processing") switched to use the full skcipher API which uses cra_alignmask to decide data alignment. This fixes 21% performance regression in kTLS. Tested by booting with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y (and running thru various kTLS packets). CC: stable@vger.kernel.org # 5.15+ Fixes: 83c83e6 ("crypto: aesni - refactor scatterlist processing") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: ccp - remove unneeded semicolon
Eliminate the following coccicheck warning: ./drivers/crypto/ccp/sev-dev.c:263:2-3: Unneeded semicolon Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: stm32/crc32 - Fix kernel BUG triggered in probe()
The include/linux/crypto.h struct crypto_alg field cra_driver_name description states "Unique name of the transformation provider. " ... " this contains the name of the chip or provider and the name of the transformation algorithm." In case of the stm32-crc driver, field cra_driver_name is identical for all registered transformation providers and set to the name of the driver itself, which is incorrect. This patch fixes it by assigning a unique cra_driver_name to each registered transformation provider. The kernel crash is triggered when the driver calls crypto_register_shashes() which calls crypto_register_shash(), which calls crypto_register_alg(), which calls __crypto_register_alg(), which returns -EEXIST, which is propagated back through this call chain. Upon -EEXIST from crypto_register_shash(), the crypto_register_shashes() starts unregistering the providers back, and calls crypto_unregister_shash(), which calls crypto_unregister_alg(), and this is where the BUG() triggers due to incorrect cra_refcnt. Fixes: b51dbe9 ("crypto: stm32 - Support for STM32 CRC32 crypto module") Signed-off-by: Marek Vasut <marex@denx.de> Cc: <stable@vger.kernel.org> # 4.12+ Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Fabien Dessenne <fabien.dessenne@st.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Lionel Debieve <lionel.debieve@st.com> Cc: Nicolas Toromanoff <nicolas.toromanoff@st.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-stm32@st-md-mailman.stormreply.com To: linux-crypto@vger.kernel.org Acked-by: Nicolas Toromanoff <nicolas.toromanoff@foss.st.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: s390/sha512 - Use macros instead of direct IV numbers
In the init functions of sha512 and sha384, the initial hash value use macros instead of numbers. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: sparc/sha - remove duplicate hash init function
sha*_base_init() series functions has implemented the initialization of the hash context, this commit use sha*_base_init() function to replace repeated implementations. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: powerpc/sha - remove duplicate hash init function
sha*_base_init() series functions has implemented the initialization of the hash context, this commit use sha*_base_init() function to replace repeated implementations. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: mips/sha - remove duplicate hash init function
sha*_base_init() series functions has implemented the initialization of the hash context, this commit use sha*_base_init() function to replace repeated implementations. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: sha256 - remove duplicate generic hash init function
crypto_sha256_init() and sha256_base_init() are the same repeated implementations, remove the crypto_sha256_init() in generic implementation, sha224 is the same process. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: jitter - add oversampling of noise source
The output n bits can receive more than n bits of min entropy, of course, but the fixed output of the conditioning function can only asymptotically approach the output size bits of min entropy, not attain that bound. Random maps will tend to have output collisions, which reduces the creditable output entropy (that is what SP 800-90B Section 3.1.5.1.2 attempts to bound). The value "64" is justified in Appendix A.4 of the current 90C draft, and aligns with NIST's in "epsilon" definition in this document, which is that a string can be considered "full entropy" if you can bound the min entropy in each bit of output to at least 1-epsilon, where epsilon is required to be <= 2^(-32). Note, this patch causes the Jitter RNG to cut its performance in half in FIPS mode because the conditioning function of the LFSR produces 64 bits of entropy in one block. The oversampling requires that additionally 64 bits of entropy are sampled from the noise source. If the conditioner is changed, such as using SHA-256, the impact of the oversampling is only one fourth, because for the 256 bit block of the conditioner, only 64 additional bits from the noise source must be sampled. This patch is derived from the user space jitterentropy-library. Signed-off-by: Stephan Mueller <smueller@chronox.de> Reviewed-by: Simo Sorce <simo@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
MAINTAINERS: update SEC2 driver maintainers list
Adding Kai Ye as SEC2 maintainer. Signed-off-by: Kai Ye <yekai13@huawei.com> Signed-off-by: Zaibo Xu <xuzaibo@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Commits on Dec 24, 2021
-
crypto: ux500 - Use platform_get_irq() to get the interrupt
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq() so that interrupt mapping is created on demand. While at it also store the IRQ number in struct cryp_device_data so that we don't have to call platform_get_irq() frequently. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: hisilicon/qm - disable qm clock-gating
For Kunpeng930, if qm clock-gating is enabled, rate limiter will be inaccurate. Therefore, disable clock-gating before doing task. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: omap-aes - Fix broken pm_runtime_and_get() usage
This fix is basically the same as 3d6b661 ("crypto: stm32 - Revert broken pm_runtime_resume_and_get changes"), just for the omap driver. If the return value isn't used, then pm_runtime_get_sync() has to be used for ensuring that the usage count is balanced. Fixes: 1f34cc4 ("crypto: omap-aes - Fix PM reference leak on omap-aes.c") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
MAINTAINERS: update caam crypto driver maintainers list
Adding Gaurav as caam maintainer. Signed-off-by: Pankaj Gupta <pankaj.gupta@nxp.com> Reviewed-by: Horia Geantă <horia.geanta@nxp.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: octeontx2 - prevent underflow in get_cores_bmap()
If we're going to cap "eng_grp->g->engs_num" upper bounds then we should cap the lower bounds as well. Fixes: 43ac0b8 ("crypto: octeontx2 - load microcode and create engine groups") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: octeontx2 - out of bounds access in otx2_cpt_dl_custom_egrp_d…
…elete() If "egrp" is negative then it is causes an out of bounds access in eng_grps->grp[]. Fixes: d9d7749 ("crypto: octeontx2 - add apis for custom engine groups") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - add support for compression for 4xxx
Add the logic required to enable the compression service for 4xxx devices. This allows to load the compression firmware image and report the appropriate compression capabilities. The firmware image selection for a given device is based on the 'ServicesEnabled' key stored in the internal configuration, which is added statically at the probe of the device according to the following rule, by default: - odd numbered devices assigned to compression services - even numbered devices assigned to crypto services In addition, restore the 'ServicesEnabled' key, if present, when SRIOV is enabled on the device. Signed-off-by: Tomasz Kowalik <tomaszx.kowalik@intel.com> Co-developed-by: Mateuszx Potrola <mateuszx.potrola@intel.com> Signed-off-by: Mateuszx Potrola <mateuszx.potrola@intel.com> Co-developed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - allow detection of dc capabilities for 4xxx
Add logic to allow the detection of data compression capabilities for 4xxx devices. The capability detection logic has been refactored to separate the crypto capabilities from the compression ones. This patch is not updating the returned capability mask as, up to now, 4xxx devices are configured only to handle crypto operations. Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Reviewed-by: Marco Chiappero <marco.chiappero@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - add PFVF support to enable the reset of ring pairs
Extend support for resetting ring pairs on the device to VFs. Such reset happens by sending a request to the PF over the PFVF protocol. This patch defines two new PFVF messages and adds the PFVF logic for handling the request on PF, triggering the reset, and VFs, accepting the 'success'/'error' response. This feature is GEN4 specific. This patch is based on earlier work done by Zelin Deng. Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - add PFVF support to the GEN4 host driver
So far PFVF support for GEN4 devices has been kept effectively disabled due to lack of support. This patch adds all the GEN4 specific logic to make PFVF fully functional on PF. Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Co-developed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - config VFs based on ring-to-svc mapping
Change the configuration logic for the VF driver to leverage the ring-to-service mappings now received via PFVF. While the driver config logic is not yet capable of supporting configurations other than the default mapping, make sure that both VF and PF share the same default configuration in order to work properly. Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - exchange ring-to-service mappings over PFVF
In addition to retrieving the device capabilities, a VF may also need to retrieve the mapping of its ring pairs to crypto and or compression services in order to work properly. Make the VF receive the ring-to-service mappings from the PF by means of a new REQ_RING_SVC_MAP Block Message and add the request and response logic on VF and PF respectively. This change requires to bump the PFVF protocol to version 4. Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
crypto: qat - support fast ACKs in the PFVF protocol
The original design and current implementation of the PFVF protocol expects the sender to both acquire and relinquish the ownership of the shared CSR by setting and clearing the "in use" pattern on the remote half of the register when sending a message. This happens regardless of the acknowledgment of the reception, to guarantee changes, including collisions, are surely detected. However, in the case of a request that requires a response, collisions can also be detected by the lack of a reply. This can be exploited to speed up and simplify the above behaviour, letting the receiver both acknowledge the message and release the CSR in a single transaction: 1) the sender can return as soon as the message has been acknowledged 2) the receiver doesn't have to wait long before acquiring ownership of the CSR for the response message, greatly improving the overall throughput. Howerver, this improvement cannot be leveraged for fire-and-forget notifications, as it would be impossible for the sender to clearly distinguish between a collision and an ack immediately followed by a new message. This patch implements this optimization in a new version of the protocol (v3), which applies the fast-ack logic only whenever possible and guarantees backward compatibility with older versions. For requests, a new retry loop guarantees a correct behaviour. Signed-off-by: Marco Chiappero <marco.chiappero@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Fiona Trahe <fiona.trahe@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>