Skip to content

v0.43.0 (2026-05-19)

Latest

Choose a tag to compare

@emmansun emmansun released this 19 May 07:53
b3b6fe1

v0.43.0

This release delivers major performance improvements across ML-KEM (arm64/amd64), ML-DSA (arm64/amd64), SM9 pairing, ZUC, and SM4, alongside two new packages (rand and tls13), an enhanced DRBG strategy mode, and internal API refinements.

Highlights

  • New rand package: cryptographically secure random number generator backed by GM/T 0105-2021 Hash-DRBG, with multi-source entropy hardening (OS, CPU jitter, and hash loop noise) and on-startup self-test
  • New tls13 package: TLS 1.3 key exchange primitives (including SM2/ECDH/X25519/Hybrid ECDH + ML-KEM support)
  • SM9 pairing speedup: G2 precomputation reduces Miller loop cost by ~27% and full pairing cost by ~15% when the G2 point (private/public key) is fixed
  • ML-KEM arm64 NEON optimizations: compress/encode (4/5/10/11-bit), decompress/decode, rejUniform, sampleNTT, ringCompressAndEncode1
  • ML-KEM amd64 AVX2 optimizations: compress/encode (10/11-bit), sampleNTT with precomputed twiddles
  • ML-DSA arm64 NEON optimizations: bitUnpack (signed 2^17/2^19), vectorMakeHint, nttMatRowVecMul
  • ML-DSA amd64 AVX2 optimizations: batch 2 (second wave of functions)
  • DRBG strategy mode (DrbgMode interface): separates GM/T 0105-2021 from NIST SP 800-90A behaviour without modifying core DRBG logic
  • DRBG API refinement: Generate now returns (reseedRequired bool, err error) instead of conflating a control-flow signal with an error value
  • SM4 ppc64 fixes: test case correctness fixes for big-endian ppc64 GCM
  • ZUC asm improvements: amd64/arm64 LFSR restore optimized for readability and performance
  • s390x bigmod: vector addMulVVWy implementation

New Packages

rand

A drop-in replacement for crypto/rand backed by a per-CPU GM/T 0105-2021 Hash-DRBG pool. Key properties:

  • Entropy hardening: OS, CPU jitter, and hash loop noise entropy source
  • On-startup DRBG known-answer self-test (GM/T 0105-2021 test vectors)
  • Automatic reseed on counter/time interval expiry
  • rand.Reader and rand.Read as the primary API surface

tls13

Key exchange primitives for TLS 1.3, including SM2, ECDH (P-256/P-384/P-521), X25519 and Hybrid ECDH + ML-KEM.

Performance

SM9 (internal/sm9/bn256)

G2 precomputation (PrecomputeG2 / PairPrecomp) caches all 77 line evaluation coefficients for a fixed G2 twist point, eliminating G2 point arithmetic from the Miller loop at pairing time.

Benchmark Before After Δ
BenchmarkMiller 158,340 ns 115,918 ns -27%
BenchmarkPairing (full) 300,079 ns 254,992 ns -15%
PrecomputeG2 46,131 ns one-time cost

Applied automatically to EncryptPrivateKey (lazy-init on first use via sync.Once) and gen2Precomp (package-level precomputed Gen2).

GT.ScalarMult / GT.ScalarBaseMult now delegate to ScalarMultGT (4-bit window + Cyclo6Squares), replacing the previous binary gfP12.Exp with general squaring.

ML-KEM arm64 NEON (internal/mlkem)

Extensive NEON vectorization of polynomial compress/encode/decode paths, sample and rejection functions. See PR #479 for details.

ML-KEM amd64 AVX2 (internal/mlkem)

AVX2 optimizations for compress/encode (10/11-bit), sampleNTT with precomputed twiddle factors (PR #478).

ML-DSA arm64 NEON (internal/mldsa)

NEON implementations of bitUnpackSignedTwoPower17, bitUnpackSignedTwoPower19, vectorMakeHint, nttMatRowVecMul (PR #481).

ML-DSA amd64 AVX2 (internal/mldsa)

Second wave of AVX2 functions (PR #480), with qMinusZetasMontgomeryAVX2 reordered to avoid VPERMQ.

ZUC Assembly

  • arm64: LFSR restore (RESTORE_LFSR) optimized
  • amd64: LFSR restore optimized, improved code readability

s390x Bigmod

Vector implementation of addMulVVWy (PR #430).

API Changes

drbg — Breaking Change

DRBG.Generate signature changed:

// Before (v0.42.x)
Generate(b, additional []byte) error  // returned ErrReseedRequired as sentinel

// After (v0.43.0)
Generate(b, additional []byte) (reseedRequired bool, err error)

ErrReseedRequired is deprecated and retained only for source compatibility; it is no longer returned by any Generate implementation. Check the bool return value instead:

// Migration
reseedRequired, err := drbg.Generate(buf, nil)
if err != nil { /* handle real error */ }
if reseedRequired { /* call Reseed */ }

drbg — Strategy Mode (DrbgMode)

New DrbgMode interface cleanly encapsulates all behavioural differences between GM/T 0105-2021 and NIST SP 800-90A (entropy length constraints, time-based reseed, output size limits). Two pre-defined singletons: drbg.GMMode and drbg.NISTMode.

Bug Fixes

  • SM4 ppc64be: Test case correctness fixes for GCM on big-endian ppc64

Internal / Documentation

  • internal/sm9/bn256/README.md comprehensively documents all optimizations, tower structure, algorithm references (eprint links), and remaining improvement opportunities
  • drbg.setZero renamed to drbg.zeroize, simplified to clear(data); runtime.KeepAlive(data), with a comment explaining the Go-specific memory-erasure limitations and why the historical 0xFF multi-pass pattern is unnecessary for RAM

Dependencies and CI

  • github/codeql-action bumped through 4.35.5
  • step-security/harden-runner bumped through 2.19.3
  • CI: added ppc64be testing; re-enabled all platforms

Full Changelog

Compare: v0.42.0...v0.43.0