v0.43.0
This release delivers major performance improvements across ML-KEM (arm64/amd64), ML-DSA (arm64/amd64), SM9 pairing, ZUC, and SM4, alongside two new packages (rand and tls13), an enhanced DRBG strategy mode, and internal API refinements.
Highlights
- New
randpackage: cryptographically secure random number generator backed by GM/T 0105-2021 Hash-DRBG, with multi-source entropy hardening (OS, CPU jitter, and hash loop noise) and on-startup self-test - New
tls13package: TLS 1.3 key exchange primitives (including SM2/ECDH/X25519/Hybrid ECDH + ML-KEM support) - SM9 pairing speedup: G2 precomputation reduces Miller loop cost by ~27% and full pairing cost by ~15% when the G2 point (private/public key) is fixed
- ML-KEM arm64 NEON optimizations: compress/encode (4/5/10/11-bit), decompress/decode,
rejUniform,sampleNTT,ringCompressAndEncode1 - ML-KEM amd64 AVX2 optimizations: compress/encode (10/11-bit),
sampleNTTwith precomputed twiddles - ML-DSA arm64 NEON optimizations:
bitUnpack(signed 2^17/2^19),vectorMakeHint,nttMatRowVecMul - ML-DSA amd64 AVX2 optimizations: batch 2 (second wave of functions)
- DRBG strategy mode (
DrbgModeinterface): separates GM/T 0105-2021 from NIST SP 800-90A behaviour without modifying core DRBG logic - DRBG API refinement:
Generatenow returns(reseedRequired bool, err error)instead of conflating a control-flow signal with an error value - SM4 ppc64 fixes: test case correctness fixes for big-endian ppc64 GCM
- ZUC asm improvements: amd64/arm64 LFSR restore optimized for readability and performance
- s390x bigmod: vector
addMulVVWyimplementation
New Packages
rand
A drop-in replacement for crypto/rand backed by a per-CPU GM/T 0105-2021 Hash-DRBG pool. Key properties:
- Entropy hardening: OS, CPU jitter, and hash loop noise entropy source
- On-startup DRBG known-answer self-test (GM/T 0105-2021 test vectors)
- Automatic reseed on counter/time interval expiry
rand.Readerandrand.Readas the primary API surface
tls13
Key exchange primitives for TLS 1.3, including SM2, ECDH (P-256/P-384/P-521), X25519 and Hybrid ECDH + ML-KEM.
Performance
SM9 (internal/sm9/bn256)
G2 precomputation (PrecomputeG2 / PairPrecomp) caches all 77 line evaluation coefficients for a fixed G2 twist point, eliminating G2 point arithmetic from the Miller loop at pairing time.
| Benchmark | Before | After | Δ |
|---|---|---|---|
BenchmarkMiller |
158,340 ns | 115,918 ns | -27% |
BenchmarkPairing (full) |
300,079 ns | 254,992 ns | -15% |
PrecomputeG2 |
— | 46,131 ns | one-time cost |
Applied automatically to EncryptPrivateKey (lazy-init on first use via sync.Once) and gen2Precomp (package-level precomputed Gen2).
GT.ScalarMult / GT.ScalarBaseMult now delegate to ScalarMultGT (4-bit window + Cyclo6Squares), replacing the previous binary gfP12.Exp with general squaring.
ML-KEM arm64 NEON (internal/mlkem)
Extensive NEON vectorization of polynomial compress/encode/decode paths, sample and rejection functions. See PR #479 for details.
ML-KEM amd64 AVX2 (internal/mlkem)
AVX2 optimizations for compress/encode (10/11-bit), sampleNTT with precomputed twiddle factors (PR #478).
ML-DSA arm64 NEON (internal/mldsa)
NEON implementations of bitUnpackSignedTwoPower17, bitUnpackSignedTwoPower19, vectorMakeHint, nttMatRowVecMul (PR #481).
ML-DSA amd64 AVX2 (internal/mldsa)
Second wave of AVX2 functions (PR #480), with qMinusZetasMontgomeryAVX2 reordered to avoid VPERMQ.
ZUC Assembly
- arm64: LFSR restore (
RESTORE_LFSR) optimized - amd64: LFSR restore optimized, improved code readability
s390x Bigmod
Vector implementation of addMulVVWy (PR #430).
API Changes
drbg — Breaking Change
DRBG.Generate signature changed:
// Before (v0.42.x)
Generate(b, additional []byte) error // returned ErrReseedRequired as sentinel
// After (v0.43.0)
Generate(b, additional []byte) (reseedRequired bool, err error)ErrReseedRequired is deprecated and retained only for source compatibility; it is no longer returned by any Generate implementation. Check the bool return value instead:
// Migration
reseedRequired, err := drbg.Generate(buf, nil)
if err != nil { /* handle real error */ }
if reseedRequired { /* call Reseed */ }drbg — Strategy Mode (DrbgMode)
New DrbgMode interface cleanly encapsulates all behavioural differences between GM/T 0105-2021 and NIST SP 800-90A (entropy length constraints, time-based reseed, output size limits). Two pre-defined singletons: drbg.GMMode and drbg.NISTMode.
Bug Fixes
- SM4 ppc64be: Test case correctness fixes for GCM on big-endian ppc64
Internal / Documentation
internal/sm9/bn256/README.mdcomprehensively documents all optimizations, tower structure, algorithm references (eprint links), and remaining improvement opportunitiesdrbg.setZerorenamed todrbg.zeroize, simplified toclear(data); runtime.KeepAlive(data), with a comment explaining the Go-specific memory-erasure limitations and why the historical 0xFF multi-pass pattern is unnecessary for RAM
Dependencies and CI
github/codeql-actionbumped through 4.35.5step-security/harden-runnerbumped through 2.19.3- CI: added ppc64be testing; re-enabled all platforms
Full Changelog
Compare: v0.42.0...v0.43.0