Skip to content

Research: low-precision acceleration for projected Hamiltonian construction and eigensolve #7

@thc1006

Description

@thc1006

Background

Mixed-precision eigenvalue decomposition is constrained by the forward error bound: |λ_computed - λ_true| >= ε_machine × κ(λ). For chemical accuracy (1.6 mHa), FP16 (ε ≈ 1e-3) and below are fundamentally insufficient for direct eigensolve when ||H|| ≈ O(1-100) Ha.

However, low precision CAN accelerate specific stages of the computation while preserving FP64 final accuracy.

Research directions

P1: FP32 projected Hamiltonian construction

Build H_proj in FP32 instead of FP64 (50% memory reduction). Only upcast to FP64 before eigensolve. Validated by CIM-QS(H)CI (arXiv:2603.13160, March 2026).

Target: matrix_elements_fast() in hamiltonians/molecular/hamiltonian.py

P2: Chebyshev filtered subspace iteration with TF32/BF16

Replace scipy.sparse.linalg.expm_multiply in SKQD with Chebyshev polynomial filtering using TF32 tensor cores for matvec. R-ChFSI (arXiv:2503.22652) demonstrated 2.1x speedup with BF16 communication and TF32 filtering, with tolerance to inexact matvec.

P3: Randomized basis selection with FP16 projection

Use FP16 random projection for initial basis selection (noise-tolerant by construction), then FP64 for the small dense eigenvalue problem. Theoretical foundation in arXiv:2601.19250 (Jan 2026).

P4: cuSOLVER BF16x9 math mode

cuSOLVER 13.2 supports CUSOLVER_FP32_EMULATED_BF16X9_MATH for syevd. Internal GEMMs use 9x BF16 tensor core operations to emulate FP32 accuracy with higher throughput on Blackwell GPUs. Requires calling cuSOLVER directly (not exposed via PyTorch).

What's already done

References

  • Higham & Mary, "Mixed Precision Algorithms in Numerical Linear Algebra", Acta Numerica (2022)
  • CIM-QS(H)CI (arXiv:2603.13160) — FP32 Hamiltonian construction validated
  • R-ChFSI (arXiv:2503.22652) — TF32/BF16 Chebyshev filtering, 2.1x speedup
  • JCTC 2026 — BF16 preconditioning for DFT eigensolver on AI-focused GPUs
  • Xu et al. (arXiv:2601.19250) — Precision-adaptive randomized SVD
  • cuSOLVER 13.2 docs — BF16x9 emulated math mode

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions