Skip to content

TDDFT Non-Deterministic and Deviated on GPU #518

@sajagbe

Description

@sajagbe
import cupy as cp, numpy as np
from pyscf import gto, dft
from gpu4pyscf import dft as gdft

cp.random.seed(12345) 


mol = gto.Mole()
mol.atom = '''
O  0.000000   0.000000   0.000000
H  0.758602   0.000000   0.504284
H -0.758602   0.000000   0.504284
'''
mol.basis = 'sto-3g'
mol.build()

print("\n=== CPU SCF + TDDFT ===")
mf_cpu = dft.RKS(mol).set(xc='lda')
mf_cpu.kernel()

td_cpu = mf_cpu.TDDFT()
td_cpu.nstates = 3
td_cpu.kernel()
print("Excited state energies (CPU):", td_cpu.e)


print("\n=== GPU SCF + TDDFT ===")
mf_gpu = gdft.RKS(mol).set(xc='lda')
mf_gpu.kernel()


td_gpu = mf_gpu.TDDFT()
td_gpu.nstates = 3
td_gpu.kernel()
print("Excited state energies (GPU):", td_gpu.e)

I used the above code to test the calculation of excited state energies on GPU vs. CPU and each time GPU results deviated and were non-deterministic. This is not same for the ground state SCF, as shown below.

Run 1:

=== CPU SCF + TDDFT ===
converged SCF energy = -74.0345163178308
Excited State energies (eV)
[12.21307739 13.672496 15.5405123 ]
Excited state energies (CPU): [0.44882232 0.50245496 0.5711033 ]

=== GPU SCF + TDDFT ===
converged SCF energy = -74.0345163178773
Excited State energies (eV)
[12.63043364 26.57117295 32.13114449]
Excited state energies (GPU): [0.46415988 0.9764726 1.18079779]

Run 2:

=== CPU SCF + TDDFT ===
converged SCF energy = -74.0345163178308
Excited State energies (eV)
[12.21307739 13.672496 15.5405123 ]
Excited state energies (CPU): [0.44882232 0.50245496 0.5711033 ]

=== GPU SCF + TDDFT (with CPU guess) ===
converged SCF energy = -74.0345163178773
TD-SCF states [0, 1, 2] not converged.
Excited State energies (eV)
[14.4972545 15.50424422 16.52782828]
Excited state energies (GPU): [0.53276428 0.56977047 0.60738649]

This behavior is not dependent on xc chosen, basis set or solvation, but I believe (after some documentation searching) it probably is a difference in the way the TD effects are calculated on GPU and CPU - including fused multiply-add (FMA) and perhaps other code based differences.

I would really appreciate some more understanding and advice about fixes.

Thank you for your help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions