DM21 convergence problem for elongated bonds #316

soumitribedi · 2022-01-04T16:58:47Z

I am a PostDoc researcher with Prof. Paul Zimmerman at the University of Michigan. We are particularly interested in exact exchange-correlation potentials. So I became interested in your recent paper and was trying to reproduce some of the results using the DM21 functional. I found strangely, that for H2 dissociation as well as F2 dissociation, the convergence fails precisely 2.3 Angstroms onwards. This problem is invariant to the change of basis or grid size. However, DM21m and DM21mc do not have convergence issues, although the results are not very good. I am attaching the input, the corresponding output (h2_2.3.txt) and a plot of H2 dissociation at the cc-PVQZ basis.

I am wondering whether we are missing any crucial parameter in the input? How do we specify the fractional occupancy?

Below is the input:

import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()

mf = dft.RKS(mol)
mf.xc ='B3LYP'
mf.verbose=0
mf.run()
dm0 = mf.make_rdm1()
mf = dft.RKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.verbose=4

mf.kernel(dm0=dm0)

h2_2.3.txt

vsinha027 · 2022-01-05T02:46:19Z

I think the scf convergence related issue is related to the wavefunction that pyscf is construting.
I checked the S^2 values that B3LYP calculations output. It is ~ 0.
At a distance of 2.3 Angstroms, if we compute the system at the singlet spin manifold, a more stable solution should correspond to a Broken Symmetry wavefunction, which should have a S^2 close to 1.
Here are some values I computed using Orca:
Method, S^2, SCF Energy (a.u.)
b2plyp_BS, 0.927552,-1.000926020005
b2plyp_no_BS,0,-0.961166184309
b3lyp_BS, 0.873707,-1.003065337625
b3lyp_no_BS, 0,-0.965519308406
bp86_BS, 0.822530,-1.007237978513
bp86_no_BS, 0,-0.983152002363
uhf_BS, 0.960038,-1.000109372040
uhf_no_BS, 0.000000,-0.885485517561
_BS denotes if BrokenSymetry was used or not.
One can note, for example for unrestricted Hartree-Fock that a more stable (lower energy) solution is obtained for BrokenSymmetry wavefunction, and S^2 ~ 1.

I ran the DM21 calculations using the same settings that have been posted originally. I see that the SCF oscillates, and does not converge.
Adding a Level-Shit of 0.5 not only leads to convergence but also predicts the correct S^2 of 1.
Here is my input file:
import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft, scf

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()
mol.spin = 0
mf = scf.UKS(mol)
mf.xc ='B3LYP'
mf.verbose=4
mf.run()
print("B3LYP run is over")
dm0 = mf.make_rdm1()
mf = dft.UKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.verbose=4
mf.damp = 0.5
mf.diis_start_cycle = 2
mf.level_shift = 0.5
#mf.kernel(dm0=dm0)
mf.kernel()

Output:
B3LYP: converged SCF energy = -0.96517814086697 <S^2> = 7.1453954e-13 2S+1 = 1
DM21: converged SCF energy = -0.911521338952581 <S^2> = 1 2S+1 = 2.236068

I believe that issue with SCF convergence might be solved by switching from DIIS to a second-order method but pyscf results in:

File "/home/vsinha/pyscf_tests/pyscf/pyscf/dft/numint.py", line 1889, in cache_xc_kernel
raise NotImplementedError('meta-GGA')
NotImplementedError: meta-GGA

when the newton() method is used with DM21.

I find it interesting that DM21 kind of beats DFT at predicting the wavefunction state for homolytic cleavage of H2. DFT without Broken-Symmetry formalism, predicts a dissociated H- and H+ (hydride and proton) pair, placing both electrons on one of the H atoms.

DM21m and DM21mc should be checked for the S2 they predict.

vsinha027 · 2022-01-05T09:08:05Z

I think the scf convergence related issue is related to the wavefunction that pyscf is construting. I checked the S^2 values that B3LYP calculations output. It is ~ 0. At a distance of 2.3 Angstroms, if we compute the system at the singlet spin manifold, a more stable solution should correspond to a Broken Symmetry wavefunction, which should have a S^2 close to 1. Here are some values I computed using Orca: Method, S^2, SCF Energy (a.u.) b2plyp_BS, 0.927552,-1.000926020005 b2plyp_no_BS,0,-0.961166184309 b3lyp_BS, 0.873707,-1.003065337625 b3lyp_no_BS, 0,-0.965519308406 bp86_BS, 0.822530,-1.007237978513 bp86_no_BS, 0,-0.983152002363 uhf_BS, 0.960038,-1.000109372040 uhf_no_BS, 0.000000,-0.885485517561 _BS denotes if BrokenSymetry was used or not. One can note, for example for unrestricted Hartree-Fock that a more stable (lower energy) solution is obtained for BrokenSymmetry wavefunction, and S^2 ~ 1.

I ran the DM21 calculations using the same settings that have been posted originally. I see that the SCF oscillates, and does not converge. Adding a Level-Shit of 0.5 not only leads to convergence but also predicts the correct S^2 of 1. Here is my input file: import density_functional_approximation_dm21 as dm21 from pyscf import gto from pyscf import dft, scf

mol = gto.Mole() mol.atom =''' H 0.0 0.0 0.0 H 0.0 0.0 2.3 ''' mol.basis = 'cc-pVDZ' mol.build() mol.spin = 0 mf = scf.UKS(mol) mf.xc ='B3LYP' mf.verbose=4 mf.run() print("B3LYP run is over") dm0 = mf.make_rdm1() mf = dft.UKS(mol) mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21) mf.grids.level=5 mf.conv_tol=1E-6 mf.conv_tol_grad=1E-3 mf.verbose=4 mf.damp = 0.5 mf.diis_start_cycle = 2 mf.level_shift = 0.5 #mf.kernel(dm0=dm0) mf.kernel()

Output: B3LYP: converged SCF energy = -0.96517814086697 <S^2> = 7.1453954e-13 2S+1 = 1 DM21: converged SCF energy = -0.911521338952581 <S^2> = 1 2S+1 = 2.236068

I believe that issue with SCF convergence might be solved by switching from DIIS to a second-order method but pyscf results in:

File "/home/vsinha/pyscf_tests/pyscf/pyscf/dft/numint.py", line 1889, in cache_xc_kernel raise NotImplementedError('meta-GGA') NotImplementedError: meta-GGA

when the newton() method is used with DM21.

I find it interesting that DM21 kind of beats DFT at predicting the wavefunction state for homolytic cleavage of H2. DFT without Broken-Symmetry formalism, predicts a dissociated H- and H+ (hydride and proton) pair, placing both electrons on one of the H atoms.

DM21m and DM21mc should be checked for the S2 they predict.

I also noted that the energy predicted by DM21 is quite high compared to FCI, DM21m and D21mc

vsinha027 · 2022-01-05T10:42:47Z

I used the broken-symmetry method by flipping spin on one of the H atoms in pyscf. For B3LYP it works fine and gives energy and S^2 values similar to Orca.
However, when I use the 1e- RDM from broken-symmetry B3LYP as input for DM21. The SCF diverges. I tried tuning the scf damping and level_shift parameters but did not help.
My input:
import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft, scf
def flip_spin(idx, atom, dma, dmb):
idx_h1 = mol.search_ao_label('%d %s' %(idx, atom))
dma_h1 = dma[idx_h1.reshape(-1,1),idx_h1].copy()
dmb_h1 = dmb[idx_h1.reshape(-1,1),idx_h1].copy()
dma[idx_h1.reshape(-1,1),idx_h1] = dmb_h1
dmb[idx_h1.reshape(-1,1),idx_h1] = dma_h1
dm = [dma, dmb]
return dm

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()
#mol.spin = 0

converge high-spin calculation first

mf = scf.UKS(mol)
mol.spin = 2
mf.xc ='B3LYP'
mf.verbose=4
mf.kernel()
dma, dmb = mf.make_rdm1()

run low spin calc now

mf = scf.UKS(mol)
mol.spin = 0
mf.xc ='B3LYP'
mf.verbose=4
mf.level_shift = 0.5
dm = flip_spin(0, 'H', dma, dmb)
mf.kernel(dm)
dm0 = mf.make_rdm1()
print("B3LYP run is over")

DM21

#dm0 = mf.make_rdm1()
mf = dft.UKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.verbose=4
mf.level_shift = 1.0
mf.damp = 0.8
mf.diis_start_cycle = 2

mol.spin = 0
mf.kernel(dm0)

...
I will look into factional occupation now.

vsinha027 · 2022-01-05T10:46:05Z

I used the broken-symmetry method by flipping spin on one of the H atoms in pyscf. For B3LYP it works fine and gives energy and S^2 values similar to Orca. However, when I use the 1e- RDM from broken-symmetry B3LYP as input for DM21. The SCF diverges. I tried tuning the scf damping and level_shift parameters but did not help. My input: import density_functional_approximation_dm21 as dm21 from pyscf import gto from pyscf import dft, scf def flip_spin(idx, atom, dma, dmb): idx_h1 = mol.search_ao_label('%d %s' %(idx, atom)) dma_h1 = dma[idx_h1.reshape(-1,1),idx_h1].copy() dmb_h1 = dmb[idx_h1.reshape(-1,1),idx_h1].copy() dma[idx_h1.reshape(-1,1),idx_h1] = dmb_h1 dmb[idx_h1.reshape(-1,1),idx_h1] = dma_h1 dm = [dma, dmb] return dm

mol = gto.Mole() mol.atom =''' H 0.0 0.0 0.0 H 0.0 0.0 2.3 ''' mol.basis = 'cc-pVDZ' mol.build() #mol.spin = 0

converge high-spin calculation first

mf = scf.UKS(mol) mol.spin = 2 mf.xc ='B3LYP' mf.verbose=4 mf.kernel() dma, dmb = mf.make_rdm1()

run low spin calc now

mf = scf.UKS(mol) mol.spin = 0 mf.xc ='B3LYP' mf.verbose=4 mf.level_shift = 0.5 dm = flip_spin(0, 'H', dma, dmb) mf.kernel(dm) dm0 = mf.make_rdm1() print("B3LYP run is over")

DM21

#dm0 = mf.make_rdm1() mf = dft.UKS(mol) mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21) mf.verbose=4 mf.level_shift = 1.0 mf.damp = 0.8 mf.diis_start_cycle = 2

mol.spin = 0 mf.kernel(dm0)

... I will look into factional occupation now.

btw I also tried to first converge triplet state using DM21 and then flip the spin and do a singlet calculation. But there perhaps the vaues inthe 1e- RDM are very small and the initial energy is NaN, and the pyscf run raises:
ValueError: array must not contain infs or NaNs

vsinha027 · 2022-01-05T10:53:08Z

Fraction occupation does not give the Broken-Symmetry slution for B3LYP. Without level-shifting fractional occupation calculation diverges for DM21.
Input
import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft, scf

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()
mol.spin = 0
mf = scf.UKS(mol)
mf = scf.addons.frac_occ(mf)
mf.xc ='B3LYP'
mf.verbose=4
mf.run()
print("B3LYP run is over")

DM21

dm0 = mf.make_rdm1()
mf = dft.UKS(mol)
mf = scf.addons.frac_occ(mf)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.verbose=4
#mf.damp = 0.5
#mf.diis_start_cycle = 2
#mf.level_shift = 0.5
#mf.kernel(dm0=dm0)
mf.kernel(dm0)

vsinha027 · 2022-01-05T11:53:53Z

with level shifting with DM21, I also noticed that the difference in SCF energy in the last SCF cycle, and in the extra scf cycle which is performed to check convergence is rather large.:

cycle= 19 E= -0.976695381817214 delta_E= 3.87e-09 |g|= 1.19e-06 |ddm|= 5.3e-06
alpha nocc = 1 HOMO = -0.23004889110091 LUMO = 0.272854601483517
beta nocc = 1 HOMO = -0.212185234553292 LUMO = 0.25759047728866
cycle= 20 E= -0.976695381153706 delta_E= 6.64e-10 |g|= 5.5e-07 |ddm|= 2.7e-06
alpha nocc = 1 HOMO = -0.230048512836549 LUMO = -0.22714498800089
beta nocc = 1 HOMO = -0.242409627414412 LUMO = -0.212185151183728
Extra cycle E= -0.911586460090507 delta_E= 0.0651 |g|= 0.0704 |ddm|= 0.868
converged SCF energy = -0.911586460090507 <S^2> = 1 2S+1 = 2.236068

jsspencer · 2022-01-05T15:31:41Z

Thanks for the interest.

H2 can be tricky to converge, especially for DM21 as its trained on fractional spin data. If you look at the orbital occupancies during SCF, they frequently switch between the A1g and A1u orbitals. This causes problems with DIIS, as @vivhitekro mentioned. One option is to converge using a quasi-Newton (e.g. scipy.optimize.minimize with L-BGFS-B) or Newton approach (not implemented in pyscf for functionals using meta-GGA and local hybrid features).

A simpler way, which gives stable convergence is to fix the orbital occupancies by symmetry, e.g. to match those from B3LYP or your intuition:

from pyscf import gto
from pyscf import dft
from pyscf import scf
from pyscf.scf import rhf_symm

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.symmetry = 'Dooh'
mol.build()

mf = dft.RKS(mol)
mf.xc ='B3LYP'
mf.verbose=0
mf.run()
dm0 = mf.make_rdm1()
irrep_nelec0 = mf.get_irrep_nelec()

mf = dft.RKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.irrep_nelec = irrep_nelec0
mf.verbose=5

mf.kernel(dm0=dm0)

This converges in around 6 SCF cycles for me and enables fast convergence across the binding curve. Note that the LUMO is lower in energy than the HOMO at this separation -- which is one reason SCF convergence is so problematic and indicates the energy can be further lowered by (partially) occupying the LUMO. This is what led us to investigate fractional occupancies, as described in our paper.

vsinha027 · 2022-01-05T21:27:57Z

Thanks for the pointer! Using symmetry adapted RKS indeed leads to stable convergence and the energy value is reasonable. The main issue with convergence, as pointed out by @jsspencer is the oscillation between A1g and A1u states.
When I used the level_shift parameter, the SCF energy converged to A1g state being doubly populated. However, the next (extra scf) cycle that pyscf does for level_shift went to the A1u state being doubly populated. see: #316 (comment)
I was also able to specifically converge to {A1g(0,1), A1u(1,0)} states, and so on.
I think we are limited by the single determinant approach of DFT.

That being said I am pondering over broken-symmetry calculations using DM21: flipping the spin from a triplet state B3LYP solution gets convegred to a symmetry broken singlet solution.
However, when I repeated this procedure with DM21, the initial energy is NaN. See: https://github.com/deepmind/deepmind-research/issues/316#issuecomment-1005815196

It seems like the 1e- RDM produced by DM21 has some issues.
Perhaps I should open a new issue?

soumitribedi · 2022-01-06T17:22:22Z

Thanks @jsspencer and @vivhitekro. I was able to see DM21 converge with integer occupancy taken from B3LYP as suggested by @jsspencer. However, I am also interested to know how you managed to converge fractional occupancy with DM21 for H2 as given in the paper? That would be very helpful. I see similar errors that @vivhitekro states.

jsspencer · 2022-01-06T18:00:21Z

@soumitribedi We used a custom optimization algorithm for fractional occupation. Please see section S3.2 of the supplementary material for details.

@vivhitekro A separate issue would be best.

chenruduan · 2022-05-16T14:32:39Z

Thanks for the interest.

H2 can be tricky to converge, especially for DM21 as its trained on fractional spin data. If you look at the orbital occupancies during SCF, they frequently switch between the A1g and A1u orbitals. This causes problems with DIIS, as @vivhitekro mentioned. One option is to converge using a quasi-Newton (e.g. scipy.optimize.minimize with L-BGFS-B) or Newton approach (not implemented in pyscf for functionals using meta-GGA and local hybrid features).

A simpler way, which gives stable convergence is to fix the orbital occupancies by symmetry, e.g. to match those from B3LYP or your intuition:
from pyscf import gto
from pyscf import dft
from pyscf import scf
from pyscf.scf import rhf_symm

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.symmetry = 'Dooh'
mol.build()

mf = dft.RKS(mol)
mf.xc ='B3LYP'
mf.verbose=0
mf.run()
dm0 = mf.make_rdm1()
irrep_nelec0 = mf.get_irrep_nelec()

mf = dft.RKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.irrep_nelec = irrep_nelec0
mf.verbose=5

mf.kernel(dm0=dm0)
This converges in around 6 SCF cycles for me and enables fast convergence across the binding curve. Note that the LUMO is lower in energy than the HOMO at this separation -- which is one reason SCF convergence is so problematic and indicates the energy can be further lowered by (partially) occupying the LUMO. This is what led us to investigate fractional occupancies, as described in our paper.

Hi @jsspencer, thanks for your explanation, which helps a lot since I just started playing with DM21! I tried your code above for the H2 dissociation curve but I still get a H2 energy of -0.955678 at 3A H-H distance (FCI should be close to -1). Is this observation expected because we need to use fractional occupation optimizations to reproduce the dissociation curve at Figure 2D? If so, is there a code example for doing this? Thanks!

Chenru

jsspencer · 2022-05-16T21:16:16Z

Yes, this is expected -- as Fig 2d (data in the Supplementary data files) shows, 3A is close to the maximum energy for DM21 with integer occupations. Fractional occupations improve the energy as described above and in the paper. The algorithm for doing this optimisation is detailed in the Supplementary material. Unfortunately a open source version is not available at this time.

chenruduan · 2022-05-16T21:32:41Z

@jsspencer Thanks so much! The data json is very helpful! I have a naive question then: Will the fractional occupations optimization algorithm improve the dissociation curve at long distance for other conventional functionals (e.g., B3LYP)?

jsspencer · 2022-05-17T10:45:36Z

I wouldn't expect an improvement for most conventional functions (cf comments in our paper and work by Becke on B13 functional)

diegolascasas assigned diegolascasas and jsspencer and unassigned diegolascasas Jan 4, 2022

jsspencer closed this as completed Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM21 convergence problem for elongated bonds #316

DM21 convergence problem for elongated bonds #316

soumitribedi commented Jan 4, 2022

vsinha027 commented Jan 5, 2022 •

edited

vsinha027 commented Jan 5, 2022

vsinha027 commented Jan 5, 2022

vsinha027 commented Jan 5, 2022 •

edited

converge high-spin calculation first

run low spin calc now

DM21

vsinha027 commented Jan 5, 2022

vsinha027 commented Jan 5, 2022

jsspencer commented Jan 5, 2022

vsinha027 commented Jan 5, 2022

soumitribedi commented Jan 6, 2022

jsspencer commented Jan 6, 2022

chenruduan commented May 16, 2022

jsspencer commented May 16, 2022

chenruduan commented May 16, 2022

jsspencer commented May 17, 2022

DM21 convergence problem for elongated bonds #316

DM21 convergence problem for elongated bonds #316

Comments

soumitribedi commented Jan 4, 2022

vsinha027 commented Jan 5, 2022 • edited

vsinha027 commented Jan 5, 2022

vsinha027 commented Jan 5, 2022

converge high-spin calculation first

run low spin calc now

DM21

vsinha027 commented Jan 5, 2022 • edited

converge high-spin calculation first

run low spin calc now

DM21

vsinha027 commented Jan 5, 2022

DM21

vsinha027 commented Jan 5, 2022

jsspencer commented Jan 5, 2022

vsinha027 commented Jan 5, 2022

soumitribedi commented Jan 6, 2022

jsspencer commented Jan 6, 2022

chenruduan commented May 16, 2022

jsspencer commented May 16, 2022

chenruduan commented May 16, 2022

jsspencer commented May 17, 2022

vsinha027 commented Jan 5, 2022 •

edited

vsinha027 commented Jan 5, 2022 •

edited