Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM21 convergence problem for elongated bonds #316

Closed
soumitribedi opened this issue Jan 4, 2022 · 14 comments
Closed

DM21 convergence problem for elongated bonds #316

soumitribedi opened this issue Jan 4, 2022 · 14 comments
Assignees

Comments

@soumitribedi
Copy link

I am a PostDoc researcher with Prof. Paul Zimmerman at the University of Michigan. We are particularly interested in exact exchange-correlation potentials. So I became interested in your recent paper and was trying to reproduce some of the results using the DM21 functional. I found strangely, that for H2 dissociation as well as F2 dissociation, the convergence fails precisely 2.3 Angstroms onwards. This problem is invariant to the change of basis or grid size. However, DM21m and DM21mc do not have convergence issues, although the results are not very good. I am attaching the input, the corresponding output (h2_2.3.txt) and a plot of H2 dissociation at the cc-PVQZ basis.

I am wondering whether we are missing any crucial parameter in the input? How do we specify the fractional occupancy?

Below is the input:

import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()

mf = dft.RKS(mol)
mf.xc ='B3LYP'
mf.verbose=0
mf.run()
dm0 = mf.make_rdm1()
mf = dft.RKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.verbose=4

mf.kernel(dm0=dm0)
h2_pes

h2_2.3.txt

@vsinha027
Copy link

vsinha027 commented Jan 5, 2022

I think the scf convergence related issue is related to the wavefunction that pyscf is construting.
I checked the S^2 values that B3LYP calculations output. It is ~ 0.
At a distance of 2.3 Angstroms, if we compute the system at the singlet spin manifold, a more stable solution should correspond to a Broken Symmetry wavefunction, which should have a S^2 close to 1.
Here are some values I computed using Orca:
Method, S^2, SCF Energy (a.u.)
b2plyp_BS, 0.927552,-1.000926020005
b2plyp_no_BS,0,-0.961166184309
b3lyp_BS, 0.873707,-1.003065337625
b3lyp_no_BS, 0,-0.965519308406
bp86_BS, 0.822530,-1.007237978513
bp86_no_BS, 0,-0.983152002363
uhf_BS, 0.960038,-1.000109372040
uhf_no_BS, 0.000000,-0.885485517561
_BS denotes if BrokenSymetry was used or not.
One can note, for example for unrestricted Hartree-Fock that a more stable (lower energy) solution is obtained for BrokenSymmetry wavefunction, and S^2 ~ 1.

I ran the DM21 calculations using the same settings that have been posted originally. I see that the SCF oscillates, and does not converge.
Adding a Level-Shit of 0.5 not only leads to convergence but also predicts the correct S^2 of 1.
Here is my input file:
import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft, scf

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()
mol.spin = 0
mf = scf.UKS(mol)
mf.xc ='B3LYP'
mf.verbose=4
mf.run()
print("B3LYP run is over")
dm0 = mf.make_rdm1()
mf = dft.UKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.verbose=4
mf.damp = 0.5
mf.diis_start_cycle = 2
mf.level_shift = 0.5
#mf.kernel(dm0=dm0)
mf.kernel()

Output:
B3LYP: converged SCF energy = -0.96517814086697 <S^2> = 7.1453954e-13 2S+1 = 1
DM21: converged SCF energy = -0.911521338952581 <S^2> = 1 2S+1 = 2.236068

I believe that issue with SCF convergence might be solved by switching from DIIS to a second-order method but pyscf results in:

File "/home/vsinha/pyscf_tests/pyscf/pyscf/dft/numint.py", line 1889, in cache_xc_kernel
raise NotImplementedError('meta-GGA')
NotImplementedError: meta-GGA

when the newton() method is used with DM21.

I find it interesting that DM21 kind of beats DFT at predicting the wavefunction state for homolytic cleavage of H2. DFT without Broken-Symmetry formalism, predicts a dissociated H- and H+ (hydride and proton) pair, placing both electrons on one of the H atoms.

DM21m and DM21mc should be checked for the S2 they predict.

@vsinha027
Copy link

I think the scf convergence related issue is related to the wavefunction that pyscf is construting. I checked the S^2 values that B3LYP calculations output. It is ~ 0. At a distance of 2.3 Angstroms, if we compute the system at the singlet spin manifold, a more stable solution should correspond to a Broken Symmetry wavefunction, which should have a S^2 close to 1. Here are some values I computed using Orca: Method, S^2, SCF Energy (a.u.) b2plyp_BS, 0.927552,-1.000926020005 b2plyp_no_BS,0,-0.961166184309 b3lyp_BS, 0.873707,-1.003065337625 b3lyp_no_BS, 0,-0.965519308406 bp86_BS, 0.822530,-1.007237978513 bp86_no_BS, 0,-0.983152002363 uhf_BS, 0.960038,-1.000109372040 uhf_no_BS, 0.000000,-0.885485517561 _BS denotes if BrokenSymetry was used or not. One can note, for example for unrestricted Hartree-Fock that a more stable (lower energy) solution is obtained for BrokenSymmetry wavefunction, and S^2 ~ 1.

I ran the DM21 calculations using the same settings that have been posted originally. I see that the SCF oscillates, and does not converge. Adding a Level-Shit of 0.5 not only leads to convergence but also predicts the correct S^2 of 1. Here is my input file: import density_functional_approximation_dm21 as dm21 from pyscf import gto from pyscf import dft, scf

mol = gto.Mole() mol.atom =''' H 0.0 0.0 0.0 H 0.0 0.0 2.3 ''' mol.basis = 'cc-pVDZ' mol.build() mol.spin = 0 mf = scf.UKS(mol) mf.xc ='B3LYP' mf.verbose=4 mf.run() print("B3LYP run is over") dm0 = mf.make_rdm1() mf = dft.UKS(mol) mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21) mf.grids.level=5 mf.conv_tol=1E-6 mf.conv_tol_grad=1E-3 mf.verbose=4 mf.damp = 0.5 mf.diis_start_cycle = 2 mf.level_shift = 0.5 #mf.kernel(dm0=dm0) mf.kernel()

Output: B3LYP: converged SCF energy = -0.96517814086697 <S^2> = 7.1453954e-13 2S+1 = 1 DM21: converged SCF energy = -0.911521338952581 <S^2> = 1 2S+1 = 2.236068

I believe that issue with SCF convergence might be solved by switching from DIIS to a second-order method but pyscf results in:

File "/home/vsinha/pyscf_tests/pyscf/pyscf/dft/numint.py", line 1889, in cache_xc_kernel raise NotImplementedError('meta-GGA') NotImplementedError: meta-GGA

when the newton() method is used with DM21.

I find it interesting that DM21 kind of beats DFT at predicting the wavefunction state for homolytic cleavage of H2. DFT without Broken-Symmetry formalism, predicts a dissociated H- and H+ (hydride and proton) pair, placing both electrons on one of the H atoms.

DM21m and DM21mc should be checked for the S2 they predict.

I also noted that the energy predicted by DM21 is quite high compared to FCI, DM21m and D21mc

@vsinha027
Copy link

I used the broken-symmetry method by flipping spin on one of the H atoms in pyscf. For B3LYP it works fine and gives energy and S^2 values similar to Orca.
However, when I use the 1e- RDM from broken-symmetry B3LYP as input for DM21. The SCF diverges. I tried tuning the scf damping and level_shift parameters but did not help.
My input:
import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft, scf
def flip_spin(idx, atom, dma, dmb):
idx_h1 = mol.search_ao_label('%d %s' %(idx, atom))
dma_h1 = dma[idx_h1.reshape(-1,1),idx_h1].copy()
dmb_h1 = dmb[idx_h1.reshape(-1,1),idx_h1].copy()
dma[idx_h1.reshape(-1,1),idx_h1] = dmb_h1
dmb[idx_h1.reshape(-1,1),idx_h1] = dma_h1
dm = [dma, dmb]
return dm

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()
#mol.spin = 0

converge high-spin calculation first

mf = scf.UKS(mol)
mol.spin = 2
mf.xc ='B3LYP'
mf.verbose=4
mf.kernel()
dma, dmb = mf.make_rdm1()

run low spin calc now

mf = scf.UKS(mol)
mol.spin = 0
mf.xc ='B3LYP'
mf.verbose=4
mf.level_shift = 0.5
dm = flip_spin(0, 'H', dma, dmb)
mf.kernel(dm)
dm0 = mf.make_rdm1()
print("B3LYP run is over")

DM21

#dm0 = mf.make_rdm1()
mf = dft.UKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.verbose=4
mf.level_shift = 1.0
mf.damp = 0.8
mf.diis_start_cycle = 2

mol.spin = 0
mf.kernel(dm0)

...
I will look into factional occupation now.

@vsinha027
Copy link

vsinha027 commented Jan 5, 2022

I used the broken-symmetry method by flipping spin on one of the H atoms in pyscf. For B3LYP it works fine and gives energy and S^2 values similar to Orca. However, when I use the 1e- RDM from broken-symmetry B3LYP as input for DM21. The SCF diverges. I tried tuning the scf damping and level_shift parameters but did not help. My input: import density_functional_approximation_dm21 as dm21 from pyscf import gto from pyscf import dft, scf def flip_spin(idx, atom, dma, dmb): idx_h1 = mol.search_ao_label('%d %s' %(idx, atom)) dma_h1 = dma[idx_h1.reshape(-1,1),idx_h1].copy() dmb_h1 = dmb[idx_h1.reshape(-1,1),idx_h1].copy() dma[idx_h1.reshape(-1,1),idx_h1] = dmb_h1 dmb[idx_h1.reshape(-1,1),idx_h1] = dma_h1 dm = [dma, dmb] return dm

mol = gto.Mole() mol.atom =''' H 0.0 0.0 0.0 H 0.0 0.0 2.3 ''' mol.basis = 'cc-pVDZ' mol.build() #mol.spin = 0

converge high-spin calculation first

mf = scf.UKS(mol) mol.spin = 2 mf.xc ='B3LYP' mf.verbose=4 mf.kernel() dma, dmb = mf.make_rdm1()

run low spin calc now

mf = scf.UKS(mol) mol.spin = 0 mf.xc ='B3LYP' mf.verbose=4 mf.level_shift = 0.5 dm = flip_spin(0, 'H', dma, dmb) mf.kernel(dm) dm0 = mf.make_rdm1() print("B3LYP run is over")

DM21

#dm0 = mf.make_rdm1() mf = dft.UKS(mol) mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21) mf.verbose=4 mf.level_shift = 1.0 mf.damp = 0.8 mf.diis_start_cycle = 2

mol.spin = 0 mf.kernel(dm0)

... I will look into factional occupation now.

btw I also tried to first converge triplet state using DM21 and then flip the spin and do a singlet calculation. But there perhaps the vaues inthe 1e- RDM are very small and the initial energy is NaN, and the pyscf run raises:
ValueError: array must not contain infs or NaNs

@vsinha027
Copy link

Fraction occupation does not give the Broken-Symmetry slution for B3LYP. Without level-shifting fractional occupation calculation diverges for DM21.
Input
import density_functional_approximation_dm21 as dm21
from pyscf import gto
from pyscf import dft, scf

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.build()
mol.spin = 0
mf = scf.UKS(mol)
mf = scf.addons.frac_occ(mf)
mf.xc ='B3LYP'
mf.verbose=4
mf.run()
print("B3LYP run is over")

DM21

dm0 = mf.make_rdm1()
mf = dft.UKS(mol)
mf = scf.addons.frac_occ(mf)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.verbose=4
#mf.damp = 0.5
#mf.diis_start_cycle = 2
#mf.level_shift = 0.5
#mf.kernel(dm0=dm0)
mf.kernel(dm0)

@vsinha027
Copy link

with level shifting with DM21, I also noticed that the difference in SCF energy in the last SCF cycle, and in the extra scf cycle which is performed to check convergence is rather large.:

cycle= 19 E= -0.976695381817214 delta_E= 3.87e-09 |g|= 1.19e-06 |ddm|= 5.3e-06
alpha nocc = 1 HOMO = -0.23004889110091 LUMO = 0.272854601483517
beta nocc = 1 HOMO = -0.212185234553292 LUMO = 0.25759047728866
cycle= 20 E= -0.976695381153706 delta_E= 6.64e-10 |g|= 5.5e-07 |ddm|= 2.7e-06
alpha nocc = 1 HOMO = -0.230048512836549 LUMO = -0.22714498800089
beta nocc = 1 HOMO = -0.242409627414412 LUMO = -0.212185151183728
Extra cycle E= -0.911586460090507 delta_E= 0.0651 |g|= 0.0704 |ddm|= 0.868
converged SCF energy = -0.911586460090507 <S^2> = 1 2S+1 = 2.236068

@jsspencer
Copy link
Collaborator

Thanks for the interest.

H2 can be tricky to converge, especially for DM21 as its trained on fractional spin data. If you look at the orbital occupancies during SCF, they frequently switch between the A1g and A1u orbitals. This causes problems with DIIS, as @vivhitekro mentioned. One option is to converge using a quasi-Newton (e.g. scipy.optimize.minimize with L-BGFS-B) or Newton approach (not implemented in pyscf for functionals using meta-GGA and local hybrid features).

A simpler way, which gives stable convergence is to fix the orbital occupancies by symmetry, e.g. to match those from B3LYP or your intuition:

from pyscf import gto
from pyscf import dft
from pyscf import scf
from pyscf.scf import rhf_symm

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.symmetry = 'Dooh'
mol.build()

mf = dft.RKS(mol)
mf.xc ='B3LYP'
mf.verbose=0
mf.run()
dm0 = mf.make_rdm1()
irrep_nelec0 = mf.get_irrep_nelec()

mf = dft.RKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.irrep_nelec = irrep_nelec0
mf.verbose=5

mf.kernel(dm0=dm0)

This converges in around 6 SCF cycles for me and enables fast convergence across the binding curve. Note that the LUMO is lower in energy than the HOMO at this separation -- which is one reason SCF convergence is so problematic and indicates the energy can be further lowered by (partially) occupying the LUMO. This is what led us to investigate fractional occupancies, as described in our paper.

@vsinha027
Copy link

Thanks for the pointer! Using symmetry adapted RKS indeed leads to stable convergence and the energy value is reasonable. The main issue with convergence, as pointed out by @jsspencer is the oscillation between A1g and A1u states.
When I used the level_shift parameter, the SCF energy converged to A1g state being doubly populated. However, the next (extra scf) cycle that pyscf does for level_shift went to the A1u state being doubly populated. see: #316 (comment)
I was also able to specifically converge to {A1g(0,1), A1u(1,0)} states, and so on.
I think we are limited by the single determinant approach of DFT.

That being said I am pondering over broken-symmetry calculations using DM21: flipping the spin from a triplet state B3LYP solution gets convegred to a symmetry broken singlet solution.
However, when I repeated this procedure with DM21, the initial energy is NaN. See: https://github.com/deepmind/deepmind-research/issues/316#issuecomment-1005815196

It seems like the 1e- RDM produced by DM21 has some issues.
Perhaps I should open a new issue?

@soumitribedi
Copy link
Author

Thanks @jsspencer and @vivhitekro. I was able to see DM21 converge with integer occupancy taken from B3LYP as suggested by @jsspencer. However, I am also interested to know how you managed to converge fractional occupancy with DM21 for H2 as given in the paper? That would be very helpful. I see similar errors that @vivhitekro states.

@jsspencer
Copy link
Collaborator

@soumitribedi We used a custom optimization algorithm for fractional occupation. Please see section S3.2 of the supplementary material for details.

@vivhitekro A separate issue would be best.

@chenruduan
Copy link

Thanks for the interest.

H2 can be tricky to converge, especially for DM21 as its trained on fractional spin data. If you look at the orbital occupancies during SCF, they frequently switch between the A1g and A1u orbitals. This causes problems with DIIS, as @vivhitekro mentioned. One option is to converge using a quasi-Newton (e.g. scipy.optimize.minimize with L-BGFS-B) or Newton approach (not implemented in pyscf for functionals using meta-GGA and local hybrid features).

A simpler way, which gives stable convergence is to fix the orbital occupancies by symmetry, e.g. to match those from B3LYP or your intuition:

from pyscf import gto
from pyscf import dft
from pyscf import scf
from pyscf.scf import rhf_symm

mol = gto.Mole()
mol.atom ='''
H 0.0 0.0 0.0
H 0.0 0.0 2.3
'''
mol.basis = 'cc-pVDZ'
mol.symmetry = 'Dooh'
mol.build()

mf = dft.RKS(mol)
mf.xc ='B3LYP'
mf.verbose=0
mf.run()
dm0 = mf.make_rdm1()
irrep_nelec0 = mf.get_irrep_nelec()

mf = dft.RKS(mol)
mf._numint = dm21.NeuralNumInt(dm21.Functional.DM21)
mf.grids.level=5
mf.conv_tol=1E-6
mf.conv_tol_grad=1E-3
mf.irrep_nelec = irrep_nelec0
mf.verbose=5

mf.kernel(dm0=dm0)

This converges in around 6 SCF cycles for me and enables fast convergence across the binding curve. Note that the LUMO is lower in energy than the HOMO at this separation -- which is one reason SCF convergence is so problematic and indicates the energy can be further lowered by (partially) occupying the LUMO. This is what led us to investigate fractional occupancies, as described in our paper.

Hi @jsspencer, thanks for your explanation, which helps a lot since I just started playing with DM21! I tried your code above for the H2 dissociation curve but I still get a H2 energy of -0.955678 at 3A H-H distance (FCI should be close to -1). Is this observation expected because we need to use fractional occupation optimizations to reproduce the dissociation curve at Figure 2D? If so, is there a code example for doing this? Thanks!

Chenru

@jsspencer
Copy link
Collaborator

Yes, this is expected -- as Fig 2d (data in the Supplementary data files) shows, 3A is close to the maximum energy for DM21 with integer occupations. Fractional occupations improve the energy as described above and in the paper. The algorithm for doing this optimisation is detailed in the Supplementary material. Unfortunately a open source version is not available at this time.

@chenruduan
Copy link

@jsspencer Thanks so much! The data json is very helpful! I have a naive question then: Will the fractional occupations optimization algorithm improve the dissociation curve at long distance for other conventional functionals (e.g., B3LYP)?

@jsspencer
Copy link
Collaborator

I wouldn't expect an improvement for most conventional functions (cf comments in our paper and work by Becke on B13 functional)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants