In [1]:
import qmctorch

INFO:QMCTorch|  ____    __  ______________             _
INFO:QMCTorch| / __ \  /  |/  / ___/_  __/__  ________/ /  
INFO:QMCTorch|/ /_/ / / /|_/ / /__  / / / _ \/ __/ __/ _ \ 
INFO:QMCTorch|\___\_\/_/  /_/\___/ /_/  \___/_/  \__/_//_/ 


#  Jastrow Factor

The wave function of the molecular system is written as :

$$
\Psi(R) = J(R) \sum_n c_n \det(A_\uparrow(r_\uparrow)) \det(A_\downarrow(r_\downarrow))
$$

where $J(R)$ is the so called Jastrow factor, and $A_\uparrow$($A_\downarrow$) is the matrix of the molecular orbitals values for the spin up(down) electron

The Jastrow factor is written as the exponential of a kernel function :

$$
J(R) = \exp\left( \sum_{i<j}  f(r_{ij}) \right) 
$$

where $r_{ij}$ denotes the distance between electrons $i$ and $j$. The kernel function $f(r_{ij})$ can take differet forms. Traditionally it is written as a Pade-Jastrow function

$$
f(r_{ij}) = \frac{a r_{ij}}{1+ \omega r_{ij}}
$$

where $a$ is a fixed weight and $\omega$ a variational parameter.

The electron-electron Jastrow factor class (`JastrowFactorElectronElectron`) orchestrates the calculations of the Jastrow factors (and its 1st and 2nd derivatives). It can take different kernel functions as for example the `PadeJastrowKernel`. The example below shows how to use this class.

In [8]:
import torch
from qmctorch.wavefunction.jastrows.elec_elec.jastrow_factor_electron_electron import JastrowFactorElectronElectron
from qmctorch.wavefunction.jastrows.elec_elec.kernels import PadeJastrowKernel

# number of spin up/down electrons
nup, ndown = 2, 2
nelec = nup + ndown

# define the jastrow factor
jastrow = JastrowFactorElectronElectron(
            nup, ndown,
            PadeJastrowKernel,
            kernel_kwargs={'w': 0.1})

# define random electronic positions
nbatch = 10
pos = torch.rand(nbatch, nelec * 3)

# compute the jastrow
jval = jastrow(pos)

# Neural Jastrows

The functional form $f(r_{ij}) = ar_{ij}(1+\omega r_{ij})^{-1}$ only offers one single variational parameter and does not offer a lot of flexibility. It is however possible to replace that form by a simple fully connected neural network

This network takes a single input value ($r_{ij}$) and outputs a single value, i.e. the value of the kernel. A simple 2 layer fully connected neurakl network  Jastrow Kernel has been implemented in the `FullyConnectedJastrowKernel` class that can be used in the `JastrowFactorElectronElectron` as follow:


In [9]:
import torch
from qmctorch.wavefunction.jastrows.elec_elec.jastrow_factor_electron_electron import JastrowFactorElectronElectron
from qmctorch.wavefunction.jastrows.elec_elec.kernels import FullyConnectedJastrowKernel

# number of spin up/down electrons
nup, ndown = 2, 2
nelec = nup + ndown

# define the jastrow factor
jastrow = JastrowFactorElectronElectron(
            nup, ndown,
            FullyConnectedJastrowKernel,
            kernel_kwargs={'size1': 32, 'size2': 64})

# define random electronic positions
nbatch = 10
pos = torch.rand(nbatch, nelec * 3)

# compute the jastrow
jval = jastrow(pos)

# SlaterJastrow Wave function

Both Jastrow factors can be used to define a `SlaterJastrow` wavefunction. The example below shows how to do that for a `LiH` molecule

In [11]:
import torch
from qmctorch.scf import Molecule
from qmctorch.wavefunction import SlaterJastrow
from qmctorch.wavefunction.jastrows.elec_elec.kernels import PadeJastrowKernel, FullyConnectedJastrowKernel

# define the molecule
mol = Molecule(
            atom='Li 0 0 0; H 0 0 3.14',
            unit='bohr',
            calculator='pyscf',
            basis='sto-3g')

# define the Slater Jastrow wavefunction
wf = SlaterJastrow(mol,
                   jastrow_kernel=PadeJastrowKernel,
                   jastrow_kernel_kwargs={'w': 0.1},
                   configs='single_double(2,2)')

# define random electronic positions
nbatch = 10
pos = torch.rand(nbatch, nelec * 3)

# compute the value of the wave function
wfval = wf(pos)

INFO:QMCTorch|
INFO:QMCTorch| SCF Calculation
INFO:QMCTorch|  Running scf  calculation
converged SCF energy = -7.85928101642665
INFO:QMCTorch|  Molecule name       : HLi
INFO:QMCTorch|  Number of electrons : 4
INFO:QMCTorch|  SCF calculator      : pyscf
INFO:QMCTorch|  Basis set           : sto-3g
INFO:QMCTorch|  SCF                 : HF
INFO:QMCTorch|  Number of AOs       : 6
INFO:QMCTorch|  Number of MOs       : 6
INFO:QMCTorch|  SCF Energy          : -7.859 Hartree
INFO:QMCTorch|
INFO:QMCTorch| Wave Function
INFO:QMCTorch|  Jastrow factor      : True
INFO:QMCTorch|  Jastrow kernel      : PadeJastrowKernel
INFO:QMCTorch|  Highest MO included : 6
INFO:QMCTorch|  Configurations      : single_double(2,2)
INFO:QMCTorch|  Number of confs     : 4
INFO:QMCTorch|  Kinetic energy      : jacobi
INFO:QMCTorch|  Number var  param   : 65
INFO:QMCTorch|  Cuda support        : False


In [12]:
import torch
from qmctorch.scf import Molecule
from qmctorch.wavefunction import SlaterJastrow
from qmctorch.wavefunction.jastrows.elec_elec.kernels import PadeJastrowKernel, FullyConnectedJastrowKernel

# define the molecule
mol = Molecule(
            atom='Li 0 0 0; H 0 0 3.014',
            unit='bohr',
            calculator='pyscf',
            basis='sto-3g')

# define the Slater Jastrow wavefunction
wf = SlaterJastrow(mol,
                   jastrow_kernel=FullyConnectedJastrowKernel,
                   jastrow_kernel_kwargs={'size1': 32, 'size2': 64},
                   configs='single_double(2,2)')

# define random electronic positions
nbatch = 10
pos = torch.rand(nbatch, nelec * 3)

# compute the value of the wave function
wfval = wf(pos)

INFO:QMCTorch|
INFO:QMCTorch| SCF Calculation
INFO:QMCTorch|  Reusing scf results from HLi_pyscf_sto-3g.hdf5
INFO:QMCTorch|
INFO:QMCTorch| Wave Function
INFO:QMCTorch|  Jastrow factor      : True
INFO:QMCTorch|  Jastrow kernel      : FullyConnectedJastrowKernel
INFO:QMCTorch|  Highest MO included : 6
INFO:QMCTorch|  Configurations      : single_double(2,2)
INFO:QMCTorch|  Number of confs     : 4
INFO:QMCTorch|  Kinetic energy      : jacobi
INFO:QMCTorch|  Number var  param   : 2210
INFO:QMCTorch|  Cuda support        : False


# TODO : Exploring different network architetures

It would be interesting to see how the accuracy of the total energy varies for different network architecture (i.e. different number of layers, different layer size, different activation function etc ...). We should compared the performance of both PadeJastrow and FullyConnectedJastrow for a few different molecules. Some mmolecule candidates can be found in https://arxiv.org/pdf/1909.02487.pdf together with their optimum energies.  


In [14]:
from qmctorch.scf import Molecule
from qmctorch.wavefunction import SlaterJastrow
from qmctorch.wavefunction.jastrows.elec_elec.kernels import PadeJastrowKernel, FullyConnectedJastrowKernel

# We should use ADF as calculator so this cell requires a valid ADF license

if 0:
    # H2 : Expected exact total energy : -1.169
    mol = Molecule(atom='H 0 0 -0.69; H 0. 0. 0.69', calculator='adf', basis='dzp', unit='bohr')

    # LiH : Expected exact total energy : -8.0705
    # mol = Molecule(atom='Li 0.0 0.0 0.0; H 0. 0. 3.015', calculator='adf', basis='dzp', unit='bohr')

    # Li2 : Expected exact total energy : -14.9954
    # mol = Molecule(atom='Li 0.0 0.0 0.0; Li 0. 0. 5.051', calculator='adf', basis='dzp', unit='bohr')

    # N2 : Expected exact total energy : -109.5423
    # mol = Molecule(atom='N 0.0 0.0 0.0; N 0. 0. 2.068', calculator='adf', basis='dzp', unit='bohr')

    # wavefunction
    wf = SlaterJastrow(mol,
                       jastrow_kernel=FullyConnectedJastrowKernel,
                       jastrow_kernel_kwargs={'size1': 32, 'size2': 64},
                       configs='single_double(4,12)')

    # sampler
    sampler = Metropolis(nwalkers=10000,
                         nstep=2000, step_size=0.05,
                         ntherm=-1, ndecor=100,
                         nelec=wf.nelec, init=mol.domain('atomic'),
                         move={'type': 'all-elec', 'proba': 'normal'},
                         cuda=cuda)

    # optimizer
    lr_dict = [{'params': wf.jastrow.parameters(), 'lr': 1E-2},
               {'params': wf.ao.parameters(), 'lr': 1E-2},
               {'params': wf.mo.parameters(), 'lr': 1E-2},
               {'params': wf.fc.parameters(), 'lr': 1E-2}]
    opt = optim.Adam(lr_dict, lr=1E-3)

    # scheduler
    scheduler = optim.lr_scheduler.StepLR(opt, step_size=100, gamma=0.90)

    # solver
    solver = SolverSlaterJastrow(wf=wf, sampler=sampler,
                                 optimizer=opt, scheduler=scheduler)

    # optimize the wave function
    solver.track_observable(['local_energy', 'parameters'])

    solver.configure_resampling(mode='update',
                                resample_every=1,
                                nstep_update=100)
    solver.ortho_mo = False

    obs = solver.run(500, batchsize=200)

Preliminary results have show that the fully connected network does not yields much better results. 

![Fully connected VS Pade Jastrow](./fcjastrow.png)

# Bonus Electron-Electron cusp conditions

In the simple Pade-Jastrow function the value of the $a$ parameter is given by the so-called electron-electron cusp conditions. These conditions are here to insured that the value of the total energy remains finite at the coalescence point, i.e. when two electrons have the same positions. 

It can be shown that this is respected when

$$
\frac{1}{J(R)}{\frac{\partial J(R)}{\partial r_{ij}} |_{r_{ij}=0}} = \frac{1}{4} 
$$

for same-spin electrons and 

$$
\frac{1}{J(R)}{\frac{\partial J(R)}{\partial r_{ij}} |_{r_{ij}=0}} = \frac{1}{2} 
$$

for opposite spin electrons.

For the Pade-Jastrow function this translates into $a=1/4$ ($a=1/2$) for same(opposite) spin electrons. However realizing the same condition for the fully connected Jastrow factor is still to be clarified. 
