In [1]:
from sympy import *
init_printing()
import scipy.linalg
import numpy as np
import warnings
from matplotlib import MatplotlibDeprecationWarning
warnings.filterwarnings("ignore",category=MatplotlibDeprecationWarning)

# Orbital Rotation

One method to optimize the single particle orbitals in the Slater determinant is to mix occupied and unoccupied orbitals.  The orbitals are orthogonal and normalized, and any mixing transformation should preserve that.  A rotation matrix meets those conditions.  However, the entries in a rotation matrix are not independent.  A rotation matrix can be expressed as an exponential of a skew-symmetric matrix, and the entries in that matrix are independent.

See Chapter 3 of "Molecular Electronic Structure Theory" by Trygve Helgaker, Poul Jorgensen, and Jeppe Olsen

See also the Wikipedia page on rotation matrices https://en.wikipedia.org/wiki/Rotation_matrix

### Size 2 example
Write a 2x2 skew-symmetric matrix

In [2]:
kappa = Symbol('kappa',real=True)
m = Matrix([[0,kappa],[-kappa,0]])
m

⎡0   κ⎤
⎢     ⎥
⎣-κ  0⎦

Apply the matrix exponential and we get the 2x2 rotation matrix

In [3]:
exp(m)

⎡cos(κ)   sin(κ)⎤
⎢               ⎥
⎣-sin(κ)  cos(κ)⎦

We could do the 3x3 case, but the form looks different from how a 3x3 rotation matrix is usually expressed.

# Combining rotation matrices
Note that if X and Y do not commute, then $\exp(X)\exp(Y) \ne \exp(X+Y)$.
We can demonstrate this with a 3x3 rotation matrix.  The symbolic form would get rather lengthy, so we will use an example with concrete values for the rotation parameters.

In [5]:
k1,k2,k3 = symbols('kappa_1 kappa_2 kappa_3',real=True)

In [25]:
# 3x3 skew-symmetric matrix.  Note the signs are different than is traditionally written.
m1 = Matrix([[0, k1, k2],[-k1, 0, k3], [-k2, -k3, 0]])
m1

⎡ 0   κ₁   κ₂⎤
⎢            ⎥
⎢-κ₁   0   κ₃⎥
⎢            ⎥
⎣-κ₂  -κ₃  0 ⎦

In [64]:
m2 = m1.subs({k1:0.1, k2:0.2, k3:0})
m3 = m1.subs({k1:0.15, k2:0.5, k3:0.0})
m2,m3

⎛⎡ 0    0.1  0.2⎤  ⎡  0    0.15  0.5⎤⎞
⎜⎢              ⎥  ⎢                ⎥⎟
⎜⎢-0.1   0    0 ⎥, ⎢-0.15   0     0 ⎥⎟
⎜⎢              ⎥  ⎢                ⎥⎟
⎝⎣-0.2   0    0 ⎦  ⎣-0.5    0     0 ⎦⎠

In [65]:
# m2 and m3 do not commute
m2*m3 - m3*m2

⎡0   0      0  ⎤
⎢              ⎥
⎢0   0    -0.02⎥
⎢              ⎥
⎣0  0.02    0  ⎦

In [66]:
# The product of exponentials
exp(m2) * exp(m3)

⎡ 0.73630111157537   0.230520546096726    0.636176823627002 ⎤
⎢                                                           ⎥
⎢-0.223771239225615  0.970234879766162   -0.0925781323022542⎥
⎢                                                           ⎥
⎣-0.638582105596693  -0.074192694467908   0.765968888728051 ⎦

In [67]:
# The exponential of the sum.  Not the same as the product of exponentials
exp(m2+m3)

⎡0.736237065559526    0.227606815284241    0.637299082795875 ⎤
⎢                                                            ⎥
⎢-0.227606815284241   0.970162563977322   -0.0835448208634987⎥
⎢                                                            ⎥
⎣-0.637299082795875  -0.0835448208634986   0.766074501582204 ⎦

In [68]:
m2 + m3

⎡  0    0.25  0.7⎤
⎢                ⎥
⎢-0.25   0     0 ⎥
⎢                ⎥
⎣-0.7    0     0 ⎦

In [69]:
# Take the matrx log of the product.  Sympy doesn't have the matrix log, must use scipy instead
# Read off the new values for k1,k2 and k3 from the skew-symmetric matrix form. 
# The rotation parameters are small, so the values are close to m2+m3, but not quite the same.
m4 = exp(m2) * exp(m3)
m4p = np.array(m4).astype(np.float64)
scipy.linalg.logm(m4p)

array([[-3.46454773e-16,  2.49492270e-01,  7.00084196e-01],
       [-2.49492270e-01,  2.11772169e-16, -1.00970891e-02],
       [-7.00084196e-01,  1.00970891e-02, -2.34862926e-16]])

This is an issue with handling rotation parameters, because the code assumes that variational parameters can simply be summed.
When applying multiple rotations, this is no longer the case, it requries more careful handling of the rotation parameters.
To get an exact answer, we take the exponentials, multiple the matrices, then take the matrix log to recover the parameters.


As an aside, the Baker-Campbell-Hausdorff formula could be used to approximate the matrix sum in terms of nested commutators.

# Fill-in
This is an issue particular to QMC's usage of rotation matrices.  The only rotations of interest (that have non-zero derivatives) are between occupied and unoccupied orbitals.  The rotations between only occupied orbitals or only unoccupied orbitals do not change the energy.  However, when combining rotation matrices by the above procedure, the values corresponding to those rotations may become non-zero.  That means capturing the state of the rotation requires more than just the occupied-unoccupied rotational parameters.

## Derivatives

To compute parameter derivatives of a wavefunction using a the rotation matrix, we can use Jacobi's formula (https://en.wikipedia.org/wiki/Jacobi%27s_formula)
$$
\frac{1}{\det(A)} \frac{d}{dt} \det(A) = \mathrm{tr} (A^{-1} \frac{dA}{dt})
$$

More detail about the argument of the determinant.  It is a matrix of single particle orbitals (SPO) evaluated at the different electron positions.  Without orbital rotation, exactly the same number of SPO's are used as electrons to yield a square matrix.  With the addition of a rotation matrix, more SPO's are used than electron positions, leaving us to deal with non-square matrices.
$$
A = E R \Phi
$$
where $\Phi$ is the matrix of SPOs, $R$ is the rotation matrix.
Let $N$ be the number of electrons and $M$ be the number of SPOs.   The matrix $\Phi$ is $M$ x $N$.  The rotation matrix, $R$, is $M$x$M$.  The product $R \Phi$ is rectangular with dimensions $M$ x $N$.  The final matrix must be square, so formally we add a "selection" matrix, $E$, of dimensions $N$ x $M$.  The matrix $E$ is the identity in the square part, and zeros elsewhere.

### Determinant of product

For square matrices, $\det(AB) = \det(A)\det(B)$.  For rectangular matrices, we need to use the Cauchy-Binet formulat (https://en.wikipedia.org/wiki/Cauchy%E2%80%93Binet_formula)
The structure of E implies the only nonzero contribution ot the determinant comes from the "square" part.

### Derivative of rotation matrix
We always evaluate the rotation matrix at zero angle.

Expand the matrix exponential $R = \exp(X)$ in the power series $1 + X + 1/2 X^2 + ...$

The derivative is then
$$
\frac{dR}{dp_i} = 0 + dX/dp_i + X (dX/dp_i)
$$
When evaluating at zero angle, all higher order terms invovling $X$ are zero.  So we are left with a single $dX/dp_i$ term.  That matrix has two entries corresponding to the location of the parameter $p_i$.  One is 1 and the other is -1.