<a href="https://colab.research.google.com/github/dnguyend/ManNullRange/blob/master/colab/pd_symbolic.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Demonstrating symbolic derivation of gradient and Hessian for the manifold of Positive-definite matrices

$\newcommand{\JJ}{\mathrm{J}}$
$\newcommand{\rN}{\mathrm{N}}$
$\newcommand{\rD}{\mathrm{D}}$
$\newcommand{\rK}{\mathrm{K}}$
$\newcommand{\sfg}{\mathsf{g}}$
$\newcommand{\cE}{\mathcal{E}}$
$\newcommand{\cM}{\mathcal{M}}$
$\newcommand{\cH}{\mathcal{H}}$
$\newcommand{\ft}{\mathfrak{t}}$
$\newcommand{\Null}{\mathsf{Null}}$
$\newcommand{\xtrace}{\mathsf{xtrace}}$
$\newcommand{\rhess}{\mathsf{rhess}}$
# Formulas used here:
* $\JJ(P) = \eta - \eta^{\ft}$. $\Null(\JJ)$ is the tangent space.
* $\sfg(P) = P^{-1}\eta P^{-1}$, a metric
  metric given as self-adjoint operator on ambient $\cE$
* $\Pi_{\sfg} = I - \sfg^{-1}\JJ^{\ft}(\JJ\sfg^{-1}\JJ^{\ft})^{-1}\JJ$ projection to $T_Y\cM$.
* Gradient is $\Pi_{\sfg}\sfg^{-1}f_Y$ 
* $\xtrace$: index raising operator for trace (Frobenius) inner product. Very simple for matrix expressions:
   * $\xtrace(AbC, b) = A^{\ft}B^{\ft}$ 
   * $\xtrace(Ab^tC, b) = BA$ 
* $\JJ^{\ft}$ is evaluated by $\xtrace$
* $\rK(\xi, \eta)$  Christoffel metric term  $\frac{1}{2}((\rD_{\xi}\sfg)\eta + (\rD_{\eta})\sfg\xi-\xtrace(\langle(\rD_\phi\sfg)\xi, \eta\rangle_{\cE}, \phi))$ \\
* $\Gamma_{c}(\xi, \eta)$  Christoffel function $\Pi_{\sfg}\sfg^{-1}\rK(\xi, \eta)-(\rD_{\xi}\Pi_{\cH, \sfg})\eta$
* $\rhess^{02}(\xi, \eta) = f_{YY}(\xi, \eta) - \langle \Gamma_{c}(\xi, \eta)f_Y\rangle$
* $\rhess^{11}\xi = \xtrace(\rhess^{02}(\xi, \eta), \eta)$


In [None]:
!git clone https://github.com/dnguyend/ManNullRange.git

Cloning into 'ManNullRange'...
remote: Enumerating objects: 144, done.[K
remote: Counting objects: 100% (144/144), done.[K
remote: Compressing objects: 100% (89/89), done.[K
remote: Total 144 (delta 95), reused 89 (delta 54), pack-reused 0[K
Receiving objects: 100% (144/144), 165.37 KiB | 5.01 MiB/s, done.
Resolving deltas: 100% (95/95), done.


In [None]:
!ls

ManNullRange  sample_data


We import the main functions in the package to take directional derivatives, and perform index raising (xtrace), and pretty print the result in latex

In [None]:
from collections import OrderedDict
from IPython.display import display, Math
from sympy import symbols, Integer
from ManNullRange.symbolic import SymMat as sm
from ManNullRange.symbolic.SymMat import (
    matrices, t, mat_spfy, xtrace, trace, DDR,
    latex_map, mat_latex, simplify_pd_tangent, inv)


def pprint(expr):
    display(Math(latex_map(mat_latex(expr), OrderedDict(
        [('fYY', r'f_{YY}'), ('fY', 'f_Y'), ('al', r'\alpha')]))))

The following define the main symbols and the operator J, g, g_inv. Note we do not have to derive J_adj on paper, it is derived through the index raising operator

In [None]:
if True:
    """ For positive definite matrices
    Y is a matrix point, a positive definite matrix
    eta is an ambient point, same size with Y not necessarily
    symmetric or invertible
    b is a point in E_J. b is antisymmetric
    """
    # eta is an ambient
    Y = sm.sym_symb('Y')
    eta = matrices('eta')
    b = sm.asym_symb('b')
    
    def J(Y, eta):
        return eta - t(eta)
    
    def J_adj(Y, a):
        dY = symbols('dY', commutative=False)
        return xtrace(trace(mat_spfy(J(Y, dY) * t(a))), dY)

    def g(Y, eta):
        return inv(Y)*eta*inv(Y)

    def g_inv(Y, eta):
        return Y*eta*Y
    
    J_g_inv_J_adj = J(Y, g_inv(Y, J_adj(Y, b)))
    print("this is J_g_inv_J_adj")
    pprint(J_g_inv_J_adj)

this is J_g_inv_J_adj


<IPython.core.display.Math object>

We define a function to invert J_g_inv_J_adj, then from here projection is just composition of operators:

In [None]:
    def solve_JginvJadj(Y, a):
        return Integer(1)/Integer(4)*inv(Y)*a*inv(Y)

    def proj(Y, omg):
        jo = mat_spfy(J(Y, omg))
        cJinvjo = solve_JginvJadj(Y, jo)
        return mat_spfy(omg - mat_spfy(
            g_inv(Y, mat_spfy(J_adj(Y, cJinvjo)))))

    def r_gradient(Y, omg):
        return mat_spfy(
            proj(Y, mat_spfy(g_inv(Y, omg))))
    print("This is the projection")
    pprint(proj(Y, eta))
    print("This is the gradient")
    pprint(r_gradient(Y, eta))


This is the projection


<IPython.core.display.Math object>

This is the gradient


<IPython.core.display.Math object>

In [None]:
    xi, phi = matrices('xi phi')
    xcross = xtrace(mat_spfy(trace(DDR(g(Y, eta), Y, phi) * t(xi))), phi)
    K = (Integer(1)/Integer(2))*(
        DDR(g(Y, eta), Y, xi) + DDR(g(Y, xi), Y, eta) - xcross)

    def d_proj(Y, xi, omg):
        e = matrices('e')
        r = mat_spfy(proj(Y, e))
        expr = DDR(r, Y, xi)
        return expr.xreplace({e: omg})

    dp_xi_eta = d_proj(Y, xi, eta)
    prK = simplify_pd_tangent(proj(Y, mat_spfy(g_inv(Y, K))), Y, (xi, eta))
    Gamma = mat_spfy(
        simplify_pd_tangent(-dp_xi_eta+prK, Y, (xi, eta)))
    print("This is the Christoffel function:")
    pprint(Gamma)
    fY, fYY = matrices('fY fYY')
    rhess02 = trace(mat_spfy(t(eta)*fYY*xi-Gamma * t(fY)))
    rhess11_bf_gr = xtrace(rhess02, eta)
    print("This is the Riemannian Hessian Vector Product:")
    pprint(r_gradient(Y, rhess11_bf_gr))      

This is the Christoffel function:


<IPython.core.display.Math object>

This is the Riemannian Hessian Vector Product:


<IPython.core.display.Math object>