Разложим функцию $f : R^3 \rightarrow R$ до второго порядка в окрестности точки $x$:

$$ f(z) = f(x) + f^{'}(x) (z-x) + \dfrac{1}{2} (f^{''}(x)(z-x), (z-x)) = f(x) - f^{'}(x) x + \dfrac{1}{2} (f^{''}(x) \space x, x) + f^{'}(x) \space z -  (f^{''}(x) \space x, z) + \dfrac{1}{2} (f^{''}(x) \space z, z) = \textbf{C} + \dfrac{1}{2} (\textbf{A} \space z, z) - 2\textbf{b}^T z$$

Предпоследний шаг справедлив в силу того, что Гессиан семмитричен.

Здесь слагаемое $\textbf{C} = f(x) - f^{'}(x) x + \dfrac{1}{2} (f^{''}(x) \space x, x)$ есть скалярная константа не зависящая от выбора $z$. $\textbf{A} = f^{''}(x)$. $\textbf b = \dfrac{1}{2} (f^{''}(x) \space x - f^{'}(x))$

Получим задачу минимизации вида:

$ minimize \space z^T \textbf{A} z - 2\textbf{b}^Tz $ subject to $||z|| = 1$

В статье "MINIMIZING A QUADRATIC OVER A SPHERE" by WILLIAM W. HAGE описано аналитическое решение данной задачи.

In [321]:
import numpy as np
from scipy.spatial.distance import pdist, squareform, euclidean
from numpy.random import rand
from torch import norm
def energy_count(points):
    energy = torch.autograd.Variable(torch.Tensor([0]))
    energy.requires_grad = True
    pairwise = torch.zeros((len(points), len(points)))
    for i in range(len(points)):
        for j in range(len(points)):
            if i < j:
                pairwise[i][j] = norm(points[i] - points[j])
    for i in range(len(points)):
        for j in range(len(points)):
            if i < j:
                energy = energy + 1/pairwise[i][j]
    return energy
def init_points( type, N):
        np.random.seed(42)
        points = -1.0 + 2.0*rand(N, 3)
        if type == 'polar':
            for i in range(N):
                points[i][0] *= 0.1
                points[i][1] *= 0.1
                points[i][2] = points[i][0]*0.1 + 1.0
            pass
        for i in range(N):
            points[i] = points[i]/np.linalg.norm(points[i])
        return points

N = 2

In [322]:
import torch 
import torch.autograd as autograd

In [323]:
points = init_points('polar', 2)

In [324]:
x.size()[0]

2

In [325]:
x = torch.Tensor(points)

In [326]:
x.requires_grad = True

In [327]:
y = energy_count(x)

In [328]:
y

tensor([6.0734], grad_fn=<AddBackward0>)

In [247]:

y.requires_grad = True


In [304]:
y.backward()


In [329]:
g = autograd.grad(y,x,create_graph=True,retain_graph= True,allow_unused=True)

In [336]:
print(type(g[0]))

<class 'torch.Tensor'>


In [346]:
def eval_hessian(loss_grad,x):
    cnt = 0
    for g in loss_grad:
        g_vector = g.contiguous().view(-1) if cnt == 0 else torch.cat([g_vector, g.contiguous().view(-1)])
        cnt = 1
    l = g_vector.size(0)
    hessian = torch.zeros(l, l)
    for idx in range(l):
        grad2rd = autograd.grad(g_vector[idx], x, create_graph=True)
        cnt = 0
        for g in grad2rd:
            g2 = g.contiguous().view(-1) if cnt == 0 else torch.cat([g2, g.contiguous().view(-1)])
            cnt = 1
        hessian[idx] = g2
    return hessian.cpu().data.numpy()

In [347]:
h = eval_hessian(g,x)


In [348]:
h

array([[-174.51974  , -175.55014  ,    2.026874 ,  174.51974  ,
         175.55014  ,   -2.026874 ],
       [-175.55014  ,  398.46387  ,   -7.1871815,  175.55014  ,
        -398.46387  ,    7.1871815],
       [   2.0268738,   -7.187181 , -223.94408  ,   -2.0268738,
           7.187181 ,  223.94408  ],
       [ 174.51974  ,  175.55014  ,   -2.026874 , -174.51974  ,
        -175.55014  ,    2.026874 ],
       [ 175.55014  , -398.46387  ,    7.1871815, -175.55014  ,
         398.46387  ,   -7.1871815],
       [  -2.0268738,    7.187181 ,  223.94408  ,    2.0268738,
          -7.187181 , -223.94408  ]], dtype=float32)

In [334]:
g[0].backward()

RuntimeError: grad can be implicitly created only for scalar outputs

In [99]:
z = x.grad
z.requires_grad = True

In [100]:
z.backward()

In [101]:
print(x.grad)

tensor([8.], requires_grad=True)


In [350]:
2*8.5/9.5

1.7894736842105263