Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding Jacobian of inverse action #266

Closed
mnissov opened this issue Apr 25, 2023 · 4 comments
Closed

Question regarding Jacobian of inverse action #266

mnissov opened this issue Apr 25, 2023 · 4 comments

Comments

@mnissov
Copy link

mnissov commented Apr 25, 2023

Maybe this is a little out of scope for this platform, if yes I understand.

The basic problem is in understanding what is more correct between various derivations of what amounts to the inverse action of SO(3).

Looking at the paper and cheatsheet one would conclude that

$$ J_{\mathcal{X}}^{\mathcal{X}^{-1} \cdot v} = (\mathcal{X}^{-1}\cdot v)^{\wedge} $$

for $\mathcal{X}\in SO(3)$ and $v\in \mathcal{R}^{3}$.

However, this is not always the results which is used/found by other sources with a similar equation. Looking at an alternative source, something like this GNSS/INS textbook from Paul Groves, in chapter 16 he discusses doppler aided INS systems, which will inevitably have a similar inverse action to consider. He derives the Jacobian of the measurement function, in equation 16.69, to be

$$ \frac{\partial C_{w}^{b} v^{w}}{\partial \delta \psi_{b}^{w}} = -C_{w}^{b} (v^{w})^{\wedge} $$

Note I've used his notation here and simplified the equation a bit. But here $C_{w}^{b}$ is the rotation from {w}->{b}, $v^{w}$ is the {w}-frame velocity, and $\delta \psi_{b}^{w}$ is the error in the orientation of {b} in {w}, since this is an error state formulation. This is the inverse action because the rotation which directly corresponds to $\delta \psi_{b}^{w}$ should be $C_{b}^{w}$, and we're using it's transpose here.

I tried also to quantify the difference between these two numerically, using a python script to perturb a rotation and calculate the error by

$$
e = \lVert \underbrace{\left( \mathcal{X} \oplus \tau \right)^{-1}\cdot v}{\text{true}} - \underbrace{ \left( v + J \tau \right) }{\text{approximate}} \rVert_2
$$

where $\mathcal{X}\in SO(3)$ and $v, \tau \in \mathcal{R}^3$ are random and $J$ is each of the two before-mentioned Jacobians. Note, I scale the perturbation with factor $k\in [0, 1)$ to see the error of this first order approximation grow.

What is strange is quite often the Lie theory derived Jacobian will perform much better, and then sometimes not depending on the specific simulation. This behavior I don't quite understand.

Code for the python analysis

import numpy as np
import matplotlib.pyplot as plt
from manifpy import SO3, SO3Tangent

# pretty print
np.set_printoptions(precision=3, suppress=True)


def skew(x):
    return np.array([[0, -x[2], x[1]], [x[2], 0, -x[0]], [-x[1], x[0], 0]])


# jacobian following Groves 16.69
def jacobian1(R: SO3, v):
    # jacobian of inv action R^T * v
    return -R.rotation().transpose() @ skew(v)


# jacobian following Lie, dervied by me
def jacobian2(R: SO3, v):
    # jacobian of inv action R^T * v
    return skew(R.rotation().transpose() @ v)


# jacobian following Lie, from manifpy
def jacobian3(R: SO3, v):
    # jacobian of inv action R^T * v
    J_Xinv_act_b__Xinv = np.zeros((3, 3))
    J_Xinv__X = np.zeros((3, 3))
    X_b_w.inverse(J_Xinv__X).act(v, J_Xinv_act_b__Xinv)
    # chain rule
    J_Xinv_act_b__X = J_Xinv_act_b__Xinv @ J_Xinv__X

    return J_Xinv_act_b__X


def generate_data(X0: SO3, v, N=100, perturb_scaling_max=1.0):
    J1 = jacobian1(X0, v)
    J2 = jacobian2(X0, v)
    J3 = jacobian3(X0, v)

    errors = []
    scalings = np.linspace(start=0, stop=perturb_scaling_max, num=N)
    perturbation = np.random.rand(3)
    perturbation = perturbation / np.linalg.norm(perturbation)
    for s in scalings:
        tau = s * perturbation
        X = X0.rplus(SO3Tangent(tau))

        v_true = X.inverse().act(v)
        v_approx_1 = v + J1 @ tau
        v_approx_2 = v + J2 @ tau
        v_approx_3 = v + J3 @ tau

        error1 = np.linalg.norm(v_true - v_approx_1)
        error2 = np.linalg.norm(v_true - v_approx_2)
        error3 = np.linalg.norm(v_true - v_approx_3)

        errors.append([s, error1, error2, error3])

    return np.asarray(errors, dtype=np.float64)


if __name__ == "__main__":
    X_b_w = SO3.Random()
    v_w = np.random.rand(3)

    data = generate_data(X_b_w, v_w, N=1000, perturb_scaling_max=1e-1)

    fig, ax = plt.subplots()
    ax.plot(data[:, 0], data[:, 1], label="Groves Jacobian")
    ax.plot(data[:, 0], data[:, 2], label="Lie theory")
    # ax.plot(data[:, 0], data[:, 3], label="Manifpy")

    ax.set_xlabel("Perturbation 2-Norm")
    ax.set_ylabel("Error 2-Norm")
    ax.legend()

    plt.show()
@joansola
Copy link
Collaborator

joansola commented Apr 25, 2023

Hello @mnissov Morten

Regarding our Jacobian, it is correct according to the definition of the right Jacobian. The proof is through the chain rule and all formulas in the paper, and is supported by extensive unit testing in manif (which tests ALL jacobians for exactness using the small-perturbation approximations similar to those you use above):

D(X.inv * v) / DX =
  = D(X.inv * v) / D(X.inv) * D(X.inv)/DX 
  = -X.tr * v_x * (-Ad_X) 
  = X.tr * v_x * X 
  = [X.tr * v]_x

This Jacobian needs to be interpreted as follows: when X is perturbed locally with tau, the action of X.tr*v gets also perturbed. The Jacobian is the limit of the quotient of perturbations, when tau goes to zero. The important point here is that tau is defined in the tangent space local to X.

The second Jacobian that you present is probably different, although you do not provide details on how \psi participates in C. If you happen to use psi=log(C1 * C2.inv) then \psi is a vector tangent to SO(3) at the identity, and not local to C. In such case, you get what we call the left Jacobian. I reckon this is the reason you observe this difference.

If this is the case, then both Jacobians are different.

Regarding your test, you should test the first one using right-plus and regular minus

  e = ((X (+) tau).inv * v) - (X.inv * v + J_1*tau)                (1)

and the second one using left-plus and regular minus

  e = ((tau (+) X).inv * v - (X.inv * v + J_2*tau)                 (2)

Since you are only evaluating with (1), you should find that our Jacobian performs well, and the other one does not. However, if you use random X, it may happen that in some occasions X is close to the identity, in which case both Jacobians will be practically the same. Then, it may happen that the second Jacobian performs better than the first, just by some random effect. The first Jacobian should however perform well in all cases using test (1).

Does this make sense?

@mnissov
Copy link
Author

mnissov commented Apr 25, 2023

Regarding your test, you should test the first one using right-plus and regular minus
and the second one using left-plus and regular minus

I realize now I wasn't consistent between text and code, in that I introduce the Lie theory derived Jacobian first but assign it to the function jacobian2. As a result, just to be sure, when you mention "first one" and "second one" here you're referring to

  • "first one": jacobian derived via Lie theory, i.e. first equation in above text: $(\mathcal{X}^{-1}\cdot v)^{\wedge}$
  • "second one": jacobian inspired by Groves textbooks, i.e. second equation in above text: $-\mathcal{X}^{-1}\cdot v^{\wedge}$

The second Jacobian that you present is probably different, although you do not provide details on how \psi participates in C.

I went back to the book to find this and I think you may be right. Groves defines the attitude error as

$$
\begin{aligned}
\delta C_\beta^\alpha &= \hat{C}\beta^\alpha C\alpha^\beta\
&= I_3 + \left[ \delta \psi_{\alpha\beta}^{\alpha}\times \right]\
\end{aligned}
$$

for the error $\delta C_{\beta}^{\alpha}$, estimate $\hat{C}_{\beta}^{\alpha}$, and true value $C$. Rearranging this for the perturbed estimate results in something more familiar:

$$
\hat{C}{\beta}^{\alpha} = (I_3 + \left[ \delta\psi{\alpha\beta}^{\alpha}\times \right]) C_{\beta}^{\alpha}
$$

so you're right, this corresponds to a global perturbation rather than a local I suppose.
The jacobian should then be

$$ \begin{aligned} J &= \lim_{\tau\rightarrow\infty} \frac{(\tau\oplus R)^{-1} \cdot v - R^{-1}\cdot v}{\tau}\\ &= \lim_{\tau\rightarrow\infty} \frac{(R^\top e^{-\tau^\wedge}) \cdot v - R^{\top}\cdot v}{\tau}\\ &= \lim_{\tau\rightarrow\infty} \frac{R^\top ( I_3 - \tau^\wedge ) \cdot v - R^{-1}\cdot v}{\tau}\\ &= R^\top v^\wedge \end{aligned} $$

In hindsight, I think I made a typo in transcribing the Jacobian, his error function was written $e = meas - h(x)$ so I think that's where the minus comes from. This would be convenient because then the plots make a lot of sense I think.

Note I tweaked the plot a bit to run N simulations of L length. Otherwise the same:

Figure_1

@joansola
Copy link
Collaborator

joansola commented Apr 25, 2023

I realize now I wasn't consistent between text and code, in that I introduce the Lie theory derived Jacobian first but assign it to the function jacobian2. As a result, just to be sure, when you mention "first one" and "second one" here you're referring to

* "first one": jacobian derived via Lie theory, i.e. first equation in above text: (X−1⋅v)∧

* "second one": jacobian inspired by Groves textbooks, i.e. second equation in above text: −X−1⋅v∧

Correct, first one is Lie, second one is Groves

Figure_1

So manif uses right-Jacobians, therefore local perturbations, and Groves uses left-Jacobians, therefore global perturbations.

It seems then it all fits perfectly!

@mnissov mnissov closed this as completed Apr 26, 2023
@mnissov
Copy link
Author

mnissov commented Apr 26, 2023

yes! thanks so much for the help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants