## Benchmarking of various ways for computing the angular acceleration from the intertia tensor and the torque

In [2]:
import torch

In [26]:
a = torch.abs(torch.randn((128,3,3)))
c = torch.bmm(a, a.transpose(1,2))

In [15]:
t = torch.randn((128,3))

## Vanilla BMM speed

In [31]:
%%timeit -o
torch.bmm(c, t.unsqueeze(-1)).squeeze(-1)

3.85 µs ± 13.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)


<TimeitResult : 3.85 µs ± 13.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)>

## 1. Simple inverse and BMM

`linalg.inv`

In [17]:
%%timeit -o
I_inv = torch.linalg.inv(c) # inverse of a batch of 3x3 matrices
omega_d = torch.bmm(I_inv, t.unsqueeze(-1)).squeeze(-1) # batched matrix-vector product

43.3 µs ± 3.26 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


<TimeitResult : 43.3 µs ± 3.26 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)>

`linalg.inv_ex`

In [18]:
%%timeit -o
I_inv = torch.linalg.inv_ex(c)[0] # inverse of a batch of 3x3 matrices
omega_d = torch.bmm(I_inv, t.unsqueeze(-1)).squeeze(-1) # batched matrix-vector product

38.8 µs ± 4.42 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


<TimeitResult : 38.8 µs ± 4.42 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)>

## 2. Solve the linear system

`linalg.solve`

In [19]:
%%timeit -o
omega_d = torch.linalg.solve(c, t) # batched linear solve

35.6 µs ± 3.45 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


<TimeitResult : 35.6 µs ± 3.45 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)>

`linalg.solve_ex`

In [20]:
%%timeit -o
omega_d = torch.linalg.solve_ex(c, t)[0] # batched linear solve

35.2 µs ± 5.13 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)


<TimeitResult : 35.2 µs ± 5.13 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)>