## Quaternion PyTorch - Basic mechanisms

In [1]:
import torch
from htorch import quaternion

### 1 - Quaternion tensors

A quaternion number is represented by:

$$
x = a + bi + cj + dk
$$

where $a$, $b$, $c$, and $d$ are real values, and $i$, $j$, $k$ are the imaginary parts. A `QuaternionTensor` extends the standard PyTorch `tensor` to handle quaternion values, by specifying the real and imaginary components during initialization:

In [2]:
# Simple scalar quaternion
x = quaternion.QuaternionTensor([0.0, 0.3, 0.4, 0.5])
print(x)

real part: tensor([0.])
imaginary part (i): tensor([0.3000])
imaginary part (j): tensor([0.4000])
imaginary part (k): tensor([0.5000])


For a general tensor of shape $(a, b, ...)$, a `QuaternionTensor` will have $(a, b, c, ..., 4)$ real values for its initialization:

In [3]:
# Mini-batch of two scalar quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4))
print(x)

real part: tensor([[0.0665],
        [0.0963]])
imaginary part (i): tensor([[0.6331],
        [0.3718]])
imaginary part (j): tensor([[0.5579],
        [0.7480]])
imaginary part (k): tensor([[0.7048],
        [0.2228]])


In [4]:
# A mini-batch of 4 vectors, each composed of 2 quaternions
y = quaternion.QuaternionTensor(torch.rand(4, 4, 2))
print(y)

real part: tensor([[[0.8456, 0.8158]],

        [[0.6617, 0.2509]],

        [[0.6872, 0.8456]],

        [[0.0933, 0.2293]]])
imaginary part (i): tensor([[[0.4966, 0.3823]],

        [[0.1375, 0.0292]],

        [[0.3926, 0.2288]],

        [[0.7794, 0.6444]]])
imaginary part (j): tensor([[[0.8498, 0.5153]],

        [[0.3284, 0.5204]],

        [[0.8245, 0.3627]],

        [[0.0672, 0.8573]]])
imaginary part (k): tensor([[[0.0649, 0.5840]],

        [[0.9681, 0.8281]],

        [[0.1878, 0.2021]],

        [[0.8931, 0.0886]]])


All standard quaternion operations can be applied on the tensor (see `QuaternionTensor` for a full list):

In [5]:
# Conjugation
print(x.conj())

real part: tensor([[0.0665],
        [0.0963]])
imaginary part (i): tensor([[-0.6331],
        [-0.3718]])
imaginary part (j): tensor([[-0.5579],
        [-0.7480]])
imaginary part (k): tensor([[-0.7048],
        [-0.2228]])


In [6]:
# Element-wise norm
print(x.norm())

tensor([[1.1015],
        [0.8699]])


In [7]:
# Element-wise angle
print(x.theta())

tensor([[1.5104],
        [1.4598]])


In [8]:
# Quaternion multiplication (Hamilton product)
print(x * x)

real part: tensor([[-1.2044],
        [-0.7381]])
imaginary part (i): tensor([[0.0842],
        [0.0716]])
imaginary part (j): tensor([[0.0742],
        [0.1441]])
imaginary part (k): tensor([[0.0938],
        [0.0429]])


In [9]:
# Quaternion matrix multiplication
print(x.t() @ x)

tensor([[1.2132, 0.8162],
        [0.8162, 0.7567]])


Importantly, quaternion tensors and real-valued tensors are interoperable (real-valued tensors being casted to quaternion tensors with 0 imaginary parts):

In [10]:
# Quaternion scalar multiplication
print(x * torch.rand(2))

real part: tensor([[0.0110],
        [0.0634]])
imaginary part (i): tensor([[0.1050],
        [0.2447]])
imaginary part (j): tensor([[0.0925],
        [0.4922]])
imaginary part (k): tensor([[0.1169],
        [0.1466]])


### 2 - Quaternion gradients

Gradients can be computed with the PyTorch autograd mechanisms:

In [11]:
x = quaternion.QuaternionTensor(torch.rand(2, 4))
x.requires_grad = True
y = x.norm().sum()
y.backward()

In [12]:
print(x.grad)

tensor([[0.3587, 0.7965, 0.2983, 0.3847],
        [0.1113, 0.5378, 0.5309, 0.6454]])


### 3 - Quaternion-valued layers

We also provide a number of quaternion-valued layers to implement quaternion neural networks:

In [13]:
from torch import nn
from htorch.layers import QLinear

In [14]:
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each component separately)
model = nn.Sequential(
    QLinear(4, 20, bias=True),
    nn.ReLU(),
    QLinear(20, 1)
)

In [15]:
x = quaternion.QuaternionTensor(torch.rand(2, 4, 4))
print(model(x))

real part: tensor([[-0.0082],
        [-0.0078]], grad_fn=<SliceBackward>)
imaginary part (i): tensor([[-0.2356],
        [-0.0615]], grad_fn=<SliceBackward>)
imaginary part (j): tensor([[0.0546],
        [0.2169]], grad_fn=<SliceBackward>)
imaginary part (k): tensor([[-0.1491],
        [-0.1471]], grad_fn=<SliceBackward>)


We also provide layers to easily integrate quaternion-valued and real-valued blocks:

In [16]:
from htorch.layers import QuaternionToReal
from torch.nn import Softmax

In [17]:
model = nn.Sequential(
    QLinear(4, 10),
    QuaternionToReal(10), # Take the absolute value of each output
    Softmax()
)

In [18]:
model(x).shape

  input = module(input)


torch.Size([2, 10])