## Quaternion PyTorch - Basic mechanisms

In [1]:
import torch
from htorch import quaternion

### 1 - Quaternion tensors

A quaternion number is represented by:

$$
x = a + bi + cj + dk
$$

where $a$, $b$, $c$, and $d$ are real values, and $i$, $j$, $k$ are the imaginary parts. A `QuaternionTensor` extends the standard PyTorch `tensor` to handle quaternion values, by specifying the real and imaginary components during initialization. You can pass the 4 components separately, or pass all values as a single (..., 4)-dimensional tensor of real values:

In [2]:
# Simple scalar quaternion
x = quaternion.QuaternionTensor([0.0, 0.3, 0.4, 0.5])
print(x)

real part: tensor([0.])
imaginary part (i): tensor([0.3000])
imaginary part (j): tensor([0.4000])
imaginary part (k): tensor([0.5000])


For a general tensor of shape $(a, b, ...)$, a `QuaternionTensor` will have $(a, b, c, ..., 4)$ real values for its initialization:

In [3]:
# Mini-batch of two scalar quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4))
print(x)

real part: tensor([[0.4353],
        [0.4580]])
imaginary part (i): tensor([[0.0437],
        [0.6015]])
imaginary part (j): tensor([[0.3951],
        [0.9785]])
imaginary part (k): tensor([[0.8840],
        [0.5028]])


In [4]:
# A mini-batch of 4 vectors, each composed of 2 quaternions
y = quaternion.QuaternionTensor(torch.rand(4, 4, 2))
print(y)

real part: tensor([[[0.9357, 0.0870]],

        [[0.1154, 0.5362]],

        [[0.2839, 0.2906]],

        [[0.9736, 0.0775]]])
imaginary part (i): tensor([[[0.1939, 0.3642]],

        [[0.7193, 0.5352]],

        [[0.7304, 0.7222]],

        [[0.3440, 0.8907]]])
imaginary part (j): tensor([[[0.7665, 0.8330]],

        [[0.2732, 0.4861]],

        [[0.6120, 0.2103]],

        [[0.5903, 0.4304]]])
imaginary part (k): tensor([[[0.5681, 0.0490]],

        [[0.9566, 0.9889]],

        [[0.9356, 0.5554]],

        [[0.9308, 0.9680]]])


All standard quaternion operations can be applied on the tensor (see the documentation of `QuaternionTensor` for a full list):

In [5]:
# Conjugation
print(x.conj())

real part: tensor([[0.4353],
        [0.4580]])
imaginary part (i): tensor([[-0.0437],
        [-0.6015]])
imaginary part (j): tensor([[-0.3951],
        [-0.9785]])
imaginary part (k): tensor([[-0.8840],
        [-0.5028]])


In [6]:
# Element-wise norm
print(x.norm())

tensor([[1.0625],
        [1.3349]])


In [7]:
# Element-wise angle
print(x.theta())

tensor([[1.1487],
        [1.2206]])


In [8]:
# Quaternion multiplication (Hamilton product)
print(x * x)

real part: tensor([[-0.7500],
        [-1.3624]])
imaginary part (i): tensor([[0.0380],
        [0.5510]])
imaginary part (j): tensor([[0.3440],
        [0.8963]])
imaginary part (k): tensor([[0.7695],
        [0.4605]])


In [9]:
# Quaternion matrix multiplication
print(x.t() @ x)

tensor([[1.1289, 1.0567],
        [1.0567, 1.7819]])


Importantly, quaternion tensors and real-valued tensors are interoperable (real-valued tensors being casted to quaternion tensors with 0 imaginary parts):

In [10]:
# Quaternion scalar multiplication
print(x * torch.rand(2))

real part: tensor([[0.2659],
        [0.3755]])
imaginary part (i): tensor([[0.0267],
        [0.4932]])
imaginary part (j): tensor([[0.2414],
        [0.8024]])
imaginary part (k): tensor([[0.5401],
        [0.4123]])


### 2 - Quaternion gradients

Gradients can be computed with the PyTorch autograd mechanisms:

In [11]:
x = quaternion.QuaternionTensor(torch.rand(2, 4))
x.requires_grad = True
y = x.norm().sum()
y.backward()

In [12]:
print(x.grad) # The gradient is also a (..., 4)-dimensional tensor

tensor([[0.0421, 0.6679, 0.5223, 0.5285],
        [0.2519, 0.5894, 0.7428, 0.1934]])


### 3 - Quaternion-valued layers

We also provide a number of quaternion-valued layers to implement quaternion neural networks:

In [13]:
from torch import nn
from htorch.layers import QLinear

In [14]:
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each component separately)
model = nn.Sequential(
    QLinear(4, 20, bias=True),
    nn.ReLU(),
    QLinear(20, 1)
)

In [15]:
x = quaternion.QuaternionTensor(torch.rand(2, 4, 4))
print(model(x))

real part: tensor([[-0.1047],
        [-0.0693]], grad_fn=<SliceBackward>)
imaginary part (i): tensor([[0.3090],
        [0.1464]], grad_fn=<SliceBackward>)
imaginary part (j): tensor([[0.1724],
        [0.1450]], grad_fn=<SliceBackward>)
imaginary part (k): tensor([[0.0721],
        [0.2345]], grad_fn=<SliceBackward>)


We also provide layers to easily integrate quaternion-valued and real-valued blocks:

In [16]:
from htorch.layers import QuaternionToReal
from torch.nn import Softmax

In [17]:
model = nn.Sequential(
    QLinear(4, 10),
    QuaternionToReal(10), # Take the absolute value of each output
    Softmax()
)

In [18]:
model(x).shape

  input = module(input)


torch.Size([2, 10])

See [notebooks/training.ipynb] for an example of a full training.