## Quaternion PyTorch - Basic mechanisms

In [1]:
import torch
from htorch import quaternion

### 1 - Quaternion tensors

A quaternion number is represented by:

$$
x = a + bi + cj + dk
$$

where $a$, $b$, $c$, and $d$ are real values, and $i$, $j$, $k$ are the imaginary parts. A `QuaternionTensor` extends the standard PyTorch `tensor` to handle quaternion values, by specifying the real and imaginary components during initialization. You can pass the 4 components separately, or pass all values as a single (..., 4)-dimensional tensor of real values:

In [2]:
# Simple scalar quaternion
x = quaternion.QuaternionTensor([0.0, 0.3, 0.4, 0.5])
print(x)

real part: tensor([0.])
imaginary part (i): tensor([0.3000])
imaginary part (j): tensor([0.4000])
imaginary part (k): tensor([0.5000])


For a general tensor of shape $(a, b, ...)$, a `QuaternionTensor` will have $(a, b, c, ..., 4)$ real values for its initialization:

In [3]:
# Mini-batch of two scalar quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4))
print(x)

real part: tensor([[0.2617],
        [0.6703]])
imaginary part (i): tensor([[0.8477],
        [0.4650]])
imaginary part (j): tensor([[0.9490],
        [0.0939]])
imaginary part (k): tensor([[0.0979],
        [0.8587]])


In [4]:
# A mini-batch of 4 vectors, each composed of 2 quaternions
y = quaternion.QuaternionTensor(torch.rand(4, 4, 2))
print(y)

real part: tensor([[[0.8456, 0.8158]],

        [[0.6617, 0.2509]],

        [[0.6872, 0.8456]],

        [[0.0933, 0.2293]]])
imaginary part (i): tensor([[[0.4966, 0.3823]],

        [[0.1375, 0.0292]],

        [[0.3926, 0.2288]],

        [[0.7794, 0.6444]]])
imaginary part (j): tensor([[[0.8498, 0.5153]],

        [[0.3284, 0.5204]],

        [[0.8245, 0.3627]],

        [[0.0672, 0.8573]]])
imaginary part (k): tensor([[[0.0649, 0.5840]],

        [[0.9681, 0.8281]],

        [[0.1878, 0.2021]],

        [[0.8931, 0.0886]]])


All standard quaternion operations can be applied on the tensor (see the documentation of `QuaternionTensor` for a full list):

In [4]:
# Conjugation
print(x.conj())

real part: tensor([[0.2617],
        [0.6703]])
imaginary part (i): tensor([[-0.8477],
        [-0.4650]])
imaginary part (j): tensor([[-0.9490],
        [-0.0939]])
imaginary part (k): tensor([[-0.0979],
        [-0.8587]])


In [5]:
# Element-wise norm
print(x.norm())

tensor([[1.3028],
        [1.1881]])


In [6]:
# Element-wise angle
print(x.theta())

tensor([[1.3686],
        [0.9714]])


In [7]:
# Quaternion multiplication (Hamilton product)
print(x * x)

real part: tensor([[-1.5604],
        [-0.5130]])
imaginary part (i): tensor([[0.4436],
        [0.6233]])
imaginary part (j): tensor([[0.4966],
        [0.1258]])
imaginary part (k): tensor([[0.0512],
        [1.1510]])


In [8]:
# Quaternion matrix multiplication
print(x.t() @ x)

tensor([[1.6973, 0.7427],
        [0.7427, 1.4115]])


Importantly, quaternion tensors and real-valued tensors are interoperable (real-valued tensors being casted to quaternion tensors with 0 imaginary parts):

In [9]:
# Quaternion scalar multiplication
print(x * torch.rand(2))

real part: tensor([[0.0905],
        [0.0814]])
imaginary part (i): tensor([[0.2931],
        [0.0564]])
imaginary part (j): tensor([[0.3281],
        [0.0114]])
imaginary part (k): tensor([[0.0339],
        [0.1042]])


### 2 - Quaternion gradients

Gradients can be computed with the PyTorch autograd mechanisms:

In [10]:
x = quaternion.QuaternionTensor(torch.rand(2, 4))
x.requires_grad = True
y = x.norm().sum()
y.backward()

In [11]:
print(x.grad) # The gradient is also a (..., 4)-dimensional tensor

tensor([[4.4383e-01, 7.6543e-01, 3.8440e-01, 2.6336e-01],
        [3.1063e-01, 8.9373e-04, 1.7649e-01, 9.3400e-01]])


### 3 - Quaternion-valued layers

We also provide a number of quaternion-valued layers to implement quaternion neural networks:

In [12]:
from torch import nn
from htorch.layers import QLinear

In [13]:
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each component separately)
model = nn.Sequential(
    QLinear(4, 20, bias=True),
    nn.ReLU(),
    QLinear(20, 1)
)

In [14]:
x = quaternion.QuaternionTensor(torch.rand(2, 4, 4))
print(model(x))

real part: tensor([[0.1155],
        [0.1084]], grad_fn=<SliceBackward>)
imaginary part (i): tensor([[-0.0812],
        [-0.0506]], grad_fn=<SliceBackward>)
imaginary part (j): tensor([[-0.1114],
        [-0.4740]], grad_fn=<SliceBackward>)
imaginary part (k): tensor([[ 0.1293],
        [-0.0704]], grad_fn=<SliceBackward>)


We also provide layers to easily integrate quaternion-valued and real-valued blocks:

In [15]:
from htorch.layers import QuaternionToReal
from torch.nn import Softmax

In [16]:
model = nn.Sequential(
    QLinear(4, 10),
    QuaternionToReal(10), # Take the absolute value of each output
    Softmax()
)

In [17]:
model(x).shape

torch.Size([2, 10])

See [notebooks/training.ipynb] for an example of a full training.