## Quaternion PyTorch - Basic mechanisms

In [1]:
import torch
from htorch import quaternion

### 1 - Quaternion tensors

A quaternion number is represented by:

$$
x = a + bi + cj + dk
$$

where $a$, $b$, $c$, and $d$ are real values, and $i$, $j$, $k$ are the imaginary parts. A `QuaternionTensor` extends the standard PyTorch `tensor` to handle quaternion values, by specifying the real and imaginary components during initialization:

In [3]:
# Simple scalar quaternion
x = quaternion.QuaternionTensor([0.0, 0.3, 0.4, 0.5])
print(x)

real part: tensor([0.])
imaginary part (i): tensor([0.3000])
imaginary part (j): tensor([0.4000])
imaginary part (k): tensor([0.5000])


For a general tensor of shape $(a, b, ...)$, a `QuaternionTensor` will have $(a, b, c, ..., 4)$ real values for its initialization:

In [4]:
# Mini-batch of two scalar quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4))
print(x)

real part: tensor([[0.4140],
        [0.2306]])
imaginary part (i): tensor([[0.8014],
        [0.8326]])
imaginary part (j): tensor([[0.7207],
        [0.5159]])
imaginary part (k): tensor([[0.8181],
        [0.8833]])


In [7]:
# A mini-batch of 4 vectors, each composed of 2 quaternions
y = quaternion.QuaternionTensor(torch.rand(4, 2, 4))
print(y)

real part: tensor([], size=(4, 0, 4))
imaginary part (i): tensor([], size=(4, 0, 4))
imaginary part (j): tensor([], size=(4, 0, 4))
imaginary part (k): tensor([[[0.3233, 0.7890, 0.0872, 0.8504],
         [0.2605, 0.0813, 0.6138, 0.0358]],

        [[0.7395, 0.1709, 0.2896, 0.4620],
         [0.7586, 0.8167, 0.8610, 0.8657]],

        [[0.7434, 0.3925, 0.4897, 0.5305],
         [0.6975, 0.6188, 0.6194, 0.6862]],

        [[0.0345, 0.3563, 0.0390, 0.0777],
         [0.8871, 0.2431, 0.0140, 0.0987]]])


All standard quaternion operations can be applied on the tensor (see `QuaternionTensor` for a full list):

In [8]:
# Conjugation
print(x.conj())

real part: tensor([[0.4140],
        [0.2306]])
imaginary part (i): tensor([[-0.8014],
        [-0.8326]])
imaginary part (j): tensor([[-0.7207],
        [-0.5159]])
imaginary part (k): tensor([[-0.8181],
        [-0.8833]])


In [9]:
# Element-wise norm
print(x.norm())

tensor([[1.4150],
        [1.3389]])


In [10]:
# Element-wise angle
print(x.theta())

tensor([[1.2739],
        [1.3977]])


In [11]:
# Quaternion multiplication (Hamilton product)
print(x * x)

real part: tensor([[-1.6595],
        [-1.6864]])
imaginary part (i): tensor([[0.6635],
        [0.3840]])
imaginary part (j): tensor([[0.5966],
        [0.2379]])
imaginary part (k): tensor([[0.6773],
        [0.4073]])


In [12]:
# Quaternion matrix multiplication
print(x.t() @ x)

tensor([[2.0022, 1.8571],
        [1.8571, 1.7928]])


Importantly, quaternion tensors and real-valued tensors are interoperable (real-valued tensors being casted to quaternion tensors with 0 imaginary parts):

In [13]:
# Quaternion scalar multiplication
print(x * torch.rand(2))

real part: tensor([[0.3731],
        [0.1474]])
imaginary part (i): tensor([[0.7223],
        [0.5322]])
imaginary part (j): tensor([[0.6496],
        [0.3298]])
imaginary part (k): tensor([[0.7374],
        [0.5646]])


### 2 - Quaternion gradients

Gradients can be computed with the PyTorch autograd mechanisms:

In [14]:
x = quaternion.QuaternionTensor(torch.rand(2, 4))
x.requires_grad = True
y = x.norm().sum()
y.backward()

In [15]:
print(x.grad)

tensor([[0.2686, 0.4748, 0.7928, 0.2718],
        [0.7006, 0.6206, 0.3216, 0.1437]])


### 3 - Quaternion-valued layers

We also provide a number of quaternion-valued layers to implement quaternion neural networks:

In [17]:
from torch import nn
from htorch.layers import QLinear

In [18]:
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each componend separately)
model = nn.Sequential(
    QLinear(4, 20),
    nn.ReLU(),
    QLinear(20, 1)
)

In [20]:
x = quaternion.QuaternionTensor(torch.rand(2, 16))
print(model(x))

real part: tensor([[-0.2704],
        [-0.2998]], grad_fn=<SliceBackward>)
imaginary part (i): tensor([[-0.0561],
        [-0.2713]], grad_fn=<SliceBackward>)
imaginary part (j): tensor([[0.2671],
        [0.1454]], grad_fn=<SliceBackward>)
imaginary part (k): tensor([[-0.2477],
        [-0.6685]], grad_fn=<SliceBackward>)


In [15]:
model(x).shape

torch.Size([2, 4])

We also provide layers to easily integrate quaternion-valued and real-valued blocks:

In [22]:
from htorch.layers import QuaternionToReal
from torch.nn import Softmax

In [23]:
model = nn.Sequential(
    QLinear(4, 10),
    QuaternionToReal(10), # Take the absolute value of each output
    Softmax()
)

In [24]:
model(x).shape

torch.Size([2, 10])