## Quaternion PyTorch - Basic mechanisms

In [1]:
import torch
import sys
sys.path.append("..")
from htorch import quaternion

### 1 - Quaternion tensors

A quaternion number is represented by:

$$
x = a + bi + cj + dk
$$

where $a$, $b$, $c$, and $d$ are real values, and $i$, $j$, $k$ are the imaginary parts. A `QuaternionTensor` extends the standard PyTorch `tensor` to handle quaternion values, by specifying the real and imaginary components during initialization:

In [2]:
# Simple scalar quaternion
x = quaternion.QuaternionTensor([0.0, 0.3, 0.4, 0.5])
print(x)

real part: tensor([0.])
imaginary part (i): tensor([0.3000])
imaginary part (j): tensor([0.4000])
imaginary part (k): tensor([0.5000])


For a general tensor of shape $(a, b, ...)$, a `QuaternionTensor` will have $(a, b, c, ..., 4)$ real values for its initialization:

In [3]:
# Mini-batch of two scalar quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4))
print(x)

real part: tensor([[0.7301],
        [0.2135]])
imaginary part (i): tensor([[0.8435],
        [0.9440]])
imaginary part (j): tensor([[0.4001],
        [0.7803]])
imaginary part (k): tensor([[0.8135],
        [0.2562]])


In [4]:
# A mini-batch of 4 vectors, each composed of 2 quaternions
y = quaternion.QuaternionTensor(torch.rand(4, 4, 2))
print(y)

real part: tensor([[[0.3416, 0.3066]],

        [[0.8734, 0.1232]],

        [[0.1527, 0.8140]],

        [[0.9689, 0.4045]]])
imaginary part (i): tensor([[[0.6831, 0.0924]],

        [[0.3733, 0.6633]],

        [[0.6624, 0.7842]],

        [[0.1450, 0.8890]]])
imaginary part (j): tensor([[[0.4835, 0.0360]],

        [[0.6728, 0.7311]],

        [[0.5018, 0.8945]],

        [[0.9411, 0.6756]]])
imaginary part (k): tensor([[[0.0753, 0.6949]],

        [[0.9017, 0.0058]],

        [[0.0956, 0.3003]],

        [[0.0816, 0.1053]]])


All standard quaternion operations can be applied on the tensor (see `QuaternionTensor` for a full list):

In [5]:
# Conjugation
print(x.conj())

real part: tensor([[0.7301],
        [0.2135]])
imaginary part (i): tensor([[-0.8435],
        [-0.9440]])
imaginary part (j): tensor([[-0.4001],
        [-0.7803]])
imaginary part (k): tensor([[-0.8135],
        [-0.2562]])


In [6]:
# Element-wise norm
print(x.norm())

tensor([[1.4375],
        [1.2693]])


In [7]:
# Element-wise angle
print(x.theta())

tensor([[1.0381],
        [1.4018]])


In [8]:
# Quaternion multiplication (Hamilton product)
print(x * x)

real part: tensor([[-1.0005],
        [-1.5200]])
imaginary part (i): tensor([[1.2316],
        [0.4031]])
imaginary part (j): tensor([[0.5842],
        [0.3332]])
imaginary part (k): tensor([[1.1879],
        [0.1094]])


In [9]:
# Quaternion matrix multiplication
print(x.t() @ x)

tensor([[2.0664, 1.4728],
        [1.4728, 1.6112]])


Importantly, quaternion tensors and real-valued tensors are interoperable (real-valued tensors being casted to quaternion tensors with 0 imaginary parts):

In [10]:
# Quaternion scalar multiplication
print(x * torch.rand(2))

real part: tensor([[0.5179],
        [0.2067]])
imaginary part (i): tensor([[0.5984],
        [0.9140]])
imaginary part (j): tensor([[0.2838],
        [0.7555]])
imaginary part (k): tensor([[0.5771],
        [0.2481]])


### 2 - Quaternion gradients

Gradients can be computed with the PyTorch autograd mechanisms:

In [11]:
x = quaternion.QuaternionTensor(torch.rand(2, 4))
x.requires_grad = True
y = x.norm().sum()
y.backward()

In [12]:
print(x.grad)

tensor([[0.7388, 0.4370, 0.1474, 0.4913],
        [0.6217, 0.5621, 0.5326, 0.1176]])


### 3 - Quaternion-valued layers

We also provide a number of quaternion-valued layers to implement quaternion neural networks:

In [13]:
from torch import nn
from htorch.layers import QLinear

In [15]:
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each component separately)
model = nn.Sequential(
    QLinear(4, 20, bias=True),
    nn.ReLU(),
    QLinear(20, 1)
)

torch.Size([4, 80])
torch.Size([20, 4])


In [16]:
x = quaternion.QuaternionTensor(torch.rand(2, 4, 4))
print(model(x))

real part: tensor([[-0.0505],
        [-0.2567]], grad_fn=<SliceBackward>)
imaginary part (i): tensor([[-0.1491],
        [-0.1509]], grad_fn=<SliceBackward>)
imaginary part (j): tensor([[-0.1504],
        [-0.1336]], grad_fn=<SliceBackward>)
imaginary part (k): tensor([[-0.2740],
        [-0.4539]], grad_fn=<SliceBackward>)


We also provide layers to easily integrate quaternion-valued and real-valued blocks:

In [17]:
from htorch.layers import QuaternionToReal
from torch.nn import Softmax

In [18]:
model = nn.Sequential(
    QLinear(4, 10),
    QuaternionToReal(10), # Take the absolute value of each output
    Softmax()
)

torch.Size([4, 40])


In [20]:
model(x).shape

torch.Size([2, 10])