## Quaternion PyTorch - Basic mechanisms

In [1]:
import torch
from htorch import quaternion

ModuleNotFoundError: No module named 'htorch'

### 1 - Quaternion tensors

A quaternion number is represented by:

$$
x = a + bi + cj + dk
$$

where $a$, $b$, $c$, and $d$ are real values, and $i$, $j$, $k$ are the imaginary parts. A `QuaternionTensor` extends the standard PyTorch `tensor` to handle quaternion values, by specifying the real and imaginary components during initialization. 

The simplest way to initialize a `QuaternionTensor` is to pass all $(a, b, c, d)$ values as a single (..., 4)-dimensional tensor of real values:

In [2]:
# Simple scalar quaternion
x = quaternion.QuaternionTensor([0.0, 0.3, 0.4, 0.5])
print(x)

real part: tensor([0.])
imaginary part (i): tensor([0.3000])
imaginary part (j): tensor([0.4000])
imaginary part (k): tensor([0.5000])


For a general tensor of shape $(a, b, ...)$, a `QuaternionTensor` will have $(a, b, c, ..., 4)$ real values for its initialization:

In [10]:
# Mini-batch of two scalar quaternions
x = quaternion.QuaternionTensor(torch.rand(2, 4))
print(x)

real part: tensor([[0.6549],
        [0.4490]])
imaginary part (i): tensor([[0.3214],
        [0.6709]])
imaginary part (j): tensor([[0.5050],
        [0.0625]])
imaginary part (k): tensor([[0.4133],
        [0.0206]])


In [12]:
# A mini-batch of 4 vectors, each composed of 2 quaternions
y = quaternion.QuaternionTensor(torch.rand(4, 4, 2))

All standard quaternion operations can be applied on the tensor (see the documentation of `QuaternionTensor` for a full list):

In [13]:
# Get the a/b/c/d components
print(x.a)

tensor([[0.6549],
        [0.4490]])


In [17]:
# Shape (always ends with a 4)
x.qshape

torch.Size([2, 4])

In [18]:
# Conjugation
print(x.conj())

real part: tensor([[0.6549],
        [0.4490]])
imaginary part (i): tensor([[-0.3214],
        [-0.6709]])
imaginary part (j): tensor([[-0.5050],
        [-0.0625]])
imaginary part (k): tensor([[-0.4133],
        [-0.0206]])


In [19]:
# Element-wise norm
print(x.norm())

tensor([[0.9788],
        [0.8100]])


In [20]:
# Element-wise angle
print(x.theta())

tensor([[0.8378],
        [0.9833]])


In [21]:
# Quaternion multiplication (element-wise Hamilton product)
print(x * x)

real part: tensor([[-0.1003],
        [-0.2529]])
imaginary part (i): tensor([[0.4210],
        [0.6025]])
imaginary part (j): tensor([[0.6615],
        [0.0562]])
imaginary part (k): tensor([[0.5413],
        [0.0185]])


In [22]:
# Quaternion matrix multiplication
print(x.t() @ x)

tensor([[0.9580, 0.5498],
        [0.5498, 0.6561]])


Importantly, quaternion tensors and real-valued tensors are interoperable (real-valued tensors being casted to quaternion tensors with 0 imaginary parts):

In [30]:
# Quaternion scalar multiplication
print(x * torch.rand(2))

real part: tensor([[0.3419],
        [0.1778]])
imaginary part (i): tensor([[0.1678],
        [0.2658]])
imaginary part (j): tensor([[0.2637],
        [0.0248]])
imaginary part (k): tensor([[0.2158],
        [0.0082]])


### 2 - Quaternion gradients

Gradients can be computed with the PyTorch autograd mechanisms:

In [31]:
x = quaternion.QuaternionTensor(torch.rand(2, 4))
x.requires_grad = True
y = x.norm().sum()
y.backward()

In [32]:
print(x.grad) # The gradient is also a (..., 4)-dimensional tensor

tensor([[0.6003, 0.2822, 0.6857, 0.2998],
        [0.2326, 0.5723, 0.5796, 0.5314]])


### 3 - Quaternion-valued layers

We also provide a number of quaternion-valued layers to implement quaternion neural networks:

In [33]:
from torch import nn
from htorch.layers import QLinear

In [34]:
# Simple model with two quaternion-valued dense layers, and a split ReLU (ReLU applied on each component separately)
model = nn.Sequential(
    QLinear(4, 20, bias=True),
    nn.ReLU(),
    QLinear(20, 1)
)

In [35]:
x = quaternion.QuaternionTensor(torch.rand(2, 4, 4))
print(model(x))

real part: tensor([[-0.4290],
        [-0.2291]], grad_fn=<SliceBackward>)
imaginary part (i): tensor([[-0.1713],
        [-0.2433]], grad_fn=<SliceBackward>)
imaginary part (j): tensor([[-0.2839],
        [-0.2001]], grad_fn=<SliceBackward>)
imaginary part (k): tensor([[-0.0769],
        [ 0.0332]], grad_fn=<SliceBackward>)


We also provide layers to easily integrate quaternion-valued and real-valued blocks:

In [36]:
from htorch.layers import QuaternionToReal
from torch.nn import Softmax

In [39]:
model = nn.Sequential(
    QLinear(4, 10),
    QuaternionToReal(10), # Take the absolute value of each output
    Softmax(1)
)

In [40]:
model(x).shape

torch.Size([2, 10])

See [notebooks/training.ipynb] for an example of a full training.