# Simple Tests

Some simple checks can be performed to test the performance and the behavior of the `EUNN` and the `EURNN` units.

## Imports

In [1]:
import torch
import numpy as np
from tqdm import trange
import sys; sys.path.append('..')
from torch_eunn import EUNN

np.random.seed(42)
np.set_printoptions(precision=2, suppress=True)

## Test Unitarity

The action of a EUNNLayer should always be unitary.

In [2]:
%%time

# dimensionality of the cell
num_hidden = 50

# create new cell
cell = EUNN(num_hidden)

# get result of action of cell on identity matrix:
x = torch.stack([torch.eye(num_hidden, num_hidden), torch.zeros(num_hidden, num_hidden)], -1)
y = cell(x)
y = y[...,0].detach().numpy() + 1j*y[...,1].detach().numpy()

# check unitarity of result
print(np.abs(y@y.T.conj()))

[[1. 0. 0. ... 0. 0. 0.]
 [0. 1. 0. ... 0. 0. 0.]
 [0. 0. 1. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 1. 0. 0.]
 [0. 0. 0. ... 0. 1. 0.]
 [0. 0. 0. ... 0. 0. 1.]]
CPU times: user 43.9 ms, sys: 8.9 ms, total: 52.8 ms
Wall time: 14.1 ms


We see that the operation of a `EUNN` is clearly unitary.

## Test Universality

Next we check if a full capacity cell can approximate any unitary matrix

In [3]:
%%time

# dimensionality of the cell
num_hidden = 14

# create new cell
cell = EUNN(num_hidden, num_hidden)

# create unitary matrix to approximate
U, _, _ = np.linalg.svd(np.random.randn(num_hidden,num_hidden) + 1j*np.random.randn(num_hidden,num_hidden))
U_torch = torch.stack([
    torch.tensor(np.real(U.T.conj()), dtype=torch.float32),
    torch.tensor(np.imag(U.T.conj()), dtype=torch.float32),
], -1)

# create the target
# the cell needs to be trained such that action of the cell on U.T.conj() yields the identity
I_torch = torch.stack([
    torch.eye(num_hidden),   
    torch.zeros((num_hidden,num_hidden)),
], -1)

# criterion & optimizer
lossfunc = torch.nn.MSELoss()
optimizer = torch.optim.Adam(cell.parameters(), lr=0.05)

# training
for _ in range(1500):
    optimizer.zero_grad()
    I_approx = cell(U_torch)
    loss = lossfunc(I_approx, I_torch)
    loss.backward()
    optimizer.step()

result = I_approx[...,0].detach().numpy() + 1j*I_approx[...,1].detach().numpy()

print(abs(result)**2)
print("Final loss: {:.2e}".format(loss.item()))

[[1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]
Final loss: 7.90e-05
CPU times: user 7.29 s, sys: 50.9 ms, total: 7.34 s
Wall time: 6.76 s


We see that we can succesfully approximate the matrix U.

## Test Speed

We can compare the execution speed of the `EUNN`, which should act like a unitary matrix to a normal complex matrix layer.

In [4]:
from torch_eunn import cmm
class ComplexLayer(torch.nn.Module):
    def __init__(self, hidden_size):
        super(ComplexLayer, self).__init__()
        self.hidden_size = hidden_size
        self.W = torch.nn.Parameter(torch.randn(1, hidden_size, 2))
    def forward(self, x):
        return cmm(x, self.W)

In [5]:
batch_size = 30
hidden_size = 30

# create layers
complex_layer = ComplexLayer(hidden_size)
unitary_layer = EUNN(hidden_size)
unitary_layer_cap2 = EUNN(hidden_size, capacity=2)

# create input vector
x = torch.randn(batch_size, hidden_size, 2)

# time speeds
%time y = complex_layer(x)
%time y = unitary_layer(x)
%time y = unitary_layer_cap2(x)

CPU times: user 172 µs, sys: 2 µs, total: 174 µs
Wall time: 179 µs
CPU times: user 23.1 ms, sys: 0 ns, total: 23.1 ms
Wall time: 6.57 ms
CPU times: user 3.05 ms, sys: 49 µs, total: 3.1 ms
Wall time: 767 µs


We see that the unitary EUNN implementation is still considerably slower than a normal complex multiplication. However, for capacity 2 networks (which are recommended to be used in recurrent neural networks), the difference is not big, while the potential benefits (no vanishing/exploding gradients) can be huge.