# Inversion of 2 by 2 matrices using an operator recurrent neural network

We use a simplified version of the network architecture proposed in the preprint

> Maarten V. de Hoop, Matti Lassas, Christopher A. Wong. _Deep learning architectures for nonlinear operator functions and nonlinear inverse problems_. [arXiv:1912.11090](https://arxiv.org/abs/1912.11090)

and teach it to invert matrices $X$ of the form $X = R D R^T$ where

$$
R = \begin{pmatrix}
c & -s
\\
s & c
\end{pmatrix},
\quad
D = \begin{pmatrix}
\lambda_1 & 0
\\
0 & \lambda_2
\end{pmatrix},
$$
$c = \cos(\alpha)$ and $s = \sin(\alpha)$ for some $\alpha \in (0,2\pi)$,
and $\lambda_j \in (1/2, 3/2)$, $j=1,2$.

We use notations as in version 3 of the preprint (revised 3 Jan 2022). The notation is different in earlier version.

In the code, variables have the same meaning as in the [Quickstart](https://pytorch.org/tutorials/beginner/basics/quickstart_tutorial.html) guige of PyTorch.

# Initialization

The operator recurrent architecture is implemented in `opnet` module, and 
generation of learning data in `simple_inversion_data`. 

File `PATH` is used to save the parameters of the network. 

In [149]:
import numpy as np
import torch

import opnet
from simple_inversion_data import generate_data, save_data, load_data

PATH = './simple_inversion_net2.pth'
# PATH = './simple_inversion_netReLU.pth'


Specify the network model and the loss function.

In [150]:

dim = 2 # use 2 x 2 matrices
num_layers = 5
# luodaan uusi neuroverkko
model = opnet.OperatorNet(dim, 2*num_layers, useReLU=False) 
loss_fn = torch.nn.MSELoss()

# Generation of training data

Training data consists of pairs $(X,y)$ where $X$ is an invertible $2 \times 2$ matrix and $y = X^{-1} v$
where $v = (1,1) \in \mathbb{R}^2$.

In [151]:
# save_data(*generate_data(60000), "simple_inversion_train_data.npz")
# save_data(*generate_data(10000), "simple_inversion_test_data.npz")

# Training

In [152]:
import training_and_testing

# update changes in training_and_testing.py
from importlib import reload 
reload(training_and_testing)

lr=1e-1
#lataa aikaisemmin käytetty verkko:
model.load_state_dict(torch.load(PATH)) 
training_and_testing.training_and_testing(model, loss_fn, lr)

lr=  0.1
kierros  1
True: 
tensor([[[1.3861],
         [1.5859]],

        [[0.9675],
         [1.4768]]])
Prediction: 
tensor([[[1.3843],
         [1.5955]],

        [[0.9753],
         [1.4741]]])
Avg loss: 0.000030
kierros  2
True: 
tensor([[[1.3861],
         [1.5859]],

        [[0.9675],
         [1.4768]]])
Prediction: 
tensor([[[1.3842],
         [1.5953]],

        [[0.9752],
         [1.4744]]])
Avg loss: 0.000029
kierros  3
True: 
tensor([[[1.3861],
         [1.5859]],

        [[0.9675],
         [1.4768]]])
Prediction: 
tensor([[[1.3842],
         [1.5951]],

        [[0.9751],
         [1.4747]]])
Avg loss: 0.000028
kierros  4
True: 
tensor([[[1.3861],
         [1.5859]],

        [[0.9675],
         [1.4768]]])
Prediction: 
tensor([[[1.3841],
         [1.5950]],

        [[0.9750],
         [1.4749]]])
Avg loss: 0.000027
kierros  5
True: 
tensor([[[1.3861],
         [1.5859]],

        [[0.9675],
         [1.4768]]])
Prediction: 
tensor([[[1.3841],
         [1.5949]],



In [153]:
# Load the training data
# train_loader = torch.utils.data.DataLoader(
#     load_data("simple_inversion_train_data.npz"), 
#     batch_size=64)

Choose the optimization method.

In [154]:
# Learning rate parameter is from the quickstart guide 
# optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

Loop over the training data multiple times (epochs) and 
save the optimized parameters. 

In [155]:
# for epoch in range(2): 
#     print(f"Epoch {epoch+1}\n-------------------------------")
#     for batch, (X, y) in enumerate(train_loader):
#         # Compute prediction error
#         pred = model(X)
#         loss = loss_fn(pred, y)
#         # Backpropagation
#         optimizer.zero_grad()
#         loss.backward()
#         optimizer.step()
#         # Print statistics
#         if batch % 100 == 0:
#             n, N = (batch + 1) * len(X), len(train_loader.dataset)
#             print(f"loss: {loss.item():>7f}  [{n:>5d}/{N:>5d}]")

torch.save(model.state_dict(), PATH)

# Testing

If we have already trained the network, we can just load its parameters. (Note that we still need to run the initialization.)

In [156]:
## Load trained variables
# model.load_state_dict(torch.load(PATH))

In [157]:
# Load the testing data
# test_loader = torch.utils.data.DataLoader(
#     load_data("simple_inversion_test_data.npz"), 
#     batch_size=64)

Compute a couple of samples.

In [158]:
# dataiter = iter(test_loader)
# X, y = dataiter.next()
# with torch.no_grad():
#     pred = model(X)
# print("True: ")
# print(y[:2])
# print("Prediction: ")
# print(pred[:2])

In [159]:
# num_batches = len(test_loader)
# test_loss = 0
# with torch.no_grad():
#     for X, y in test_loader:
#         pred = model(X)
#         test_loss += loss_fn(pred, y).item()
# test_loss /= num_batches
# print(f"Avg loss: {test_loss:>8f} \n")