# Preconditioner Series: Sparse Approximate Inverse

The premise of sparse approximate inverse (SAI) preconditioners is to explicitly calculate a sparse approximation of $A^{-1}$ such that $M\approx A^{-1}$.

Below, I implement the following approximate inverse methods:
1. Frobenius norm minimization
    1. SPAI algorithm
    2. MR algorithm
    3. Self-preconditioned MR algorithm
2. Factorized sparse approximate inverses
    1. Biconjugation algorithm
3. Inverse ILU techniques

## Imports

In [1]:
import numpy as np
import torch
import torch.cuda as tc
from torch.autograd import Variable

np.set_printoptions(suppress=True)

%load_ext autoreload
%autoreload 2

## Helper methods

In [2]:
def sparsity(M):
    return 1 - np.count_nonzero(M.numpy()) / torch.numel(M)

def report(A, M):
    print((torch.eye(n) - torch.mm(A, M)).norm(2).item(), sparsity(M))

## Frobenius norm minimization

This class of methods is based on the computation of a sparse matrix $M=A^{-1}$ from the following constrained minimization problem:

$$ \min_{M\in S} \lVert I-AM\rVert_F$$

where $S$ is a set of sparse matrices. The above problem can be solved for a right approximate inverse. The left approxiate inverse can be calculated by minimizing $\lVert I-MA\rVert_F$.

### SPAI algorithm

One of the most successful algorithms proposed in this class is known as the SPAI preconditioner. The algorithm is described [here](https://arxiv.org/abs/1503.04500) and [here](http://www.mathcs.emory.edu/~benzi/Web_papers/comp.pdf) and is as follows:

For every column $m_j$ of $M$:

1. Choose an initial sparsity pattern $J$
2. 

### MR algorithm

The [MR algorithm](https://www.cc.gatech.edu/~echow/pubs/newapinv.pdf)

For an n-by-n matrix $A$:

1. Choose an initial guess $M=M_0=[m_1, m_2,\dots,m_n]$
2. For each column $m_j$, $j=1,2,\dots,n$
    1. For $i$ in $1,2,\dots,n_i$
        1. $r_j=e_j-Am_j$
        2. $\alpha_j=r_j^TAr_j/((Ar_j)^T(Ar_j))$
        3. $m_j=m_j+\alpha_jr_j$
        4. Numerical dropping on $m_j$
        
**Numerical dropping?**


In [94]:
def drop(m):
    

def mr(A, M, ni=10, tol=1e-3):
    n = A.shape[0]
    e = np.eye(n)
    
    for j in range(n):
        for i in range(ni):
            r = e[j] - A @ M[:,j]
            a = r.T @ A @ r / ((A @ r).T @ (A @ r))
            M[:,j] += a * r
            drop(M[:,j])

SyntaxError: unexpected EOF while parsing (<ipython-input-94-b9c1c93cfe2e>, line 1)

In [5]:
n = 10
A = np.random.rand(n, n)
M0 = np.eye(n)

# mr(A, M0)
M0

array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]])

### Self-preconditioned MR algorithm

In [None]:
def self_mr(A, M0, ni, tol)

## Factorized sparse approximate inverses

### Biconjugation algorithm

## Inverse ILU techniques

## Plain old L1-regularization

Having learned about L1-regularization in machine learning, I wondered how such a simple method would do when calculating a sparse approximate inverse.

The minimization problem here is:

$$ \min_{M} \lVert I-AM\rVert_F + \alpha\lVert M\rVert_1$$

In [87]:
def loss(A, M, n, alpha):
    return torch.norm(torch.mm(A, M) - torch.eye(n), p='fro') + alpha * torch.norm(M ,p=1)

def plain_l1(A, lr=1e-3, alpha=1, eps=1e-3, rtol=1e-5, max_iter=10000):
    
    n = A.shape[0]
    pA = Variable(torch.FloatTensor(A), requires_grad=False)
    pM = Variable(torch.randn(n, n), requires_grad=True)
    
    opt = torch.optim.SGD([pM], lr)
    
    for i in range(max_iter):
        opt.zero_grad()
        pA[torch.abs(pA) < eps] = 0
        l = loss(pA, pM, n, alpha)
        l.backward()
        opt.step()
        
        if i % 1000 == 999:
            report(pA.data, pM.data)
    
    return pM

In [91]:
n = 10
A = np.random.rand(n, n)
M = plain_l1(A, lr=1e-3, alpha=1, eps=1e-3)

M.data.numpy()

4.145950794219971 0.0
3.189674139022827 0.0
3.157994508743286 0.0
3.158231258392334 0.0
3.159546375274658 0.0
3.157010793685913 0.0
3.1600568294525146 0.0
3.1574387550354004 0.0
3.158766508102417 0.0
3.1583542823791504 0.0


array([[ 0.00044699,  0.00048033, -0.00070255,  0.0004818 ,  0.00092256,
         0.00029915,  0.00062901,  0.00100516,  0.00095123,  0.00067867],
       [ 0.00027501,  0.00001562, -0.00013192,  0.00004996,  0.00107651,
        -0.00058413,  0.00033288,  0.00116887, -0.0001119 , -0.00003155],
       [ 0.00082322,  0.0000967 ,  0.00074049, -0.0002428 , -0.00017854,
        -0.00038929,  0.00034475, -0.00016088,  0.00019941,  0.00083907],
       [-0.00005981, -0.0007041 ,  0.00091283,  0.00013724,  0.00055957,
        -0.00050004, -0.00022909,  0.00089508,  0.00026577,  0.00085568],
       [-0.0003328 , -0.00034273,  0.00115568,  0.00106347, -0.0007028 ,
         0.0001009 , -0.00011733,  0.00043115, -0.00061869,  0.00048567],
       [ 0.00084954, -0.00000967, -0.00008444, -0.00017176, -0.0001022 ,
         0.00047752, -0.0000748 , -0.00052132,  0.00052364, -0.00071944],
       [ 0.00035677,  0.00062591, -0.0002821 ,  0.00017212, -0.00067891,
         0.00001098, -0.00044304,  0.00078664

## Resources

[A comparative study of sparse approximate inverse preconditioners](http://www.mathcs.emory.edu/~benzi/Web_papers/comp.pdf)

[A Residual Based Sparse Approximate Inverse Preconditioning Procedure for Large Sparse Linear Systems](https://arxiv.org/abs/1503.04500)