<center>
    <h1> ILI285 - Computación Científica I  / INF285 - Computación Científica </h1>
    <h2> Sylvester Equation with GMRes </h2>
    <h2> <a href="#acknowledgements"> [S]cientific [C]omputing [T]eam </a> </h2>
    <h2> Version: 1.00</h2>
</center>

## Table of Contents
* [Introduction](#intro)
* [Short reminder about Least Squares](#LS)
* [GMRes](#GMR)
* [Theoretical Problems](#TP)
* [Practical Problems](#PP)
* [Acknowledgements](#acknowledgements)

In [1]:
import numpy as np
import scipy as sp
from scipy import linalg as la
import scipy.sparse.linalg as spla
import matplotlib.pyplot as plt
%matplotlib inline
import matplotlib as mpl
mpl.rcParams['font.size'] = 14
mpl.rcParams['axes.labelsize'] = 20
mpl.rcParams['xtick.labelsize'] = 14
mpl.rcParams['ytick.labelsize'] = 14

<div id='intro' />

## Sylvester Equation

https://en.wikipedia.org/wiki/Sylvester_equation

$$A\,X+X\,B=C$$
where $A\in\mathbb{R}^{n\times n}$, $B\in\mathbb{R}^{n\times n}$, $C\in\mathbb{R}^{n\times n}$ and $X\in\mathbb{R}^{n\times n}$.
$A$, $B$ and $C$ are given, the problem is to find $X$.

<div id='GMR' />

## GMRes

GMRes is a member of the family of Krylov methods. It finds an approximation of $\mathbf{x}$ restricted to _live_ on the Krylov sub-space $\mathcal{K_k}$,  where $\mathcal{K_k}=\{\mathbf{r}_0, A\,\mathbf{r}_0, A^2\,\mathbf{r}_0, \cdots, A^{k-1}\,\mathbf{r}_0\}$ and $\mathbf{r}_0 = \mathbf{b} - A\,\mathbf{x}_0$ is the residual vector of the initial guess.

The idea behind this method is to look for improvements to the initial guess $\mathbf{x}_0$ in the Krylov space. At the $k$-th iteration, we enlarge the Krylov space by adding $A^k\,\mathbf{r}_0$, reorthogonalize the basis, and then use least squares to find the best improvement to add to $\mathbf{x}_0$.

The algorithm is as follows:

`Generalized Minimum Residual Method`

$\mathbf{x}_0$ `= initial guess`<br>
$\mathbf{r}$ `=` $\mathbf{b} - A\,\mathbf{x}_0$ `=` $\mathbf{b} - $<span style="color:blue">afun</span>$(\mathbf{x}_0)$<br>
$\mathbf{q}_1$ `=` $\mathbf{r} / \|\mathbf{r}\|_2$<br>
`for` $k = 1, ..., m$<br>
$\qquad \ \ \mathbf{y} = A\,\mathbf{q}_k$ = <span style="color:blue">afun</span>$(\mathbf{q}_k)$ <br>
$\qquad$ `for` $j = 1,2,...,k$ <br>
$\qquad \qquad$ $h_{jk} = \mathbf{q}_j^*\,\mathbf{y}$<br>
$\qquad \qquad$ $\mathbf{y} = \mathbf{y} - h_{jk}\, \mathbf{q}_j$<br>
$\qquad$ `end`<br>
$\qquad \ h_{k+1,k} = \|y\|_2 \qquad$ `(If ` $h_{k+1,k} = 0$ `, skip next line and terminate at bottom.)` <br>
$\qquad \ \mathbf{q}_{k+1} = \mathbf{y}/h_{k+1,k}$ <br>
$\qquad$ `Minimize` $\left\|\widehat{H}_k\, \mathbf{c}_k - [\|\mathbf{r}\|_2 \ 0 \ 0 \ ... \ 0]^T \right\|_2$ `for` $\mathbf{c}_k$ <br>
$\qquad$ $\mathbf{x}_k = Q_k \, \mathbf{c}_k + \mathbf{x}_0$ <br>
`end`

## Generating an instance of the Sylvester Equation

In [3]:
n = 10
np.random.seed(0)
A = np.random.rand(n,n)+2*np.eye(n)
#print(np.linalg.eigvals(A))
B = np.random.rand(n,n)
#print(np.linalg.eigvals(B))
C = np.random.rand(n,n)

### Using a Jacobi/Gauss-Seidel iterative solver

In [4]:
def solve_JGS_iterative_Sylvester(A,B,C,m,alg=1):
    if alg==1:
        # Algorithm 1
        # AX+XB=C
        # X=A^{-1}(C-XB)
        # X^{(i+1)}=A^{-1}(C-X^{(i)} B)
        X0 = np.zeros_like(A)
        X1 = np.zeros_like(A)
        for i in range(m):
            X1=np.linalg.solve(A,C-np.dot(X0,B))
            X0=X1
            print(np.linalg.norm(np.dot(A,X1)+np.dot(X1,B)-C))
        return X1
#    elif algo==2: # TO DO 1!!!!!!!!!!!!!
        # Algorithm 2
        # AX+XB=C
        # X = (C-AX)B^{-1}
        # X^{(i+1)}=(C-A X^{(i)})B^{-1}
        # How do we implement this? Hint: You only need to use np.linalg.solve in a convenient way.

In [5]:
X_JGS=solve_JGS_iterative_Sylvester(A,B,C,10)

8.06674588003015
34.574062476753454
168.7035372976298
819.8919726957432
3949.6601997495272
18860.43712486795
89325.87084866337
419868.5695000339
1959858.326207409
9089702.028396802


In [6]:
print(X_JGS)

[[  75200.97487677   65634.05984605   65998.54659038   80340.85290919
    90925.59282883   84199.35838266   59829.01081381   85410.37801577
    82946.88506162   63244.18574112]
 [ 128633.7885535   112268.76431551  112892.59243868  137425.55044263
   155531.33214488  144025.84608515  102339.21298884  146096.83219882
   141882.95005125  108180.63127189]
 [  49787.61259846   43453.60578573   43694.85805072   53190.61746407
    60198.27573483   55745.23310695   39610.27894313   56546.48369473
    54915.85788812   41871.20787801]
 [ 123623.82492401  107896.42890395  108495.88720754  132073.26368602
   149473.98087708  138416.59632654   98353.63778848  140407.11863359
   136357.19314882  103967.61172201]
 [  11527.56089777   10060.80519377   10116.78678399   12315.04595271
    13937.96546518   12906.76654795    9171.23034486   13092.3297926
    12714.6142232     9694.54903326]
 [ -42186.05121531  -36818.77362958  -37023.3952593   -45069.08936689
   -51006.99200954  -47234.04135621  -33562.41

### Using GMRes of SciPy

In [7]:
def compute_matrix_vector_product(x,A,B,n):
    X = np.reshape(x,(n,n))
    out = np.dot(A,X)+np.dot(X,B)
    return out.flatten()
Ax = lambda x: compute_matrix_vector_product(x,A,B,n)
afun = spla.LinearOperator((n**2, n**2), matvec=Ax)
x, exitCode = spla.gmres(afun, C.flatten(), tol=1e-10)
X_GMRes = np.reshape(x,(n,n))
print(X_GMRes)

[[ 0.00933016  0.08684042 -0.01125225 -0.13670696 -0.14126436 -0.08920281
   0.06787225  0.07854398  0.14141093  0.24525254]
 [ 0.26933517 -0.12182861  0.07301594 -0.13434814  0.12407477  0.15335455
   0.00363918 -0.06085898  0.04648421 -0.03495096]
 [ 0.1255255   0.03479638 -0.0896578   0.130781    0.06218381  0.21909669
  -0.01948567 -0.18669482  0.20815108 -0.02104573]
 [ 0.02574737 -0.07080452  0.04197927 -0.28438946  0.04063     0.08167592
   0.09786173  0.06858471  0.01453835  0.11473811]
 [ 0.18789107 -0.06656593  0.02870919 -0.22879644  0.18107615  0.0382124
   0.19738009 -0.00718783 -0.09009293  0.03936773]
 [-0.16259341  0.26735377  0.16881013  0.22591956  0.10410711 -0.22623425
   0.14304677  0.1028314   0.02887488 -0.11768034]
 [-0.03125821 -0.13399008  0.01144478  0.21488824  0.18362581  0.16883234
  -0.08415118  0.09236312 -0.08581249  0.08713543]
 [ 0.07626468  0.20380177  0.03588479  0.4734714  -0.18812106  0.10684246
   0.01704592  0.15817514 -0.08554415  0.1091209 ]
 

In [8]:
print(np.linalg.norm(X_JGS-X_GMRes))

1416184.1547198202


## Computing the relative residues

In [9]:
Ax_JGS = Ax(X_JGS.flatten())
Ax_GMRes = Ax(X_GMRes.flatten())
c =  C.flatten()
print(np.linalg.norm(Ax_JGS-c)/np.linalg.norm(c))
print(np.linalg.norm(Ax_GMRes-c)/np.linalg.norm(c))

1534827.0668660458
9.010731898377531e-11


## To Do 2: What do we need to change in our implementation of GMRes to be able to use the lambda function "Ax"?

In [10]:
# This is a very instructive implementation of GMRes.
def GMRes_Ax(A, b, x0=np.array([0.0]), m=10, flag_display=True, threshold=1e-12):
    n = len(b)
    if len(x0)==1:
        x0=np.zeros(n)
    r0 = b - np.dot(A, x0)
    nr0=np.linalg.norm(r0)
    out_res=np.array(nr0)
    Q = np.zeros((n,n))
    H = np.zeros((n,n))
    Q[:,0] = r0 / nr0
    flag_break=False
    for k in np.arange(np.min((m,n))):
        y = np.dot(A, Q[:,k])
        if flag_display:
            print('||y||=',np.linalg.norm(y))
        for j in np.arange(k+1):
            H[j][k] = np.dot(Q[:,j], y)
            if flag_display:
                print('H[',j,'][',k,']=',H[j][k])
            y = y - np.dot(H[j][k],Q[:,j])
            if flag_display:
                print('||y||=',np.linalg.norm(y))
        # All but the last equation are treated equally. Why?
        if k+1<n:
            H[k+1][k] = np.linalg.norm(y)
            if flag_display:
                print('H[',k+1,'][',k,']=',H[k+1][k])
            if (np.abs(H[k+1][k]) > 1e-16):
                Q[:,k+1] = y/H[k+1][k]
            else:
                print('flag_break has been activated')
                flag_break=True
            # Do you remember e_1? The canonical vector.
            e1 = np.zeros((k+1)+1)        
            e1[0]=1
            H_tilde=H[0:(k+1)+1,0:k+1]
        else:
            H_tilde=H[0:k+1,0:k+1]
        # Solving the 'SMALL' least square problem. 
        # This could be improved with Givens rotations!
        ck = np.linalg.lstsq(H_tilde, nr0*e1)[0] 
        if k+1<n:
            x = x0 + np.dot(Q[:,0:(k+1)], ck)
        else:
            x = x0 + np.dot(Q, ck)
        # Why is 'norm_small' equal to 'norm_full'?
        norm_small=np.linalg.norm(np.dot(H_tilde,ck)-nr0*e1)
        out_res = np.append(out_res,norm_small)
        if flag_display:
            norm_full=np.linalg.norm(b-np.dot(A,x))
            print('..........||b-A\,x_k||=',norm_full)
            print('..........||H_k\,c_k-nr0*e1||',norm_small);
        if flag_break:
            if flag_display: 
                print('EXIT: flag_break=True')
            break
        if norm_small<threshold:
            if flag_display:
                print('EXIT: norm_small<threshold')
            break
    return x,out_res

<div id='acknowledgements' />

# Acknowledgements
* _Material created by professor Claudio Torres_ (`ctorres@inf.utfsm.cl`). _July 2020._