# SciPy Tutorial 4 - Linear Algebra (<font color=#0099ff>scipy.linalg</font>)
When SciPy is built using the optimized ATLAS LAPACK and BLAS libraries, it has very fast linear algebra capabilities. If you dig deep enough, all of the raw LAPACK and BLAS libraries are available for your use for even more speed.

All of these linear algebra routines expect an object that can be converted into a 2-D array. The output of these routines is also a 2-D array.

## scipy.linalg vs numpy.linalg

<font color=#0099ff>scipy.linalg</font> contains all the functions in <font color=#0099ff>numpy.linalg</font>. plus some other more advanced ones not contained in `numpy.linalg`.

Another advantage of using `scipy.linalg` over `numpy.linalg` is that it is always compiled with BLAS/LAPACK support, while for numpy this is optional. Therefore, the scipy version might be faster depending on how numpy was installed.

Therefore, unless you don’t want to add `scipy` as a dependency to your `numpy` program, use `scipy.linalg` instead of `numpy.linalg`.

## numpy.matrix vs 2-D numpy.ndarray

`numpy.matrix` is matrix class that has a more convenient interface than `numpy.ndarray` for matrix operations. This class supports, for example, MATLAB-like creation syntax via the semicolon, has matrix multiplication as default for the `*` operator, and contains `I` and `T` members that serve as shortcuts for inverse and transpose:

In [None]:
import numpy as np
A = np.mat('[1 2; 3 4]')
A

In [None]:
A.I

In [None]:
b = np.mat('[5 6]')
b

In [None]:
b.T

In [None]:
A*b.T

Despite its convenience, the use of the `numpy.matrix` class is discouraged, since it adds nothing that cannot be accomplished with 2-D `numpy.ndarray` objects, and may lead to a confusion of which class is being used. For example, the above code can be rewritten as:

In [None]:
from scipy import linalg
A = np.array([[1,2],[3,4]])
A

In [None]:
linalg.inv(A)

In [None]:
b = np.array([[5,6]]) #2D array
b

In [None]:
b.T

In [None]:
A*b #not matrix multiplication!

In [None]:
A.dot(b.T) #matrix multiplication

In [None]:
b = np.array([5,6]) #1D array
b

In [None]:
b.T  #not matrix transpose!

In [None]:
A.dot(b)  #does not matter for multiplication

`scipy.linalg` operations can be applied equally to `numpy.matrix` or to 2D `numpy.ndarray` objects.

## Basic routines

### Finding the inverse

The inverse of a mtrix $\mathbf{A}$ is the matrix $\mathbf{B}$, such that $\mathbf{AB}=\mathbf{I}$, where $\mathbf{I}$ is the identity matrix consisting of ones down the main diagonal. Usually, $\mathbf{B}$ is denoted $\mathbf{B}=\mathbf{A}^{-1}$. In SciPy, the matrix inverse of the NumPy array, A, is obtained using <font color=#0099ff>linalg.inv</font>`(A)`, or using `A.I` if `A` is a Matrix. For example, let

$$
\mathbf{A}=\begin{bmatrix}
1 & 3 & 5\\
2 & 5 & 1\\
2 & 3 & 8
\end{bmatrix},
$$

then

$$
A^{-1}=\frac{1}{25}\begin{bmatrix}
-37 & 9 & 22\\
14 & 2 & -9\\
4 & -3 & 1
\end{bmatrix}\ =\ \begin{bmatrix}
-1.48 & 0.36 & 0.88\\
0.56 & 0.08 & -0.36\\
0.16 & -0.12 & 0.04
\end{bmatrix}.
$$

In [None]:
A = np.array([[1,3,5],[2,5,1],[2,3,8]])
A

In [None]:
linalg.inv(A)

In [None]:
A.dot(linalg.inv(A)) #double check

### Solving a linear system
Solving linear systems of equations is straightforward using the scipy command <font color=#0099ff>linalg.solve</font>. This command expects an input matrix and a right-hand side vector. The solution vector is then computed. An option for entering a symmetric matrix is offered, which can speed up the processing when applicable. As an example, suppose it is desired to solve the following simultaneous equations:

$$\begin{align}
x+3y+5z&=10\\
2x+5y+z&=8\\
2x+3y+8z&=3
\end{align}$$

We could find the solution vector using a matrix inverse:

$$
\begin{bmatrix}
x\\
y\\
z
\end{bmatrix}\ = \ 
\begin{bmatrix}
1 & 3 & 5\\
2 & 5 & 1\\
2 & 3 & 8
\end{bmatrix}^{\ -1}
\begin{bmatrix}
10\\
8\\
3
\end{bmatrix}\ = \ \frac{1}{25}\ 
\begin{bmatrix}
-232\\
129\\
19
\end{bmatrix}\ = \ 
\begin{bmatrix}
-9.28\\
5.16\\
0.76
\end{bmatrix}.
$$

However, it is better to use the linalg.solve command, which can be faster and more numerically stable. In this case, it, however, gives the same answer as shown in the following example:

In [None]:
A = np.array([[1, 2], [3, 4]])
A

In [None]:
b = np.array([[5], [6]])
b

In [None]:
linalg.inv(A).dot(b)  # slow

In [None]:
A.dot(linalg.inv(A).dot(b)) - b  # check

In [None]:
np.linalg.solve(A, b)  # fast

In [None]:
A.dot(np.linalg.solve(A, b)) - b  # check

### Finding the determinant
The determinant of a square matrix $\mathbf{A}$ is often denoted $|\mathbf{A}|$ and is a quantity often used in linear algebra. Suppose $a_{i,j}$ are the elements of the matrix $\mathbf{A}$ and let $M_{i,j}=|\mathbf{A}_{i,j}|$ be the determinant of the matrix left by removing the $i^{th}$ row and $j^{th}$ column from $\mathbf{A}$. Then, for any row $i$,

$$|\mathbf{A}|=\sum_j(-1)^{i+j}a_{i,j}M_{i,j}.$$

This is a recursive way to define the determinant, where the base case is defined by accepting that the determinant of a $1\times1$ matrix is the only matrix element. In SciPy the determinant can be calculated with <font color=#0099ff>linalg.det</font>. For example, the determinant of

$$\mathbf{A}\ = \ \begin{bmatrix}
1 & 3 & 5\\
2 & 5 & 1\\
2 & 3 & 8
\end{bmatrix}
$$

is

$$\begin{align}
|\mathbf{A}|\ &= \ 1\begin{vmatrix}
5 & 1\\
3 & 8
\end{vmatrix}\ -3\begin{vmatrix}
2 & 1\\
2 & 8
\end{vmatrix}\ +5\begin{vmatrix}
2 & 5\\
2 & 3
\end{vmatrix}\\
&=\ 1\ (5\cdot8-3\cdot1)\ -\ 3\ (2\cdot8-2\cdot1)\ +\ 5\ (2\cdot3-2\cdot5)\ =-25.
\end{align}
$$

In [None]:
A = np.array([[1,2],[3,4]])
A

In [None]:
linalg.det(A)

### Computing norms
For vector $x$, the order parameter can be any real number including `inf` or `-inf`. The computed norm is

$$
\|\mathbf{x}\|\ =\ \begin{cases}
\max|x_i| & \text{ord }=\infty\\
\min|x_i| & \text{ord }=-\infty\\
\Big(\sum_i|x_i|^{\text{ord}}\Big)^{1/\text{ord}} & \text{ord }<\infty.
\end{cases}
$$

For matrix $\mathbf{A}$, the only valid values for norm are $\pm2,\pm1,\pm\infty$, and 'fro'. Thus,

$$
\|\mathbf{A}\|\ =\ \begin{cases}
\max_i\sum_j|a_{i,j}| & \text{ord }=\infty\\
\min_i\sum_j|a_{i,j}| & \text{ord }=-\infty\\
\max_i\sum_j|a_{i,j}| & \text{ord }=1\\
\min_i\sum_j|a_{i,j}| & \text{ord }=-1\\
\max\sigma_i          & \text{ord }=2\\
\min\sigma_i          & \text{ord }=-2\\
\sqrt{\text{trace}(\mathbf{A}^H\mathbf{A})} & \text{ord }=\text{'fro'}
\end{cases}
$$

where $\sigma_i$ are the singular values of $\mathbf{A}$.

In [None]:
A=np.array([[1,2],[3,4]])
A

In [None]:
linalg.norm(A)

In [None]:
linalg.norm(A,'fro') # frobenius norm is the default

In [None]:
linalg.norm(A,1) # L1 norm (max column sum)

In [None]:
linalg.norm(A,-1)

In [None]:
linalg.norm(A,np.inf) # L inf norm (max row sum)

### Solving linear least-squares problems and pseudo-inverses
Linear least-squares problems occur in many branches of applied mathematics. In this problem, a set of linear scaling coefficients is sought that allows a model to fit the data. In particular, it is assumed that data 
$y_i$  is related to data $\mathbf{x}_i$ through a set of coefficients $c_j$ and model functions $f_j(\mathbf{x}_i)$ via the model
$$y_i=\sum_jc_jf_j(\mathbf{x}_i)+\epsilon_i,$$

where $\epsilon_i$ represents uncertainty in the data. The strategy of least squares is to pick the coefficients 
$c_j$ to minimize
$$J(\mathbf{c})=\sum_i\bigg|y_i-\sum_jc_jf_j(x_i)\bigg|^2.$$

Theoretically, a global minimum will occur when

$$\frac{\partial J}{\partial c_n^*}=0=\sum_i\bigg(y_i-\sum_jc_jf_j\big(x_i\big)\bigg)\big(-f_n^*\big(x_i\big)\big)$$

or

$$\begin{align}
\sum_jc_j\sum_if_j(x_i)f_n^*(x_i)&=\sum_iy_if_n^*(x_i)\\
\mathbf{A}^H\mathbf{Ac}&=\mathbf{A}^H\mathbf{y},
\end{align}$$

where

$$\big\{\mathbf{A}\big\}_{i,j}=f_j\big(x_i\big).$$

where $\mathbf{A}^H\mathbf{A}$ is invertible, then

$$\mathbf{c}=\big(\mathbf{A}^H\mathbf{A}\big)^{-1}\mathbf{A}^H\mathbf{y}=\mathbf{A}^\dagger\mathbf{y},$$

where $\mathbf{A}^\dagger$ is called the pseudo-inverse of $\mathbf{A}$. Notice that using this definition of 
$\mathbf{A}$ the model can be written

$$\mathbf{y}=\mathbf{Ac}+\mathbf{\epsilon}.$$

The command <font color=#0099ff>linalg.lstsq</font> will solve the linear least-squares problem for 
$\mathbf{c}$ given $\mathbf{A}$ and $\mathbf{y}$. In addition, <font color=#0099ff>linalg.pinv</font> or <font color=#0099ff>linalg.pinv2</font> (uses a different method based on singular value decomposition) will find 
$\mathbf{A}^\dagger$ given $\mathbf{A}$.

The following example and figure demonstrate the use of <font color=#0099ff>linalg.lstsq</font> and <font color=#0099ff>linalg.pinv</font> for solving a data-fitting problem. The data shown below were generated using the model:
$$y_i=c_1e^{-x_i}+c_2x_i,$$

where $x_i=0.1i$ for $i=1,\dots,10$, $c_1=5$, and $c_2=4$. Noise is added to $y_i$ and the coefficients $c_1$ and $c_2$ are estimated using linear least squares.

In [None]:
import matplotlib.pyplot as plt

c1, c2 = 5.0, 2.0
i = np.r_[1:11]
xi = 0.1*i
yi = c1*np.exp(-xi) + c2*xi
zi = yi + 0.05 * np.max(yi) * np.random.randn(len(yi))

A = np.c_[np.exp(-xi)[:, np.newaxis], xi[:, np.newaxis]]
c, resid, rank, sigma = linalg.lstsq(A, zi)

xi2 = np.r_[0.1:1.0:100j]
yi2 = c[0]*np.exp(-xi2) + c[1]*xi2

plt.plot(xi,zi,'x',xi2,yi2)
plt.axis([0,1.1,3.0,5.5])
plt.xlabel('$x_i$')
plt.title('Data fitting with linalg.lstsq')
plt.show()

## Decompositions

### Eigenvalues and eigenvectors
The eigenvalue-eigenvector problem is one of the most commonly employed linear algebra operations. In one popular form, the eigenvalue-eigenvector problem is to find for some square matrix $\mathbf{A}$ scalars $\lambda$ and corresponding vectors $\mathbf{v}$, such that

$$\mathbf{Av}=\lambda\mathbf{v}.$$

For an $N\times N$ matrix, there are $N$ (not necessarily distinct) eigenvalues — roots of the (characteristic) polynomial

$$|\mathbf{A}-\lambda\mathbf{I}|=0.$$

The eigenvectors, $\mathbf{v}$, are also sometimes called right eigenvectors to distinguish them from another set of left eigenvectors that satisfy

$$\mathbf{v}_L^H\mathbf{A}=\lambda\mathbf{v}_L^H$$

or

$$\mathbf{A}^H\mathbf{v}_L=\lambda^*\mathbf{v}_L.$$

With its default optional arguments, the command <font color=#0099ff>linalg.eig</font> returns $\lambda$ and $\mathbf{v}$. However, it can also return $\mathbf{v}_L$ and just $\lambda$ by itself (<font color=#0099ff>linalg.eigvals</font> returns just $\lambda$ as well).

In addition, <font color=#0099ff>linalg.eig</font> can also solve the more general eigenvalue problem

$$\begin{align}
\mathbf{Av}&=\lambda\mathbf{Bv}\\
\mathbf{A}^H\mathbf{v}_L&=\lambda^*\mathbf{B}^H\mathbf{v}_L
\end{align}$$

for square matrices $\mathbf{A}$ and $mathbf{B}$. The standard eigenvalue problem is an example of the general eigenvalue problem for $\mathbf{B}=\mathbf{I}$. When a generalized eigenvalue problem can be solved, it provides a decomposition of $\mathbf{A}$ as

$$\mathbf{A}=\mathbf{BV\Lambda V}^{-1},$$

where $\mathbf{V}$ is the collection of eigenvectors into columns and $\Lambda$ is a diagonal matrix of eigenvalues.

By definition, eigenvectors are only defined up to a constant scale factor. In SciPy, the scaling factor for the eigenvectors is chosen so that $\|\mathbf{v}\|^2=\sum_iv_i^2=1$.

As an example, consider finding the eigenvalues and eigenvectors of the matrix

$$
A = \begin{bmatrix}
1 & 5 & 2\\
2 & 4 & 1\\
3 & 6 & 2
\end{bmatrix}.
$$

The characteristic polynomial is

$$\begin{align}
\big|\mathbf{A}-\lambda\mathbf{I}\big|&=\big(1-\lambda\big)\big[\big(4-\lambda\big)\big(2-\lambda\big)-6\big]-\\
                              &\quad5\big[2\big(2-\lambda\big)-3\big]+2\big[12-3\big(4-\lambda\big)\big]\\
                              &=-\lambda^3+7\lambda^2+8\lambda-3.
\end{align}$$

The roots of this polynomial are the eigenvalues of $\mathbf{A}$:

$$\begin{align}
\lambda_1 &= 7.9579\\
\lambda_2 &= -1.2577\\
\lambda_3 &= 0.2997.
\end{align}$$

In [None]:
A = np.array([[1, 2], [3, 4]])
la, v = linalg.eig(A)
l1, l2 = la
print(l1, l2)   # eigenvalues

In [None]:
print(v[:, 0])   # first eigenvector

In [None]:
print(v[:, 1])   # second eigenvector

In [None]:
print(np.sum(abs(v**2), axis=0))  # eigenvectors are unitary

In [None]:
v1 = np.array(v[:, 0]).T
print(linalg.norm(A.dot(v1) - l1*v1))  # check the computation

### Singular value decomposition
Singular value decomposition (SVD) can be thought of as an extension of the eigenvalue problem to matrices that are not square.

In [None]:
A = np.array([[1,2,3],[4,5,6]])
A

In [None]:
M,N = A.shape
U,s,Vh = linalg.svd(A)
Sig = linalg.diagsvd(s,M,N)
U, Vh = U, Vh

In [None]:
U

In [None]:
Sig

In [None]:
Vh

In [None]:
U.dot(Sig.dot(Vh)) #check computation

### LU decomposition
The LU decomposition finds a representation for the $M\times N$ matrix $\mathbf{A}$ as

$$\mathbf{A}=\mathbf{PLU},$$

where $\mathbf{P}$ is an $M\times M$ permutation matrix (a permutation of the rows of the identity matrix), $\mathbf{L}$ is in $M\times K$ lower triangular or trapezoidal matrix ($K=\min\big(M,N\big)$) with unit-diagonal, and $\mathbf{U}$ is an upper triangular or trapezoidal matrix. The SciPy command for this decomposition is <font color=#0099ff>linalg.lu</font>.

Such a decomposition is often useful for solving many simultaneous equations where the left-hand side does not change but the right-hand side does. For example, suppose we are going to solve

$$\mathbf{Ax}_i=\mathbf{b}_i$$

for many different $\mathbf{b}_i$. The LU decomposition allows this to be written as

$$\mathbf{PLUx}_i=\mathbf{b}_i.$$

Because $\mathbf{L}$ is lower-triangular, the equation can be solved for $\mathbf{Ux}_i$ and, finally, $\mathbf{x}_i$ very rapidly using forward- and back-substitution. An initial time spent factoring $\mathbf{A}$ allows for very rapid solution of similar systems of equations in the future.