# 2.2 Transpose, Inverse and Trace of matrices

In this section, we will continue studying the basic operators on the set of matrices, focusing on the transpose, inverse, and trace of a square matrix.


## Transpose of a matrix

Given an $m\times n$ matrix $A$, the transpose of $A$ is the $n\times m$ matrix denoted by $A^T$, whose columns are the corresponding rows of $A$.

__Example 1:__

If $M = \begin{bmatrix} 1 & 2 \\ 3 & 4 \\ 5 & 6 \end{bmatrix}$, then $M^T = \begin{bmatrix} 1 & 3 & 5 \\ 2 & 4 & 6 \end{bmatrix}$.

In Python we can either use`narray.T` or `narray.transpose()` to find the transpose of a matrix.

In [2]:
import numpy as np

M = np.array([[1,2],[3, 4],[5,6]])

#Compute the the transpose of M
M_T = M.T

print("M = ", '\n', M, '\n\n', "M_T = ",'\n', M_T)

M =  
 [[1 2]
 [3 4]
 [5 6]] 

 M_T =  
 [[1 3 5]
 [2 4 6]]


In [3]:
#Compute the the transpose of M
M_transpose = M.transpose()

print("M = ", '\n', M, '\n\n', "M_T = ",'\n', M_transpose)

M =  
 [[1 2]
 [3 4]
 [5 6]] 

 M_T =  
 [[1 3 5]
 [2 4 6]]


Note that `transpose` and `T` both have different effect on a list and a list of lists:   

In [4]:
# transpose of a list of lists

np.array([[1,2,3]]).transpose()

array([[1],
       [2],
       [3]])

In [5]:
# does not effect 1D array
np.array([1,2,3]).transpose()

array([1, 2, 3])

## Inverse of a square matrix


An $n\times n$ matrix $A$ is called invertible if there exists an $n\times n$ matrix $B$ such that $AB = BA = I_n$. In this case, $B$ is referred to as the inverse of $A$, denoted as $A^{-1}$. It is important to note that not all matrices are invertible.

__Theorem 3:__

Let $A= \begin{bmatrix} a & b \\ c & d\\ \end{bmatrix}$. If $ad-bc \neq 0$, then $A$ is invertible, and its inverse is given by:

$$
A^{-1}= \frac{1}{ad-bc}\begin{bmatrix} d & -b \\ -c & a\\ \end{bmatrix}
$$

$ad-bc$ is called the __determinant__ of matrix A. The determinant plays a crucial role in determining the invertibility of a square matrix. If the determinant is zero, i.e., $ad-bc = 0$, then matrix A is not invertible. Determinants of matrices will be discussed in the next section, where we will explore their properties and computation methods.

__Example 6:__

1. Find the inverse of $M= \begin{bmatrix} 1 & 2 \\ 3 & 4\\ \end{bmatrix}$.
2. Find the inverse of $N= \begin{bmatrix} 1 & 2 \\ 2 & 4\\ \end{bmatrix}$.

__Solution:__

Let's write a Python code that computes the inverse of a $2\times 2$ matrix:

In [6]:
M = np.array([[1, 2], [3, 4]])
N = np.array([[1, 2], [2, 4]])

# Compute the inverse
def inverse_matrix(matrix):
    a, b = matrix[0]
    c, d = matrix[1]
    
    det = a * d - b * c

    if det == 0:
        print(matrix, "is not invertible")
    else:
        D = 1 / det    
        inverse = np.zeros((2, 2))
        inverse[0, 0] = D * d
        inverse[0, 1] = -D * b
        inverse[1, 0] = -D * c
        inverse[1, 1] = D * a

        return(inverse)

In [7]:
inverse_matrix(M)

array([[-2. ,  1. ],
       [ 1.5, -0.5]])

In [8]:
inverse_matrix(N)

[[1 2]
 [2 4]] is not invertible


__Theorem 3__ (Properties of the inverse matrix)

Let $A$ and $B$ be two invertible $n\times n$ matrices.

1. $(A^{-1})^{-1} = A$
2. $(AB)^{-1} = B^{-1}A^{-1}$
3. $(A^{T})^{-1} = (A^{-1})^{T}$


### Computing the inverse of a matrix

__Theorem 4__

An $n\times n$ matrix $A$ is invertible if and only if it is row equivalent to the identity matrix $I_n$. 


We can use this theorem to find the inverse of a matrix using the row reduction method: set up the augmented matrix $[A|I_n]$ and perform row operations until the left side of the augmented matrix becomes the identity matrix $I_n$. If this is achieved, then the right side of the augmented matrix will be the inverse of $A$. Otherwise, if the left side does not become $I_n$, then the matrix $A$ is not invertible.


__Example 7__ 

Find the inverse of $M= \begin{bmatrix} 0 & 1 & 2\\ 1 & 0 & 3\\ 4 & -3 & 8 \end{bmatrix}$.

__Solution__ Lets set up the augmented matrix and find its RREF:


$$[M|I_3]= \begin{bmatrix} 0 & 1 & 2 && 1 & 0 & 0\\ 1 & 0 & 3 &&  0 & 0 & 1 \\ 4 & -3 & 8 && 0 & 0 & 1\end{bmatrix}$$

In [9]:
# matrix M

M = np.array([[0,1,2], [1,0,3], [4, -3, 8]])
M

array([[ 0,  1,  2],
       [ 1,  0,  3],
       [ 4, -3,  8]])

In [10]:
# identity matrix I_3

I3 = np.eye(3)
I3

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [11]:
# Augmented matrix [A,I_3]

A = np.concatenate((M, I3), axis = 1)
A

array([[ 0.,  1.,  2.,  1.,  0.,  0.],
       [ 1.,  0.,  3.,  0.,  1.,  0.],
       [ 4., -3.,  8.,  0.,  0.,  1.]])

Now lets call row operations

In [12]:
# Elementry row operations:

# Swap two rows

def swap(matrix, row1, row2):
    
    copy_matrix=np.copy(matrix).astype('float64') 
  
    copy_matrix[row1,:] = matrix[row2,:]
    copy_matrix[row2,:] = matrix[row1,:]
    
    return copy_matrix


# Multiple all entries in a row by a nonzero number


def scale(matrix, row, scalar):
    copy_matrix=np.copy(matrix).astype('float64') 
    copy_matrix[row,:] = scalar*matrix[row,:]  
    return copy_matrix

# Replacing a row by the sum of itself and a multiple of another 

def replace(matrix, row1, row2, scalar):
    copy_matrix=np.copy(matrix).astype('float64')
    copy_matrix[row1] = matrix[row1]+ scalar * matrix[row2] 
    return copy_matrix

In [13]:
A_1 = swap(A, 0, 1)
A_1

array([[ 1.,  0.,  3.,  0.,  1.,  0.],
       [ 0.,  1.,  2.,  1.,  0.,  0.],
       [ 4., -3.,  8.,  0.,  0.,  1.]])

In [14]:
A_2 = replace(A_1, 2, 0, -4)
A_2

array([[ 1.,  0.,  3.,  0.,  1.,  0.],
       [ 0.,  1.,  2.,  1.,  0.,  0.],
       [ 0., -3., -4.,  0., -4.,  1.]])

In [15]:
A_3 = replace(A_2, 2, 1, 3)
A_3

array([[ 1.,  0.,  3.,  0.,  1.,  0.],
       [ 0.,  1.,  2.,  1.,  0.,  0.],
       [ 0.,  0.,  2.,  3., -4.,  1.]])

In [16]:
A_4 = scale(A_3, 2, 1/2)
A_4

array([[ 1. ,  0. ,  3. ,  0. ,  1. ,  0. ],
       [ 0. ,  1. ,  2. ,  1. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  1.5, -2. ,  0.5]])

In [17]:
A_5 = replace(A_4, 1, 2, -2)
A_5

array([[ 1. ,  0. ,  3. ,  0. ,  1. ,  0. ],
       [ 0. ,  1. ,  0. , -2. ,  4. , -1. ],
       [ 0. ,  0. ,  1. ,  1.5, -2. ,  0.5]])

In [18]:
A_6 = replace(A_5, 0, 2, -3)
A_6

array([[ 1. ,  0. ,  0. , -4.5,  7. , -1.5],
       [ 0. ,  1. ,  0. , -2. ,  4. , -1. ],
       [ 0. ,  0. ,  1. ,  1.5, -2. ,  0.5]])

We can see that the resulting matrix is in the form $[I_3, B]$. Thus, $B= \begin{bmatrix} -4.5 & 7 & -1.5\\ -2 &  4 & -1 \\ 1.5 & -2 & 0.5 \\ \end{bmatrix}$ is the inverse of $M$.

__Remark__ We can also use an `np.linalg.inv()` function, from NumPy linear algebra submodule, to find an inverse of a matrix: 

In [19]:
np.linalg.inv(M)

array([[-4.5,  7. , -1.5],
       [-2. ,  4. , -1. ],
       [ 1.5, -2. ,  0.5]])

__Theorem 4__ (The Invertible Matrix Theorem)

Let $A$ be an $n\times n$ matrix. The following statements are equivalent:

1. $A$ is an invertible matrix.
2. $A$ is row equivalent to $I_n$.
3. The equation $A\vec{x}=0$ has only the trivial solution ($\vec{x} = \mathbf{0}$).
4. The columns of $A$ form a linearly independent set.
5. The equation $A\vec{x}=\vec{b}$ has a unique solution ($\vec{x} = A^{-1}\vec{b}$).
6. The columns of $A$ span $\mathbb{R}^n$.

__Example 8__

Find the solution set of the following linear systems:

1. $M\vec{x}=0$, where $M$ is the matrix from Example 7.

__Solution:__ Since $M$ is invertible, by the Invertible Matrix Theorem part (3), the only solution to $M\vec{x}=0$ is the zero vector $\vec{x}=\mathbf{0}$.

2. $M\vec{x}=\vec{b}$, where $M$ is the matrix from Example 7 and $\vec{b}=\begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}$.

__Solution:__ By IMT part (5), the only solution to $M\vec{x}=\vec{b}$ is $\vec{x}=M^{-1}\vec{b}$. In Example 7, we computed the inverse of $M$:

$A^{-1} = \begin{bmatrix} -4.5 & 7 & -1.5 \\ -2 & 4 & -1 \\ 1.5 & -2 & 0.5 \end{bmatrix}$

In the following cell, we use $B$ to represent $A^{-1}$ to simplify notation:

In [21]:
B = np.array([[-4.5, 7, -1.5], [-2, 4, -1], [1.5, -2, 0.5]])

b = np.array ([[1], [2], [3]])

#compute Bb

x = B @ b
x

array([[ 5.],
       [ 3.],
       [-1.]])

In [22]:
M = np.array([[0,1,2], [1,0,3], [4,-3,8]])

#compute Mx

Mx= M @ x
Mx

array([[1.],
       [2.],
       [3.]])

which is $\vec{b}$.

## Trace of Square Matrices

The trace of an $n\times n$ matrix $A$, denoted by $Tr(A)$ or $tr(A)$, is defined as the sum of the elements on its main diagonal:

$$
Tr(A) = a_{11} + a_{22} + \dots + a_{nn} = \sum^{n}_{i=1} a_{ii}
$$

__Theorem 5__ (properties of the trace)

Let $A$ and $B$ be two $n\times n$ matrices, and $c \in \mathbb{R}$. Then we have the following:

1. __Trace of sum:__ $Tr(A + B) = Tr(A) + Tr(B)$

2. __Trace of scalar product:__ $Tr(cA) = c \cdot Tr(A)$

3. __Trace of identity matrix:__ $Tr(I_n) = n$

4. __Trace of product:__ $Tr(AB) = Tr(BA)$

The trace can also be seen as a function from the set of $n\times n$ matrices to real numbers. It can be shown that there is only one function that satisfies the above conditions.

__Example 9__

Let $M = \begin{bmatrix} 0 & 1 & 2\\ 1 & 0 & 3\\ 4 & -3 & 8 \end{bmatrix}$. Then:

$$
Tr(M) = 0 + 0 + 8 = 8.
$$

For large matrices, we can use `numpy.trace()` to compute the trace. The following code generates a random $5\times 5$ matrix and computes its trace:

In [23]:
# generate a 5x5 matrix
A = np.random.rand(5, 5)
print(A)


#compute the trace
Tr_A = np.trace(A)
print('\n Tr(A) = ', Tr_A)

[[0.01885491 0.94056143 0.25729183 0.32348535 0.31284015]
 [0.63032471 0.75771773 0.17459726 0.05278947 0.58316163]
 [0.74847121 0.18847521 0.28978758 0.20410923 0.22665367]
 [0.20345886 0.23354335 0.50383353 0.78272183 0.77366004]
 [0.33258835 0.02049531 0.97957254 0.09030899 0.78366617]]

 Tr(A) =  2.6327482252992613


## Exercises

1. Suppose $A= \begin{bmatrix}
1 & 2 \\
0 & -1 \\
3 & 0
\end{bmatrix}$

    (a) Compute $A^T$.
    
    (b) Is $AA^T$ invertible? 
    
    (c) Is $A^TA$ invertible?
    
    

2. The trace is invariant under cyclic permutations, i.e., for three matrices we have $Tr(ABC) = Tr(BCA) = Tr(CAB)$, where $A$ is an $n\times p$ matrix, $B$ is a $p\times m$ matrix, and $C$ is an $m \times n$ matrix. Using Example 9, generate three matrices $A$, $B$, and $C$ such that the product $ABC$ makes sense, and verify that:

$$Tr(ABC) = Tr(BCA) = Tr(CAB)$$
    