In [None]:
'''
 * Copyright (c) 2016 Radhamadhab Dalai
 *
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included in
 * all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 * THE SOFTWARE.
'''

For a memory aid of the product terms in Sarrus’ rule, try tracing the elements of the triple products in the matrix.

We call a square matrix $T$ an **upper-triangular matrix** if $T_{ij} = 0$ for $i > j$, i.e., the matrix is zero below its diagonal. Analogously, we define a **lower-triangular matrix** as a matrix with zeros above its diagonal. For a triangular matrix $T \in \mathbb{R}^{n \times n}$, the determinant is the product of the diagonal matrix elements, i.e.,
$$\det(T) = \prod_{i=1}^{n} T_{ii} \quad (4.8)$$

**Example 4.2 (Determinants as Measures of Volume)**
The notion of a determinant is natural when we consider it as a mapping from a set of $n$ vectors spanning an object in $\mathbb{R}^n$. It turns out that the determinant $\det(A)$ is the **signed volume of an n-dimensional parallelepiped** formed by the columns of the matrix $A$.

For $n=2$, the columns of the matrix form a parallelogram; see Figure 4.2. As the angle between vectors gets smaller, the area of a parallelogram shrinks, too. Consider two vectors $\mathbf{b}, \mathbf{g}$ that form the columns of a matrix $A = [\mathbf{b}, \mathbf{g}]$. Then, the absolute value of the determinant of $A$ is the area of the parallelogram with vertices $\mathbf{0}, \mathbf{b}, \mathbf{g}, \mathbf{b} + \mathbf{g}$. In particular, if $\mathbf{b}, \mathbf{g}$ are linearly dependent so that $\mathbf{b} = \lambda\mathbf{g}$ for some $\lambda \in \mathbb{R}$, they no longer form a two-dimensional parallelogram. Therefore, the corresponding area is $0$. On the contrary, if $\mathbf{b}, \mathbf{g}$ are linearly independent and are multiples of the canonical basis vectors $\mathbf{e}_1, \mathbf{e}_2$ then they can be written as $\mathbf{b} = \begin{pmatrix} b \\ 0 \end{pmatrix}$ and $\mathbf{g} = \begin{pmatrix} 0 \\ g \end{pmatrix}$, and the determinant is $\begin{vmatrix} b & 0 \\ 0 & g \end{vmatrix} = bg - 0 = bg$. This becomes the familiar formula: area = height $\times$ length.

The sign of the determinant indicates the orientation of the spanning vectors $\mathbf{b}, \mathbf{g}$ with respect to the standard basis ($\mathbf{e}_1, \mathbf{e}_2$). In our figure, flipping the order to $\mathbf{g}, \mathbf{b}$ swaps the columns of $A$ and reverses the orientation of the shaded area.

This intuition extends to higher dimensions. In $\mathbb{R}^3$, we consider three vectors $\mathbf{r}, \mathbf{b}, \mathbf{g} \in \mathbb{R}^3$ spanning the edges of a parallelepiped, i.e., a solid with faces that are parallel parallelograms (see Figure 4.3). The absolute value of the determinant of the $3 \times 3$ matrix $[\mathbf{r}, \mathbf{b}, \mathbf{g}]$ is the volume of the solid. Thus, the determinant acts as a function that measures the signed volume formed by column vectors composed in a matrix.

Consider the three linearly independent vectors $\mathbf{r}, \mathbf{g}, \mathbf{b} \in \mathbb{R}^3$ given as
$$ \mathbf{r} = \begin{pmatrix} 2 \\ 0 \\ -8 \end{pmatrix}, \quad \mathbf{g} = \begin{pmatrix} 6 \\ 1 \\ 0 \end{pmatrix}, \quad \mathbf{b} = \begin{pmatrix} 1 \\ 4 \\ -1 \end{pmatrix} \quad (4.9) $$

![image.png](attachment:image.png)

**Fig.2** The area of the parallelogram (shaded region) spanned by the vectors $\mathbf{b}$ and $\mathbf{g}$ is $|\det([\mathbf{b}, \mathbf{g}])|$.

![image-2.png](attachment:image-2.png)

**Fig.3** The notion of volume of the parallelepiped (shaded volume) spanned by vectors $\mathbf{r}, \mathbf{b}, \mathbf{g}$ is $|\det([\mathbf{r}, \mathbf{b}, \mathbf{g}])|$. The sign of the determinant indicates the orientation of the spanning vectors.

In [3]:
import math

# --- Helper functions for Determinant from previous responses (re-used) ---

def get_minor(matrix, i, j):
    """
    Helper: Computes the minor of a matrix by removing row i and column j.
    Used for cofactor expansion.
    """
    return [row[:j] + row[j+1:] for k, row in enumerate(matrix) if k != i]

def determinant_nxn(matrix):
    """
    Computes the determinant of an n x n matrix using cofactor expansion.
    WARNING: This method is highly inefficient (O(n!)) for large matrices.
    For practical applications, use NumPy (e.g., numpy.linalg.det).
    """
    n = len(matrix)
    
    if not matrix or not all(len(row) == n for row in matrix):
        raise ValueError("Input matrix must be a non-empty square matrix.")

    if n == 1:
        return matrix[0][0]
    elif n == 2:
        return matrix[0][0] * matrix[1][1] - matrix[0][1] * matrix[1][0]
    else:
        det = 0
        for j in range(n): # Expand along the first row
            minor = get_minor(matrix, 0, j)
            cofactor = matrix[0][j] * determinant_nxn(minor)
            if (0 + j) % 2 == 1: # Check for (-1)^(i+j) sign
                det -= cofactor
            else:
                det += cofactor
        return det

def determinant_3x3_sarrus(matrix):
    """
    Computes the determinant of a 3x3 matrix using Sarrus' Rule.
    Formula: a11*a22*a33 + a12*a23*a31 + a13*a21*a32
             - a31*a22*a13 - a32*a23*a11 - a33*a21*a12
    """
    if not isinstance(matrix, list) or len(matrix) != 3 or \
       not all(isinstance(row, list) and len(row) == 3 for row in matrix):
        raise ValueError("Input must be a 3x3 matrix for Sarrus' rule.")
    
    a11, a12, a13 = matrix[0][0], matrix[0][1], matrix[0][2]
    a21, a22, a23 = matrix[1][0], matrix[1][1], matrix[1][2]
    a31, a32, a33 = matrix[2][0], matrix[2][1], matrix[2][2]
    
    # Positive diagonal products
    term1 = a11 * a22 * a33
    term2 = a12 * a23 * a31
    term3 = a13 * a21 * a32
    
    # Negative diagonal products
    term4 = a31 * a22 * a13
    term5 = a32 * a23 * a11
    term6 = a33 * a21 * a12
    
    return term1 + term2 + term3 - term4 - term5 - term6


# --- Determinant of Triangular Matrices ---

def is_square_matrix(matrix):
    """Checks if a given list of lists represents a square matrix."""
    if not matrix or not isinstance(matrix, list):
        return False
    n = len(matrix)
    return all(isinstance(row, list) and len(row) == n for row in matrix)

def is_upper_triangular(matrix):
    """
    Checks if a square matrix is upper-triangular (zeros below the diagonal).
    """
    if not is_square_matrix(matrix):
        raise ValueError("Matrix must be square to check for triangularity.")
    
    n = len(matrix)
    for i in range(1, n): # Start from second row
        for j in range(i): # Check elements below diagonal
            if matrix[i][j] != 0:
                return False
    return True

def is_lower_triangular(matrix):
    """
    Checks if a square matrix is lower-triangular (zeros above the diagonal).
    """
    if not is_square_matrix(matrix):
        raise ValueError("Matrix must be square to check for triangularity.")

    n = len(matrix)
    for i in range(n):
        for j in range(i + 1, n): # Check elements above diagonal
            if matrix[i][j] != 0:
                return False
    return True

def determinant_triangular(matrix):
    """
    Computes the determinant of a triangular (upper or lower) matrix.
    The determinant is the product of its diagonal elements.
    """
    if not is_square_matrix(matrix):
        raise ValueError("Matrix must be square to compute its determinant.")
    
    if not (is_upper_triangular(matrix) or is_lower_triangular(matrix)):
        # While this function can compute it, it's specific to triangular matrices.
        # For non-triangular, use general determinant_nxn.
        raise ValueError("Matrix is not triangular (upper or lower). This function is for triangular matrices only.")
    
    product_of_diagonals = 1
    for i in range(len(matrix)):
        product_of_diagonals *= matrix[i][i]
    return product_of_diagonals

# --- Geometric Interpretation of Determinant ---

def vectors_to_matrix_columns(vectors):
    """
    Converts a list of vectors into a matrix where each vector is a column.
    Assumes all vectors have the same dimension.
    """
    if not vectors:
        raise ValueError("List of vectors cannot be empty.")
    
    dim = len(vectors[0])
    num_vectors = len(vectors)
    
    if not all(len(v) == dim for v in vectors):
        raise ValueError("All vectors must have the same dimension.")
    
    # Create an empty matrix of appropriate size (rows = dim, cols = num_vectors)
    matrix = [[0 for _ in range(num_vectors)] for _ in range(dim)]
    
    for col_idx in range(num_vectors):
        for row_idx in range(dim):
            matrix[row_idx][col_idx] = vectors[col_idx][row_idx]
            
    return matrix


# --- Example Usage ---

print("--- Determinant of Triangular Matrices ---")

# Upper-triangular matrix
upper_tri_matrix = [
    [1, 2, 3],
    [0, 4, 5],
    [0, 0, 6]
]
print(f"Matrix: {upper_tri_matrix}")
print(f"Is upper-triangular? {is_upper_triangular(upper_tri_matrix)}")
print(f"Determinant (triangular formula): {determinant_triangular(upper_tri_matrix)}") # Expected: 1*4*6 = 24
print(f"Determinant (general formula): {determinant_nxn(upper_tri_matrix)}") # Expected: 24

# Lower-triangular matrix
lower_tri_matrix = [
    [7, 0, 0],
    [8, 9, 0],
    [1, 2, 10]
]
print(f"\nMatrix: {lower_tri_matrix}")
print(f"Is lower-triangular? {is_lower_triangular(lower_tri_matrix)}")
print(f"Determinant (triangular formula): {determinant_triangular(lower_tri_matrix)}") # Expected: 7*9*10 = 630
print(f"Determinant (general formula): {determinant_nxn(lower_tri_matrix)}") # Expected: 630

# Non-triangular matrix
non_tri_matrix = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]
print(f"\nMatrix: {non_tri_matrix}")
print(f"Is upper-triangular? {is_upper_triangular(non_tri_matrix)}")
print(f"Is lower-triangular? {is_lower_triangular(non_tri_matrix)}")
try:
    determinant_triangular(non_tri_matrix)
except ValueError as e:
    print(f"Determinant (triangular formula): Error: {e}")
print(f"Determinant (general formula - Sarrus): {determinant_3x3_sarrus(non_tri_matrix)}") # Expected: 0


print("\n--- Geometric Interpretation of Determinant ---")

# 2D Example: Area of a parallelogram
# Vectors b = [2, 0], g = [0, 3] => forms a rectangle with area 6
b_vec_2d = [2, 0]
g_vec_2d = [0, 3]
matrix_2d_columns = vectors_to_matrix_columns([b_vec_2d, g_vec_2d])
det_2d = determinant_nxn(matrix_2d_columns)
print(f"\nVectors for 2D parallelogram: b={b_vec_2d}, g={g_vec_2d}")
print(f"Matrix from columns: {matrix_2d_columns}")
print(f"Determinant: {det_2d}")
print(f"Area of parallelogram: {abs(det_2d)}") # Expected: 6

# Vectors b = [3, 1], g = [1, 2]
b_vec_2d_skew = [3, 1]
g_vec_2d_skew = [1, 2]
matrix_2d_skew_columns = vectors_to_matrix_columns([b_vec_2d_skew, g_vec_2d_skew])
det_2d_skew = determinant_nxn(matrix_2d_skew_columns)
print(f"\nVectors for skewed 2D parallelogram: b={b_vec_2d_skew}, g={g_vec_2d_skew}")
print(f"Matrix from columns: {matrix_2d_skew_columns}")
print(f"Determinant: {det_2d_skew}") # Expected: (3*2) - (1*1) = 5
print(f"Area of parallelogram: {abs(det_2d_skew)}")


# 3D Example: Volume of a parallelepiped (from text, Equation 4.9)
r_vec_3d = [2, 0, -8]
g_vec_3d = [6, 1, 0]
b_vec_3d = [1, 4, -1]

matrix_3d_columns = vectors_to_matrix_columns([r_vec_3d, g_vec_3d, b_vec_3d])

# Using the determinant_3x3_sarrus function as it's efficient for 3x3
det_3d = determinant_3x3_sarrus(matrix_3d_columns)

print(f"\nVectors for 3D parallelepiped (from Equation 4.9):")
print(f"  r = {r_vec_3d}")
print(f"  g = {g_vec_3d}")
print(f"  b = {b_vec_3d}")
print(f"Matrix from columns: {matrix_3d_columns}")
print(f"Determinant of matrix [r, g, b]: {det_3d}")
print(f"Volume of parallelepiped: {abs(det_3d)}")

# Verify with manual calculation for [r, g, b]:
# det = 2*(1*-1 - 0*4) - 6*(0*-1 - (-8)*4) + 1*(0*0 - (-8)*1)
#     = 2*(-1) - 6*(0 + 32) + 1*(0 + 8)
#     = -2 - 6*32 + 8
#     = -2 - 192 + 8
#     = -186 (This is the expected value)

print("\n--- End of Demonstrations ---")
print("Remember: For real-world linear algebra, use libraries like NumPy for performance.")

--- Determinant of Triangular Matrices ---
Matrix: [[1, 2, 3], [0, 4, 5], [0, 0, 6]]
Is upper-triangular? True
Determinant (triangular formula): 24
Determinant (general formula): 24

Matrix: [[7, 0, 0], [8, 9, 0], [1, 2, 10]]
Is lower-triangular? True
Determinant (triangular formula): 630
Determinant (general formula): 630

Matrix: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Is upper-triangular? False
Is lower-triangular? False
Determinant (triangular formula): Error: Matrix is not triangular (upper or lower). This function is for triangular matrices only.
Determinant (general formula - Sarrus): 0

--- Geometric Interpretation of Determinant ---

Vectors for 2D parallelogram: b=[2, 0], g=[0, 3]
Matrix from columns: [[2, 0], [0, 3]]
Determinant: 6
Area of parallelogram: 6

Vectors for skewed 2D parallelogram: b=[3, 1], g=[1, 2]
Matrix from columns: [[3, 1], [1, 2]]
Determinant: 5
Area of parallelogram: 5

Vectors for 3D parallelepiped (from Equation 4.9):
  r = [2, 0, -8]
  g = [6, 1, 0]
  b = [1

Writing these vectors as the columns of a matrix
$$ A = [\mathbf{r}, \mathbf{g}, \mathbf{b}] = \begin{pmatrix} 2 & 6 & 1 \\ 0 & 1 & 4 \\ -8 & 0 & -1 \end{pmatrix} \quad (4.10) $$
allows us to compute the desired volume as $V = |\det(A)| = 186$. (4.11)

Computing the determinant of an $n \times n$ matrix requires a general algorithm to solve the cases for $n > 3$, which we are going to explore in the following. Theorem 4.2 below reduces the problem of computing the determinant of an $n \times n$ matrix to computing the determinant of $(n-1) \times (n-1)$ matrices. By recursively applying the Laplace expansion (Theorem 4.2), we can therefore compute determinants of $n \times n$ matrices by ultimately computing determinants of $2 \times 2$ matrices.

**Theorem 4.2 (Laplace Expansion).** Consider a matrix $A \in \mathbb{R}^{n \times n}$. Then, for all $j = 1, \dots, n$:
1.  **Expansion along column $j$**:
    $$ \det(A) = \sum_{k=1}^{n} (-1)^{k+j} a_{kj} \det(A_{k,j}) \quad (4.12) $$
2.  **Expansion along row $j$**:
    $$ \det(A) = \sum_{k=1}^{n} (-1)^{j+k} a_{jk} \det(A_{j,k}) \quad (4.13) $$
Here $A_{k,j} \in \mathbb{R}^{(n-1) \times (n-1)}$ is the submatrix of $A$ that we obtain when deleting row $k$ and column $j$.

**Example 4.3 (Laplace Expansion)**
Let us compute the determinant of
$$ A = \begin{pmatrix} 1 & 2 & 3 \\ 3 & 1 & 2 \\ 0 & 0 & 1 \end{pmatrix} \quad (4.14) $$
using the Laplace expansion along the first row. Applying (4.13) yields
$$ \begin{vmatrix} 1 & 2 & 3 \\ 3 & 1 & 2 \\ 0 & 0 & 1 \end{vmatrix} = (-1)^{1+1} \cdot 1 \begin{vmatrix} 1 & 2 \\ 0 & 1 \end{vmatrix} + (-1)^{1+2} \cdot 2 \begin{vmatrix} 3 & 2 \\ 0 & 1 \end{vmatrix} + (-1)^{1+3} \cdot 3 \begin{vmatrix} 3 & 1 \\ 0 & 0 \end{vmatrix} \quad (4.15) $$

In [4]:
# --- Helper functions (re-used from previous implementations) ---

def get_minor(matrix, row_to_delete, col_to_delete):
    """
    Computes the minor (submatrix) by removing a specified row and column.
    Used in Laplace Expansion.
    """
    return [row[:col_to_delete] + row[col_to_delete+1:] 
            for r_idx, row in enumerate(matrix) if r_idx != row_to_delete]

def determinant_2x2(matrix):
    """
    Computes the determinant of a 2x2 matrix.
    Base case for recursive Laplace Expansion.
    """
    if len(matrix) != 2 or len(matrix[0]) != 2 or len(matrix[1]) != 2:
        raise ValueError("Input matrix must be a 2x2 matrix.")
    return matrix[0][0] * matrix[1][1] - matrix[0][1] * matrix[1][0]

def determinant_1x1(matrix):
    """
    Computes the determinant of a 1x1 matrix.
    Base case for recursive Laplace Expansion.
    """
    if len(matrix) != 1 or len(matrix[0]) != 1:
        raise ValueError("Input must be a 1x1 matrix (e.g., [[val]]).")
    return matrix[0][0]


# --- Implementation of Laplace Expansion (Theorem 4.2) ---

def determinant_laplace_expansion(matrix):
    """
    Computes the determinant of a square matrix using Laplace Expansion (recursive cofactor expansion).
    This implementation expands along the first row (j=1 in Theorem 4.2, part 2, with k iterating through columns).

    WARNING: This method is highly inefficient (O(n!)) for matrices larger than 4x4.
    For practical applications, use optimized libraries like NumPy (numpy.linalg.det).
    """
    n = len(matrix)
    
    if not matrix or not all(len(row) == n for row in matrix):
        raise ValueError("Input matrix must be a non-empty square matrix.")

    if n == 1:
        return determinant_1x1(matrix)
    elif n == 2:
        return determinant_2x2(matrix)
    else:
        det = 0
        # Expand along the first row (fixed row_idx = 0)
        row_idx = 0 
        for col_idx in range(n):
            # Ak,j in the theorem corresponds to minor = get_minor(matrix, row_idx, col_idx)
            minor = get_minor(matrix, row_idx, col_idx)
            
            # (-1)^(j+k) in Theorem 4.2, here it's (-1)^(row_idx + col_idx)
            sign = (-1)**(row_idx + col_idx)
            
            # a_jk in Theorem 4.2 is matrix[row_idx][col_idx]
            # det(A_jk) is determinant_laplace_expansion(minor)
            cofactor_term = sign * matrix[row_idx][col_idx] * determinant_laplace_expansion(minor)
            
            det += cofactor_term
        return det

# --- Example 4.3: Laplace Expansion ---

print("--- Example 4.3: Laplace Expansion ---")

A = [
    [1, 2, 3],
    [3, 1, 2],
    [0, 0, 1]
]
print(f"Matrix A (Equation 4.14):\n{A[0]}\n{A[1]}\n{A[2]}\n")

# Applying Equation (4.15) step-by-step
print("Applying Laplace Expansion along the first row (Equation 4.15):")

# Term 1: (-1)^(1+1) * a11 * det(A1,1)
a11 = A[0][0] # 1
minor_A11 = get_minor(A, 0, 0) # Submatrix after deleting row 0, col 0
det_minor_A11 = determinant_2x2(minor_A11)
term1 = ((-1)**(0+0)) * a11 * det_minor_A11
print(f"Term 1: (-1)^(1+1) * {a11} * det({minor_A11}) = 1 * {a11} * ({det_minor_A11}) = {term1}")

# Term 2: (-1)^(1+2) * a12 * det(A1,2)
a12 = A[0][1] # 2
minor_A12 = get_minor(A, 0, 1) # Submatrix after deleting row 0, col 1
det_minor_A12 = determinant_2x2(minor_A12)
term2 = ((-1)**(0+1)) * a12 * det_minor_A12
print(f"Term 2: (-1)^(1+2) * {a12} * det({minor_A12}) = -1 * {a12} * ({det_minor_A12}) = {term2}")

# Term 3: (-1)^(1+3) * a13 * det(A1,3)
a13 = A[0][2] # 3
minor_A13 = get_minor(A, 0, 2) # Submatrix after deleting row 0, col 2
det_minor_A13 = determinant_2x2(minor_A13)
term3 = ((-1)**(0+2)) * a13 * det_minor_A13
print(f"Term 3: (-1)^(1+3) * {a13} * det({minor_A13}) = 1 * {a13} * ({det_minor_A13}) = {term3}")

calculated_det_example = term1 + term2 + term3
print(f"\nSum of terms = {calculated_det_example}")

# Verify using the general determinant_laplace_expansion function
det_A_function = determinant_laplace_expansion(A)
print(f"Determinant of A computed by function: {det_A_function}")

# The expected value for this example is 1 (as calculated in the text: 1*1 - 2*3 + 3*0 = 1 - 6 + 0 = -5, but the text example has different calculation for the third term, let's recheck with the formula)
# Recheck calculation from text:
# det(A) = 1 * det([[1,2],[0,1]]) - 2 * det([[3,2],[0,1]]) + 3 * det([[3,1],[0,0]])
# det([[1,2],[0,1]]) = 1*1 - 2*0 = 1
# det([[3,2],[0,1]]) = 3*1 - 2*0 = 3
# det([[3,1],[0,0]]) = 3*0 - 1*0 = 0
# So, det(A) = 1 * 1 - 2 * 3 + 3 * 0 = 1 - 6 + 0 = -5.

# Let's verify the trace for the example matrix (not related to determinant, but good for completeness)
# trace_A = A[0][0] + A[1][1] + A[2][2] = 1 + 1 + 1 = 3
# print(f"Trace of A: {trace_A}")

print("\n--- Important Note ---")
print("The Laplace Expansion (cofactor expansion) is computationally intensive (O(n!)).")
print("For practical determinant calculations, especially for larger matrices,")
print("always use optimized libraries like NumPy, which employ more efficient algorithms.")

--- Example 4.3: Laplace Expansion ---
Matrix A (Equation 4.14):
[1, 2, 3]
[3, 1, 2]
[0, 0, 1]

Applying Laplace Expansion along the first row (Equation 4.15):
Term 1: (-1)^(1+1) * 1 * det([[1, 2], [0, 1]]) = 1 * 1 * (1) = 1
Term 2: (-1)^(1+2) * 2 * det([[3, 2], [0, 1]]) = -1 * 2 * (3) = -6
Term 3: (-1)^(1+3) * 3 * det([[3, 1], [0, 0]]) = 1 * 3 * (0) = 0

Sum of terms = -5
Determinant of A computed by function: -5

--- Important Note ---
The Laplace Expansion (cofactor expansion) is computationally intensive (O(n!)).
For practical determinant calculations, especially for larger matrices,
always use optimized libraries like NumPy, which employ more efficient algorithms.


We use (4.6) to compute the determinants of all $ 2 \times 2 $ matrices and obtain

$$
\det(A) = 1(1 - 0) - 2(3 - 0) + 3(0 - 0) = -5. \tag{4.16}
$$

For completeness, we can compare this result to computing the determinant using Sarrus’ rule (4.7):

$$
\det(A) = 1 \cdot 1 \cdot 1 + 3 \cdot 0 \cdot 3 + 0 \cdot 2 \cdot 2 - 0 \cdot 1 \cdot 3 - 1 \cdot 0 \cdot 2 - 3 \cdot 2 \cdot 1 = 1 - 6 = -5. \tag{4.17}
$$

For $ A \in \mathbb{R}^{n \times n} $, the determinant exhibits the following properties:

- The determinant of a matrix product is the product of the corresponding determinants, $ \det(AB) = \det(A) \det(B) $.
- Determinants are invariant to transposition, i.e., $ \det(A) = \det(A^\top) $.
- If $ A $ is regular (invertible), then $ \det(A^{-1}) = \frac{1}{\det(A)} $.
- Similar matrices (Definition 2.22) possess the same determinant. Therefore, for a linear mapping $ \Phi : V \to V $, all transformation matrices $ A_\Phi $ of $ \Phi $ have the same determinant. Thus, the determinant is invariant to the choice of basis of a linear mapping.
- Adding a multiple of a column/row to another one does not change $ \det(A) $.
- Multiplication of a column/row with $ \lambda \in \mathbb{R} $ scales $ \det(A) $ by $ \lambda $. In particular, $ \det(\lambda A) = \lambda^n \det(A) $.
- Swapping two rows/columns changes the sign of $ \det(A) $.

Because of the last three properties, we can use Gaussian elimination (see Section 2.1) to compute $ \det(A) $ by bringing $ A $ into row-echelon form. We can stop Gaussian elimination when we have $ A $ in a triangular form where the elements below the diagonal are all 0. Recall from (4.8) that the determinant of a triangular matrix is the product of the diagonal elements.

**Theorem 4.3.** A square matrix $ A \in \mathbb{R}^{n \times n} $ has $ \det(A) \neq 0 $ if and only if $ \text{rk}(A) = n $. In other words, $ A $ is invertible if and only if it is full rank.

When mathematics was mainly performed by hand, the determinant calculation was considered an essential way to analyze matrix invertibility. However, contemporary approaches in machine learning use direct numerical methods that superseded the explicit calculation of the determinant. For example, in Chapter 2, we learned that inverse matrices can be computed by Gaussian elimination. Gaussian elimination can thus be used to compute the determinant of a matrix.

Determinants will play an important theoretical role for the following sections, especially when we learn about eigenvalues and eigenvectors (Section 4.2) through the characteristic polynomial.

**Definition 4.4.** The trace of a square matrix $ A \in \mathbb{R}^{n \times n} $ is defined as

$$
\text{tr}(A) := \sum_{i=1}^n a_{ii}, \tag{4.18}
$$

i.e., the trace is the sum of the diagonal elements of $ A $.

The trace satisfies the following properties:

- $ \text{tr}(A + B) = \text{tr}(A) + \text{tr}(B) $ for $ A, B \in \mathbb{R}^{n \times n} $
- $ \text{tr}(\alpha A) = \alpha \text{tr}(A) $, $ \alpha \in \mathbb{R} $ for $ A \in \mathbb{R}^{n \times n} $
- $ \text{tr}(I_n) = n $
- $ \text{tr}(AB) = \text{tr}(BA) $ for $ A \in \mathbb{R}^{n \times k} $, $ B \in \mathbb{R}^{k \times n} $

It can be shown that only one function satisfies these four properties together – the trace (Gohberg et al., 2012).

The properties of the trace of matrix products are more general. Specifically, the trace is invariant under cyclic permutations, i.e.,

$$
\text{tr}(AKL) = \text{tr}(KLA) \tag{4.19}
$$

for matrices $ A \in \mathbb{R}^{a \times k} $, $ K \in \mathbb{R}^{k \times l} $, $ L \in \mathbb{R}^{l \times a} $. This property generalizes to products of an arbitrary number of matrices.

As a special case of (4.19), it follows that for two vectors $ x, y \in \mathbb{R}^n $

$$
\text{tr}(x y^\top) = \text{tr}(y^\top x) = y^\top x \in \mathbb{R}. \tag{4.20}
$$

Given a linear mapping $ \Phi : V \to V $, where $ V $ is a vector space, we define the trace of this map by using the trace of the matrix representation of $ \Phi $. For a given basis of $ V $, we can describe $ \Phi $ by means of the transformation matrix $ A $. Then the trace of $ \Phi $ is the trace of $ A $. For a different basis of $ V $, it holds that the corresponding transformation matrix $ B $ of $ \Phi $ can be obtained by a basis change of the form $ S^{-1} A S $ for suitable $ S $ (see Section 2.7.2). For the corresponding trace of $ \Phi $, this means

$$
\text{tr}(B) = \text{tr}(S^{-1} A S) = \text{tr}(A S S^{-1}) = \text{tr}(A). \tag{4.21}
$$

Hence, while matrix representations of linear mappings are basis-dependent, the trace of a linear mapping $ \Phi $ is independent of the basis.

In this section, we covered determinants and traces as functions characterizing a square matrix. Taking together our understanding of determinants and traces, we can now define an important equation describing a matrix $ A $ in terms of a polynomial, which we will use extensively in the following sections.

**Definition 4.5 (Characteristic Polynomial).** For $ \lambda \in \mathbb{R} $ and a square matrix $ A \in \mathbb{R}^{n \times n} $

$$
p_A(\lambda) := \det(A - \lambda I) \tag{4.22a}
$$

$$
= c_0 + c_1 \lambda + c_2 \lambda^2 + \cdots + c_{n-1} \lambda^{n-1} + (-1)^n \lambda^n, \tag{4.22b}
$$

$ c_0, \ldots, c_{n-1} \in \mathbb{R} $, is the characteristic polynomial of $ A $. In particular,

$$
c_0 = \det(A), \tag{4.23}
$$

$$
c_{n-1} = (-1)^{n-1} \text{tr}(A). \tag{4.24}
$$

The characteristic polynomial (4.22a) will allow us to compute eigenvalues and eigenvectors, covered in the next section.

In [5]:
# --- Matrix-Matrix Multiplication ---
def matrix_multiply(A, B):
    """
    Multiply matrices A (n x m) and B (m x p). Returns an (n x p) matrix.
    """
    n, m = len(A), len(A[0])
    p = len(B[0])
    result = [[0.0 for _ in range(p)] for _ in range(n)]
    for i in range(n):
        for j in range(p):
            result[i][j] = sum(A[i][k] * B[k][j] for k in range(m))
    return result

# --- Transpose of a Matrix ---
def transpose(A):
    """
    Compute the transpose of matrix A.
    """
    n, m = len(A), len(A[0])
    return [[A[j][i] for j in range(n)] for i in range(m)]

# --- Determinant of a Matrix ---
def determinant(A):
    """
    Compute the determinant of a square matrix A.
    For 2x2 and 3x3 matrices, use direct formulas.
    """
    n = len(A)
    if n != len(A[0]):
        raise ValueError("Matrix must be square")

    if n == 2:
        # For 2x2: det(A) = ad - bc
        return A[0][0] * A[1][1] - A[0][1] * A[1][0]
    
    if n == 3:
        # Sarrus' rule for 3x3 (Equation 4.17)
        pos = (A[0][0] * A[1][1] * A[2][2] +
               A[0][1] * A[1][2] * A[2][0] +
               A[0][2] * A[1][0] * A[2][1])
        neg = (A[0][2] * A[1][1] * A[2][0] +
               A[0][0] * A[1][2] * A[2][1] +
               A[0][1] * A[1][0] * A[2][2])
        return pos - neg
    
    raise ValueError("This implementation only supports 2x2 and 3x3 matrices")

# --- Trace of a Matrix (Equation 4.18) ---
def trace(A):
    """
    Compute the trace of a square matrix A: sum of diagonal elements.
    """
    n = len(A)
    if n != len(A[0]):
        raise ValueError("Matrix must be square")
    return sum(A[i][i] for i in range(n))

# --- Characteristic Polynomial Coefficients (Equations 4.22a–4.24) ---
def characteristic_polynomial(A):
    """
    Compute the characteristic polynomial p_A(lambda) = det(A - lambda I).
    For simplicity, compute coefficients for 2x2 matrices.
    Returns coefficients [c0, c1, c2] for p_A(lambda) = c0 + c1*lambda + c2*lambda^2.
    """
    n = len(A)
    if n != len(A[0]) or n != 2:
        raise ValueError("This implementation only supports 2x2 matrices")

    # A - lambda I for a 2x2 matrix
    # det(A - lambda I) = (a11 - lambda)(a22 - lambda) - a12*a21
    a11, a12 = A[0][0], A[0][1]
    a21, a22 = A[1][0], A[1][1]
    
    # Expand: det = (a11 - lambda)(a22 - lambda) - a12*a21
    # = a11*a22 - a11*lambda - a22*lambda + lambda^2 - a12*a21
    # = lambda^2 - (a11 + a22)*lambda + (a11*a22 - a12*a21)
    c0 = a11 * a22 - a12 * a21  # Constant term = det(A)
    c1 = -(a11 + a22)           # Coefficient of lambda
    c2 = 1.0                    # Coefficient of lambda^2
    
    return [c0, c1, c2]

# --- Run the Implementation ---
# Determinant Calculation (Equations 4.16–4.17)
print("Determinant Calculation:")
A_3x3 = [[1, 3, 0], [2, 1, 0], [0, 2, 1]]  # Matrix from Equation 4.16
det_A = determinant(A_3x3)
print(f"Matrix A (3x3):")
for row in A_3x3:
    print(row)
print(f"det(A) using Sarrus' rule = {det_A} (Equation 4.17)")

# Verify determinant properties
A_2x2 = [[1, 2], [3, 4]]
B_2x2 = [[2, 0], [1, 3]]
print(f"\nMatrix A (2x2):")
for row in A_2x2:
    print(row)
print(f"Matrix B (2x2):")
for row in B_2x2:
    print(row)

# det(AB) = det(A) det(B)
AB = matrix_multiply(A_2x2, B_2x2)
det_A = determinant(A_2x2)
det_B = determinant(B_2x2)
det_AB = determinant(AB)
print(f"\ndet(A) = {det_A}, det(B) = {det_B}")
print(f"det(AB) = {det_AB}")
print(f"det(A) * det(B) = {det_A * det_B}")
print(f"Property det(AB) = det(A) det(B): {abs(det_AB - det_A * det_B) < 1e-10}")

# det(A) = det(A^T)
A_T = transpose(A_2x2)
det_A_T = determinant(A_T)
print(f"\ndet(A^T) = {det_A_T}")
print(f"Property det(A) = det(A^T): {abs(det_A - det_A_T) < 1e-10}\n")

# Trace Calculation (Equation 4.18)
print("Trace Calculation:")
tr_A = trace(A_2x2)
print(f"Trace of A (2x2) = {tr_A} (Equation 4.18)")

# Verify trace properties
# tr(AB) = tr(BA)
BA = matrix_multiply(B_2x2, A_2x2)
tr_AB = trace(AB)
tr_BA = trace(BA)
print(f"\ntr(AB) = {tr_AB}, tr(BA) = {tr_BA}")
print(f"Property tr(AB) = tr(BA): {abs(tr_AB - tr_BA) < 1e-10}")

# Cyclic permutation: tr(AKL) = tr(KLA) (Equation 4.19)
# For simplicity, use 2x2 matrices A, K, and L
K = [[0, 1], [1, 0]]
L = [[1, 1], [0, 1]]
AK = matrix_multiply(A_2x2, K)
AKL = matrix_multiply(AK, L)
KLA = matrix_multiply(K, matrix_multiply(L, A_2x2))
tr_AKL = trace(AKL)
tr_KLA = trace(KLA)
print(f"\ntr(AKL) = {tr_AKL}, tr(KLA) = {tr_KLA}")
print(f"Property tr(AKL) = tr(KLA): {abs(tr_AKL - tr_KLA) < 1e-10}\n")

# Characteristic Polynomial (Equations 4.22a–4.24)
print("Characteristic Polynomial (2x2 Matrix):")
coeffs = characteristic_polynomial(A_2x2)
print(f"Characteristic polynomial coefficients [c0, c1, c2]: {coeffs}")
# Verify c0 = det(A)
c0 = coeffs[0]
print(f"c0 = {c0}, det(A) = {det_A}")
print(f"Property c0 = det(A): {abs(c0 - det_A) < 1e-10}")
# Verify c_{n-1} = (-1)^(n-1) tr(A), here n = 2, so c1 = -tr(A)
c1 = coeffs[1]
tr_A = trace(A_2x2)
expected_c1 = -tr_A
print(f"c1 = {c1}, (-1)^(n-1) tr(A) = {expected_c1}")
print(f"Property c1 = (-1)^(n-1) tr(A): {abs(c1 - expected_c1) < 1e-10}")

Determinant Calculation:
Matrix A (3x3):
[1, 3, 0]
[2, 1, 0]
[0, 2, 1]
det(A) using Sarrus' rule = -5 (Equation 4.17)

Matrix A (2x2):
[1, 2]
[3, 4]
Matrix B (2x2):
[2, 0]
[1, 3]

det(A) = -2, det(B) = 6
det(AB) = -12
det(A) * det(B) = -12
Property det(AB) = det(A) det(B): True

det(A^T) = -2
Property det(A) = det(A^T): True

Trace Calculation:
Trace of A (2x2) = 5 (Equation 4.18)

tr(AB) = 16, tr(BA) = 16
Property tr(AB) = tr(BA): True

tr(AKL) = 9, tr(KLA) = 9
Property tr(AKL) = tr(KLA): True

Characteristic Polynomial (2x2 Matrix):
Characteristic polynomial coefficients [c0, c1, c2]: [-2, -5, 1.0]
c0 = -2, det(A) = -2
Property c0 = det(A): True
c1 = -5, (-1)^(n-1) tr(A) = -5
Property c1 = (-1)^(n-1) tr(A): True
