## Determinants using SVD

Recall the singular value decomposition (check the SVD notebook in the unit 3 dir if you need a refresher).

$A_{m \times n} = U \Sigma V^T$

Every matrix can be decomposed with the SVD, and the properties of $U$, $\Sigma$, and $V^T$ are often exploited to accomplish other operations or tasks. Despite the fact that a SVD can be computed for any $A_{m \times n}$ we only really compute determinants for square matrices. For non-square matrices determinant computations may work, but for the sake of remaining focused on the numerical methods, is regarded as more of a not particularly useful pseudo-determinant without it being the actual pseudo-determinant (WARNING: this is a deep rabbit hole to go down). Because of this, we'll only consider computong determinants for square matrices.

In the case of the determinant we can see the following:

$\det{(A)} = \det{(U \Sigma V^T)}$

$\det{(A)} = \det{(U)} \det{(\Sigma)} \det{(V^T)}$

$U$ and $V$ are orthogonal so their determinants will be $\pm 1$. Knowing if the determinant is positive or negative is not always important so we can just simplify using the absolute value.

$|\det{(A)}| = \det{(\Sigma)}$

Since $\Sigma$ is a square, diagonal matrix with the singular values running down the diagonal in decending order, computing $\det{(\Sigma)}$ just becomes the following:


$|\det{(A)}| = \displaystyle \prod_i{diag(\Sigma)_i} = \displaystyle \prod_{i=1}^n \sigma_i$

Where $n$ is the total number of singular values.

So this won't quite give us the actual determinant, but will give us the absolute value of the determinant, which in many cases will be useful enough in of itself. 

The following code will test this method against `np.linalg.det()` which uses a different backend method for computing the determinant.

In [6]:
import numpy as np


A = np.random.randint(-1000, 1001, size=(8, 8))# Random 8x8 matrix

# Compute the SVD of each matrix
U_A, s_A, Vh_A = np.linalg.svd(A)
det_A_svd = np.prod(s_A)  # Compute determinant from singular values
det_A_np = np.linalg.det(A)   # Compare with numpy determinant function

print(f"A:\n {A}\n")
print(f"Singular Value Decomposition Determinant:\n {det_A_svd}\n")
print(f"np.linalg.det() function result: \n {det_A_np}\n")

A:
 [[-755  400  -58  -64   30 -629 -626 -780]
 [-113 -579 -372  326  214  243 -709  405]
 [ 207  409 -820 -575 -462  251  118  948]
 [-764  404 -421 -722  703 -825  963  152]
 [-107  861 -832  264 -395  -17  613 -277]
 [-774  541 -967 -734 -926  400  316  955]
 [ 380  -87  948 -772 -589 -688 -993  164]
 [ 328  726 -869 -218   35 -811 -221 -974]]

Singular Value Decomposition Determinant:
 1.2150377580054407e+24

np.linalg.det() function result: 
 -1.2150377580054364e+24



From this we can also see that both values for $|\det{(A)}|$ from both methods are not perfectly identical. This is due to the difference in methods used and the nature of numerical mathematics and the error associated with it. Neverthless though we do get respectable agreement between both value's significant figures despite the fact that the value is of a high order.

## Determinants with Eigenvalues (kinda)

Let's suppose that we want to use eigenvalues instead of singular values. After all, a similar line of reasoning should allow that to work as seen with the following using the eigen-decomposition $A = Q \Lambda Q^T$:

$\det{(A)} = \det{(Q \Lambda Q^T)}$

$\det{(A)} = \det{(Q)} \det{(\Sigma)} \det{(Q^T)}$

Recall $Q$ is also orthogonal so $\det{(Q)} = \pm 1$. 
Also recall that we can have negative eigenvalues.

$\det{(A)} = \det{(\Lambda)}$

$\det{(A)} = \displaystyle \prod_i{diag(\Lambda)_i} = \displaystyle \prod_{i=1}^n \lambda_i$

Since we can only compute determinants for square matrices then this should work, right?

As can be seen in the following code block, eigenvalues can be complex, which leads to a complex determinant. Remember that not all square matrices have strictly real eigenvalues. This among other considerations regarding computation of eigenvalues makes using eigenvalues for computing determinants less than ideal.

In [7]:
eigvals, eigvecs = np.linalg.eig(A)
eig_det = np.prod(eigvals)
print(f"Product of eigenvalues: {eig_det}")

Product of eigenvalues: (-1.2150377580054393e+24+64939918.32303821j)


Do notice though that the real part of the product of eigenvalues is in reasonable agreement with the previous $\det{(A)}$ computation using SVD and `np.linalg.det()`. However, there is still the problem of significant figures and error (as is the case with all differing numerical methods), and also the problem of this value being complex, making it not the determinant.

## Determinants with LU Decomposition

More commonly the $LU$ decomposition is used to compute determinants since the decomposition is simpler than using $\sigma_i$'s or $\lambda_i$'s. Computing the determinant with $LU$ involves just computing the products of $diag(L)$ and $diag(U)$, plus it gives the full $\det(A)$ instead of $|\det(A)|$ as is the case with $\sigma_i$'s.

$\det(A) = \det(LU) = \det(L) \det(U)$

$\det(A) = \displaystyle \prod_{i} l_{ii} \displaystyle \prod_{i} u_{ii}$

If you need to convince yourself of this just take a $3 \times 3$ upper and lower triangular matrix and compute the determinant for them, and you'll find that everything off the diagonal at one point or another ends up getting multiplied by zero except for the product of the values along the diagonal.

Computation of the determinant with the diagonals of $LU$ for the same $A$ matrix as before, can be seen in the following code block using `scipy.linalg.lu()`:

In [9]:
import scipy

p, L, U = scipy.linalg.lu(A.copy())
det = np.prod(L.diagonal()) * np.prod(U.diagonal())
print(f"Product of Values from diagonal of L and U: {det}")

Product of Values from diagonal of L and U: -1.2150377580054401e+24


So the above result agrees very well with the previous $\det(A)$ value obtained with `np.linalg.det()`. Plus the $LU$ decomposition has $O(n^3)$ time complexity, so while it isn't the fastest method, it still just relies on matrix multiplication and is relatively respectable. Matrix multiplication can be spead up with the *Strassen Algorithm* which efectively follows the same process we took here, however has closer to $O(n^{2.5})$ time complexity (THIS IS NOT EXACT, DON"T QUOTE ME ON IT, the point is to say Strassen's algorithm for matrix multiplication is faster). I don't have any notes on the *Strassen Algorithm* yet, so I may need to create those in the future.

Also do note `scipy.linalg.lu()` uses a permutation matrix, hence `p, L, U = scipy.linalg.lu(A.copy())` which is another way of doing a $LU$ decomposition that differs slightly from my notes on the $LU$ decomposition, however the same idea will apply to both cases. When using a permutation matrix, like in this case with the scipy function, we get the $\pm 1$ from the determinant of the permutation matrix. So in the above code we are only guaranteed $|\det (A)|$ like in the case of using SVD.

## Bareiss Algorithm

The last method I want to examine is the *Bareiss Algorithm*, which can be used to compute integer determinants for matrices of only real integer values. What's meant by the term "integer determinants" is that no floats will be used, and only mathematics operations will be conducted with integer values. Its named after Erwin Bareiss who formally proposed the algorithm in 1968 in a paper called *Sylvester’s identity and multistep integer-preserving Gaussian elimination*. 

This is a bit of a weird and special case, but a cool algorithm that I thought would be interesting to explore with the topic of numerical methods for computing determinants. 

Practically speaking, the Bareiss Algorithm appears to be to be marginally better (from the sources I've read through) compared to using $LU$ with Gaussian elimination since Gaussian Elimination introduces division, which, therefore requires the use of floating point values. Keeping a consistent integer data type and allowing for arithmetic operations to be done strictly using integers, clearly, in theory, allows the Bareiss Algorithm to be more performant than $LU$ with Gaussian Elimination because of this.

The steps for performing the Bareiss Algorithm are not super straight forward, but the following is my best description of the steps to perform the Bareiss Algorithm:

1. Start with a sign factor $=1$

2. Iterate through columns: For each column index $k = 0, 1, ..., n-2$ and search for a zero on the diagonal. If $A[k][k] = 0$, then search for a non-zero element in the same column below the current row. Perform a row swap to eliminate the zero, i.e., interchange rows $k$ and $l$, and multiply the sign factor every time we swap rows by $-1$.
$$\begin{pmatrix} A[k] \\ A[l] \end{pmatrix} \mapsto \begin{pmatrix} A[l] \\ A[k] \end{pmatrix}.$$

3. Eliminate columns below the diagonal: For each column index $i > k$ and each column index $j \geq k+1$, compute:
$$a_{ij} = a_{kk} \cdot a_{ij} - a_{ik} \cdot a_{kj}$$

4. Divide rows below by the pivot element. If $k > 0$, then for each row index $i > k$, divide all elements in that row by the pivot element.

5. Repeat steps 2 and 3 until the top-right corner is reached

6. Multiply the sign factor and last diagonal element

Don't take this decription as 100% accurate. Again, this process is still not entirely clear to me, but I wanted to provide a description to be best of my current understanding as of writing.

In [5]:
def bareiss(A: np.ndarray):
    # Get the dimensions of the matrix
    m, n = np.shape(A)
    sign = 1
    
    # Check if the matrix is square; raise an error if not
    if m != n:
        raise TypeError("Matrix is not square")
    
    # Initialize the sign factor to 1 (or -1 for a row swap)
    for k in range(n - 1):
        # Iterate through columns
        if A[k][k] == 0:
            # If a zero is on the diagonal
            for l in range(k + 1, n):
                if A[l][k] != 0:
                    # Swap the two rows
                    A[l], A[k] = A[k], A[l]
                    # Update the sign factor (since we swapped a row)
                    sign = -sign
                    break
            # If we reached this point and didn't swap rows, the matrix is singular
            # so the determinant of the matrix will be zero
            if (l == n - 1):
                return 0            
        # Subtract multiples of one column from others to eliminate 
        # columns below the diagonal
        for i in range(k + 1, n):
            for j in range(k + 1, n):
                A[i][j] = A[k][k] * A[i][j] - A[i][k] * A[k][j]
                # If we're not at the top row yet, divide rows below by the pivot element
                if k != 0:
                    A[i][j] = A[i][j] // A[k - 1][k - 1]
    # The determinant the sign factor times the last diagonal element
    return (sign * A[n - 1][n - 1])

B = np.random.randint(-20, 21, size=(8, 8), dtype=np.int64)
npdet = np.linalg.det(B)
print(f"Determinant from np.linalg.det(): {npdet}")
bareiss_det = bareiss(B)
print(f"Determinant from Bareiss Algorithm: {bareiss_det}")

Determinant from np.linalg.det(): -3934305320.0000124
Determinant from Bareiss Algorithm: -3934305320


From the code block above using the Bareiss Algorithm we strictly get an integer value, compared to `np.linalg.det()` which still returns a float. Also, do note I had to generate a different matrix $B$ for this example since I kept running into overflow warnings in the above code, so the above implementation may not be the best for use in real world applications.

## Citations

Bareiss, Erwin H. (1968), "Sylvester's Identity and multistep integer-preserving Gaussian elimination" (PDF), Mathematics of Computation, 22 (103): 565–578, https://doi.org/10.1090/S0025-5718-1968-0226829-0, JSTOR 2004533

## Reference Links

https://math.stackexchange.com/questions/3222706/singular-value-decomposition-determinant

MIT 18.06 Linear Algebra, Spring 2005, By Gilbert Strang, Lecture 18:
https://youtu.be/srxexLishgY?si=W7L50FNFTUrN8m7k

MIT 18.06 Linear Algebra, Spring 2005, By Gilbert Strang, Lecture 19:
https://youtu.be/23LLB9mNJvc?si=2ebB6UGuUosCOjRG

Bariess Paper can also be found here (but I think it'll take you to the same place as if you were to use the doi):
https://www.ams.org/journals/mcom/1968-22-103/S0025-5718-1968-0226829-0/

https://en.wikipedia.org/wiki/LU_decomposition

https://stackoverflow.com/questions/27003062/fastest-algorithm-for-computing-the-determinant-of-a-matrix

https://en.wikipedia.org/wiki/Bareiss_algorithm

https://cs.stackexchange.com/questions/124759/determinant-calculation-bareiss-vs-gauss-algorithm