# Determinants

One of the first things that most students learn about in linear algebra is the [determinant](https://en.wikipedia.org/wiki/Determinant) of a matrix.  Lots of useful formulas for 2×2 and 3×3 matrices can be expressed in terms of determinants, and determinants played a central role in linear algebra 100 years ago when most matrices were tiny.

Nowadays, determinants are much less useful as a practical tool, although they still occasionally show up. Determinant-related formulas are also useful in proving theorems in linear algebra.   The basic computational problem, however, is that the determinant formulas don't scale — for a big matrix, there is almost always a better way of computing something than using explicit determinants, cofactors, [Cramer's rule](https://en.wikipedia.org/wiki/Cramer's_rule), and other tricks useful for small matrices.

Still, it is important to know what determinants are, and their basic properties.  In 18.06, we mainly use determinants as a *conceptual* tool to help us understand eigenvalues via the [characteristic polynomial](https://en.wikipedia.org/wiki/Characteristic_polynomial) — although, again, this is not a practical *computational* tool for eigenvalues, which are nowadays computed by very different methods.

## Explicit formulas, high-school version

In high school, you may have learned some explicit formulas for determinants of small matrices.

These formulas quickly become computational useless for larger matrices (although there is a there is still a theoretical formula we'll give at the end), but it is nice to see a few of them.

The computer is much better at writing them down as the matrices get larger.   We'll use the [Symbolics.jl package](https://github.com/JuliaSymbolics/Symbolics.jl) for symbolic algebra in Julia to write out some determinant formulas with symbols, not numbers:

In [1]:
using Symbolics, LinearAlgebra

We can easily define a $3 \times 3$ matrix of symbolic variables in $a_{i,j}$ format and take its determinant:

In [2]:
@variables a[1:3, 1:3]
A = collect(a)

3×3 Matrix{Num}:
 a[1, 1]  a[1, 2]  a[1, 3]
 a[2, 1]  a[2, 2]  a[2, 3]
 a[3, 1]  a[3, 2]  a[3, 3]

In [3]:
expand(det(A))

a[1, 2]*a[2, 3]*a[3, 1] + a[1, 1]*a[2, 2]*a[3, 3] + a[1, 3]*a[2, 1]*a[3, 2] - a[1, 3]*a[2, 2]*a[3, 1] - a[1, 2]*a[2, 1]*a[3, 3] - a[1, 1]*a[2, 3]*a[3, 2]

… but these $a_{i,j}$ variables can be a little hard to read.  Let's define some more colorful symbols:

In [4]:
@variables a b c d 🍎 🍌 🍪 🥟 α β γ δ ♣ ♡ ♠ ♢

A = [a  b
     🍎 🍌]

2×2 Matrix{Num}:
 a  b
 🍎  🍌

Here is the determinant of a $2\times 2$ matrix:

In [5]:
det(A)

a*🍌 - b*🍎

and here is $3 \times 3$:

In [6]:
A = [a  b  c
     🍎 🍌 🍪
     α  β  γ]

3×3 Matrix{Num}:
 a  b  c
 🍎  🍌  🍪
 α  β  γ

In [7]:
expand(det(A))

a*γ*🍌 + c*β*🍎 + b*α*🍪 - a*β*🍪 - b*γ*🍎 - c*α*🍌

Notice that the terms in the determinant contain **exactly one value from each row and one from each column**.

This pattern continues for $4\times 4$, which is rapidly getting messier:

In [8]:
A = [ a  b  c  d
      🍎 🍌 🍪  🥟
      α  β  γ  δ
      ♣  ♡  ♠  ♢]

4×4 Matrix{Num}:
 a  b  c  d
 🍎  🍌  🍪  🥟
 α  β  γ  δ
 ♣  ♡  ♠  ♢

In [9]:
expand(det(A))

d*α*♠*🍌 + c*α*♡*🥟 + a*γ*♢*🍌 + a*β*♠*🥟 + b*α*♢*🍪 + c*β*♢*🍎 + d*β*♣*🍪 + b*γ*♣*🥟 + b*δ*♠*🍎 + d*γ*♡*🍎 + c*δ*♣*🍌 + a*δ*♡*🍪 - c*δ*♡*🍎 - d*γ*♣*🍌 - b*α*♠*🥟 - b*δ*♣*🍪 - d*α*♡*🍪 - d*β*♠*🍎 - a*β*♢*🍪 - a*γ*♡*🥟 - b*γ*♢*🍎 - a*δ*♠*🍌 - c*α*♢*🍌 - c*β*♣*🥟

By $n=5$ these formulas have gotten ridiculous:

In [10]:
@variables a[1:5,1:5]
A = collect(a)

5×5 Matrix{Num}:
 a[1, 1]  a[1, 2]  a[1, 3]  a[1, 4]  a[1, 5]
 a[2, 1]  a[2, 2]  a[2, 3]  a[2, 4]  a[2, 5]
 a[3, 1]  a[3, 2]  a[3, 3]  a[3, 4]  a[3, 5]
 a[4, 1]  a[4, 2]  a[4, 3]  a[4, 4]  a[4, 5]
 a[5, 1]  a[5, 2]  a[5, 3]  a[5, 4]  a[5, 5]

In [11]:
expand(det(A))

a[1, 3]*a[2, 1]*a[3, 2]*a[4, 4]*a[5, 5] + a[1, 2]*a[2, 4]*a[3, 5]*a[4, 3]*a[5, 1] + a[1, 4]*a[2, 1]*a[3, 5]*a[4, 3]*a[5, 2] + a[1, 3]*a[2, 4]*a[3, 2]*a[4, 5]*a[5, 1] + a[1, 4]*a[2, 5]*a[3, 2]*a[4, 3]*a[5, 1] + a[1, 5]*a[2, 4]*a[3, 1]*a[4, 3]*a[5, 2] + a[1, 1]*a[2, 5]*a[3, 4]*a[4, 3]*a[5, 2] + a[1, 1]*a[2, 2]*a[3, 4]*a[4, 5]*a[5, 3] + a[1, 4]*a[2, 2]*a[3, 1]*a[4, 3]*a[5, 5] + a[1, 5]*a[2, 2]*a[3, 3]*a[4, 1]*a[5, 4] + a[1, 4]*a[2, 2]*a[3, 5]*a[4, 1]*a[5, 3] + a[1, 1]*a[2, 2]*a[3, 3]*a[4, 4]*a[5, 5] + a[1, 3]*a[2, 1]*a[3, 4]*a[4, 5]*a[5, 2] + a[1, 2]*a[2, 3]*a[3, 5]*a[4, 1]*a[5, 4] + a[1, 5]*a[2, 1]*a[3, 2]*a[4, 3]*a[5, 4] + a[1, 2]*a[2, 5]*a[3, 1]*a[4, 3]*a[5, 4] + a[1, 5]*a[2, 1]*a[3, 4]*a[4, 2]*a[5, 3] + a[1, 5]*a[2, 1]*a[3, 3]*a[4, 4]*a[5, 2] + a[1, 2]*a[2, 1]*a[3, 3]*a[4, 5]*a[5, 4] + a[1, 2]*a[2, 1]*a[3, 5]*a[4, 4]*a[5, 3] + a[1, 5]*a[2, 3]*a[3, 1]*a[4, 2]*a[5, 4] + a[1, 1]*a[2, 4]*a[3, 3]*a[4, 5]*a[5, 2] + a[1, 3]*a[2, 2]*a[3, 1]*a[4, 5]*a[5, 4] + a[1, 4]*a[2, 3]*a[3, 5]*a[4, 2]*a[

In fact, the number of terms in these formulas **increases faster than exponentially** with $n$, as we shall see.  If we want to use determinants at all, we want a better way to think about them (and compute them) if possible.

In [12]:
@variables a # make this back to an ordinary scalar variable

1-element Vector{Num}:
 a

# Expectation: Singular = Zero determinant

The property that most students learn about determinants of 2×2 and 3×3 is this: **given a square matrix A, the determinant det(A) is some number that is zero if and only if the matrix is singular**.

For example, the following matrix is not singular, and its determinant (`det(A)` in Julia) is nonzero:

In [13]:
A = [1 3
     2 4]
det(A)

-2.0

(You may even remember the formula for the 2×2 determinant: $1 \times 4 - 3 \times 2 = -2$.

But this matrix is singular (the second column is twice the first), and so its determinant is zero:

In [14]:
A = [1 2
     2 4]
det(A)

0.0

* By the way, many authors, including Strang's book, use the abbreviated notation $|A| = \det A$.  I won't use this notation here, mainly because I don't think the determinant is important enough anymore to deserve its own punctuation. Anyway, $|A|$ looks too much like an absolute value, even though the determinant can have any sign.  

## A lucky guess for the determinant

In 18.06, we know have another way to check whether a matrix is zero: perform Gaussian elimination, and then **check whether any pivots (diagonal entries of U) are zero**.

But this gives us an obvious way to construct a single determinant-like number: **just multiply the pivots together**, and the result will be zero if and only if the matrix is singular.

In fact, this intuition turns out to be *almost* exactly the right guess:

* The **determinant is ± the product of the pivots**, with a minus sign if elimination involved an *odd* number of row swaps and a plus sign if there were an *even* number of swaps (including zero swaps).

We can check it for a random matrix:

In [15]:
A = randn(5,5)
det(A)

4.225997606151304

In [16]:
L,U = lu(A, NoPivot()) # LU without row swaps
U

5×5 Matrix{Float64}:
 -0.767285  -0.229022   0.124838   0.427701   0.35248
  0.0       -0.749302   1.04241    2.72733    0.929833
  0.0        0.0       -0.567867  -2.57839   -1.36299
  0.0        0.0        0.0        7.0929     7.13283
  0.0        0.0        0.0        0.0       -1.82493

In [17]:
prod(diag(U)) # the product of the diagonal elements of U

4.225997606151305

Note that this matches `det(A)` (up to roundoff errors in the last few digits).

This immediately gives you a hint of why the determinant is not such a useful computational tool as you might have thought:

* The most efficient way to compute a determinant, in general, is to do Gaussian elimination and then multiply the pivots together.

* Once you have done elimination, you *already* know whether the matrix is singular and you can *already* solve $Ax=b$ efficiently, so the determinant is mostly superfluous.

We'll discuss some actual determinant applications later.

Although we *could* use the "product of the pivots" as the definition of the determinant (at least for matrices), it is more typical to **build up the definition of the determinant from more basic properties**, and to get the product of the pivots as a *consequence*.  We will do that now.

# Defining properties of the determinant

The following three properties are actually sufficient to *uniquely define* the determinant of any matrix, and are taken from [Strang's Introduction to Linear Algebra](http://math.mit.edu/~gs/linearalgebra/), section 5.1.

Therefore, we don't *derive* these properties: they are [axioms](https://en.wikipedia.org/wiki/Axiom) that serve to define the determinant operation.

## 1. det(I) = 1

It is clear that the identity matrix $I$ is not singular, and all its pivots are 1.  A reasonable starting point for defining determinants, therefore, is to require:

* $\det I = 1$ for any $m \times m$ identity matrix I (any $m$).

For example:

In [18]:
I₅ = I(5) * 1

5×5 Diagonal{Int64, Vector{Int64}}:
 1  ⋅  ⋅  ⋅  ⋅
 ⋅  1  ⋅  ⋅  ⋅
 ⋅  ⋅  1  ⋅  ⋅
 ⋅  ⋅  ⋅  1  ⋅
 ⋅  ⋅  ⋅  ⋅  1

In [19]:
det(I₅)

1

## 2. Sign flips under row exchange

The second key property is:

* If you **swap two rows** in a matrix, the **determinant flips sign**.

It's easy to see this for the high-school $2\times 2$ matrix formula:

In [20]:
det([a  b
     🍎 🍌])

a*🍌 - b*🍎

In [21]:
det([🍎 🍌
     a  b])

b*🍎 - a*🍌

Or for the identity matrix, where swapping the first two rows gives a determinant $-1$:

In [22]:
I₅_swapped = I₅[ [2,1,3,4,5], : ]

5×5 Matrix{Int64}:
 0  1  0  0  0
 1  0  0  0  0
 0  0  1  0  0
 0  0  0  1  0
 0  0  0  0  1

In [23]:
det(I₅_swapped)

-1.0

As another example, let's try it with a random $5 \times 5$ matrix $A$:

In [24]:
A = rand(-3:3, 5,5)

5×5 Matrix{Int64}:
  1  -2  -1   1  -1
  1   1   1   3   3
 -1   2  -1   3   3
  2  -2  -1   2   1
 -2   0  -2  -2   0

Swapping the first two rows gives the matrix $B$:

In [25]:
B = A[ [2,1,3,4,5], : ]

5×5 Matrix{Int64}:
  1   1   1   3   3
  1  -2  -1   1  -1
 -1   2  -1   3   3
  2  -2  -1   2   1
 -2   0  -2  -2   0

Hence the determinants are equal and opposite:

In [26]:
det(A), det(B)

(24.0, -24.0)

(Up to roundoff errors, of course.)

## 3. Linearity in any individual row

The determinant will *not* be a linear operation on the whole matrix: $\det(A+B) \ne \det A + \det B$!!  But, we *would* like it to be linear with respect to **operations on individual rows**.

This means two things:

### Scaling rows

* If we **multiply a row by a scalar α**, then the **determinant multiplies by α**.

This axiom actually makes a lot of sense if you think about the example of the identity matrix.  Multiplying the first row of $I$ by $\alpha$ leads to the matrix:

$$
\begin{pmatrix}
\alpha & 0 & 0 & 0 & \cdots \\
     0 & 1 & 0 & 0 & \cdots \\
     0 & 0 & 1 & 0 & \cdots \\
     0 & 0 & 0 & 1 & \cdots \\
     \vdots & \vdots & \vdots & \vdots & \ddots \\
\end{pmatrix}
$$

The determinant of this matrix is exactly $\alpha$!  As $\alpha \to 0$, this matrix becomes singular, and the determinant goes to zero at the same rate.  It is also consistent with our "product of the pivots" intuitive guess above, because the pivots here are $(\alpha, 1, 1, \cdots)$.


We can also try this with our random matrix $A$ from above.  Let's multiply the second row by 2:

In [27]:
C = copy(A)
C[2,:] = 2*A[2,:]
C

5×5 Matrix{Int64}:
  1  -2  -1   1  -1
  2   2   2   6   6
 -1   2  -1   3   3
  2  -2  -1   2   1
 -2   0  -2  -2   0

In [28]:
det(A), det(C)

(24.0, 48.0)

As expected, the determinant doubles.

As a consequence of this, if you multiply an *entire* $m\times m$ matrix $A$ by $\alpha$, we obtain:

* $\det(\alpha A) = \alpha^m \det A$

This is *not* an axiom, it is a *consequence* of the axiom above: we pick up a factor of $\alpha$ for each row that we scale.

For our $5 \times 5$ matrix $A$, this means that $\det(2A) = 2^5 \det A = 32 \det A$:

In [29]:
det(2A) / det(A)

32.0

If we think back to our high-school formulas, this is consistent — the determinant terms each had exactly **one factor from each row**.  So, if we multiply any row by $\alpha$, then the determinant multiples by $\alpha$, and if we multiply *all* of the rows by $\alpha$ then we get a factor of $\alpha^m$.   

### Adding a row vector to a row

There is a second property of linearity, corresponding to vector addition:

* If we **add a row vector $r$** to a row of $A$, then the determinant becomes $\det(A) + \det(A')$, where $A'$ is the matrix with that row **replaced by** $r$ (with **other rows unchanged**).

This is easier to explain with an example:

$$
\det \begin{pmatrix} a + a' & b + b' \\ c & d \end{pmatrix} =
\det \begin{pmatrix} a  & b  \\ c & d \end{pmatrix} +
\det \begin{pmatrix} a' & b' \\ c & d \end{pmatrix} \; .
$$

Or, in terms of our matrix $A$ from above, let's add $(1,2,3,4,5)$ to the first row:

In [30]:
A

5×5 Matrix{Int64}:
  1  -2  -1   1  -1
  1   1   1   3   3
 -1   2  -1   3   3
  2  -2  -1   2   1
 -2   0  -2  -2   0

In [31]:
[1,0,0,0,0] * [1 2 3 4 5] # = column * row = outer product

5×5 Matrix{Int64}:
 1  2  3  4  5
 0  0  0  0  0
 0  0  0  0  0
 0  0  0  0  0
 0  0  0  0  0

In [32]:
det(A + [1,0,0,0,0] * [1 2 3 4 5])

54.0

This should be the same as $\det A$ plus the determinant of $A$ with the first row replaced by $(1,2,3,4,5)$:

In [33]:
A′ = copy(A)
A′[1,:] = [1,2,3,4,5] # replace first row
A′

5×5 Matrix{Int64}:
  1   2   3   4  5
  1   1   1   3  3
 -1   2  -1   3  3
  2  -2  -1   2  1
 -2   0  -2  -2  0

In [34]:
det(A) + det(A′)

54.0

Yup, it matches (up to roundoff errors, of course).

# Additional properties of determinants

The following properties can be **derived from the above 3**, and are quite useful to know.  Again, the numbering follows Strang, section 5.1:

## 4. If two rows are equal, det = 0

It's easy to see why this **follows from property 2**: if we swap two equal rows, the matrix doesn't change, but the determinant must flip sign.  But this means:

$$\det A = -\det A \implies \det A = 0$$

For example:

In [35]:
det([ 1 2 3 
      4 5 6
      1 2 3 ])

0.0

This property also makes sense if our expectation is that the determinant is zero for singular matrices: if two rows are equal, the matrix is singular.

## 5. Subtracting a multiple of one row from another doesn’t change det

Suppose we take a matrix $A$, and subtract (or add) a multiple of one row from another.  For example:

$$
\det \begin{pmatrix} a & b \\ c - \alpha a & d - \alpha b \end{pmatrix} =
\det \begin{pmatrix} a & b \\ c & d \end{pmatrix} -
\alpha \det \begin{pmatrix} a & b \\ a & b \end{pmatrix} =
\det \begin{pmatrix} a & b \\ c & d \end{pmatrix} + 0
$$

Here, we applied axiom 3 (linearity), and then property 4 (repeated rows).

The same thing happens for *any* size of matrix.

But this is *precisely* the kind of operation that we perform during Gaussian elimination.  It has the crucial implications:

* **Elimination operations** on rows **don't change the determinant**.

* **Gaussian elimination without row swaps doesn't change the determinant**.

And, by axiom 2:

* **Gaussian elimination with row swaps** gives the **same determinant** but with **flipped sign for each row swap**.

For example:

In [36]:
L, U = lu(A, NoPivot()) # elimination without row swaps
U

5×5 Matrix{Float64}:
 1.0  -2.0  -1.0   1.0  -1.0
 0.0   3.0   2.0   2.0   4.0
 0.0   0.0  -2.0   4.0   2.0
 0.0   0.0   0.0  -2.0   2.22045e-16
 0.0   0.0   0.0   0.0   2.0

In [37]:
det(A), det(U)

(24.0, 23.999999999999993)

## 6. A matrix with a row of zeros has det = 0

This is easy to see from axiom 3 (linearity): if we multiply the row of zeros by zero, it doesn't change the matrix but multiplies the determinant by zero, hence:

$$
0 \times \det A = \det A \implies \det A = 0
$$

For example:

In [38]:
det([1 2 3
     4 5 6
     0 0 0])

0.0

## 7. If A is triangular then det(A) is the product of the diagonal entries

This is another incredibly useful property.  To see this, suppose we have an upper-triangular matrix $U$.  Then:

1. Eliminate "upward" above the pivots to get a diagonal matrix $D$.  This doesn't change the determinant by property 5.

2. Pull out each diagonal element by axiom 3 (linearity) until you get the identity matrix $I$ whose determinant is 1 by axiom 1:
$$
\det \begin{pmatrix} \alpha_1 & & & \\ & \alpha_2 & & \\ & & \alpha_3 & \\ & & & \ddots \end{pmatrix} =
\alpha_1 \det \begin{pmatrix} 1 & & & \\ & \alpha_2 & & \\ & & \alpha_3 & \\ & & & \ddots \end{pmatrix} = \cdots = \alpha_1 \alpha_2 \alpha_3 \cdots \det I = \alpha_1 \alpha_2 \alpha_3 \cdots
$$
which is precisely the product of the diagonals.

If we have a zero diagonal entry, we can't eliminate upward above it (we can't divide by the diagonal "pivot").  But in that case we end up with a row of zeros after eliminating above the *other* diagonals, and by property 6 we get a zero determinant.  So it still matches the product of the diagonal entries.

Similarly for a lower triangular matrix, except that we eliminate "downward".

We already saw an example of this earlier, but let's do it again.  We got our $U$ matrix from elimination on $A$:

In [39]:
U

5×5 Matrix{Float64}:
 1.0  -2.0  -1.0   1.0  -1.0
 0.0   3.0   2.0   2.0   4.0
 0.0   0.0  -2.0   4.0   2.0
 0.0   0.0   0.0  -2.0   2.22045e-16
 0.0   0.0   0.0   0.0   2.0

Its diagonal entries are:

In [40]:
diag(U)

5-element Vector{Float64}:
  1.0
  3.0
 -2.0
 -1.9999999999999998
  1.9999999999999996

The product of these is:

In [41]:
prod(diag(U))

23.999999999999993

which matches $\det U$ (and $\det A$):

In [42]:
det(U), det(A)

(23.999999999999993, 24.0)

If we *do* need to compute the determinant, this gives us a very practical way to do it: **compute det(A) by taking the product of the pivots after elimination, with a sign flip for every row swap**, i.e.

$$
\boxed{\det A = (-1)^\mbox{# row swaps} \times \mbox{(product of pivots)} } \, .
$$

This is, in fact *exactly* what the Julia `det` function does, as you can check by looking at the source code:

In [43]:
@which det(A)

In [44]:
@which det(UpperTriangular(U))

From the source code, this calls `det(lufact(A))`, which calls:

In [45]:
@which det(lufact(A))

LoadError: UndefVarError: lufact not defined

## 8. det(A) = 0 if and only if A is singular

This follows from property 7.  Since the determinant is ± the product of the pivots, we get zero if and only if there is a zero pivot, corresponding to a singular matrix.

## 9. det(AB) = det(A) det(B)

This is an amazing property of determinants, and probably the least obvious.

A nice way to show this (from Strang's book) is simply to check that $$\det(AB)/\det(B)$$ **satisfies axioms 1,2,3 for A**.  If it does, then it must be $\det A$, and we are done! Let's check:

1. Identity: If $A=I$, then $\det(AB)/\det(B) = \det(B)/\det(B) = 1$.  ✓
2. Swaps: If we swap two rows of $A$, we also swap the *same* two rows of $AB$, hence $\det(AB)/\det(B)$ flips sign.  ✓
3. Linearity:
  - Scaling a row of $A$ by $\alpha$ scales a row of $AB$ by $\alpha$, which scales $\det(AB)/\det(B)$ by $\alpha$. ✓
  - Adding a row of $A$ to a row of $A'$ (with other rows the same) adds the same rows of $AB$ and $A'B$, so it adds $\det(AB)/\det(B)$ and $\det(A'B)/\det(B)$. ✓

Let's try it:

In [46]:
B = rand(-3:3, 5,5)

5×5 Matrix{Int64}:
  3  2  -2  -2   3
 -3  3   2  -1  -3
 -1  0  -3  -1  -3
 -2  2   1   1   3
  0  3   0   1   1

In [47]:
det(A), det(B)

(24.0, 700.0)

In [48]:
det(A*B), det(A)*det(B)

(16799.999999999996, 16800.0)

### Matrix inverses

This rule has important consequences for matrix inverses.  First:

$$\det (A^{-1}) = 1 / \det(A)$$

Proof: $1 = \det(I) = \det(A A^{-1}) = \det(A) \det(A^{-1})$.

For example:

In [49]:
det(inv(A)), 1/det(A)

(0.04166666666666666, 0.041666666666666664)

Recall from last lecture that $X A X^{-1}$ corresponds simply to a **change of basis** from $A$.  (We will later call this a **similar matrix** to $A$).  Now we know:

$$
\det(X A X^{-1}) = \det(X) \det(A) \det(X^{-1}) = \det(A) \; .
$$

That is, a **change of basis doesn't change the determinant**.

## 10. det(Aᵀ) = det(A)

This is another non-obvious, but very important, property of determinants.  It is relatively easy to see from properties 7 and 9, however.

In particular, factorize $PA = LU$, or $A = P^T L U \implies A^T = U^T L^T P$.  Then, from property 9:

$$
\det(A^T) = \det(U^T)  \det(L) \det(P)  = \det(U^T) \det(P) = \det(P) \times \mbox{(product of pivots)} \; ,
$$

where we have used the fact that $\det L^T = 1$ since $L^T$ is *upper* triangular and the diagonal entries of $L^T$ are all 1's, while $\det U^T$ is the determinant of a *lower* triangular matrix with the pivots on the diagonal.

But we also have that the permutation $P$ is formed by taking the identity matrix $I$ and swapping rows (for each rows wap during elimination), so

$$
\det P = (-1)^\mbox{# row swaps}
$$

So:

$$
\det(A^T) = (-1)^\mbox{# row swaps} \times \mbox{(product of pivots)}
$$

which is exactly the same as $\det A$ from earlier.

In [50]:
det(A), det(A')

(24.0, 23.99999999999999)


# Useful applications of determinants

Ignoring formulas (e.g. Cramer's rule, a formula for $A^{-1}$ — see Strang, section 5.3) that are mainly useful for tiny matrices, here are some examples of real usages of determinants **even for large matrices**:

* Understanding **eigenvalues**: determinants will turn eigenvalues into polynomial roots, and since we know about polynomial roots, that tells us a lot about eigenvalues.  (This is *not* how eigenvalues are *computed* in practice, however!)

 - There is also something called a [nonlinear eigenproblem](https://en.wikipedia.org/wiki/Nonlinear_eigenproblem), arising in many science and engineering problems, in which the determinant plays a basic conceptual role.  Again, however, computational methods typically avoid computing determinants explicitly except for tiny matrices.

* Proofs: Determinants show up in a lot of proofs in matrix theory, because they reduce matrices to numbers that have nice properties and are easy to reason about.  One also often sees things like the [adjugate matrix](https://en.wikipedia.org/wiki/Adjugate_matrix) and the [Cayley–Hamilton theorem](https://en.wikipedia.org/wiki/Cayley%E2%80%93Hamilton_theorem), both related to determinants.

 - That is, we often use determinants to help use understand and derive things in linear algebra, even if the final result doesn't require us to *compute* the determinant for any practical purpose.

* [Jacobian factors](https://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant): in multivariable calculus, a factor of $|\det J|$ arises when you perform a *change of variables* in integration, where $J$ is a Jacobian matrix.

 - The reason a determinant arises here is that, more generally, **det(A) is the volume of a parallelepiped** ("box") whose edges are given by the columns of $A$.
 
 - Integration may sound like something that only happens in a few dimensions (= tiny matrices J), but extremely high dimensional (even infinite-dimensional) integrals appear in statistics, quantum field theory, bioinformatics, and other fields.

* High-dimensional [Gaussian integrals](https://en.wikipedia.org/wiki/Gaussian_integral) often arise in **statistics** and related areas of science (e.g. [quantum field theory](https://en.wikipedia.org/wiki/Common_integrals_in_quantum_field_theory)), and the inverse of the square root of a determinant appears in the answer.  Often, one wants the logarithm of the result, in which case what arises is the **log determinant** $\log \det A$, an important matrix function.

This is no doubt an incomplete list.  Nevertheless, although determinants are a much more marginal topic in modern linear algebra than they were in the 19th century, they have hardly disappeared.

# A “Simple” But Horrible Formula

You probably learned a neat formula for the determinant of a $2\times2$ matrix at some point:

$$
\det \begin{pmatrix} a & b \\ c & d \end{pmatrix} = ad - bc \;
$$

You might have even learned a formula for $3\times3$ matrices.  You might be hoping, therefore, that there would be an extension of this "nice" formula (which seems a lot easier than doing elimination to get pivots) to arbitrary matrices.  There is!

Here it is (see Strang, section 5.2):

$$
\det A = \sum_{\mbox{permutations }p} \operatorname{sign}(p) \times (\mbox{product of diagonals of }A\mbox{ with columns permuted by }p)
$$

The important thing to know is that you have to consider **all permutations (re-orderings)** of $(1,2,3,\ldots,n)$.  (The [sign of the permutation](https://en.wikipedia.org/wiki/Parity_of_a_permutation) corresponds to the number of swaps it involves.) There are $n! = n (n-1)(n-2)\cdots 1$ (*n factorial*) re-orderings.

That means that this formula requires $\sim n \times n!$ scalar operations, which is **worse than exponential** in $n$.  This is **far more expensive than elimination** ($\sim n^3$), making this formula **computationally useless** for $n > 3$.

(There is also *another* computationally useless formula involving [minors and cofactors](https://en.wikipedia.org/wiki/Minor_(linear_algebra)); see Strang, section 5.2.)

The permutation formula is still sometimes useful *conceptually*, however.