<h1>LU Factorization</h1>

<p>Return to the task of solving a system of equations:</p>

&#36;~
\begin&#123;align&#125;
a_&#123;11&#125; x_1 &#43; a_&#123;12&#125;x_2 &#43; \cdots a_&#123;1n&#125; x_n &amp;&#61; b_1\\
a_&#123;21&#125; x_1 &#43; a_&#123;22&#125;x_2 &#43; \cdots a_&#123;2n&#125; x_n &amp;&#61; b_2\\
&amp;\vdots\\
a_&#123;m1&#125; x_1 &#43; a_&#123;m2&#125;x_2 &#43; \cdots a_&#123;mn&#125; x_n &amp;&#61; b_m
\end&#123;align&#125;
~&#36;

<p>Which we wrote as &#36;Ax&#61;b&#36;.</p>

<h2>Easy to solve systems</h2>

<p>If we have equations resulting in &#36;A&#36; being a diagonal matrix, then we have basically:</p>

&#36;~
\begin&#123;align&#125;
a_&#123;11&#125;x_1 &amp;&#61; b_1\\
a_&#123;22&#125;x_2 &amp;&#61; b_2\\
&amp; \vdots\\
a_&#123;nn&#125;x_n &amp;&#61; b_n
\end&#123;align&#125;
~&#36;

<p>This has <em>easy</em> solutions, namely. If &#36;a_&#123;ii&#125; \neq 0&#36;, then &#36;x_i &#61; b_i/a_&#123;ii&#125;&#36;.</p>

<p>If an &#36;a_&#123;ii&#125; &#61; 0&#36;, then the determinant of &#36;A&#36; is &#36;0&#36;, and there is not a unique solution.</p>

<h3>Lower triangular matrices</h3>

<p>If &#36;A&#36; is <em>lower triangular</em>, that is &#40;&#36;a_&#123;ij&#125; &#61; 0&#36; if &#36;j &gt; i&#36;&#41; or:</p>

&#36;~
A &#61; \left&#40;
\begin&#123;array&#125;&#123;cccc&#125;
a_&#123;11&#125; &amp; 0 &amp; \cdots &amp; 0\\
a_&#123;21&#125; &amp; a_&#123;22&#125; &amp; \cdots &amp; 0\\
 &amp; \vdots &amp;  &amp; \\
a_&#123;m1&#125; &amp; a_&#123;m2&#125; &amp; \cdots &amp; a_&#123;mn&#125;\\
\end&#123;array&#125;
\right&#41;
~&#36;

<p>Then we can solve recursively</p>

<ul>
<li>First, we solve &#36;a_&#123;11&#125; x_1 &#61; b_1&#36; with &#36;x_1 &#61; b_1 / a_&#123;11&#125;&#36;.</li>
<li>Next we solve &#36;a_&#123;21&#125;x_1 &#43; a_&#123;22&#125;x_2 &#61; b_2&#36; by first subsititution in for our just-solved &#36;x_1&#36;, and then solving:</li>
</ul>

&#36;~
a_&#123;21&#125;x_1 &#43; a_&#123;22&#125;x_2 &#61; b_2
~&#36;

&#36;~
a_&#123;21&#125; &#40; b_1 / a_&#123;11&#125;&#41;&#43; a_&#123;22&#125;x_2 &#61; b_2
~&#36;

&#36;~
x_2 &#61; &#40;b_2 - a_&#123;21&#125;&#40;b_1/a_&#123;11&#125;&#41;&#41; / a_&#123;22&#125;
~&#36;

<ul>
<li>repeat. In general we have</li>
</ul>

&#36;~
x_i &#61; &#40;b_i - \sum_&#123;j&#61;1&#125;^&#123;i-1&#125; a_&#123;ij&#125; x_j&#41; \cdot \frac&#123;1&#125;&#123;a_&#123;ii&#125;&#125;
~&#36;

<p>It is important that we have &#36;a_&#123;ii&#125; \neq 0&#36;, as otherwise we will have issues dividing. But this will be the case if &#36;det&#40;A&#41; \neq 0&#36;. &#40;Why?&#41;</p>

<p>This method is called <em>forward substitution</em></p>

<h3>Upper triangular matrices</h3>

<p>A matrix &#36;U&#36; is <em>upper triangular</em> if &#36;u_&#123;ij&#125; &#61; 0&#36; if &#36;i &lt; j&#36;. For example:</p>

&#36;~
A &#61; \left&#40;
\begin&#123;array&#125;&#123;cccc&#125;
a_&#123;11&#125; &amp; a_&#123;12&#125; &amp; \cdots &amp; a_&#123;1, n-1&#125; &amp; a_&#123;1n&#125;\\
0 &amp; a_&#123;22&#125; &amp; \cdots &amp; a_&#123;2,n-1&#125; &amp;a_&#123;2n&#125;\\
&amp;&amp;\vdots&amp;&amp;\\
0 &amp; 0 &amp;\cdots &amp; a_&#123;n-1,n-1&#125; &amp; a_&#123;n-1,n&#125;\\
0 &amp; 0 &amp;\cdots &amp; 0 &amp; a_&#123;nn&#125;\\
\end&#123;array&#125;
\right&#41;
~&#36;

<p>Then solving &#36;Ax&#61;b&#36; can be done by working <em>backwards</em>:</p>

&#36;~
x_n &#61; b_n / a_&#123;nn&#125;
~&#36;

<p>and from here:</p>

&#36;~
x_&#123;n-1&#125; &#61; &#40;b_&#123;n-1&#125; - a_&#123;n-1, n&#125;x_n&#41;/a_&#123;n-1, n-1&#125; &#61; &#40;b_&#123;n-1&#125; - a_&#123;n-1, n&#125;&#41; \cdot b_n / a_&#123;nn&#125;/a_&#123;n-1, n-1&#125;
~&#36;

<p>And in general</p>

&#36;~
x_i &#61; &#40;b_i - \sum_&#123;j&#61;i&#43;1&#125;^n a_&#123;ij&#125; x_j&#41; / a_&#123;ii&#125;.
~&#36;

<p>Again, we need &#36;a_&#123;ii&#125; &#61; 0&#36;.</p>

<h2>Permuted L or U matrices</h2>

<p>Consider</p>

&#36;~
A &#61; \begin&#123;array&#125;&#123;ccc&#125;
1 &amp; 2 &amp; 3\\
0 &amp; 0 &amp; 4\\
0 &amp; 5 &amp; 6
\end&#123;array&#125;
~&#36;

<p>Clearly if we permuted rows 2 and 3 this would be upper triangular, so we could solve easily by: first solving row 2, then use that to solve row 3 and then finally row 4.</p>

<p>Define &#36;p &#61; &#91;p_1, p_2, \cdots, p_n&#93;&#36;,  to be a permutation vector if the mapping &#36;i \rightarrow p_i&#36; maps the set &#36;1, \dots, n&#36; to itself in a bijective manner <em>and</em> the matrix &#36;&#40;\alpha_&#123;ij&#125;&#41; &#61; &#40;a_&#123;p_i j&#125;&#41;&#36; is either upper or lower triangular.</p>

<p>&#40;In the above we would have &#36;p_1 &#61; 1, p_2 &#61; 3&#36;, and &#36;p_3 &#61; 2&#36;.&#41;</p>

<p>Then clearly we could solve the permuted system of equations. For example, in the case that we end up lower triangular, so that forward substitution works, we would have:</p>

&#36;~
x_i &#61; &#40;b_&#123;p_i&#125; -  \sum_&#123;j&#61;1&#125;^&#123;i-1&#125; a_&#123;p_i j&#125;&#41; / a_&#123;p_i i&#125;.
~&#36;

<h2>Why bother?</h2>

<p>Suppose we knew that &#36;A&#61;LU&#36;, then we can solve &#36;Ax &#61; b&#36; easily by:</p>

<ul>
<li>solve the  equatulation &#36;Ly &#61; b&#36; for &#36;y&#36;.</li>
<li>But &#36;y &#61; Ux&#36;, so we solve &#36;Ux &#61; y&#36; for &#36;x&#36;.</li>
</ul>

<p>So if we can <em>factorize</em> &#36;A &#61; LU&#36;, we can easily solve &#36;Ax &#61; b&#36;.</p>

<h3>Example</h3>

<p>In Julia &#40;and MATLAB&#41; there is a built in solver for these problems:</p>

In [None]:
U = [1 2; 0 1]

2x2 Array{Int64,2}:
 1  2
 0  1

In [None]:
b = [1, 3]
x = U \ b

2-element Array{Float64,1}:
 -5.0
  3.0

In [None]:
U*x - b

2-element Array{Float64,1}:
 0.0
 0.0

<p>In fact, there are many different methods depending on assumptions. For example, rationals:</p>

In [None]:
U = [1//1 2; 0 1]
U \ b

2-element Array{Rational{Int64},1}:
 -5//1
  3//1

<p>There are special methods for many others...</p>

<h2>Can we find LU for a given A?</h2>

<p>Suppose &#36;A &#61; LU&#36;, then we have:</p>

&#36;~
a_&#123;ij&#125; &#61; l_&#123;i1&#125; u_&#123;1j&#125; &#43; l_&#123;i2&#125; u_&#123;2j&#125; &#43; \cdots l_&#123;in&#125; u_&#123;nj&#125;
~&#36;

<p>But:</p>

<ul>
<li>lower triangular means &#36;l_&#123;ij&#125; &#61; 0&#36; if &#36;j &gt; i&#36;</li>
</ul>

<ul>
<li>upper triangular means &#36;u_&#123;ij&#125; &#61; 0&#36; is &#36;j &lt; i&#36;.</li>
</ul>

<p>So</p>

&#36;~
a_&#123;ij&#125; &#61; \sum_&#123;s &#61; 1&#125;^&#123;min&#40;i, j&#41;&#125; l_&#123;is&#125; u_&#123;sj&#125;.
~&#36;

<p>Now to prove we can find &#36;LU &#61; A&#36;. We will do so by  induction. Suppose we know the first &#36;k-1&#36; columns of &#36;L&#36; and the first &#36;k-1&#36; rows of &#36;U&#36;. We then have:</p>

&#36;~
a_&#123;kk&#125; &#61; \sum_&#123;s&#61;1&#125;^&#123;min&#40;k,k&#41;&#125; \cdot &#61; \sum_&#123;s&#61;1&#125;^k \cdot &#61; \sum_&#123;s&#61;1&#125;^&#123;k-1&#125; l_&#123;ks&#125;u_&#123;sk&#125; &#43; l_&#123;kk&#125; u_&#123;kk&#125;.
~&#36;

<p>The first part of the right hand sum involves columns of &#36;L&#36; for which &#36;s &lt; k&#36; and rows of &#36;U&#36; or which &#36;s &lt; k&#36;. So all values are known by our assumption. So if &#36;l_&#123;kk&#125;&#36; is known &#40;say assumed to 1 or some other non-zero value&#41; we can solve for &#36;u_&#123;kk&#125;&#36; in terms of known values. To be explicit:</p>

&#36;~
u_kk &#61; &#40;a_kk - \sum_&#123;s&#61;1&#125;^&#123;k-1&#125; l_&#123;ks&#125;u_&#123;sk&#125; &#41; / l_&#123;kk&#125;.
~&#36;

<p>Then to fill out the &#36;k&#36; row of &#36;U&#36;, we consider for &#36;j &gt; k&#36; &#40;for which &#36;min&#40;j,k&#41; &#61; l&#36;&#41;:</p>

&#36;~
a_&#123;kj&#125; &#61; \sum_&#123;s&#61;1&#125;^&#123;k-1&#125; l_&#123;ks&#125; u_&#123;sj&#125; &#43; l_&#123;kk&#125; u_&#123;kj&#125;
~&#36;

<p>The sum is of known values and &#36;l_&#123;kk&#125;&#36; is known, so for each &#36;j&#36;, as specified, we can solve for &#36;u_&#123;kj&#125;&#36;.</p>

<p>Similarly, for the &#36;k&#36; column of &#36;L&#36;, we consider for &#36;j &gt; k&#36;</p>

&#36;~
a_&#123;jk&#125; &#61; \sum_&#123;s&#61;1&#125;^&#123;k-1&#125; l_&#123;js&#125; u_&#123;sk&#125; &#43; l_&#123;jk&#125; u_&#123;kk&#125;
~&#36;

<p>As before, the sum is known, and here, so is &#36;u_&#123;kk&#125;&#36;, so we can solve for &#36;l_&#123;jk&#125;&#36; when &#36;j &gt; k&#36;. That is we can fill in the &#36;k&#36; column.</p>

<h3>Special cases</h3>

<ul>
<li>If we always were to take &#36;l_&#123;ii&#125; &#61; 1&#36; we get Dolittle&#39;s factorization</li>
<li>If we always were to take &#36;u_&#123;ii&#125; &#61; 1&#36; we get Crout&#39;s factorization</li>
<li>If we take &#36;u_&#123;ii&#125; &#61; l_&#123;ii&#125;&#36; we get Cholesky&#39;s factorization.</li>
</ul>

<h3>Example</h3>

<p>Let&#39;s look at this matrix</p>

In [None]:
A = [1 1 1; 1 2 2; 1 2 3]

3x3 Array{Int64,2}:
 1  1  1
 1  2  2
 1  2  3

<p>We need to fill in &#36;U&#36; and &#36;L&#36;. We start with a zeros:</p>

In [None]:
L = zero(A)
U = zero(A)

3x3 Array{Int64,2}:
 0  0  0
 0  0  0
 0  0  0

<p>Now we fill in: we have &#36;1 &#61; a_&#123;11&#125; &#61; l_&#123;11&#125; u_&#123;11&#125;&#36; so we can take each to be 1:</p>

In [None]:
L[1,1] = 1
U[1,1] = 1

1

<p>Then for &#36;U&#36; we need to fill in &#36;u_&#123;12&#125;&#36; and &#36;u_&#123;23&#125;&#36;. For these we have</p>

&#36;~
a_&#123;12&#125; &#61; 1 &#61; &#40;0&#41; &#43;  l_&#123;11&#125; u_&#123;12&#125; &#61; 1 u_&#123;12&#125;, \quad
a_&#123;13&#125; &#61; 1 &#61; &#40;0&#41; &#43; l_&#123;11&#125; u_&#123;13&#125; &#61; 1 u_&#123;13&#125;
~&#36;

<p>So both are 1:</p>

In [None]:
U[1,2] = 1
U[1,3] = 1

1

<p>And for the first row of &#36;L&#36; we have:</p>

&#36;~
a_&#123;21&#125; &#61; 1 &#61; &#40;0&#41; &#43; l_&#123;21&#125;u_&#123;11&#125; &#61; l_&#123;21&#125;, \quad
a_&#123;31&#125; &#61; 1 &#61; &#40;0&#41; &#43; l_&#123;31&#125;u_&#123;11&#125; &#61; l_&#123;31&#125;
~&#36;

<p>So ditto:</p>

In [None]:
L[2,1] = 1
L[3,1] = 1

1

<p>Moving on to &#36;k&#61;2&#36; gives first the diagonal terms:</p>

&#36;~
a_&#123;22&#125; &#61; &#40;l_&#123;21&#125; u_&#123;12&#125;&#41; &#43; l_&#123;22&#125; u_&#123;22&#125;, \quad\text&#123;or&#125;
2 &#61; &#40;1 \cdot 1&#41; &#43; l_&#123;22&#125; u_&#123;22&#125;
~&#36;

<p>We can take both to be &#36;1&#36;:</p>

In [None]:
L[2,2] = 1
U[2,2] = 1

1

<p>And to fill in for &#36;j &gt; 2&#36;:</p>

&#36;~
a_&#123;23&#125; &#61; 2 &#61; &#40;l_&#123;21&#125; u_&#123;13&#125;&#41; &#43; l_&#123;22&#125;u_&#123;23&#125; &#61; &#40;1\cdot 1&#41; &#43; 1 u_&#123;23&#125;,
~&#36;

<p>So &#36;u_&#123;23&#125; &#61; 1&#36;</p>

In [None]:
U[2,3] = 1

1

<p>And from</p>

&#36;~
a_&#123;32&#125; &#61; 2 &#61; &#40;l_&#123;31&#125; u_&#123;12&#125;&#41; &#43; l_&#123;32&#125; u_&#123;22&#125; &#61; &#40;1 \cdot 1&#41; &#43; l_&#123;32&#125; \cdot 1
~&#36;

<p>So &#36;l_&#123;32&#125; &#61; 1&#36;:</p>

In [None]:
L[3,2] = 1

1

<p>Finally, we need to find the last diagonal terms:</p>

&#36;~
a_&#123;33&#125; &#61; 3 &#61; &#40;l_&#123;31&#125;u_&#123;13&#125; &#43; l_&#123;32&#125;u_&#123;23&#125;&#41; &#43; l_&#123;33&#125; u_&#123;33&#125; &#61; &#40;1\cdot 1 &#43; 1\cdot 1&#41; &#43;  l_&#123;33&#125; u_&#123;33&#125;
~&#36;

<p>So we can take each to be &#36;1&#36;:</p>

In [None]:
L[3,3] = 1
U[3,3] = 1

1

In [None]:
L

3x3 Array{Int64,2}:
 1  0  0
 1  1  0
 1  1  1

<p>and</p>

In [None]:
U

3x3 Array{Int64,2}:
 1  1  1
 0  1  1
 0  0  1

<p>And we verify:</p>

In [None]:
A - L*U

3x3 Array{Int64,2}:
 0  0  0
 0  0  0
 0  0  0

<h3>Optimized versions</h3>

<p>There are built in functions for these:</p>

In [None]:
L, U, p =  lu(A)

(
3x3 Array{Float64,2}:
 1.0  0.0  0.0
 1.0  1.0  0.0
 1.0  1.0  1.0,

3x3 Array{Float64,2}:
 1.0  1.0  1.0
 0.0  1.0  1.0
 0.0  0.0  1.0,

[1,2,3])

<p>We already verified that &#36;LU &#61; A&#36; for this &#36;A&#36;. The <code>p</code> is a permulation vector. In general, we should verify</p>

In [None]:
A[p,:]  -  L * U

3x3 Array{Float64,2}:
 0.0  0.0  0.0
 0.0  0.0  0.0
 0.0  0.0  0.0

<h2>When do we know this will work?</h2>

<p>Define the &#36;k&#36;th leading principal minor of &#36;A&#36; to be the submatrix &#36;a_&#123;ij&#125;&#36; for &#36;1 \leq i,j \leq k&#36;. Call this &#36;A_k&#36;.</p>

<blockquote>
<p>Thm: If &#36;A&#36; is &#36;n \times n&#36; and all &#36;n&#36; leading principle minors are non-singular, then &#36;A&#36; has an LU decomposition</p>
</blockquote>

<p>Proof. Suppose by induction this is true for step &#36;k-1&#36;. The we have &#36;A_&#123;k-1&#125;&#36; can be factored: &#36;A_&#123;k-1&#125; &#61; L_&#123;k-1&#125; U_&#123;k-1&#125;&#36;.</p>

<p>We wish to find &#36;L_k&#36; and &#36;U_k&#36; which are extensions and satisfy &#36;A_k &#61; L_k U_k&#36;.</p>

<p>Consider the case &#36;1 \leq i \leq k-1&#36; and</p>

&#36;~
a_&#123;ik&#125; &#61; \sum_&#123;s &#61; 1&#125;^&#123;k-1&#125; l_&#123;is&#125;u_&#123;sk&#125;
~&#36;

<p>We know the &#36;a_&#123;ik&#125;&#36;, the &#36;l&#36;&#39;s involved are from &#36;L_&#123;k-1&#125;&#36;. The &#36;u&#36;&#39;s we don&#39;t know &#40;yet&#41;, but as &#36;L_&#123;k-1&#125;&#36; is non-singular we can solve for the &#36;u&#36;s and this is just of the form &#36;b &#61; L_&#123;k-1&#125; x&#36;. So we can fill out the value of &#36;U_k&#36;.</p>

<p>Similarly, for &#36;1 \leq j \leq k-1&#36; we have:</p>

&#36;~
a_&#123;kj&#125; &#61; \sum_&#123;s&#61;1&#125;^&#123;k-1&#125; l_&#123;ks&#125; u_&#123;sj&#125;
~&#36;

<p>This is of the form &#36;b &#61; U_&#123;k-1&#125; x&#36; so can be solved to fill out the value of &#36;L_k&#36;.</p>

<p>FInally, we need to solve for &#36;l_&#123;kk&#125;&#36; and &#36;u_&#123;kk&#125;&#36;, but we have:</p>

&#36;~
a_&#123;kk&#125; &#61; \sum_s^&#123;k-1&#125; l_&#123;ks&#125; u_&#123;sk&#125; &#43; l_&#123;kk&#125;u_&#123;kk&#125;
~&#36;

<p>If the value of &#36;l_&#123;kk&#125; &#61; 1&#36;, then &#36;u_&#123;kk&#125;&#36; can be solved as the sum is now known. That is, we can fill out &#36;L_k&#36; and &#36;U_k&#36; with &#36;A_k &#61; L_k U_k&#36;, as desired.</p>

<h2>Cholesky factorization</h2>

<p>We know the transpose of a lower triangular matrix is upper and vice versa. This gives hope to a factorization of the form &#36;A &#61; L L^T&#36;, known as the Cholesky factorization. When is this possible?</p>

<blockquote>
<p>Thm: If &#36;A&#36; is real, symmetric and positive definite then it has a unque factorization &#36;A&#61;LL^T&#36; and &#36;L&#36; has a positive diagonal.</p>
</blockquote>

<p>Pf: We must have &#36;Ax&#61;0&#36; has only a solution &#36;x&#61;0&#36;, as positive definite means &#36;x^T A x &gt; 0&#36; for non-zero &#36;x&#36;. By considering vectors of the form &#36;x &#61; &#91;x_1 x_2 \cdot x_k 0 0 \cdots 0&#93;&#36; we can see that &#36;A_k&#36; will also be non-singular.</p>

<p>So by the last theorem &#36;A&#61; LU&#36; for some &#36;l&#36; and &#36;U&#36;. But &#36;A^T &#61; A&#36; so &#36;LU &#61; &#40;LU&#41;^T &#61; U^T L^T&#36;. Multiplying on the right and left as follows gives</p>

&#36;~
\begin&#123;align&#125;
L^&#123;-1&#125; &#40;L U&#41; &#40;L^T&#41;^&#123;-1&#125; &amp;&#61;L^&#123;-1&#125; U^T L^T &#40;L^T&#41;^&#123;-1&#125;\\
U &#40;L^T&#41;^&#123;-1&#125; &amp;&#61; L^&#123;-1&#125; U^T.
\end&#123;align&#125;
~&#36;

<p>The left side is upper triangular, the right side lower triangular, hence the must be a diagonal matrix &#36;D&#36;: &#36;D &#61; U &#40;L^T&#41;^&#123;-1&#125;&#36; and so  &#36;L D &#61; &#40;LU&#41;&#40;L^T&#41;^&#123;-1&#125; &#61; A&#40;L^T&#41;^&#123;-1&#125;&#36;, giving &#36;A &#61; L D L^T&#36;.</p>

<p>If we can show that &#36;D&#36; has all positive diagonal terms, then we can define &#36;D^&#123;1/2&#125;&#36; by &#36;&#40;\sqrt&#123;d_&#123;ii&#125;&#125;&#41;&#36; and express &#36;A&#36; as &#36;&#40;LD^&#123;1/2&#125;&#41; &#40;LD^&#123;1/2&#125;&#41;^T&#36; which is what we want.</p>

<p>So, why do we know &#36;D&#36; has all positive diagonal terms? Because &#36;D&#36; is positive definite:</p>

<p>Take &#36;x&#36; and then:</p>

&#36;~
\begin&#123;align&#125;
x^T D x &amp;&#61; x^T &#40;L^&#123;-1&#125;&#41; A &#40;L^T&#41;^&#123;-1&#125; x\\
&amp;&#61; &#40;x^T L^&#123;-1&#125;&#41; A &#40;&#40;L^T&#41;^&#123;-1&#125; x&#41;\\
&amp;&#61; &#40;&#40;L^&#123;-1&#125;&#41;^Tx&#41;^T A &#40;&#40;L^T&#41;^&#123;-1&#125;x&#41;\\
&amp;&#61; &#40;&#40;L^&#123;-1&#125;&#41;^Tx&#41;^T A &#40;&#40;L^&#123;-1&#125;&#41;^&#123;t&#125;x&#41;
&amp;&gt; 0.
\end&#123;align&#125;
~&#36;

<p>The last line as &#36;A&#36; is positive definite and &#36;&#40;L^&#123;-1&#125;&#41;^Tx&#36; is non-zero. The fact we can swap out the inverse and transpose of a matrix is something to prove.</p>

<h3>Proof take 2</h3>

<p>Here is an alternative <a href="http://www.math.iit.edu/~fass/477577_Chapter_7.pdf">proof</a>, perhaps more instructive. It requires a few facts about matrices which are symmetric and positive definite:</p>

<ul>
<li>If &#36;A&#36; is then &#36;a_11 &gt; 0&#36;.</li>
<li>If &#36;A&#36; is then any sub matrix formed by removing row &#36;i&#36; and column &#36;i&#36; will be</li>
<li>If &#36;A&#36; is and &#36;L&#36; has full rank, then &#36;LAL^T&#36; is one.</li>
</ul>

<p>Suppose we have &#36;A&#36; as assumed. Then we can write &#36;A&#36; in the following way where &#36;a_&#123;11&#125; &gt; 0&#36;.</p>

&#36;~
A &#61; \left&#91;
\begin&#123;array&#125;&#123;cc&#125;
a_&#123;11&#125; &amp; w^T\\
w &amp; K
\end&#123;array&#125;
\right&#93;
~&#36;

<p>By the second fact above, &#36;K&#36; Is symmetric and positive definite.</p>

<p>Now consider the following lower triangular matrix:</p>

&#36;~
L_1 &#61; \left&#91;
\begin&#123;array&#125;&#123;cc&#125;
\sqrt&#123;a_&#123;11&#125;&#125; &amp; 0\\
\frac&#123;w&#125;&#123;\sqrt&#123;a_&#123;11&#125;&#125;&#125; &amp; I
\end&#123;array&#125;
\right&#93;
~&#36;

<p>Then using block matrix multiplication we get this decomposition: &#36;A &#61; L B A_1 L^T&#36; where</p>

&#36;~
B_1&#61; \left&#91;
\begin&#123;array&#125;&#123;cc&#125;
I &amp; 0^T\\
0 &amp; K - \frac&#123;ww^T&#125;&#123;a_&#123;11&#125;&#125;
\end&#123;array&#125;
\right&#93;.
~&#36;

<hr />

<p>To see:</p>

In [None]:
using SymPy
w, K= symbols("w,  K", real=true)
w = [w]	 # a vector
a_11 = symbols("a_11", positive=true)
A = [a_11 w'; w K]
L = [sqrt(a_11) 0; w/sqrt(a_11) 1]
B_1 = [1 0;0 (K-w*w'/a_11)]

L*B_1*L'

2x2 Array{Any,2}:
 a_11  w
    w  K

<p>&#40;This isn&#39;t perfect, as <code>w</code> messes up the automatic conversion to a matrix of symbolic objects.&#41;</p>

<hr />

<p>Let &#36;A_1 &#61;  K - ww^T / a_&#123;11&#125;&#36;. Since &#36;A&#36; is positive definite and &#36;L&#36; has full rank &#40;why?&#41; it must be &#36;B_1&#36; is positive definite. Hence, &#36;A_1&#36; is too. But both &#36;K&#36; and &#36;ww^T&#36; are symmetric, so &#36;A_1&#36; is also symmetric and positive definite.</p>

<p>So we can find &#36;M_2&#36;, &#36;B_2&#36;, such that &#36;A_1 &#61; M_2 B_2 M_2^T&#36; and &#36;B_2&#36; will have a symmetric, positive definite submatrix &#36;A_2&#36;. As written &#36;M_2&#36; is &#36;n-1 \times n-1&#36;. We embed this into</p>

&#36;~
L_2 &#61;  \left&#91;
\begin&#123;array&#125;&#123;cc&#125;
I &amp; 0^T\\
0 &amp; M_2
\end&#123;array&#125;
\right&#93;.
~&#36;

<p>And then &#36;A &#61; L_1 L_2 A_2 L_2^T L_1^T&#36; where</p>

&#36;~
A_2 &#61; \left&#91;
\begin&#123;array&#125;&#123;cc&#125;
I_2 &amp; 0^T\\
0   &amp; K_2
\end&#123;array&#125;
\right&#93;.
~&#36;

<p>We see were this repeated, we would eventually get:</p>

&#36;~
A &#61; L_1 L_2 \cdots L_n \cdot I \cdot  L_n^T \cdots L_2^T L_1^T.
~&#36;

<p>Letting &#36;L &#61; L_1 L_2 \cdots L_n&#36; yields the result, &#36;A&#61;LL^T&#36;, after noting the the product of lower triangular matrices is lower triangular.</p>

<h2>Example</h2>

<p>This comes from statistics. Consider the <em>overdetermined</em> system:</p>

In [None]:
A = [1 2; 3 5; 4 7; 1 8]

4x2 Array{Int64,2}:
 1  2
 3  5
 4  7
 1  8

In [None]:
b = [1,2,3,4]

4-element Array{Int64,1}:
 1
 2
 3
 4

<p>The system &#36;Ax&#61;b&#36; has no solutions. However, this system will:</p>

&#36;~
&#40;A^T A&#41; x &#61; A^T b
~&#36;

<p>&#40;Assuming &#36;A^TA&#36; is non-singular, we have it is symmetric and positive definite.&#41;</p>

In [None]:
M = A' *A

2x2 Array{Int64,2}:
 27   53
 53  142

<p>So we can take the cholesky decomposition:</p>

In [None]:
U = chol(M)'   # default answer is upper triangular
L = U'

2x2 UpperTriangular{Float64,Array{Float64,2}}:
 5.19615  10.1999 
 0.0       6.16141

<p>So we can solve &#36;LL^Tx &#61; A^T b&#36;. First we solve for &#36;y&#36; in &#36;Ly&#61;A^Tb&#36; with:</p>

In [None]:
y = L \ (A'*b)

2-element Array{Float64,1}:
 -16.282 
  10.5495

<p>And then solve &#36;L^Tx &#61; y&#36;:</p>

In [None]:
x = L' \ y

2-element Array{Float64,1}:
 -3.13347
  6.89947

<p>This answer is not the &quot;answer&quot; &#40;as that doesn&#39;t exist&#41;:</p>

In [None]:
A*x - b

4-element Array{Float64,1}:
  9.66548
 23.097  
 32.7624 
 48.0623 

<p>However, it has a property: it is the <code>x</code> with the smallest difference:</p>

In [None]:
norm(A*x - b)

63.3265770219158

In [None]:
sort([norm(A*rand(2) - b) for _ in 1:10])

10-element Array{Any,1}:
 2.06059
 2.44154
 2.69575
 2.89747
 3.06312
 3.28859
 3.8127 
 5.31242
 6.78086
 6.83102

<h3>Why?</h3>

<p>&#40;P279&#41; Suppose &#36;Ax&#61;b&#36; has &#36;A&#36; being &#36;m \times n&#36; with &#36;m &gt; n&#36; and &#36;rank&#40;A&#41; &#61; n&#36;. Then, this will typically have no solutions. In that case, what is sought is a best solution in the sense of minimizing &#36;\| b - Ax \|_2&#36;.</p>

<p>Now suppose &#36;x&#36; solves &#36;A^TAx&#61;A^Tb&#36;, and &#36;y&#36; is some other value, then</p>

&#36;~
\begin&#123;align&#125;
\|b - Ay\|_2^2 &amp;&#61; \|b - Ax &#43; A&#40;x-y&#41;\|_2^2\\
&amp;&#61; &#40;b - Ax &#43; A&#40;x-y&#41;&#41;^T \cdot  &#40;b - Ax &#43; A&#40;x-y&#41;&#41;\\
&amp;&#61; &#40;b - Ax&#41;^T\cdot &#40;b-Ax&#41; &#43; &#40;b-Ax&#41;^T\cdot &#40;A&#40;x-y&#41;&#41; &#43; &#40;A&#40;x-y&#41;&#41;^T \cdot &#40;b-Ax&#41; &#43; &#40;A&#40;x-y&#41;^T&#41; \cdot &#40;A&#40;x-y&#41;&#41;\\
&amp;&#61; \| b - Ax \|_2^2 &#43; 0 &#43; 0 &#43; \|A&#40;x-y&#41;\|_2^2\\
&amp;\geq  \| b - Ax \|_2^2
\end&#123;align&#125;
~&#36;

<p>The latter because, &#36;Ax-b&#36; has a &#36;0&#36; dot product with vectors in the column space of &#36;A&#36; &#40;as &#36;A^T&#40;Ax-b&#41;&#61;0&#36;. But &#36;A&#40;x-y&#41;&#36; is in the column space of &#36;A&#36;. &#40;Any &#36;Az &#61; &#91;A_&#123;\cdot 1&#125; ; A_&#123;\cdot 2&#125; ; \cdots ; A_&#123;\cdot n&#125;&#93; \cdot z &#61; z_1A_&#123;\cdot 1&#125; &#43; z_2A_&#123;\cdot 2&#125; &#43; \cdots &#43; z_n A_&#123;\cdot n&#125;&#36;.&#41; So, the cross terms are 0 and the result holds.</p>