# Elimination

+ This notebook is part of lecture 2 *Elimination with matrices* in the OCW MIT course 18.06 [1]
+ Created by me, Dr Juan H Klopper
    + Head of Acute Care Surgery
    + Groote Schuur Hospital
    + University Cape Town
    + <a href="mailto:juan.klopper@uct.ac.za">Email me with your thoughts, comments, suggestions and corrections</a> 
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons Licence" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" /></a><br /><span xmlns:dct="http://purl.org/dc/terms/" href="http://purl.org/dc/dcmitype/InteractiveResource" property="dct:title" rel="dct:type">Linear Algebra OCW MIT18.06</span> <span xmlns:cc="http://creativecommons.org/ns#" property="cc:attributionName">IPython notebook [2] study notes by Dr Juan H Klopper</span> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.

+ [1] <a href="http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm">OCW MIT 18.06</a>
+ [2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June 2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org

In [1]:
from IPython.core.display import HTML, Image
css_file = 'style.css'
HTML(open(css_file, 'r').read())

In [2]:
from sympy import init_printing, Matrix, symbols, eye, Rational
from warnings import filterwarnings

In [3]:
init_printing(use_latex = 'mathjax')
filterwarnings('ignore')

# Elimination

## A system of linear equations

+ Linear refers to the fact that each variable appears on its own (i.e. to the power 1) and is not transcendtal
+ A solution satisfies all of the equations at once
+ Consider the following linear set
$$ 1x+2y+1z=2 \\ 3x + 8y + 1z = 12 \\ 0x + 4y + 1z = 2 $$
+ A solution for *x*, *y*, and *z* could be as follows
$$ 1\left(2\right)+2\left(1\right)+1\left(-2\right)=2 \\ 3\left(2\right)+8\left(1\right)+1\left(-2\right)=12 \\ 0\left(2\right)+4\left(1\right)+1\left(-2\right)=2 $$
+ Since this is a set ( of three) equations that have a solution (solutions) for the variable in common, all left- and all right hand sides can be manipulated in certain ways
    + We could simply exchange the order of the equations (here equations 2 and 3 have been exchanged; row exchange)
    $$ 1x+2y+1z=2 \\ 0x + 4y + 1z = 2 \\ 3x + 8y + 1z = 12 $$
    + We could multiply both the left- and right-hand side of one of the equations with a scalar (here I multiply the first equation by 2)
    $$  2x+4y+2z=4 \\ 3x + 8y + 1z = 12 \\ 0x + 4y + 1z = 2 $$
    + Lastly, we can subtract a constant multiple of one equation from another
        + This serves an excellent purpose, as I can eliminate of one (or more) of the variables (give it a coefficient of 0)
        + Remember that we are trying to solve for all three equations and have three unknowns
        + We can most definitely struggle by doing this problem algebraically by substitution, but linear algebra makes it much easier
        + Here I have multiplies the first equation by 3 (both sides, so that we maintain integrity of the equation) and subtracted the left hand side of this new equation from the left-hand side of equation two and the new right-hand side of equation 1 from the right-hand side of equation two
        + This is quite legitimate, as the left- and right-hand sides are equal (it is an equation after all) and so, when subtracting from equation 2, we are still doing the same thing to the lfet-hand side as the right-hand side
        $$ 1x+2y+1z=2 \\ 0x + 2y - 2z = 6 \\ 0x + 4y + 1z = 2 $$
+ This has introduced a noice zero for me in the second equation
+ Let's go further and multiply equation 2 by 2 and subtract that from equation 3
$$ 1x+2y+1z=2 \\ 0x + 2y - 2z = 6 \\ 0x + 0y + 5z = -10 $$
+ Now let last equation is easy to solve for *z*
$$ z=-2 $$
+ Knowing this I can go back up to equation 2 and solve for *y*
$$ 2y+2(-2)=6 \\ y=1 $$
+ Finally up to equation 1
$$ x+2(1)+1(-2)=2 \\ x=2 $$

+ We need to have gone straight for substitution, indeed, we could have tried to get zeros above all our leading (non-zero) coefficients
+ Let's just clean up equation three by multiplying out by &frac15;
$$ 1x+2y+1z=2 \\ 0x + 2y - 2z = 6 \\ 0x + 0y + 1z = -2 $$
+ Now we have to get rid of the -2*z* in equation 2 which we can do by multiplying equation 3 by -2 and subtracting from equations 2
$$ 1x+2y+1z=2 \\ 0x + 2y - 0z = 2 \\ 0x + 0y + 1z = -2 $$
+ Multiplying equation 2 by &frac12; gives us the following
$$ 1x+2y+1z=2 \\ 0x + 1y + 0z = 1 \\ 0x + 0y + 1z = -2 $$
+ Now we can do the same to get rid of the 1*z* in equation 1 (multiply equation 3 by 1 and subtracting from equation 1)
$$ 1x+2y+0z=4 \\ 0x + 1y + 0z = 1 \\ 0x + 0y + 1z = -2 $$
+ Now tow get rid of the 2*y* in equation 1, which is above our leading 1*y* in equation 2
+ Simple enough, we multiply equation 2 by 2 and subtract that from equation 1
$$ 1x+0y+0z=2 \\ 0x + 1y + 0z = 1 \\ 0x + 0y + 1z = -2 $$
+ The solution is now clear for *x*, *y*, and *z*

+ We need not rewrite all of the variables all the time
+ We can simply write the coefficients
$$ \begin{bmatrix} 1&2&1&2\\3&8&1&12\\0&4&1&2 \end{bmatrix} $$
+ This is called the augmented matrix (right-hand side is added)
    + A matrix has rows and columns (attcahed in position to our algebraic equation above; we simply omit the variables)
+ The left-upper entry is called the pivot
+ Our aim is to get everything below it to be a zero (as we did with the algebra)
+ We do exactely the same as we did above, which is multiply row 1 by 3 and subtract these new values from row 2
$$ \begin{bmatrix} 1&2&1&2\\0&2&-2&6\\0&4&1&2 \end{bmatrix} $$
+ Now 2 times row 2 subtracted from row 3
$$ \begin{bmatrix} 1&2&1&2\\0&2&-2&6\\0&0&5&-10 \end{bmatrix} $$
+ Multiply the last row with &frac15;
$$ \begin{bmatrix} 1&2&1&2\\0&2&-2&6\\0&0&1&-2 \end{bmatrix} $$
+ This show 1*z* to equal -2
+ With this small matrix, it's easy to do back substitution as we did algebraically above
+ The first non-zero number in each row is the pivot (just like the upper-left entry)
+ The steps we have taken up to this point is called Gauss elimination and the form we end up with is row-echelon form
+ We could carry on and do the same sort of thing to get rid of all the non-zero entries above each pivot
+ This is called Gauss-Jordan elimination and the result is reduced row-echelon form (see the computer code below)
+ All of these steps are called elementary row operations
+ The only one we didn't do is row exchange
    + We reserve this so as not to have leading (in the pivot position) zeros

In [4]:
A_augmented = Matrix([[1, 2, 1, 2], [3, 8, 1, 12], [0, 4, 1, 2]])
A_augmented

⎡1  2  1  2 ⎤
⎢           ⎥
⎢3  8  1  12⎥
⎢           ⎥
⎣0  4  1  2 ⎦

+ We can ask python™ to simply get the augmented matrix in reduced row-echelon form and read off the solutions

In [5]:
A_augmented.rref() # The rref() method returns the reduced row-echelon form

⎛⎡1  0  0  2 ⎤           ⎞
⎜⎢           ⎥           ⎟
⎜⎢0  1  0  1 ⎥, (0, 1, 2)⎟
⎜⎢           ⎥           ⎟
⎝⎣0  0  1  -2⎦           ⎠

+ So row one reads as follows
$$ 1x + 0y + 0z = 2 \\ x=2 $$

## Elimination matrices

+ Matrices can only be multiplied by each other if in order we have the first column size equal the second row size
+ Rows are usually called *m* and columns *n*
+ So, our augmented matrix above will be *m* &times; *n* = 3 &times; 4
+ Let's look at how matrices are multiplied by looking at two small matrices
$$ \begin{bmatrix} {a}_{11}&{a}_{12}\\{a}_{21}&{a}_{22} \end{bmatrix}\begin{bmatrix} {b}_{11}&{b}_{12}\\{b}_{21}&{b}_{22} \end{bmatrix} $$
+ The subscripts refer to row and column position, i.e. 21 means row 2 column 1
+ We see that we have a 2 &times; 2 matrix times a 2 &times; 2 matrix
    + The *inner* two values are the same (2 and 2), so this multiplication is allowed
    + The resultant matrix will have the size equal to the *outer* two values (first row and last columns); here also 2  and 2
+ So let's look at position 11 (row 1 and column 1)
    + To get this we take the entries in row 1 of the first matrix and multiply them by the entries in the first column of the second matrix
    + We do this element by element and add the multiplication of each set of separate elements tow each other
    + The python code below shows you exactly how this is done

In [6]:
a11, a12, a21, a22, b11, b12, b21, b22 = symbols('a11 a12 a21 a22 b11 b12 b21 b22')

In [7]:
A = Matrix([[a11, a12], [a21, a22]])
B = Matrix([[b11, b12], [b21, b22]])
A, B

⎛⎡a₁₁  a₁₂⎤  ⎡b₁₁  b₁₂⎤⎞
⎜⎢        ⎥, ⎢        ⎥⎟
⎝⎣a₂₁  a₂₂⎦  ⎣b₂₁  b₂₂⎦⎠

In [8]:
A * B

⎡a₁₁⋅b₁₁ + a₁₂⋅b₂₁  a₁₁⋅b₁₂ + a₁₂⋅b₂₂⎤
⎢                                    ⎥
⎣a₂₁⋅b₁₁ + a₂₂⋅b₂₁  a₂₁⋅b₁₂ + a₂₂⋅b₂₂⎦

+ Let's constrain ourselves to the matrix of coefficients (this discards the right-hand side from the augmented matrix above)

In [9]:
A = Matrix([[1, 2, 1], [3, 8, 1], [0, 4, 1]]) # I use the same computer variable above, which
# will change its value in the computer memory
A # A 3 by 3 matrix, which we call square

⎡1  2  1⎤
⎢       ⎥
⎢3  8  1⎥
⎢       ⎥
⎣0  4  1⎦

+ The identity matrix is akin to the number 1, i.e. multiplying by it leaves everything unchanged
+ It has 1<sup>'s</sup> along what is called the main diagonal and 0<sup>'s</sup> everywhere else

In [10]:
I = eye(3) # Identity matrices are always square and the argument
# here is 3, so it is a 3 by 3 matrix
I # Note what the main diagonal is

⎡1  0  0⎤
⎢       ⎥
⎢0  1  0⎥
⎢       ⎥
⎣0  0  1⎦

+ Let's multiply this by A

In [11]:
I * A # Nothing will change

⎡1  2  1⎤
⎢       ⎥
⎢3  8  1⎥
⎢       ⎥
⎣0  4  1⎦

+ To get rid of the leading 3 in row 2 (because we want a zero under the pivot 1 in row 1), we multiplied row 1 by 3 and subtracted that from row 2
+ Interestingly enough we can do something to this identity matrix that when multiplied by A will results in the first step we have above
+ Since we required to subtract 3 times the first row from the 2 (it's all about that 3 in row 2, column 1), we can do the following

In [12]:
E21 = Matrix([[1, 0, 0], [-3, 1, 0], [0, 0, 1]])
E21 # 21 because we are working on row 2, column 1

⎡1   0  0⎤
⎢        ⎥
⎢-3  1  0⎥
⎢        ⎥
⎣0   0  1⎦

+ That gives us the required 3 times row 1 and the negative shows that we subtract (add the negative)
+ It's a thing of beauty

In [13]:
E21 * A

⎡1  2  1 ⎤
⎢        ⎥
⎢0  2  -2⎥
⎢        ⎥
⎣0  4  1 ⎦

+ Just what we wanted
+ E1 is called the first elimination matrix

+ Let's do something to the identity matrix to get rif of the 4 in row 3 column 2
+ It would require 2 times row 2 subtracted from row 3
+ Look carefully at the positions

In [14]:
E32 = Matrix([[1, 0, 0], [0, 1, 0], [0, -2, 1]])
E32

⎡1  0   0⎤
⎢        ⎥
⎢0  1   0⎥
⎢        ⎥
⎣0  -2  1⎦

In [15]:
E32 * (E21 * A)

⎡1  2  1 ⎤
⎢        ⎥
⎢0  2  -2⎥
⎢        ⎥
⎣0  0  5 ⎦

+ Spot on!
+ We now have nice pivots (leading non-zeros), with nothing under them
+ As a tip, try not to get fractions involved
+ As far as the other two row operations are concerned, we can either exchange rows in the identity matrix or multiply the required row by a scalar constant

+ Look at what happens we multiply E2 and E1

In [16]:
L_inv = E32 * E21
L_inv

⎡1   0   0⎤
⎢         ⎥
⎢-3  1   0⎥
⎢         ⎥
⎣6   -2  1⎦

+ Later we'll call this matrix the inverse of L
+ It is in triangular form, in this case lower triangular (note all the zeros above the main diagonal)

In [17]:
L_inv * A # Later we'll call this result the matrix U

⎡1  2  1 ⎤
⎢        ⎥
⎢0  2  -2⎥
⎢        ⎥
⎣0  0  5 ⎦

+ We now have the following
$$ {L}^{-1}{A}={U} $$

+ If we can get the inverse of the inverse of L we'll have the following
$$ {L}{L}^{-1}{A}={L}{U} $$
+ The inverse of a square matrix multiplied by itself gives the identity matrix
$$ {I}{A}={L}{U} \\ {A}={L}{U} $$

+ We can construct L from E32 and E21 above
$$ {E}_{21}^{-1}{E}_{32}^{-1}{E}_{32}{E}_{21}={E}_{21}^{-1}{E}_{32}^{-1}{U} \\ \therefore {E}_{21}^{-1}{E}_{32}^{-1}={L} $$

In [18]:
E21.inv() # The inverse is easy to understand in words
# We just want to add 3 instead of subtracting 3

⎡1  0  0⎤
⎢       ⎥
⎢3  1  0⎥
⎢       ⎥
⎣0  0  1⎦

In [19]:
E32.inv()

⎡1  0  0⎤
⎢       ⎥
⎢0  1  0⎥
⎢       ⎥
⎣0  2  1⎦

In [20]:
E21.inv() * E32.inv()

⎡1  0  0⎤
⎢       ⎥
⎢3  1  0⎥
⎢       ⎥
⎣0  2  1⎦

+ This is exactly the inverse of our inverse of L above

In [21]:
L_inv.inv()

⎡1  0  0⎤
⎢       ⎥
⎢3  1  0⎥
⎢       ⎥
⎣0  2  1⎦

+ This is called LU-decomposition of A
+ More about this in two chapter from now (I_05_LU_decomposition)

+ As an aside we can also do elementary column operation, but then we have to multiply on the right of A and not on the left as above

## Example problems

### Example problem 1

+ Solve the following linear set (set of linear equations)
$$ x-y-z+u=0 \\ 2x+2z=8 \\ -y-2z=-8 \\ 3x-3y-2z+4u=7 $$

#### Solution

In [22]:
A_augm = Matrix([[1, -1, -1, 1, 0], [2, 0, 2, 0, 8], [0, -1, -2, 0, -8], [3, -3, -2, 4, 7]])
A_augm

⎡1  -1  -1  1  0 ⎤
⎢                ⎥
⎢2  0   2   0  8 ⎥
⎢                ⎥
⎢0  -1  -2  0  -8⎥
⎢                ⎥
⎣3  -3  -2  4  7 ⎦

In [23]:
A_augm.rref()

⎛⎡1  0  0  0  1⎤              ⎞
⎜⎢             ⎥              ⎟
⎜⎢0  1  0  0  2⎥              ⎟
⎜⎢             ⎥, (0, 1, 2, 3)⎟
⎜⎢0  0  1  0  3⎥              ⎟
⎜⎢             ⎥              ⎟
⎝⎣0  0  0  1  4⎦              ⎠

+ Whoa! That was easy!
+ Let's take it a notch down and do some elementary matrices
+ First off, we want the matrix of coefficients

In [24]:
A = Matrix([[1, -1, -1, 1], [2, 0, 2, 0], [0, -1, -2, 0], [3, -3, -2, 4]])
A

⎡1  -1  -1  1⎤
⎢            ⎥
⎢2  0   2   0⎥
⎢            ⎥
⎢0  -1  -2  0⎥
⎢            ⎥
⎣3  -3  -2  4⎦

+ Now we need to get rid of the 2 in position row 2, column 1
+ We start by numbering the elementary matrix by this position and modifying the identity matrix

In [25]:
E21 = Matrix([[1, 0, 0, 0], [-2, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])
E21 * A

⎡1  -1  -1  1 ⎤
⎢             ⎥
⎢0  2   4   -2⎥
⎢             ⎥
⎢0  -1  -2  0 ⎥
⎢             ⎥
⎣3  -3  -2  4 ⎦

+ Now for position row 3, column 2
+ We have to use row 2 to do this
+ If we used row 1, we would introduce a non-zero into position row 3, column 1

In [26]:
E32 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, Rational(1, 2), 1, 0], [0, 0, 0, 1]])
E32 * (E21 * A)

⎡1  -1  -1  1 ⎤
⎢             ⎥
⎢0  2   4   -2⎥
⎢             ⎥
⎢0  0   0   -1⎥
⎢             ⎥
⎣3  -3  -2  4 ⎦

+ Now for the 3 in position row 4, column 1

In [27]:
E41 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [-3, 0, 0, 1]])
E41 * (E32 * E21 * A)

⎡1  -1  -1  1 ⎤
⎢             ⎥
⎢0  2   4   -2⎥
⎢             ⎥
⎢0  0   0   -1⎥
⎢             ⎥
⎣0  0   1   1 ⎦

+ Let's exchange rows 3 and 4

In [28]:
Ee34 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0]])
Ee34 * E41 * E32 * E21 * A

⎡1  -1  -1  1 ⎤
⎢             ⎥
⎢0  2   4   -2⎥
⎢             ⎥
⎢0  0   1   1 ⎥
⎢             ⎥
⎣0  0   0   -1⎦

+ Let's see where that leaves **b**, after all, what we do to the left, we must do to the right
$$ {Ee}_{34}\times{E}_{41}\times{E}_{32}\times{E}_{21}{A}{x}={Ee}_{34}\times{E}_{41}\times{E}_{32}\times{E}_{21}{b} $$

In [29]:
b_vect = Matrix([[0], [8], [-8], [7]])
b_vect

⎡0 ⎤
⎢  ⎥
⎢8 ⎥
⎢  ⎥
⎢-8⎥
⎢  ⎥
⎣7 ⎦

In [30]:
Ee34 * E41 * E32 * E21 * b_vect

⎡0 ⎤
⎢  ⎥
⎢8 ⎥
⎢  ⎥
⎢7 ⎥
⎢  ⎥
⎣-4⎦

+ Let's print them next to each other on the screen

In [31]:
Ee34 * E41 * E32 * E21 * A, Ee34 * E41 * E32 * E21 * b_vect

⎛⎡1  -1  -1  1 ⎤  ⎡0 ⎤⎞
⎜⎢             ⎥  ⎢  ⎥⎟
⎜⎢0  2   4   -2⎥  ⎢8 ⎥⎟
⎜⎢             ⎥, ⎢  ⎥⎟
⎜⎢0  0   1   1 ⎥  ⎢7 ⎥⎟
⎜⎢             ⎥  ⎢  ⎥⎟
⎝⎣0  0   0   -1⎦  ⎣-4⎦⎠

+ So we can simply do back substitution
+ We note that -1*u* = -4 and thus *u* = 4
+ From here, we work our way back up
$$ -1(u)=-4 \quad \therefore \quad u=4 \\ 1(z)+1(4) = 7 \quad \therefore \quad z=3 \\  2(y) + 4(3) - 2(4) = 8 \quad \therefore \quad y=2 \\ 1(x)-1(2)-1(3)+1(4)=0 \quad \therefore x=1 $$