# Matrix notation and conventions
M C M Wright, ISVR, University of Southampton

In [1]:
%pylab inline

Populating the interactive namespace from numpy and matplotlib


## Series overview

## Notebook overview
This notebook describes the mathematical notations and conventions for matrix algebra and its representation in Python/NumPy. There are actually two ways to represent matrices in NumPy, and each has its own syntax. We'll describe both but only use one in subsequent notebooks.

## Simultaneous equations
Consider a set of three linear equations in three unknowns:

$$
\begin{align}
3x+5y+2z &= 25.9, \\
4x+2y+2z &= 19.4, \\
3x+y &=10.5. 
\end{align}
$$

Here $x$, $y$ and $z$ might represent the unit costs of three different drinks and the RHSs are the costs of three rounds of drinks. We can write this system of equations like this:

$$
\begin{pmatrix} 3 & 5 & 2 \\
4 & 2 & 2 \\
3 & 1 & 0 \end{pmatrix}
\begin{pmatrix} x\\ y \\ z\end{pmatrix} = \begin{pmatrix} 25.9\\ 19.4 \\ 10.5\end{pmatrix}.
$$

The first array of numbers in parentheses is a *matrix*, specifically a $3\times 3$ matrix, which contains $9$ *elements*. The second array in parentheses is a *vector*, as is the RHS. Writing the matrix followed by the vector implies that they are multiplied. Each *row* (horizontal line) of the matrix contains the numbers of each drink that went into the corresponding round. Each element on the RHS of our equation is formed by taking the **scalar product** between a row of the matrix and the vector of unknowns. 

Matrices with the same number of rows are columns are *square matrices*. Matrices don't have to be square;  when used to reresent systems of simultaneous equations like this square matrices correspond to systems with as many equations as unknowns. When writing the dimensions of a matrix the height (number of rows) comes first, so a $4\times 3$ matrix is taller than it is wide.

In the same way that we use letters to name variables with single values, e.g., $x = 3$, we'll also use letters to represent matrices and vectors and it's common practice to set them bold and upright, so we can write our equation in this way:

$$
\mathbf{Ax} = \mathbf{b}.
$$

We'll use lower-case letters for vectors and capitals for matrices. Neither of these conventions are universally followed. When writing by hand it's usual to underline the letters that represent vectors and some people find it helpful to double underline the  letters that represent matrices (in the days of handwritten manuscripts underlining a word or phrase was an instruction to the compositor to emphasise it, by setting it either bold or italic).

## NumPy representation 1

**[If you're familiar with MATLAB watch out for the differences]**

We can represent matrices and vectors with numpy arrays so we can form $\mathbf{A}$ and $\mathbf{b}$ like this:

In [2]:
A = array([[3, 5, 2], [4, 2, 2], [3, 1, 0]])
b = array([25.9, 19.4, 10.5])

We use a nested list (list of lists) to construct the matrix, with the inner lists corresponding to the rows.

We can't form $\mathbf{x}$ until we know the values of the unknowns $x$, $y$ and $z$ which are its elements, but we could guess some values and see if they're right.

In [9]:
x_guess = array([2.5, 3, 1])

To see if our guess is right we need to perform the matrix-vector multiplication as defined above. We could write some code to do it with nexted `for` loops, but NumPy has a built-in function to do it. 

We obseved above that the matrix-vector multiplication was a process of finding a series of scalar products between rows of the matrix and the vector. Scalar products are sometimes called *dot products*, and scalar products between two vectors are often written $\mathbf{p{.}q}$. NumPy's `dot()` function evaluates such dot products:

In [10]:
print(dot(array([1, 2, 3]), array([4, 5, 6])))

32


and even though it's much less common to write matrix-vector multiplication as $\mathbf{A{.}x}$ the `dot()` function will perform the correct multiplication:

In [11]:
print(dot(A, x_guess))

[ 24.5  18.   10.5]


...even though `x_guess` doesn't satisfy our sysytem of equations because this result isn't the same as $\mathbf{b}$.

Python has printed the result horizontally whereas we wrote the vector as a column, but since vectors are one-dimensional arrays their orientation doesn't matter. If, for some reason, we constructed a $1\times 3$ matrix its orientation *would* matter and Python would print it vertically.

In [12]:
print(array([[1], [2], [3]]))

[[1]
 [2]
 [3]]


#### Exercise
Confirm that the matrix-vector multiplication performed by the `dot()` function is correct. You can use Python to do the multiplicaton and addition of single numbers, but not `dot()`.

## Numpy representation 2

**[Simulates MATLAB behaviour but is not widely used]**

As well as arrays NumPy has objects called `matrix` that can be constructed in a similar way, but can be multiplied with `*` using the rules of matrix multiplication.  We'll give the Python variable a `_m` suffix to remind us that they're different objects. However, `matrix()` only constructs matrices, so vectors have to be represented by $1\times N$ matrices.

In [18]:
A_m = matrix([[3, 5, 2], [4, 2, 2], [3, 1, 0]])
x_guess_m = matrix([[2.5], [3], [1]])
A_m*x_guess_m

matrix([[ 24.5],
        [ 18. ],
        [ 10.5]])

In [17]:
A_m*x_guess_m

matrix([[ 24.5],
        [ 18. ],
        [ 10.5]])

The of these matrix objects is that you can write products of several matrices such as $\mathbf{ABCx}$ as `A*B*C*x` rather that `dot(A,dot(B,dot(C,x)))`. Nonetheless, we'll stick with the `array` representation for the rest of this notebook, and this series of notebooks.

##  Addressing matrix elements
In mathematical notation matrix elements are referred to by using two suffices, usually attached to the italic lowercase version of the letter whose bold uppercase letter symbolizes the whole matrix. 