<div style="text-align: right"> CS824 - Lab 6c (2022) </div>

# Working with matrices as well as vectors

We will be using 2-D arrays to represent matrices (as the NumPy developers have noted that the `matrix` data type is likely soon to be depricated), and as of Python version 3.5 there is a built-in operator (`@`) for matrix (as opposed to element-wise) multiplication...  (Previously the `*` operator was interpreted as 'matrix multiplication' when applied to `matrix` objects, but that was about the only reason that you would have used that specific data type prior to the `@` operator being available. However, you do need to remember that when applyng the `*` operator to arrays of any dimensions - including 2-D arrays to represent classic matrices - this will always be interpreted as a request to carry out element-wise multiplication.) 

We have already seen some non 1-D arrays, so strictly speaking we have already seen some matrix objects. However, here we will be a bit more 'formal' and stick to creating `vectors` that will be one dimensional and matrices which will be `2-D arrays`. 


In [None]:
# We have already seen that vectors can be created as row- or column- based arrays in one-dimension

import numpy as np

# A vector as a single row
row_vec = np.array([3, 4, 7])

# Or as a single column
col_vec = np.array([[3],
                    [4],
                    [7]])

# You can also make a column vector by `re-shaping` a row vector
col_vec2 = row_vec.reshape(3, 1)

print(row_vec)
print(col_vec)
print(col_vec2)


Strictly you should **not** use the `transpose` operator (`T` method) to achieve this row-to-column transformation as a vector is just a collection of values (a 'tuple') and therefore does not have row/column indices that can be switched (`transpose`). However, you will sometimes see code that does this - perhaps because people think of this as the 1-D equivalent to the matrix transformations we will explore below.


In [None]:
# Generate a single column vector by taking the transpose of a row vector. 
# Note the second set of [] that is required to make this happen.

col_vec3 = np.array([[3, 4, 7]]).T

print(col_vec3)


## Create an NxN matrix and generate its transpose

Let's create an NxN (here 4x4) matrix and use the `T` method to get its transpose.


In [None]:
# A matrix with 4 rows and 4 columns
mat_A = np.array([[1, 5,  9, 13],
                  [2, 6, 10, 14],
                  [3, 7, 11, 15],
                  [4, 8, 12, 16]])

print(mat_A)
print("-----------------and its transpose")
print(mat_A.T)


## Inverting a matrix

This should NOT be confused with a transpostion. A couple of points:
 - you can *only* invert **square** matrices
 - some matrices cannot be inverted

The defintion of the inverse of a square matrix `[A]` is a second matrix `[A-1`, such that:
> `[A]` `[A-1]` = `I`    (the identity matrix)


In [None]:
# Let's see what happens with our matrix above...
# It should generate a 'Singular matrix' error - in other words this particular matrix cannot be inverted.

from numpy import linalg as LA 

mat_A_invert = LA.inv(mat_A)


In [None]:
# Let's start with a simpler 2x2 matrix
mat_B = np.array([[1, 4],
                  [2, 5]])

mat_B_invert = LA.inv(mat_B)
print(mat_B)
print(mat_B_invert)
# An 'inversion' was generated, but it is not easy to see what it might 'mean'!

In [None]:
# Let's confirm that these are indeed the inverse of the other.
# We can use the `@` operator to carry out matrix multiplication.

print(mat_B @ mat_B_invert)

# Due to rounding errors this may not quite be the 'perfect' Identity we were looking for - i.e. 
#
#    [[1, 0]
#      0, 1]]
#
# But it's close enough!


## Exercise 6c1

The `transpose` opeartion/method can also be applied to non-square matrices.  
 - Generate a non-square matrix and see what its transpose looks like.

As we saw some matrices cannot be inverted. In fact if we find the **determinant** of a matrix then this let's us know whether inversion can be attempted or not. If the `det` of a matrix is zero then it CANNOT be inverted.

 - Check the determinant values for the two matrices that we were working with above (**A** and **B**)


## Diagonals and the 'Trace' of a matrix

We are often interested in the 'main diagonal' of a matrix; irrespective of the dimensions of the matrix this will always be a vector.


In [None]:
# Let's remind ourselves of the contents of mat_A and then get its diagonal

print("Here is a matrix:")
print(mat_A)

diag_A = mat_A.diagonal()

print("--------------------------------------------------")
print("And these are the members of its leading diagonal:")
print(diag_A)


While Python will not generate an error if you apply the `diagonal` method to a non-square matrix, you may have to be careful about how you interpret these 'diagonal' values...

You can also use an `offset` parameter to return values at some distance away from the 'main diagonal'...


In [None]:
print("These values lie along the diagonal one 'below' the leading diagonal:")
print(mat_A.diagonal(offset=-1))

print("--------------------------------------------------")
print("While these are the values lying in the diagonal one 'above' the leading diagonal:")
print(mat_A.diagonal(offset=1))



**NB** There is also something referred to as a *'diagonal matrix'* which is (most often) a square matrix that has values on it main diagonal and zeros everywhere else. The **_identity matix_** is a special case of a *diagonal matrix* where all of these non-zero entries are equal to 1.


### The 'Trace' of a matrix

This is simply the sum of the values that lie along the leading diagonal...


In [None]:
print("The 'trace' of matrix mat_A is:")
print(mat_A.trace())

You could of course use the `sum` and `diagonal` array operations to find this value pretty easily...


In [None]:
trace2 = np.sum(mat_A.diagonal())
print("Working out the 'trace' using basic operators, the value we get is (surprise surprise!):")
print(trace2)

## Exercise 6c2  (SUBMIT)

You may remember that a common way to assess the performance of a classification algorithm is to report its **accuracy** which is often taken to be the entries on the leading diagonal of a 'mis-classification matrix' (i.e. all of the cases that it got correct) as a proportion of the total number of cases.

 - write a function called `my_accuracy` that takes a square matrix and returns an accuracy score as a percentage value

Once written you can try it out on the following cases:

 - A 2x2 mis-classification table that you might get from a clincal diagnostic test:
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;  +ve Cases &nbsp;&nbsp;   -ve Cases
<br>+ve Tests &nbsp;&nbsp;&nbsp; 71 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3
<br>-ve Tests &nbsp;&nbsp;&nbsp;&nbsp; 7 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 23

> I get a value of accuracy of just over 90% 

 - Now you have a multi-class problem with 5 possible outcome classes (a bit like the various animal diseases in *VetAfrica*. A more complex mis-classification matrix is shown below, but once again your function should easily be able to estimate the accuracy score.

&nbsp;(35,  2,  1,  0,  3,
<br> &nbsp;&nbsp; 4, 24,  2,  1,  5,
<br> &nbsp;&nbsp; 3,  0, 51,  2,  7,
<br> &nbsp;&nbsp; 0,  1,  6, 33, 12,
<br> &nbsp;&nbsp; 1,  3,  1,  4,  45)

> I get a value of accuracy of just over 76% 


## The Rank of a matrix

The result of this calculation is the number of linearly independent rows/columns in a matrix.

The maximum number of linearly independent rows in **A** is called the 'row rank' while the maximum number of linearly independent columns is called the 'column rank'. In an m-by-n matrix, it is obvious that the row_rank must be <= `m` while the column_rank <= `n`. However, there is no real reason to make a distinction between rows and columns when thinking about a matrix (as we could transpose and look for linear indepence in the 'alternate' direction. Therefore it is normal to talk only of the overall **rank** of a matrix, and for the case where **A** is an m-by-n matrix: 

> rank(**A**) <= min(m, n)

We can use the NumPy function `rank` to find this value...


In [None]:
mat_Z = np.array([[1, -2, 0, 4],
                  [3,  1, 1, 0],
                  [-1,-5,-1, 8],
                  [ 3, 8, 2, -12]])

print("The matrix shown below:")
print(mat_Z)
print("has an overall rank of: {0:}".format(LA.matrix_rank(mat_Z)))


### Short explanation of *Rank*

In the example above, think of the four rows as four sets of linear equations.
<br> e.g. &nbsp;&nbsp; r1 = a -2b + 4d
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; r2 = 3a +b + c

For each of these rows to be linearly independent they should **not** be able to be expressed in terms of any of the others (and the same holds for the columns).

However, hopefully you can see that in this particular example:
 - r3 = 2*r1 - r2
<br>and also that
 - r4 = -3*r1 + 2*r2
 
Neither r3 nor r4 are independent of r1/r2 (i.e. they can be expressed as linear combinations of them) and as such the maximum number of independent rows (i.e. the **rank**) in this example is 2 (just r1 and r2).


## Exercise 6c3

Consider the two matrices shown below and find their `rank`...  see whether this makes sense for:
 - an non-square 4x3 matrix (**V**)
 - the 3x3 'checker-board' matrix (**W**)
 - try a much larger 'checker-board', say 6x6


In [None]:
V = np.array([[2,-1, 5],
              [1, 0, 1],
              [0, 2,-1],
              [1, 1, 4]])

W = np.array([[ 1,-1, 1],
              [-1, 1,-1],
              [ 1,-1, 1]])
