CSU DSCI 369 Lab 5
Instructor: Emily J. King
Spring 2024
Goals: Understand parallizability of matrix multiplication. Show non-commutativity. See what the identity matrix does. Tease inveribility and non-invertibility.

We're going to play around with the following five matrices:


In [1]:
import numpy as np

In [2]:
A=np.array([[3, 0, 0], [0, -1, 0], [0, 0, 0.5]])
print('A is\n',A)
B=np.array([[1, 0, 2], [0, 1, 0], [0, 0, 1]])
print('B is\n',B)
C=np.ones([3,3])
print('C is\n',C)
I=np.eye(3)
print('I is\n',I)
R=np.random.rand(3,3)
print('R is\n',R)

A is
 [[ 3.   0.   0. ]
 [ 0.  -1.   0. ]
 [ 0.   0.   0.5]]
B is
 [[1 0 2]
 [0 1 0]
 [0 0 1]]
C is
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
I is
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
R is
 [[0.23668397 0.08776614 0.07275722]
 [0.28134398 0.11664886 0.50606825]
 [0.35560434 0.54321585 0.64114684]]


The command above np.eye(n) will make an nxn matrix with ones where i=j and zeroes everywhere else.  It's called "eye" because such matrices are typically denoted with a capital letter 'I'.

Parallelizability

Let's compute BR.  REMEMBER: Matrix multiplication in Python is with @.  The * is a completely differen operation (Hadamard product).

In [3]:
B@R

array([[0.94789265, 1.17419784, 1.3550509 ],
       [0.28134398, 0.11664886, 0.50606825],
       [0.35560434, 0.54321585, 0.64114684]])

Now let's compute B times the first column of R. Note that Python will present this as a 1D array displayed horizontally, but you should read it as a column vector.

In [4]:
B@R[:,0]

array([0.94789265, 0.28134398, 0.35560434])

It is possible to force Python to show the output as a column vector, but we must add a valence:

In [5]:
(B@R[:,0])[:, np.newaxis]

array([[0.94789265],
       [0.28134398],
       [0.35560434]])

To aid with visualization, we will also display the following products as columns.  Now B times the second column of R.


In [6]:
(B@R[:,1])[:, np.newaxis]

array([[1.17419784],
       [0.11664886],
       [0.54321585]])

And B times the third column of R.

In [7]:
(B@R[:,2])[:, np.newaxis]

array([[1.3550509 ],
       [0.50606825],
       [0.64114684]])

What do you notice?  Why does this match what we've seen in lecture?

Similarly, let's compute the first row of B times R.  Since the mathematical output is a row and Python displays 1D arrays as rows be default, we won't need to manipulate the output to aid in visualization.

In [8]:
B[0,:]@R

array([0.94789265, 1.17419784, 1.3550509 ])

And the second row of B times R.

In [9]:
B[1,:]@R

array([0.28134398, 0.11664886, 0.50606825])

And the third.


In [10]:
B[2,:]@R

array([0.35560434, 0.54321585, 0.64114684])

What do you notice?  Why does this match what we've seen in lecture?

Identity

Let's compute AI and IA.  

In [12]:
A@I

array([[ 3. ,  0. ,  0. ],
       [ 0. , -1. ,  0. ],
       [ 0. ,  0. ,  0.5]])

In [11]:
I@A

array([[ 3. ,  0. ,  0. ],
       [ 0. , -1. ,  0. ],
       [ 0. ,  0. ,  0.5]])

Now, let's compute BI and IB.  


In [13]:
B@I

array([[1., 0., 2.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [14]:
I@B

array([[1., 0., 2.],
       [0., 1., 0.],
       [0., 0., 1.]])

Finally, let's compute RI and IR.


In [15]:
R@I

array([[0.23668397, 0.08776614, 0.07275722],
       [0.28134398, 0.11664886, 0.50606825],
       [0.35560434, 0.54321585, 0.64114684]])

In [16]:
I@R

array([[0.23668397, 0.08776614, 0.07275722],
       [0.28134398, 0.11664886, 0.50606825],
       [0.35560434, 0.54321585, 0.64114684]])

What seems to be happening each time?  Discuss.

Multiplying by I (Identity Matrix) does not change the original matrix.

Non-commutativity

Changing the order of multiplication when one of the matrices was I above didn't change the output.  But let's try some other pairs.  

In Lab 2, we used np.allclose(vec1,vec2) to test for approximate equality taking into account floating point arithmetic.  We will do the same with matrices. 

We'll look at the matrices first since they are small enough to see all entries at once.

In [17]:
A@B

array([[ 3. ,  0. ,  6. ],
       [ 0. , -1. ,  0. ],
       [ 0. ,  0. ,  0.5]])

In [18]:
B@A

array([[ 3. ,  0. ,  1. ],
       [ 0. , -1. ,  0. ],
       [ 0. ,  0. ,  0.5]])

And here is the test.

In [19]:
np.allclose(A@B,B@A)

False

And a different pair.

In [20]:
C@R

array([[0.87363229, 0.74763085, 1.21997231],
       [0.87363229, 0.74763085, 1.21997231],
       [0.87363229, 0.74763085, 1.21997231]])

In [21]:
R@C

array([[0.39720733, 0.39720733, 0.39720733],
       [0.90406109, 0.90406109, 0.90406109],
       [1.53996703, 1.53996703, 1.53996703]])

In [22]:
np.allclose(C@R,R@C)

False

What happened?  Can we always switch the order of multiplication?

No


Non-invertibility

CR and RC above were really structured.  Let's play around with that some more.

We'll multiply C times a number of vectors.  This time we won't restructure them to visualize them as column vectors.

In [23]:
C@np.array([1,0,0])

array([1., 1., 1.])

In [24]:
C@np.array([0,1,0])

array([1., 1., 1.])

In [25]:
C@np.array([0,0,1])

array([1., 1., 1.])

In [26]:
C@np.array([1./3,1/3,1/3])

array([1., 1., 1.])

Say I had a function f:R^3 -> R^3 that multiplied column vectors with 3 entries on the right by C, i.e. f(x) = Cx.  If I know that for some y that f(y) is the all-ones vector, do I know what y was?  Can I say anything about y?  Discuss.

Exercises

1. a. Multiply A defined above on the right by three different non-zero vectors.  

In [27]:
print(A@np.array([1,3,2]))
print(A@np.array([3,3,1]))
print(A@np.array([6,2,4]))

[ 3. -3.  1.]
[ 9.  -3.   0.5]
[18. -2.  2.]


b. Compare the input and output each time.  What does multiplying by A seem to do to the vectors?

Multiplying a vector by A appears to return a new vector that has had operations performed on its entries. The first entry is multiplied by 3, the second entry is multiplied by -1, and the third entry is halved.

2. a. Define two vectors, y and z (which are not one of the four already used above) such that Cy and Cz are the all-ones vector. 

In [36]:
y=np.array([-1,-3,5])
z=np.array([2,-1,0])

b. Verify that Cy and Cz are indeed the all-ones vector.


In [35]:
print(C@y)
print(C@z)

[1. 1. 1.]
[1. 1. 1.]


c. Explain why you chose those vectors.

I noticed that C multiplied by a given vector returns a new vector where all entries are the sum of the original vector's entries, so I defined y and z so that their entries add up to 1.

3. a. Multiply C on the right by three different random 3x1 vectors.


In [41]:
print(C@np.random.rand(3, 1))
print(C@np.random.rand(3, 1))
print(C@np.random.rand(3, 1))

[[1.16215977]
 [1.16215977]
 [1.16215977]]
[[1.10280875]
 [1.10280875]
 [1.10280875]]
[[1.5412121]
 [1.5412121]
 [1.5412121]]


e. Do you think [2; 1; 1] is an output of the function f(x)=Cx?  Explain your answer.

No because f(x)=Cx returns a vector where all entries are equal to the sum of x's entries. [2; 1; 1] could not be an output because not all of its entries are equal.
