# Representing and operating on vectors and matrices

## $ \S 1 $ Basic operations on vectors

We saw in the previous notebook that a vector such as $ (1, 2, 3) $
can be represented in NumPy as a $ 1D $ array:

In [1]:
import numpy as np

v = np.array([1, 2, 3])
w = np.array([4, 5, 6])

We can also use NumPy to conveniently perform all of the vector operations
that we learned in Linear Algebra.

Given vectors $ \mathbf v = (v_1, v_2, \cdots, v_n) $ and $ \mathbf w = (w_1,
w_2, \cdots, w_n) $ with the same number of coordinates, their __sum__ and
__difference__ $ \mathbf v \pm \mathbf w $ are computed element-wise:
$$
\mathbf v \pm \mathbf w = (v_1 \pm w_1,\,v_2 \pm w_2,\, \cdots,\, v_n \pm w_n)\,.
$$
NumPy uses the same notation:

In [17]:
s = v + w
print(s, type(s))

d = v - w
print(d, type(d))

[5 7 9] <class 'numpy.ndarray'>
[-3 -3 -3] <class 'numpy.ndarray'>


__Scalar multiplication__ of a vector by a factor $ c \in \mathbb{R} $ is also defined
element-wise:
$$
c\, \mathbf v = (c\,v_1, c\,v_2, \cdots, c\,v_n)\,.
$$

In [19]:
print(2 * v)
print(-3.14 * v)
print(0 * v)

[2 4 6]
[-3.14 -6.28 -9.42]
[0 0 0]


Naturally, we may also write $ -\mathbf v $ instead of $ (-1)\mathbf v $. Try it in the code cell below:

If we operate on an array whose datatype is `int` and any floating-number is
involved in the operation, then the result will be of datatype `float`.  A similar
observation applies to any other type coercion.

In [2]:
# `v.dtype` yields the datatype of the elements of v.
# We will study `dtype` in more detail later.
v = np.array([1, 2, 3])
print(v, v.dtype)

u = 1.0 * v
print(u, u.dtype)

[1 2 3] int64
[1. 2. 3.] float64


__Exercise:__ Can you explain the output of the following cell?

In [3]:
x = np.array([-1, 0, 1, 3])
b = np.array([True, False, True, False])

x_plus_b = x + b
print(x_plus_b, x_plus_b.dtype)


[0 0 2 3] int64


The __dot product__ $ \mathbf v \cdot \mathbf w $ of two vectors $ \mathbf v =
(w_1, w_2, \cdots, w_n) $ and $ \mathbf w  = (w_1, w_2, \cdots, w_n) $ of the
same shape is the sum of the products of their corresponding coordinates:
$$
\boxed{\ \mathbf v \cdot \mathbf w = v_1w_1 + v_2w_2 + \cdots + v_nw_n\ } 
$$

In [14]:
v = np.array([1, 2, 3])
w = np.array([4, 5, 6])
dot_product = np.dot(v, w)
print(dot_product)

32


Equivalently, we can also use the `@` operator to compute dot products:

In [16]:
alternative_dot_product = v @ w
print(alternative_dot_product)

32


It is easy to verify directly from the definition that the dot product is both:
* symmetric, i.e.,
    $$ \mathbf v \cdot \mathbf w = \mathbf w \cdot \mathbf v \qquad (\mathbf v,\, \mathbf w \in \mathbb R^n)\,; \qquad \text{and} $$
* bilinear, meaning that 
\begin{alignat*}{9}
    (a\, \mathbf u + b\,\mathbf v) \cdot \mathbf w
    &= a\, (\mathbf u \cdot \mathbf w) + b\, (\mathbf v \cdot \mathbf w) \\
    \mathbf u \cdot (a\,\mathbf v + b\,\mathbf w) 
    &= a\, (\mathbf u \cdot \mathbf v) + b\, (\mathbf u \cdot \mathbf w)\qquad && (\mathbf u,\,\mathbf v,\, \mathbf w \in \mathbb R^n,\ a,\,b \in \mathbb R)\,.
\end{alignat*}

The __norm__ or __length__ of a vector
$ \mathbf v = (v_1, v_2, \cdots, v_n) \in \mathbb R^n $ is defined by
$$
\boxed{\ \Vert \mathbf v \Vert = \sqrt{\mathbf v \cdot \mathbf v} = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}\ } $$
In dimension $ 2 $, this definition of "length" matches our intuitive notion and
can be justified by a simple use of Pythagoras' theorem, as illustrated in the
figure below. For higher dimensions, we apply Pythagoras' theorem and
induction.

For example, the norm (length) of the vector $ \mathbf v = (1, -2, 3) \in
\mathbb R^3 $ is $ \sqrt{1^2 + (-2)^2 + 3^2} = \sqrt{14} $, while the norm of
$ \big(\frac{1}{2}, \frac{1}{2}, \frac{1}{2}, \frac{1}{2} \big) \in \mathbb R^4 $
is $ 1 $.

![Vector](vector.png)


In NumPy, the norm of a vector can be computed as follows:

In [4]:
v = np.array([3, 4])

# We can invoke the function `norm` from the `linalg` submodule:
print(np.linalg.norm(v))

# Alternatively, we can take the square root of the dot product:
print(np.sqrt(np.dot(v, v)))

5.0
5.0


Recall that two vectors are __orthogonal__ (or __perpendicular__) if and only if
their dot product vanishes.  As an example, try to decide whether the two
vectors below are orthogonal using Python:

In [19]:
a = np.array([-3, 4, 7, 3, -6])
b = np.array([2, 5, -2, 4, 2])

More generally, recall from Linear Algebra the following relationship between the dot product and
the smallest angle $ \theta \in [0, \pi] $ between two vectors:
$$
\boxed{\ \mathbf v \cdot \mathbf w = \Vert \mathbf v \Vert \,\Vert \mathbf w \Vert \cos \theta\ }
$$
                                                                                                    

__Exercise:__ Compute the angle between the vectors $ \mathbf v = (2, 0) $ and $ \mathbf w = (3, 3) $ in degrees. _Hint:_ See
the illustration below. Use `np.arccos` to compute the arccosine and
`np.degrees` to transform the result to degrees.


![Angle](vectors_and_angle.png)

__Exercise:__ Consider the three vectors $ \mathbf a $, $ \mathbf b $ and $ \mathbf c $ in the code cell below.

(a) Compute $ \mathbf d = 3\mathbf a + 2\mathbf b - \mathbf c $.

(b) Project $ \mathbf d $ onto the line spanned by $ \mathbf b $ to get $ \mathbf e $. That is, compute 
$$ \mathbf e = \frac{\mathbf d \cdot \mathbf b}{\mathbf b \cdot \mathbf b} \mathbf b \,.$$

In [None]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([7, 8, 9])

__Exercise:__ The __canonical basis__ in $ \mathbb R^3 $ consists of the three vectors
$$ \mathbf e_1 = (1, 0, 0)\,, \quad \mathbf e_2 = (0, 1, 0)\,, \quad \text{and} \quad \mathbf e_3 = (0, 0, 1) \,,$$
which have norm $ 1 $ and point in the same direction as the positive $ x $-, $ y $- and $ z$-axis, respectively.
Using Python, compute and print all possible dot products $ \mathbf e_i \cdot \mathbf e_j $. _Hint:_ Store the
vectors in a list and use two for loops.

The __cross product__ $ \mathbf v \times \mathbf w \in \mathbb R^3 $ of two vectors in
three-dimensional space results in a vector _orthogonal to both $ \mathbf v $ and $ \mathbf w $
whose length is given by_
$$
\boxed{\ \Vert{\mathbf v \times \mathbf w}\Vert = \Vert{\mathbf v}\Vert\,\Vert{\mathbf w}\Vert\,\sin \theta\ }
$$
where again $ \theta \in [0, \pi] $ denotes the angle between $ \mathbf v $ and
$ \mathbf w $. The cross product is uniquely determined by these two properties
together with the fact that the basis $ \big(\mathbf v,\, \mathbf w,\, \mathbf v
\times \mathbf w \big) $ is _positively oriented_ (i.e., this trio of vectors,
in this order, satisfies the "right-hand rule"). Like the dot product, the cross
product $ \times $ is also bilinear, but it is antisymmetric instead of
symmetric:
$$ \mathbf w \times \mathbf v = -\mathbf v \times \mathbf w \quad (\mathbf v,\, \mathbf w \in \mathbb R^3)\,.
$$

__Exercise:__ Compute all possible cross products of the canonical basis vectors $ \mathbf e_i $ in $ \mathbb R^3 $
using the function `cross`. _Hint:_ Use for loops.

In [32]:
e1 = np.array([1, 0, 0])
e2 = np.array([0, 1, 0])
cross_product = np.cross(e1, e2)
print(cross_product)

[0 0 1]


A __unit vector__ is a vector of length $ 1 $. To get a unit vector $ \mathbf u $ having the same
direction as a given nonzero vector $ \mathbf v $, we can simply divide the latter by its norm:
$$
\mathbf u = \frac{\mathbf v}{\Vert \mathbf v \Vert}\,.
$$
Indeed, using the bilinearity of the dot product and the definition of the norm, we can check directly that
$$
\mathbf u \cdot \mathbf u = \bigg(\frac{\mathbf v}{\Vert \mathbf v \Vert}\bigg) \cdot \bigg(\frac{\mathbf v}{\Vert \mathbf v \Vert}\bigg)
= \frac{1}{\Vert \mathbf v \Vert^2}\big({\mathbf v \cdot \mathbf v}\big) = \frac{\Vert \mathbf v \Vert^2}{\Vert \mathbf v \Vert^2} = 1\,.

__Exercise:__ How many _unit_ vectors in $ \mathbb{R}^3 $ are parallel to $ \mathbf v = (3, -4, 10) $ (i.e., lie on the same line through the origin as $ \mathbf v $)? Compute all of them using NumPy.

In [23]:
v = np.array([3, -4, 12])

__Exercise:__ Recall that $ A $ is
an __orthogonal matrix__ if and only if its $ n $ column vectors (or,
equivalently, row vectors) $ \mathbf v_1, \cdots, \mathbf v_n $ form an
_orthonormal basis_ of $ \mathbb R^n $ that is,
$$
\mathbf v_i \cdot \mathbf v_j = 
\begin{cases}
1 & \text{if $ i = j $} \\
0 & \text{otherwise}
\end{cases}
\qquad \text{for each $ i,\,j = 1, \cdots, n\,. $}
$$

(a) Write a Python function `is_orthogonal` that determines whether a
given $ n \times n $ square matrix $ A $ is orthogonal. 
_Hint:_ Use the slice `A[:, i]` to extract the $ i $-th column vector of $ A $.

(b) Can you see any potential problems with your approach when $ \mathbf A $
consists of floating-point numbers? How could these problems be controlled?

## $ \S 2 $ Basic operations involving matrices

Just as for vectors, we can __add__ and __subtract__ two matrices, or more
generally any two arrays _having the same shape_ using
`+` and `-` respectively:

In [24]:
A = np.array([[1, 2, 3],
              [1, 2, 3]])

B = np.array([[4, 4, 4],
              [5, 5, 5]])

print("Matrix A:\n", A, '\n')
print("Matrix B:\n", B, '\n')
print("Sum:\n", A + B, '\n')
print("Difference:\n", A - B, '\n')

Matrix A:
 [[1 2 3]
 [1 2 3]] 

Matrix B:
 [[4 4 4]
 [5 5 5]] 

Sum:
 [[5 6 7]
 [6 7 8]] 

Difference:
 [[-3 -2 -1]
 [-4 -3 -2]] 



Similarly, to __scale__ every element of a matrix (or, more generally, $ n
$-dimensional array) $ A $ by a scalar $ c $, we may use either `c * A` or `A * c`:

In [12]:
c = 2
print("c * A:\n", c * A, '\n')
print("A * c:\n", A * c, '\n')

c * A:
 [[2 4 6]
 [2 4 6]] 

A * c:
 [[2 4 6]
 [2 4 6]] 



For __multiplication__ of 2D arrays, i.e., matrices, NumPy uses the `np.matmul` function or
the `@` operator. Note that we are referring here to matrix multiplication,
which is different from element-wise multiplication. In particular, for the
product to make sense, the number of columns in the first matrix must match the
number of rows in the second matrix: the product of an $ m \times n $ matrix by
an $ n \times p $ matrix has shape $ m \times p $.

In [31]:
# Creating a 2 x 3 matrix A:
A = np.array([[1, 2, 3],
              [4, 5, 6]])

# Creating a 3 x 4 matrix B:
B = np.array([[7, 8, 9, 10],
              [11, 12, 13, 14],
              [15, 16, 17, 18]])

# Multiplying A and B:
C = np.matmul(A, B)
# Alternatively:
D = A @ B
print(C, C.shape)
print(D, D.shape)

[[ 74  80  86  92]
 [173 188 203 218]] (2, 4)
[[ 74  80  86  92]
 [173 188 203 218]] (2, 4)


📝 `np.matmul` and `@` are completely equivalent in their output and
performance. The choice between them is a matter of preference and code
readability.

__Exercise:__ Compute `C @ C`, `C * C`, `C**2` and `C**(-1)` for the matrix $ C $ below. Can you explain these results? We will return to these operations in another notebook.

In [10]:
C = np.array([[-2.5, -1.7, -0.5],
              [ 2.4, -6.5,  3.3],
              [-0.5,  5.0,  1.0]])

To instantiate a copy of the identity matrix of shape $ n \times n $,
we can use the function `np.identity` as follows:

In [5]:
n = 4
I = np.identity(n)  # Create an n x n identity matrix
print(I)
print(I.dtype)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
float64


A more flexible version of `np.identity` allowing the creation of non-square matrices is `np.eye`:

In [27]:
E = np.eye(3, 4)
print(E)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]]


The third (optional) parameter of `np.eye` specifies an offset to the diagonal:

In [31]:
I = np.eye(4, 4, 0)   # An offset of 0 corresponds to the main diagonal
U = np.eye(4, 4, 1)   # An offset of 1 corresponds to the diagonal immediately above the main one
L = np.eye(4, 4, -2)  # A negative offset to refers to a lower diagonal

print(I, '\n')
print(U, '\n')
print(L, '\n')

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]] 

[[0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]
 [0. 0. 0. 0.]] 

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [1. 0. 0. 0.]
 [0. 1. 0. 0.]] 



__Exercise:__ Compute the linear combination $ M^2 - 3 M + 2I $, for $ M $ the matrix below:

In [36]:
M = np.array([[ 0, -2],
              [ 1,  3]])

The diagonal elements of a square matrix $ A $ can be extracted to a $ 1D $ array using the function `np.diag(A)`.

__Exercise:__ Extract the diagonal elements of the matrix $ C $ below into a vector and
then compute its length and the angle it makes with the vector $ (7, -2, 1) $:

In [11]:
C = np.array([[0, -4, 2],
              [3, 1, -5],
              [-3, 0, 2]])

When multiplying a $ 2D $ array (matrix) by a $ 1D $ array (vector), the vector
is temporarily viewed as a column matrix and the operation is then treated as a
matrix multiplication.  Thus, matrix-vector multiplication can also be handled
by `@`, or equivalently `np.matmul`:

In [32]:
A = np.array([[1, 2, 3],
              [4, 5, 6]])
v = np.array([-1, 0, 1])

prod1 = A @ v
prod2 = np.matmul(A, v)
print(prod1, prod1.shape)
print(prod2, prod2.shape)

[2 2] (2,)
[2 2] (2,)


Recall that the __trace__ of a square matrix is by definition the sum of all of
its diagonal entries. To compute the __trace__, __determinant__ and the
__inverse__ of a _square_ matrix, we can use the `np.trace`, `np.linalg.det` and
the `np.linalg.inv` functions, respectively. 

In [5]:
X = np.array([[1, 2],
              [3, 4]])
print("Matrix X:\n", X)
print(f"Trace of X: {np.trace(X):.2f}")
print(f"Determinant of X: {np.linalg.det(X):.2f}")

Matrix X:
 [[1 2]
 [3 4]]
Trace of X: 5.00
Determinant of X: -2.00


In [3]:
A = np.array([[0, 1],
              [-1, 0]])
A_inverse = np.linalg.inv(A)
print("Inverse of A:\n", A_inverse)
print("Product of A and its inverse:\n", A @ A_inverse)

Inverse of A:
 [[-0. -1.]
 [ 1.  0.]]
Product of A and its inverse:
 [[1. 0.]
 [0. 1.]]


__Exercise:__ Find the area of the parallelogram spanned by the vectors 
$ (3, 5) $ and $ (2, 4) $ in $ \mathbb{R}^2 $.  Recall that this area can be
computed as the absolute value of the determinant of the matrix formed by these
vectors. _Hint:_ The absolute value function in NumPy is denoted by `np.abs`.

__Exercise:__ Given two square matrices $ C $ and $ D $ of the same size, recall
that the determinant of their product is the product of their determinants:
$$
\boxed{\ \det(CD) = \det(C) \cdot \det(D)\ }
$$
Verify this identity in the particular example where
$$
C = \begin{bmatrix}
1 & 2 \\
3 & 4 \\
\end{bmatrix} \quad \text{and} \quad
D = \begin{bmatrix}
2 & 3 \\
1 & 4 \\
\end{bmatrix}\,.
$$

__Exercise:__ Solve the linear system of equations given by $ A\mathbf{x} = \mathbf{b} $, where
$$
A = \begin{bmatrix}
1 & 2 & 3 \\
0 & 1 & 4 \\
5 & 6 & 0 \\
\end{bmatrix} \quad \text{and} \quad \mathbf b = \begin{bmatrix}
3 \\
7 \\
8 \\
\end{bmatrix}\,.
$$
Verify your answer by multiplying $ A $ by $ \mathbf x $.
_Hint:_ Use the inverse of $ A $ to find $ \mathbf{x} = A^{-1}\mathbf{b} $.

__Exercise:__ Recall the function `is_orthogonal` from the last exercise of $ \S
1 $ that determines whether a given $ n \times n $ square matrix $ A $ is
orthogonal by verifying whether its column vectors are orthonormal. An
equivalent condition for the orthogonality of $ A $ is that it satisfy
$$
A^TA = I_n = AA^T\,,
$$
where $ A^T $ is the transpose of $ A $ and $ I_n $ is the $ n \times n $ identity matrix.
(Actually, any one of these equations by itself already suffices for orthogonality.)
 
Write another version of `is_orthogonal` that makes use of this criterion. When comparing
to the identity, you may want to use `np.round(B, 3)` to round all entries of $
B $ to three decimal digits to avoid false negatives.