<div style="text-align:center;">
    <img style="text-align:center;" src="http://www.cs.wm.edu/~rml/images/wm_horizontal_single_line_full_color.png">
</div>
<h1 style="text-align:center;">CSCI 416-01/516-01: Fundamentals of AI and ML, Fall 2025</h1>
<h1 style="text-align:center;">A review of vectors and matrices</h1>

# Vectors and matrices in <code>numpy</code> 
$\newcommand{\R}{\mathbb{R}}$
$\newcommand{\C}{\mathbb{C}}$
$\newcommand{\Rm}{\R^{m}}$
$\newcommand{\Rn}{\R^{n}}$
$\newcommand{\Rmn}{\R^{m \times n}}$
$\newcommand{\Rnn}{\R^{n \times n}}$
$\newcommand{\Rnm}{\R^{n \times m}}$
$\newcommand{\Rmm}{\R^{m \times m}}$

The set of real numbers is denoted by $\R$.  The space of all $n$-dimensional real vectors is denoted by $\Rn$.

Given vectors $v_{1}, \ldots, v_{r}$, their <span style="color: #ff0000">span</span> is the set of vectors of the form
$$
\alpha_{1}v_{1} + \cdots + \alpha_{r}v_{r}
$$
for $\alpha_{1}, \ldots, \alpha_{r} \in \R$.  Vectors of the form $\alpha_{1}v_{1} + \cdots + \alpha_{r}v_{r}$ are called <span style="color: #ff0000">linear combinations</span> of $v_{1}, \ldots, v_{r}$.

A set of vectors $v_{1}, \ldots, v_{r}$ is called <span style="color: #ff0000">linearly independent</span> if 
$$
\alpha_{1}v_{1} + \cdots + \alpha_{r}v_{r} = 0
$$
then $\alpha_{1} = \ldots = \alpha_{r} = 0$.

The <span style="color: #ff0000">standard basis</span> for $\Rn$ is the set of vectors $e_{i}$ with $1$ in the $i$-th position and zero otherwise.  These span $\Rn$.

In [None]:
import numpy as np

In [None]:
x = np.array([1.,2,3]) 
print(x)

In [None]:
print(f"{x.ndim = }")   # Number of dimensions
print(f"{x.shape = }")  # Dimensions
print(f"{x.size = }")   # Number of elements
print(f"{x.dtype = }")  # Type

Vectors in $\Rn$ are matrices.  <span color="ff0000">Row vectors</span> are $1 \times n$ matrices; <space color="ff0000">column vectors</span> are $n \times 1$ matrices.  When we are talking about linear algebra you should assume that vectors are column vectors.

Unfortunately,  <code>numpy</code> treats vectors as a special case, rather than $n \times 1$ or $1 \times n$ matrices.  This is WRONG!

In [None]:
y = np.array([4,5,6])

In [None]:
print(f"{y.ndim = }")   # Number of dimensions
print(f"{y.shape = }")  # Dimensions
print(f"{y.size = }")   # Number of elements
print(f"{y.dtype = }")  # Type

You can slice vectors in the same way as you slice lists in Python.  This means the upper bound is one less than the ostensible upper bound, unlike Matlab and Fortran. 

In [None]:
print(x[0:2])
print(x[:1])
print(x[:])
print(x[1:])

# Operations on vectors

In mathematics, vector addition and subtraction, as well as scalar multiplication, are all defined, being element-wise operations.

In [None]:
print(x + y)
print(x - y)
print(2 * x)

In computation, element-wise multiplication and division are also defined.

In <code>numpy</code>, for vectors and matrices <code>*</code> represents element-wise multiplication term.  This is like Matlab's <code>.&ast;</code> operator and <code>&ast;</code> in Fortran.

The mathematical terminology for element-wise multiplication is the  <span style="color: #ff0000">Hadamard product</span>.

In [None]:
print(x * y)  # * is element-wise multiplication

In [None]:
print(x / y)   # / is element-wise division
print(x // y)  # // is element-wise floor division

The <span style="color: #ff0000">Euclidean inner product</span> of two vectors $x, y, \in \Rn$ is defined to be
$$
x \cdot y = x^{T}y = \sum_{i=1}^{n} x_{i}y_{i} = x_{1}y_{1} + \cdots + x_{n}y_{n}.
$$


In [None]:
print(x @ y)  # For vectors, @ is the inner product.

The <span style="color: #ff0000">Euclidean norm</span> of $x \in \Rn$ is
$$
\| x \| = \sqrt{x \cdot x} = \sqrt{x^{T}x} = \left( x_{1}^{2} + \cdots + x_{n}^{2} \right)^{1/2}.
$$

The <span style="color: #ff0000">Cauchy-Schwarz</span> inequality is
$$
x^{T}y \leq \| x \|\ \| y \|.
$$
If $\theta$ is the angle between $x$ and $y$ in the plane they determine, then
$$
x^{T}y = \| x \|\ \| y \|\ \cos\theta.
$$
Thus, if $x \neq 0, y \neq 0$, then $x^{T}y = 0$ if and only if $\theta = 90^{\degree}$.  In this case we say $x$ and $y$ are <span style="color: #ff0000">orthogonal</span> or <span style="color: #ff0000">perpendicular</span>.

In [None]:
print(f"{x @ y = }")

In [None]:
print(f"||x|| = {np.linalg.norm(x, ord=2)}")
print(f"||y|| = {np.linalg.norm(y, ord=2)}")
print(f"||x|| * ||y|| = {np.linalg.norm(x, ord=2) * np.linalg.norm(y, ord=2)}")

The Euclidean norm satisfies the <span style="color: red">triangle inequality</span>
$$
\| x + y \| \leq \| x \| + \| y \|.
$$

In [None]:
print(f"||x + y||     = {np.linalg.norm(x+y, ord=2)}")
print(f"||x|| + ||y|| = {np.linalg.norm(x, ord=2) + np.linalg.norm(y, ord=2)}")

Mixing vectors with incompatible dimensions in <code>numpy</code> results in an error:

In [None]:
z = np.array([1,2,3,4])
print(x + z)

# Matrices



In [None]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6]
])
print(A)

In [None]:
print(f"{A.shape = }")  # Dimensions
print(f"{A.size = }")   # Number of elements
print(f"{A.dtype = }")  # Type

Array access in <code>numpy</code> is the same as in Matlab and Fortran:

In [None]:
print(A[:,2])    # All rows of column 2.  This should be a column vector, only numpy is WRONG!
print(A[1:3,:])  # All columns of rows 1 to 2.
print(A[:2,:2])  # Upper left-hand 2 x 2 block.

As with vectors, <code>&ast;</code> represents element-wise multiplication:

In [None]:
A*x

Matrix multiplication is also denoted by <code>@</code>

In [None]:
A@x

In [None]:
B = np.array([
    [1, 1],
    [2, 2]
])
print(B)

In [None]:
B@A

Attempting to apply operations to matrices of incompatible dimensions results in an error:

In [None]:
A@B

# Tensors

Vectors and matrices are special cases of tensors.  The simplest way to think of a tensor is as a multiply indexed array.  The number of indices is called the rank of the tensor, which is confusing since the rank of a matrix means something completely different.

In [None]:
# A 2 x 3 x 4 tensor.
T = np.zeros((2,3,4))  # A rank-3 tensor of zeros.
print(T)

When we print <code>T</code>, it is printed by plane, each of which is a $3 \times 4$ matrix.

In [None]:
print(f"{T.shape = }")  # Dimensions
print(f"{T.size = }")   # Number of elements
print(f"{T.dtype = }")  # Type

# Back to matrices&hellip;

Given $A \in \Rmn$, the row rank of $A$ is the number of linearly independent rows, while the column rank is the number of linear independent columns.  The row rank is the same as the column rank; this common value is called the <span style="color: red">rank</span> of $A$.

A matrix all of whose rows are linearly independent is said to have full row rank; a matrix all of whose columns are linearly independent is said to have full column rank.

A matrix with full column rank defines a map $A: x \in \Rn \mapsto Ax \in \Rm$ that is 1-1.

A matrix with full column rank defines a map $A: x \in \Rn \mapsto Ax \in \Rm$ that is onto $\Rm$.

Suppose $c_{1}, c_{2}, \ldots, c_{n}$ are the columns of $A$ and that $x = (x_{1}, x_{2}, \ldots, x_{n}) \in \Rn$.  Then
$$
  Ax = x_{1} c_{1} + x_{2} c_{2} + \cdots + x_{n} c_{n}.
$$

Given an $m \times n$ matrix $A = (a_{ij})$, its transpose $A^{T}$ is the $n \times m$ whose $(i,j)$ entry is $a_{ji}$:

In [None]:
print(A, end="\n\n")
print(A.T)

A matrix is symmetric if $A = A^{T}$.  Note that symmetric matrices are necessarily square.

In [None]:
B = np.array([
    [1,2,3],
    [4,5,6],
    [7,8,9]
])
B = B + B.T  # A symmetric matrix.
print(B)

A matrix $U \in \Rnn$ is orthogonal if $U^{T}U = I$:

In [None]:
U,S,Vh = np.linalg.svd(A)  # U and V are orthogonal
print(Vh @ Vh.T)

Orthogonal matrices preserve the Euclidean norm:

In [None]:
print(f"||x||  = {np.linalg.norm(x, ord=2)}")
print(f"||Ux|| = {np.linalg.norm(Vh@x, ord=2)}")

A matrix $A \in \Rnn$ is called positive definite if $x^{T}Ax > 0$ for all $x \neq 0$.

A matrix $A \in \Rnn$ is called positive semidefinite if $x^{T}Ax \geq 0$ for all $x$.

In [None]:
A = np.array([  # A is symmetric positive definite.
    [2,1],
    [1,2]
])

rng = np.random.default_rng()
for _ in range(42):
    u = 2*(rng.random((2,)) - 1)
    print(u@A@u)

# Eigenvalues and eigenvectors

Let $A$ be an $n \times n$ matrix.  An <span style="color: red">eigenvalue</span> $\lambda \in \C$ such that there exists a vector $v \neq 0$ for which
$$
Av = \lambda v.
$$
Such a $v$ is called an <span style="color: red">eigenvector</span>.

If $A$ is real it may still have complex eigenvalues:

In [None]:
A = np.array([
    [1,-1],
    [1, 1]
])
print(np.linalg.eigvals(A))

However, if $A \in \Rnn$ is symmetric, then all of its eigenvalues are real, and there exists an orthogonal set of eigenvectors that spans $\Rn$:

In [None]:
A = np.array([
    [1, 2, 3],
    [2, 4, 5],
    [3, 5, 4]
])

D, V = np.linalg.eig(A)  # Returns eigenvalues, eigenvectors.
print("eigenvalues: ", D)
print("eigenvectors: ", V)
print(V@V.T)  # Check that V is orthogonal.

A matrix is symmetric positive definite if and only if all of its eigenvalues are positive.

In [None]:
A = np.array([  # A is symmetric positive definite.
    [3,1,1],
    [1,3,1],
    [1,1,3]
])

print(np.linalg.eigvals(A))

# The singular value decomposition

If $A \in \Rmn$, then there exist orthogonal matrices
\begin{align*}
    U &= \left[u_{1}, \cdots, u_{n}\right] \in \Rmm \\
    V &= \left[v_{1}, \cdots, v_{m}\right] \in \Rnn,
\end{align*}
and $\Sigma \in \Rmn$,
$$
  \Sigma = \begin{pmatrix}
    \sigma_{1} & 0 & 0 & \cdots & 0 \\
    0 & \sigma_{2} & 0 & \cdots & 0 \\
    \vdots & \vdots & & \ddots &  0 \\
    0 & 0 & 0 & \cdots & \sigma_{p},
\end{pmatrix}
$$
where $p = \min(m,n)$ and $\sigma_{1} \geq \sigma_{2} \geq \ldots \geq \sigma_{p} \geq 0$, such that 
$$
  A = U \Sigma V^{T}.
$$
This is called the singular value decomposition (SVD).

This means
$$
A = \sum_{k=1}^{p} \sigma_{k} u_{k}v_{k}^{T}.
$$

In [None]:
U,S,Vh = np.linalg.svd(A)
print(f"{U = }")
print(f"{S = }")
print(f"{Vh = }")

In [None]:
S = np.c_[ np.diag(S), np.zeros(2) ]  # Add a column of zeros.
print(S)

In [None]:
print(f"{A - U@S@Vh = }")

# Norms and inner products

A norm on $\Rn$ is a map $\| \cdot \|: \Rn \rightarrow \R$ with the following properties:
1. $\| x \| = 0$ if and only if $x = 0$
2. $\| \alpha x \| = |\alpha|\ \| x \|$ for all $\alpha \in \R$ and $x \in \Rn$.
3. $\| x + y \| \leq \| x \| + \| y \|$ for all $x,y \in \Rn$.

The last property is called the triangle inequality.

If $\| \cdot \|$ and $\|| \cdot |\|$ are norms on $\Rn$ then there exist $c, C > 0$ such that for all $x \in \Rn$,
$$
c \|| x |\| \leq \| x \| \leq C \|| x |\|.
$$

For $1 \leq p \leq \infty$ the $p$-norm is defined to be 
$$
\| x \|_{p} = \left( \sum_{i=1}^{n} | x_{i} |^{p} \right)^{1/p}.
$$
The $\infty$-norm is
$$
\| x \|_{\infty} = \max_{1 \leq i \leq n} | x_{i} |.
$$

$\newcommand{\ip}[2]{\langle #1, #2 \rangle}$
An inner product on $\Rn$ is a map $\ip{\cdot}{\cdot}: \Rn \times \Rn \rightarrow \R$ with the following properties:
1. $\ip{x}{y} = \ip{y}{x}$ for all $x,y \in \Rn$.
2. $\ip{\alpha x}{y} = \alpha \ip{x}{y}$ for all $x,y \in \Rn$.
3. $\ip{x + y}{z} = \ip{x}{z} + \ip{y}{z}$ for all $x,y,z\in \Rn$.

For any inner product on $\Rn$ there exists a symmetric positive definite matrix $A$ such that
$$
\ip{x}{y} = x^{T}Ay.
$$
An inner product induces a norm via
$$
\| x \| = \sqrt{\ip{x}{x}}.
$$