In [None]:
import numpy as np

# Analytic Geometry

We've already seen multiple views of vectors:

* as an array of numbers (a computer science “data structure” view)
* as an arrow with a direction and magnitude (a physics view); and
* as an object that obeys addition and scaling (a mathematics view)

We're going to take a slightly more abstract take on the physics view and add some geometric interpretation and intuition to vectors, vector spaces, and linear mappings. This will yield some useful tools for machine learning topics such as regression, matrix decomposition, and dimensionality reduction.

## Norms

Taking the square root of an inner product of a vector an itself gives a scalar $\mathbf{x}^T \mathbf{x}$ which is known as the Euclidean *norm*. It is a measure of the length of the vector $x$ and is written $||\mathbf{x}||_2$. It is equivalent to taking the square root of the sum of the squares of the elements of the vector.

Let's look at a few different ways to compute the squared Euclidean norm. We're employing the `%%time` cell magic to time execution of a cell. For more control over the measurement (e.g. using repeated loops) you can use `%%timeit`. For more information on profiling and timing code, check out [Jake VanderPlas' chapter on the subject](https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html).

In [None]:
x = np.array([1, 2, 3])
print(x)

Explicitly implementing the dot product ourselves:

In [None]:
%%time
np.sqrt(np.sum(x * x))

Making use of `numpy.dot`, which looks a bit more like the algebraic expression:

In [None]:
%%time
np.sqrt(np.dot(x.T, x))

Note that we didn't actually need the transpose&mdash; NumPy automatically does dot product with two vector inputs:

In [None]:
%%time
np.sqrt(np.dot(x, x))

Finally, we use a more powerful function for computing general norms:

In [None]:
%%time
np.linalg.norm(x)

If you're feeling really geeky, you can even call raw BLAS functions directly from SciPy:

In [None]:
import scipy.linalg
%time nrm2, = scipy.linalg.get_blas_funcs(('nrm2',), (x,))
print(nrm2)

What is the difference between `numpy.linalg` and `scipy.linalg`? Well, according to the [SciPy docs](https://docs.scipy.org/doc/scipy/reference/tutorial/linalg.html):

> A scipy.linalg contains all the functions that are in numpy.linalg. Additionally, scipy.linalg also has some other advanced functions that are not in numpy.linalg. Another advantage of using scipy.linalg over numpy.linalg is that it is always compiled with BLAS/LAPACK support, while for NumPy this is optional. Therefore, the SciPy version might be faster depending on how NumPy was installed.

It is recommended therefore to use `scipy.linalg` instead of `numpy.linalg` unless you don't want to add `scipy` as a dependency to your `numpy` program.

### Manhattan ($\ell_1$) norm

The $\ell_1$ norm for $\mathbf{x} \in \mathbb{R}^n$, $||x||_1 = \sum_{i=1}^n |x_i|$ is frequently encountered in machine learning. You can compute it similarly to the above.

In [None]:
np.linalg.norm(x, 1)

NumPy provides the `numpy.isclose` and `numpy.allclose` functions to test  array-like objects for equality up to desired tolerance.

In [None]:
np.allclose(np.linalg.norm(x, 1), np.sum(np.abs(x)))

## Tests for symmetry and positive definiteness

Since symmetric, positive definite matrices play an important role in machine learning, it is important to be able to test for these properties.

Let's take the matrices from Example 3.4:

$$\mathbf{A}_1 = 
\begin{bmatrix} 9 & 6\\ 6 & 5\\\end{bmatrix}
\mathbf{A}_2 = 
\begin{bmatrix} 9 & 6\\ 6 & 3\\\end{bmatrix}
$$

In [None]:
A1 = np.array([[9, 6], [6, 5]], dtype='float')
A2 = np.array([[9, 6], [6, 3]], dtype='float')

### Exercise

Before unfolding these blocks, why don't you see if you can write a function to test whether a matrix is symmetric?

In [None]:
#@title
from numpy import diag_indices_from, empty_like, finfo, sqrt, asanyarray
from numpy.linalg import LinAlgError, cholesky

In [None]:
#@title
# Source: https://numpy-sugar.readthedocs.io/en/stable/_modules/numpy_sugar/linalg/property.html#check_symmetry
def check_symmetry(A):
    """Check if ``A`` is a symmetric matrix.

    Args:
        A (array_like): Matrix.

    Returns:
        bool: ``True`` if ``A`` is symmetric; ``False`` otherwise.
    """
    A = asanyarray(A)
    if A.ndim != 2:
        raise ValueError("Checks symmetry only for bi-dimensional arrays.")

    if A.shape[0] != A.shape[1]:
        return False

    return abs(A - A.T).max() < sqrt(finfo(float).eps)

In [None]:
print(check_symmetry(A1))
print(check_symmetry(A2))
print(check_symmetry([[1, 2], [3, 4]]))

With positive definiteness, it's a little more nuanced. An efficient way to test this is to use the NumPy implementation for a particular kind of matrix decomposition called the Cholesky decomposition. We'll explore it in detail in the next unit, so for now let's just treat it like a black box. `numpy.cholesky` will through a `LinAlgError` if its argument is not positive definite.

In [None]:
# Source: https://numpy-sugar.readthedocs.io/en/stable/_modules/numpy_sugar/linalg/property.html#check_definite_positiveness
def check_definite_positiveness(A):
    """Check if ``A`` is a definite positive matrix.

    Args:
        A (array_like): Matrix.

    Returns:
        bool: ``True`` if ``A`` is definite positive; ``False`` otherwise.
    """
    try:
        cholesky(A)
    except LinAlgError:
        return False
    return True

In [None]:
print(check_definite_positiveness(A1))
print(check_definite_positiveness(A2))

### Exercise

Before unfolding the next block, see if you can modify the function above to test for positive semidefiniteness. This is tricky, so don't worry too much if you're stumped.

In [None]:
#@title
# Source: https://numpy-sugar.readthedocs.io/en/stable/_modules/numpy_sugar/linalg/property.html#check_semidefinite_positiveness
def check_semidefinite_positiveness(A):
    """Check if ``A`` is a positive semi-definite matrix.

    Args:
        A (array_like): Matrix.

    Returns:
        bool: ``True`` if ``A`` is positive semidefinite; ``False`` otherwise.
    """
    B = empty_like(A)
    B[:] = A
    B[diag_indices_from(B)] += sqrt(finfo(float).eps)
    try:
        cholesky(B)
    except LinAlgError:
        return False
    return True

In [None]:
print(check_semidefinite_positiveness(A1))
print(check_semidefinite_positiveness(A2))

## Orthogonality

Let's consider the two vectors in Example 3.7, $\mathbf{x} = [1, 1]^{\top}$, $\mathbf{y} = [-1, 1]^{\top} \in \mathbb{R}^2$. 

Using the dot product as the inner product yields $\mathbf{x} \perp \mathbf{y}$:


In [None]:
x = np.array([1, 1])
y = np.array([-1, 1])

np.dot(x, y)

However, if we choose the inner product

$$ 
\langle \mathbf{x} , \mathbf{y} \rangle = \mathbf{x}^{\top}
\begin{bmatrix}
2 & 0\\
0 & 1\\
\end{bmatrix}
\mathbf{y},
$$

we get that the cosine of the angle $\omega$ between $\mathbf{x}$ and $\mathbf{y}$ given by

$$
\cos \omega = \frac{\langle \mathbf{x} , \mathbf{y} \rangle}{\|\mathbf{x}\|\|\mathbf{y}\|}
$$

is:

In [None]:
A = np.array([[2, 0], [0, 1]])

# define our inner product
def innerprod(x, y, A):
    return x.dot(A).dot(y)

# norm is based on our new inner product
def norm(x, A):
    return np.sqrt(innerprod(x, x, A))
  
cos_omega = innerprod(x, y, A) / (norm(x, A) * norm(y, A))
print(cos_omega)

To get $\omega$ we take the triginometric inverse:

In [None]:
omega = np.arccos(cos_omega)
print(omega)  # in radians
print(np.rad2deg(omega))  # in degrees

So we see that vectors that are orthogonal with respect to one inner product are not necessarily orthogonal to a different inner product.

### Orthogonal matrix

A square matrix is orthogonal only if its columns are orthonomal.

Let's consider the matrix

$$\mathbf{A} = 
\begin{bmatrix} \cos(0.5) & -\sin(0.5)\\ \sin(0.5) & \cos(0.5) \\\end{bmatrix}
$$

In [None]:
A = np.array([[np.cos(0.5), -np.sin(0.5)], [np.sin(0.5), np.cos(0.5)]])


First, we'll check if its columns are orthogonal and unit norm:

In [None]:
print(np.dot(A[:, 0], A[:, 1]))
print(np.linalg.norm(A[:, 0]))
print(np.linalg.norm(A[:, 1]))

For an orthogonal matrix $\mathbf{A} \in \mathbb{R}^{n \times n}$, $\mathbf{A}\mathbf{A}^{\top} = \mathbf{I}$.

In [None]:
np.allclose(np.dot(A, A.T), np.eye(2))

The above property also implies that $\mathbf{A}^{-1} = \mathbf{A}^{\top}$.

In [None]:
np.allclose(A.T, np.linalg.inv(A))

### Exercise

We saw that the columns of an orthogonal matrix are an orthonomal. That is, each column is length one, and mutually perpendicular. What can we say about the rows of an orthogonal basis? Why?