# Arrays and Vectors

Arrays are far more general objects than vectors or matrices, yet arrays can be (and are) used to represent vectors and matrices.  Here will explore some of the implications of this and some of the care that must be taken when using a general object (array) to represent specific objects (vectors and matrices).

## Initialization

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import scipy.linalg as la
# Globally fix plot styling
import matplotlib as mpl
mpl.rc('xtick', direction='in', top=True)
mpl.rc('ytick', direction='in', right=True)
mpl.rc('xtick.minor', visible=True)
mpl.rc('ytick.minor', visible=True)

### A Brief aside on documentation

One way to look up documentation on different functions is to append, or pre-pend a question mark. For example, we can access documentation on the `np.min` function:

In [2]:
np.min?

In [4]:
?np.min

Many commonly used modules in Python have excellent documentation. However, be careful! Community-contributed modules may not be well-documented, and may have inconsistent conventions. One example I have encountered recently is the `pywigxjpf` module. This module is useful for computing Wigner symbols and Clebsch–Gordan coefficients. The source version, Conda version, and PyPI version all accepted similar but slightly different argument formats; and were not documented.

You can define your own documentation too! If I define a function, I can similarly access its documentation.

In [5]:
def mysquare(x) :
    """
    Return the square of the argument.

    Parameters
    ----------
    x : Input value.
    
    out : value, squared.
    """
    return x**2

mysquare?

## Lists versus Arrays

The basic data structures in Python itself are lists and tuples.  We can think of a tuple as a read only list.  The basic data structure in NumPy is an array.  Arrays and lists are not the same thing, even though most NumPy and SciPy functions will treat them as the same.  In fact, many of us have "gotten away with" using a list where we meant an array, since the functions we used silently converted lists to arrays.

To see that there is a difference we first construct the two different types of objects

In [10]:
a_list = [ 1., 2, 3 ]
# We can convert the list to an array
a_arr = np.array(a_list)
print(f"""List : {a_list}
Array: {a_arr}""")

List : [1.0, 2, 3]
Array: [1. 2. 3.]


Notice that these are different!  The list preserves the types of the objects we put into it.  The first element is a float and the last two are integers.  The array, on the other hand, converts everything to the same (largest) type, so everything is a float.  Python lists can contain different types of elements.  NumPy arrays must have all the same type.

Next, if we try to add a float to a list or an array the addition works fine for the array but gives an error for the list.

In [11]:
print("Array sum :", a_arr+1.1)
print("List sum  :", a_list+1.1)

Array sum : [2.1 3.1 4.1]


TypeError: can only concatenate list (not "float") to list

Notice that it did try to interpret the addition in some way, it just could not make sense of the operation.

On the other hand, if we multiplied by an integer ...!

In [12]:
print("Array mult :", a_arr*2)
print("List mult  :", a_list*2)

Array mult : [2. 4. 6.]
List mult  : [1.0, 2, 3, 1.0, 2, 3]


The summary of this is that we must be careful with how we represent our information.  If we mean to use arrays then we should use arrays.  Even if the functions we are using silently "fix" this for us, we always should use the correct data structure.  If we do not, then we will run into a problem, eventually.

### Broadcasting

Another extremely powerful feature of arrays is broadcasting.  When dealing with matrices and vectors an expression such as $\mathsf{A} + \vec v$ makes no sense.  It is complete nonsense!  However, when dealing with arrays we can define what this means.  Similarly, we know that the multiplication of a matrix and a vector, $\mathsf{A}\vec v$, does make sense and produces a vector.  So what does the multiplication of a two dimensional array and a one dimensional array produce?

The answer to all questions relating to the combination of different dimensional arrays is broadcasting.  There are many powerful things that can be done with broadcasting, here we will just understand the basics.  Though we will list some rules, the best approach, as always, is to **test your operations**.  There is no need to guess at how things will behave, we can easily check!

**Basic Rule**

For array `A` with one more dimension than array `v`, broadcasting will "repeat" `v` along the missing dimensions of `A`.  For the two dimensional case this means if `A` is a $M\times N$ array, then `v` must be of length $N$ and broadcasting will "repeat" it $M$ times and operate on each row of `A` using the same `v`.

This can be phrased in more sophisticated ways, but it is easiest to see in an example.  Here we create an array with 3 columns (notice the use of `reshape`).

In [13]:
A = np.arange(6).reshape((2,3))
v = np.arange(3) + 1
print("A =", A)
print("v =", v)

A = [[0 1 2]
 [3 4 5]]
v = [1 2 3]


We can add and multiply these two.  Again notice that the operations are performed on each component of `A` using the same components of `v` for each row.  More explicitly, for the addition case
$$ (A+v)_{ij} = A_{ij} + v_j. $$
Similarly for multiplication (and it would also be the same for subtraction, division, and other operators).

In [14]:
print("A+v =", A+v)
print("A*v =", A*v)

A+v = [[1 3 5]
 [4 6 8]]
A*v = [[ 0  2  6]
 [ 3  8 15]]


The key to broadcasting is that the shape of `v` must be the same shape as `A`, in all of its final dimensions.  In the two dimensional case this means if `A` is $N\times M$ and `v` is of length $N$ then **broadcasting will not work** (at least not for $N\ne M$).

We can see that here.

In [15]:
A = np.arange(6).reshape((3,2))
v = np.arange(3) + 1
print("A =", A)
print("v =", v)
print("A+v =", A+v)

A = [[0 1]
 [2 3]
 [4 5]]
v = [1 2 3]


ValueError: operands could not be broadcast together with shapes (3,2) (3,) 

Notice the error even tells us the broadcasting fails, and why.  How can we fix this?  Well, what we wanted to happen was to have `v` added to the *columns* of `A`, not the rows.  This can easily be accomplished by taking the transpose of `A`.  Since taking the transpose is so common there is shorthand for doing this, `A.T`.  Here, if we take the transpose of `A` we can add it to `v` using the usual broadcasting.  This is great, except the answer we get will have the rows and columns swapped.  How do we fix this?  Just take the transpose of the result!

In [16]:
print("Corrected A+v =", (A.T + v).T)

Corrected A+v = [[1 2]
 [4 5]
 [7 8]]


*Technical details:* The way this, and most slicing, is done is using what is called *strides*.  It really just changes the way the array is accessed.  No changes are made to the actual order of data in the array.  It is not necessary to know this to use broadcasting, but it does mean that it not computationally expensive to use.  It is implemented in an efficient manner.

## Matrix operations

We have some familiarity with arrays since we have already been using them.  For linear algebra, which we will discuss further later in the class, the main routines are contained in `scipy.linalg`.  Behind the scenes many calculations are done using `BLAS` and `LAPACK` libraries.  Depending on how your version was compiled these can be highly optimized and even parallelized.

*Notes:*
1. Some functions *also* have implementations in NumPy.  In general we are going to prefer the versions from `scipy.linalg`.  In practice, both versions should give the same results except in extreme cases, such as dealing with (numerically) singular matrices.
2. Since we will represent both vectors and matrices using NumPy arrays we will need to be careful.  For example, we know what multiplying a matrix and a vector means, whereas multiplying a two dimensional array and a one dimensional array is not uniquely defined.  There *is* a matrix object in NumPy but it is not well developed nor well supported.  In fact, it is deprecated and will be removed from future versions.  It should never be used!  Due to the fact that we are using arrays instead of matrices this means we will need to use a special function, `np.dot`, or a special operator, `@`, when we multiply a matrix and a vector. (We will discuss the `np.dot` function more below and the `@` operator in the future. The distinction between an array and a vector is an important detail!)

As always, begin by looking at the documentation.  From `scipy.linalg` we see there are many routines.  Here we will focus on some of those from "Basics".  We will encounter some of the other functions in future weeks.

In [17]:
la?

As usual there are many things defined in this module.  For our purposes here we are interested in the basic functionality, in particular calculating inverses using `la.inv()`.  There are many other specialized functions, many matrix decompositions, *etc*., all of which have their particular uses.

In [18]:
la.inv?

Notice that this calculates the inverse for a **square matrix**.  There are options for optimizing the function that are common to many of the functions in `scipy.linalg`.  These are particularly useful when dealing with large matrices when we want to avoid making copies.  By default these functions do make a copy and do not overwrite their inputs.  This is a good default, but can lead to slower, more memory intensive code.  For all of our uses this will not be important so we will just use the defaults.

As one test we can verify that `la.inv()` really does calculate an inverse.  Recall that for a square matrix with $\det(\mathsf{A})\ne 0$ there exists an inverse, $\mathsf{A}^{-1}$ that satisfies
$$ \mathsf{A} \mathsf{A}^{-1} = \mathsf{A}^{-1} \mathsf{A} = \mathsf{1}. $$

In [19]:
# Construct a random matrix
A = np.random.random(size=(3,3))
# Calculate its inverse
Ainv = la.inv(A)
# Print some results
print(f"""
det(A) =\n {la.det(A)}
A Ainv =\n {np.dot(A, Ainv)}
Ainv A =\n {np.dot(Ainv, A)}""")


det(A) =
 0.14320419636627074
A Ainv =
 [[ 1.00000000e+00  1.30204909e-16 -1.79675558e-16]
 [ 1.08928249e-16  1.00000000e+00 -1.88761513e-16]
 [ 1.18775378e-16  2.96863822e-16  1.00000000e+00]]
Ainv A =
 [[ 1.00000000e+00 -3.85441056e-16  2.49385471e-16]
 [-7.10269450e-17  1.00000000e+00  1.67169121e-16]
 [ 7.40317575e-17  1.27822265e-16  1.00000000e+00]]


There are few things to notice here.  First, we used `np.dot()` to multiply the matrices, we **did not use** `A*Ainv`.  Why not?

Let us compare the two options.

In [20]:
print(f"""
np.dot(A,Ainv) =\n {np.dot(A,Ainv)}
A*Ainv =\n {A*Ainv}""")


np.dot(A,Ainv) =
 [[ 1.00000000e+00  1.30204909e-16 -1.79675558e-16]
 [ 1.08928249e-16  1.00000000e+00 -1.88761513e-16]
 [ 1.18775378e-16  2.96863822e-16  1.00000000e+00]]
A*Ainv =
 [[-0.30085366  0.72776594  1.36432709]
 [ 0.17336493 -0.45239308  0.85824376]
 [ 0.65300737  1.39831779 -1.04027362]]


The first case gives the identity matrix (or something close to it) but what do we get in the second case?  The function `np.dot()` does *matrix multiplication* :
$$ (\mathsf{A} \mathsf{A}^{-1})_{ij} = \sum_{k} A_{ik} (A^{-1})_{kj}. $$

The usual multiplication, `A*Ainv`, multiplies the components of the two arrays
$$ (\mathsf{A} * \mathsf{A}^{-1})_{ij} = A_{ij} * (A^{-1})_{ij}. $$

In both cases we end up with a two dimensional array, but they are very different arrays!

The next thing to notice is that we are suppose to get the identity matrix.  We can construct this in NumPy in a few ways, one is using `np.eye()`.

In [21]:
np.eye(A.shape[0])

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Do your matrix multiplications actually return the identity matrix?  We see that it is not exactly the identity matrix, but we do not expect it to be.  (Why not?)  For small arrays like we have here we can look at all the components and see they the diagonal entries are one and the off diagonal entries are close enough to zero, as expected.  For larger arrays we can once again use `np.allclose`.

In [22]:
np.allclose?

As the documentation describes, this will test each component of our arrays testing if they are "close" to each other in both an absolute and relative sense.  In other words, there are two tolerances.  The formula described in the documentation is, for components $a$ and $b$ of the two arrays, respectively, we say these values are close if
$$ |a - b| \le (a_{\mathrm{tol}} + r_{\mathrm{tol}} |b|), $$
where $a_{\mathrm{tol}}$ is the absolute tolerance and $r_{\mathrm{tol}}$ is the relative tolerance.  As the documentation also notes, this definition is asymmetric, $a$ and $b$ are treated differently.  This is an unfortunate choice and something that may be corrected in the future.

For our purposes we can use this to verify the inverse found above

In [24]:
print(f"""
A Ainv = 1? {np.allclose(np.eye(A.shape[0]),np.dot(A, Ainv))}
Ainv A = 1? {np.allclose(np.eye(A.shape[0]),np.dot(Ainv, A))}
""")


A Ainv = 1? True
Ainv A = 1? True

