# numpy Tutorial

## Introduction

numpy is a library used to build multidimensional arrays and perform linear algebra operations. numpy calls dimensions of the matrix _axes_ and the number of axes _rank_.

## Initializing Arrays

We begin by importing the numpy module.

In [1]:
import numpy as np

The simplest way to create a numpy array is to convert a Python array. Note that the type of a numpy array is different than that off a Python array and provides significantly more functionality. Types are deduced based on the values passed in through the Python array.

In [2]:
np.array([1, 2, 3, 4])

array([1, 2, 3, 4])

In [3]:
a = np.array([1, 2, 3, 4])
a.dtype

dtype('int32')

In [4]:
b = np.array([1.2, 3.4, 5.6])
b.dtype

dtype('float64')

Another simple way to create a new numpy array of a certain size is to initialize it by dimension directly. This can be achieved using `zeros`, `ones`, or some other size-based initialization method, which accept a tuple argument to specify the dimensions. `zeros` will zero-initialize the array and `ones` will one-initialize the array. There are also ways to initlalize with random numbers, identity matrix, etc.

In [5]:
np.zeros((3, 4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [6]:
np.ones((3, 4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [7]:
np.empty((3, 4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

Another way to initialize a numpy array is using the `arange` or `linspace` methods. `arange` accepts the start and end of an interval as well as a step size (generaally used with integers). `linspace` accepts the start and end of an interval as well as the number of elements (generally used with floating point numbers). `arange` includes the start value and excludes the end value of the interval. `linspace` includes both ends of the interval by default, but can be set to exclude the end.

In [8]:
np.arange(10, 30, 5)

array([10, 15, 20, 25])

In [9]:
np.linspace(0, 2, 9)

array([0.  , 0.25, 0.5 , 0.75, 1.  , 1.25, 1.5 , 1.75, 2.  ])

## Operating on arrays

Common operations, such as addition, subtraction, exponentiation, etc., are executed element by element on arrays. numpy also provides some functions, such as `sin`, which operate on arrays elementwise.

In [10]:
a = np.array([20, 30, 40, 50])
b = np.arange(4)  # implicitly starts at 0
a - b

array([20, 29, 38, 47])

In [11]:
b ** 2

array([0, 1, 4, 9], dtype=int32)

In [12]:
a = np.linspace(0, 2 * np.pi, 8)
np.sin(a)

array([ 0.00000000e+00,  7.81831482e-01,  9.74927912e-01,  4.33883739e-01,
       -4.33883739e-01, -9.74927912e-01, -7.81831482e-01, -2.44929360e-16])

Naturally, numpy also provides some operations on matrices. The `*` operator does elementwise multiplication of two matrices, and the `dot` method does a matrix multiplication (i.e. repeated dot products to return a matrix multiplication).

In [13]:
a = np.array([[1, 1],
              [0, 1]])
b = np.array([[2, 0],
              [3, 4]])
a * b

array([[2, 0],
       [0, 4]])

In [14]:
np.dot(a, b)

array([[5, 4],
       [3, 4]])

There are also functions that operate elementwise by default, but that can accept an `axis` parameter to change the behavior for multidimensional arrays. For example, selecting the `sum` operation to work along the 0th axis (columns), will produce a 1D array instead of a scalar value.

In [15]:
a = np.array([[0, 1,  2,  3],
              [4, 5,  6,  7],
              [8, 9, 10, 11]])
np.sum(a)

66

In [16]:
np.sum(a, axis=0)

array([12, 15, 18, 21])

In [17]:
np.sum(a, axis=1)

array([ 6, 22, 38])

In [18]:
np.mean(a, axis=1)

array([1.5, 5.5, 9.5])

## Shape manipulation

numpy allows for slicing similar to Python's native slicing, but extended to support multiple dimensions. First let's look at the single dimension slicing.

In [19]:
a = np.arange(10) ** 2

In [20]:
a

array([ 0,  1,  4,  9, 16, 25, 36, 49, 64, 81], dtype=int32)

In [21]:
a[2]

4

In [22]:
a[2:5]

array([ 4,  9, 16], dtype=int32)

In [23]:
a[::-1]

array([81, 64, 49, 36, 25, 16,  9,  4,  1,  0], dtype=int32)

The same sort of slicing can be used for multi-dimension arrays, where the slices for each dimension are separated by commas.

In [24]:
b = np.fromfunction(lambda r, c: r + 2 * c, (5, 4), dtype=int)

In [25]:
b

array([[ 0,  2,  4,  6],
       [ 1,  3,  5,  7],
       [ 2,  4,  6,  8],
       [ 3,  5,  7,  9],
       [ 4,  6,  8, 10]])

In [26]:
b[2,3]

8

In [27]:
b[:,1]  # pull out the entirety of each row, and then from each row pull out the 2nd column

array([2, 3, 4, 5, 6])

In [28]:
b[1:3,1] # pull out the entirety of the 2nd and 3rd row, and then from each of those pull out the 2nd column

array([3, 4])

In [29]:
b[-2:,-2:]  # pull out the bottom right 2x2 matrix

array([[ 7,  9],
       [ 8, 10]])

Additionally, numpy can reshape arrays. Reshaping refers to restructuring the same data into different dimensions. We first create an array of dimensions 3x4 filled with random integers between 0 and 9, inclusive. The `shape` method prints out the dimensions of the array.

In [30]:
a = np.floor(10 * np.random.random((3, 4)))
a.shape

(3, 4)

The `ravel` method can be used to flatten the array into a single dimension. Note that `ravel` does not modify the original array, but instead returns a flattened copy.

In [31]:
a.ravel()

array([7., 6., 1., 6., 2., 3., 4., 7., 8., 4., 9., 7.])

We can also change the dimensions using the `reshape` method. The `reshape` method also returns a copy instead of modifying the original array.

In [32]:
a.reshape(6, 2)

array([[7., 6.],
       [1., 6.],
       [2., 3.],
       [4., 7.],
       [8., 4.],
       [9., 7.]])

Since transposing is a common operation, numpy provides an attribute `T` that contains the transposed array.

In [33]:
a.reshape(6, 2).T

array([[7., 1., 2., 4., 8., 9.],
       [6., 6., 3., 7., 4., 7.]])

Arrays can also be stacked, provided their dimensions are agreeable. Stacking arrays places one array either to the on top of or on the side of another array.

In [34]:
a = np.floor(10 * np.random.random((2, 2)))
b = np.floor(10 * np.random.random((2, 2)))

In [35]:
a

array([[1., 0.],
       [1., 7.]])

In [36]:
b

array([[1., 4.],
       [5., 8.]])

In [37]:
np.vstack((a, b))

array([[1., 0.],
       [1., 7.],
       [1., 4.],
       [5., 8.]])

In [38]:
np.hstack((a, b))

array([[1., 0., 1., 4.],
       [1., 7., 5., 8.]])

## Linear algebra

Working with arrays is useful and convenient with numpy, but much of the power comes from using its linear algebra library (and the fact that many analytic libraries support numpy types).

First we need to import the linear algebra library.

In [39]:
import numpy.linalg as la

To start off simple, you can use numpy to compute inner (dot) and cross products of vectors. These methods are in the standard numpy library.

In [40]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
np.inner(a, b)

32

In [41]:
a = np.array([1, 0, 0])
b = np.array([0, 1, 0])
np.cross(a, b)

array([0, 0, 1])

numpy can also compute the norm of a vector or matrix. The `norm` method is in the linear algebra library.

In [42]:
a = np.array([3, 4])
la.norm(a)

5.0

numpy can also be used to compute eigenvalues and eigenvectors using the `eig` method.

In [43]:
a = np.diag((1, 2, 3))
w, v = la.eig(a)

In [44]:
w

array([1., 2., 3.])

In [45]:
v

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

Finally, numpy can solve linear matrix equations. For example, we can solve for $x$ and $y$ in the system of equations $3x+y=9$ and $x+2y=8$ using numpy.

In [46]:
a = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
la.solve(a, b)

array([2., 3.])

## References

* [numpy documentation](https://docs.scipy.org/doc/numpy-1.14.0/reference/)