## **Put your name and student ID here**

# **NumPy**

*NumPy is the fundamental library for scientific computing with Python. NumPy is centered around a powerful N-dimensional array object, and it also contains useful linear algebra, Fourier transform, and random number functions.*

## 2D

## arange()

## reshape() indexing

## mean() max() min() sum()

## linspace()

# Creating Arrays

Now let's import `numpy`. Most people import it as `np`:

## `np.zeros`

The `zeros` function creates an array containing any number of zeros:

It's just as easy to create a 2D array (i.e. a matrix) by providing a tuple with the desired number of rows and columns. For example, here's a 3x4 matrix:

## Some vocabulary

* In NumPy, each dimension is called an **axis**.
* The number of axes is called the **rank**.
    * For example, the above 3x4 matrix is an array of rank 2 (it is 2-dimensional).
    * The first axis has length 3, the second has length 4.
* An array's list of axis lengths is called the **shape** of the array.
    * For example, the above matrix's shape is `(3, 4)`.
    * The rank is equal to the shape's length.
* The **size** of an array is the total number of elements, which is the product of all axis lengths (e.g. 3*4=12)

## N-dimensional arrays
You can also create an N-dimensional array of arbitrary rank. For example, here's a 3D array (rank=3), with shape `(2,3,4)`:

## Array type
NumPy arrays have the type `ndarray`s:

## `np.ones`
Many other NumPy functions create `ndarray`s.

Here's a 3x4 matrix full of ones:

## `np.full`
Creates an array of the given shape initialized with the given value. Here's a 3x4 matrix full of `π`.

## `np.empty`
An uninitialized 2x3 array (its content is not predictable, as it is whatever is in memory at that point):

## np.array
Of course, you can initialize an `ndarray` using a regular python array. Just call the `array` function:

## `np.arange`
You can create an `ndarray` using NumPy's `arange` function, which is similar to python's built-in `range` function:

It also works with floats:

Of course, you can provide a step parameter:

However, when dealing with floats, the exact number of elements in the array is not always predictable. For example, consider this:

In [1]:
# import numpy as np
# print(np.arange(0, 5/3, 1/3)) # depending on floating point errors, the max value is 4/3 or 5/3.
# print(np.arange(0, 5/3, 0.333333333))
# print(np.arange(0, 5/3, 0.333333334))


## `np.linspace`
For this reason, it is generally preferable to use the `linspace` function instead of `arange` when working with floats. The `linspace` function returns an array containing a specific number of points evenly distributed between two values (note that the maximum value is *included*, contrary to `arange`):

## `np.rand` and `np.randn`
A number of functions are available in NumPy's `random` module to create `ndarray`s initialized with random values.
For example, here is a 3x4 matrix initialized with random floats between 0 and 1 (uniform distribution):

Here's a 3x4 matrix containing random floats sampled from a univariate [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) (Gaussian distribution) of mean 0 and variance 1:

## np.fromfunction
You can also initialize an `ndarray` using a function:

NumPy first creates three `ndarray`s (one per dimension), each of shape `(3, 2, 10)`. Each array has values equal to the coordinate along a specific axis. For example, all elements in the `z` array are equal to their z-coordinate:

    [[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
      [ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]]
    
     [[ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
      [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]]
    
     [[ 2.  2.  2.  2.  2.  2.  2.  2.  2.  2.]
      [ 2.  2.  2.  2.  2.  2.  2.  2.  2.  2.]]]

So the terms `x`, `y` and `z` in the expression `x + 10 * y + 100 * z` above are in fact `ndarray`s (we will discuss arithmetic operations on arrays below).  The point is that the function `my_function` is only called *once*, instead of once per element. This makes initialization very efficient.

# Array data
## `dtype`
NumPy's `ndarray`s are also efficient in part because all their elements must have the same type (usually numbers).
You can check what the data type is by looking at the `dtype` attribute:

Instead of letting NumPy guess what data type to use, you can set it explicitly when creating an array by setting the `dtype` parameter:

Available data types include signed `int8`, `int16`, `int32`, `int64`, unsigned `uint8`|`16`|`32`|`64`, `float16`|`32`|`64` and `complex64`|`128`. Check out the documentation for the [basic types](https://numpy.org/doc/stable/user/basics.types.html) and [sized aliases](https://numpy.org/doc/stable/reference/arrays.scalars.html#sized-aliases) for the full list.

## `itemsize`
The `itemsize` attribute returns the size (in bytes) of each item:

# Reshaping an array
## In place
Changing the shape of an `ndarray` is as simple as setting its `shape` attribute. However, the array's size must remain the same.

## `reshape`
The `reshape` function returns a new `ndarray` object pointing at the *same* data. This means that modifying one array will also modify the other.

Set item at row 1, col 2 to 999 (more about indexing below).

The corresponding element in `g` has been modified.

## `ravel`
Finally, the `ravel` function returns a new one-dimensional `ndarray` that also points to the same data:

# Arithmetic operations
All the usual arithmetic operators (`+`, `-`, `*`, `/`, `//`, `**`, etc.) can be used with `ndarray`s. They apply *elementwise*:

Note that the multiplication is *not* a matrix multiplication. We will discuss matrix operations below.

The arrays must have the same shape. If they do not, NumPy will apply the *broadcasting rules*.

# Broadcasting

In general, when NumPy expects arrays of the same shape but finds that this is not the case, it applies the so-called *broadcasting* rules:

## First rule
*If the arrays do not have the same rank, then a 1 will be prepended to the smaller ranking arrays until their ranks match.*

Now let's try to add a 1D array of shape `(5,)` to this 3D array of shape `(1,1,5)`. Applying the first rule of broadcasting!

## Second rule
*Arrays with a 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is repeated along that dimension.*

Let's try to add a 2D array of shape `(2,1)` to this 2D `ndarray` of shape `(2, 3)`. NumPy will apply the second rule of broadcasting:

Combining rules 1 & 2, we can do this:

And also, very simply:

## Third rule
*After rules 1 & 2, the sizes of all arrays must match.*

Broadcasting rules are used in many NumPy operations, not just arithmetic operations, as we will see below.
For more details about broadcasting, check out [the documentation](https://numpy.org/doc/stable/user/basics.broadcasting.html).

# Conditional operators

The conditional operators also apply elementwise:

And using broadcasting:

This is most useful in conjunction with boolean indexing (discussed below).

# Mathematical and statistical functions

Many mathematical and statistical functions are available for `ndarray`s.

## `ndarray` methods
Some functions are simply `ndarray` methods, for example:

Note that this computes the mean of all elements in the `ndarray`, regardless of its shape.

Here are a few more useful `ndarray` methods:

These functions accept an optional argument `axis` which lets you ask for the operation to be performed on elements along the given axis. For example:

You can also sum over multiple axes:

## Universal functions
NumPy also provides fast elementwise functions called *universal functions*, or **ufunc**. They are vectorized wrappers of simple functions. For example `square` returns a new `ndarray` which is a copy of the original `ndarray` except that each element is squared:

np.abs, np.sqrt, np.exp, np.log, np.sign, np.ceil, np.modf, np.isnan, np.cos

In [2]:
# a = np.array([[-2.5, 3.1, 7], [10, 11, 12]])


Here are a few more useful unary ufuncs:

The two warnings are due to the fact that `sqrt()` and `log()` are undefined for negative numbers, which is why there is a `np.nan` value in the first cell of the output of these two functions.

## Binary ufuncs
There are also many binary ufuncs, that apply elementwise on two `ndarray`s.  Broadcasting rules are applied if the arrays do not have the same shape:

In [3]:
# a = np.array([1, -2, 3, 4])
# b = np.array([2, 8, -1, 7])
# np.add(a, b)  # equivalent to a + b

In [4]:
# np.greater(a, b)  # equivalent to a > b

In [5]:
# np.maximum(a, b)

In [6]:
# np.copysign(a, b)

# Array indexing
## One-dimensional arrays
One-dimensional NumPy arrays can be accessed more or less like regular python arrays:

In [7]:
# a = np.array([1, 5, 3, 19, 13, 7, 3])


Of course, you can modify elements:

You can also modify an `ndarray` slice:

## Differences with regular python arrays
Contrary to regular python arrays, if you assign a single value to an `ndarray` slice, it is copied across the whole slice, thanks to broadcasting rules discussed above.

Also, you cannot grow or shrink `ndarray`s this way:

Last but not least, `ndarray` **slices are actually *views*** on the same data buffer. This means that if you create a slice and modify it, you are actually going to modify the original `ndarray` as well!

If you want a copy of the data, you need to use the `copy` method:

## Multidimensional arrays
Multidimensional arrays can be accessed in a similar way by providing an index or slice for each axis, separated by commas:

In [8]:
# b = np.arange(48).reshape(4, 12)
# b

In [9]:
 # row 1, col 2

In [10]:
 # row 1, all columns

In [11]:
  # all rows, column 1

**Caution**: note the subtle difference between these two expressions: 

The first expression returns row 1 as a 1D array of shape `(12,)`, while the second returns that same row as a 2D array of shape `(1, 12)`.

## Fancy indexing
You may also specify a list of indices that you are interested in. This is referred to as *fancy indexing*.

In [12]:
  # rows 0 and 2, columns 2 to 4 (5-1)

In [13]:
  # all rows, columns -1 (last), 2 and -1 (again, and in this order)

If you provide multiple index arrays, you get a 1D `ndarray` containing the values of the elements at the specified coordinates.

## Higher dimensions
Everything works just as well with higher dimensional arrays, but it's useful to look at a few examples:

In [14]:
# c = b.reshape(4,2,6)
# c

In [15]:
  # matrix 2, row 1, col 4

In [16]:
  # matrix 2, all rows, col 3

If you omit coordinates for some axes, then all elements in these axes are returned:

In [17]:
  # Return matrix 2, row 1, all columns.  This is equivalent to c[2, 1, :]

## Ellipsis (`...`)
You may also write an ellipsis (`...`) to ask that all non-specified axes be entirely included.

In [18]:
# c[2, ...]  #  matrix 2, all rows, all columns.  This is equivalent to c[2, :, :]

In [19]:
# c[2, 1, ...]  # matrix 2, row 1, all columns.  This is equivalent to c[2, 1, :]

In [20]:
# c[2, ..., 3]  # matrix 2, all rows, column 3.  This is equivalent to c[2, :, 3]

In [21]:
# c[..., 3]  # all matrices, all rows, column 3.  This is equivalent to c[:, :, 3]

## Boolean indexing
You can also provide an `ndarray` of boolean values on one axis to specify the indices that you want to access.

In [22]:
# b = np.arange(48).reshape(4, 12)
# b

In [23]:
 # Rows 0 and 2, all columns. Equivalent to b[(0, 2), :]

In [24]:
 # All rows, columns 1, 4, 7 and 10

## `np.ix_`
You cannot use boolean indexing this way on multiple axes, but you can work around this by using the `ix_` function:

If you use a boolean array that has the same shape as the `ndarray`, then you get in return a 1D array containing all the values that have `True` at their coordinate. This is generally used along with conditional operators:

# Stacking arrays
It is often useful to stack together different arrays. NumPy offers several functions to do just that. Let's start by creating a few arrays.

In [25]:
# q1 = np.full((3,4), 1.0)
# q1

In [26]:
# q2 = np.full((4,4), 2.0)
# q2

In [27]:
# q3 = np.full((3,4), 3.0)
# q3

## `vstack`
Now let's stack them vertically using `vstack`:

It was possible because q1, q2 and q3 all have the same shape (except for the vertical axis, but that's ok since we are stacking on that axis).

## `hstack`
We can also stack arrays horizontally using `hstack`:

# Splitting arrays
Splitting is the opposite of stacking. For example, let's use the `vsplit` function to split a matrix vertically.

First let's create a 6x4 matrix:

In [28]:
# r = np.arange(24).reshape(6,4)
# r

Now let's split it in three equal parts, vertically:

In [29]:
# r1, r2, r3 = np.vsplit(r, 3)
# r1

In [30]:
# r2

In [31]:
# r3

There is also a `split` function which splits an array along any given axis. Calling `vsplit` is equivalent to calling `split` with `axis=0`. There is also an `hsplit` function, equivalent to calling `split` with `axis=1`:

In [32]:
# r4, r5 = np.hsplit(r, 2)
# r4

In [33]:
# r5

# Linear algebra
NumPy 2D arrays can be used to represent matrices efficiently in python. We will just quickly go through some of the main matrix operations available.
## Matrix transpose
The `T` attribute is equivalent to calling `transpose()` when the rank is ≥2:

In [34]:
# m1 = np.arange(10).reshape(2,5)
# m1

The `T` attribute has no effect on rank 0 (empty) or rank 1 arrays:

In [35]:
# m2 = np.arange(5)
# m2

We can get the desired transposition by first reshaping the 1D array to a single-row matrix (2D):

In [36]:
# m2r = m2.reshape(1,5)
# m2r

## Matrix multiplication
Let's create two matrices and execute a [matrix multiplication](https://en.wikipedia.org/wiki/Matrix_multiplication) using the `dot()` method.

In [37]:
# n1 = np.arange(10).reshape(2, 5)
# n1

In [38]:
# n2 = np.arange(15).reshape(5,3)
# n2

**Caution**: as mentioned previously, `n1*n2` is *not* a matrix multiplication, it is an elementwise product (also called a [Hadamard product](https://en.wikipedia.org/wiki/Hadamard_product_(matrices))).

## Matrix inverse and pseudo-inverse
Many of the linear algebra functions are available in the `numpy.linalg` module, in particular the `inv` function to compute a square matrix's inverse:

In [39]:
# import numpy.linalg as linalg
#
# m3 = np.array([[1,2,3],[5,7,11],[21,29,31]])
# m3

In [40]:
# inverse


You can also compute the [pseudoinverse](https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_pseudoinverse) using `pinv`:

In [41]:
# linalg.pinv(m3)

## Identity matrix
The product of a matrix by its inverse returns the identity matrix (with small floating point errors):

In [42]:
# m3.dot(linalg.inv(m3))

You can create an identity matrix of size NxN by calling `eye(N)` function:

In [43]:
# np.eye(3)

## QR decomposition
The `qr` function computes the [QR decomposition](https://en.wikipedia.org/wiki/QR_decomposition) of a matrix:

In [44]:
# q, r = linalg.qr(m3)
# q

In [45]:
# r

In [46]:
# q.dot(r)  # q.r equals m3

## Determinant
The `det` function computes the [matrix determinant](https://en.wikipedia.org/wiki/Determinant):

In [47]:
# linalg.det(m3)  # Computes the matrix determinant

## Eigenvalues and eigenvectors
The `eig` function computes the [eigenvalues and eigenvectors](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors) of a square matrix:

In [48]:
# eigenvalues, eigenvectors = linalg.eig(m3)
# eigenvalues # λ

In [49]:
# eigenvectors # v

In [50]:
# m3.dot(eigenvectors) - eigenvalues * eigenvectors  # m3.v - λ*v = 0

## Singular Value Decomposition (not our focus)
The `svd` function takes a matrix and returns its [singular value decomposition](https://en.wikipedia.org/wiki/Singular_value_decomposition):

In [51]:
# m4 = np.array([[1,0,0,0,2], [0,0,3,0,0], [0,0,0,0,0], [0,2,0,0,0]])
# m4

In [52]:
# U, S_diag, V = linalg.svd(m4)
# U

In [53]:
# S_diag

The `svd` function just returns the values in the diagonal of Σ, but we want the full Σ matrix, so let's create it:

In [54]:
# S = np.zeros((4, 5))
# S[np.diag_indices(4)] = S_diag
# S  # Σ

In [55]:
# V

In [56]:
# U.dot(S).dot(V) # U.Σ.V == m4

## Diagonal and trace (not our focus)

In [57]:
# np.diag(m3)  # the values in the diagonal of m3 (top left to bottom right)

In [58]:
# m3.diagonal()

In [59]:
# m3.trace()  # equivalent to np.diag(m3).sum()

# Meshgrid
x_coords = np.arange(0, 3)  # [0, 1, 2, ..., 1023]
y_coords = np.arange(0, 5)   # [0, 1, 2, ..., 767]
X, Y = np.meshgrid(x_coords, y_coords)

# Saving and loading
NumPy makes it easy to save and load `ndarray`s in text format.


Let's create a random array and save it.

## Text format
Let's try saving the array in text format:

This is a CSV file with tabs as delimiters. You can set a different delimiter:
np.savetxt("my_array.csv", a, delimiter=",")

To load this file, just use `loadtxt`:
np.loadtxt("my_array.csv", delimiter=",")