# Tutorial 2: Introducing NumPy

> COM2004/COM3004


*Copyright &copy; 2022 University of Sheffield. All rights reserved*.

## What is NumPy?

NumPy is a core Python package for scientific computing

* provides a powerful N-dimensional array object
* highly optimised linear algebra tools
* tight integration with C/C++ and Fortran code
* licensed under a BSD license, i.e., freely reusable

## Importing NumPy

Conventionally imported using,

In [None]:
import numpy as np

The Python keyword 'as' here allows us to use 'np' as a shorthand to refer to the 'numpy' module

## Generating a NumPy ndarray

`numpy.ndarray` is a type for representing N-Dimensional arrays.

Arrays can be generated either,

* from Python lists containing numeric data
* using NumPy array generating functions
* reading data from a file

## Generating arrays from lists

In [None]:
my_list = [1, 2, 3, 4, 5]
my_array = np.array(my_list)  # create a simple 1-D array
print(my_list)
print(my_array)
print(type(my_list))
type(my_array)

In [None]:
my_2d_array = np.array([[1., 2, 3], [4, 5, 6], [7, 8, 9]])
print(my_2d_array)
type(my_2d_array)

## The ndarray object's properties

The `ndarray` has various properties that we can access,

In [None]:
my_2d_array.shape

In [None]:
my_2d_array.size

In [None]:
my_2d_array.dtype

## N-dimensional arrays

Note, NumPy generalises arrays to be N-dimensional.

In [None]:
x2 = np.array([[1, 2], [3, 4]])  # a matrix
x3 = np.array([x2, x2, x2, x2])  # stacking two matrices
x3.shape

In [None]:
x4 = np.array([x3, x3, x3, x3, x3])  # stacking 5 3-D structures
x4.shape

In [None]:
x5 = np.array([x4, x4])  # stacking 2 4-D structures!
x5.shape

But in COM2004/3004 we will only be using N=1 (vectors) and N=2 (matrices).

## Array generating functions

In [None]:
np.arange(10)

In [None]:
np.arange(100, 110, 2)  # start, stop, step

In [None]:
np.linspace(10, 20, 5)  # start, stop, n-points

In [None]:
np.zeros( (3, 3, 3) )  # Note, argument is a tuple

In [None]:
np.ones((2, 5))

## More array generating functions

In [None]:
np.diag((4, 5, 3))

In [None]:
np.diag((2, 2), k=3)

In [None]:
np.diag((1, 1, 1)) + np.diag((2, 2), k=1) + np.diag((2, 2), k=-1)

In [None]:
np.eye(10)

## Arrays initialised with random numbers

In [None]:
np.random.rand(2, 4)  # uniform distribution between 0 and 1

In [None]:
np.random.randn(2, 4)  # standard normal distribution

## Reading arrays from files

* `genfromtxt` and `savetxt` for reading and writing to text files.
* `load` and `save` for reading and writing in NumPy's native format.

In [None]:
%more data/liver_data_20.txt

In [None]:
x = np.genfromtxt("data/liver_data_20.txt", delimiter=",")  # for reading a csv file
#x = np.loadtxt("data/liver_data_20.txt", delimiter=",", dtype=float)
print(x)

In [None]:
np.savetxt("data/matrix.csv", x, delimiter="_", fmt="%.5f")

In [None]:
%more data/matrix.csv

## Array manipulation

indexing is similar to Python lists

In [None]:
x = np.array([1, 2, 3, 4, 5, 6, 7])
print(x[0])
print(x[2:5])
print(x[:4])
print(x[4:])

But it is generalised to n-dimensions

In [None]:
x = np.random.rand(5, 5)
print(x[2:4, :2])

## Extracting a row or column vector from a matrix

In [None]:
A = np.genfromtxt("data/test_matrix.txt")
print(A)

In [None]:
print(A[2, 1:4])  # extract row 2  (can also write as A[2])
print(A[2, 1:4].shape)

In [None]:
print(A[:, 2])  # extract column 2
print(A[:, 2].shape)

## Data processing

The NumPy ndarray object has many methods.

e.g., `min`, `max`, `sum`, `product`, `mean`

In [None]:
x = np.array([1, 2, 3, 4, 5, 6])

In [None]:
x.min(), x.max()

In [None]:
x.sum(), x.prod()

In [None]:
x.mean(), x.var()

## Data processing on matrices

In [None]:
A = np.genfromtxt("data/test_matrix.txt")
print(A)

In [None]:
A.min(), A.max()

In [None]:
A.mean(axis=0).shape  #, A.sum(), A.shape

## Reshaping and resizing

It's sometimes necessary to wrap a vector into a matrix or unwrap a matrix into a vector

In [None]:
M = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]).reshape(3, 3)
print(M)

In [None]:
v = M.reshape(9)
print(v)

In [None]:
# The following line will generate an error
# because reshape cannot change the number of elements

#  v = M.reshape(8)

## Adding a new dimension

Can easily turn 1-D vectors in 2-D matrices

In [None]:
v = np.array([1, 2, 3, 4, 5])
print(v)
print(v.shape)

In [None]:
v_row = v[np.newaxis, :]  # turn a vector into a 1-row matrix
print(v_row)
print(v_row.shape)

In [None]:
v_col = v[:, np.newaxis]  # turn a vector into a 1-column matrix
print(v_col)
print(v_col.shape)

## Stacking arrays

Arrays with compatible dimensions can be joined horizontally or vertically

In [None]:
x = np.ones((2, 3))
y = np.zeros((2, 2))
z = np.hstack((x, y, x))  # note, arrays passed as a tuple
print(x)
print(y)
print(z)
print(z.shape)

In [None]:
x = np.ones((2, 2))
y = np.zeros((1, 2))
z = np.vstack((x, y))
print(z)
print(z.shape)

## Tiling and repeating

In [None]:
x = np.array([[1, 2], [3, 4]])
np.tile(x, 3)

In [None]:
np.tile(x, (2, 4))

In [None]:
np.repeat(x, 4)

In [None]:
np.repeat(x, 4, axis=0)

## Copying

Arrays are handled by reference.

When you do `A = B` you are just copying a reference, not the data itself.

In [None]:
A = np.array([1, 2, 3, 4, 5, 6])
B = A
B[0] = 10
print(A)

Note, this is also true for Python lists and objects in general

In [None]:
A = [1, 2, 3, 4]
B = A
B[0] = 10
print(A)

So how do we make a real copy?

## Deep Copy

To actually copy the data stored in the array we use the NumPy copy method,

In [None]:
A = np.array([1, 2, 3, 4])
B = A.copy()  # can also write, B = np.copy(A)
B[0] = 10
print(A)

Note, to copy Python *lists*, we first need to import the copy module,



In [None]:
import copy

A = [1, 2, 3, 4]
B = copy.deepcopy(A)

(Don't confuse NumPy ndarrays and Python lists...)

## Matrix operations

NumPy implements all common array operations,

* addition, subtraction,
* transpose
* multiplication,
* inverse

## Array addition and substraction

In [None]:
X = np.array([[1, 2, 3], [4, 5, 6]])
Y = np.ones((2, 3))
print(X)
print(Y)

In [None]:
print(X + Y)

In [None]:
X - 2 * Y  # note, scalar multiplication

In [None]:
# X + np.array([[2, 2], [2, 2]])

## Broadcasting

During operation NumPy will try to repeat an array to make dimensions fit. This is called 'broadcasting'. It's convenient but it can be confusing.

In [None]:
X = np.array([[1, 2, 3], [4, 5, 6]])
row = np.array([1, 1, 1])
print(X)
print(row)

In [None]:
print(X + row)

In [None]:
col = np.array([1, 2])
print(col)

In [None]:
print(X + col[:, np.newaxis])

## Transpose

Transposing a matrix swaps the rows and columns.

In [None]:
A = np.array([[1, 2, 3], [4, 5, 6]])
print(A)

In [None]:
print(A.T)

In [None]:
A.shape, A.T.shape

In [None]:
v = np.array([1, 2, 3, 4, 5])
print(v)

In [None]:
print(v.T)  # Vectors only have one dimension. Transpose does nothing.

In [None]:
v.shape, v.T.shape

## Vectors versus `skinny' matrices

When using NumPy, a vector is **not** the same as a matrix with one column.

In [None]:
v = np.array([1, 2, 3, 4, 5])  # A vector - has 1 dimension
print(v)
print(v.T)
print(v.shape)

In [None]:
M_row = np.array([[1, 2, 3, 4, 5]])  # A matrix - has 2 dimensions
print(M_row)
print(M_row.shape)

In [None]:
M_col = np.array([[1, 2, 3, 4, 5]]).T  # A matrix can be transposed
print(M_col)
print(M_col.shape)

## Multiplication

The '*' operator performs 'elementwise' multiplication

In [None]:
A = np.array([[1, 2, 3], [4, 5, 6]])
A * A

Standard 'matrix multiplication' is performed using the `dot`
function

In [None]:
np.dot(A, A.T)  # Multipy 2x3 matrix A and 3x2 matrix A.T (AA')

In [None]:
v = np.array([1, 2, 3])
np.dot(A, v)  # Multiply 2x3 matrix A and 3-element vector (Av)

## Matrix inverse

Matrix determinant and inverse function are provided by the linalg submodule of NumPy.

In [None]:
A = np.array([[2, 1], [3, 2]])
print(A)

In [None]:
np.linalg.det(A)

In [None]:
np.linalg.inv(A)

## Summary

* NumPy provides tools for numeric computing.
* With NumPy, Python becomes a usable alternative to MATLAB.
* NumPy's basic type is the `ndarray` -- it can represent vectors, matrices, etc.
* Lots of tools for vector and matrix manipulation.
    - This lecture has only reviewed the most commonly used.
* For the full documentation see [http://docs.scipy.org/doc/numpy/](http://docs.scipy.org/doc/numpy/)