# The Numpy Package

### Kasey Martin, MIS

## Outline
- What is Numpy
- Simple Benchmarks
- Basic Operations
- Matrix Manipulations
- Numpy's `random` module

## What is Numpy?
- Package for scientific computing
- High performance N-dimensional array object
- Very useful for:
    - Linear Algebra
    - Advanced Random Number Generation 

## How to Add to Our Projects?

- Already comes preinstalled with the Anaconda computing distribution 
- Use `import` to include into project:

In [1]:
import numpy as np

## High performance N-dimensional array object
- To create an N-dimensional array purely in Python, we create a list of lists
- N-number of list elements
- Each inner list is holds M number of elements
- Let's try a 4x3 matrix

In [2]:
matrix = [
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
]

## High performance N-dimensional array object
- To create a NxM matrix with numpy, we can use by default its `np.array` method
- accepts a list by default

In [3]:
np_matrix = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

## Simple benchmark 1
- Try multiplying the whole sample 4x3 matrix by 2
- Use Jupyter Notebook's `%%time` magic keyword to measure execution time 

In [4]:
matrix = [
    [1,2,3,4],
    [5,6,7,8],
    [9,10,11,12]
]
np_matrix = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])

In [5]:
%%time
for x , row in enumerate(matrix):
    for y,item in enumerate(row):
        matrix[x][y] = item * 2

CPU times: user 11 µs, sys: 0 ns, total: 11 µs
Wall time: 15 µs


In [6]:
%%time
np_matrix = np_matrix*2

CPU times: user 24 µs, sys: 5 µs, total: 29 µs
Wall time: 32.2 µs


## Simple benchmark 2
- Let's try multiplying a 1 million element list by 2

In [7]:
vector = list(range(1000000))
np_vector = np.arange(1000000)
print(len(vector))
print(len(np_vector))

1000000
1000000


In [8]:
%%time
for i,x in enumerate(vector):
    vector[i] = x*2

CPU times: user 142 ms, sys: 1.8 ms, total: 143 ms
Wall time: 184 ms


In [9]:
%%time
np_vector = np_vector*2

CPU times: user 2.63 ms, sys: 3.03 ms, total: 5.65 ms
Wall time: 5.38 ms


## Simple benchmark 3
- Let's try multiplying a 1000x1000 matrix by 2

In [10]:
matrix = []
for row in range(1000):
    matrix.append([])
    for column in range(1000):
        value = (row*1000)+column
        matrix[row].append(value)
        
np_matrix = np.arange(1000000)
np_matrix = np_matrix.reshape(1000,1000)

In [11]:
%%time
for x , row in enumerate(matrix):
    for y,item in enumerate(row):
        matrix[x][y] = item * 2


CPU times: user 169 ms, sys: 3.1 ms, total: 172 ms
Wall time: 171 ms


In [12]:
%%time
np_matrix = np_matrix*2

CPU times: user 2.65 ms, sys: 3.23 ms, total: 5.88 ms
Wall time: 4.9 ms


## Matrix Initialization
- `np.array([[list1],[list2],...,[listn]])` - creates an N-dimension matrix from list of lists

In [13]:
np.array([
    [1,2,3],
    [4,5,6],
    [7,8,9]
])

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

## Matrix Initialization
- `np.zeros([x,y,z,..])` - creates an N-dimension matrix of zeros
- `np.ones([x,y,z,..])` - creates an N-dimension matrix of ones

In [14]:
print("array of zeros:")
print(np.zeros((3,4)))
print("array of ones:")
print(np.ones((3,4)))


array of zeros:
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
array of ones:
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


## Matrix Initialization
- `np.full([x,y,z,..],value)` - creates an N-dimension matrix of a constant value
- `np.random.random([x,y,z,..])` - creates an N-dimension matrix of random values

In [15]:
print("array of constants:")
print(np.full([3,4],16))
print("array of random values:")
print(np.random.random([3,4]))

array of constants:
[[16 16 16 16]
 [16 16 16 16]
 [16 16 16 16]]
array of random values:
[[0.93236417 0.34040285 0.45188013 0.6264704 ]
 [0.91257783 0.72286607 0.21967525 0.93131759]
 [0.9488106  0.49443911 0.97786523 0.23043897]]


## Matrix Initialization

- `np.empty([x,y,z,..])` - creates an N-dimension empty matrix 
- `np.eye(N)` - creates an N-dimension identity matrix 

In [16]:
print("empty array:")
print(np.empty([3,4]))
print("identity matrix:")
print(np.eye(3))

empty array:
[[0.93236417 0.34040285 0.45188013 0.6264704 ]
 [0.91257783 0.72286607 0.21967525 0.93131759]
 [0.9488106  0.49443911 0.97786523 0.23043897]]
identity matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


## Matrix Initialization
- `np.arange(start = 0, end, step=1)` - creates an 1d array from `start` to `end` with a step-size of `step`
- `np.linspace(start=0,end,num=50)` - creates an 1d array from `start` to `end` with a size of `num` 

In [17]:
print("array of zeros:")
print(np.arange(10,15,0.1))
print("array of zeros:")
print(np.linspace(10,15,10))

array of zeros:
[10.  10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 11.  11.1 11.2 11.3
 11.4 11.5 11.6 11.7 11.8 11.9 12.  12.1 12.2 12.3 12.4 12.5 12.6 12.7
 12.8 12.9 13.  13.1 13.2 13.3 13.4 13.5 13.6 13.7 13.8 13.9 14.  14.1
 14.2 14.3 14.4 14.5 14.6 14.7 14.8 14.9]
array of zeros:
[10.         10.55555556 11.11111111 11.66666667 12.22222222 12.77777778
 13.33333333 13.88888889 14.44444444 15.        ]


## Linear Algebra: Operations
- Numpy has efficient and readable operations for Linear Algebra

<table>
<tr>
    <th>Operator</th>
    <th>Numpy Method Format</th>
    <th>Operation</th>
</tr>
<tr>
    <td> A + B </td>
    <td>np.add(A, B)</td>
    <td>matrix addition</td>
</tr>
<tr>
    <td> A - B </td>
    <td>np.subtract(A,B)</td>
    <td>matrix subtraction</td>
</tr>
<tr>
    <td> A / B </td>
    <td>np.divide(A, B)</td>
    <td>matrix division</td>
</tr>
<tr>
    <td> A * B </td>
    <td>np.multiply(A,B)</td>
    <td>matrix multiplication</td>
</tr>
<tr>
    <td> A @ B </td>
    <td>np.dot(A, B)</td>
    <td>matrix dot multiplication</td>
</tr>
</table>

In [18]:
# create matrices A and B
A = np.arange(9)
A = A.reshape(3,3)
print('A:\n',A)
B = np.random.randint(1,100,size=(3,3))
B = B.reshape(3,3)
print('B:\n',B)
b = np.array([2,4,6])
print('b:\n',b)
scalar_cons = 2
print('scalar_cons:\n',scalar_cons)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
b:
 [2 4 6]
scalar_cons:
 2


## Scalar Addition and Subtraction
- Addition:

In [19]:
print('A:\n',A)
print('scalar_cons:\n',scalar_cons)
C = A + scalar_cons
print('Scalar Addition:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
scalar_cons:
 2
Scalar Addition:
 [[ 2  3  4]
 [ 5  6  7]
 [ 8  9 10]]


## Scalar Addition and Subtraction
- Subtraction:

In [20]:
print('A:\n',A)
print('scalar_cons:\n',scalar_cons)
C = A - scalar_cons
print('Scalar Subtraction:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
scalar_cons:
 2
Scalar Subtraction:
 [[-2 -1  0]
 [ 1  2  3]
 [ 4  5  6]]


## Vector Addition and Subtraction
- Addition:

In [21]:
print('A:\n',A)
print('b:\n',b)
C = A + b
print('Vector Addition:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
b:
 [2 4 6]
Vector Addition:
 [[ 2  5  8]
 [ 5  8 11]
 [ 8 11 14]]


## Vector Addition and Subtraction
- Subtraction:

In [22]:
print('A:\n',A)
print('b:\n',b)
C = A - b
print('Vector Subtraction:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
b:
 [2 4 6]
Vector Subtraction:
 [[-2 -3 -4]
 [ 1  0 -1]
 [ 4  3  2]]


## Matrix Addition and Subtraction
- Addition:

In [23]:
print('A:\n',A)
print('B:\n',B)
C = A + B
print('Matrix Addition:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
Matrix Addition:
 [[ 52   4  23]
 [ 40  14 102]
 [ 47  72  12]]


## Matrix Addition and Subtraction
- Subtraction:

In [24]:
print('A:\n',A)
print('B:\n',B)
C = A - B
print('Matrix Subtraction:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
Matrix Subtraction:
 [[-52  -2 -19]
 [-34  -6 -92]
 [-35 -58   4]]


## Matrix Multiplication
- element-wise multiplication
$$C=AB \Leftrightarrow c_{ij} = a_{ij}b_{ij}$$

In [25]:
print('A:\n',A)
print('B:\n',B)
C = A * B
print('Matrix Multiplication:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
Matrix Multiplication:
 [[  0   3  42]
 [111  40 485]
 [246 455  32]]


## Matrix Dot Multiplication
- Let **A** is an _n x m_matrix
- Let **B** be an _m x p_ matrix
$$C=A\cdot B \Leftrightarrow c_{ij} = \sum_{k=1}^{n}a_{ik}b_{kj}$$
- Where _i = 1, ..., n_ and _j = 1, ..., p_.

In [26]:
print('A:\n',A)
print('B:\n',B)
C = A @ B
print('Matrix Dot Multiplication:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
Matrix Dot Multiplication:
 [[119 140 105]
 [509 374 471]
 [899 608 837]]


## Matrix Scalar Division

In [27]:
print('A:\n',A)
print('scalar_cons:\n',scalar_cons)
C = A / scalar_cons
print('Scalar multiplication:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
scalar_cons:
 2
Scalar multiplication:
 [[0.  0.5 1. ]
 [1.5 2.  2.5]
 [3.  3.5 4. ]]


## Matrix Vector Division

In [28]:
print('A:\n',A)
print('b:\n',b)
C = A / b
print('Scalar multiplication:\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
b:
 [2 4 6]
Scalar multiplication:
 [[0.         0.25       0.33333333]
 [1.5        1.         0.83333333]
 [3.         1.75       1.33333333]]


## Matrix Division
- element wise division
$$ \frac{A}{B} \Leftrightarrow c_{ij} = \sum_{k=1}^{n} \frac{a_{ik}}{b_{kj}}$$

In [29]:
print('A:\n',A)
print('B:\n',B)
C =  A / B
print('Matrix element-wise Division(C = A / B):\n',C)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
Matrix element-wise Division(C = A / B):
 [[0.         0.33333333 0.0952381 ]
 [0.08108108 0.4        0.05154639]
 [0.14634146 0.10769231 2.        ]]


## Matrix Division
- True matrix division **(valid only if B has an inverse)**
$$ \frac{A}{B} = AB^{-1}$$

In [30]:
print('A:\n',A)
print('B:\n',B)
C =  A / B
print('Matrix Division(C = A / B):\n',C)
C =  A@np.linalg.inv(B)
print('Matrix Division(C = A @ B^-1):\n',C)
# validate that C is correct
print(C@B)

A:
 [[0 1 2]
 [3 4 5]
 [6 7 8]]
B:
 [[52  3 21]
 [37 10 97]
 [41 65  4]]
Matrix Division(C = A / B):
 [[0.         0.33333333 0.0952381 ]
 [0.08108108 0.4        0.05154639]
 [0.14634146 0.10769231 2.        ]]
Matrix Division(C = A @ B^-1):
 [[-0.02870432  0.02631077  0.01266162]
 [-0.02384746  0.05447177  0.05425884]
 [-0.01899059  0.08263277  0.09585606]]
[[-5.89805982e-17  1.00000000e+00  2.00000000e+00]
 [ 3.00000000e+00  4.00000000e+00  5.00000000e+00]
 [ 6.00000000e+00  7.00000000e+00  8.00000000e+00]]


In [31]:
# Sample operations
print('A + B:\n',A + B)
print('A - B:\n',A - B)
print('A / B:\n',A / B)
print('A * B:\n',A * B)
print('A @ B:\n',A @ B)

A + B:
 [[ 52   4  23]
 [ 40  14 102]
 [ 47  72  12]]
A - B:
 [[-52  -2 -19]
 [-34  -6 -92]
 [-35 -58   4]]
A / B:
 [[0.         0.33333333 0.0952381 ]
 [0.08108108 0.4        0.05154639]
 [0.14634146 0.10769231 2.        ]]
A * B:
 [[  0   3  42]
 [111  40 485]
 [246 455  32]]
A @ B:
 [[119 140 105]
 [509 374 471]
 [899 608 837]]


## Matrix Manipulations
- As seen earlier, we can change the dimensions of an existing matrix by the `reshape` function

In [32]:
A = np.arange(4)
print(A)
print(A.reshape([2,2])) 

[0 1 2 3]
[[0 1]
 [2 3]]


## Matrix Manipulations
- To flatten a matrix, we have two functions:
    - `ravel` returns flattened matrix
    - `flatten` returns a _copy_ of the flattened matrix

In [33]:
# ravel sample
B = np.array([[1,2,3],[4,5,6]])
C = B.ravel()
print('C:',C)
C[-1] = 7
print('C:',C)
print('B:',B)

C: [1 2 3 4 5 6]
C: [1 2 3 4 5 7]
B: [[1 2 3]
 [4 5 7]]


In [34]:
# flatten sample
B = np.array([[1,2,3],[4,5,6]])
C = B.flatten()
print('C:',C)
C[-1] = 7
print('C:',C)
print('B:',B)

C: [1 2 3 4 5 6]
C: [1 2 3 4 5 7]
B: [[1 2 3]
 [4 5 6]]


## Matrix Manipulations
- Just like lists, numpy arrays also have the ability to add elements with the `append` method

In [35]:
A = np.array([1,2,3,4,5])
print('A:',A)
print(np.append(A,1))
print(np.append(A,[2,3,4]))

A: [1 2 3 4 5]
[1 2 3 4 5 1]
[1 2 3 4 5 2 3 4]


## Matrix Manipulations
- Since Numpy excels in Linear Algebra operations, we also have very useful methods such as matrix transposing and taking the inverse of a matrix
    - To transpose a matrix `A`: `A.T`
    - To perform an inverse of matrix `A`: `np.linalg.inv(A)`

In [36]:
A = np.array([
    [2,4,6],
    [8,10,12],
    [14,16,18]
])
print('Transpose of A:')
print(A.T)
print('Inverse of A:')
print(np.linalg.inv(A))

Transpose of A:
[[ 2  8 14]
 [ 4 10 16]
 [ 6 12 18]]
Inverse of A:
[[ 1.57625987e+15 -3.15251974e+15  1.57625987e+15]
 [-3.15251974e+15  6.30503948e+15 -3.15251974e+15]
 [ 1.57625987e+15 -3.15251974e+15  1.57625987e+15]]


## Random: Simple random data
- numpy provides a lot methods for generating simple random data. Two of which are:
    - `np.random.rand(d0, d1, …, dn)` - generate a random number matrix with uniform distribution
    - `np.random.randn(d0, d1, …, dn)` - generate a random number matrix with standard normal distribution

In [37]:
# random 3x2 matrix with uniform distribution
np.random.rand(3,2)

array([[0.51021144, 0.44186442],
       [0.92831141, 0.36522915],
       [0.48553739, 0.30924215]])

In [38]:
# random 3x2 matrix with standard normal distribution
np.random.randn(3,2)

array([[-0.31024414, -0.85432039],
       [ 0.26757357, -0.88134475],
       [ 0.27360596, -0.36603366]])

## Random: Permutations
- Shuffling and permutations are also included in numpy via:
    - `np.random.shuffle(x)` - permutates a sequence in-place
    - `np.random.permutation(x)` - return a permutation of the sequence


In [39]:
# shuffle
A = np.arange(10)
print('A:',A)
print('Shuffle:',np.random.shuffle(A)) # returns None, but check value of A below
print('A:',A)

A: [0 1 2 3 4 5 6 7 8 9]
Shuffle: None
A: [1 4 2 6 9 5 3 8 0 7]


In [40]:
# permutation
A = np.arange(10)
print('A:',A)
print('Permutation:',np.random.permutation(A)) # returns None, but check value of A below
print('A:',A)

A: [0 1 2 3 4 5 6 7 8 9]
Permutation: [2 5 0 3 8 9 6 4 1 7]
A: [0 1 2 3 4 5 6 7 8 9]


## Random: Distributions
- numpy can also be used to draw samples from various distributions
- numpy implements 36 types of distributions

## Random: Normal Distributions
- `np.random.normal(loc=0.0, scale=1.0, size=None)`
    - loc - center of distribution
    - scale - standard deviation of distribution
    - size - output shape

In [41]:
loc = 0
scale = 0.1
normal = np.random.normal(loc, scale, 30)
print(np.sort(normal))

[-0.16090775 -0.15507172 -0.14988266 -0.11760981 -0.08957677 -0.06825945
 -0.06424099 -0.06265832 -0.05623899 -0.04534928 -0.04322283 -0.02292111
 -0.01879064 -0.00962895 -0.00381058  0.00126567  0.0219826   0.04189316
  0.05553079  0.07334335  0.07977418  0.08131094  0.09893341  0.10519903
  0.11534444  0.12467043  0.16621186  0.19540889  0.20201479  0.21132592]


## Random: Standard Distributions
- `np.random.uniform(loc=0.0, scale=1.0, size=None)`
    - low - lower boundary of interval
    - high - upper boundary of interval
    - size - output shape

In [42]:
low = 10
high = 20
uniform = np.random.uniform(low,high,30)
print(np.sort(uniform))

[10.0635118  10.80629176 11.44481832 12.25892569 12.26213275 12.90268294
 13.21116594 13.59915431 13.62427497 13.75661932 14.64448945 15.2279316
 15.51367435 15.70238723 15.75531763 15.98414262 16.09267964 16.11398484
 17.11075273 17.29780686 17.42955146 17.60095108 17.63455282 17.69933056
 17.89027268 18.43209985 18.56890752 18.69938386 19.01562756 19.24993363]


## Further Reading:
- A more in-depth intro: https://docs.scipy.org/doc/numpy/user/quickstart.html
- More advance linear algebra methods: https://docs.scipy.org/doc/numpy/reference/routines.linalg.html
- The full power of the the random sample module: https://docs.scipy.org/doc/numpy/reference/routines.random.html

# FIN