# Introduction to the Numpy Library

In this lecture we will be looking into numerical python library Numpy

## What is Numpy ?

Numpy is a core python package for scientific computing that 

 * provide a powerful N-dimensional array object
 * highly optimised linear algebra tools
 * very close integration with C/C++ and Fortan code
 * licensed under a BSD license (free to use)
 
 
## Importing the Numpy

The most conventional approach used is



In [2]:
import numpy as np   # just hit shift+enter to run a cell in notebook!

In the above statement, the python keyword 'as' allows us to use 'np' as a shorthand to refer to the 'numpy' module.

In [3]:
# You will also frequenty see the following statement

import matplotlib.pyplot as plt

# this allows the alias 'plt' to now access plotting libraries wrapped inside pyplt!

## Generating a numpy array

There are several ways to generate a numpy array. 
 * It could be generated from a python lists containing numeric data
 * using numpy array generating functions
 * reading numeric data from a file
 
Numpy represents N-dimensional array as type 'numpy.ndarray'

## Creating array from lists

In [4]:
data_list = [1,2,3,4,5]
array_1D = np.array(data_list)

print(array_1D)            # [1 2 3 4 5]
print(type(array_1D))      # <class 'numpy.ndarray'>

[1 2 3 4 5]
<class 'numpy.ndarray'>


In [5]:
my_2d_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(my_2d_array)
print(type(my_2d_array))

[[1 2 3]
 [4 5 6]
 [7 8 9]]
<class 'numpy.ndarray'>


## The ndarray object’s properties

The ndarray has various properties that we can access, for example using above my_2d_array we can see:
 

In [6]:
print(my_2d_array.shape)   # the shape of a numpy array 

(3, 3)


In [7]:
print(my_2d_array.size)    # the size of a numpy array
                           # returns total number of elements

9


In [8]:
print(my_2d_array.dtype)   # Print the type of data the array holds 
                           # (eg: int64, float64 etc)

int64


## N-dimensional arrays

Note, numpy generalises arrays to be N-dimensional


In [9]:
x2 = np.array([[1,2], [3, 4]])       # a matrix 
print(x2.shape)                      # (2, 2)

x3 = np.array([x2, x2])              # stacking two matrices
print(x3.shape)                      # (2, 2, 2)

(2, 2)
(2, 2, 2)


In [10]:
x4 = np.array([x3, x3, x3, x3, x3])  # stacking 5 3-D structures
print(x4.shape)                      # (5, 2, 2, 2)
#print(x4)
   
x5 = np.array([x4, x4])              # stacking 2 4-D structures!
print(x5.shape)                      # (2, 5, 2, 2, 2)

(5, 2, 2, 2)
(2, 5, 2, 2, 2)


## Array generating functions

### 1. arange function

In [11]:
print(np.arange(10))     # generates 10 numbers starting from 0

[0 1 2 3 4 5 6 7 8 9]


In [12]:
print(np.arange(10, 20))   # generate number starting from 10 to 20 

[10 11 12 13 14 15 16 17 18 19]


In [13]:
print(np.arange(10, 20, 2))   # generate number starting from 10 to 20 with step 2

[10 12 14 16 18]


### 2.  linspace

In [14]:
print(np.linspace(10, 20, 5))  # start, stop, n-points

[10.  12.5 15.  17.5 20. ]


### 3. ones and zeros functions

In [15]:
print(np.zeros((3, 3)))   # the argument is a tuple !

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [16]:
print(np.ones((3, 3)))   # the argument is a tuple !

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


### 4. diagonal and identity matrix

In [17]:
print(np.diag((4,5,3)))  # generate diagonal matrix

[[4 0 0]
 [0 5 0]
 [0 0 3]]


In [18]:
print(np.diag((2,2), k=1))    

[[0 2 0]
 [0 0 2]
 [0 0 0]]


In [19]:
print(np.eye(3)) # Identity matrix

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


## 5. Random numbers

#### Initialise array with random numbers from uniform distribution between 0 and 1


In [20]:
print(np.random.rand(2,4)) # Create an array of the given shape and populate it with random samples 
                           # from a uniform distribution over [0, 1)

[[0.50242699 0.47454204 0.01740772 0.32686565]
 [0.53512545 0.00087277 0.00816952 0.84037682]]


#### Initialise array with random numbers from standard normal distribution

In [21]:
print(np.random.standard_normal((2,4)))  #Draw samples from a standard Normal distribution (mean=0, stdev=1)

[[ 0.67151915  0.51954706 -0.01560633  0.84285068]
 [ 0.50219375  0.75089655  0.55199581  1.0536527 ]]


#### Initialise array with random integer numbers in a given range

In [22]:
x = np.random.randint(low=1, high=100, size=10)  # give me 10 integer numbers randomly picked
print(x)                                         # between low and high variable
                                   # Everytime u run this cell u will get different answer       

[99 51 63 56 32 46 91 49 34 18]


## Reading arrays from files

 * genfromtxt for reading from a file
 * savetxt for writing to a text file
 * load and save function for reading and writing in numpy's .npz format

###  reading a csv file

In [23]:
data = np.genfromtxt('testInput.txt', delimiter=',')
print(data)

[[1. 2. 3. 4. 5.]
 [6. 7. 8. 9. 1.]
 [3. 2. 4. 6. 7.]
 [7. 3. 6. 9. 1.]]


### saving a text file

In [24]:
np.savetxt('output.csv', data, delimiter=',', fmt='%.5f')

### Saving in numpy .npy binary format

In [25]:
np.save('output1', data)   # just pass the output file name

### Loading numpy .npy binary file

In [26]:
read_data = np.load('output1.npy')  # must supply file extension .npy

In [27]:
data2= data  # creating copy of array data into data2

### Saving numpy arrays in archive form with .npz extension

In [28]:
np.savez('output2', data=data, data2=data2)

### Loading .npz archive file

In [29]:
read_data = np.load('output2.npz')

In [30]:
print(read_data.files)    # to view the files in zipped data

['data', 'data2']


In [31]:
print(read_data['data'])   # accessing the first part of data

[[1. 2. 3. 4. 5.]
 [6. 7. 8. 9. 1.]
 [3. 2. 4. 6. 7.]
 [7. 3. 6. 9. 1.]]


In [32]:
print(read_data['data2'])   # accessing the second file

[[1. 2. 3. 4. 5.]
 [6. 7. 8. 9. 1.]
 [3. 2. 4. 6. 7.]
 [7. 3. 6. 9. 1.]]


## Array manipulation

In [33]:
x = np.array([1,2,3,4,5,6,7])
print (x[0])
print (x[2:5])
print (x[:4])
print (x[4:])

#Array indexing is similar to lists but generalised to N-dimensions

1
[3 4 5]
[1 2 3 4]
[5 6 7]


In [34]:
x = np.ones((5,5))
print (x[1:3,1:5])

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]


### Extracting a row or column vector from a matrix

In [35]:
data = np.genfromtxt('testInput.txt', delimiter=',')

In [36]:
print (x[2,:])            # extract row 2 (can also write as x[2])
print (x[2,:].shape)     

[1. 1. 1. 1. 1.]
(5,)


In [37]:
print (x[:,2])            # extract column 2
print (x[:,2].shape)

[1. 1. 1. 1. 1.]
(5,)


## Data processing

There exists many data processing methods such as min, max, sum, product, mean etc

In [38]:
import numpy as np
x = np.array([1,2,3,4,5,6])

In [39]:
print(x.min())         # minimum in x, prints 1
print(x.max())         # maximum in x, prints 6
print(x.sum())         # sum of all elements in x 
print(x.prod())        # product of all elements in x 
print(x.mean())        # mean of x
print(x.var())         # variance of x

1
6
21
720
3.5
2.9166666666666665


Another way of calling these functions would be to use
 * np.max(array_name)
 * np.mean(array_name)

## On matrices

In [40]:
x = np.array([[1,2,3,4,5,6],[3,4,5,6,7,8]])     # 2d Matrix

In [41]:
print(x.min())           #returns 1
print(x.min(axis=0))     #returns [1, 2, 3, 4, 5, 6]
                         # axis=0 specifies column
    
print(x.sum())           # sum of all elements in x
print(x.sum(axis=0))     # sum of elements along the column axis

print(x.mean())                # mean of x
print(x.mean(axis=0))          # mean of x along column axis  

1
[1 2 3 4 5 6]
54
[ 4  6  8 10 12 14]
4.5
[2. 3. 4. 5. 6. 7.]


## Reshaping and resizing

Converting a matrix to vector or vice versa

In [42]:
M = np.array([1,2,3,4,5,6,7,8,9]).reshape(3,3)
print(M)

[[1 2 3]
 [4 5 6]
 [7 8 9]]


In [43]:
v = M.reshape(9)
print(v)

[1 2 3 4 5 6 7 8 9]


In [45]:
v = M.reshape(8) # We will get error message. "The number of elements can’t change"

ValueError: cannot reshape array of size 9 into shape (8,)

## Adding a new dimension

In [46]:
v = np.array([1,2,3,4,5])
print (v)                         # [1 2 3 4 5]
print (v.shape)                   # (5,)
v_row = v[np.newaxis, :]         
print (v_row)                # [[1 2 3 4 5]]  
print (v_row.shape)          # turn a vector into a 1-row matrix. Prints (1,5)

v_col = v[:, np.newaxis]
print(v_col)                     
print (v_col.shape)         # turn a vector into a 1-column matrix. Prints (5,1)

[1 2 3 4 5]
(5,)
[[1 2 3 4 5]]
(1, 5)
[[1]
 [2]
 [3]
 [4]
 [5]]
(5, 1)


### Stacking arrays

Arrays can be stacked horizontally or vertically. Make sure the dimesions are compatible.


In [47]:
x = np.ones((2,3))
y = np.zeros((2,2))

z = np.hstack((x, y, x))       # Horizontally stacking the arrays.
                               # Arrays are passed as a tuple
print (z)
print (z.shape)


[[1. 1. 1. 0. 0. 1. 1. 1.]
 [1. 1. 1. 0. 0. 1. 1. 1.]]
(2, 8)


In [48]:
x = np.ones((2,2))
y = np.zeros((1,2))

z = np.vstack((x,y))     # Vertically stacking the arrays
print (z)
print (z.shape)

[[1. 1.]
 [1. 1.]
 [0. 0.]]
(3, 2)


## Duplicating array using tile and repeat

In [52]:
x = np.array([[1,2],[3,4]])

print('**************')
print('Array x is: ', x)
print('**************')

y = np.tile(x, 3)         # repeat x three times
print(y)

z=np.tile(x, (2,4))       # repeat x (2,4) times
print(z)

**************
Array x is:  [[1 2]
 [3 4]]
**************
[[1 2 1 2 1 2]
 [3 4 3 4 3 4]]
[[1 2 1 2 1 2 1 2]
 [3 4 3 4 3 4 3 4]
 [1 2 1 2 1 2 1 2]
 [3 4 3 4 3 4 3 4]]


In [53]:
a=np.repeat(x, 4)            # repeat each element 4 times. Does not preserve the shape/dimension
print(a)
print('*************')

b=np.repeat(x, 4, axis=1)    # repeat each element 4 times preserving dimension 
print(b)

[1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4]
*************
[[1 1 1 1 2 2 2 2]
 [3 3 3 3 4 4 4 4]]


## Copying in python

By default, arrays in python are handled by reference. This means that when you do A=B you are just copying a reference, not the data itself. Therefore any changes you make on B will actually reflect on A.


In [54]:
A = np.array([1,2,3,4,5,6])
B = A
B[0] = 10
print(A)

[10  2  3  4  5  6]


Note, this is also true for Python lists and objects in general

In [55]:
A = [1,2,3,4]
B = A
B[0] = 10
print(A)

[10, 2, 3, 4]


## Deep copy using the copy() method

To actually copy the data stored in the array we use the numpy copy method,

In [56]:
A = np.array([1, 2, 3, 4])
B = A.copy()                     # can also write, B = np.copy(A)

B[0] = 10
print(A)

[1 2 3 4]


To copy Python lists we first need to import the copy module,

In [57]:
import copy
A = [1,2,3,4]
B = copy.deepcopy(A)
print(B)

# Its easy to get confused with numpy arrays and lists. Careful !!

[1, 2, 3, 4]


## Operations in Matrix

Numpy supports different array operations such as 
 * addition, subtraction, multiplication, division
 * transpose, inverse etc

### Array addition and substraction

In [58]:
X = np.array([[1, 2, 3], [4, 5, 6]])
Y = np.ones((2,3))

print(X+Y)                     # Element wise adds X and Y

print(X - 2 * Y)                      # Multiply each element of Y by 2 and then subtract 
                               # with every elements of X

# note, scalar multiplication
print(X + np.array([[2,2,2],[2,2,2]]))  # perform addition element wise

[[2. 3. 4.]
 [5. 6. 7.]]
[[-1.  0.  1.]
 [ 2.  3.  4.]]
[[3 4 5]
 [6 7 8]]


## Broadcasting 

There might be situations where the dimension of two matrices/arrays are not equal, where we need to copy array to make dimension compatible to execute certain operation. Python provides away to do this automatically throught broadcasting.

In [59]:
X = np.array([[1,2,3],[4,5,6]])     # X is a 2d array
row = np.array([1,1,1])             # row is a 1d array

print(X)
print(row)
print (X + row)      # Dimension mismatch. Numpy does broadcasting automatically to make row
                     # a 2d array

[[1 2 3]
 [4 5 6]]
[1 1 1]
[[2 3 4]
 [5 6 7]]


In [60]:
col = np.array([1,1])
print(col)
print (X + col[:, np.newaxis])

[1 1]
[[2 3 4]
 [5 6 7]]


## Matrix multiplication

### Element wise multiplication

In [61]:
A = np.array([[1,2,3],[4,5,6]])
B = np.array([[3,3,3],[4,4,4]])

In [62]:
C = A * B           # element wise multiplication

In [63]:
print(C)

[[ 3  6  9]
 [16 20 24]]


### Standard matrix multiplication

In [64]:
D = np.dot(A, B.T)  # A = (2,3), B.T = (3,2)
                    # Performs standard matrix multiplication
                    # Column of A must equal to Rows of B

In [65]:
print(D)

[[18 24]
 [45 60]]


## Transpose

Transpose operation alters/swaps the dimension 

In [66]:
A = np.array([[1,2,3],[4,5,6]])
print(A)

print(A.T)        # Take transpose of A. Just use .T
                  # Rows of A now becomes columns

print(A.shape)     # prints (2,3)
print(A.T.shape)   # (3,2)

v = np.array([1,2,3,4,5])
print(v)

print(v.T)

# Vectors only have one dimension. Transpose does nothing.
print(v.shape)     # prints (5,)
print(v.T.shape)   # prints (5,)

[[1 2 3]
 [4 5 6]]
[[1 4]
 [2 5]
 [3 6]]
(2, 3)
(3, 2)
[1 2 3 4 5]
[1 2 3 4 5]
(5,)
(5,)


## Inverse of a matrix

linalg submodule of Numpy provides functionalities needed for computing matrix determinant and inverse

In [67]:
A = np.array([[2,1],[3,2]])
print(A)

det_A = np.linalg.det(A)
print(det_A)

inv_A = np.linalg.inv(A)
print(inv_A)

[[2 1]
 [3 2]]
0.9999999999999998
[[ 2. -1.]
 [-3.  2.]]


## Summary

 * Numpy provides tools for numeric computing
 * With Numpy, Python becomes a usable alternative to MATLAB
 * Basic type is the ndarray - can represent vectors, matrices etc 
 * Lots of tools for vector and matrix manipulation
   * This lecture has only reviewed the most commonly used.
 * For full documentation see http://docs.scipy.org/doc/numpy/