## Numpy

In [1]:
import numpy as np

### Arrays
A numpy array is a analogous to python list but the elements of the array should be of same type.

In [4]:
a = np.array([1, 2, 3])
a

array([1, 2, 3])

The type of the array is called numpy.ndarray . ( numpy n-dimensional array)

In [5]:
type(a)

numpy.ndarray

We associate numpy arrays with two properties 1)shape and 2)rank

a is one dimensional array with 3 elements

In [7]:
print("Dimension of a: ", a.ndim)
print("Shape of a: ", a.shape)
print("Total number of elements in the array: ", a.size)
print("Data type of the elements of a:", a.dtype)

Dimension of a:  1
Shape of a:  (3,)
Total number of elements in the array:  3
Data type of the elements of a: int32


In [8]:
print(a[0], a[1], a[2]) # Prints "1 2 3"
a[0] = 5 # Change an element of the array
print(a)

1 2 3
[5 2 3]


In [9]:
b=np.array([[1., 2., 3.], [ 4., 5., 6.]])

In [11]:
print("Dimension of b: ", b.ndim)
print("Shape of b: ", b.shape)
print("Total number of elements in b: ", b.size)
print("Data type of the elements of b:", b.dtype)

Dimension of b:  2
Shape of b:  (2, 3)
Total number of elements in b:  6
Data type of the elements of b: float64


In [12]:
b

array([[1., 2., 3.],
       [4., 5., 6.]])

In [13]:
# accessing elements
print(b[0, 0], b[0, 1], b[1, 0]) # Prints "1 2 4"

1.0 2.0 4.0


In [14]:
# for all dimensional array
x1 = np.random.randn(3)  # One-dimensional array
x2 = np.random.randn(2,3)  # Two-dimensional array
x3 = np.random.randn(2,3,4)  # Three-dimensional array
x4 = np.random.randn(2,3,4,5)  # Four-dimensional array

In [34]:
print("x4 ndim: ", x4.ndim)
print("x4 shape:", x4.shape)
print("x4 size: ", x4.size)
print("dtype:", x4.dtype)

x4 ndim:  4
x4 shape: (2, 3, 4, 5)
x4 size:  120
dtype: float64


In [40]:
# Data sample example
name = ['Alice', 'Bob', 'Cathy', 'Doug']
age = [25, 45, 37, 19]
weight = [55.0, 85.5, 68.0, 61.5]

In [43]:
# Use a compound data type for creating a structured empty container arrays
data = np.zeros(4, dtype={'names':('name', 'age', 'weight'),
                          'formats':('U10', 'i4', 'f8')})
print(data.dtype)

[('name', '<U10'), ('age', '<i4'), ('weight', '<f8')]


In [44]:
# filling the array with our lists of values
data['name'] = name
data['age'] = age
data['weight'] = weight
print(data)

[('Alice', 25, 55. ) ('Bob', 45, 85.5) ('Cathy', 37, 68. )
 ('Doug', 19, 61.5)]


In [45]:
print(data[0])

('Alice', 25, 55.)


In [46]:
# Get names where age is under 30 - Operations similar to pandas df
data[data['age'] < 30]['name']

array(['Alice', 'Doug'], dtype='<U10')

### Array Creation
Numpy provides lots of ways to create a numpy array.

In [47]:
a = np.zeros((3,3)) # Create an array of all zeros
print(a) 

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [48]:
a = np.ones((2,5)) # Create an array of all ones
print(a) 

[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


In [49]:
c = np.zeros_like(a) # Create an array of all zeros like a's shape
print(c)

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]


In [50]:
c = np.ones_like(b) # Create an array of all ones like b's shape
print(c)

[[1. 1. 1.]
 [1. 1. 1.]]


In [53]:
#numpy full
d = np.full((2,5), 10) # Create a constant array
print(d)

[[10 10 10 10 10]
 [10 10 10 10 10]]


In [56]:
#numpy eye (Identity)
e = np.eye(3) # Create a 3x3 identity matrix
print(e)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [70]:
#numpy randoms
np.random.seed(12)  # seed for reproducibility

f = np.random.random((3,3)) # Create an array filled with random values
print(f)

[[0.15416284 0.7400497  0.26331502]
 [0.53373939 0.01457496 0.91874701]
 [0.90071485 0.03342143 0.95694934]]


In [72]:
#numpy randoms
np.random.seed(12)  # seed for reproducibility

f = np.random.random_sample((3,3)) # Create an array filled with random values
print(f)

[[0.15416284 0.7400497  0.26331502]
 [0.53373939 0.01457496 0.91874701]
 [0.90071485 0.03342143 0.95694934]]


In [73]:
#numpy randoms
f = np.random.randint(10,20,(3,6)) # Create an array filled with random values
print(f)

[[16 10 15 18 12 19]
 [13 14 13 11 17 10]
 [12 16 12 10 14 16]]


In [79]:
#numpy arange
np.arange(0, 10, 10) # arguments: start, stop, step

array([0])

In [78]:
#numpy linspace
np.linspace(0, 10, 10)

array([ 0.        ,  1.11111111,  2.22222222,  3.33333333,  4.44444444,
        5.55555556,  6.66666667,  7.77777778,  8.88888889, 10.        ])

## Array Indexing
Numpy offers several ways to index into arrays.
Slicing
One-dimensional arrays can be indexed, sliced and iterated over, much like lists and other Python
sequences.

In [80]:
a = np.linspace(0, 500, 6)
print(a)

[  0. 100. 200. 300. 400. 500.]


In [81]:
# Elements at the middle of the array
a[2:4]

array([200., 300.])

In [82]:
# Last element
a[-1]

500.0

In [83]:
# Last two elements
a[-2:]

array([400., 500.])

Multidimensional arrays can have one index per axis. These indices are given in a tuple separated by
commas. When accessing multidimensional arrays, we must specify a slice for each dimension of the array

In [84]:
a =np.array([
 np.linspace(1, 3, 3),
 np.linspace(4, 6, 3),
 np.linspace(7, 9, 3)
])
print(a)

[[1. 2. 3.]
 [4. 5. 6.]
 [7. 8. 9.]]


In [85]:
print("First element: ", a[0,0])

First element:  1.0


When fewer indices are provided than the number of axes, the missing indices are considered complete
slices

In [86]:
print("First row:\n", a[0])

First row:
 [1. 2. 3.]


In [87]:
print("Upper Right 2x2 matrix:\n", a[0:2, 1:])
print("Lower Right 2x2 matrix:\n", a[1:, 1:])

Upper Right 2x2 matrix:
 [[2. 3.]
 [5. 6.]]
Lower Right 2x2 matrix:
 [[5. 6.]
 [8. 9.]]


In [88]:
print("Upper Left 2x2 matrix:\n", a[:2, :2])
print("Lower Left 2x2 matrix:\n", a[1:, :2])

Upper Left 2x2 matrix:
 [[1. 2.]
 [4. 5.]]
Lower Left 2x2 matrix:
 [[4. 5.]
 [7. 8.]]


In [92]:
## Boolean array indexing
a = np.array([[1,2], [3, 4], [5, 1]])
bool_idx = (a > 2)
print(bool_idx)

[[False False]
 [ True  True]
 [ True False]]


In [93]:
print(a[bool_idx])

[3 4 5]


In [94]:
# We can do the above operation in single line:
print(a[a > 2])

[3 4 5]


## Other function to subset an array
### where
Convert conditional indices to position index

In [95]:
print(a, "\n")
m, n = np.where(a > 2)

[[1 2]
 [3 4]
 [5 1]] 



In [96]:
a[m,n]

array([3, 4, 5])

In [98]:
# diagonal
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
print(A)
np.diag(A)

[[ 0  1  2  3  4]
 [10 11 12 13 14]
 [20 21 22 23 24]
 [30 31 32 33 34]
 [40 41 42 43 44]]


array([ 0, 11, 22, 33, 44])

In [100]:
# reverse diagonal
A[:, ::-1]

array([[ 4,  3,  2,  1,  0],
       [14, 13, 12, 11, 10],
       [24, 23, 22, 21, 20],
       [34, 33, 32, 31, 30],
       [44, 43, 42, 41, 40]])

In [104]:
np.diag(A[:, ::-1])

array([ 4, 13, 22, 31, 40])

### Take

In [105]:
v = np.arange(-5,5)
v

array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4])

### indexing via a list

Using List to index the Numpy Array

In [106]:
row_indices = [1, 3, 5]
v[row_indices]

array([-4, -2,  0])

In [107]:
# Doesn’t work with List
[-5, -4, -3, -2, -1,  0,  1,  2,  3,  4][row_indices]

TypeError: list indices must be integers or slices, not list

Works like a charm!

In [108]:
np.take([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4], row_indices)

array([-4, -2,  0])

## Linear Algebra
### Elementwise-array operations
Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.
Elementwise addition; both produce the array

In [109]:
x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)
print(x + y, "\n")
print(np.add(x, y))

[[ 6.  8.]
 [10. 12.]] 

[[ 6.  8.]
 [10. 12.]]


In [110]:
#Elementwise difference; both produce the array
print(x - y, "\n")
print(np.subtract(x, y))

[[-4. -4.]
 [-4. -4.]] 

[[-4. -4.]
 [-4. -4.]]


In [111]:
# Elementwise product; both produce the array
print(x * y, "\n")
print(np.multiply(x, y))

[[ 5. 12.]
 [21. 32.]] 

[[ 5. 12.]
 [21. 32.]]


In [112]:
#Elementwise division; both produce the array
print(x / y, "\n")
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]] 

[[0.2        0.33333333]
 [0.42857143 0.5       ]]


In [113]:
#Other Useful Elementwise Operations
print("a: \n", a)
print("\na**2: \n", a**2)
print("\nnp.square(a): \n", np.square(a))

a: 
 [[1 2]
 [3 4]
 [5 1]]

a**2: 
 [[ 1  4]
 [ 9 16]
 [25  1]]

np.square(a): 
 [[ 1  4]
 [ 9 16]
 [25  1]]


In [117]:
# Same operation on python list raises an error
list_a = [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
list_a**2

TypeError: unsupported operand type(s) for ** or pow(): 'list' and 'int'

In [115]:
print("exp(a):\n", np.exp(a))

exp(a):
 [[  2.71828183   7.3890561 ]
 [ 20.08553692  54.59815003]
 [148.4131591    2.71828183]]


In [119]:
print(np.power(a, 2))

[[ 1  4]
 [ 9 16]
 [25  1]]


In [None]:
print("Natural logarithm: \n", np.log(a))
print("\nBase10 logarithm: \n", np.log10(a))
print("\nBase2 logarithm: \n", np.log2(a))

## Vector Operations
We can use the usual arithmetic operators to multiply, add, subtract, and divide vectors with scalar
numbers.

In [120]:
v1 = np.arange(0, 5)
v1

array([0, 1, 2, 3, 4])

In [122]:
print(v1 * 2)
print(v1 / 2)
print(v1 ** 2)
print(v1 * v1)

[0 2 4 6 8]
[0.  0.5 1.  1.5 2. ]
[ 0  1  4  9 16]
[  1   1   4  27 256]


Inner Product

In [128]:
v2 = np.arange(5, 10)
v2

array([5, 6, 7, 8, 9])

In [124]:

v1 = [0, 1, 2, 3, 4]
v2 = [5, 6, 7, 8, 9]
np.dot(v1, v2)  #0 * 5 + 1 * 6 + 2 * 7 + 3 * 8 + 4 * 9

80

Vector Magnitude (self inner product)

In [125]:
sum = 0
for eachelement in v1:
 sum += eachelement * eachelement
print(sum)

30


In [126]:
np.sum([element*element for element in v1])

30

In [130]:
v2

array([5, 6, 7, 8, 9])

In [131]:
## should work for numpy array not list
print(v1 @ v1)

TypeError: unsupported operand type(s) for @: 'list' and 'list'

## Matrix Algebra

In [132]:
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

Transpose

In [133]:
A.T

array([[ 0, 10, 20, 30, 40],
       [ 1, 11, 21, 31, 41],
       [ 2, 12, 22, 32, 42],
       [ 3, 13, 23, 33, 43],
       [ 4, 14, 24, 34, 44]])

## Matrix-Vector Multiplication

In [134]:
v1

[0, 1, 2, 3, 4]

In [135]:
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

v1 is multiplied to each row

In [136]:
A * v1

array([[  0,   1,   4,   9,  16],
       [  0,  11,  24,  39,  56],
       [  0,  21,  44,  69,  96],
       [  0,  31,  64,  99, 136],
       [  0,  41,  84, 129, 176]])

In [137]:
# Elementwise Matrix Multiplication
A * A

array([[   0,    1,    4,    9,   16],
       [ 100,  121,  144,  169,  196],
       [ 400,  441,  484,  529,  576],
       [ 900,  961, 1024, 1089, 1156],
       [1600, 1681, 1764, 1849, 1936]])

Alternatively we can cast the array to Matrix , which enables normal arithmatic opertions to perform
matrix algebra.

In [138]:
A_mat = np.matrix(A)
v = np.matrix(A).T # make it a column vector
print("Matrix A:\n", A_mat)
print("\nVector v:\n",v)

Matrix A:
 [[ 0  1  2  3  4]
 [10 11 12 13 14]
 [20 21 22 23 24]
 [30 31 32 33 34]
 [40 41 42 43 44]]

Vector v:
 [[ 0 10 20 30 40]
 [ 1 11 21 31 41]
 [ 2 12 22 32 42]
 [ 3 13 23 33 43]
 [ 4 14 24 34 44]]


In [141]:
type(A_mat)

numpy.matrix

In [142]:
A_mat * A_mat

matrix([[ 300,  310,  320,  330,  340],
        [1300, 1360, 1420, 1480, 1540],
        [2300, 2410, 2520, 2630, 2740],
        [3300, 3460, 3620, 3780, 3940],
        [4300, 4510, 4720, 4930, 5140]])

In [144]:
v.T * A_mat

matrix([[ 300,  310,  320,  330,  340],
        [1300, 1360, 1420, 1480, 1540],
        [2300, 2410, 2520, 2630, 2740],
        [3300, 3460, 3620, 3780, 3940],
        [4300, 4510, 4720, 4930, 5140]])

In [145]:
A_mat * v.T

matrix([[ 300,  310,  320,  330,  340],
        [1300, 1360, 1420, 1480, 1540],
        [2300, 2410, 2520, 2630, 2740],
        [3300, 3460, 3620, 3780, 3940],
        [4300, 4510, 4720, 4930, 5140]])

In [146]:
#If we try to add, subtract or multiply objects with incomplatible shapes we get an error:
v = np.matrix([1,2,3,4]).T
A_mat.shape, v.shape

((5, 5), (4, 1))

In [147]:
## an error 
A_mat * v

ValueError: shapes (5,5) and (4,1) not aligned: 5 (dim 1) != 4 (dim 0)

## Other Useful Functions
Sum

In [148]:
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

In [149]:
A.sum()

550

In [150]:
# Column-wise sum or Reduce by row
A.sum(axis=0)

array([100, 105, 110, 115, 120])

In [151]:
# Row-wise sum or Reduce by column
A.sum(axis=1)

array([ 10,  60, 110, 160, 210])

### Statistics

Mean

In [153]:
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

In [152]:
print("Mean of A:", A.mean())
print("\nColumn-wise mean of A:", A.mean(axis=0))
print("\nRow-wise mean of A:", A.mean(axis=1))

Mean of A: 22.0

Column-wise mean of A: [20. 21. 22. 23. 24.]

Row-wise mean of A: [ 2. 12. 22. 32. 42.]


In [154]:
# Variance
print("Variance of A:", A.var())
print("\nColumn-wise variance of A:", A.var(axis=0))
print("\nRow-wise variance of A:", A.var(axis=1))

Variance of A: 202.0

Column-wise variance of A: [200. 200. 200. 200. 200.]

Row-wise variance of A: [2. 2. 2. 2. 2.]


In [155]:
# Standard deviation
print("Standard Deviation of A:", A.std())
print("\nColumn-wise Standard Deviation of A:", A.std(axis=0))
print("\nRow-wise Standard Deviation of A:", A.std(axis=1))

Standard Deviation of A: 14.212670403551895

Column-wise Standard Deviation of A: [14.14213562 14.14213562 14.14213562 14.14213562 14.14213562]

Row-wise Standard Deviation of A: [1.41421356 1.41421356 1.41421356 1.41421356 1.41421356]


In [156]:
#Min and Max
print("Minimum of A:", A.min())
print("\nColumn-wise Minimum of A:", A.min(axis=0))
print("\nRow-wise Minimum of A:", A.min(axis=1))

Minimum of A: 0

Column-wise Minimum of A: [0 1 2 3 4]

Row-wise Minimum of A: [ 0 10 20 30 40]


In [157]:
print("Maximum of A:", A.max())
print("\nColumn-wise Maximum of A:", A.max(axis=0))
print("\nRow-wise Maximum of A:", A.max(axis=1))

Maximum of A: 44

Column-wise Maximum of A: [40 41 42 43 44]

Row-wise Maximum of A: [ 4 14 24 34 44]


## Broadcasting

Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when
performing arithmetic operations.

For example, suppose that we want to add a constant vector to each row of a matrix. We could do it with this

In [158]:
X = np.array([[1,2],[3, 4] ,[5,6], [7,8]])
v = np.array([10, 20])
print("X: \n", X)
print("\nv:\n", v)

X: 
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]

v:
 [10 20]


We can add the vector v to each row of the matrix x, storing the result in the matrix y

In [159]:
Y = np.zeros_like(X)
# Add the vector v to each row of the matrix x with an explicit loop
# Numpy broadcasting allows us to perform this computation without actually creating multiple copies of v.
for i in range(4):
 Y = X + v
print(Y)

[[11 22]
 [13 24]
 [15 26]
 [17 28]]


## Array Reshape, Concatenation, Stacking and Copy
### Reshape

In [160]:
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
A

array([[ 0,  1,  2,  3,  4],
       [10, 11, 12, 13, 14],
       [20, 21, 22, 23, 24],
       [30, 31, 32, 33, 34],
       [40, 41, 42, 43, 44]])

In [166]:
m, n = A.shape
B = A.reshape((1, m*n))
print(B.shape)
print(B)

(1, 25)
[[ 0  1  2  3  4 10 11 12 13 14 20 21 22 23 24 30 31 32 33 34 40 41 42 43
  44]]


In [167]:
B = A.flatten()
print(B.shape)
print(B)

(25,)
[ 0  1  2  3  4 10 11 12 13 14 20 21 22 23 24 30 31 32 33 34 40 41 42 43
 44]


### Concatenation and Stacking
Join a sequence of arrays along an existing axis.

In [170]:
# + will concatinate only if it is string, that is string operation, not arithmetic operation
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

In [172]:
np.concatenate((a, b.T), axis=1)

array([[1, 2, 5],
       [3, 4, 6]])

In [171]:
np.vstack((a,b))

array([[1, 2],
       [3, 4],
       [5, 6]])

In [173]:
np.hstack((a,b.T))

array([[1, 2, 5],
       [3, 4, 6]])

### Copy
To achieve high performance, assignments in Python usually do not copy the underlaying objects.

To technically term it: `pass by reference`.

In [174]:
A = np.array([[1, 2], [3, 4]])
A

array([[1, 2],
       [3, 4]])

In [175]:
# now B is referring to the same array data as A
B = A
# changing B affects A
B[0,0] = 10
B

array([[10,  2],
       [ 3,  4]])

In [176]:
A

array([[10,  2],
       [ 3,  4]])

If we want to avoid this behavior, so that when we get a new completely independent object B copied
from A

Technically called `deep copy` using the function copy

In [177]:
B = A.copy()
# now, if we modify B, A is not affected
B[0,0] = -5
B

array([[-5,  2],
       [ 3,  4]])

In [178]:
A

array([[10,  2],
       [ 3,  4]])

References:
- http://numpy.scipy.org (http://numpy.scipy.org)
- http://scipy.org/Tentative_NumPy_Tutorial (http://scipy.org/Tentative_NumPy_Tutorial)
- https://www.datalorelabs.com