## **Numpy Basics**
1. Numpy is the core library for scientific computing in Python and is used to perform computations on multi-dimensional data easily and effectively. 
1. Numpy provides a new data structure called arrays which allow efficient vector and matrix operations and a number of linear algebra operations

**Convert a list into an array using numpy**

In [1]:
import numpy as np #Import numpy package

a_list = [1,2,3,4]
a = np.array(a_list) #Convert list to numpy array
a

array([1, 2, 3, 4])

**Scalar, Vector, Matrix (Tensor)**

In [2]:
a = np.array(1)  # define scalar
print('a\n', a.__repr__())
print('shape of a: ', a.shape)
print('dimension of a: ', a.ndim)

b = np.array([1, 2, 3, 4, 5]) # define vector
print('b\n', b.__repr__())
print('shape of b: ', b.shape)
print('dimension of b: ', b.ndim)

c = np.array([[1, 2, 3], [4, 5, 6]]) # define matrix
print('c\n',c.__repr__())
print('shape of c: ', c.shape)
print('dimension of c: ', c.ndim)

a
 array(1)
shape of a:  ()
dimension of a:  0
b
 array([1, 2, 3, 4, 5])
shape of b:  (5,)
dimension of b:  1
c
 array([[1, 2, 3],
       [4, 5, 6]])
shape of c:  (2, 3)
dimension of c:  2


**Create array obejects using arange(), zeros(), ones(), linspace() methods**

In [3]:
#Numpy also provides many methods to create arrays.

a = np.arange(0,10,1)
print('created from .arange() method: ', a)

a = np.zeros(10)
print('created from .zeros() method: ', a)

a = np.ones(10)
print('created from .ones() method: ', a)

a = np.linspace(0,2,9)
print('created from .linspace() method: ', a)

created from .arange() method:  [0 1 2 3 4 5 6 7 8 9]
created from .zeros() method:  [0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
created from .ones() method:  [1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
created from .linspace() method:  [0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]


In [4]:
# Convert datatype of elements
a = np.arange(10).reshape(2, 5) # create (2, 5) shape of array
print(a.dtype)   # original datatype
print(a.astype(np.float).dtype)  # convert to float

int64
float64


In [5]:
# We can also make multidimensional arrays.

np.random.seed(0) # random seed for reproducibility
a = np.arange(16).reshape(2,8) #reshape function gives a new shape to an array without changing its data
b = np.ones ((2,8))
c = np.random.random((2,8)) #Create an array randomly

print('created from .reshape() method: \n', a.__repr__())
print()
print('created from .ones() method: \n', b.__repr__())
print()
print('created from .random.random() method: \n', c.__repr__())

created from .reshape() method: 
 array([[ 0,  1,  2,  3,  4,  5,  6,  7],
       [ 8,  9, 10, 11, 12, 13, 14, 15]])

created from .ones() method: 
 array([[1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1.]])

created from .random.random() method: 
 array([[0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ,
        0.64589411, 0.43758721, 0.891773  ],
       [0.96366276, 0.38344152, 0.79172504, 0.52889492, 0.56804456,
        0.92559664, 0.07103606, 0.0871293 ]])


In [6]:
#You can check the dimension or size of arrays
a = np.random.random((2, 4, 5))  # dimensional array
a

array([[[0.0202184 , 0.83261985, 0.77815675, 0.87001215, 0.97861834],
        [0.79915856, 0.46147936, 0.78052918, 0.11827443, 0.63992102],
        [0.14335329, 0.94466892, 0.52184832, 0.41466194, 0.26455561],
        [0.77423369, 0.45615033, 0.56843395, 0.0187898 , 0.6176355 ]],

       [[0.61209572, 0.616934  , 0.94374808, 0.6818203 , 0.3595079 ],
        [0.43703195, 0.6976312 , 0.06022547, 0.66676672, 0.67063787],
        [0.21038256, 0.1289263 , 0.31542835, 0.36371077, 0.57019677],
        [0.43860151, 0.98837384, 0.10204481, 0.20887676, 0.16130952]]])

In [7]:
print('dimension of array: ', a.ndim)
print('shape of array', a.shape) 
print('number of row: ', a.shape[0])
print('number of column: ', a.shape[1])
print('total number of elements in array: ', a.size)
print('data type of elements: ', a.dtype)

dimension of array:  3
shape of array (2, 4, 5)
number of row:  2
number of column:  4
total number of elements in array:  40
data type of elements:  float64


In [8]:
# 4 dimensional array
c = np.random.random((3, 4, 5))
print('c:\n', c.__repr__())

c:
 array([[[0.65310833, 0.2532916 , 0.46631077, 0.24442559, 0.15896958],
        [0.11037514, 0.65632959, 0.13818295, 0.19658236, 0.36872517],
        [0.82099323, 0.09710128, 0.83794491, 0.09609841, 0.97645947],
        [0.4686512 , 0.97676109, 0.60484552, 0.73926358, 0.03918779]],

       [[0.28280696, 0.12019656, 0.2961402 , 0.11872772, 0.31798318],
        [0.41426299, 0.0641475 , 0.69247212, 0.56660145, 0.26538949],
        [0.52324805, 0.09394051, 0.5759465 , 0.9292962 , 0.31856895],
        [0.66741038, 0.13179786, 0.7163272 , 0.28940609, 0.18319136]],

       [[0.58651293, 0.02010755, 0.82894003, 0.00469548, 0.67781654],
        [0.27000797, 0.73519402, 0.96218855, 0.24875314, 0.57615733],
        [0.59204193, 0.57225191, 0.22308163, 0.95274901, 0.44712538],
        [0.84640867, 0.69947928, 0.29743695, 0.81379782, 0.39650574]]])


**Indexing & Slicing**

In [9]:
#In a similar way to Python lists, numpy arrays can be sliced.
a = np.ones(5)
a[0] = 6
a[4] = 2
a

array([6., 1., 1., 1., 2.])

In [10]:
a[-1]  # -1 indicates last element

2.0

In [11]:
a[0:-1] # include 0 index and exclude -1 (last) index of element

array([6., 1., 1., 1.])

In [12]:
# Create (3, 4) shape of tensor
a = np.arange(1, 13).reshape(3, 4)
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [13]:
a[0] # indexing the first row 

array([1, 2, 3, 4])

In [14]:
a[0, 1] # indexing second element of the first row

2

In [15]:
a[0, 1:3] # slicing the first row from 1 to 3 (exclusive)

array([2, 3])

In [16]:
a[0:3:2, 1:4:2] # slicing with respect to both 1 , 2 - dimensional elements

array([[ 2,  4],
       [10, 12]])

In [17]:
# slicing with multi-dimensional array
a = np.arange(30).reshape(2, 3, 5)
a

array([[[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]],

       [[15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24],
        [25, 26, 27, 28, 29]]])

In [18]:
a[0] # first matrix element

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [19]:
a[0, 1, :] # second row of first matrix element

array([5, 6, 7, 8, 9])

In [20]:
a[0, :, 2] # third column of first matrix element

array([ 2,  7, 12])

In [21]:
a[1, 0:3:2, 1:4:2] # slicing with respect to both 2 , 3 - dimensional elements

array([[16, 18],
       [26, 28]])

In [22]:
# add an additional dimension
a[None].shape  # add extra dimension in first dim

(1, 2, 3, 5)

In [23]:
a[:, None].shape # add in second dim

(2, 1, 3, 5)

In [None]:
a[..., None, :].shape  # add in last dim

**Numpy Operations**

In [25]:
# Basic mathematical functions in the numpy module are available and operate elementwise on arrays.
# support all basic numerical operations such as +. -. *, /, ** ..
a = np.arange(0,3,0.5)
b = np.arange(1,4,0.5)
print('a:\n', a.__repr__())
print('b:\n', b.__repr__(), end='\n\n')

print('array a:\n ', a)
print('a + 5 =\n', a + 5)
print('a^2 =\n', a ** 2)
print('cos(a) = \n', np.cos(a))
print('logical operation of a < 1: \n', a<1)

a:
 array([0. , 0.5, 1. , 1.5, 2. , 2.5])
b:
 array([1. , 1.5, 2. , 2.5, 3. , 3.5])

array a:
  [0.  0.5 1.  1.5 2.  2.5]
a + 5 =
 [5.  5.5 6.  6.5 7.  7.5]
a^2 =
 [0.   0.25 1.   2.25 4.   6.25]
cos(a) = 
 [ 1.          0.87758256  0.54030231  0.0707372  -0.41614684 -0.80114362]
logical operation of a < 1: 
 [ True  True False False False False]


In [26]:
#Unlike MATLAB, operator * is not matrix multiplication but elementwise multiplication.
print('array a: ', a)
print('array b: ', b)
a * b

array a:  [0.  0.5 1.  1.5 2.  2.5]
array b:  [1.  1.5 2.  2.5 3.  3.5]


array([0.  , 0.75, 2.  , 3.75, 6.  , 8.75])

In [27]:
# Instead, we use the dot function to compute inner products of vectors, 
# to multiply a vector by a matrix and to multiply matrices.
np.dot(a, b)

21.25

In [28]:
#You can perform matrix operations

a = np.array([[2,5],[1,2]])
b = np.array([[2,1],[5,7]])

np.matmul(a,b) #Matrix Product

array([[29, 37],
       [12, 15]])

In [29]:
# you can also use np.dot() method and @ keyword (@ works in python3.x version)
np.dot(a, b)

array([[29, 37],
       [12, 15]])

In [30]:
a@b

array([[29, 37],
       [12, 15]])

In [31]:
a * b #Elementwise Product

array([[ 4,  5],
       [ 5, 14]])

In [32]:
a.transpose() #transpose the array

array([[2, 1],
       [5, 2]])

**Numpy methods for various operations**

You can use both ```np.method()``` method and ```np.ndarray.method()```.

In [33]:
c = np.arange(18).reshape(3,6)
c

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17]])

In [34]:
# using np.ndarray.method()
print('max value of each column: ', c.max(axis = 0)) # max of each column
print('min value of each row:', c.min(axis = 1)) # min of each row
print('sums of each row:', c.sum(axis = 1)) # sum of each row
print('sums of all elements:', c.sum()) # sum of all elements
print('max value of array (matrix) c:', c.max()) # max of c

max value of each column:  [12 13 14 15 16 17]
min value of each row: [ 0  6 12]
sums of each row: [15 51 87]
sums of all elements: 153
max value of array (matrix) c: 17


In [35]:
# you can also use np.method()
np.sum(c, axis=0)

array([18, 21, 24, 27, 30, 33])

In [36]:
np.max(c)

17

In [37]:
# take operation while keeping dimension
np.sum(c, axis=0, keepdims=True)

array([[18, 21, 24, 27, 30, 33]])

In [38]:
np.sum(c, axis=1, keepdims=True)

array([[15],
       [51],
       [87]])

**Broadcasting**

In [39]:
#Broadcasting is a powerful mechanism that allows numpy to work with arrays of different shapes when computing mathematical operations.

a = np.arange(18).reshape((3,6))
a # (3, 6) shape

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17]])

In [40]:
a * 5  # multiply with scalar -> convert to (3, 6) shape

array([[ 0,  5, 10, 15, 20, 25],
       [30, 35, 40, 45, 50, 55],
       [60, 65, 70, 75, 80, 85]])

In [41]:
a * np.arange(6) # multiply with (6) shape -> also convert to (3, 6) shape

array([[ 0,  1,  4,  9, 16, 25],
       [ 0,  7, 16, 27, 40, 55],
       [ 0, 13, 28, 45, 64, 85]])

In [42]:
b = np.arange(6)
print('b\n', b.__repr__())  # (6,) shape
c = np.arange(3).reshape(3, 1)
print('c\n', c.__repr__())  # (3, 1) shape

b
 array([0, 1, 2, 3, 4, 5])
c
 array([[0],
       [1],
       [2]])


In [43]:
b * c

array([[ 0,  0,  0,  0,  0,  0],
       [ 0,  1,  2,  3,  4,  5],
       [ 0,  2,  4,  6,  8, 10]])

In [44]:
# Broadcasting with multidimensional arrays
a = np.arange(15).reshape(5, 1, 3, 1)
b = np.arange(8).reshape(2, 1, 4)
print('shape of a: ', a.shape)
print('shape of b: ', b.shape)

shape of a:  (5, 1, 3, 1)
shape of b:  (2, 1, 4)


In [45]:
# check dimension of a * b
(a * b).shape

(5, 2, 3, 4)

In [46]:
# Question: how rule of broadcasting works?
# does it work for (5, 1, 3, 2) with (2, 1, 4) shape of tensor? -> No (why?)

**Stacking Arrays**

In [47]:
#You can stack arrays horizontally or vertically

a = np.arange(12).reshape(3, 4)
print('a: \n', a.__repr__())
b = np.ones((2, 4))
print('b: \n', b.__repr__())
c = np.ones((3, 2))
print('c: \n', c.__repr__())

a: 
 array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
b: 
 array([[1., 1., 1., 1.],
       [1., 1., 1., 1.]])
c: 
 array([[1., 1.],
       [1., 1.],
       [1., 1.]])


In [48]:
np.vstack((a,b)) # stack array vertically

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.],
       [ 1.,  1.,  1.,  1.],
       [ 1.,  1.,  1.,  1.]])

In [49]:
np.hstack((a,c)) # stack array horizontally

array([[ 0.,  1.,  2.,  3.,  1.,  1.],
       [ 4.,  5.,  6.,  7.,  1.,  1.],
       [ 8.,  9., 10., 11.,  1.,  1.]])

**Boolean Array Indexing (Masking)**

In [50]:
a = np.arange(1, 10).reshape(3, 3)
print('a: \n', a.__repr__())

a: 
 array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])


In [51]:
# Support element-wise logical operation (return as True or False)
even = a % 2 == 0
print(even.__repr__())

array([[False,  True, False],
       [ True, False,  True],
       [False,  True, False]])


In [52]:
# indexing the elements corresponding to its True boolean index and return as a rank 1 array 
a[even]

array([2, 4, 6, 8])

**Copy in numpy**

In [53]:
#There are 3 cases of copying ndarray in numpy

#Case 1

a = np.zeros((2,2))
b = a #No copy at all # Share both the data and properties(e.g., dimension of array)
print('b: \n', b.__repr__())

b[1,1] = 1
print('b: \n', b.__repr__())
print('a: \n', a.__repr__()) # a is also changed


b.shape = (1,4)
print('shape of a: ', a.shape) #The shape of a is also changed

b: 
 array([[0., 0.],
       [0., 0.]])
b: 
 array([[0., 0.],
       [0., 1.]])
a: 
 array([[0., 0.],
       [0., 1.]])
shape of a:  (1, 4)


In [54]:
#Case 2 : Shallow copy

a = np.zeros((2,2))
b = a.view() #Shallow copy # Share the data but not properties(e.g., dimension of array)
print('b: \n', b.__repr__())

b[1,1] = 1
print('b: \n', b.__repr__())
print('a: \n', a.__repr__()) # a is also changed!!

b.shape = (1,4)
print('shape of a: ', a.shape) #The shape of a is not changed

b: 
 array([[0., 0.],
       [0., 0.]])
b: 
 array([[0., 0.],
       [0., 1.]])
a: 
 array([[0., 0.],
       [0., 1.]])
shape of a:  (2, 2)


In [55]:
#Case 3 : Deep copy

a=np.zeros((2,2))
c = a.copy() #Deep copy # Create an independet variable not sharing both the data and properties
print('c: \n', c.__repr__())

c[1,1] =1
print('c: \n', c.__repr__())
print('a: \n', a.__repr__()) # a is not changed

c: 
 array([[0., 0.],
       [0., 0.]])
c: 
 array([[0., 0.],
       [0., 1.]])
a: 
 array([[0., 0.],
       [0., 0.]])


### References

https://cs231n.github.io/python-numpy-tutorial/#numpy

http://aikorea.org/cs231n/python-numpy-tutorial/

https://nbviewer.jupyter.org/gist/FinanceData/274d1a051b8ef10379b35b3fa72dd931