## NumPy

- NumPy is a python library that provides a simple yet powerful data structure: **n-dimesnional array**
- This is the foundaton on which almost all of the power of Python's data science toolkit is built.
- The official documentation of numpy: [numpy.org](https://numpy.org/)

install `numpy` in your virtual environement using `pip install numpy`

----

### Benefits of NumPy

* **More speed:** NumPy uses algorithms written in C that complete in nanoseconds rather than seconds.
* **Fewer loops:** NumPy helps you to reduce loops and keep from getting tangled up in iteration indices.
* **Clearer code:** Without loops, your code will look more like the equations youâ€™re trying to calculate.
* **Open source:** Distributed under a liberal BSD license, NumPy is developed and maintained publicly on GitHub by a vibrant, responsive, and diverse community. Hence numpy is stable, mature and friendly.

In [1]:
# import numpy

import numpy as np

### Creating numpy array from list

In [2]:
list_1 = [2, 4, 3, 6, 8, 7]

x = np.array(list_1)

In [3]:
print(type(list_1))

<class 'list'>


In [4]:
print(type(x))

<class 'numpy.ndarray'>


In [5]:
x

array([2, 4, 3, 6, 8, 7])

In [11]:
list_2 = [[1, 2, 3], [4, 5, 6]]  # 2D list (list of lists)

In [12]:
y = np.array(list_2) # 2D array

In [8]:
y

array([[1, 2, 3],
       [4, 5, 6]])

In [13]:
print(type(y))

<class 'numpy.ndarray'>


In [14]:
list_3d = [[[1, 2, 3], [4, 5, 6]], [[-1, 2, -3], [4, -5, 6]], [[2, 4, 5], [3, 4, 5]]] # 3D list (list of lists of lists)

list_3d

[[[1, 2, 3], [4, 5, 6]], [[-1, 2, -3], [4, -5, 6]], [[2, 4, 5], [3, 4, 5]]]

In [15]:
z = np.array(list_3d)  # 3D array

print(type(z)) 

<class 'numpy.ndarray'>


In [16]:
z

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[-1,  2, -3],
        [ 4, -5,  6]],

       [[ 2,  4,  5],
        [ 3,  4,  5]]])

In [17]:
x.shape  # return a tuple containing the shape of the array

(6,)

In [18]:
y.shape

(2, 3)

In [19]:
z.shape

(3, 2, 3)

### Create NumPy array from linearly spaced data

We will use `numpy.linspace` function for it

In [20]:
a = np.linspace(start=0, stop=100, num=21)   # both start and stop are included

a

array([  0.,   5.,  10.,  15.,  20.,  25.,  30.,  35.,  40.,  45.,  50.,
        55.,  60.,  65.,  70.,  75.,  80.,  85.,  90.,  95., 100.])

In [21]:
b = np.linspace(start=20, stop=500, num=30)   # both start and stop are included

b

array([ 20.        ,  36.55172414,  53.10344828,  69.65517241,
        86.20689655, 102.75862069, 119.31034483, 135.86206897,
       152.4137931 , 168.96551724, 185.51724138, 202.06896552,
       218.62068966, 235.17241379, 251.72413793, 268.27586207,
       284.82758621, 301.37931034, 317.93103448, 334.48275862,
       351.03448276, 367.5862069 , 384.13793103, 400.68965517,
       417.24137931, 433.79310345, 450.34482759, 466.89655172,
       483.44827586, 500.        ])

In [22]:
b = np.linspace(start=200, stop=-100, num=50)

b

array([ 200.        ,  193.87755102,  187.75510204,  181.63265306,
        175.51020408,  169.3877551 ,  163.26530612,  157.14285714,
        151.02040816,  144.89795918,  138.7755102 ,  132.65306122,
        126.53061224,  120.40816327,  114.28571429,  108.16326531,
        102.04081633,   95.91836735,   89.79591837,   83.67346939,
         77.55102041,   71.42857143,   65.30612245,   59.18367347,
         53.06122449,   46.93877551,   40.81632653,   34.69387755,
         28.57142857,   22.44897959,   16.32653061,   10.20408163,
          4.08163265,   -2.04081633,   -8.16326531,  -14.28571429,
        -20.40816327,  -26.53061224,  -32.65306122,  -38.7755102 ,
        -44.89795918,  -51.02040816,  -57.14285714,  -63.26530612,
        -69.3877551 ,  -75.51020408,  -81.63265306,  -87.75510204,
        -93.87755102, -100.        ])

### NumPy data type

`numpy.ndarray` is the fundamental data type of numpy.

ndarray is different from python's native list in the following way.

- ndarray only accepts homogenous data type
- The data within the ndarray must be on the dtypes provided by numpy

few popular dtypes are:

- float64
- int64
- int32
- bool
- datetime
and many more

In [23]:
x

array([2, 4, 3, 6, 8, 7])

In [24]:
type(x[0])

numpy.int64

In [25]:
z = np.array([1,2,3], dtype='int32')

In [26]:
type(z[0])

numpy.int32

In [27]:
y = np.array([3,4,7], dtype='float32')

y

array([3., 4., 7.], dtype=float32)

In [28]:
type(y[0])

numpy.float32

In [29]:
c = np.array([0, 1, 1, 0, 0], dtype='bool')

In [30]:
c

array([False,  True,  True, False, False])

In [31]:
type(c[0])

numpy.bool

In [34]:
d = np.array([5.5, -1, 1, 0, 0], dtype='bool')

In [35]:
d

array([ True,  True,  True, False, False])

### Creating arrays with all the zero values

In [36]:
np.zeros((4,), dtype='int') # array initialized with zero

array([0, 0, 0, 0])

In [37]:
np.zeros((3,2))

array([[0., 0.],
       [0., 0.],
       [0., 0.]])

In [38]:
np.empty((5,)) # uninitialized

array([0.00000000e+000, 1.59793495e-311, 1.59793495e-311, 1.59793495e-311,
       0.00000000e+000])

In [39]:
v = np.array([[1,2,3], [4,5,6], [7,8,9]])

v

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [40]:
v.shape

(3, 3)

In [41]:
u = np.zeros_like(v)

u

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

### Creating arrays with all ones

In [46]:
np.ones((7,))

array([1., 1., 1., 1., 1., 1., 1.])

In [47]:
np.ones_like(v)

array([[1, 1, 1],
       [1, 1, 1],
       [1, 1, 1]])

### Identity matrix

In [48]:
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [49]:
np.eye(5)  # alias to the function np.identity

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

### numpy.arrange

In [52]:
np.arange(start=0, stop=12, step=3)  # stop is excluded, start is included. 

array([0, 3, 6, 9])

### Shape and axis

In [53]:
x2d = np.array([[2, 3, 4], [9, 8, 7]])

In [54]:
x2d

array([[2, 3, 4],
       [9, 8, 7]])

In [55]:
x2d.shape

(2, 3)

In [56]:
x2d.reshape((3,2))

array([[2, 3],
       [4, 9],
       [8, 7]])

In [57]:
v = np.linspace(0, 100, 21)

v

array([  0.,   5.,  10.,  15.,  20.,  25.,  30.,  35.,  40.,  45.,  50.,
        55.,  60.,  65.,  70.,  75.,  80.,  85.,  90.,  95., 100.])

In [58]:
v.shape

(21,)

In [59]:
v.reshape((3,7))

array([[  0.,   5.,  10.,  15.,  20.,  25.,  30.],
       [ 35.,  40.,  45.,  50.,  55.,  60.,  65.],
       [ 70.,  75.,  80.,  85.,  90.,  95., 100.]])

In [None]:
v.reshape(-1,7)   # -1 means automatically infer the dimension

array([[  0.,   5.,  10.,  15.,  20.,  25.,  30.],
       [ 35.,  40.,  45.,  50.,  55.,  60.,  65.],
       [ 70.,  75.,  80.,  85.,  90.,  95., 100.]])

In [94]:
v.reshape(7,-1)

array([[  0.,   5.,  10.],
       [ 15.,  20.,  25.],
       [ 30.,  35.,  40.],
       [ 45.,  50.,  55.],
       [ 60.,  65.,  70.],
       [ 75.,  80.,  85.],
       [ 90.,  95., 100.]])

In [60]:
x2d

array([[2, 3, 4],
       [9, 8, 7]])

In [61]:
x2d.flatten()

array([2, 3, 4, 9, 8, 7])

In [62]:
x3d = np.array([[[1,2],[2,3]], [[3,4],[4,3]]])

x3d

array([[[1, 2],
        [2, 3]],

       [[3, 4],
        [4, 3]]])

In [63]:
x3d.shape

(2, 2, 2)

In [64]:
x3d.flatten()

array([1, 2, 2, 3, 3, 4, 4, 3])

In [65]:
x2d = np.array([[3.5, 6.4, 2.7], [5.9, 6.1, 7.3]])

In [66]:
np.sum(x2d) # sum of all the elements

np.float64(31.900000000000002)

In [67]:
x2d

array([[3.5, 6.4, 2.7],
       [5.9, 6.1, 7.3]])

In [68]:
x2d.shape

(2, 3)

In [69]:
np.sum(x2d, axis=0) # column wise sum

array([ 9.4, 12.5, 10. ])

In [70]:
np.sum(x2d, axis=1) # row wise sum

array([12.6, 19.3])

In [71]:
x = np.array([[[1, 2, 3, 4], [5, 6, 7, 8]], [[5, 7, 9, 11], [2, 4, 6, 8]], [[-1, -2, 4, 9], [2, -7, 5, 0]]])

In [72]:
x

array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8]],

       [[ 5,  7,  9, 11],
        [ 2,  4,  6,  8]],

       [[-1, -2,  4,  9],
        [ 2, -7,  5,  0]]])

In [73]:
x.shape

(3, 2, 4)

In [74]:
np.sum(x, axis=0)  # result dimension is (2,4)

array([[ 5,  7, 16, 24],
       [ 9,  3, 18, 16]])

In [75]:
np.sum(x, axis=1) # result dimension is (3,4)

array([[ 6,  8, 10, 12],
       [ 7, 11, 15, 19],
       [ 1, -9,  9,  9]])

In [76]:
np.sum(x, axis=2) # result dimension is (3,2)

array([[10, 26],
       [32, 20],
       [10,  0]])

### Indexing and Slicing

In [None]:
y = np.array([1,2,3,4,5])  # index starts from 0

y[0]

np.int64(1)

In [78]:
y[4]

np.int64(5)

In [79]:
y[-1]

np.int64(5)

In [80]:
y[2:4]

array([3, 4])

In [81]:
y[:3]

array([1, 2, 3])

In [82]:
y[2:]

array([3, 4, 5])

In [83]:
z = np.linspace(1,24,24).reshape(4,6)  # 2D array of shape (4,6)

In [84]:
z

array([[ 1.,  2.,  3.,  4.,  5.,  6.],
       [ 7.,  8.,  9., 10., 11., 12.],
       [13., 14., 15., 16., 17., 18.],
       [19., 20., 21., 22., 23., 24.]])

In [85]:
z[0]

array([1., 2., 3., 4., 5., 6.])

In [86]:
z[0,4]

np.float64(5.0)

In [87]:
z[1,3]

np.float64(10.0)

In [88]:
z[1,:]  

array([ 7.,  8.,  9., 10., 11., 12.])

In [None]:
z[:,3].reshape(-1,1)

array([ 4., 10., 16., 22.])

In [95]:
z

array([[ 1.,  2.,  3.,  4.,  5.,  6.],
       [ 7.,  8.,  9., 10., 11., 12.],
       [13., 14., 15., 16., 17., 18.],
       [19., 20., 21., 22., 23., 24.]])

In [96]:
z[1:3, 2:5]

array([[ 9., 10., 11.],
       [15., 16., 17.]])

In [None]:
z[1:4, 1:6:2]

array([[ 8., 10., 12.],
       [14., 16., 18.],
       [20., 22., 24.]])

### Boolean indexing

In [101]:
v = np.array([-1,4,2,7,10,9,3,5])

In [102]:
v.shape

(8,)

In [103]:
bool_index = np.array([1, 1, 0, 0, 0, 1, 0, 0], dtype='bool')

In [104]:
bool_index

array([ True,  True, False, False, False,  True, False, False])

In [105]:
v[bool_index]

array([-1,  4,  9])

In [107]:
~bool_index

array([False, False,  True,  True,  True, False,  True,  True])

In [106]:
v[~bool_index]

array([ 2,  7, 10,  3,  5])

In [109]:
v

array([-1,  4,  2,  7, 10,  9,  3,  5])

In [108]:
v > 2   # give an array of boolean (of the same shape as v) and wherever we get value > 2 the resultant array will be True else False.

array([False,  True, False,  True,  True,  True,  True,  True])

In [110]:
v[v > 2]   # returns all the elemensts of v which is > 2

array([ 4,  7, 10,  9,  3,  5])

In [111]:
(v > 2) & (v < 9)   # & -> and operation

array([False,  True, False,  True, False, False,  True,  True])

In [112]:
v[(v > 2) & (v < 9)]

array([4, 7, 3, 5])

In [113]:
(v < 5) | (v == 10) # | -> or operation

array([ True,  True,  True, False,  True, False,  True, False])

In [114]:
v[(v < 5) | (v == 10)]

array([-1,  4,  2, 10,  3])

### Array methods

In [115]:
x = np.array([3, -1, 4, 8, 0, 2])

In [116]:
x.sort()  # in-place sorting 

In [117]:
x

array([-1,  0,  2,  3,  4,  8])

In [118]:
x.sort()
x = x[::-1]

x

array([ 8,  4,  3,  2,  0, -1])

In [None]:
y = np.array([3, -1, 4, 8, 0, 2])  # index of the elements from small to large 

y.argsort()

array([1, 4, 5, 0, 2, 3])

In [120]:
y.argsort()[-1]   # index of the largest element

np.int64(3)

In [121]:
y.min()  # minimum value of the array

np.int64(-1)

In [122]:
y.argmin()  # return the index of minimum value

np.int64(1)

In [None]:
y.max()  # maximum value of the array

np.int64(8)

In [124]:
y.argmax()  # index of the maximum value

np.int64(3)

In [125]:
x = np.array([1, 2, -1, 4, 0, 5, 7, 8])

In [None]:
np.cumsum(x)  # cumulative summation

# y = cumsum(x)
# y[0] = x[0]
# y[1] = x[0] + x[1] = y[0] + x[1]
# y[2] = x[0] + x[1] + x[2] = y[1] + x[2]
# ....
# y[n] = y[n-1] + x[n] 

array([ 1,  3,  2,  6,  6, 11, 18, 26])

In [None]:
z = np.cumprod(x)   # cumulative product

# z = cumprod(x)
# z[0] = x[0]
# z[1] = x[0] * x[1] = z[0] * x[1]
# z[2] = x[0] * x[1] * x[2] = z[1] * x[2]
# ...
# z[n] = z[n-1] * x[n]

In [128]:
z

array([ 1,  2, -2, -8,  0,  0,  0,  0])

### mutability and copy of numpy array

In [129]:
a = [1, 3, 4]

b = a  # pointing to the same memory location

In [130]:
b.append(5)

b

[1, 3, 4, 5]

In [131]:
a

[1, 3, 4, 5]

In [132]:
a[0] = 56

In [133]:
b

[56, 3, 4, 5]

In [134]:
a

[56, 3, 4, 5]

In [135]:
b = a.copy()  # copies the same elements to a different memory location

In [136]:
b.append(16)

In [137]:
b

[56, 3, 4, 5, 16]

In [138]:
a

[56, 3, 4, 5]

In [149]:
x = np.array((4, 5, 6, 10))

In [150]:
y = x.copy()

In [151]:
x[2] = 23

In [152]:
x

array([ 4,  5, 23, 10])

In [153]:
y

array([ 4,  5,  6, 10])

### Stacking of NumPy array

In [154]:
x1 = np.array([[1,2],[3,4],[5,6]])
x2 = np.array([[7,8,-1],[0,-1,1]])
x3 = np.array([[-1,1],[-2,2],[-3,3]])

x1.shape, x2.shape, x3.shape

((3, 2), (2, 3), (3, 2))

In [155]:
x1

array([[1, 2],
       [3, 4],
       [5, 6]])

In [156]:
x3

array([[-1,  1],
       [-2,  2],
       [-3,  3]])

### stack()

![](https://www.w3resource.com/w3r_images/numpy-manipulation-stack-function-image-1.png)

In [157]:
x1

array([[1, 2],
       [3, 4],
       [5, 6]])

In [158]:
x3

array([[-1,  1],
       [-2,  2],
       [-3,  3]])

In [159]:
v1 = np.stack((x1,x3), axis=0) # stacking along a new 0th axis

v1

array([[[ 1,  2],
        [ 3,  4],
        [ 5,  6]],

       [[-1,  1],
        [-2,  2],
        [-3,  3]]])

In [160]:
v1.shape

(2, 3, 2)

In [161]:
v2 = np.stack((x1,x3), axis=1) # stacking along axis-1

v2

array([[[ 1,  2],
        [-1,  1]],

       [[ 3,  4],
        [-2,  2]],

       [[ 5,  6],
        [-3,  3]]])

In [162]:
v2.shape

(3, 2, 2)

### hstack()

![](https://www.w3resource.com/w3r_images/numpy-manipulation-hstack-function-image-a.png)

In [163]:
x1

array([[1, 2],
       [3, 4],
       [5, 6]])

In [165]:
x3

array([[-1,  1],
       [-2,  2],
       [-3,  3]])

In [166]:
np.hstack((x1,x3))  # stacking along rows

array([[ 1,  2, -1,  1],
       [ 3,  4, -2,  2],
       [ 5,  6, -3,  3]])

#### vstack()

![](https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQ6jO_LjYNvZEFQHGWmTRkCRsIFT1KS5pSOCw&s)

In [167]:
np.vstack((x1,x3))  # stacking along columns

array([[ 1,  2],
       [ 3,  4],
       [ 5,  6],
       [-1,  1],
       [-2,  2],
       [-3,  3]])

### Arithmatic operations on NumPy array

#### Addition

In [168]:
x = np.array([[7, 8, -1],[0, -1, 1]])
x

array([[ 7,  8, -1],
       [ 0, -1,  1]])

In [169]:
x + 2    # broadcasting (add value 2 to all the elements of the array)

array([[ 9, 10,  1],
       [ 2,  1,  3]])

In [170]:
y = np.array([[3, 4, 5], [9, 7, 2]])

y

array([[3, 4, 5],
       [9, 7, 2]])

In [171]:
x + y  # Element wise addition (the shape of x and y should be same)

array([[10, 12,  4],
       [ 9,  6,  3]])

In [172]:
z = np.array([[2, 4, 5]])

z

array([[2, 4, 5]])

In [173]:
y + z   # broadcasting (add the single row array z to each row of y)

array([[ 5,  8, 10],
       [11, 11,  7]])

In [174]:
y + np.array([2, 5])

ValueError: operands could not be broadcast together with shapes (2,3) (2,) 

In [175]:
y + np.array([[2],[5]])

array([[ 5,  6,  7],
       [14, 12,  7]])

In [176]:
# transposing

x

array([[ 7,  8, -1],
       [ 0, -1,  1]])

In [None]:
x.T   # .T will transpose

array([[ 7,  0],
       [ 8, -1],
       [-1,  1]])

In [178]:
y + np.array([[2, 5]]).T  # broadcasting : add the single column array to each column of y

# array([[3, 4, 5],    +  array([[2],)
#        [9, 7, 2]])             [5]])


array([[ 5,  6,  7],
       [14, 12,  7]])

#### Subtraction

In [179]:
x

array([[ 7,  8, -1],
       [ 0, -1,  1]])

In [180]:
x - 3.5 # broadcasting : subtract 3.5 from each element

array([[ 3.5,  4.5, -4.5],
       [-3.5, -4.5, -2.5]])

#### Multiplication and division

In [181]:
x * 3  # broadcasting : multiply each element by 3

array([[21, 24, -3],
       [ 0, -3,  3]])

In [182]:
x

array([[ 7,  8, -1],
       [ 0, -1,  1]])

In [183]:
y

array([[3, 4, 5],
       [9, 7, 2]])

In [184]:
x * y   # element wise product -> Hadamard product

array([[21, 32, -5],
       [ 0, -7,  2]])

In [185]:
x / 4 # broadcasting : divide each element by 4

array([[ 1.75,  2.  , -0.25],
       [ 0.  , -0.25,  0.25]])

In [186]:
x / y # element wise division

array([[ 2.33333333,  2.        , -0.2       ],
       [ 0.        , -0.14285714,  0.5       ]])

### Vectorized function

In [187]:
def return_max_value(x, y):
    if x > y:
        return x
    else:
        return y

In [188]:
return_max_value(3, 6)

6

In [189]:
array1 = np.array([2, 3, 4, 1])
array2 = np.array([3, 4, 2, 6])

In [190]:
return_max_value(array1, array2)

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

In [191]:
return_max_value_vect = np.vectorize(return_max_value)  # vectorizing the function

In [192]:
return_max_value_vect

<numpy.vectorize at 0x2f12c8ec6b0>

In [193]:
return_max_value_vect(array1, array2)

# np.array(return_max_value(2,3), return_max_value(3,4), ....])

array([3, 4, 4, 6])

In [194]:
from math import factorial

In [195]:
factorial(6)

720

In [196]:
factorial(array1)

TypeError: only integer scalar arrays can be converted to a scalar index

In [197]:
np.vectorize(factorial)(array1)

array([ 2,  6, 24,  1])

### popular vectorized functions

In [198]:
array1

array([2, 3, 4, 1])

In [199]:
array2

array([3, 4, 2, 6])

In [200]:
np.sin(array1)

array([ 0.90929743,  0.14112001, -0.7568025 ,  0.84147098])

In [201]:
np.cos(array2)

array([-0.9899925 , -0.65364362, -0.41614684,  0.96017029])

In [None]:
np.log(array1)  # natural logarithm

array([0.69314718, 1.09861229, 1.38629436, 0.        ])

In [203]:
from math import pi

np.sin(pi/2)

np.float64(1.0)