### NumPy - Numerical Python 
* Developed in 2006, it’s used in many domains where high dimensional arrays are used like Scientific Computing, Deep learning, Financial Data etc. In Python as data can come from various sources and can be in various formats it can be image data, text data etc. and is heterogeneous in nature. To make the data readable we have to think it as an array of numbers, an image is an array of numbers with numbers ranging from 0 to 255 .
* Now Python has only 4 basic data types and to think of data as an array we need to store it and manipulate it and that is the reason why Numpy came into Picture.
* NumPy can efficiently store n-d arrays in vectorised form and benefit from DRAM technology. NumPy provides a functionality to store n-d arrays in files and load them in the same manner. NumPy avoids costly type-checking as NumPy arrays are homogeneous in nature.

#### Why NumPy 

In [1]:
# Initialize the N
N = 10000000

In [6]:
%%time
l1 = list(range(N))# Making a list with N values then squaring each term and saving it in the same list and timing the whole process
for i in range(N):
    l1[i] = l1[i]*l1[i]


Wall time: 2.39 s


In [7]:
%%time
l1 = list(range(N))
l1 = [i*i for i in l1] # same task with list comprehension

Wall time: 1.02 s


In [8]:
%%time
l1 = list(range(N))
l1 = map(lambda x:x*x,l1) # Using Lambda Functions

Wall time: 257 ms


In [10]:
#To use numpy we have to import it from our anaconda stack where we will get it pre installed
import numpy as np
np.__version__

'1.20.3'

In [11]:
%%time
array_1 = np.arange(N) #using NumPy arrays for the same task
array_1 = array_1*array_1

Wall time: 26.9 ms


From the above example we can see that while using list of lists as array is an expensive task. Typechecking and inefficient use of memory take more time and power. However with some changes in the software we can increase the speed. But that's not sufficient in case of high dimensional arrays. With NumPy array we saw a significant decrease in time. Hence NumPy provides Python with array oriented programming.

#### How Numpy

##### Creating Arrays  

In [12]:
# Creating one dimensional numpy Arrays from list
a1 = np.array([1,2,3,4,5])
a1

array([1, 2, 3, 4, 5])

##### Numpy arrays are homogeneous 

In [13]:
# Remeber Python list are heterogeneous but numpy arrays are homogeneous and the data type is same and if it is not same then 
# numpy will upcast them to same data type
a2 = np.array([1,2,3,4,5.])
a2

array([1., 2., 3., 4., 5.])

In [14]:
# Or we can predefine the data type
a3 = np.array([10,15,20,35,40,60.0],dtype = "float32")
a3

array([10., 15., 20., 35., 40., 60.], dtype=float32)

In [15]:
# Numpy arrays are N-Dimensional arrays here n = 1 i.e. One dimensional 
type(a3)

numpy.ndarray

In [16]:
# shape of the array
a3.shape

(6,)

In [17]:
# 2D-Array
a4 = np.array([[0,1,2],[3,4,5]]) # we can see here 2 brackets hence it is a 2D-array (Intiution based)
a4

array([[0, 1, 2],
       [3, 4, 5]])

In [18]:
# it has 2 rows that has 3 elements each
a4.shape 

(2, 3)

In [19]:
# 3D- Array
a5 = np.array([[[0,1,2],[3,4,5]], [[5,6,7],[8,9,10]]])
a5

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 5,  6,  7],
        [ 8,  9, 10]]])

In [20]:
# it has 3 dimensions where on the first plane we have [[0,1,2],[3,4,5]] and on the other plane we have [[5,6,7],[8,9,10]]
a5.shape

(2, 2, 3)

##### Notes

* One Dimensional arrays are called Vectors.
* Two Dimensional arrays are called Matrix.
* N Dimensional arrays are called Tensors.
* In arrays terminology we index the dimensions from backward. In our case of 3-D array we have:<br>
1) 2 as our dimension 0 that contain plane-1 and plane-2 (plane behind plane-1).<br>
2) 2 as our dimension 1 that contains 2 different rows on each plane.<br>
3) 3 as our dimension 0 that have a total of 3 elements each.
 

##### Creating Arrays  with Inbuilt Functions.

In [21]:
# we can create numpy arrays filled with zeroes
np.zeros(10,dtype = "int")

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [22]:
# we can create 2-D arrays all filled with ones
np.ones((3,5),dtype="int")

array([[1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1],
       [1, 1, 1, 1, 1]])

In [23]:
# we can create 3-D arrays filled with some values
np.full((3,4,5),21)

array([[[21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21]],

       [[21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21]],

       [[21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21],
        [21, 21, 21, 21, 21]]])

In [24]:
# we can create an array with equally spaced values between given range and it will go from 0 to n-1 and we can give step size 
# in decimals too.
np.arange(0,40,1.2)

array([ 0. ,  1.2,  2.4,  3.6,  4.8,  6. ,  7.2,  8.4,  9.6, 10.8, 12. ,
       13.2, 14.4, 15.6, 16.8, 18. , 19.2, 20.4, 21.6, 22.8, 24. , 25.2,
       26.4, 27.6, 28.8, 30. , 31.2, 32.4, 33.6, 34.8, 36. , 37.2, 38.4,
       39.6])

In [25]:
# Linspace will go from 0 to n and will create equally spaced number between start and end and number can be anything whether we
# want 100 numbers the spacing between them would be same
np.linspace(0,40,40)

array([ 0.        ,  1.02564103,  2.05128205,  3.07692308,  4.1025641 ,
        5.12820513,  6.15384615,  7.17948718,  8.20512821,  9.23076923,
       10.25641026, 11.28205128, 12.30769231, 13.33333333, 14.35897436,
       15.38461538, 16.41025641, 17.43589744, 18.46153846, 19.48717949,
       20.51282051, 21.53846154, 22.56410256, 23.58974359, 24.61538462,
       25.64102564, 26.66666667, 27.69230769, 28.71794872, 29.74358974,
       30.76923077, 31.79487179, 32.82051282, 33.84615385, 34.87179487,
       35.8974359 , 36.92307692, 37.94871795, 38.97435897, 40.        ])

In [26]:
# we can also create an identity matrix
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [27]:
# Will generate random numbers between 0 and 1
# random variables are geenrally used to stimulate a event or generate some random variable to play with.
np.random.random((2,3))

array([[0.09385841, 0.50217454, 0.52971145],
       [0.86779741, 0.48556615, 0.1200499 ]])

In [28]:
# will generate random numbers between the given range
a6= np.random.randint(0,10,(3,3))
a6

array([[7, 1, 7],
       [0, 5, 5],
       [3, 1, 3]])

In [29]:
np.random.randn(4,5) # A series of number generated from normal distribution where mean = 0 and variance = 1

array([[ 1.47252549, -0.7760884 , -0.22621421, -0.01754808, -1.72558332],
       [ 0.49538172, -1.31817645, -0.1720082 ,  0.04800515,  1.81916895],
       [ 1.80525076, -0.20005571, -0.39811051,  0.21103552, -0.20370036],
       [ 1.19429692,  0.276467  ,  1.73292745,  0.20826862, -0.80917563]])

In [30]:
np.random.rand(3,4) # Uniformly sampled numbers from 0-1. 

array([[0.67865581, 0.22409827, 0.2391382 , 0.58158776],
       [0.28764747, 0.5154386 , 0.3899557 , 0.54886044],
       [0.17234211, 0.39508373, 0.22776309, 0.80476116]])

##### Retrieve data from Arrays

In [31]:
a5

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 5,  6,  7],
        [ 8,  9, 10]]])

In [32]:
# we can access elements using square brackets[row,column] ,like we did in python
a5[0,0,0]

0

In [33]:
# for thw whole array. In between we can slice data to get the other elements
a5[:,:]

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 5,  6,  7],
        [ 8,  9, 10]]])

In [34]:
# to get the value from first axis
# here as we can see that we can get the value 2 i.e, from row 0 get the last element
a5[0,0,2]# 0 - means plane-1 [[ 0,  1,  2], [ 3,  4,  5]], next 0 - first row and 2 for element at location.

2

##### Slicing 

In [35]:
# Get the matrix [[6,7],[9,10]] - Slice the data
a5[1, 0:3, 1:3]

array([[ 6,  7],
       [ 9, 10]])

In [36]:
# get the 2nd row of each plane
a5[:,1,:]

array([[ 3,  4,  5],
       [ 8,  9, 10]])

In [37]:
# reversed array
a5[::-1,::-1]

array([[[ 8,  9, 10],
        [ 5,  6,  7]],

       [[ 3,  4,  5],
        [ 0,  1,  2]]])

In [38]:
# condition check
a5[a5%2==0]

array([ 0,  2,  4,  6,  8, 10])

##### Copy Function

 Whenever we slice the data it creates a view of the array just like in SQL and the original array is not copied in the memory. So while modifying the view the original array gets updated and hence we use Copy Function. 

In [39]:
arr_1 = a5

In [40]:
arr_2 = arr_1[:,1,:]
arr_2

array([[ 3,  4,  5],
       [ 8,  9, 10]])

In [41]:
np.shares_memory(arr_1,arr_2)

True

In [42]:
# here we can see that both shares same location and we if we update anything in arr_2, it will get updated in arr_1.
arr_2[:1:] = 10
arr_2

array([[10, 10, 10],
       [ 8,  9, 10]])

In [43]:
arr_1

array([[[ 0,  1,  2],
        [10, 10, 10]],

       [[ 5,  6,  7],
        [ 8,  9, 10]]])

In [44]:
# Hence to sort out this problem we use copy function that copies the data in the other array.
arr_3 = arr_1[:,1,:].copy()  # deep copy

In [45]:
np.shares_memory(arr_1,arr_3)

False

#### Aggregations & other Mathematic functions

In [46]:
# we can do aggregation on the values in array
np.sum(a5)

78

In [47]:
np.min(a5)

0

In [48]:
np.max(a5)

10

In [49]:
# for multidimensional aggregation we can use axis
# axis = 0 is used to sum along the columns
np.sum(a5,axis=0)

array([[ 5,  7,  9],
       [18, 19, 20]])

In [50]:
# for row wise aggregation we can use axis = 1
np.sum(a5,axis = 1)

array([[10, 11, 12],
       [13, 15, 17]])

In [51]:
np.mean(a5)

6.5

In [52]:
a5 + 2

array([[[ 2,  3,  4],
        [12, 12, 12]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [53]:
a5-2

array([[[-2, -1,  0],
        [ 8,  8,  8]],

       [[ 3,  4,  5],
        [ 6,  7,  8]]])

In [54]:
a5*8

array([[[ 0,  8, 16],
        [80, 80, 80]],

       [[40, 48, 56],
        [64, 72, 80]]])

In [55]:
a5/2

array([[[0. , 0.5, 1. ],
        [5. , 5. , 5. ]],

       [[2.5, 3. , 3.5],
        [4. , 4.5, 5. ]]])

##### Broadcasting -  Replication of values for one or more operands to be expanded along other dimension to perform mathematical operations 

In [56]:
a6 = np.array([[[1,2,3,4],[5,6,7,8],[9,10,11,12]],[[13,14,15,16],[17,18,19,20],[21,22,23,24]]])
a6

array([[[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]],

       [[13, 14, 15, 16],
        [17, 18, 19, 20],
        [21, 22, 23, 24]]])

In [57]:
a7 = np.array([1,2,3]).reshape(3,1)
a7

array([[1],
       [2],
       [3]])

In [58]:
# To broadcast a6 and a7 we check dimesnions. The geenral rule is to check the dimesnions backward and if we have 1
# then that tensor is allowed
#a6 = (2,3,4) and a7 = (3*1). if we match dimensions from backward 1 *4 is allowed and 3 =3
a6+a7

array([[[ 2,  3,  4,  5],
        [ 7,  8,  9, 10],
        [12, 13, 14, 15]],

       [[14, 15, 16, 17],
        [19, 20, 21, 22],
        [24, 25, 26, 27]]])

In [61]:
a = np.array([1,9,5])
b = np.array([2,5,90])

In [62]:
a+b

array([ 3, 14, 95])

In [63]:
# we can add numbers to arrays
c = a+100
c

array([101, 109, 105])

In [64]:
# we can check conditions on the numpy array
c[c>105]

array([109])

In [65]:
Arr = np.random.randint(50, 500, size = (4, 6))

In [66]:
Arr

array([[404,  56, 176, 346, 487, 342],
       [ 94, 412, 413, 381, 323, 145],
       [261, 413, 267, 468, 115, 494],
       [446, 387, 217, 355, 122,  56]])

In [67]:
# logical Condition checking
Arr[(Arr>=350) & (Arr<=400)]

array([381, 387, 355])

#### Functions related to shape 

In [68]:
d = np.array([1,3,5,12,6,7,8,9,6,4,4,14])

In [69]:
# o we can reshape the elements as per no . of elements like 12 = 2*6,3*4,4*3
d.reshape(2,6)

array([[ 1,  3,  5, 12,  6,  7],
       [ 8,  9,  6,  4,  4, 14]])

In [70]:
d.reshape(3,4)

array([[ 1,  3,  5, 12],
       [ 6,  7,  8,  9],
       [ 6,  4,  4, 14]])

In [71]:
# we cn transpose them and change there dimensions rows = columns and vice-versa
d.reshape(3,4).T

array([[ 1,  6,  6],
       [ 3,  7,  4],
       [ 5,  8,  4],
       [12,  9, 14]])

#### Combining Data 

In [72]:
J = np.random.randint( 10, size = (6,9))
K = np.random.randint( 100, size = (6,2))
L = np.random.randint( 800, size = (4,9))

In [73]:
J

array([[0, 2, 8, 8, 0, 9, 0, 7, 2],
       [0, 0, 8, 2, 7, 4, 5, 8, 6],
       [7, 2, 5, 1, 8, 7, 2, 1, 1],
       [1, 1, 9, 9, 9, 3, 9, 3, 0],
       [6, 4, 1, 9, 7, 4, 7, 7, 4],
       [9, 4, 5, 9, 3, 5, 2, 7, 1]])

In [74]:
K

array([[52, 13],
       [22, 50],
       [27, 98],
       [54,  2],
       [88, 62],
       [52, 93]])

In [75]:
L

array([[ 24, 465,  92, 672, 167, 330, 260, 157, 558],
       [214, 213, 758, 587, 377, 249, 268, 517, 660],
       [ 98, 176, 349,  14, 624, 658,  23, 535, 515],
       [654, 530, 799, 492, 292, 313, 432, 760, 342]])

In [76]:
# Horizontal Combining for that the horizontal dimesnions should be same
np.hstack([J,K])

array([[ 0,  2,  8,  8,  0,  9,  0,  7,  2, 52, 13],
       [ 0,  0,  8,  2,  7,  4,  5,  8,  6, 22, 50],
       [ 7,  2,  5,  1,  8,  7,  2,  1,  1, 27, 98],
       [ 1,  1,  9,  9,  9,  3,  9,  3,  0, 54,  2],
       [ 6,  4,  1,  9,  7,  4,  7,  7,  4, 88, 62],
       [ 9,  4,  5,  9,  3,  5,  2,  7,  1, 52, 93]])

In [77]:
# Vertical appending for that the vertical dimesnions should be same
np.vstack([J,L])

array([[  0,   2,   8,   8,   0,   9,   0,   7,   2],
       [  0,   0,   8,   2,   7,   4,   5,   8,   6],
       [  7,   2,   5,   1,   8,   7,   2,   1,   1],
       [  1,   1,   9,   9,   9,   3,   9,   3,   0],
       [  6,   4,   1,   9,   7,   4,   7,   7,   4],
       [  9,   4,   5,   9,   3,   5,   2,   7,   1],
       [ 24, 465,  92, 672, 167, 330, 260, 157, 558],
       [214, 213, 758, 587, 377, 249, 268, 517, 660],
       [ 98, 176, 349,  14, 624, 658,  23, 535, 515],
       [654, 530, 799, 492, 292, 313, 432, 760, 342]])