<a href="https://colab.research.google.com/github/arkapriyathecoderinprogress/DataScience2.0/blob/main/NumPy_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**What** **is** **NumPy** **?**

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multi-dimensional array object, various derived objects (such as masked arrays and matrices) and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.
At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types.

**NumPy** **Arrays** **VS** **Python** **Sequences**

1. NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.
2. The elements in a NumPy array are all required to be of the same data type and thus will be the same size in memory.
3. NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python's built-in sequences.
4. A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing and they often output NumPy arrays.

**Creating** **Numpy** **Arrays**

In [1]:
# np.array
import numpy as np

a=np.array([1,2,3])
print(a)

[1 2 3]


In [6]:
# 2D and 3D
b=np.array([[1,2,3],[4,5,6]])
print(b)

[[1 2 3]
 [4 5 6]]


In [7]:
c=np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(c)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [8]:
# dtype
np.array([1,2,3],dtype=float)

array([1., 2., 3.])

In [15]:
# np.arange ( like for i in range())
np.arange(1,11)
np.arange(1,12,2)

array([ 1,  3,  5,  7,  9, 11])

In [17]:
# with reshape
np.arange(16).reshape(2,2,2,2)

array([[[[ 0,  1],
         [ 2,  3]],

        [[ 4,  5],
         [ 6,  7]]],


       [[[ 8,  9],
         [10, 11]],

        [[12, 13],
         [14, 15]]]])

In NumPy, the 'reshape()' function is used to change the shape of an array without changing its data. It returns a new array with the specified shape. This can be useful for rearranging data into different dimensions or for preparing data for certain operations that require a specific shape. For example, one can reshape a 1D array into a 2D array or vice-versa.

In [21]:
# np.ones() and np.zeros()
np.ones((3,4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [22]:
np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

"np.ones()" and "np.zeros()" are NumPy functions used to create arrays filled with ones and zeros, respectively.

1. np.ones(shape,dtype=None,order='C'): This function creates new array of the specified shape and fills it with ones. The 'dtype' argument allows to specify the data type of the array elements, such as 'int','float', etc. If not specified, it defaults to 'float64'. The 'order' argument specifies whether the array is stored in row-major('C') or column-major('F') order.

2. np.zeros(shape,dtype=None,order='C'): This function creates a new array of the specified shape and fills it with zeros.

In [23]:
# np.random.random()
np.random.random((3,4))


array([[0.24225469, 0.54299108, 0.99941536, 0.10234544],
       [0.73486207, 0.28672808, 0.80212028, 0.98487669],
       [0.66454971, 0.37985053, 0.26652714, 0.70781316]])

In NumPy, the "np.random.random()" is a function that generates random numbers uniformly distributed between 0 and 1.
Here, "np.random.random((3,4))" generates a NumPy array with a shape of (3,4) filled with random numbers from a uniform distribution between 0 and 1. This means it creates a 2D array with 3 rows and 4 columns, where each element in the array is a random number between 0 and 1.

In [24]:
# np.linspace()
np.linspace(-10,10,10,dtype=int)

array([-10,  -8,  -6,  -4,  -2,   1,   3,   5,   7,  10])

In [26]:
np.linspace(-30,30,10,dtype=float)

array([-30.        , -23.33333333, -16.66666667, -10.        ,
        -3.33333333,   3.33333333,  10.        ,  16.66666667,
        23.33333333,  30.        ])

In NumPy, "np.linspace()" is a function used to create an array of evenly spaced numbers over a specified interval. The syntax is 'np.linspace(start,stop,num=50,endpoint=True,retstep=False,dtype=None,axis=0). Here's what each parameter means:

1. 'start': The starting value of the sequence.
2. 'stop': The end value of the sequence.
3. 'num': The number of evenly spaced samples to generate. Default is 50.
4. 'endpoint': If True(default), 'stop' is the last value in the range. if False, it is not included.
5. 'retstep': If True, return the step size between numbers.
6. 'dtype': The data type of the output array. If not specified, it in=s inferred from the input arguments.
7. 'axis': The axis in the result along which the 'linspace' samples are stored. The default is 0.


In [27]:
# np.identity()
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [28]:
np.identity(2)

array([[1., 0.],
       [0., 1.]])

In NumPy, "np.identity(n) creates a square identity matrix of size n*n . An identity matrix is a square matrix where all elements in the main doagonal are 1 and all other elements are 0. It is denoted by "I" and is analogous to the number 1 in scalar algebra. This function is quite useful in linear algebra operation and matrix computations.

**Array** **Attributes**

In [30]:
a1=np.arange(10,dtype=np.int32)
a1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)

In [31]:
a2=np.arange(12,dtype=float).reshape(3,4)
a2

array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]])

In [33]:
a3=np.arange(8).reshape(2,2,2)
a3

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [34]:
# ndim
a3.ndim

3

In NumPy, the "ndim" attribute of an ndarray( NumPy array) returns the number of dimensions of the array. For example, for a 1D array, "ndim" will return 1; for a 2D array, "ndim" will return 2 and so on. It is useful in knowing the dimensionality of an array programmatically.

In [35]:
# shape
a3.shape

(2, 2, 2)

In NumPy, the "shape" attribute of an array provides information about the dimensions of the array. It returns a tuple representing the size of each dimension of the array. For a 1D array, the shape tuple contains only one element indicating the number of elements in that array. For multi-dimensional arrays, the shape tuple contains the size of each dimension in the order they appear in the array.
For example.
1. A 1D array with 5 elements will have a shape of (5,)
2. A 2D array with 3 rows and 4 columns will have a shape of (3,4)

In [36]:
# size
a2.size

12

In NumPy, the "size" attribute is used to get the number of elements in an array. It returns an integar representing the total number of elements along all dimensions of the array. This can be useful for various operations and calculations where one needs to know the total number of elements in an array.

In [37]:
# itemsize
a3.itemsize

8

In NumPy, the "itemsize" attribute of an array returns the size of each element in bytes. This can be useful when working with memory allocation or when one needs to know the memory footprint of an array.

In [38]:
# dtype
print(a1.dtype)
print(a2.dtype)
print(a3.dtype)

int32
float64
int64


In NumPy, the "dtype" parameter specifies the data type of elements in an ndarray(NumPy array). It allows to explicitly define the type of data one wants to store in the array, such as integers, floats or custom data types. This parameter is commonly used when creating arrays using function like "numpy.array()" or "numpy.zeros()". Using the appropriate data type can help optimize memory usage and ensure data integrity in NumPy arrays.

**Changing** **Datatype**

In [39]:
# astype
a3.astype(np.int32)

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]], dtype=int32)

In NumPy, the "astype" attribute is used to cast an array to a specified data type. This can be useful when one wants to change the data type of an array to perform specific operations or to ensure compatibility with other functions or libraries.
For example, one can use "astype" to convert an array of integers to floating point numbers or to change the precision of the numbers in the array.

**Array** **Operations**

In [41]:
a1=np.arange(12).reshape(3,4)
a1

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [42]:
a2=np.arange(12,24).reshape(3,4)
a2

array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [43]:
# scalar operations
# arithmetic
a1 * 2

array([[ 0,  2,  4,  6],
       [ 8, 10, 12, 14],
       [16, 18, 20, 22]])

In [44]:
a1 ** 2

array([[  0,   1,   4,   9],
       [ 16,  25,  36,  49],
       [ 64,  81, 100, 121]])

In [45]:
# relational
a2==15

array([[False, False, False,  True],
       [False, False, False, False],
       [False, False, False, False]])

In [46]:
a2>15

array([[False, False, False, False],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [47]:
a2<15

array([[ True,  True,  True, False],
       [False, False, False, False],
       [False, False, False, False]])

In [48]:
a2>=15

array([[False, False, False,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [49]:
a2<=15

array([[ True,  True,  True,  True],
       [False, False, False, False],
       [False, False, False, False]])

In [50]:
a2!=15

array([[ True,  True,  True, False],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [51]:
# vector operations
# arithmetic
a1 * a2

array([[  0,  13,  28,  45],
       [ 64,  85, 108, 133],
       [160, 189, 220, 253]])

In [52]:
a1 ** a2

array([[                   0,                    1,                16384,
                    14348907],
       [          4294967296,         762939453125,      101559956668416,
           11398895185373143],
       [ 1152921504606846976, -1261475310744950487,  1864712049423024128,
         6839173302027254275]])

**Array** **Functions**

In [53]:
a1=np.random.random((3,3))
a1

array([[0.96480695, 0.36418794, 0.92597947],
       [0.96457649, 0.93214459, 0.57808031],
       [0.46179747, 0.32385045, 0.06119029]])

In [54]:
a1=np.round(a1*100)
a1

array([[96., 36., 93.],
       [96., 93., 58.],
       [46., 32.,  6.]])

In [None]:
# 0-> column
# 1-> row

In [55]:
# max
np.max(a1,axis=1)

array([96., 96., 46.])

In [56]:
# min
np.min(a1,axis=0)

array([46., 32.,  6.])

In [57]:
# sum
np.sum(a1,axis=1)

array([225., 247.,  84.])

In [58]:
# prod
np.prod(a1,axis=0)

array([423936., 107136.,  32364.])

In [59]:
# mean
np.mean(a1,axis=1)

array([75.        , 82.33333333, 28.        ])

In [60]:
# median
np.median(a1,axis=0)

array([96., 36., 58.])

In [61]:
# std
np.std(a1,axis=1)

array([27.60434748, 17.24979871, 16.57307053])

In [62]:
# var
np.var(a1,axis=0)

array([ 555.55555556,  776.22222222, 1277.55555556])

In [63]:
# trigonometric functions
np.sin(a1)

array([[ 0.98358775, -0.99177885, -0.94828214],
       [ 0.98358775, -0.94828214,  0.99287265],
       [ 0.90178835,  0.55142668, -0.2794155 ]])

In [64]:
np.cos(a1)

array([[-0.18043045, -0.12796369,  0.3174287 ],
       [-0.18043045,  0.3174287 ,  0.11918014],
       [-0.43217794,  0.83422336,  0.96017029]])

In [65]:
np.tan(a1)

array([[-5.45134011,  7.75047091, -2.98738626],
       [-5.45134011, -2.98738626,  8.33085685],
       [-2.08661353,  0.66100604, -0.29100619]])

In [67]:
# dot product
a2=np.arange(12).reshape(3,4)
a3=np.arange(12,24).reshape(4,3)

np.dot(a2,a3)

array([[114, 120, 126],
       [378, 400, 422],
       [642, 680, 718]])

In [68]:
# log and exponents
np.exp(a1)

array([[4.92345829e+41, 4.31123155e+15, 2.45124554e+40],
       [4.92345829e+41, 2.45124554e+40, 1.54553894e+25],
       [9.49611942e+19, 7.89629602e+13, 4.03428793e+02]])

In [69]:
# round
np.round(np.random.random((3,4))*100)

array([[49., 55., 52., 85.],
       [ 5., 89., 22., 75.],
       [96., 32.,  4., 22.]])

In [72]:
arr1=np.array([1.2,2.7,3.5,4.9])
rounded_int=np.round(arr1)
print(rounded_int)

[1. 3. 4. 5.]


In [73]:
arr2=np.array([1.234,2.567,3.891])
rounded_decimals=np.round(arr2,decimals=2)
print(rounded_decimals)

[1.23 2.57 3.89]


In NumPy, the "round" function is used to round elements of an array to the nearest integer or to a specified number of decimals.

In [70]:
# floor
np.floor(np.random.random((2,3))*100)

array([[33., 65., 88.],
       [46., 75., 59.]])

In [74]:
arr3=np.array([1.5,2.7,3.1,4.9])
floor_arr=np.floor(arr3)
print(floor_arr)

[1. 2. 3. 4.]


In NumPy, the "floor" function is used to round down each element of an array to the nearest integer. It returns the largest integer not greater than the input value.

In [71]:
# ceil
np.ceil(np.random.random((3,4))*100)

array([[74., 82., 23., 20.],
       [18., 65., 57., 72.],
       [18., 52., 33., 45.]])

In NumPy, the "ceil" function is used to compute the ceiling of each element in an array. The ceiling of a number is the smallest integer greater than or equal to that number.

**Indexing** **and** **Slicing**

In [4]:
import numpy as np
a1=np.arange(10)
a1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [12]:
a1[2:7:4]

array([2, 6])

In [5]:
a2=np.arange(12).reshape(3,4)
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [26]:
a2[1:,1:3]

array([[ 5,  6],
       [ 9, 10]])

In [22]:
a2[0,:]

array([0, 1, 2, 3])

In [23]:
a2[1,:]

array([4, 5, 6, 7])

In [24]:
a2[2,:]

array([ 8,  9, 10, 11])

In [25]:
a2[:,2]

array([ 2,  6, 10])

In [7]:
a2[1,0]

4

In [14]:
a2[0:2,1::2]

array([[1, 3],
       [5, 7]])

In [15]:
a2[::2,1::2]

array([[ 1,  3],
       [ 9, 11]])

In [19]:
a2[1,::3]

array([4, 7])

In [6]:
a3=np.arange(8).reshape(2,2,2)
a3

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [8]:
a3[1,0,1]

5

In [9]:
a3[1,1,0]

6

In [27]:
a3=np.arange(27).reshape(3,3,3)
a3

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

In [30]:
a3[0,1,:]

array([3, 4, 5])

In [28]:
a3[::2,0,0::2]

array([[ 0,  2],
       [18, 20]])

In [29]:
a3[2,1:,1:]

array([[22, 23],
       [25, 26]])

**Iterating**

In [31]:
a1
for i in a1:
  print(i)

0
1
2
3
4
5
6
7
8
9


In [32]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [34]:
for i in a2:
  print(i)

[0 1 2 3]
[4 5 6 7]
[ 8  9 10 11]


In [35]:
a3

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

In [36]:
for i in a3:
  print(i)

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[ 9 10 11]
 [12 13 14]
 [15 16 17]]
[[18 19 20]
 [21 22 23]
 [24 25 26]]


In [37]:
for i in np.nditer(a3):
  print(i)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


In [38]:
for i in np.nditer(a2):
  print(i)

0
1
2
3
4
5
6
7
8
9
10
11


**Reshaping**

In [40]:
# transopse()
np.transpose(a2)

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [41]:
a2.T

array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [42]:
# ravel()
a3.ravel()

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26])

In NumPY,the "ravel()" function is used to flatten a multi-dimensional array into a 1D array. This means it takes a multi-dimensinal array and reshapes it into a single row array, concatenating all the elements in row-major order.

**Stacking**

In [44]:
# horizontal stacking
a4=np.arange(12).reshape(3,4)
a4

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [45]:
a5=np.arange(12,24).reshape(3,4)
a5

array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [47]:
np.hstack((a4,a5))

array([[ 0,  1,  2,  3, 12, 13, 14, 15],
       [ 4,  5,  6,  7, 16, 17, 18, 19],
       [ 8,  9, 10, 11, 20, 21, 22, 23]])

In [48]:
np.hstack((a4,a5,a4,a5))

array([[ 0,  1,  2,  3, 12, 13, 14, 15,  0,  1,  2,  3, 12, 13, 14, 15],
       [ 4,  5,  6,  7, 16, 17, 18, 19,  4,  5,  6,  7, 16, 17, 18, 19],
       [ 8,  9, 10, 11, 20, 21, 22, 23,  8,  9, 10, 11, 20, 21, 22, 23]])

In [49]:
# vertical stacking
np.vstack((a4,a5))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

**Splitting**

In [50]:
# horizontal splitting
a4

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [51]:
np.hsplit(a4,2)

[array([[0, 1],
        [4, 5],
        [8, 9]]),
 array([[ 2,  3],
        [ 6,  7],
        [10, 11]])]

In [52]:
# vertical splitting
a5

array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [53]:
np.vsplit(a5,3)

[array([[12, 13, 14, 15]]),
 array([[16, 17, 18, 19]]),
 array([[20, 21, 22, 23]])]