<a href="https://colab.research.google.com/github/Rohan-1103/Data-Science/blob/main/session_13_numpy_fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### What is numpy?

NumPy is the fundamental package for <u>scientific computing</u> in Python. It is a Python library that provides a <u>multidimensional array object</u>, various derived objects (such as masked arrays and matrices), and an assortment of routines for <u>fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more.</u>


At the core of the NumPy package, is the <u>ndarray object.</u> This encapsulates n-dimensional arrays of homogeneous data types

### Numpy Arrays Vs Python Sequences

- NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). <u>Changing the size of an ndarray will create a new array and delete the original.</u>

- The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory.

- NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.

- A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays.

### Creating Numpy Arrays

In [3]:
# np.array('list')
import numpy as np

a = np.array([1,2,3])
print(a)
print(type(a))

[1 2 3]
<class 'numpy.ndarray'>


In [6]:
# 2D and 3D / matrix and tensor
b = np.array([[1,2,3],[4,5,6]])
print(b)

[[1 2 3]
 [4 5 6]]


In [7]:
c = np.array([[[1,2],[3,4]],[[5,6],[7,8]]])
print(c)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


In [8]:
# dtype: parameter
np.array([1,2,3],dtype=float)

array([1., 2., 3.])

In [34]:
# np.arange(number_of_elements):
# np.arange(start, end(excluded), step):
np.arange(1,11,2)

array([1, 3, 5, 7, 9])

In [11]:
# with reshape(): Changing shape
print(np.arange(16).reshape(4, 4))        # 2D array with 4 rows and 4 cols
print(np.arange(16).reshape(2,2,2,2))     # 4D array with 2 rows and 2 cols(each)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
[[[[ 0  1]
   [ 2  3]]

  [[ 4  5]
   [ 6  7]]]


 [[[ 8  9]
   [10 11]]

  [[12 13]
   [14 15]]]]


In [12]:
# np.ones((rows, cols)) and np.zeros((rows, cols)): To initialize arrays/in NN to provide initial weights
np.ones((3,4))

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [13]:
np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [14]:
# np.random((rows, cols))
np.random.random((3,4))

array([[0.93790714, 0.28932192, 0.66500717, 0.24581127],
       [0.05582401, 0.75781961, 0.3994496 , 0.51570715],
       [0.25237594, 0.25495549, 0.23466563, 0.08003344]])

In [17]:
# np.linspace(start, end(inclusive), n-elements): Linearly spaced -> ML graph plots
print(np.linspace(-10,10,10))
print(np.linspace(-10,10,10,dtype=int))

[-10.          -7.77777778  -5.55555556  -3.33333333  -1.11111111
   1.11111111   3.33333333   5.55555556   7.77777778  10.        ]
[-10  -8  -6  -4  -2   1   3   5   7  10]


In [18]:
# np.identity(n_cols): Create an identity matrix
np.identity(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [19]:
# Python Programming illustrating
# numpy.diag method

import numpy as geek

# matrix creation by array input
a = geek.matrix([[1, 21, 30],
                 [63 ,434, 3],
                 [54, 54, 56]])

print("Main Diagonal elements : \n", geek.diag(a), "\n")

print("Diagonal above main diagonal : \n", geek.diag(a, 1), "\n")

print("Diagonal below main diagonal : \n", geek.diag(a, -1))

Main Diagonal elements : 
 [  1 434  56] 

Diagonal above main diagonal : 
 [21  3] 

Diagonal below main diagonal : 
 [63 54]


### Array Attributes

In [20]:
a1 = np.arange(10,dtype=np.int32)
a2 = np.arange(12,dtype=float).reshape(3,4)
a3 = np.arange(8).reshape(2,2,2)

print(a1)
print(a2)
print(a3)

[0 1 2 3 4 5 6 7 8 9]
[[ 0.  1.  2.  3.]
 [ 4.  5.  6.  7.]
 [ 8.  9. 10. 11.]]
[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]


In [22]:
# ndim: Returns the dimentions of the array
print(a3.ndim)
print(a2.ndim)
print(a1.ndim)

3
2
1


In [30]:
# shape: Returns rows and columns
print(a1.shape)
print(a2.shape)
print(a3.shape)

# for 3D -> (2, 2, 2) means (how_many_elements, dimentions_r, dimention_c)

(10,)
(3, 4)
(2, 2, 2)


In [25]:
# size: returns total number of items
print(a1.size)
print(a2.size)
print(a3.size)

10
12
8


In [28]:
# itemsize: returns how much size does an element occupy in memory
print(a3.itemsize)
print(a2.itemsize)
print(a1.itemsize)

8
8
4


In [29]:
# dtype: returns datatype of elements
print(a1.dtype)
print(a2.dtype)
print(a3.dtype)

int32
float64
int64


### Changing Datatype

In [32]:
# astype: reducing space/changing datatype of features of dataset
a3.astype(np.int32)

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]], dtype=int32)

### Array Operations

In [35]:
a1 = np.arange(12).reshape(3,4)
a2 = np.arange(12,24).reshape(3,4)

print(a2)
print(a1)

[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]


In [36]:
# scalar operations

# arithmetic
a1 ** 2

array([[  0,   1,   4,   9],
       [ 16,  25,  36,  49],
       [ 64,  81, 100, 121]])

In [38]:
# relational
a2 >= 15

array([[False, False, False,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]])

In [39]:
print(a1)
print(a2)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


In [43]:
# vector operations: Same shape
# arithmetic
a1 * a2

array([[  0,  13,  28,  45],
       [ 64,  85, 108, 133],
       [160, 189, 220, 253]])

### Array Functions

In [44]:
a1 = np.random.random((3,3))
a1 = np.round(a1*100)
a1

array([[66., 46., 30.],
       [92., 52., 64.],
       [15., 73., 84.]])

In [47]:
# max/min/sum/prod
print(np.max(a1))
print(np.min(a1, axis = 1))
print(np.sum(a1))
print(np.sum(a1, axis = 0))
print(np.prod(a1))
# 0 -> col and 1 -> row
np.prod(a1,axis=0)

92.0
[30. 52. 15.]
522.0
[173. 171. 178.]
2565001197158400.0


array([ 91080., 174616., 161280.])

In [49]:
# mean/median/std/var
print(np.mean(a1,axis=1))
print(np.median(a1,axis=1))
print(np.std(a1))
print(np.var(a1,axis=1))

[47.33333333 69.33333333 57.33333333]
[46. 64. 73.]
23.49940897601942
[216.88888889 280.88888889 916.22222222]


In [50]:
# trigonomoetric functions(Not that imp)
np.sin(a1)

array([[-0.02655115,  0.90178835, -0.98803162],
       [-0.77946607,  0.98662759,  0.92002604],
       [ 0.65028784, -0.67677196,  0.73319032]])

# Dot Product Rules

For vectors **a** and **b** in ℝⁿ:

**Definition:**
- Algebraic: **a · b** = Σ (aᵢ bᵢ)

---

## Properties

1. **Commutative**
   - a · b = b · a

2. **Distributive**
   - a · (b + c) = a · b + a · c

3. **Scalar Multiplication**
   - (k·a) · b = k (a · b)

4. **Self Dot Product**
   - a · a = |a|²

5. **Zero Vector**
   - 0 · a = 0

6. **Angle Relation**
   - a · b > 0 ⇒ Acute angle  
   - a · b = 0 ⇒ Right angle  
   - a · b < 0 ⇒ Obtuse angle  

7. **Not Associative**
   - a · (b · c) ❌ (not defined)


# Example: Matrix Product (2×3) · (3×2) = (2×2)

### Step 1: Define the matrices
A =  
⎡ a   b   c ⎤  
⎣ d   e   f ⎦   (2×3)

B =  
⎡ x   y ⎤  
⎢ z   w ⎥  
⎣ m   n ⎦   (3×2)

---

### Step 2: Compute the product
C = A · B = (2×2 matrix)

C =  
[ a·x + b·z + c·m     a·y + b·w + c·n ]  
[ d·x + e·z + f·m     d·y + e·w + f·n ]

---

### Step 3: Numeric Example
Let  

A =  
 [ 1   2   3 ]  
 [ 4   5   6 ]  

B =  
 [ 07   08 ]  
 [ 09   10 ]  
 [ 11   12 ]  

Then  

A · B =  
[ (1·7 + 2·9 + 3·11)    (1·8 + 2·10 + 3·12) ]  
[ (4·7 + 5·9 + 6·11)    (4·8 + 5·10 + 6·12) ]  
=<br>
[ 58    64 ]  
[139   154 ]


In [51]:
# dot product: Product of 2 matrix
a2 = np.arange(12).reshape(3,4)
a3 = np.arange(12,24).reshape(4,3)

np.dot(a2,a3)

array([[114, 120, 126],
       [378, 400, 422],
       [642, 680, 718]])

In [52]:
# log and exponents
print(np.log(a1))
print(np.exp(a1))

[[4.18965474 3.8286414  3.40119738]
 [4.52178858 3.95124372 4.15888308]
 [2.7080502  4.29045944 4.4308168 ]]
[[4.60718663e+28 9.49611942e+19 1.06864746e+13]
 [9.01762841e+39 3.83100800e+22 6.23514908e+27]
 [3.26901737e+06 5.05239363e+31 3.02507732e+36]]


In [53]:
# round: Rounds the decimal values to nearest int
# floor: Rounds back the values
# ceil: Rounds up the values
print(np.random.random((2, 3)) * 100)
print(np.ceil(np.random.random((2,3))*100))

[[49.59951086 60.09422511 84.19998054]
 [22.03918803 39.71576836 69.42338957]]
[[ 90.  30.  70.]
 [ 69.  79. 100.]]


### Indexing and Slicing

In [54]:
a1 = np.arange(10)
a2 = np.arange(12).reshape(3,4)
a3 = np.arange(8).reshape(2,2,2)

print(a1)
print(a2)
print(a3)

[0 1 2 3 4 5 6 7 8 9]
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[[0 1]
  [2 3]]

 [[4 5]
  [6 7]]]


In [68]:
# Indexing

In [56]:
a1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [57]:
print(a1[-1])

9


In [58]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [60]:
# a2[row, col]
print(a2[1,2])

6


In [61]:
a3

array([[[0, 1],
        [2, 3]],

       [[4, 5],
        [6, 7]]])

In [63]:
# a3[which 2d group, row, col]
print(a3[1,0,1])

5


In [67]:
print(a3[1,1,0])
print(a3[0,0,0])
print(a3[1,1,1])

6
0
7


In [69]:
# Slicing

In [70]:
a1

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [73]:
# a1[start:end:step]
print(a1[2:5:2])
print(a1[2:5])

[2 4]
[2 3 4]


In [72]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [90]:
# a2[row, col:step]
print(a2[0:2,1:4:2])
print()
print(a2[0:2,1::2])
print()
print(a2[::2,::3])
print()
print(a2[::2,1::2])
print()
print(a2[1,::3])
# a2[0:2,1::2]

[[1 3]
 [5 7]]

[[1 3]
 [5 7]]

[[ 0  3]
 [ 8 11]]

[[ 1  3]
 [ 9 11]]

[4 7]


In [92]:
a3 = np.arange(27).reshape(3,3,3)
a3

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

In [111]:
print(a3[1])
print()
print()
# print(a3[::2,:,:])
print(a3[::2])
print()
print()
# print(a3[:1, 1:2,:])
# print(a3[0,1])
print(a3[0,1,:])
print()
print()
print(a3[1, :, 1])
print()
print()
print(a3[2, 1:, 1:])
print()
print()
# print(a3[::2, :1:2, ::2])
print(a3[::2, 0, ::2])

[[ 9 10 11]
 [12 13 14]
 [15 16 17]]


[[[ 0  1  2]
  [ 3  4  5]
  [ 6  7  8]]

 [[18 19 20]
  [21 22 23]
  [24 25 26]]]


[3 4 5]


[10 13 16]


[[22 23]
 [25 26]]


[[ 0  2]
 [18 20]]


In [112]:
a3[::2,0,::2]

array([[ 0,  2],
       [18, 20]])

In [113]:
a3[2,1:,1:]

array([[22, 23],
       [25, 26]])

In [114]:
a3[0,1,:]

array([3, 4, 5])

### Iterating

In [115]:
a1

for i in a1:
  print(i)

0
1
2
3
4
5
6
7
8
9


In [116]:
a2

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [117]:
for i in a2:
  print(i)

[0 1 2 3]
[4 5 6 7]
[ 8  9 10 11]


In [118]:
a3

array([[[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8]],

       [[ 9, 10, 11],
        [12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23],
        [24, 25, 26]]])

In [119]:
for i in a3:
  print(i)

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[ 9 10 11]
 [12 13 14]
 [15 16 17]]
[[18 19 20]
 [21 22 23]
 [24 25 26]]


In [120]:
# np.nditer(matrix): iterate on EACH element of matrix
for i in np.nditer(a3):
  print(i)

0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26


### Reshaping

In [121]:
# reshape

In [123]:
# Transpose(): Rows to col, col to rows
print(np.transpose(a2))
a2.T

[[ 0  4  8]
 [ 1  5  9]
 [ 2  6 10]
 [ 3  7 11]]


array([[ 0,  4,  8],
       [ 1,  5,  9],
       [ 2,  6, 10],
       [ 3,  7, 11]])

In [125]:
# ravel(): Converts any dimention of matrix to 1D array
print(a3.ravel())

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26]


### Stacking: Attach multiple datasets

In [128]:
a4 = np.arange(12).reshape(3,4)
a5 = np.arange(12,24).reshape(3,4)
print(a4)
print(a5)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


In [135]:
# horizontal stacking: Shape same
# np.hstack((matrix1, matrix2,....matrixn))
# print(np.hstack((a4,a5)))
print(np.hstack((a4,a5,a5,a4)))

[[ 0  1  2  3 12 13 14 15 12 13 14 15  0  1  2  3]
 [ 4  5  6  7 16 17 18 19 16 17 18 19  4  5  6  7]
 [ 8  9 10 11 20 21 22 23 20 21 22 23  8  9 10 11]]


In [133]:
# Vertical stacking
np.vstack((a4,a5))

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

### Splitting

In [134]:
# horizontal splitting
a4

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

In [140]:
# np.split(matrix, how_many_parts)
# np.hsplit(a4,2)
np.hsplit(a4,4)

[array([[0],
        [4],
        [8]]),
 array([[1],
        [5],
        [9]]),
 array([[ 2],
        [ 6],
        [10]]),
 array([[ 3],
        [ 7],
        [11]])]

In [141]:
# vertical splitting

In [142]:
a5

array([[12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

In [143]:
np.vsplit(a5,3)

[array([[12, 13, 14, 15]]),
 array([[16, 17, 18, 19]]),
 array([[20, 21, 22, 23]])]