<a href="https://colab.research.google.com/github/Sanghita-C/mle-python-stack/blob/main/Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Numpy Review

## Core Basics:

- creating arrays
- shapes
- dimensions
- dtype

In [None]:
import numpy as np

In [None]:
a = np.array([1,2,3])
print("arr is : ",a)
print("shape is: ",a.shape)
print("dim is: ",a.ndim)
print("dtype is: ",a.dtype)

arr is :  [1 2 3]
shape is:  (3,)
dim is:  1
dtype is:  int64


In [None]:
#2d
b = np.array([[1,2],[2,3],[3,4]])
print("arr is : ",b)
print("shape is: ",b.shape)
print("dim is: ",b.ndim)
print("dtype is: ",b.dtype)



arr is :  [[1 2]
 [2 3]
 [3 4]]
shape is:  (3, 2)
dim is:  2
dtype is:  int64


shape means size along each axis . For 1D array it just becomes number of elements - example in 1st case we had (3, ).

but for 2D thereare two dimensions - in the above example, we have 3 elements in 1st dimension (rows) and 2 in the other dimension (column)

In [None]:
#dtype

x = np.array([1,2,3,4])
print(x.dtype)

y = np.array([1,2,3,4], dtype = np.float32)
print(y.dtype)

int64
float32


Numpy will take a dtype by default - mostly int64. but often in ML coding we need to change it to float:
- division behaviour
- numerical stability
- compatibility with ML formulas


In [None]:
# some famous arrays

z = np.zeros((2,3)) # passing the dimension as parameter (or shape)
one_arr = np.ones((2,3))

print(z)
print(one_arr)

#again you can also pass the dtype but by default it creates float arr

[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [1. 1. 1.]]


In [None]:
#IDENTITY MATRIX
I = np.eye(3)

print(I)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [None]:
# working with ranges

r = np.arange(0,10,2)
print(r)

#endpoint is excluded and last param is step size

[0 2 4 6 8]


In [None]:
# linspace - exact number of points between two values - used for plots etc

t = np.linspace(0,1,5) # the endpoint is included in this case

print(t)

[0.   0.25 0.5  0.75 1.  ]


In [None]:
#Array Reshaping

v = np.array([1,2,3])

col_vector = v.reshape(-1,1) # -1 means the system should infer the value. we could have passed 3 also but sometimes we might not know that value. So we just mention that number of columns should be 1.

row_vector = v.reshape(1,-1)

print(col_vector)
print(col_vector.shape)
print(row_vector)
print(row_vector.shape)

[[1]
 [2]
 [3]]
(3, 1)
[[1 2 3]]
(1, 3)


##Indexing, Slicing and Boolean Masking

In [None]:
#Indexing basics

a = np.array([10,20,30,40])

print("first element: ", a[0])
print("last element ",a[-1])
print("Range from 2nd to 3nd element: ",a[1:3]) #last index not included

first element:  10
last element  40
Range from 2nd to 3nd element:  [20 30]


In [None]:
#2D indexing

X= np.array([[1,2,3],[4,5,6],[7,8,9]])

print("first row: ",X[0])
print("first col: ", X[:, 0])
print("row 1 , col 2: ", X[0,1])

first row:  [1 2 3]
first col:  [1 4 7]
row 1 , col 2:  2


In [None]:
#slicing

print("row 2 onwards and all columns: ", X[1:, :])
print("all rows and first two columns: ", X[:, :2])
print("submatrix: ",X[1:2, 1:])

row 2 onwards and all columns:  [[4 5 6]
 [7 8 9]]
all rows and first two columns:  [[1 2]
 [4 5]
 [7 8]]
submatrix:  [[5 6]]


In [None]:
#Boolean Masking

greater_than_5 = X > 5

print(greater_than_5)

print(X[X>5]) # The boolean array

[[False False False]
 [False False  True]
 [ True  True  True]]
[6 7 8 9]


In [None]:
#Rowise filtering

print(X[X[:, 0] > 2])

#Intuition : X[:, 0] - looking into all the rows (datapoints) but only col 0 (feature 1) - we want all the rows where feature 1 value > 2 = X[:, 0] >2
#Now bool = X[:, 0] > 2 gives us a corresponding boolean matrix
#Now we do X[bool] which returns only the rows where feature 1 value is greater than 3

[[4 5 6]
 [7 8 9]]


In [None]:
# in-place modification using masks

X[X < 5] = 0

print(X)

[[0 0 0]
 [0 5 6]
 [7 8 9]]


In [None]:
#Where

Y = np.array([[1,2,3],[4,5,6]])

A = np.where(Y>5, Y, 0)

print(Y)
print(A) # where is not inplace

[[1 2 3]
 [4 5 6]]
[[0 0 0]
 [0 0 6]]


##Broadcasting