#  **Introduction to Machine Learning Lab 1**

This is the first lab where we will explore the basics of Google Colab and write our first Python
code!

##  Why Numpy?
It is the foundation of most ML workflows, providing fast and efficient array computations, vectorized operations, and an overall boost in speed.



*   **Example**

This simple vectorized operation gives us the BMI of all individuals in one go, demonstrating how NumPy makes code both faster and more readable.


In [9]:
import numpy as np
heights = np.array([1.75, 1.80, 1.65]) # in meters
weights = np.array([65, 78, 50]) # in kg
bmi = weights / (heights ** 2)
print("bmi:")
print(bmi)

bmi:
[21.2244898  24.07407407 18.36547291]




*   **Data Preprocessing**

Before training ML models, data needs to be cleaned, transformed, and
normalized.



In [3]:
data = np.array([4.0, 5.0, 6.0, 8.0, 10.0])
normalized_data = (data - np.mean(data)) / np.std(data)
print("Normalized Data:")
print(normalized_data)

Normalized Data:
[-1.2070197  -0.74278135 -0.27854301  0.64993368  1.57841037]




*   **Matrix Operations**

Most ML models, like linear regression, decision trees, and neural
networks, involve matrix multiplication, a task that NumPy handles with ease.



In [8]:
#Dot Product
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result = np.dot(A, B)
print("A:")
print(A)
print("B:")
print(B)
print("Result(A.B):")
print(result)
#Linear regression
X = np.array([[1, 2], [3, 4], [5, 6]])
beta = np.array([0.5, 1.5])
b = 0.1
y_pred = np.dot(X, beta) + b
print("X:")
print(X)
print("beta:")
print(beta)
print("b:")
print(b)
print("Predicted value(X.beta + b):")
print(y_pred)

A:
[[1 2]
 [3 4]]
B:
[[5 6]
 [7 8]]
Result(A.B):
[[19 22]
 [43 50]]
X:
[[1 2]
 [3 4]
 [5 6]]
beta:
[0.5 1.5]
b:
0.1
Predicted value(X.beta + b):
[ 3.6  7.6 11.6]


# Key Numpy Concepts



*   **Arrays vs Lists**

Python lists are versatile, but they’re not optimized for heavy numerical computations, which is why NumPy arrays are a better choice for ML tasks.


In [13]:
#Lists

list_a = [1, 2, 3]
list_b = [4, 5, 6]
result = list_a + list_b
print("List Concatenation(Simple Addition):")
print(result)
list_sum = [a + b for a,b in zip(list_a, list_b)]
print("List Sum(Addition using loops):")
print(list_sum)

#Arrays

import numpy as np
array_a = np.array([1, 2, 3])
array_b = np.array([4, 5, 6])
result = array_a + array_b
print("Arrays Sum(Simple Addition):")
print(result)

List Concatenation(Simple Addition):
[1, 2, 3, 4, 5, 6]
List Sum(Addition using loops):
[5, 7, 9]
Arrays Sum(Simple Addition):
[5 7 9]



*   **Creating Numpy Arrays**

1.   From Lists: The simplest way to create an array is by converting a Python list using np.array()
2.   Using np.zeros() and np.ones(): These functions create arrays filled with zeros or ones, respectively.
3.   Using np.random(): Often, you&#39;ll want arrays filled with random numbers. NumPy’s random module allows you to create arrays with random values, which are essential in ML for tasks like initializing neural networks or splitting data.







In [15]:
# From Lists
a = np.array([1, 2, 3, 4])
print("a:")
print(a)
# Using np.zeros() and np.ones()
b = np.zeros((3, 3)) # A 3x3 matrix of zeros
print("b:")
print(b)
c = np.ones((2, 4)) # A 2x4 matrix of ones
print("c:")
print(c)
# Using np.random()
d = np.random.random((2, 3))
print("d:")
print(d)

a:
[1 2 3 4]
b:
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
c:
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]]
d:
[[0.90801676 0.56971275 0.5082838 ]
 [0.11303518 0.12568421 0.07195756]]




*   **Basic Array Operations**



1.   Addition, Subtraction, Multiplication - Operations are performed element-wise, which means they are applied to each element of the arrays.
2.   Broadcasting - NumPy automatically expands smaller arrays so they can match the dimensions
of larger arrays.





In [17]:
#Addition, Subtraction, Multiplication
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("a:")
print(a)
print("b:")
print(b)
print("a + b:")
print(a + b)
print("a * b:")
print(a * b)
#BroadCasting
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([1, 2, 3])
print("a:")
print(a)
print("b:")
print(b)
print("a + b:")
print(a + b)

a:
[1 2 3]
b:
[4 5 6]
a + b:
[5 7 9]
a * b:
[ 4 10 18]
a:
[[1 2 3]
 [4 5 6]]
b:
[1 2 3]
a + b:
[[2 4 6]
 [5 7 9]]



*   **Reshaping Arrays**




1.   Reshaping - You can reshape an array into any shape, as long as the total number of elements stays the same.
2.   Flattening - In contrast, sometimes you need to flatten multi-dimensional arrays back into a single-dimensional form for processing.



In [19]:
#Reshaping
a = np.array([1, 2, 3, 4, 5, 6])
print("a:")
print(a)
a_reshaped = a.reshape(2, 3) # 2 rows, 3 columns
print("Reshaped a:")
print(a_reshaped)
#Flattening
a_flat = a_reshaped.flatten()
print("Flattened a:")
print(a_flat) # Output: [1, 2, 3, 4, 5, 6]

a:
[1 2 3 4 5 6]
Reshaped a:
[[1 2 3]
 [4 5 6]]
Flattened a:
[1 2 3 4 5 6]


## Hands-On Coding



*   Creating and Manipulating Arrays
*   Generating a Random Matrix and Performing Operations
*   Reshaping and Slicing






In [20]:
#Creating a 1D Array from a List
import numpy as np
array_1d = np.array([10, 20, 30, 40])
print(array_1d)

[10 20 30 40]


In [21]:
#Creating a 2D Array (Matrix)
array_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(array_2d)

[[1 2 3]
 [4 5 6]]


In [22]:
#Performing Operations
array_sum = array_1d + 5 # Add 5 to every element
print(array_sum)

[15 25 35 45]


In [26]:
#Generating a Random Matrix
random_matrix = np.random.random((3, 3))
print(random_matrix)

[[0.71524001 0.93497014 0.14633754]
 [0.08507332 0.15560506 0.0822161 ]
 [0.41443847 0.78831516 0.42726447]]


In [28]:
#Matrix Operations
random_matrix_scaled = random_matrix * 100 # Scale all values by 100
print(random_matrix_scaled)

[[71.5240009  93.497014   14.63375432]
 [ 8.50733189 15.56050563  8.22160966]
 [41.44384706 78.83151614 42.72644661]]


In [29]:
#Reshaping Arrays
reshaped_array = np.array([1, 2, 3, 4, 5, 6]).reshape(2, 3) # 2 rows, 3 columns
print(reshaped_array)

[[1 2 3]
 [4 5 6]]


In [41]:
import numpy as np

# Initialize a random 10x10 array
a = np.random.random((10, 10))

# 1. Slice the first two rows and first two columns of the reshaped array
# This extracts the top-left 2x2 section of the array
sliced_array = a[:2, :2]
print("Sliced first two rows and first two columns:\n", sliced_array)

# 2. Slice the first two rows and first two columns of the 10x10 random array
sliced_array1 = a[:2, :2]
print("Random array: first two rows and first two columns:\n", sliced_array1)

#Checking Operations by myself.
print("Checking Operations by myself:")

# 3. Slice the first five rows and all columns of the array
sliced_array1 = a[:5, :]
print("First five rows and all columns:\n", sliced_array1)

# 4. Slice all 10 rows but only the first two columns of the array
sliced_array1 = a[:10, :2]
print("All rows and first two columns:\n", sliced_array1)

# 5. Slice the first two rows and all columns of the array
sliced_array1 = a[:2, :]
print("First two rows and all columns:\n", sliced_array1)

# 6. Slice all but the last two rows and all but the last two columns of the array
sliced_array1 = a[:-2, :-2]
print("All but the last two rows and columns:\n", sliced_array1)


Sliced first two rows and first two columns:
 [[0.49772925 0.99325616]
 [0.8009884  0.36944866]]
Random array: first two rows and first two columns:
 [[0.49772925 0.99325616]
 [0.8009884  0.36944866]]
Checking Operations by myself:
First five rows and all columns:
 [[0.49772925 0.99325616 0.91293688 0.19604905 0.5430586  0.86099027
  0.57858541 0.28204061 0.63233131 0.59647284]
 [0.8009884  0.36944866 0.29899703 0.27307606 0.52776987 0.45620944
  0.11884622 0.23322397 0.70378128 0.1504109 ]
 [0.47834408 0.47209556 0.87711095 0.42235703 0.32206072 0.55398484
  0.34607823 0.89927002 0.01145921 0.16761387]
 [0.25555613 0.13141404 0.83819521 0.55573105 0.70456404 0.98189668
  0.2886531  0.38387191 0.46957168 0.20477434]
 [0.06068014 0.82759297 0.32309044 0.55871752 0.08939073 0.10767365
  0.40365176 0.81791433 0.31753991 0.46206635]]
All rows and first two columns:
 [[0.49772925 0.99325616]
 [0.8009884  0.36944866]
 [0.47834408 0.47209556]
 [0.25555613 0.13141404]
 [0.06068014 0.82759297]


## Mini Challenge

In [47]:
def normalization_performing_function(data):
  # Calculate min and max
  data_min = np.min(data)
  data_max = np.max(data)
  # Apply Min-Max normalization
  normalized_data = (data - data_min) / (data_max - data_min)
  return normalized_data

w = np.random.random((5,5))*2000
print("w:")
print(w)
print("Normalized w:")
print(normalization_performing_function(w))

w:
[[ 477.98324348 1762.94455581 1134.53507411  140.07425079 1515.91591876]
 [1821.99482099 1220.99805677  216.55215038 1945.43714802  701.4272267 ]
 [ 423.75095515 1904.22195631  315.08038395 1959.96927411  325.82854892]
 [1517.33574291 1487.22749356 1226.04171374  326.00375965  571.14740022]
 [ 941.21218174  476.46377033 1672.81372889  714.92004486  602.79173951]]
Normalized w:
[[0.18567499 0.89173842 0.54643856 0.         0.75600057]
 [0.92418549 0.59394844 0.04202325 0.99201485 0.30845349]
 [0.15587531 0.96936784 0.09616276 1.         0.10206869]
 [0.75678073 0.74023679 0.59671984 0.10216496 0.23686704]
 [0.44021107 0.18484007 0.84221313 0.31586756 0.25425504]]


##  **Explaination:**

I applied **Min-Max Scaling** to normalize the dataset, scaling all feature values between **0 and 1**. Below are some of the reasons due to which we will be using these techniques:

- **Uniform Scale**: By scaling all features to a common range (typically 0 to 1), the model can treat each feature more equally, preventing any one feature from dominating the learning process due to its scale.
  
- **Prevents Bias**: Features with larger ranges can dominate learning (e.g., will have much larger gradients than others) if not scaled. Min-Max scaling prevents this by equalizing the range of all features.
  
- **Improves Model Efficiency**: Many machine learning algorithms, such as **Gradient Descent** and **Neural Networks**, converge faster and perform better when features are normalized between 0 and 1.

- **Balanced Distance Calculations**: For algorithms like **K-Nearest Neighbors (KNN)** and **Support Vector Machines (SVMs)**, which use distance metrics, scaling ensures that all features contribute equally to the distance calculations, improving model predictions.

- **Better Weight Handling**: Models using weights (e.g., neural networks) benefit from Min-Max scaling, as it prevents disproportionately large weights on features with larger ranges.

- **Facilitates Feature Comparison**: Normalized features between 0 and 1 make it easier to compare the importance of different features and interpret relationships in the dataset.
