Setup

In [1]:
import numpy as np
import time

Section 1 — ndarray Fundamentals

Task 1.1: Array Creation & Shapes

In [3]:
# TODO:
# Create a 1D array with values 0 to 99 (no loops)

# HINT:
# - Use np.arange

arr_1d = np.arange(0,100)
arr_1d


array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

In [4]:
# TODO:
# Reshape arr_1d into a (10, 10) array

# HINT:
# - reshape does NOT copy data

arr_2d = arr_1d.reshape(10,10)
arr_2d


array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [7]:
# TODO:
# Create a 3D array of shape (4, 5, 3)

# HINT:
# - Total elements must match

arr_3d = np.arange(4 * 5 *3).reshape(4,5,3)
arr_3d.shape


(4, 5, 3)

**Explain:**
- What does `.shape` represent?
- Why does contiguous memory matter?


Section 1.2 — dtype & Memory

In [14]:
# TODO:
# Create two arrays with same values but different dtypes

arr_int = np.arange(10, dtype=np.int64)
arr_float = np.arange(10, dtype=np.float32)


In [15]:
# TODO:
# Compare memory usage

# HINT:
# - Use .nbytes
print(arr_int.nbytes)
print(arr_float.nbytes)


80
40


**Interview Question:**  
Why does dtype selection matter in large ML pipelines?


Section 2 — Indexing, Views & Copies

Task 2.1: Views vs Copies

In [32]:
# TODO:
# Create a 2D array and slice every alternate row

# HINT:
# - Use slicing, not fancy indexing

A = np.arange(0,16).reshape(4,4)
A_slice = A[::2]
print(A)
print(A_slice)
#print(A[1:3, 1:3])


[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
[[ 0  1  2  3]
 [ 8  9 10 11]]


In [33]:
# TODO:
# Modify A_slice and observe A

A_slice[0,0]=99
A

array([[99,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

Explain:
- Why did the original array change (or not)?


slice is a view, not a copy. This means the slice and the original array are looking at the exact same data in computer's memory

Section 2.2 — Boolean Masking

In [52]:
# TODO:
# Create random array of size 1000

rng = np.random.default_rng(0)
X = rng.standard_normal(1000)


In [53]:
# TODO:
# Extract values greater than mean

# HINT:
# - Mean first
# - Boolean mask
m = X.mean()
above_mean = X[X > m]




In [55]:
# TODO:
# Replace negative values with 0 (no loops)

X_copy = X.copy()
X_copy[X_copy < 0] =0
print(X_copy.size)
print(X_copy[X_copy > 0].size)

1000
466


Section 3 — Broadcasting

Task 3.1: Broadcasting Rules

In [58]:
# TODO:
# Create A (1000, 50) and b (50,)
rng = np.random.default_rng(1)
A = rng.standard_normal((1000,50))
b = rng.standard_normal(50)


In [65]:
# TODO:
# Add b to each row of A

# HINT:
# - No reshape required
A_New = A + b
A_New.shape


(1000, 50)

In [67]:
# TODO:
# Normalize each row of A

# HINT:
# - Axis matters
# - Keep dimensions in mind
row_mean = A.mean(axis=1, keepdims=True)
row_std = A.std(axis=1, keepdims=True)

A_normalized = (A-row_mean)/row_std



Explain broadcasting step-by-step.

1. First it checks the shapes matches are not from right to left.
2. Then Virtual Stretch happens. b array gets stretched to 1000 times to match with A
3. Now operation takes place what ever we requested
   new_array[0,0] = A[0,0]+b[0]
   new_array[1,2] = A[1,2]+b[2]


Section 3.2 — Broadcasting Trap

In [73]:
# TODO:
# Intentionally trigger a broadcasting error
# Then fix it

new_col = np.arange(1000)
try:
    A_broadcast = A-new_col
except ValueError as err:
    print(f"broadcasting error: {err}")

new_col_reshape = new_col.reshape(-1,1)
A_broadcast_fixed = A-new_col_reshape
A_broadcast_fixed.shape

broadcasting error: operands could not be broadcast together with shapes (1000,50) (1000,) 


(1000, 50)

What was wrong with the original shapes?


Section 4 — Vectorization vs Loops

Task 4.1: Loop vs Vectorized

In [3]:
# TODO:
# Create large array X of size 1,000,000
rng = np.random.default_rng(2)
X = rng.standard_normal(1_000_000)


In [4]:
# TODO:
# Normalize using Python loop

# HINT:
# - Time it

v_mean = X.mean()
v_std = X.std()
t1 = time.time()
new_X = np.empty_like(X)
for i,n in enumerate(X):
    new_X[i] = (X[i]-v_mean)/v_std
print(f"time elapsed in for loop: {time.time()-t1}")


time elapsed in for loop: 0.1924445629119873


In [5]:
# TODO:
# Normalize using vectorization

v_t1 = time.time()
v_norm = (X-v_mean)/v_std
print(f"time elapsed in vectorization: {time.time()-v_t1}")

time elapsed in vectorization: 0.0051119327545166016


Why is vectorization faster?
1. it perform operations on an entire array at once rather than processing each number one-by-one in a loop
2. NumPy stores data in contiguous blocks (one number right after the other). This allows the CPU to "prefetch" data into its high-speed cache before it even needs it.


Task 4.2: Pairwise Distance (FAANG Classic)

In [26]:
# TODO:
# Compute pairwise Euclidean distance matrix without loops

# HINT:
# - Use (x - y)^2 expansion
# - Broadcasting is key

def pairwise_distance(X):
    ...


Section 5 — Numerical Stability

Task 5.1: Softmax

In [7]:
# TODO:
# Implement naive softmax

def softmax_naive(X):
    exp_value = np.exp(X)
    return exp_value / exp_value.sum(axis=1, keepdims=True)


In [8]:
# TODO:
# Fix numerical instability

# HINT:
# - Subtract max per row

def softmax_stable(X):
    shifted_X = X - X.max(axis=1, keepdims=True)
    shifted_exp = np.exp(shifted_X)
    return shifted_exp / shifted_exp.sum(axis=1, keepdims=True)

X_huge = np.array([[1000.0, 1001.0, 1002.0]])
try:
    print('naive:', softmax_naive(X_huge))
except FloatingPointError as e:
    print('overflow:', e)

print(f"softmax_stable: {softmax_stable(X_huge)}")
print(f"softmax_stable_sum: {softmax_stable(X_huge).sum(axis=1)}")


naive: [[nan nan nan]]
softmax_stable: [[0.09003057 0.24472847 0.66524096]]
softmax_stable_sum: [1.]


  exp_value = np.exp(X)
  return exp_value / exp_value.sum(axis=1, keepdims=True)


Why does subtracting max work?


Section 6 — Linear Algebra

Task 6.1: Matrix Multiplication

In [13]:
# TODO:
# Try valid and invalid matrix multiplications

array1 = np.random.default_rng(4).standard_normal((5, 3))
array2 = np.random.default_rng(5).standard_normal((3, 4))
array3 = array1 @ array2
print(array3)

try:
    in_mult = array1 @ array1
except ValueError as err:
    print(f"Invalid Multiplication Error: {err}")

[[ 1.56991114  3.56386564  0.71224891 -2.18884628]
 [-2.39719092 -1.06152625  0.74198752  1.57168922]
 [-0.53529631 -1.78704248 -0.36595976  1.60464863]
 [ 1.25326254  2.28143675  0.23965209 -2.02633524]
 [-0.79188763 -2.80426328 -0.76807032  1.57396851]]
Invalid Multiplication Error: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 5 is different from 3)


Explain difference between dot, @, and matmul.


Task 6.2: Solving Linear Systems

In [17]:
# TODO:
# Solve Ax = b and verify solution

rng = np.random.default_rng(6)
A = rng.standard_normal((4,4))
b = rng.standard_normal(4)
x = np.linalg.solve(A,b)
residual_value = np.linalg.norm(A @ x - b)
print(residual_value)


2.6558837364083264e-16


Section 7 — Performance & Memory

Task 7.1: In-Place Operations

In [18]:
# TODO:
# Compare in-place vs out-of-place operations
in_array = np.random.default_rng(7).standard_normal(1_000_000)
in_array1 = in_array.copy()
out_time = time.time()
out_array = in_array + 1.0
out_time1 = time.time()

in_time = time.time()
in_array1 += 1.0
in_time1 = time.time()

print(f"out-of-place: {out_time1-out_time}")
print(f"in-place: {in_time1-in_time}")


out-of-place: 0.0015408992767333984
in-place: 0.0005476474761962891


Task 7.2: Strides

In [24]:
# TODO:
# Inspect array strides and explain

stride_array = np.arange(35).reshape(5,7)
print(f"stride_array shape: {stride_array.shape}")
print(f"stride_array strides: {stride_array.strides}")

stride_array_slice = stride_array[:, ::4]
print(f"stride_array_slice shape: {stride_array_slice.shape}")
print(f"stride_array_slice strides: {stride_array_slice.strides}")

stride_array shape: (5, 7)
stride_array strides: (56, 8)
stride_array_slice shape: (5, 2)
stride_array_slice strides: (56, 32)


Section 8 — Mini Case Study

In [27]:
# TODO:
# Given X (10000, 100):
# - Normalize features
# - Compute covariance
# - Extract top-k eigenvectors

random_generator = np.random.default_rng(8)
feature_matrix = random_generator.standard_normal((10_000, 100))

feature_means = feature_matrix.mean(axis=0, keepdims=True)
feature_stds = feature_matrix.std(axis=0, keepdims=True, ddof=1)
standardized_features = (feature_matrix - feature_means) / feature_stds

num_samples = standardized_features.shape[0]
covariance_matrix = (standardized_features.T @ standardized_features) / (num_samples - 1)

eigenvalues, eigenvectors = np.linalg.eigh(covariance_matrix)

num_components = 5

sorted_indices = np.argsort(eigenvalues)[::-1]
top_indices = sorted_indices[:num_components]
projection_matrix = eigenvectors[:, top_indices]

reduced_data = standardized_features @ projection_matrix


print(f"Original shape:  {feature_matrix.shape}")
print(f"Reduced shape:   {reduced_data.shape}")
print(f"Covariance size: {covariance_matrix.shape}")

Original shape:  (10000, 100)
Reduced shape:   (10000, 5)
Covariance size: (100, 100)


Explain each step and its ML relevance.


1. Where did NumPy save memory?
2. Where did it avoid Python overhead?
3. Which operation would break at scale?
