# NumPy Essentials for Machine Learning (Beginner-friendly)

**Learning Objectives:**
- Learn how to create and work with NumPy arrays
- Use basic array math, broadcasting, and reshaping
- Apply simple linear algebra and statistics used in ML workflows

**Prerequisites:** Python basics, pip-install NumPy

**Estimated Time:** ~30 minutes

---

NumPy is a foundational library for numerical computing in Python. This notebook focuses on hands-on examples and short explanations aimed at beginners. Where advanced topics are shown, they are marked as optional.

In [1]:
import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

print(f"NumPy version: {np.__version__}")

NumPy version: 2.3.3


## 1. Array Creation and Basic Properties

Understanding how to create and inspect arrays is fundamental to ML workflows.

In [2]:
# Different ways to create arrays (common in ML)

# From lists (loading data)
data_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
arr_from_list = np.array(data_list)
print("From list:")
print(arr_from_list)
print(f"Shape: {arr_from_list.shape}, Dtype: {arr_from_list.dtype}\n")

# Zeros (weight initialization)
weights = np.zeros((3, 4))
print("Zeros (weight initialization):")
print(weights)
print(f"Shape: {weights.shape}\n")

# Random arrays (data generation, weight initialization)
random_data = np.random.randn(100, 5)  # 100 samples, 5 features
print("Random data (first 5 rows):")
print(random_data[:5])
print(f"Shape: {random_data.shape}\n")

# Identity matrix (useful for regularization)
identity = np.eye(3)
print("Identity matrix:")
print(identity)

From list:
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Shape: (3, 3), Dtype: int64

Zeros (weight initialization):
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
Shape: (3, 4)

Random data (first 5 rows):
[[ 0.49671415 -0.1382643   0.64768854  1.52302986 -0.23415337]
 [-0.23413696  1.57921282  0.76743473 -0.46947439  0.54256004]
 [-0.46341769 -0.46572975  0.24196227 -1.91328024 -1.72491783]
 [-0.56228753 -1.01283112  0.31424733 -0.90802408 -1.4123037 ]
 [ 1.46564877 -0.2257763   0.0675282  -1.42474819 -0.54438272]]
Shape: (100, 5)

Identity matrix:
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [3]:
# Array properties essential for ML
sample_array = np.random.randn(32, 10, 8)  # Batch size 32, sequence length 10, features 8

print("Array Properties (typical ML batch):")
print(f"Shape: {sample_array.shape} (batch_size, seq_len, features)")
print(f"Number of dimensions: {sample_array.ndim}")
print(f"Total elements: {sample_array.size}")
print(f"Data type: {sample_array.dtype}")
print(f"Memory usage: {sample_array.nbytes} bytes")
print(f"Memory usage: {sample_array.nbytes / 1024:.2f} KB")

Array Properties (typical ML batch):
Shape: (32, 10, 8) (batch_size, seq_len, features)
Number of dimensions: 3
Total elements: 2560
Data type: float64
Memory usage: 20480 bytes
Memory usage: 20.00 KB


## 2. Array Indexing and Slicing

Critical for data manipulation, batch processing, and feature selection.

In [4]:
# Create sample data representing a batch of images
# Shape: (batch_size, height, width, channels)
batch_images = np.random.randint(0, 256, size=(8, 28, 28, 3))

print("Batch of images shape:", batch_images.shape)
print("\nIndexing and Slicing Examples:")

# Get first image
first_image = batch_images[0]
print(f"First image shape: {first_image.shape}")

# Get first 4 images (mini-batch)
mini_batch = batch_images[:4]
print(f"Mini-batch shape: {mini_batch.shape}")

# Get red channel from all images
red_channel = batch_images[:, :, :, 0]
print(f"Red channel shape: {red_channel.shape}")

# Get center crop (common preprocessing)
center_crop = batch_images[:, 7:21, 7:21, :]
print(f"Center crop shape: {center_crop.shape}")

Batch of images shape: (8, 28, 28, 3)

Indexing and Slicing Examples:
First image shape: (28, 28, 3)
Mini-batch shape: (4, 28, 28, 3)
Red channel shape: (8, 28, 28)
Center crop shape: (8, 14, 14, 3)


In [5]:
# Boolean indexing (filtering data)
scores = np.array([85, 92, 78, 96, 88, 73, 91, 82])
names = np.array(['Alice', 'Bob', 'Charlie', 'Diana', 'Eve', 'Frank', 'Grace', 'Henry'])

print("Original scores:", scores)
print("Names:", names)

# Filter high performers (score > 85)
high_performers = scores > 85
print(f"\nHigh performers mask: {high_performers}")
print(f"High performer scores: {scores[high_performers]}")
print(f"High performer names: {names[high_performers]}")

# Multiple conditions
good_range = (scores >= 80) & (scores <= 90)
print(f"\nScores in 80-90 range: {scores[good_range]}")

Original scores: [85 92 78 96 88 73 91 82]
Names: ['Alice' 'Bob' 'Charlie' 'Diana' 'Eve' 'Frank' 'Grace' 'Henry']

High performers mask: [False  True False  True  True False  True False]
High performer scores: [92 96 88 91]
High performer names: ['Bob' 'Diana' 'Eve' 'Grace']

Scores in 80-90 range: [85 88 82]


## 3. Array Operations and Broadcasting

Broadcasting is crucial for efficient ML computations and is used across numerical computing libraries and tools.

In [6]:
# Element-wise operations (fundamental to neural networks)
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([[2, 2, 2], [3, 3, 3]])

print("Array a:")
print(a)
print("\nArray b:")
print(b)

print("\nElement-wise operations:")
print("Addition (a + b):")
print(a + b)

print("\nMultiplication (a * b):")
print(a * b)

print("\nSquare (a**2):")
print(a**2)

# Activation functions
print("\nCommon activation functions:")
x = np.array([-2, -1, 0, 1, 2])
print(f"Input: {x}")
print(f"ReLU (max(0, x)): {np.maximum(0, x)}")
print(f"Sigmoid: {1 / (1 + np.exp(-x))}")
print(f"Tanh: {np.tanh(x)}")

Array a:
[[1 2 3]
 [4 5 6]]

Array b:
[[2 2 2]
 [3 3 3]]

Element-wise operations:
Addition (a + b):
[[3 4 5]
 [7 8 9]]

Multiplication (a * b):
[[ 2  4  6]
 [12 15 18]]

Square (a**2):
[[ 1  4  9]
 [16 25 36]]

Common activation functions:
Input: [-2 -1  0  1  2]
ReLU (max(0, x)): [0 0 0 1 2]
Sigmoid: [0.11920292 0.26894142 0.5        0.73105858 0.88079708]
Tanh: [-0.96402758 -0.76159416  0.          0.76159416  0.96402758]


In [7]:
# Broadcasting examples (very important for ML)
print("Broadcasting Examples:")

# Example 1: Adding bias to all samples
features = np.random.randn(100, 5)  # 100 samples, 5 features
bias = np.array([0.1, -0.2, 0.3, -0.1, 0.2])  # bias for each feature

print(f"Features shape: {features.shape}")
print(f"Bias shape: {bias.shape}")

# Broadcasting adds bias to each sample
features_with_bias = features + bias
print(f"Result shape: {features_with_bias.shape}")
print(f"First sample before: {features[0]}")
print(f"First sample after: {features_with_bias[0]}")

print("\n" + "="*50)

# Example 2: Normalizing features (mean centering)
data = np.random.randn(1000, 3) * 10 + 5  # Add some offset and scale
print(f"\nOriginal data shape: {data.shape}")
print(f"Original means: {np.mean(data, axis=0)}")
print(f"Original stds: {np.std(data, axis=0)}")

# Normalize (broadcasting)
mean = np.mean(data, axis=0)  # Shape: (3,)
std = np.std(data, axis=0)    # Shape: (3,)
normalized_data = (data - mean) / std  # Broadcasting!

print(f"\nNormalized means: {np.mean(normalized_data, axis=0)}")
print(f"Normalized stds: {np.std(normalized_data, axis=0)}")

Broadcasting Examples:
Features shape: (100, 5)
Bias shape: (5,)
Result shape: (100, 5)
First sample before: [-1.33767419  0.61217162  0.56928932  0.07166855 -0.24237546]
First sample after: [-1.23767419  0.41217162  0.86928932 -0.02833145 -0.04237546]


Original data shape: (1000, 3)
Original means: [4.3688771 5.4340328 5.8957205]
Original stds: [ 9.97976302  9.78954254 10.3014967 ]

Normalized means: [-1.52045043e-16 -2.39586129e-16 -4.27546887e-16]
Normalized stds: [1. 1. 1.]


In [8]:
# Broadcasting rules visualization
print("Broadcasting Rules Examples:")

# Rule: Arrays are aligned from the rightmost dimension
examples = [
    ((3, 4), (4,)),      # (3,4) + (4,) -> (3,4)
    ((2, 3, 4), (4,)),   # (2,3,4) + (4,) -> (2,3,4)
    ((2, 3, 4), (3, 4)), # (2,3,4) + (3,4) -> (2,3,4)
    ((2, 1, 4), (3, 4)), # (2,1,4) + (3,4) -> (2,3,4)
]

for shape1, shape2 in examples:
    a = np.ones(shape1)
    b = np.ones(shape2)
    result = a + b
    print(f"\n{shape1} + {shape2} -> {result.shape}")
    print("\nArray A:")
    print(a)
    print("\nArray B:")
    print(b)
    print("\nResult:")
    print(result)
    print("-"*50)

Broadcasting Rules Examples:

(3, 4) + (4,) -> (3, 4)

Array A:
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

Array B:
[1. 1. 1. 1.]

Result:
[[2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]]
--------------------------------------------------

(2, 3, 4) + (4,) -> (2, 3, 4)

Array A:
[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]

Array B:
[1. 1. 1. 1.]

Result:
[[[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]

 [[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]]
--------------------------------------------------

(2, 3, 4) + (3, 4) -> (2, 3, 4)

Array A:
[[[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]

 [[1. 1. 1. 1.]
  [1. 1. 1. 1.]
  [1. 1. 1. 1.]]]

Array B:
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

Result:
[[[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]

 [[2. 2. 2. 2.]
  [2. 2. 2. 2.]
  [2. 2. 2. 2.]]]
--------------------------------------------------

(2, 1, 4) + (3, 4) -> (2, 3, 4)

Array A:
[[[1. 1. 1. 1.]]

 [[1. 1. 1. 1.]

## 4. Linear Algebra Operations

Essential for understanding neural network computations, matrix multiplications, and transformations.

In [9]:
# Matrix multiplication (core of neural networks)
print("Matrix Multiplication Examples:")

# Simulate a simple neural network layer
# Input: batch_size=32, input_features=10
# Layer: input_features=10, output_features=5
batch_size, input_features, output_features = 32, 10, 5

X = np.random.randn(batch_size, input_features)  # Input data
W = np.random.randn(input_features, output_features)  # Weights
b = np.random.randn(output_features)  # Bias

print(f"Input X shape: {X.shape}")
print(f"Weights W shape: {W.shape}")
print(f"Bias b shape: {b.shape}")

# Forward pass: Y = XW + b
Y = np.dot(X, W) + b  # or X @ W + b
print(f"Output Y shape: {Y.shape}")

print(f"\nFirst sample input: {X[0][:5]}...")  # Show first 5 features
print(f"First sample output: {Y[0]}")

Matrix Multiplication Examples:
Input X shape: (32, 10)
Weights W shape: (10, 5)
Bias b shape: (5,)
Output Y shape: (32, 5)

First sample input: [ 0.40265698 -0.07046034  0.42698368 -0.65456494  0.11033543]...
First sample output: [ 1.32621282  1.70173192 -0.74400113 -1.45980749 -1.53939884]


In [10]:
# Different matrix operations
A = np.random.randn(3, 4)
B = np.random.randn(4, 2)

print("Matrix Operations:")
print(f"A shape: {A.shape}")
print(f"B shape: {B.shape}")

# Matrix multiplication
C = A @ B  # Same as np.dot(A, B)
print(f"A @ B shape: {C.shape}")

# Transpose (very common in ML)
A_T = A.T
print(f"A transpose shape: {A_T.shape}")

# Element-wise vs matrix multiplication
square_matrix = np.random.randn(3, 3)
print(f"\nSquare matrix shape: {square_matrix.shape}")
print(f"Element-wise square: {(square_matrix * square_matrix).shape}")
print(f"Matrix multiplication: {(square_matrix @ square_matrix).shape}")

Matrix Operations:
A shape: (3, 4)
B shape: (4, 2)
A @ B shape: (3, 2)
A transpose shape: (4, 3)

Square matrix shape: (3, 3)
Element-wise square: (3, 3)
Matrix multiplication: (3, 3)


In [11]:
# Advanced linear algebra (useful for understanding ML algorithms)
print("Advanced Linear Algebra:")

# Create a symmetric matrix (common in optimization)
A = np.random.randn(4, 4)
symmetric_A = A + A.T

print(f"Matrix A shape: {symmetric_A.shape}")

# Eigenvalues and eigenvectors (PCA, optimization)
eigenvalues, eigenvectors = np.linalg.eig(symmetric_A)
print(f"Eigenvalues: {eigenvalues}")
print(f"Eigenvectors shape: {eigenvectors.shape}")

# Matrix norms (regularization)
print("\nMatrix norms:")
print(f"Frobenius norm: {np.linalg.norm(A, 'fro'):.4f}")
print(f"L2 norm: {np.linalg.norm(A, 2):.4f}")

# Determinant and inverse
det_A = np.linalg.det(symmetric_A)
print(f"\nDeterminant: {det_A:.4f}")

if abs(det_A) > 1e-10:  # Check if invertible
    inv_A = np.linalg.inv(symmetric_A)
    print(f"Inverse exists, shape: {inv_A.shape}")
    # Verify: A @ A^(-1) should be identity
    identity_check = symmetric_A @ inv_A
    print(f"A @ A^(-1) close to identity: {np.allclose(identity_check, np.eye(4))}")
else:
    print("Matrix is singular (not invertible)")

Advanced Linear Algebra:
Matrix A shape: (4, 4)
Eigenvalues: [ 4.3549979   1.80615905 -2.23164492 -2.91563732]
Eigenvectors shape: (4, 4)

Matrix norms:
Frobenius norm: 3.1701
L2 norm: 2.4230

Determinant: 51.1803
Inverse exists, shape: (4, 4)
A @ A^(-1) close to identity: True


## 5. Statistical Operations and Aggregations

Critical for data analysis, loss computation, and model evaluation.

In [12]:
# Statistical operations along different axes
# Simulate prediction scores for classification
# Shape: (batch_size, num_classes)
predictions = np.random.randn(100, 5)  # 100 samples, 5 classes

print(f"Predictions shape: {predictions.shape}")
print("\nStatistical Operations:")

# Overall statistics
print(f"Overall mean: {np.mean(predictions):.4f}")
print(f"Overall std: {np.std(predictions):.4f}")
print(f"Min value: {np.min(predictions):.4f}")
print(f"Max value: {np.max(predictions):.4f}")

# Statistics along axes
print(f"\nMean per class (axis=0): {np.mean(predictions, axis=0)}")
print(f"Mean per sample (axis=1) shape: {np.mean(predictions, axis=1).shape}")

# Useful for softmax and classification
max_per_sample = np.max(predictions, axis=1, keepdims=True)
print(f"\nMax per sample shape (keepdims=True): {max_per_sample.shape}")
print(f"Max per sample shape (keepdims=False): {np.max(predictions, axis=1).shape}")

Predictions shape: (100, 5)

Statistical Operations:
Overall mean: 0.0075
Overall std: 0.9992
Min value: -3.3295
Max value: 2.4184

Mean per class (axis=0): [ 0.0430433  -0.01403904 -0.1310505   0.07978231  0.05988469]
Mean per sample (axis=1) shape: (100,)

Max per sample shape (keepdims=True): (100, 1)
Max per sample shape (keepdims=False): (100,)


In [13]:
# Practical ML examples
print("Practical ML Statistical Operations:")

# 1. Softmax implementation
def softmax(x):
    """Numerically stable softmax"""
    exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=1, keepdims=True)

logits = np.random.randn(5, 3)  # 5 samples, 3 classes
probabilities = softmax(logits)

print(f"Logits shape: {logits.shape}")
print(f"Probabilities shape: {probabilities.shape}")
print(f"Probabilities sum per sample: {np.sum(probabilities, axis=1)}")
print(f"First sample probabilities: {probabilities[0]}")

# 2. Accuracy calculation
true_labels = np.array([0, 1, 2, 1, 0])
predicted_labels = np.argmax(probabilities, axis=1)

accuracy = np.mean(true_labels == predicted_labels)
print(f"\nTrue labels: {true_labels}")
print(f"Predicted labels: {predicted_labels}")
print(f"Accuracy: {accuracy:.2f}")

Practical ML Statistical Operations:
Logits shape: (5, 3)
Probabilities shape: (5, 3)
Probabilities sum per sample: [1. 1. 1. 1. 1.]
First sample probabilities: [0.37378617 0.48353105 0.14268279]

True labels: [0 1 2 1 0]
Predicted labels: [1 0 0 1 0]
Accuracy: 0.40


In [14]:
# Loss function implementations
print("Common Loss Functions:")

# Mean Squared Error (regression)
y_true = np.array([1.5, 2.3, 3.1, 4.2, 5.0])
y_pred = np.array([1.2, 2.1, 3.4, 4.0, 5.2])

mse = np.mean((y_true - y_pred)**2)
rmse = np.sqrt(mse)
mae = np.mean(np.abs(y_true - y_pred))

print(f"True values: {y_true}")
print(f"Predicted values: {y_pred}")
print(f"MSE: {mse:.4f}")
print(f"RMSE: {rmse:.4f}")
print(f"MAE: {mae:.4f}")

# Cross-entropy loss (classification)
def cross_entropy_loss(y_true_labels, y_pred_probs):
    """Cross-entropy loss for classification"""
    # Convert labels to one-hot if needed
    n_classes = y_pred_probs.shape[1]
    y_true_onehot = np.eye(n_classes)[y_true_labels]

    # Clip predictions to avoid log(0)
    y_pred_clipped = np.clip(y_pred_probs, 1e-15, 1 - 1e-15)

    # Calculate cross-entropy per sample and average
    per_sample = np.sum(y_true_onehot * np.log(y_pred_clipped), axis=1)
    loss = np.mean(np.negative(per_sample))
    return loss

ce_loss = cross_entropy_loss(true_labels, probabilities)
print(f"\nCross-entropy loss: {ce_loss:.4f}")

Common Loss Functions:
True values: [1.5 2.3 3.1 4.2 5. ]
Predicted values: [1.2 2.1 3.4 4.  5.2]
MSE: 0.0600
RMSE: 0.2449
MAE: 0.2400

Cross-entropy loss: 1.0208


## 6. Array Reshaping and Manipulation

Essential for preparing data for neural networks and handling different tensor shapes.

In [15]:
# Reshaping operations (very common in deep learning)
print("Array Reshaping:")

# Original data: flattened images
flattened_images = np.random.randint(0, 256, size=(100, 784))  # 100 images, 28x28 pixels
print(f"Flattened images shape: {flattened_images.shape}")

# Reshape to image format
images = flattened_images.reshape(100, 28, 28)
print(f"Reshaped to images: {images.shape}")

# Add channel dimension (for CNN)
images_with_channel = images.reshape(100, 28, 28, 1)
print(f"With channel dimension: {images_with_channel.shape}")

# Or using -1 for automatic calculation
auto_reshape = flattened_images.reshape(100, 28, 28, -1)
print(f"Auto reshape (-1): {auto_reshape.shape}")

# Flatten back
flattened_again = images_with_channel.reshape(100, -1)
print(f"Flattened again: {flattened_again.shape}")

Array Reshaping:
Flattened images shape: (100, 784)
Reshaped to images: (100, 28, 28)
With channel dimension: (100, 28, 28, 1)
Auto reshape (-1): (100, 28, 28, 1)
Flattened again: (100, 784)


In [16]:
# Axis manipulation
print("Axis Manipulation:")

# Sample data: batch of sequences
sequences = np.random.randn(32, 50, 128)  # batch_size, seq_len, features
print(f"Original shape: {sequences.shape}")

# Transpose (swap axes)
transposed = sequences.transpose(1, 0, 2)  # seq_len, batch_size, features
print(f"Transposed: {transposed.shape}")

# Add new axis
with_new_axis = sequences[:, :, :, np.newaxis]
print(f"With new axis: {with_new_axis.shape}")

# Squeeze (remove dimensions of size 1)
squeezed = np.squeeze(with_new_axis)
print(f"Squeezed: {squeezed.shape}")

# Expand dimensions
expanded = np.expand_dims(sequences, axis=0)
print(f"Expanded (axis=0): {expanded.shape}")

expanded_last = np.expand_dims(sequences, axis=-1)
print(f"Expanded (axis=-1): {expanded_last.shape}")

Axis Manipulation:
Original shape: (32, 50, 128)
Transposed: (50, 32, 128)
With new axis: (32, 50, 128, 1)
Squeezed: (32, 50, 128)
Expanded (axis=0): (1, 32, 50, 128)
Expanded (axis=-1): (32, 50, 128, 1)


In [17]:
# Concatenation and stacking (combining data)
print("Concatenation and Stacking:")

# Create sample batches
batch1 = np.random.randn(16, 10)  # 16 samples, 10 features
batch2 = np.random.randn(16, 10)  # 16 samples, 10 features
batch3 = np.random.randn(16, 10)  # 16 samples, 10 features

print(f"Batch 1 shape: {batch1.shape}")
print(f"Batch 2 shape: {batch2.shape}")
print(f"Batch 3 shape: {batch3.shape}")

# Concatenate along batch dimension
combined_batches = np.concatenate([batch1, batch2, batch3], axis=0)
print(f"Combined batches: {combined_batches.shape}")

# Stack (creates new dimension)
stacked_batches = np.stack([batch1, batch2, batch3], axis=0)
print(f"Stacked batches: {stacked_batches.shape}")

# Horizontal stack (features)
features1 = np.random.randn(100, 5)
features2 = np.random.randn(100, 3)
combined_features = np.hstack([features1, features2])
print(f"\nFeatures 1: {features1.shape}")
print(f"Features 2: {features2.shape}")
print(f"Combined features: {combined_features.shape}")

Concatenation and Stacking:
Batch 1 shape: (16, 10)
Batch 2 shape: (16, 10)
Batch 3 shape: (16, 10)
Combined batches: (48, 10)
Stacked batches: (3, 16, 10)

Features 1: (100, 5)
Features 2: (100, 3)
Combined features: (100, 8)


## 7. Performance and Memory Considerations

Understanding NumPy performance is crucial for efficient ML workflows.

In [18]:
import time

# Vectorization vs loops
print("Performance Comparison: Vectorization vs Loops")

# Create large arrays
size = 1000000
a = np.random.randn(size)
b = np.random.randn(size)

# Method 1: Pure Python loop (slow)
start_time = time.time()
result_loop = []
for i in range(min(10000, size)):  # Only do 10k for speed
    result_loop.append(a[i] * b[i])
loop_time = time.time() - start_time

# Method 2: NumPy vectorization (fast)
start_time = time.time()
result_vectorized = a * b
vectorized_time = time.time() - start_time

print(f"Loop time (10k elements): {loop_time:.6f} seconds")
print(f"Vectorized time ({size} elements): {vectorized_time:.6f} seconds")
print(f"Speedup factor: {loop_time / vectorized_time * (size/10000):.1f}x")

# Memory usage
print("\nMemory usage:")
print(f"Array 'a' memory: {a.nbytes / 1024 / 1024:.2f} MB")
print(f"Array 'b' memory: {b.nbytes / 1024 / 1024:.2f} MB")
print(f"Result memory: {result_vectorized.nbytes / 1024 / 1024:.2f} MB")

Performance Comparison: Vectorization vs Loops
Loop time (10k elements): 0.002123 seconds
Vectorized time (1000000 elements): 0.002605 seconds
Speedup factor: 81.5x

Memory usage:
Array 'a' memory: 7.63 MB
Array 'b' memory: 7.63 MB
Result memory: 7.63 MB


In [19]:
# Memory layout and views vs copies
print("Memory Layout and Views:")

# Original array
original = np.random.randn(1000, 1000)
print(f"Original array memory: {original.nbytes / 1024 / 1024:.2f} MB")

# View (shares memory)
view = original[::2, ::2]  # Every other element
print(f"View shares memory: {view.base is original}")
print(f"View shape: {view.shape}")

# Copy (new memory)
copy = original.copy()
print(f"Copy shares memory: {copy.base is original}")
print(f"Copy memory: {copy.nbytes / 1024 / 1024:.2f} MB")

# Demonstrate view behavior
original[0, 0] = 999
print("\nAfter modifying original[0,0] = 999:")
print(f"View[0,0] = {view[0, 0]} (should be 999 if it's a view)")
print(f"Copy[0,0] = {copy[0, 0]} (should be original value)")

Memory Layout and Views:
Original array memory: 7.63 MB
View shares memory: True
View shape: (500, 500)
Copy shares memory: False
Copy memory: 7.63 MB

After modifying original[0,0] = 999:
View[0,0] = 999.0 (should be 999 if it's a view)
Copy[0,0] = 0.8779508901505022 (should be original value)


## 8. Working with other ML libraries (brief, framework-neutral)

NumPy arrays are the common numerical representation used across many machine learning tools. The key things to check when preparing NumPy arrays for use with other tools are:

- dtype: many tools prefer float32 for inputs and integer labels for classification targets
- shape: confirm batch and feature dimensions (e.g., (batch_size, num_features))
- no NaNs or infinite values

Example checks and simple conversion:

In [20]:
print("Using NumPy arrays with other tools (framework-neutral):")

# Create sample data in NumPy
numpy_data = np.random.randn(32, 10)
numpy_labels = np.random.randint(0, 3, size=(32,))

print(f"NumPy data shape: {numpy_data.shape}")
print(f"NumPy data dtype before: {numpy_data.dtype}")

# Convert to a common dtype for ML (float32 inputs, int64 labels)
numpy_data = numpy_data.astype(np.float32)
numpy_labels = numpy_labels.astype(np.int64)

print(f"NumPy data dtype after: {numpy_data.dtype}")
print(f"NumPy labels dtype after: {numpy_labels.dtype}")

print("\nChecklist before handing arrays to another tool:")
print(" - Are shapes as expected? (batch, features)")
print(" - Are dtypes appropriate? (e.g. float32 for inputs)")
print(" - Are there NaNs or infinite values?")

Using NumPy arrays with other tools (framework-neutral):
NumPy data shape: (32, 10)
NumPy data dtype before: float64
NumPy data dtype after: float32
NumPy labels dtype after: int64

Checklist before handing arrays to another tool:
 - Are shapes as expected? (batch, features)
 - Are dtypes appropriate? (e.g. float32 for inputs)
 - Are there NaNs or infinite values?


## Summary and Key Takeaways

**What we've learned:**

1. **Array Creation & Properties**: Understanding shapes, dtypes, and memory usage
2. **Indexing & Slicing**: Essential for data manipulation and batch processing
3. **Broadcasting**: Enables efficient operations without explicit loops
4. **Linear Algebra**: Matrix operations that form the core of many ML models
5. **Statistical Operations**: Computing metrics, losses, and aggregations
6. **Reshaping**: Preparing data for different network architectures
7. **Performance**: Vectorization and memory considerations
8. **Tool Bridge**: How NumPy arrays are the common numerical format used by many ML tools

**Key Patterns for ML:**
- Use vectorized operations instead of loops
- Understand broadcasting for efficient computations
- Master axis-based operations for batch processing
- Know when operations create views vs copies
- Prepare data in NumPy before converting to other tools

**Next Steps:**
- Learn Pandas for structured data manipulation
- Understand how these concepts translate to other numerical libraries
- See how higher-level ML libraries mirror NumPy patterns

NumPy is a foundational numerical library. Mastering these concepts will make it easier to learn higher-level tools that build on the same numerical principles.
