# Lab 01: NumPy Fundamentals

**ING3513 - Introduction to Artificial Intelligence and Machine Learning**

NumPy (Numerical Python) is the foundation of scientific computing in Python. Almost every ML library (scikit-learn, TensorFlow, PyTorch) uses NumPy arrays under the hood.

**What you'll learn:**

- Creating and manipulating arrays
- Indexing and slicing
- Vectorized operations (and why they matter)
- Common mathematical functions


In [None]:
import numpy as np

print(f"NumPy version: {np.__version__}")

## 1. Creating Arrays

NumPy arrays are more efficient than Python lists for numerical operations. Let's explore different ways to create them.


In [None]:
# From a Python list
arr1 = np.array([1, 2, 3, 4, 5])
print("From list:", arr1)

# 2D array (matrix)
arr2d = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D array:\n", arr2d)

# Array of zeros
zeros = np.zeros((3, 4))
print("\nZeros (3x4):\n", zeros)

# Array of ones
ones = np.ones((2, 3))
print("\nOnes (2x3):\n", ones)

# Range of values
range_arr = np.arange(0, 10, 2)  # start, stop, step
print("\nRange (0 to 10, step 2):", range_arr)

# Evenly spaced values
linspace_arr = np.linspace(0, 1, 5)  # start, stop, num_points
print("\nLinspace (5 points from 0 to 1):", linspace_arr)

## 2. Array Properties

Understanding array properties is essential for debugging and working with ML libraries.


In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

print("Array:\n", arr)
print("\nShape:", arr.shape)  # (rows, columns)
print("Number of dimensions:", arr.ndim)
print("Data type:", arr.dtype)
print("Total elements:", arr.size)

# Reshaping arrays
reshaped = arr.reshape(3, 2)
print("\nReshaped to (3, 2):\n", reshaped)

# Flattening to 1D
flat = arr.flatten()
print("\nFlattened:", flat)

## 3. Indexing and Slicing

Accessing elements in NumPy arrays is similar to Python lists, but more powerful for multi-dimensional arrays.


In [None]:
# 1D array indexing
arr1d = np.array([10, 20, 30, 40, 50])
print("Array:", arr1d)
print("First element:", arr1d[0])
print("Last element:", arr1d[-1])
print("Elements 1-3:", arr1d[1:4])

# 2D array indexing
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("\n2D Array:\n", arr2d)
print("\nElement at row 1, col 2:", arr2d[1, 2])
print("First row:", arr2d[0])
print("First column:", arr2d[:, 0])
print("Subarray (first 2 rows, last 2 cols):\n", arr2d[:2, 1:])

## 4. Vectorized Operations

This is where NumPy shines! Operations are applied to entire arrays without explicit loops, making code both cleaner and **much faster**.


In [None]:
arr = np.array([1, 2, 3, 4, 5])

# Element-wise operations
print("Original:", arr)
print("Add 10:", arr + 10)
print("Multiply by 2:", arr * 2)
print("Square:", arr**2)

# Operations between arrays
arr2 = np.array([10, 20, 30, 40, 50])
print("\nArray 1:", arr)
print("Array 2:", arr2)
print("Sum:", arr + arr2)
print("Product:", arr * arr2)

### Why Vectorization Matters

Let's compare the speed of a loop vs. vectorized operations:


In [None]:
import time

# Create a large array
np.random.seed(42)  # For reproducibility
large_arr = np.random.rand(1_000_000)

# Using a Python loop
start = time.time()
result_loop = [x * 2 for x in large_arr]
loop_time = time.time() - start

# Using NumPy vectorization
start = time.time()
result_numpy = large_arr * 2
numpy_time = time.time() - start

print(f"Python loop: {loop_time:.4f} seconds")
print(f"NumPy vectorized: {numpy_time:.4f} seconds")
print(f"NumPy is {loop_time / numpy_time:.1f}x faster!")

## 5. Boolean Indexing

Filter arrays based on conditions - extremely useful for data processing.


In [None]:
arr = np.array([1, 5, 3, 8, 2, 9, 4, 7])

# Create a boolean mask
mask = arr > 5
print("Array:", arr)
print("Mask (arr > 5):", mask)

# Use the mask to filter
print("Values > 5:", arr[mask])
print("Values <= 5:", arr[arr <= 5])

# Combining conditions
print("Values between 3 and 7:", arr[(arr >= 3) & (arr <= 7)])

## 6. Mathematical Functions

NumPy provides many built-in mathematical and statistical functions.


In [None]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Statistical functions
print("Array:", arr)
print("Sum:", np.sum(arr))
print("Mean:", np.mean(arr))
print("Standard deviation:", np.std(arr))
print("Min:", np.min(arr))
print("Max:", np.max(arr))

# Finding indices
print("\nIndex of max:", np.argmax(arr))
print("Index of min:", np.argmin(arr))

# Mathematical functions
angles = np.array([0, np.pi / 4, np.pi / 2, np.pi])
print("\nAngles (radians):", angles)
print("Sine:", np.sin(angles).round(4))
print("Cosine:", np.cos(angles).round(4))

## 7. Broadcasting

Broadcasting allows NumPy to work with arrays of different shapes during arithmetic operations.


In [None]:
# Scalar + array (scalar is "broadcast" to match array shape)
arr = np.array([1, 2, 3])
print("Array + 10:", arr + 10)

# 2D array + 1D array
matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
row = np.array([10, 20, 30])

print("\nMatrix:\n", matrix)
print("Row:", row)
print("\nMatrix + Row (row added to each row):\n", matrix + row)

# Practical example: normalize data (subtract mean, divide by std)
data = np.array([[10, 200, 3000], [20, 400, 6000], [30, 600, 9000]])
means = data.mean(axis=0)  # mean of each column
stds = data.std(axis=0)  # std of each column

normalized = (data - means) / stds
print("\nOriginal data:\n", data)
print("\nNormalized data:\n", normalized.round(2))