# NumPy Crash Course for Data Science Assessments

**Date:** 20 January 2026

This notebook provides a comprehensive overview of NumPy fundamentals and common interview/assessment topics. It covers array creation, manipulation, mathematical operations, and includes practice questions with solutions.

---

## Table of Contents

1. [Introduction to NumPy](#1-introduction-to-numpy)
2. [Array Creation](#2-array-creation)
3. [Array Indexing and Slicing](#3-array-indexing-and-slicing)
4. [Array Reshaping](#4-array-reshaping)
5. [Broadcasting](#5-broadcasting)
6. [Mathematical Operations](#6-mathematical-operations)
7. [Statistical Functions](#7-statistical-functions)
8. [Boolean and Fancy Indexing](#8-boolean-and-fancy-indexing)
9. [Stacking and Splitting Arrays](#9-stacking-and-splitting-arrays)
10. [Linear Algebra Basics](#10-linear-algebra-basics)
11. [Practice Questions](#11-practice-questions)

---

## 1. Introduction to NumPy

NumPy (Numerical Python) is a fundamental library for scientific computing in Python. It provides support for large, multi-dimensional arrays and matrices, along with a vast collection of high-level mathematical functions.

### Key Advantages of NumPy over Python Lists:

- **Performance:** NumPy arrays are stored in contiguous memory blocks, enabling vectorised operations that are significantly faster than Python loops
- **Memory Efficiency:** NumPy arrays consume less memory than Python lists
- **Homogeneous Data Types:** All elements in a NumPy array must be of the same type, which enables optimised computations
- **Broadcasting:** Allows arithmetic operations on arrays of different shapes without explicit loops
- **Vectorisation:** Perform operations on entire arrays rather than individual elements

In [5]:
import numpy as np

print(f"NumPy version: {np.__version__}")

NumPy version: 2.4.1


---

## 2. Array Creation

NumPy provides numerous ways to create arrays. Understanding these methods is essential for data manipulation and preprocessing.

### 2.1 Creating Arrays from Python Lists

In [6]:
# 1D array from a list
arr_1d = np.array([1, 2, 3, 4, 5])
print(f"1D Array: {arr_1d}")
print(f"Shape: {arr_1d.shape}")
print(f"Data type: {arr_1d.dtype}")
print()

# 2D array from nested lists
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"2D Array:\n{arr_2d}")
print(f"Shape: {arr_2d.shape}")
print()

# Specifying data type explicitly
arr_float = np.array([1, 2, 3], dtype=np.float64)
print(f"Float array: {arr_float}")
print(f"Data type: {arr_float.dtype}")

1D Array: [1 2 3 4 5]
Shape: (5,)
Data type: int64

2D Array:
[[1 2 3]
 [4 5 6]]
Shape: (2, 3)

Float array: [1. 2. 3.]
Data type: float64


### 2.2 Arrays of Zeros, Ones, and Empty Arrays

In [7]:
# Array of zeros
zeros_1d = np.zeros(5)
print(f"1D Zeros: {zeros_1d}")

zeros_2d = np.zeros((3, 4))
print(f"2D Zeros (3x4):\n{zeros_2d}")
print()

# Array of ones
ones_1d = np.ones(5)
print(f"1D Ones: {ones_1d}")

ones_2d = np.ones((2, 3), dtype=np.int32)
print(f"2D Ones (int32):\n{ones_2d}")
print()

# Array filled with a specific value
full_arr = np.full((2, 3), fill_value=7)
print(f"Array filled with 7:\n{full_arr}")
print()

# Identity matrix
identity = np.eye(4)
print(f"4x4 Identity matrix:\n{identity}")

1D Zeros: [0. 0. 0. 0. 0.]
2D Zeros (3x4):
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

1D Ones: [1. 1. 1. 1. 1.]
2D Ones (int32):
[[1 1 1]
 [1 1 1]]

Array filled with 7:
[[7 7 7]
 [7 7 7]]

4x4 Identity matrix:
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


### 2.3 arange and linspace

In [8]:
# arange: similar to Python's range but returns an array
# Syntax: np.arange(start, stop, step)
arr_arange = np.arange(0, 10, 2)
print(f"arange(0, 10, 2): {arr_arange}")

arr_arange_float = np.arange(0, 1, 0.1)
print(f"arange(0, 1, 0.1): {arr_arange_float}")
print()

# linspace: creates evenly spaced numbers over a specified interval
# Syntax: np.linspace(start, stop, num)
arr_linspace = np.linspace(0, 1, 5)
print(f"linspace(0, 1, 5): {arr_linspace}")

arr_linspace_10 = np.linspace(0, 100, 11)
print(f"linspace(0, 100, 11): {arr_linspace_10}")

arange(0, 10, 2): [0 2 4 6 8]
arange(0, 1, 0.1): [0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]

linspace(0, 1, 5): [0.   0.25 0.5  0.75 1.  ]
linspace(0, 100, 11): [  0.  10.  20.  30.  40.  50.  60.  70.  80.  90. 100.]


### 2.4 Random Arrays

In [9]:
# Set seed for reproducibility
np.random.seed(42)

# Random values between 0 and 1 (uniform distribution)
rand_uniform = np.random.rand(3, 4)
print(f"Random uniform (3x4):\n{rand_uniform}")
print()

# Random integers
rand_int = np.random.randint(1, 100, size=(3, 3))
print(f"Random integers 1-100 (3x3):\n{rand_int}")
print()

# Standard normal distribution (mean=0, std=1)
rand_normal = np.random.randn(2, 3)
print(f"Random normal (2x3):\n{rand_normal}")
print()

# Normal distribution with custom mean and standard deviation
rand_custom_normal = np.random.normal(loc=50, scale=10, size=5)
print(f"Normal (mean=50, std=10): {rand_custom_normal}")
print()

# Random choice from an array
choices = np.random.choice([10, 20, 30, 40, 50], size=3, replace=False)
print(f"Random choice without replacement: {choices}")

Random uniform (3x4):
[[0.37454012 0.95071431 0.73199394 0.59865848]
 [0.15601864 0.15599452 0.05808361 0.86617615]
 [0.60111501 0.70807258 0.02058449 0.96990985]]

Random integers 1-100 (3x3):
[[30 38  2]
 [64 60 21]
 [33 76 58]]

Random normal (2x3):
[[-2.61254901  0.95036968  0.81644508]
 [-1.523876   -0.42804606 -0.74240684]]

Normal (mean=50, std=10): [42.96656198 28.60379344 43.70525039 55.97720467 75.59488031]

Random choice without replacement: [20 40 50]


---

## 3. Array Indexing and Slicing

NumPy provides powerful indexing and slicing capabilities for accessing and modifying array elements.

### 3.1 Basic Indexing

In [10]:
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
print(f"Original array: {arr}")
print()

# Positive indexing (starts from 0)
print(f"First element (arr[0]): {arr[0]}")
print(f"Fifth element (arr[4]): {arr[4]}")

# Negative indexing (starts from -1 for last element)
print(f"Last element (arr[-1]): {arr[-1]}")
print(f"Second to last (arr[-2]): {arr[-2]}")

Original array: [ 10  20  30  40  50  60  70  80  90 100]

First element (arr[0]): 10
Fifth element (arr[4]): 50
Last element (arr[-1]): 100
Second to last (arr[-2]): 90


### 3.2 Slicing 1D Arrays

In [11]:
arr = np.arange(10)
print(f"Original array: {arr}")
print()

# Syntax: arr[start:stop:step]
print(f"arr[2:7]: {arr[2:7]}")
print(f"arr[:5]: {arr[:5]}")
print(f"arr[5:]: {arr[5:]}")
print(f"arr[::2] (every second): {arr[::2]}")
print(f"arr[::-1] (reversed): {arr[::-1]}")
print(f"arr[1:8:2]: {arr[1:8:2]}")

Original array: [0 1 2 3 4 5 6 7 8 9]

arr[2:7]: [2 3 4 5 6]
arr[:5]: [0 1 2 3 4]
arr[5:]: [5 6 7 8 9]
arr[::2] (every second): [0 2 4 6 8]
arr[::-1] (reversed): [9 8 7 6 5 4 3 2 1 0]
arr[1:8:2]: [1 3 5 7]


### 3.3 Indexing and Slicing 2D Arrays

In [12]:
arr_2d = np.array([[1, 2, 3, 4],
                   [5, 6, 7, 8],
                   [9, 10, 11, 12]])
print(f"2D Array:\n{arr_2d}")
print(f"Shape: {arr_2d.shape}")
print()

# Accessing single elements: arr[row, col]
print(f"Element at (0, 0): {arr_2d[0, 0]}")
print(f"Element at (1, 2): {arr_2d[1, 2]}")
print(f"Element at (2, -1): {arr_2d[2, -1]}")
print()

# Slicing rows and columns
print(f"First row: {arr_2d[0]}")
print(f"First row (explicit): {arr_2d[0, :]}")
print(f"First column: {arr_2d[:, 0]}")
print(f"Last column: {arr_2d[:, -1]}")
print()

# Subarray slicing
print(f"Rows 0-1, Cols 1-2:\n{arr_2d[0:2, 1:3]}")
print(f"Every other row:\n{arr_2d[::2, :]}")

2D Array:
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
Shape: (3, 4)

Element at (0, 0): 1
Element at (1, 2): 7
Element at (2, -1): 12

First row: [1 2 3 4]
First row (explicit): [1 2 3 4]
First column: [1 5 9]
Last column: [ 4  8 12]

Rows 0-1, Cols 1-2:
[[2 3]
 [6 7]]
Every other row:
[[ 1  2  3  4]
 [ 9 10 11 12]]


### 3.4 Important: Views vs Copies

NumPy slices return **views** of the original array, not copies. Modifying a view will modify the original array.

In [13]:
original = np.array([1, 2, 3, 4, 5])
print(f"Original: {original}")

# Creating a view (slice)
view_slice = original[1:4]
print(f"View (slice): {view_slice}")

# Modifying the view affects the original
view_slice[0] = 999
print(f"After modifying view: {original}")
print()

# To avoid this, create an explicit copy
original = np.array([1, 2, 3, 4, 5])
copy_arr = original[1:4].copy()
copy_arr[0] = 888
print(f"Original after modifying copy: {original}")
print(f"Copy: {copy_arr}")

Original: [1 2 3 4 5]
View (slice): [2 3 4]
After modifying view: [  1 999   3   4   5]

Original after modifying copy: [1 2 3 4 5]
Copy: [888   3   4]


---

## 4. Array Reshaping

Reshaping is crucial for data preprocessing and adapting data to various algorithm input requirements.

### 4.1 reshape()

In [14]:
arr = np.arange(12)
print(f"Original 1D array: {arr}")
print(f"Shape: {arr.shape}")
print()

# Reshape to 2D (3 rows, 4 columns)
arr_3x4 = arr.reshape(3, 4)
print(f"Reshaped to (3, 4):\n{arr_3x4}")
print()

# Reshape to 2D (4 rows, 3 columns)
arr_4x3 = arr.reshape(4, 3)
print(f"Reshaped to (4, 3):\n{arr_4x3}")
print()

# Reshape to 3D
arr_3d = arr.reshape(2, 2, 3)
print(f"Reshaped to (2, 2, 3):\n{arr_3d}")
print()

# Using -1 to automatically infer one dimension
arr_auto = arr.reshape(-1, 4)
print(f"Reshaped with -1 (auto-infer rows): {arr_auto.shape}")

arr_auto_2 = arr.reshape(3, -1)
print(f"Reshaped with -1 (auto-infer cols): {arr_auto_2.shape}")

Original 1D array: [ 0  1  2  3  4  5  6  7  8  9 10 11]
Shape: (12,)

Reshaped to (3, 4):
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Reshaped to (4, 3):
[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]

Reshaped to (2, 2, 3):
[[[ 0  1  2]
  [ 3  4  5]]

 [[ 6  7  8]
  [ 9 10 11]]]

Reshaped with -1 (auto-infer rows): (3, 4)
Reshaped with -1 (auto-infer cols): (3, 4)


### 4.2 flatten() and ravel()

In [15]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"Original 2D array:\n{arr_2d}")
print()

# flatten() returns a COPY
flattened = arr_2d.flatten()
print(f"Flattened (copy): {flattened}")
flattened[0] = 999
print(f"Original after modifying flattened: {arr_2d[0, 0]}")
print()

# ravel() returns a VIEW (when possible)
ravelled = arr_2d.ravel()
print(f"Ravelled (view): {ravelled}")
ravelled[0] = 888
print(f"Original after modifying ravelled: {arr_2d[0, 0]}")
print()

# Flatten with different orders
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print(f"Row-major (C-style, default): {arr_2d.flatten('C')}")
print(f"Column-major (Fortran-style): {arr_2d.flatten('F')}")

Original 2D array:
[[1 2 3]
 [4 5 6]]

Flattened (copy): [1 2 3 4 5 6]
Original after modifying flattened: 1

Ravelled (view): [1 2 3 4 5 6]
Original after modifying ravelled: 888

Row-major (C-style, default): [1 2 3 4 5 6]
Column-major (Fortran-style): [1 4 2 5 3 6]


### 4.3 transpose()

In [16]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])
print(f"Original (2x3):\n{arr}")
print()

# Transpose swaps rows and columns
transposed = arr.T
print(f"Transposed (3x2):\n{transposed}")
print()

# Alternative: np.transpose()
transposed_2 = np.transpose(arr)
print(f"Using np.transpose():\n{transposed_2}")
print()

# Transpose for 3D arrays (specify axes order)
arr_3d = np.arange(24).reshape(2, 3, 4)
print(f"3D array shape: {arr_3d.shape}")
print(f"Transposed (default) shape: {arr_3d.T.shape}")
print(f"Custom transpose (1, 0, 2) shape: {np.transpose(arr_3d, (1, 0, 2)).shape}")

Original (2x3):
[[1 2 3]
 [4 5 6]]

Transposed (3x2):
[[1 4]
 [2 5]
 [3 6]]

Using np.transpose():
[[1 4]
 [2 5]
 [3 6]]

3D array shape: (2, 3, 4)
Transposed (default) shape: (4, 3, 2)
Custom transpose (1, 0, 2) shape: (3, 2, 4)


### 4.4 Adding and Removing Dimensions

In [17]:
arr = np.array([1, 2, 3, 4, 5])
print(f"Original shape: {arr.shape}")
print()

# Adding a new axis using np.newaxis
row_vec = arr[np.newaxis, :]
print(f"Row vector shape (1, 5): {row_vec.shape}")
print(f"Row vector: {row_vec}")

col_vec = arr[:, np.newaxis]
print(f"Column vector shape (5, 1): {col_vec.shape}")
print(f"Column vector:\n{col_vec}")
print()

# Using np.expand_dims()
expanded = np.expand_dims(arr, axis=0)
print(f"expand_dims(axis=0): {expanded.shape}")

expanded_2 = np.expand_dims(arr, axis=1)
print(f"expand_dims(axis=1): {expanded_2.shape}")
print()

# Removing single-element dimensions with squeeze()
arr_with_extra = np.array([[[1, 2, 3]]])
print(f"Before squeeze: {arr_with_extra.shape}")
print(f"After squeeze: {np.squeeze(arr_with_extra).shape}")

Original shape: (5,)

Row vector shape (1, 5): (1, 5)
Row vector: [[1 2 3 4 5]]
Column vector shape (5, 1): (5, 1)
Column vector:
[[1]
 [2]
 [3]
 [4]
 [5]]

expand_dims(axis=0): (1, 5)
expand_dims(axis=1): (5, 1)

Before squeeze: (1, 1, 3)
After squeeze: (3,)


---

## 5. Broadcasting

Broadcasting allows NumPy to perform element-wise operations on arrays of different shapes without explicitly copying data.

### 5.1 Broadcasting Rules

Broadcasting follows these rules when comparing array shapes dimension by dimension (from right to left):

1. **Rule 1:** If arrays have different numbers of dimensions, the shape of the smaller array is padded with ones on its **left** side
2. **Rule 2:** If array shapes differ in any dimension, the array with shape 1 in that dimension is stretched to match
3. **Rule 3:** If neither dimension is 1 and they differ, an error is raised

### 5.2 Scalar Broadcasting

In [18]:
arr = np.array([1, 2, 3, 4, 5])
print(f"Original: {arr}")
print()

# Scalar is broadcast to all elements
print(f"arr + 10: {arr + 10}")
print(f"arr * 2: {arr * 2}")
print(f"arr ** 2: {arr ** 2}")

Original: [1 2 3 4 5]

arr + 10: [11 12 13 14 15]
arr * 2: [ 2  4  6  8 10]
arr ** 2: [ 1  4  9 16 25]


### 5.3 1D Array with 2D Array

In [19]:
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
arr_1d = np.array([10, 20, 30])

print(f"2D array (3x3):\n{arr_2d}")
print(f"1D array (3,): {arr_1d}")
print()

# The 1D array is broadcast across each row
# Shape comparison: (3, 3) vs (3,) -> (3, 3) vs (1, 3) -> broadcast!
result = arr_2d + arr_1d
print(f"2D + 1D (broadcast across rows):\n{result}")

2D array (3x3):
[[1 2 3]
 [4 5 6]
 [7 8 9]]
1D array (3,): [10 20 30]

2D + 1D (broadcast across rows):
[[11 22 33]
 [14 25 36]
 [17 28 39]]


### 5.4 Column Vector Broadcasting

In [20]:
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
col_vec = np.array([[100], [200], [300]])

print(f"2D array (3x3):\n{arr_2d}")
print(f"Column vector (3x1):\n{col_vec}")
print()

# The column vector is broadcast across each column
# Shape comparison: (3, 3) vs (3, 1) -> broadcast!
result = arr_2d + col_vec
print(f"2D + column vector (broadcast across columns):\n{result}")

2D array (3x3):
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Column vector (3x1):
[[100]
 [200]
 [300]]

2D + column vector (broadcast across columns):
[[101 102 103]
 [204 205 206]
 [307 308 309]]


### 5.5 Outer Product via Broadcasting

In [21]:
x = np.array([1, 2, 3])
y = np.array([10, 20, 30, 40])

# Reshape x to column vector (3, 1) and keep y as row (4,)
# Broadcasting: (3, 1) * (4,) -> (3, 1) * (1, 4) -> (3, 4)
outer_product = x[:, np.newaxis] * y
print(f"x (column): {x[:, np.newaxis].shape}")
print(f"y (row): {y.shape}")
print(f"Outer product (3x4):\n{outer_product}")

x (column): (3, 1)
y (row): (4,)
Outer product (3x4):
[[ 10  20  30  40]
 [ 20  40  60  80]
 [ 30  60  90 120]]


### 5.6 Practical Example: Normalising Data

In [22]:
# Sample data: 5 samples, 3 features
data = np.array([[1, 100, 1000],
                 [2, 200, 2000],
                 [3, 300, 3000],
                 [4, 400, 4000],
                 [5, 500, 5000]], dtype=np.float64)

print(f"Original data:\n{data}")
print()

# Calculate mean and std for each feature (column)
mean_per_feature = data.mean(axis=0)
std_per_feature = data.std(axis=0)

print(f"Mean per feature: {mean_per_feature}")
print(f"Std per feature: {std_per_feature}")
print()

# Standardise (z-score normalisation) using broadcasting
normalised = (data - mean_per_feature) / std_per_feature
print(f"Normalised data:\n{normalised}")
print(f"Normalised mean: {normalised.mean(axis=0)}")
print(f"Normalised std: {normalised.std(axis=0)}")

Original data:
[[1.e+00 1.e+02 1.e+03]
 [2.e+00 2.e+02 2.e+03]
 [3.e+00 3.e+02 3.e+03]
 [4.e+00 4.e+02 4.e+03]
 [5.e+00 5.e+02 5.e+03]]

Mean per feature: [   3.  300. 3000.]
Std per feature: [   1.41421356  141.42135624 1414.21356237]

Normalised data:
[[-1.41421356 -1.41421356 -1.41421356]
 [-0.70710678 -0.70710678 -0.70710678]
 [ 0.          0.          0.        ]
 [ 0.70710678  0.70710678  0.70710678]
 [ 1.41421356  1.41421356  1.41421356]]
Normalised mean: [0. 0. 0.]
Normalised std: [1. 1. 1.]


---

## 6. Mathematical Operations

NumPy provides extensive support for element-wise and matrix operations.

### 6.1 Element-wise Operations

In [23]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([10, 20, 30, 40, 50])

print(f"a: {a}")
print(f"b: {b}")
print()

# Basic arithmetic (element-wise)
print(f"a + b: {a + b}")
print(f"a - b: {a - b}")
print(f"a * b: {a * b}")
print(f"b / a: {b / a}")
print(f"b // a (floor): {b // a}")
print(f"b % a (modulo): {b % a}")
print(f"a ** 2: {a ** 2}")

a: [1 2 3 4 5]
b: [10 20 30 40 50]

a + b: [11 22 33 44 55]
a - b: [ -9 -18 -27 -36 -45]
a * b: [ 10  40  90 160 250]
b / a: [10. 10. 10. 10. 10.]
b // a (floor): [10 10 10 10 10]
b % a (modulo): [0 0 0 0 0]
a ** 2: [ 1  4  9 16 25]


### 6.2 Universal Functions (ufuncs)

In [24]:
arr = np.array([0, np.pi/6, np.pi/4, np.pi/3, np.pi/2])
print(f"Angles (radians): {arr}")
print()

# Trigonometric functions
print(f"sin: {np.sin(arr)}")
print(f"cos: {np.cos(arr)}")
print()

# Exponential and logarithmic
arr2 = np.array([1, 2, 3, 4, 5])
print(f"exp: {np.exp(arr2)}")
print(f"log (natural): {np.log(arr2)}")
print(f"log10: {np.log10(arr2)}")
print(f"log2: {np.log2(arr2)}")
print()

# Square root and power
print(f"sqrt: {np.sqrt(arr2)}")
print(f"power(arr, 3): {np.power(arr2, 3)}")

Angles (radians): [0.         0.52359878 0.78539816 1.04719755 1.57079633]

sin: [0.         0.5        0.70710678 0.8660254  1.        ]
cos: [1.00000000e+00 8.66025404e-01 7.07106781e-01 5.00000000e-01
 6.12323400e-17]

exp: [  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]
log (natural): [0.         0.69314718 1.09861229 1.38629436 1.60943791]
log10: [0.         0.30103    0.47712125 0.60205999 0.69897   ]
log2: [0.         1.         1.5849625  2.         2.32192809]

sqrt: [1.         1.41421356 1.73205081 2.         2.23606798]
power(arr, 3): [  1   8  27  64 125]


### 6.3 Dot Product and Matrix Multiplication

In [25]:
# Dot product of 1D arrays (scalar result)
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

dot_product = np.dot(a, b)
print(f"a: {a}")
print(f"b: {b}")
print(f"Dot product (np.dot): {dot_product}")
print(f"Dot product (@ operator): {a @ b}")
print(f"Manual calculation: 1*4 + 2*5 + 3*6 = {1*4 + 2*5 + 3*6}")
print()

# Matrix multiplication
A = np.array([[1, 2],
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

print(f"Matrix A:\n{A}")
print(f"Matrix B:\n{B}")
print()

# Matrix multiplication (not element-wise!)
matmul_result = np.matmul(A, B)
print(f"Matrix multiplication (np.matmul):\n{matmul_result}")
print(f"Matrix multiplication (@ operator):\n{A @ B}")
print(f"Matrix multiplication (np.dot):\n{np.dot(A, B)}")
print()

# Note: Element-wise multiplication is different!
print(f"Element-wise multiplication (A * B):\n{A * B}")

a: [1 2 3]
b: [4 5 6]
Dot product (np.dot): 32
Dot product (@ operator): 32
Manual calculation: 1*4 + 2*5 + 3*6 = 32

Matrix A:
[[1 2]
 [3 4]]
Matrix B:
[[5 6]
 [7 8]]

Matrix multiplication (np.matmul):
[[19 22]
 [43 50]]
Matrix multiplication (@ operator):
[[19 22]
 [43 50]]
Matrix multiplication (np.dot):
[[19 22]
 [43 50]]

Element-wise multiplication (A * B):
[[ 5 12]
 [21 32]]


---

## 7. Statistical Functions

NumPy provides comprehensive statistical functions that can operate along specified axes.

### 7.1 Basic Statistics

In [26]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(f"Array: {arr}")
print()

print(f"Sum: {np.sum(arr)}")
print(f"Mean: {np.mean(arr)}")
print(f"Median: {np.median(arr)}")
print(f"Standard deviation: {np.std(arr)}")
print(f"Variance: {np.var(arr)}")
print(f"Min: {np.min(arr)}")
print(f"Max: {np.max(arr)}")
print(f"Range (ptp): {np.ptp(arr)}")
print(f"Cumulative sum: {np.cumsum(arr)}")
print(f"Cumulative product: {np.cumprod(arr)}")

Array: [ 1  2  3  4  5  6  7  8  9 10]

Sum: 55
Mean: 5.5
Median: 5.5
Standard deviation: 2.8722813232690143
Variance: 8.25
Min: 1
Max: 10
Range (ptp): 9
Cumulative sum: [ 1  3  6 10 15 21 28 36 45 55]
Cumulative product: [      1       2       6      24     120     720    5040   40320  362880
 3628800]


### 7.2 Operations Along Axes

In [27]:
arr_2d = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

print(f"2D Array:\n{arr_2d}")
print()

# axis=None (default): operation on flattened array
print(f"Sum (all elements): {np.sum(arr_2d)}")

# axis=0: operation along rows (collapse rows, result has shape of columns)
print(f"Sum along axis=0 (column sums): {np.sum(arr_2d, axis=0)}")
print(f"Mean along axis=0: {np.mean(arr_2d, axis=0)}")

# axis=1: operation along columns (collapse columns, result has shape of rows)
print(f"Sum along axis=1 (row sums): {np.sum(arr_2d, axis=1)}")
print(f"Mean along axis=1: {np.mean(arr_2d, axis=1)}")
print()

# Keep dimensions with keepdims=True
print(f"Mean axis=1 without keepdims: {np.mean(arr_2d, axis=1).shape}")
print(f"Mean axis=1 with keepdims: {np.mean(arr_2d, axis=1, keepdims=True).shape}")

2D Array:
[[1 2 3]
 [4 5 6]
 [7 8 9]]

Sum (all elements): 45
Sum along axis=0 (column sums): [12 15 18]
Mean along axis=0: [4. 5. 6.]
Sum along axis=1 (row sums): [ 6 15 24]
Mean along axis=1: [2. 5. 8.]

Mean axis=1 without keepdims: (3,)
Mean axis=1 with keepdims: (3, 1)


### 7.3 Finding Indices of Min/Max

In [28]:
arr = np.array([3, 1, 4, 1, 5, 9, 2, 6])
print(f"Array: {arr}")
print()

# Index of min/max values
print(f"Index of min (argmin): {np.argmin(arr)}")
print(f"Index of max (argmax): {np.argmax(arr)}")
print(f"Min value: {arr[np.argmin(arr)]}")
print(f"Max value: {arr[np.argmax(arr)]}")
print()

# For 2D arrays
arr_2d = np.array([[3, 7, 2],
                   [5, 1, 9]])
print(f"2D Array:\n{arr_2d}")
print(f"Flat index of max: {np.argmax(arr_2d)}")
print(f"Argmax along axis=0: {np.argmax(arr_2d, axis=0)}")
print(f"Argmax along axis=1: {np.argmax(arr_2d, axis=1)}")

Array: [3 1 4 1 5 9 2 6]

Index of min (argmin): 1
Index of max (argmax): 5
Min value: 1
Max value: 9

2D Array:
[[3 7 2]
 [5 1 9]]
Flat index of max: 5
Argmax along axis=0: [1 0 1]
Argmax along axis=1: [1 2]


### 7.4 Percentiles and Quantiles

In [29]:
np.random.seed(42)
data = np.random.randn(1000)

print(f"Data shape: {data.shape}")
print(f"Mean: {np.mean(data):.4f}")
print(f"Std: {np.std(data):.4f}")
print()

# Percentiles
print(f"25th percentile (Q1): {np.percentile(data, 25):.4f}")
print(f"50th percentile (median): {np.percentile(data, 50):.4f}")
print(f"75th percentile (Q3): {np.percentile(data, 75):.4f}")
print()

# Multiple percentiles at once
quartiles = np.percentile(data, [25, 50, 75])
print(f"Quartiles: {quartiles}")

# Quantiles (same as percentile / 100)
print(f"0.5 quantile: {np.quantile(data, 0.5):.4f}")

Data shape: (1000,)
Mean: 0.0193
Std: 0.9787

25th percentile (Q1): -0.6476
50th percentile (median): 0.0253
75th percentile (Q3): 0.6479

Quartiles: [-0.64759031  0.02530061  0.64794388]
0.5 quantile: 0.0253


---

## 8. Boolean and Fancy Indexing

NumPy supports advanced indexing techniques for selecting elements based on conditions or arbitrary indices.

### 8.1 Boolean Indexing

In [30]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(f"Array: {arr}")
print()

# Create boolean mask
mask = arr > 5
print(f"Mask (arr > 5): {mask}")
print(f"Elements where mask is True: {arr[mask]}")
print()

# Direct boolean indexing
print(f"Elements > 5: {arr[arr > 5]}")
print(f"Even elements: {arr[arr % 2 == 0]}")
print(f"Elements between 3 and 8: {arr[(arr >= 3) & (arr <= 8)]}")
print()

# Count elements matching condition
print(f"Count of elements > 5: {np.sum(arr > 5)}")
print(f"Any element > 5? {np.any(arr > 5)}")
print(f"All elements > 0? {np.all(arr > 0)}")

Array: [ 1  2  3  4  5  6  7  8  9 10]

Mask (arr > 5): [False False False False False  True  True  True  True  True]
Elements where mask is True: [ 6  7  8  9 10]

Elements > 5: [ 6  7  8  9 10]
Even elements: [ 2  4  6  8 10]
Elements between 3 and 8: [3 4 5 6 7 8]

Count of elements > 5: 5
Any element > 5? True
All elements > 0? True


### 8.2 np.where()

In [31]:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
print(f"Array: {arr}")
print()

# np.where with single argument: returns indices where condition is True
indices = np.where(arr > 5)
print(f"Indices where arr > 5: {indices[0]}")
print()

# np.where with three arguments: conditional selection
# np.where(condition, value_if_true, value_if_false)
result = np.where(arr > 5, 1, 0)
print(f"Replace >5 with 1, else 0: {result}")

result2 = np.where(arr % 2 == 0, arr * 2, arr)
print(f"Double even numbers: {result2}")

Array: [ 1  2  3  4  5  6  7  8  9 10]

Indices where arr > 5: [5 6 7 8 9]

Replace >5 with 1, else 0: [0 0 0 0 0 1 1 1 1 1]
Double even numbers: [ 1  4  3  8  5 12  7 16  9 20]


### 8.3 Fancy Indexing

In [32]:
arr = np.array([10, 20, 30, 40, 50, 60, 70, 80, 90, 100])
print(f"Array: {arr}")
print()

# Select elements at specific indices
indices = [0, 2, 5, 9]
print(f"Indices: {indices}")
print(f"Selected elements: {arr[indices]}")
print()

# Using numpy arrays as indices
idx_arr = np.array([1, 3, 5, 7])
print(f"Elements at indices [1,3,5,7]: {arr[idx_arr]}")
print()

# Fancy indexing for 2D arrays
arr_2d = np.arange(12).reshape(3, 4)
print(f"2D Array:\n{arr_2d}")
print()

# Select specific elements by row and column indices
rows = np.array([0, 1, 2])
cols = np.array([0, 2, 3])
print(f"Elements at (0,0), (1,2), (2,3): {arr_2d[rows, cols]}")

# Select entire rows
print(f"Rows 0 and 2:\n{arr_2d[[0, 2], :]}")

Array: [ 10  20  30  40  50  60  70  80  90 100]

Indices: [0, 2, 5, 9]
Selected elements: [ 10  30  60 100]

Elements at indices [1,3,5,7]: [20 40 60 80]

2D Array:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Elements at (0,0), (1,2), (2,3): [ 0  6 11]
Rows 0 and 2:
[[ 0  1  2  3]
 [ 8  9 10 11]]


### 8.4 np.take() and np.put()

In [33]:
arr = np.array([10, 20, 30, 40, 50])
print(f"Original: {arr}")
print()

# np.take: take elements at indices
taken = np.take(arr, [0, 2, 4])
print(f"np.take([0, 2, 4]): {taken}")
print()

# np.put: replace elements at indices (modifies in place)
arr_copy = arr.copy()
np.put(arr_copy, [1, 3], [999, 888])
print(f"After np.put at [1, 3]: {arr_copy}")

Original: [10 20 30 40 50]

np.take([0, 2, 4]): [10 30 50]

After np.put at [1, 3]: [ 10 999  30 888  50]


---

## 9. Stacking and Splitting Arrays

NumPy provides functions to combine and split arrays in various ways.

### 9.1 Stacking Arrays

In [34]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
c = np.array([7, 8, 9])

print(f"a: {a}")
print(f"b: {b}")
print(f"c: {c}")
print()

# Vertical stack (row-wise)
vstacked = np.vstack([a, b, c])
print(f"Vertical stack (vstack):\n{vstacked}")
print(f"Shape: {vstacked.shape}")
print()

# Horizontal stack (column-wise for 1D)
hstacked = np.hstack([a, b, c])
print(f"Horizontal stack (hstack): {hstacked}")
print(f"Shape: {hstacked.shape}")
print()

# Column stack (treats 1D as columns)
col_stacked = np.column_stack([a, b, c])
print(f"Column stack:\n{col_stacked}")
print(f"Shape: {col_stacked.shape}")

a: [1 2 3]
b: [4 5 6]
c: [7 8 9]

Vertical stack (vstack):
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Shape: (3, 3)

Horizontal stack (hstack): [1 2 3 4 5 6 7 8 9]
Shape: (9,)

Column stack:
[[1 4 7]
 [2 5 8]
 [3 6 9]]
Shape: (3, 3)


### 9.2 Stacking 2D Arrays

In [35]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

print(f"Array 1:\n{arr1}")
print(f"Array 2:\n{arr2}")
print()

# Vertical stack (stack along axis 0)
v_result = np.vstack([arr1, arr2])
print(f"Vertical stack:\n{v_result}")
print(f"Shape: {v_result.shape}")
print()

# Horizontal stack (stack along axis 1)
h_result = np.hstack([arr1, arr2])
print(f"Horizontal stack:\n{h_result}")
print(f"Shape: {h_result.shape}")
print()

# Depth stack (stack along axis 2, creates 3D array)
d_result = np.dstack([arr1, arr2])
print(f"Depth stack shape: {d_result.shape}")
print(f"Depth stack:\n{d_result}")

Array 1:
[[1 2]
 [3 4]]
Array 2:
[[5 6]
 [7 8]]

Vertical stack:
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
Shape: (4, 2)

Horizontal stack:
[[1 2 5 6]
 [3 4 7 8]]
Shape: (2, 4)

Depth stack shape: (2, 2, 2)
Depth stack:
[[[1 5]
  [2 6]]

 [[3 7]
  [4 8]]]


### 9.3 np.concatenate() and np.stack()

In [36]:
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])

# concatenate: joins along existing axis
concat_0 = np.concatenate([arr1, arr2], axis=0)
concat_1 = np.concatenate([arr1, arr2], axis=1)

print(f"Concatenate axis=0:\n{concat_0}")
print(f"Shape: {concat_0.shape}")
print()
print(f"Concatenate axis=1:\n{concat_1}")
print(f"Shape: {concat_1.shape}")
print()

# stack: joins along NEW axis
stacked_0 = np.stack([arr1, arr2], axis=0)
stacked_2 = np.stack([arr1, arr2], axis=2)

print(f"Stack axis=0 shape: {stacked_0.shape}")
print(f"Stack axis=2 shape: {stacked_2.shape}")

Concatenate axis=0:
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
Shape: (4, 2)

Concatenate axis=1:
[[1 2 5 6]
 [3 4 7 8]]
Shape: (2, 4)

Stack axis=0 shape: (2, 2, 2)
Stack axis=2 shape: (2, 2, 2)


### 9.4 Splitting Arrays

In [37]:
arr = np.arange(12).reshape(3, 4)
print(f"Original array:\n{arr}")
print()

# Horizontal split (split along axis 1)
h_splits = np.hsplit(arr, 2)
print("Horizontal split into 2:")
for i, split in enumerate(h_splits):
    print(f"  Part {i}:\n{split}")
print()

# Vertical split (split along axis 0)
v_splits = np.vsplit(arr, 3)
print("Vertical split into 3:")
for i, split in enumerate(v_splits):
    print(f"  Part {i}: {split}")
print()

# Split at specific indices
arr_1d = np.arange(10)
splits = np.split(arr_1d, [3, 5, 7])
print(f"Original 1D: {arr_1d}")
print(f"Split at [3, 5, 7]:")
for i, split in enumerate(splits):
    print(f"  Part {i}: {split}")

Original array:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Horizontal split into 2:
  Part 0:
[[0 1]
 [4 5]
 [8 9]]
  Part 1:
[[ 2  3]
 [ 6  7]
 [10 11]]

Vertical split into 3:
  Part 0: [[0 1 2 3]]
  Part 1: [[4 5 6 7]]
  Part 2: [[ 8  9 10 11]]

Original 1D: [0 1 2 3 4 5 6 7 8 9]
Split at [3, 5, 7]:
  Part 0: [0 1 2]
  Part 1: [3 4]
  Part 2: [5 6]
  Part 3: [7 8 9]


---

## 10. Linear Algebra Basics

NumPy's `linalg` module provides essential linear algebra operations.

### 10.1 Matrix Properties

In [38]:
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 10]])

print(f"Matrix A:\n{A}")
print()

# Determinant
det = np.linalg.det(A)
print(f"Determinant: {det:.4f}")

# Rank
rank = np.linalg.matrix_rank(A)
print(f"Rank: {rank}")

# Trace (sum of diagonal elements)
trace = np.trace(A)
print(f"Trace: {trace}")

# Diagonal elements
diag = np.diag(A)
print(f"Diagonal: {diag}")

Matrix A:
[[ 1  2  3]
 [ 4  5  6]
 [ 7  8 10]]

Determinant: -3.0000
Rank: 3
Trace: 16
Diagonal: [ 1  5 10]


### 10.2 Matrix Inverse

In [39]:
A = np.array([[1, 2],
              [3, 4]])

print(f"Matrix A:\n{A}")
print()

# Compute inverse
A_inv = np.linalg.inv(A)
print(f"Inverse of A:\n{A_inv}")
print()

# Verify: A @ A_inv should be identity matrix
identity = A @ A_inv
print(f"A @ A_inv (should be identity):\n{np.round(identity, 10)}")

Matrix A:
[[1 2]
 [3 4]]

Inverse of A:
[[-2.   1. ]
 [ 1.5 -0.5]]

A @ A_inv (should be identity):
[[1. 0.]
 [0. 1.]]


### 10.3 Solving Linear Systems

In [40]:
# Solve the system: Ax = b
# 2x + 3y = 8
# 3x + 4y = 11

A = np.array([[2, 3],
              [3, 4]])
b = np.array([8, 11])

print(f"Coefficient matrix A:\n{A}")
print(f"Constants b: {b}")
print()

# Solve using np.linalg.solve (more efficient than computing inverse)
x = np.linalg.solve(A, b)
print(f"Solution x: {x}")
print(f"Verification A @ x: {A @ x}")

Coefficient matrix A:
[[2 3]
 [3 4]]
Constants b: [ 8 11]

Solution x: [1. 2.]
Verification A @ x: [ 8. 11.]


### 10.4 Eigenvalues and Eigenvectors

In [41]:
A = np.array([[4, 2],
              [1, 3]])

print(f"Matrix A:\n{A}")
print()

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print(f"Eigenvalues: {eigenvalues}")
print(f"Eigenvectors (as columns):\n{eigenvectors}")
print()

# Verify: A @ v = lambda * v
for i in range(len(eigenvalues)):
    v = eigenvectors[:, i]
    lam = eigenvalues[i]
    print(f"A @ v{i}: {A @ v}")
    print(f"lambda{i} * v{i}: {lam * v}")
    print()

Matrix A:
[[4 2]
 [1 3]]

Eigenvalues: [5. 2.]
Eigenvectors (as columns):
[[ 0.89442719 -0.70710678]
 [ 0.4472136   0.70710678]]

A @ v0: [4.47213595 2.23606798]
lambda0 * v0: [4.47213595 2.23606798]

A @ v1: [-1.41421356  1.41421356]
lambda1 * v1: [-1.41421356  1.41421356]



### 10.5 Norms

In [42]:
v = np.array([3, 4])
print(f"Vector v: {v}")
print()

# L2 norm (Euclidean norm) - default
l2_norm = np.linalg.norm(v)
print(f"L2 norm (Euclidean): {l2_norm}")
print(f"Manual calculation: sqrt(3^2 + 4^2) = {np.sqrt(3**2 + 4**2)}")

# L1 norm (Manhattan norm)
l1_norm = np.linalg.norm(v, ord=1)
print(f"L1 norm (Manhattan): {l1_norm}")

# Infinity norm (max absolute value)
inf_norm = np.linalg.norm(v, ord=np.inf)
print(f"Infinity norm: {inf_norm}")
print()

# Matrix norms
A = np.array([[1, 2], [3, 4]])
print(f"Matrix A:\n{A}")
print(f"Frobenius norm: {np.linalg.norm(A, 'fro'):.4f}")

Vector v: [3 4]

L2 norm (Euclidean): 5.0
Manual calculation: sqrt(3^2 + 4^2) = 5.0
L1 norm (Manhattan): 7.0
Infinity norm: 4.0

Matrix A:
[[1 2]
 [3 4]]
Frobenius norm: 5.4772


### 10.6 Singular Value Decomposition (SVD)

In [43]:
A = np.array([[1, 2, 3],
              [4, 5, 6]])

print(f"Matrix A ({A.shape}):\n{A}")
print()

# Compute SVD: A = U @ S @ V^T
U, s, Vt = np.linalg.svd(A)

print(f"U ({U.shape}):\n{U}")
print(f"Singular values: {s}")
print(f"V^T ({Vt.shape}):\n{Vt}")
print()

# Reconstruct the original matrix
S = np.zeros(A.shape)
S[:len(s), :len(s)] = np.diag(s)
A_reconstructed = U @ S @ Vt
print(f"Reconstructed A:\n{A_reconstructed}")

Matrix A ((2, 3)):
[[1 2 3]
 [4 5 6]]

U ((2, 2)):
[[-0.3863177  -0.92236578]
 [-0.92236578  0.3863177 ]]
Singular values: [9.508032   0.77286964]
V^T ((3, 3)):
[[-0.42866713 -0.56630692 -0.7039467 ]
 [ 0.80596391  0.11238241 -0.58119908]
 [ 0.40824829 -0.81649658  0.40824829]]

Reconstructed A:
[[1. 2. 3.]
 [4. 5. 6.]]


---

## 11. Practice Questions

Test your NumPy knowledge with these practice questions. Each question has a hidden solution - try to solve it yourself first!

### Question 1: Array Creation

Create a 5x5 identity matrix and then set all elements in the first and last rows to 2.

In [None]:
# Your solution here

<details>
<summary>Click to reveal answer</summary>

```python
# Create 5x5 identity matrix
arr = np.eye(5)
print(f"Identity matrix:\n{arr}")

# Set first and last rows to 2
arr[0, :] = 2
arr[-1, :] = 2
print(f"\nModified matrix:\n{arr}")
```

</details>

### Question 2: Array Manipulation

Create an array of integers from 1 to 20, reshape it to a 4x5 matrix, and then reverse the order of elements in each row.

In [None]:
# Your solution here

<details>
<summary>Click to reveal answer</summary>

```python
# Create array and reshape
arr = np.arange(1, 21).reshape(4, 5)
print(f"Original matrix:\n{arr}")

# Reverse each row using slicing
reversed_arr = arr[:, ::-1]
print(f"\nReversed rows:\n{reversed_arr}")
```

</details>

### Question 3: Broadcasting

Given a 3x4 matrix of random integers between 1 and 10, subtract the mean of each column from all elements in that column (centre the data column-wise).

In [None]:
# Your solution here
np.random.seed(42)

<details>
<summary>Click to reveal answer</summary>

```python
np.random.seed(42)
arr = np.random.randint(1, 11, size=(3, 4))
print(f"Original matrix:\n{arr}")

# Calculate column means
col_means = arr.mean(axis=0)
print(f"\nColumn means: {col_means}")

# Subtract column means using broadcasting
centred = arr - col_means
print(f"\nCentred matrix:\n{centred}")
print(f"\nVerification (new column means): {centred.mean(axis=0)}")
```

</details>

### Question 4: Boolean Indexing

Create a 10x10 array of random integers from 1 to 100. Replace all values greater than 50 with -1 and all values less than or equal to 50 with 1.

In [None]:
# Your solution here
np.random.seed(123)

<details>
<summary>Click to reveal answer</summary>

```python
np.random.seed(123)
arr = np.random.randint(1, 101, size=(10, 10))
print(f"Original array:\n{arr}")

# Method 1: Using np.where
result = np.where(arr > 50, -1, 1)
print(f"\nTransformed array (using np.where):\n{result}")

# Method 2: Using boolean indexing
arr_copy = arr.copy()
arr_copy[arr_copy > 50] = -1
arr_copy[arr_copy != -1] = 1
print(f"\nTransformed array (using boolean indexing):\n{arr_copy}")
```

</details>

### Question 5: Statistical Operations

Given a 2D array representing exam scores (rows = students, columns = subjects), find:
1. The average score for each student
2. The average score for each subject
3. The student with the highest overall average
4. The subject with the lowest average score

In [None]:
# Your solution here
scores = np.array([[85, 90, 78, 92],
                   [88, 76, 95, 89],
                   [72, 85, 88, 91],
                   [90, 92, 85, 78],
                   [78, 88, 90, 95]])

<details>
<summary>Click to reveal answer</summary>

```python
scores = np.array([[85, 90, 78, 92],
                   [88, 76, 95, 89],
                   [72, 85, 88, 91],
                   [90, 92, 85, 78],
                   [78, 88, 90, 95]])

print(f"Scores matrix (students x subjects):\n{scores}")
print()

# 1. Average score for each student (mean across columns)
student_avg = scores.mean(axis=1)
print(f"Average per student: {student_avg}")

# 2. Average score for each subject (mean across rows)
subject_avg = scores.mean(axis=0)
print(f"Average per subject: {subject_avg}")

# 3. Student with highest overall average
best_student = np.argmax(student_avg)
print(f"Best student: Student {best_student} with average {student_avg[best_student]:.2f}")

# 4. Subject with lowest average score
lowest_subject = np.argmin(subject_avg)
print(f"Lowest scoring subject: Subject {lowest_subject} with average {subject_avg[lowest_subject]:.2f}")
```

</details>

### Question 6: Stacking and Reshaping

Create three 1D arrays of length 4 each. Stack them to create:
1. A 3x4 matrix (vertical stack)
2. A 4x3 matrix (treating each array as a column)
3. A 12-element 1D array

In [None]:
# Your solution here
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
c = np.array([9, 10, 11, 12])

<details>
<summary>Click to reveal answer</summary>

```python
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])
c = np.array([9, 10, 11, 12])

print(f"Arrays: a={a}, b={b}, c={c}")
print()

# 1. Vertical stack (3x4)
vertical = np.vstack([a, b, c])
print(f"Vertical stack (3x4):\n{vertical}")
print()

# 2. Column stack (4x3)
columns = np.column_stack([a, b, c])
print(f"Column stack (4x3):\n{columns}")
print()

# 3. Horizontal stack to 1D (12 elements)
flat = np.hstack([a, b, c])
print(f"Horizontal stack (12 elements): {flat}")
```

</details>

### Question 7: Linear Algebra

Solve the following system of linear equations:
```
2x + y - z = 8
-3x - y + 2z = -11
-2x + y + 2z = -3
```

In [None]:
# Your solution here

<details>
<summary>Click to reveal answer</summary>

```python
# Coefficient matrix
A = np.array([[2, 1, -1],
              [-3, -1, 2],
              [-2, 1, 2]])

# Constants vector
b = np.array([8, -11, -3])

print(f"Coefficient matrix A:\n{A}")
print(f"Constants b: {b}")
print()

# Solve the system
solution = np.linalg.solve(A, b)
print(f"Solution: x={solution[0]:.4f}, y={solution[1]:.4f}, z={solution[2]:.4f}")

# Verify
print(f"\nVerification (A @ x): {A @ solution}")
print(f"Should equal b: {b}")
```

</details>

### Question 8: Fancy Indexing

Create a 6x6 checkerboard pattern using NumPy (alternating 0s and 1s).

In [None]:
# Your solution here

<details>
<summary>Click to reveal answer</summary>

```python
# Method 1: Using slicing
checkerboard = np.zeros((6, 6), dtype=int)
checkerboard[::2, 1::2] = 1  # Even rows, odd columns
checkerboard[1::2, ::2] = 1  # Odd rows, even columns
print(f"Checkerboard (Method 1):\n{checkerboard}")
print()

# Method 2: Using indices and modulo
i, j = np.indices((6, 6))
checkerboard2 = (i + j) % 2
print(f"Checkerboard (Method 2):\n{checkerboard2}")
print()

# Method 3: Using tile
pattern = np.array([[0, 1], [1, 0]])
checkerboard3 = np.tile(pattern, (3, 3))
print(f"Checkerboard (Method 3):\n{checkerboard3}")
```

</details>

### Question 9: Data Normalisation

Write a function that performs min-max normalisation on a 2D array (scales values to the range [0, 1]) for each column independently. Test it with a sample array.

In [None]:
# Your solution here

<details>
<summary>Click to reveal answer</summary>

```python
def min_max_normalise(arr: np.ndarray) -> np.ndarray:
    """Perform min-max normalisation on each column of a 2D array.
    
    Args:
        arr: 2D NumPy array to normalise.
        
    Returns:
        Normalised array with values scaled to [0, 1] per column.
    """
    col_min = arr.min(axis=0)
    col_max = arr.max(axis=0)
    return (arr - col_min) / (col_max - col_min)


# Test data
data = np.array([[10, 200, 3000],
                 [20, 400, 1000],
                 [30, 100, 2000],
                 [40, 300, 4000]])

print(f"Original data:\n{data}")
print()

normalised = min_max_normalise(data)
print(f"Normalised data:\n{normalised}")
print()

print(f"Column min: {normalised.min(axis=0)}")
print(f"Column max: {normalised.max(axis=0)}")
```

</details>

### Question 10: Matrix Operations

Given a matrix, compute:
1. Its transpose
2. The product of the matrix with its transpose
3. The eigenvalues of this product
4. Verify that the eigenvalues are all non-negative (property of positive semi-definite matrices)

In [None]:
# Your solution here
M = np.array([[1, 2, 3],
              [4, 5, 6]])

<details>
<summary>Click to reveal answer</summary>

```python
M = np.array([[1, 2, 3],
              [4, 5, 6]])

print(f"Original matrix M ({M.shape}):\n{M}")
print()

# 1. Transpose
M_T = M.T
print(f"Transpose M^T ({M_T.shape}):\n{M_T}")
print()

# 2. Product M @ M^T (this produces a symmetric matrix)
product = M @ M_T
print(f"M @ M^T ({product.shape}):\n{product}")
print()

# 3. Eigenvalues
eigenvalues = np.linalg.eigvals(product)
print(f"Eigenvalues: {eigenvalues}")
print()

# 4. Verify non-negative (positive semi-definite)
all_non_negative = np.all(eigenvalues >= -1e-10)  # Small tolerance for numerical precision
print(f"All eigenvalues non-negative: {all_non_negative}")
```

</details>

### Question 11: Vectorised Distance Calculation

Write a function that computes the Euclidean distance between each pair of points in two sets of points (useful for k-NN, clustering, etc.). Given two arrays of shape (n, d) and (m, d), return a distance matrix of shape (n, m).

In [None]:
# Your solution here

<details>
<summary>Click to reveal answer</summary>

```python
def euclidean_distance_matrix(X: np.ndarray, Y: np.ndarray) -> np.ndarray:
    """Compute pairwise Euclidean distances between two sets of points.
    
    Args:
        X: Array of shape (n, d) containing n points in d dimensions.
        Y: Array of shape (m, d) containing m points in d dimensions.
        
    Returns:
        Distance matrix of shape (n, m) where element [i, j] is the
        Euclidean distance between X[i] and Y[j].
    """
    # Using broadcasting: (n, 1, d) - (1, m, d) -> (n, m, d)
    diff = X[:, np.newaxis, :] - Y[np.newaxis, :, :]
    return np.sqrt(np.sum(diff ** 2, axis=2))


# Test with sample points
X = np.array([[0, 0], [1, 0], [0, 1]])
Y = np.array([[1, 1], [2, 2]])

print(f"Points X:\n{X}")
print(f"Points Y:\n{Y}")
print()

distances = euclidean_distance_matrix(X, Y)
print(f"Distance matrix:\n{distances}")
print()

# Verify one distance manually
manual_dist = np.sqrt((0-1)**2 + (0-1)**2)
print(f"Distance from (0,0) to (1,1): {distances[0, 0]:.4f} (manual: {manual_dist:.4f})")
```

</details>

### Question 12: Finding Unique Rows

Given a 2D array, find all unique rows and count how many times each unique row appears.

In [None]:
# Your solution here
data = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [1, 2, 3],
                 [7, 8, 9],
                 [4, 5, 6],
                 [1, 2, 3]])

<details>
<summary>Click to reveal answer</summary>

```python
data = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [1, 2, 3],
                 [7, 8, 9],
                 [4, 5, 6],
                 [1, 2, 3]])

print(f"Original data:\n{data}")
print()

# Find unique rows and their counts
unique_rows, counts = np.unique(data, axis=0, return_counts=True)

print("Unique rows and counts:")
for row, count in zip(unique_rows, counts):
    print(f"  {row} appears {count} time(s)")
```

</details>

---

## Summary

This notebook covered the essential NumPy concepts commonly tested in data science assessments:

1. **Array Creation:** Multiple methods including `zeros`, `ones`, `arange`, `linspace`, and random arrays
2. **Indexing and Slicing:** Basic indexing, negative indices, multi-dimensional slicing, views vs copies
3. **Reshaping:** `reshape`, `flatten`, `ravel`, `transpose`, adding/removing dimensions
4. **Broadcasting:** Rules and practical examples for operations on arrays of different shapes
5. **Mathematical Operations:** Element-wise operations, ufuncs, dot product, matrix multiplication
6. **Statistical Functions:** Mean, std, var, min, max, sum along axes, percentiles
7. **Boolean and Fancy Indexing:** Conditional selection, `np.where`, index arrays
8. **Stacking and Splitting:** `vstack`, `hstack`, `column_stack`, `concatenate`, `split`
9. **Linear Algebra:** Determinant, inverse, solving systems, eigenvalues, SVD, norms

### Key Tips for Assessments:

- **Vectorisation:** Always prefer NumPy operations over Python loops for better performance
- **Broadcasting:** Understand the rules to efficiently operate on arrays of different shapes
- **Axis Parameter:** Remember that `axis=0` operates along rows (column-wise result), `axis=1` operates along columns (row-wise result)
- **Views vs Copies:** Be aware that slicing returns views; use `.copy()` when needed
- **Memory Efficiency:** Use appropriate data types and consider `np.memmap()` for large datasets

---

**Sources consulted for this notebook:**
- [NumPy Official Documentation - Broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html)
- [GeeksforGeeks - NumPy Array Broadcasting](https://www.geeksforgeeks.org/numpy/numpy-array-broadcasting/)
- [DataCamp - NumPy Interview Questions](https://www.datacamp.com/blog/numpy-interview-questions)
- [InterviewBit - NumPy Interview Questions](https://www.interviewbit.com/numpy-interview-questions/)
- [GitHub - numpy-100 Exercises](https://github.com/rougier/numpy-100)
- [PYnative - NumPy Exercises](https://pynative.com/python-numpy-exercise/)