# Week 5: Vector Spaces

**Course:** Mathematics for Data Science II (BSMA1003)  
**Week:** 5

## Learning Objectives
- Master vector spaces concepts
- Apply to data science problems
- Implement using NumPy and SciPy
- Understand real-world applications


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import linalg

np.random.seed(42)
plt.style.use('seaborn-v0_8-whitegrid')
%matplotlib inline

print('✓ Libraries loaded')

## 1. Vector Space Definition

A **vector space** $V$ over field $\mathbb{R}$ is a set with:
- **Addition:** $u + v \in V$
- **Scalar multiplication:** $c \cdot v \in V$

Satisfying 10 axioms (closure, associativity, identity, etc.)

### Examples
- $\mathbb{R}^n$: n-dimensional Euclidean space
- Matrices $\mathbb{R}^{m \times n}$
- Polynomials $P_n$
- Functions $C[a,b]$


In [None]:
# Vector space operations in R³
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
c = 2.5

print('Vectors in R³:')
print(f'v1 = {v1}')
print(f'v2 = {v2}')
print(f'\nAddition: v1 + v2 = {v1 + v2}')
print(f'Scalar mult: {c}·v1 = {c * v1}')
print(f'Linear combo: 2v1 + 3v2 = {2*v1 + 3*v2}')
print('\n✓ All results stay in R³ (closure)')

## 2. Subspaces

**Subspace** $W \subseteq V$: Subset that is itself a vector space

### Subspace Test
$W$ is subspace if:
1. **Zero vector** $\mathbf{0} \in W$
2. **Closed under addition:** $u, v \in W \implies u+v \in W$
3. **Closed under scalar mult:** $v \in W, c \in \mathbb{R} \implies cv \in W$

### Important Subspaces
- **Column space** $\text{Col}(A)$: Span of columns
- **Null space** $\text{Nul}(A)$: Solutions to $Ax = 0$
- **Row space** $\text{Row}(A)$: Span of rows


In [None]:
# Column space and null space
A = np.array([[1, 2, 3], [2, 4, 6], [1, 1, 2]])
print('Matrix A:')
print(A)

# Column space basis
rank = np.linalg.matrix_rank(A)
print(f'\nRank(A) = {rank}')
print(f'Column space dimension = {rank}')

# Null space
_, _, V = np.linalg.svd(A)
null_space = V[rank:].T
print(f'\nNull space basis vectors:')
for i, v in enumerate(null_space.T, 1):
    print(f'v{i} = {v}')
    print(f'Av{i} = {A @ v} (≈ zero)')

## 3. Span and Linear Combinations

**Span** of vectors $\{v_1, ..., v_k\}$:
$$\text{Span}(v_1, ..., v_k) = \{c_1v_1 + ... + c_kv_k : c_i \in \mathbb{R}\}$$

All possible linear combinations!

### Properties
- Span is always a subspace
- $\text{Span}(v_1, ..., v_k) = \text{Col}([v_1 | ... | v_k])$
- Smallest subspace containing all vectors


In [None]:
# Visualize span in 2D
v1 = np.array([1, 0])
v2 = np.array([1, 2])

fig, axes = plt.subplots(1, 2, figsize=(14, 6))

# Span of single vector
t = np.linspace(-2, 2, 100)
span_v1 = np.outer(v1, t)
axes[0].plot(span_v1[0], span_v1[1], 'b-', linewidth=2, label='Span(v1)')
axes[0].arrow(0, 0, v1[0], v1[1], head_width=0.15, head_length=0.1, fc='red', ec='red', linewidth=2)
axes[0].set_title('Span of One Vector (Line)', fontsize=14, fontweight='bold')
axes[0].grid(True)
axes[0].axis('equal')
axes[0].set_xlim(-2, 2)
axes[0].set_ylim(-2, 2)
axes[0].legend()

# Span of two vectors
for c1 in np.linspace(-2, 2, 20):
    for c2 in np.linspace(-2, 2, 20):
        point = c1*v1 + c2*v2
        axes[1].plot(point[0], point[1], 'b.', markersize=2)

axes[1].arrow(0, 0, v1[0], v1[1], head_width=0.15, head_length=0.1, fc='red', ec='red', linewidth=2, label='v1')
axes[1].arrow(0, 0, v2[0], v2[1], head_width=0.15, head_length=0.1, fc='green', ec='green', linewidth=2, label='v2')
axes[1].set_title('Span of Two Vectors (Plane = R²)', fontsize=14, fontweight='bold')
axes[1].grid(True)
axes[1].axis('equal')
axes[1].set_xlim(-4, 4)
axes[1].set_ylim(-4, 4)
axes[1].legend()

plt.tight_layout()
plt.show()

## 4. Linear Independence

Vectors $\{v_1, ..., v_k\}$ are **linearly independent** if:
$$c_1v_1 + ... + c_kv_k = 0 \implies c_1 = ... = c_k = 0$$

Only trivial combination gives zero!

### Tests
- Check if $\det([v_1 | ... | v_n]) \neq 0$ (square case)
- Solve homogeneous system $Ax = 0$
- Check rank: Independent if $\text{rank}(A) = k$


In [None]:
# Test linear independence
vectors_indep = [np.array([1, 0, 0]), np.array([0, 1, 0]), np.array([0, 0, 1])]
vectors_dep = [np.array([1, 2, 3]), np.array([2, 4, 6]), np.array([1, 1, 2])]

def check_independence(vectors, name):
    A = np.column_stack(vectors)
    rank = np.linalg.matrix_rank(A)
    n_vectors = len(vectors)
    print(f'{name}:')
    print(f'  Rank = {rank}, # vectors = {n_vectors}')
    if rank == n_vectors:
        print(f'  ✓ Linearly INDEPENDENT')
    else:
        print(f'  ✗ Linearly DEPENDENT')
    if A.shape[0] == A.shape[1]:
        det = np.linalg.det(A)
        print(f'  Determinant = {det:.6f}')
    print()

check_independence(vectors_indep, 'Standard basis')
check_independence(vectors_dep, 'Dependent vectors')

## 5. Application: Feature Spaces in ML

In machine learning:
- **Feature vectors** live in vector spaces
- **Linear models:** Operate in feature space
- **Dimensionality reduction:** Find lower-dimensional subspaces
- **Kernel methods:** Implicitly work in high-dimensional spaces


In [None]:
# Generate sample data in feature space
np.random.seed(42)
n_samples = 100

# 2D features
X = np.random.randn(n_samples, 2)
y = (X[:, 0] + X[:, 1] > 0).astype(int)

plt.figure(figsize=(10, 6))
plt.scatter(X[y==0, 0], X[y==0, 1], c='blue', label='Class 0', alpha=0.6, s=50)
plt.scatter(X[y==1, 0], X[y==1, 1], c='red', label='Class 1', alpha=0.6, s=50)
plt.axhline(0, color='gray', linestyle='--', linewidth=0.5)
plt.axvline(0, color='gray', linestyle='--', linewidth=0.5)
plt.xlabel('Feature 1', fontsize=12)
plt.ylabel('Feature 2', fontsize=12)
plt.title('Data in 2D Feature Space (Vector Space R²)', fontsize=14, fontweight='bold')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f'Data lives in R²')
print(f'Each point is a vector: [{X[0,0]:.2f}, {X[0,1]:.2f}], etc.')

## Summary

### Key Concepts
1. **Vector Spaces:** Sets with addition and scalar multiplication
2. **Subspaces:** Vector spaces within vector spaces
3. **Span:** All linear combinations
4. **Linear Independence:** No redundancy

### Important Subspaces
- Column space: $\text{Col}(A)$
- Null space: $\text{Nul}(A)$
- Row space: $\text{Row}(A)$

### ML Connection
- Feature spaces are vector spaces
- Linear models work in these spaces
- Understanding structure crucial for algorithms

**Next:** Week 6 - Basis and Dimension
