# Module 12: Linear Algebra for Data Science

## Topics Covered
1. Vectors and Vector Operations
2. Matrices and Matrix Operations
3. Matrix Multiplication and Transpose
4. Identity Matrix and Inverse
5. Dot Product and Cross Product
6. Eigenvalues and Eigenvectors
7. Linear Transformations
8. Applications in ML (Feature transformations, PCA preview)
9. NumPy for Linear Algebra

## Learning Objectives

By the end of this module, you will be able to:
- Understand and perform vector and matrix operations
- Apply linear algebra concepts using NumPy
- Recognize how linear algebra is used in machine learning
- Compute eigenvalues and eigenvectors
- Perform linear transformations on data

---

---
# Section 1: Vectors and Vector Operations
---

## What is a Vector?

A **vector** is simply a list of numbers. Think of it as an arrow pointing in space, with both direction and magnitude.

In data science, vectors are everywhere:
- A single row in a dataset (one observation with multiple features)
- A single feature column across all observations
- Model weights and parameters

### Why This Matters in Data Science

Every data point in your dataset can be represented as a vector. For example, if you're predicting house prices, each house might be represented as:
- Vector: [square_feet, bedrooms, bathrooms, age]
- Example: [1500, 3, 2, 10]

Understanding vector operations helps you manipulate, transform, and analyze data efficiently.

## Syntax

```python
import numpy as np

# Create a vector (1D array)
vector = np.array([1, 2, 3, 4])
```

**Parameters:**
- List of numbers inside `np.array()`

**Returns:** A NumPy array (vector)

In [None]:
# Example: Creating vectors
import numpy as np

# A vector representing a house: [square_feet, bedrooms, bathrooms]
house_1 = np.array([1500, 3, 2])
house_2 = np.array([2000, 4, 3])

print("House 1:", house_1)
print("House 2:", house_2)
print("\nShape of house_1:", house_1.shape)  # (3,) means 3 elements

### Vector Addition and Subtraction

Adding or subtracting vectors is done element-wise (matching positions are added/subtracted).

In [None]:
# Example: Vector addition
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Add vectors element-wise
v_sum = v1 + v2
print("v1 + v2 =", v_sum)

# Subtract vectors element-wise
v_diff = v2 - v1
print("v2 - v1 =", v_diff)

### Scalar Multiplication

Multiplying a vector by a single number (scalar) multiplies each element by that number.

In [None]:
# Example: Scalar multiplication
v = np.array([1, 2, 3])
scalar = 5

result = scalar * v
print(f"{scalar} * {v} = {result}")

### Vector Magnitude (Length)

The magnitude of a vector is its "length" - how far it reaches from the origin.

In [None]:
# Example: Vector magnitude
v = np.array([3, 4])

# Calculate magnitude using np.linalg.norm()
magnitude = np.linalg.norm(v)
print(f"Magnitude of {v} = {magnitude}")

# Verify: For [3, 4], magnitude = sqrt(3^2 + 4^2) = sqrt(9 + 16) = 5
manual_calc = np.sqrt(3**2 + 4**2)
print(f"Manual calculation: {manual_calc}")

## Practice Exercise 1.1

**Task:** Create two vectors representing customer spending: `customer_a = [100, 200, 150]` and `customer_b = [80, 220, 130]`. Calculate:
1. Total combined spending
2. The difference in spending
3. If customer_a doubles their spending, what would the new vector be?

**Expected Output:**
```
Combined spending: [180 420 280]
Spending difference: [20 -20 20]
Doubled spending: [200 400 300]
```

In [None]:
# Your code here


In [None]:
# Solution 1.1

customer_a = np.array([100, 200, 150])
customer_b = np.array([80, 220, 130])

# 1. Combined spending
combined = customer_a + customer_b
print("Combined spending:", combined)

# 2. Spending difference
difference = customer_a - customer_b
print("Spending difference:", difference)

# 3. Doubled spending
doubled = 2 * customer_a
print("Doubled spending:", doubled)

---
# Section 2: Matrices and Matrix Operations
---

## What is a Matrix?

A **matrix** is a 2D array of numbers arranged in rows and columns. Think of it as a spreadsheet or table.

In data science:
- Your entire dataset is usually a matrix (rows = observations, columns = features)
- Model parameters can be represented as matrices
- Images are matrices (pixels arranged in rows and columns)

### Why This Matters in Data Science

When you load a dataset into pandas, the underlying data structure is essentially a matrix. Understanding matrix operations allows you to efficiently transform and analyze entire datasets at once.

## Syntax

```python
# Create a matrix (2D array)
matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])
```

**Parameters:**
- Nested list (list of lists) where each inner list is a row

**Returns:** A 2D NumPy array (matrix)

In [None]:
# Example: Creating matrices

# A dataset with 3 houses and 3 features
# Rows = houses, Columns = [square_feet, bedrooms, bathrooms]
houses = np.array([
    [1500, 3, 2],
    [2000, 4, 3],
    [1200, 2, 1]
])

print("Houses matrix:")
print(houses)
print(f"\nShape: {houses.shape}")  # (3, 3) = 3 rows, 3 columns
print(f"Number of houses: {houses.shape[0]}")
print(f"Number of features: {houses.shape[1]}")

### Matrix Addition and Subtraction

Like vectors, matrices are added/subtracted element-wise.

In [None]:
# Example: Matrix addition
matrix_a = np.array([[1, 2],
                     [3, 4]])

matrix_b = np.array([[5, 6],
                     [7, 8]])

result = matrix_a + matrix_b
print("Matrix A + Matrix B:")
print(result)

### Scalar Multiplication on Matrices

Multiply every element in the matrix by the scalar.

In [None]:
# Example: Scalar multiplication on matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])

# Multiply all values by 10
result = 10 * matrix
print("Original matrix * 10:")
print(result)

### Element-wise Multiplication

This is NOT matrix multiplication - it multiplies corresponding elements.

In [None]:
# Example: Element-wise multiplication
A = np.array([[1, 2],
              [3, 4]])

B = np.array([[2, 0],
              [1, 3]])

# Element-wise multiplication
elementwise = A * B
print("Element-wise multiplication:")
print(elementwise)

## Practice Exercise 2.1

**Task:** Create a matrix representing sales for 3 products across 3 months:
```
       Jan  Feb  Mar
Prod A  100  120  110
Prod B  200  190  210
Prod C  150  160  155
```
Then:
1. If all sales increase by 10%, create the new matrix
2. Calculate the total sales per product (sum across months)

**Expected Output:**
```
10% increase:
[[110. 132. 121.]
 [220. 209. 231.]
 [165. 176. 170.5]]
Total per product: [330 600 465]
```

In [None]:
# Your code here


In [None]:
# Solution 2.1

sales = np.array([
    [100, 120, 110],  # Product A
    [200, 190, 210],  # Product B
    [150, 160, 155]   # Product C
])

# 1. 10% increase
increased_sales = sales * 1.10
print("10% increase:")
print(increased_sales)

# 2. Total per product (sum across columns, axis=1)
total_per_product = np.sum(sales, axis=1)
print("\nTotal per product:", total_per_product)

---
# Section 3: Matrix Multiplication and Transpose
---

## Matrix Multiplication

Matrix multiplication is different from element-wise multiplication. It's a fundamental operation in machine learning, used in:
- Making predictions with linear models
- Neural network computations
- Transforming data

### Key Rules:
- To multiply A × B, the number of columns in A must equal the number of rows in B
- If A is (m × n) and B is (n × p), the result is (m × p)

### Why This Matters in Data Science

In linear regression, predictions are made using matrix multiplication:
**Predictions = Data × Weights**

In [None]:
# Example: Matrix multiplication

# Matrix A: 2x3
A = np.array([[1, 2, 3],
              [4, 5, 6]])

# Matrix B: 3x2
B = np.array([[7, 8],
              [9, 10],
              [11, 12]])

# Matrix multiplication using @ or np.dot()
result = A @ B  # or np.dot(A, B)
print("A @ B (matrix multiplication):")
print(result)
print(f"Shape: {result.shape}")  # (2, 2)

In [None]:
# Example: Real-world application - predicting house prices

# 3 houses with 2 features: [square_feet, bedrooms]
houses = np.array([
    [1500, 3],
    [2000, 4],
    [1200, 2]
])

# Model weights: [weight_for_sqft, weight_for_bedrooms]
weights = np.array([100, 10000])  # $100 per sqft, $10,000 per bedroom

# Predictions = houses @ weights
predictions = houses @ weights
print("Predicted house prices:")
print(predictions)
print("\nBreakdown for house 1:")
print(f"1500 sqft × $100 + 3 bedrooms × $10,000 = ${predictions[0]:,.0f}")

## Matrix Transpose

Transposing a matrix flips it over its diagonal - rows become columns and columns become rows.

In [None]:
# Example: Matrix transpose

matrix = np.array([[1, 2, 3],
                   [4, 5, 6]])

print("Original matrix (2x3):")
print(matrix)
print(f"Shape: {matrix.shape}")

# Transpose using .T
transposed = matrix.T
print("\nTransposed matrix (3x2):")
print(transposed)
print(f"Shape: {transposed.shape}")

## Practice Exercise 3.1

**Task:** You have 4 students and their scores in 3 subjects:
```
students = [[85, 90, 88],  # Student 1: Math, Science, English
            [78, 82, 80],  # Student 2
            [92, 95, 90],  # Student 3
            [70, 75, 72]]  # Student 4
```
Subject weights: Math=0.4, Science=0.3, English=0.3

Calculate weighted final scores for all students using matrix multiplication.

**Expected Output:**
```
Weighted scores: [87.4 79.6 92.2 72.2]
```

In [None]:
# Your code here


In [None]:
# Solution 3.1

students = np.array([
    [85, 90, 88],
    [78, 82, 80],
    [92, 95, 90],
    [70, 75, 72]
])

weights = np.array([0.4, 0.3, 0.3])  # Math, Science, English

# Calculate weighted scores using matrix multiplication
weighted_scores = students @ weights
print("Weighted scores:", weighted_scores)

---
# Section 4: Identity Matrix and Inverse
---

## Identity Matrix

The identity matrix is like the number 1 in matrix math. Multiplying any matrix by the identity matrix gives you the original matrix back.

Properties:
- Square matrix (same rows and columns)
- 1s on the diagonal, 0s everywhere else
- Denoted as **I**

In [None]:
# Example: Identity matrix

# Create a 3x3 identity matrix
I = np.eye(3)
print("3x3 Identity matrix:")
print(I)

# Verify property: A @ I = A
A = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

result = A @ I
print("\nA @ I equals A:")
print(result)
print("\nAre they equal?", np.array_equal(A, result))

## Matrix Inverse

The inverse of matrix A (denoted A⁻¹) is like division in matrix math. When you multiply a matrix by its inverse, you get the identity matrix:

**A × A⁻¹ = I**

### Why This Matters in Data Science

Matrix inverse is used in:
- Solving systems of equations
- Computing linear regression coefficients
- Statistical calculations (covariance matrices)

In [None]:
# Example: Matrix inverse

A = np.array([[4, 7],
              [2, 6]])

print("Original matrix A:")
print(A)

# Calculate inverse
A_inv = np.linalg.inv(A)
print("\nInverse of A:")
print(A_inv)

# Verify: A @ A_inv should equal identity matrix
result = A @ A_inv
print("\nA @ A_inv (should be identity):")
print(np.round(result, 10))  # Round to handle floating point errors

### Important Note: Singular Matrices

Not all matrices have an inverse. A matrix without an inverse is called **singular** or **non-invertible**.

In [None]:
# Example: Singular matrix (no inverse)

# This matrix is singular (second row = 2 × first row)
singular = np.array([[1, 2],
                     [2, 4]])

print("Singular matrix:")
print(singular)

# Try to compute inverse (will give error or warning)
try:
    inv = np.linalg.inv(singular)
    print("Inverse:", inv)
except np.linalg.LinAlgError:
    print("\nError: This matrix is singular and has no inverse!")

## Practice Exercise 4.1

**Task:** Create a 2x2 matrix and:
1. Calculate its inverse
2. Verify that matrix × inverse = identity

Use this matrix:
```
[[3, 1],
 [5, 2]]
```

In [None]:
# Your code here


In [None]:
# Solution 4.1

A = np.array([[3, 1],
              [5, 2]])

print("Matrix A:")
print(A)

# 1. Calculate inverse
A_inv = np.linalg.inv(A)
print("\nInverse of A:")
print(A_inv)

# 2. Verify A @ A_inv = I
result = A @ A_inv
print("\nA @ A_inv:")
print(np.round(result, 10))

# Compare with identity
I = np.eye(2)
print("\nIs it equal to identity?", np.allclose(result, I))

---
# Section 5: Dot Product and Cross Product
---

## Dot Product

The dot product takes two vectors and produces a single number (scalar). It measures how similar two vectors are in direction.

Formula: **a · b = a₁b₁ + a₂b₂ + a₃b₃ + ...**

### Why This Matters in Data Science

Dot products are used in:
- Calculating similarity between data points
- Computing predictions in machine learning
- Measuring angles between vectors

In [None]:
# Example: Dot product

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

# Calculate dot product
dot_product = np.dot(a, b)
print(f"a · b = {dot_product}")

# Manual calculation
manual = 1*4 + 2*5 + 3*6
print(f"Manual: 1×4 + 2×5 + 3×6 = {manual}")

In [None]:
# Example: Real-world application - similarity

# User ratings for 3 movies
user_a = np.array([5, 4, 1])  # Loves action, likes comedy, dislikes drama
user_b = np.array([5, 3, 2])  # Loves action, neutral comedy, neutral drama
user_c = np.array([1, 2, 5])  # Dislikes action, neutral comedy, loves drama

# Calculate similarity using dot product
similarity_ab = np.dot(user_a, user_b)
similarity_ac = np.dot(user_a, user_c)

print("Similarity between User A and User B:", similarity_ab)
print("Similarity between User A and User C:", similarity_ac)
print("\nUser A is more similar to User B (higher dot product)")

## Cross Product

The cross product of two 3D vectors produces another vector that is perpendicular to both input vectors. It's mainly used in 3D graphics and physics.

Note: Cross product only works with 3D vectors.

In [None]:
# Example: Cross product

a = np.array([1, 0, 0])  # Vector pointing along x-axis
b = np.array([0, 1, 0])  # Vector pointing along y-axis

# Calculate cross product
cross = np.cross(a, b)
print("a × b =", cross)
print("\nThis vector points along the z-axis (perpendicular to both a and b)")

## Practice Exercise 5.1

**Task:** Calculate the dot product between two customer preference vectors:
- Customer 1 preferences: [3, 5, 2, 4] (ratings for 4 product categories)
- Customer 2 preferences: [4, 4, 3, 5]

Also calculate the magnitude of each preference vector.

In [None]:
# Your code here


In [None]:
# Solution 5.1

customer_1 = np.array([3, 5, 2, 4])
customer_2 = np.array([4, 4, 3, 5])

# Dot product
similarity = np.dot(customer_1, customer_2)
print("Dot product (similarity):", similarity)

# Magnitudes
mag_1 = np.linalg.norm(customer_1)
mag_2 = np.linalg.norm(customer_2)
print(f"\nMagnitude of customer 1: {mag_1:.2f}")
print(f"Magnitude of customer 2: {mag_2:.2f}")

---
# Section 6: Eigenvalues and Eigenvectors
---

## What are Eigenvalues and Eigenvectors?

When you multiply a matrix by a special vector (eigenvector), the result is the same vector scaled by a number (eigenvalue):

**A × v = λ × v**

Where:
- A = matrix
- v = eigenvector
- λ (lambda) = eigenvalue

### Why This Matters in Data Science

Eigenvalues and eigenvectors are fundamental to:
- **Principal Component Analysis (PCA)** - finding the most important features
- Understanding data variance
- Dimensionality reduction
- Recommender systems

In [None]:
# Example: Computing eigenvalues and eigenvectors

# A simple 2x2 matrix
A = np.array([[4, 2],
              [1, 3]])

print("Matrix A:")
print(A)

# Calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("\nEigenvalues:")
print(eigenvalues)
print("\nEigenvectors (columns):")
print(eigenvectors)

In [None]:
# Example: Verify the eigenvector equation A × v = λ × v

# Take first eigenvector and eigenvalue
v1 = eigenvectors[:, 0]  # First column
lambda1 = eigenvalues[0]

print("First eigenvector:", v1)
print("First eigenvalue:", lambda1)

# Compute A @ v1
result1 = A @ v1
print("\nA @ v1 =", result1)

# Compute lambda1 * v1
result2 = lambda1 * v1
print("λ1 × v1 =", result2)

print("\nAre they equal?", np.allclose(result1, result2))

## Practice Exercise 6.1

**Task:** Compute eigenvalues and eigenvectors for this matrix:
```
[[6, 2],
 [2, 3]]
```
Verify that the largest eigenvalue corresponds to the direction of maximum variance.

In [None]:
# Your code here


In [None]:
# Solution 6.1

A = np.array([[6, 2],
              [2, 3]])

print("Matrix:")
print(A)

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("\nEigenvalues:", eigenvalues)
print("\nEigenvectors:")
print(eigenvectors)

# Find largest eigenvalue
max_idx = np.argmax(eigenvalues)
print(f"\nLargest eigenvalue: {eigenvalues[max_idx]:.2f}")
print(f"Corresponding eigenvector: {eigenvectors[:, max_idx]}")

---
# Section 7: Linear Transformations
---

## What is a Linear Transformation?

A linear transformation is a function that takes vectors as input and produces vectors as output, following these rules:
1. Lines remain lines
2. The origin stays fixed
3. Grid lines remain parallel and evenly spaced

In practice, applying a matrix to a vector transforms (moves, rotates, scales) that vector.

### Why This Matters in Data Science

Linear transformations are used to:
- Scale features (normalization)
- Rotate data (PCA)
- Transform features in neural networks
- Change coordinate systems

In [None]:
# Example: Scaling transformation

# Scaling matrix (doubles x, triples y)
scale_matrix = np.array([[2, 0],
                         [0, 3]])

# Original point
point = np.array([3, 2])
print("Original point:", point)

# Apply transformation
transformed = scale_matrix @ point
print("Transformed point:", transformed)
print("\nThe x-coordinate doubled (3→6) and y-coordinate tripled (2→6)")

In [None]:
# Example: Rotation transformation (90 degrees counterclockwise)

# Rotation matrix for 90 degrees
rotation_90 = np.array([[0, -1],
                        [1, 0]])

point = np.array([3, 0])  # Point on positive x-axis
print("Original point:", point)

rotated = rotation_90 @ point
print("After 90° rotation:", rotated)
print("\nThe point moved from x-axis to y-axis")

In [None]:
# Example: Feature scaling (normalization) - common in ML

# Dataset: [height_cm, weight_kg] for 3 people
data = np.array([
    [170, 65],
    [180, 75],
    [160, 55]
])

print("Original data:")
print(data)

# Create scaling transformation (divide height by 100, weight by 10)
scale_transform = np.array([[1/100, 0],
                            [0, 1/10]])

# Apply to each person
scaled_data = (scale_transform @ data.T).T
print("\nScaled data:")
print(scaled_data)
print("\nNow features are on similar scales (important for ML!)")

## Practice Exercise 7.1

**Task:** Apply a transformation that:
1. Scales x-coordinates by 0.5
2. Scales y-coordinates by 2

Apply this to the points: [[4, 3], [2, 1], [6, 5]]

In [None]:
# Your code here


In [None]:
# Solution 7.1

# Transformation matrix
transform = np.array([[0.5, 0],
                      [0, 2]])

# Points
points = np.array([
    [4, 3],
    [2, 1],
    [6, 5]
])

print("Original points:")
print(points)

# Apply transformation (need to transpose for matrix multiplication)
transformed_points = (transform @ points.T).T
print("\nTransformed points:")
print(transformed_points)

---
# Section 8: Applications in Machine Learning
---

## How Linear Algebra Powers Machine Learning

Let's see practical examples of how linear algebra concepts are used in real ML workflows.

### Application 1: Feature Scaling (Standardization)

Standardization centers data around 0 with standard deviation of 1.

In [None]:
# Example: Standardizing features for ML

# Sample data: [age, income]
data = np.array([
    [25, 30000],
    [35, 50000],
    [45, 70000],
    [55, 90000],
    [30, 40000]
])

print("Original data:")
print(data)

# Calculate mean and standard deviation
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)

print(f"\nMean: {mean}")
print(f"Std: {std}")

# Standardize: (x - mean) / std
standardized = (data - mean) / std
print("\nStandardized data:")
print(standardized)

# Verify: mean ≈ 0, std ≈ 1
print("\nNew mean:", np.mean(standardized, axis=0))
print("New std:", np.std(standardized, axis=0))

### Application 2: Computing Distance Between Points

Used in K-Nearest Neighbors (KNN) algorithm.

In [None]:
# Example: Euclidean distance between data points

# Two customers with features: [age, annual_purchases, avg_purchase_value]
customer_1 = np.array([35, 20, 50])
customer_2 = np.array([40, 25, 60])

# Calculate Euclidean distance
difference = customer_2 - customer_1
distance = np.linalg.norm(difference)

print("Customer 1:", customer_1)
print("Customer 2:", customer_2)
print(f"\nDistance between customers: {distance:.2f}")
print("\nSmaller distance = more similar customers")

### Application 3: Covariance Matrix

Understanding relationships between features - foundation for PCA.

In [None]:
# Example: Computing covariance matrix

# Dataset: [hours_studied, exam_score] for 6 students
data = np.array([
    [2, 65],
    [3, 70],
    [4, 75],
    [5, 82],
    [6, 88],
    [7, 92]
])

print("Data (hours_studied, exam_score):")
print(data)

# Calculate covariance matrix
cov_matrix = np.cov(data.T)  # Transpose so features are rows
print("\nCovariance matrix:")
print(cov_matrix)
print("\nDiagonal: variance of each feature")
print("Off-diagonal: covariance between features")
print(f"\nPositive covariance ({cov_matrix[0,1]:.2f}) means: more study hours → higher scores")

### Application 4: Preview of PCA (Principal Component Analysis)

PCA uses eigenvalues/eigenvectors to find the most important directions in data.

In [None]:
# Example: Simple PCA concept demonstration

# Sample data: [feature1, feature2]
data = np.array([
    [2.5, 2.4],
    [0.5, 0.7],
    [2.2, 2.9],
    [1.9, 2.2],
    [3.1, 3.0],
    [2.3, 2.7],
    [2.0, 1.6],
    [1.0, 1.1],
    [1.5, 1.6],
    [1.1, 0.9]
])

print("Original data shape:", data.shape)

# Center the data (subtract mean)
mean = np.mean(data, axis=0)
centered_data = data - mean

# Compute covariance matrix
cov_matrix = np.cov(centered_data.T)
print("\nCovariance matrix:")
print(cov_matrix)

# Find principal components (eigenvectors)
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)

print("\nEigenvalues (variance explained):")
print(eigenvalues)
print("\nPrincipal components (eigenvectors):")
print(eigenvectors)

# The eigenvector with largest eigenvalue is the direction of maximum variance
max_idx = np.argmax(eigenvalues)
print(f"\nFirst principal component explains {eigenvalues[max_idx]/np.sum(eigenvalues)*100:.1f}% of variance")

## Practice Exercise 8.1

**Task:** Given customer data with features [age, income, years_customer]:
```
[[25, 30000, 2],
 [40, 55000, 5],
 [35, 45000, 3],
 [50, 75000, 8]]
```

1. Standardize the data
2. Calculate the covariance matrix of the standardized data
3. Find which features are most correlated

In [None]:
# Your code here


In [None]:
# Solution 8.1

data = np.array([
    [25, 30000, 2],
    [40, 55000, 5],
    [35, 45000, 3],
    [50, 75000, 8]
])

print("Original data:")
print(data)

# 1. Standardize
mean = np.mean(data, axis=0)
std = np.std(data, axis=0)
standardized = (data - mean) / std

print("\nStandardized data:")
print(standardized)

# 2. Covariance matrix
cov_matrix = np.cov(standardized.T)
print("\nCovariance matrix:")
print(cov_matrix)

# 3. Find correlations (off-diagonal elements)
print("\nCorrelations:")
print(f"Age vs Income: {cov_matrix[0,1]:.3f}")
print(f"Age vs Years: {cov_matrix[0,2]:.3f}")
print(f"Income vs Years: {cov_matrix[1,2]:.3f}")
print("\nHighest correlation is between Age and Income (makes sense!)")

---
# Section 9: NumPy Linear Algebra Toolkit
---

## Essential NumPy Linear Algebra Functions

Here's a comprehensive reference of NumPy functions you'll use for linear algebra in data science.

In [None]:
# Reference: Key NumPy linear algebra functions
import numpy as np

# Sample data
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
v = np.array([1, 2])

print("=== Basic Operations ===")
print("Matrix addition: A + B")
print(A + B)

print("\nMatrix multiplication: A @ B or np.dot(A, B)")
print(A @ B)

print("\nElement-wise multiplication: A * B")
print(A * B)

print("\n=== Matrix Properties ===")
print("Transpose: A.T")
print(A.T)

print("\nDeterminant: np.linalg.det(A)")
print(np.linalg.det(A))

print("\nTrace (sum of diagonal): np.trace(A)")
print(np.trace(A))

print("\n=== Advanced Operations ===")
print("Inverse: np.linalg.inv(A)")
print(np.linalg.inv(A))

print("\nEigenvalues and eigenvectors:")
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)

print("\n=== Vector Operations ===")
print("Dot product: np.dot(v, v)")
print(np.dot(v, v))

print("\nVector norm (magnitude): np.linalg.norm(v)")
print(np.linalg.norm(v))

## Quick Reference Table

| Operation | NumPy Function | Example |
|-----------|----------------|----------|
| Create vector | `np.array([1,2,3])` | 1D array |
| Create matrix | `np.array([[1,2],[3,4]])` | 2D array |
| Identity matrix | `np.eye(n)` | n×n identity |
| Zeros matrix | `np.zeros((m,n))` | m×n zeros |
| Ones matrix | `np.ones((m,n))` | m×n ones |
| Matrix multiply | `A @ B` or `np.dot(A,B)` | Standard multiplication |
| Element-wise multiply | `A * B` | Element-wise |
| Transpose | `A.T` | Flip rows/columns |
| Inverse | `np.linalg.inv(A)` | A⁻¹ |
| Determinant | `np.linalg.det(A)` | Scalar value |
| Eigenvalues | `np.linalg.eig(A)` | Returns values & vectors |
| Dot product | `np.dot(v1, v2)` | Scalar result |
| Cross product | `np.cross(v1, v2)` | Vector result (3D) |
| Vector norm | `np.linalg.norm(v)` | Magnitude |
| Matrix rank | `np.linalg.matrix_rank(A)` | Number of independent rows/cols |

## Practice Exercise 9.1

**Task:** Use NumPy to solve this system of linear equations:
```
2x + 3y = 8
4x + y = 10
```

Hint: This can be written as Ax = b, where:
- A = [[2, 3], [4, 1]]
- b = [8, 10]
- Solve using x = A⁻¹ × b

In [None]:
# Your code here


In [None]:
# Solution 9.1

# Coefficient matrix
A = np.array([[2, 3],
              [4, 1]])

# Constants vector
b = np.array([8, 10])

# Method 1: Using inverse matrix
x = np.linalg.inv(A) @ b
print("Solution using inverse:")
print(f"x = {x[0]}, y = {x[1]}")

# Method 2: Using np.linalg.solve (more efficient)
x_solve = np.linalg.solve(A, b)
print("\nSolution using np.linalg.solve:")
print(f"x = {x_solve[0]}, y = {x_solve[1]}")

# Verify the solution
print("\nVerification:")
print(f"2({x[0]}) + 3({x[1]}) = {2*x[0] + 3*x[1]}")
print(f"4({x[0]}) + 1({x[1]}) = {4*x[0] + 1*x[1]}")

---
# Module Summary

## Key Takeaways

- **Vectors** represent data points, features, or model parameters
- **Matrices** represent entire datasets, transformations, or relationships
- **Matrix multiplication** is fundamental to machine learning predictions
- **Transpose** flips rows and columns - used in many ML operations
- **Identity matrix** is the multiplicative identity (like the number 1)
- **Matrix inverse** is used to solve equations and compute ML parameters
- **Dot product** measures similarity between vectors
- **Eigenvalues/eigenvectors** find principal directions in data (PCA)
- **Linear transformations** scale, rotate, and manipulate data
- **NumPy** provides efficient implementations of all linear algebra operations

## Real-World Applications

1. **Machine Learning Models**: Use matrix multiplication for predictions
2. **Feature Scaling**: Linear transformations normalize data
3. **Dimensionality Reduction**: PCA uses eigenvalues/eigenvectors
4. **Similarity Measures**: Dot products compute similarity
5. **Recommender Systems**: Matrix factorization finds patterns
6. **Image Processing**: Images are matrices that can be transformed

## Next Module

In the next module, we'll cover **Calculus for Machine Learning**, where you'll learn about derivatives, gradients, and optimization - the mathematical foundation for training ML models.

## Additional Practice

For extra practice, try these challenges:
1. Create a function that standardizes any dataset using matrix operations
2. Implement a simple similarity search using dot products
3. Calculate PCA manually on a 2D dataset and visualize the principal components
4. Solve a system of 3 equations with 3 unknowns using matrix inverse
5. Create rotation matrices for different angles and apply them to points

---