---
__About Section:__

- __Author name:__ UBAIDULLAH

- __Email:__ [ai.bussiness.student0@gmail.com](mailto:ai.bussiness.student0@gmail.com)

- __GitHub:__ [github.com/ubaid-X/](https://github.com/ubaid-X/)

- __LinkedIn Profile:__ [linkedin.com/in/ubaid-ullah-634563373/](https://www.linkedin.com/in/ubaid-ullah-634563373/)

- __Kaggle:__ [kaggle.com/ubaidullah01](https://www.kaggle.com/ubaidullah01)

---

## 5. Gaussian Elimination

### Definition
A step-by-step method to simplify matrices until they look like stairs, making them easier to solve.

### How It Works
1. Write equations in an augmented matrix
2. Use row operations to make zeros below the diagonal
3. Solve from bottom to top (back substitution)

### Example
Let's solve:
- `x + y + z = 6`
- `2x + y + 3z = 14`
- `x + 2y + z = 8`

**Step 1:** Make augmented matrix:
```
[1  1  1 | 6]
[2  1  3 |14]
[1  2  1 | 8]
```

**Step 2:** Row operations:
- R2 = R2 - 2×R1
- R3 = R3 - R1

New matrix:
```
[1  1  1 | 6]
[0 -1  1 | 2]
[0  1  0 | 2]
```

**Step 3:** More operations:
- R3 = R3 + R2

New matrix:
```
[1  1  1 | 6]
[0 -1  1 | 2]
[0  0  1 | 4]
```

**Step 4:** Back substitution:
- From R3: z = 4
- From R2: -y + 4 = 2 → -y = -2 → y = 2
- From R1: x + 2 + 4 = 6 → x + 6 = 6 → x = 0

**Answer:** x = 0, y = 2, z = 4

### Benefits
- Works for any system
- Very systematic
- Good for computers

### Limitations
- Many steps
- Easy to make small mistakes
- Can get messy with fractions

### When to Use
- For systems with many equations
- When other methods don't work
- When working without a calculator

### Tips
- Take it one step at a time
- Double-check your arithmetic
- Practice with simple systems first

---

## 6. Gauss-Jordan Elimination

### Definition
An extension of Gaussian elimination that makes the matrix even simpler, so you can read the answers directly.

### How It Works
1. Do Gaussian elimination first
2. Continue with row operations to get ones on diagonal
3. Make zeros above the diagonal too
4. Read answers directly from the matrix

### Example
Let's solve:
- `2x + y = 5`
- `x - y = 1`

**Step 1:** Augmented matrix:
```
[2  1 | 5]
[1 -1 | 1]
```

**Step 2:** Row operations:
- Swap R1 and R2:
```
[1 -1 | 1]
[2  1 | 5]
```
- R2 = R2 - 2×R1:
```
[1 -1 | 1]
[0  3 | 3]
```
- R2 = R2 ÷ 3:
```
[1 -1 | 1]
[0  1 | 1]
```
- R1 = R1 + R2:
```
[1  0 | 2]
[0  1 | 1]
```

**Answer:** x = 2, y = 1 (directly from the matrix!)

### Benefits
- Gives answers directly
- Very elegant
- No back substitution needed

### Limitations
- Even more steps than Gaussian
- More calculations
- Can be time-consuming

### When to Use
- When you want the most reduced form
- For theoretical work
- When you need to see the inverse

### Tips
- Master Gaussian elimination first
- Be patient with the steps
- Check your final matrix carefully

---



---

# 7. LU Decomposition Method

## 1. Simple Definition

LU Decomposition (Lower-Upper Decomposition) is a matrix factorization method that decomposes a square matrix into two triangular matrices:
- **L** (Lower triangular matrix) with diagonal elements = 1
- **U** (Upper triangular matrix)

Such that: **A = L × U**

---

## 2. Explanation

LU Decomposition is a fundamental technique in numerical linear algebra that:
- Factorizes a square matrix into lower and upper triangular matrices
- Provides an efficient way to solve systems of linear equations
- Serves as the foundation for matrix inversion and determinant calculation
- Is more computationally efficient than Gaussian elimination when solving multiple equations with the same coefficient matrix

The decomposition works for any square matrix where the decomposition exists (all leading principal minors ≠ 0).

---

## 3. Example: Solve for x, y, z

**System of Equations:**
```
2x + y + z = 5
4x - 6y + 0z = -2
-2x + 7y + 2z = 9
```

**Matrix form: A × X = B**
```
A = [[2, 1, 1],
     [4, -6, 0],
     [-2, 7, 2]]
     
B = [[5],
     [-2],
     [9]]
```

### Step 1: LU Decomposition of Matrix A

**Initialize L and U:**
```
L = [[1, 0, 0],
     [0, 1, 0],
     [0, 0, 1]]

U = [[2, 1, 1],
     [4, -6, 0],
     [-2, 7, 2]]
```

**Step 1.1: Eliminate first column below pivot (2)**
- Multiplier for row2: 4/2 = 2
- Multiplier for row3: -2/2 = -1

Update:
```
U = [[2,  1,  1],
     [0, -8, -2],  # Row2 - 2×Row1
     [0,  8,  3]]  # Row3 - (-1)×Row1

L = [[1, 0, 0],
     [2, 1, 0],    # Store multiplier in L[1,0]
     [-1, 0, 1]]   # Store multiplier in L[2,0]
```

**Step 1.2: Eliminate second column below pivot (-8)**
- Multiplier for row3: 8/-8 = -1

Update:
```
U = [[2,  1,  1],
     [0, -8, -2],
     [0,  0,  1]]  # Row3 - (-1)×Row2

L = [[1,  0, 0],
     [2,  1, 0],
     [-1, -1, 1]]  # Store multiplier in L[2,1]
```

**Verification: L × U = A**
```
L × U = [[1,0,0][2,1,0][-1,-1,1]] × [[2,1,1][0,-8,-2][0,0,1]]
       = [[2,1,1][4,-6,0][-2,7,2]] = A ✓
```

### Step 2: Solve L × Y = B for Y using forward substitution

**System:**
```
1y₁ + 0y₂ + 0y₃ = 5
2y₁ + 1y₂ + 0y₃ = -2
-1y₁ -1y₂ + 1y₃ = 9
```

**Forward substitution:**
- y₁ = 5/1 = 5
- y₂ = (-2 - 2×5)/1 = -12
- y₃ = (9 - (-1×5) - (-1×-12))/1 = (9 + 5 - 12) = 2

**Y = [5, -12, 2]ᵀ**

### Step 3: Solve U × X = Y for X using backward substitution

**System:**
```
2x + 1y + 1z = 5
0x - 8y - 2z = -12
0x + 0y + 1z = 2
```

**Backward substitution:**
- z = 2/1 = 2
- y = (-12 - (-2×2))/(-8) = (-12 + 4)/(-8) = (-8)/(-8) = 1
- x = (5 - 1×1 - 1×2)/2 = (5-1-2)/2 = 2/2 = 1

**Solution: x = 1, y = 1, z = 2**

---

## 4. Use Cases in Data Science

1. **Linear Regression**: Efficiently solve normal equations (XᵀXβ = Xᵀy) for multiple right-hand sides
2. **Matrix Inversion**: Foundation for computational methods to find A⁻¹
3. **Determinant Calculation**: det(A) = det(L) × det(U) = product of U's diagonal elements
4. **Eigenvalue Algorithms**: Used as a building block in numerical methods
5. **Time Series Analysis**: Solving systems in ARIMA and state-space models
6. **Image Processing**: Solving large systems in convolutional operations and deblurring
7. **Optimization Problems**: Solving KKT conditions in constrained optimization

---

## 5. Benefits and Limitations

### Benefits:
- **Efficiency**: Once decomposed, solving with different B vectors is O(n²) instead of O(n³)
- **Numerical Stability**: With pivoting, it's more stable than Gaussian elimination
- **Memory Efficient**: Stores decomposition in place of original matrix
- **Multi-purpose**: Useful for determinant, inverse, and solving equations

### Limitations:
- **Only for Square Matrices**: Works only with square coefficient matrices
- **Pivoting Required**: For numerical stability, especially for nearly singular matrices
- **Not for All Matrices**: Some matrices require permutation (PA = LU)
- **Dense Matrices**: For very large sparse systems, other methods may be better

---

## 6. When to Use LU Decomposition vs Other Methods

| Method | Best For | When to Choose Over LU |
|--------|----------|------------------------|
| **LU Decomposition** | Multiple systems with same A | Default choice for medium-sized systems |
| **Gaussian Elimination** | Single systems, teaching concepts | When you only need to solve one system |
| **Cholesky Decomposition** | Symmetric positive definite matrices | When A is symmetric positive definite (faster) |
| **QR Decomposition** | Least squares problems | For overdetermined systems (AX = B, A not square) |
| **SVD** | Rank-deficient matrices | When A is singular or nearly singular |
| **Iterative Methods** | Very large sparse systems | When A is too large for direct methods |

---

## 7. Implementation Tips

1. **Always Use Pivoting**: Implement partial pivoting (PA = LU) for numerical stability
2. **Check Condition Number**: Verify matrix is well-conditioned before decomposition
3. **Sparse Matrices**: Use specialized algorithms (like sparse LU) for sparse systems
4. **Verification**: Always check that L × U ≈ original matrix
5. **Memory Management**: For large matrices, use in-place computation to save memory
6. **Parallelization**: Consider parallel LU implementations for very large systems


---
## 8. Python Implementation


In [1]:
# Python implementation example
import numpy as np
from scipy.linalg import lu, solve

# Create matrices
A = np.array([[2, 1, 1], [4, -6, 0], [-2, 7, 2]])
B = np.array([5, -2, 9])

# LU decomposition with pivoting
P, L, U = lu(A)

# Solve system
y = solve(L, P @ B)
x = solve(U, y)

print("Solution:", x)
# Output: [1. 1. 2.]

Solution: [1. 1. 2.]


---

# 8. Singular Value Decomposition (SVD)

## 1. Simple Definition

Singular Value Decomposition (SVD) is a matrix factorization method that decomposes any matrix (square or rectangular) into three matrices:
- **U**: Left singular vectors (orthogonal matrix)
- **Σ**: Diagonal matrix of singular values (non-negative, in descending order)
- **Vᵀ**: Right singular vectors (orthogonal matrix)

Such that: **A = U × Σ × Vᵀ**

---

## 2. Explanation

SVD is one of the most important matrix decompositions in linear algebra with wide applications in data science and machine learning:

- Works for any matrix (square, rectangular, real, or complex)
- Reveals the fundamental geometric structure of a matrix
- Provides optimal low-rank approximations of matrices
- Forms the mathematical foundation for many dimensionality reduction techniques
- Is numerically stable and robust to computational errors

The decomposition exists for any m×n matrix A, where:
- U is an m×m orthogonal matrix (UᵀU = I)
- Σ is an m×n diagonal matrix with non-negative entries
- V is an n×n orthogonal matrix (VᵀV = I)

---

## 3. Example: Solving a System Using SVD (Mathematical Approach)

**System of Equations:**
```
x + 2y + 3z = 14
4x + 5y + 6z = 32
7x + 8y + 10z = 53
```

**Matrix form: A × X = B**
```
A = [[1, 2, 3],
     [4, 5, 6],
     [7, 8, 10]]
     
B = [[14],
     [32],
     [53]]
```

### Step 1: Compute AᵀA and AAᵀ

First, we compute the products:
```
AᵀA = [[1,4,7],    [[1,2,3],    [[66,  78,  97],
        [2,5,8],  ×  [4,5,6],  =  [78,  93, 116],
        [3,6,10]]    [7,8,10]]    [97, 116, 145]]
        
AAᵀ = [[1,2,3],    [[1,4,7],    [[14, 32,  50],
        [4,5,6],  ×  [2,5,8],  =  [32, 77, 122],
        [7,8,10]]    [3,6,10]]    [50,122,193]]
```

### Step 2: Find Eigenvalues and Eigenvectors of AᵀA

Solve det(AᵀA - λI) = 0:
```
det([[66-λ, 78, 97],
     [78, 93-λ, 116],
     [97, 116, 145-λ]]) = 0
```

After calculation, we get eigenvalues:
λ₁ ≈ 283.5, λ₂ ≈ 18.5, λ₃ ≈ 1.0

Singular values: σ₁ = √λ₁ ≈ 16.84, σ₂ = √λ₂ ≈ 4.30, σ₃ = √λ₃ ≈ 1.00

Eigenvectors (normalized):
v₁ ≈ [-0.46, -0.57, -0.68]ᵀ
v₂ ≈ [0.79, 0.16, -0.60]ᵀ
v₃ ≈ [0.41, -0.81, 0.43]ᵀ

Thus, V = [v₁, v₂, v₃]

### Step 3: Find U Matrix

Compute U columns using uᵢ = (1/σᵢ)Avᵢ:
```
u₁ = (1/16.84)A×v₁ ≈ [-0.21, -0.52, -0.83]ᵀ
u₂ = (1/4.30)A×v₂ ≈ [0.76, 0.43, -0.49]ᵀ
u₃ = (1/1.00)A×v₃ ≈ [0.62, -0.74, 0.27]ᵀ
```

Thus, U = [u₁, u₂, u₃]

### Step 4: Compute Pseudoinverse A⁺

A⁺ = VΣ⁺Uᵀ, where Σ⁺ is the pseudoinverse of Σ (reciprocals of non-zero singular values):
```
Σ⁺ = [[1/16.84, 0, 0],
      [0, 1/4.30, 0],
      [0, 0, 1/1.00]]
```

### Step 5: Solve for X

X = A⁺B = VΣ⁺UᵀB

After matrix multiplication:
```
X ≈ [[1],
     [2],
     [3]]
```

**Solution: x = 1, y = 2, z = 3**

Verification:
```
1(1) + 2(2) + 3(3) = 1 + 4 + 9 = 14 ✓
4(1) + 5(2) + 6(3) = 4 + 10 + 18 = 32 ✓
7(1) + 8(2) + 10(3) = 7 + 16 + 30 = 53 ✓
```

---

## 4. Use Cases in Data Science

1. **Dimensionality Reduction (PCA)**: SVD is the computational foundation for Principal Component Analysis
2. **Recommendation Systems**: Collaborative filtering using matrix factorization
3. **Image Compression**: Representing images with fewer components
4. **Natural Language Processing**: Latent Semantic Analysis (LSA) for document retrieval
5. **Data Denoising**: Removing noise by truncating small singular values
6. **Inverse Problems**: Solving ill-conditioned systems in physics and engineering
7. **Computer Vision**: Structure from motion and facial recognition
8. **Signal Processing**: Separating mixed signals (blind source separation)

---

## 5. Benefits and Limitations

### Benefits:
- **Universal Applicability**: Works for any matrix (square, rectangular, rank-deficient)
- **Numerical Stability**: More stable than other decompositions for ill-conditioned matrices
- **Optimal Approximations**: Provides the best low-rank approximation of a matrix (Eckart-Young theorem)
- **Reveals Structure**: Shows the intrinsic geometry of the data
- **Robust to Noise**: Truncating small singular values can remove noise

### Limitations:
- **Computational Cost**: O(mn²) for m×n matrix where m ≥ n
- **Memory Intensive**: For very large matrices
- **Interpretation**: Singular vectors may not have direct interpretation in original domain
- **Implementation Complexity**: More complex to implement than simpler decompositions

---

## 6. Python Implementation


In [2]:
import numpy as np

# Create the matrix and vector
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 10]])
B = np.array([14, 32, 53])

# Compute SVD
U, S, Vt = np.linalg.svd(A)

# Create Sigma matrix with proper dimensions
Sigma = np.zeros(A.shape)
Sigma[:len(S), :len(S)] = np.diag(S)

# Compute pseudoinverse
Sigma_plus = np.zeros(A.shape).T
Sigma_plus[:len(S), :len(S)] = np.diag(1/S)
A_plus = Vt.T @ Sigma_plus @ U.T

# Solve the system
X = A_plus @ B

print("Solution:", X)
# Output: [1. 2. 3.]


Solution: [1. 2. 3.]


> ### __For large matrices, use truncated SVD__

In [3]:
from sklearn.decomposition import TruncatedSVD

# Create a large matrix (example)
large_matrix = np.random.rand(1000, 500)

# Apply truncated SVD
svd = TruncatedSVD(n_components=50)
reduced_data = svd.fit_transform(large_matrix)

print("Original shape:", large_matrix.shape)
print("Reduced shape:", reduced_data.shape)
print("Explained variance ratio:", svd.explained_variance_ratio_.sum())

Original shape: (1000, 500)
Reduced shape: (1000, 50)
Explained variance ratio: 0.23303124749975593


---

## 7. When to Use SVD vs Other Methods

| Method | Best For | When to Choose Over SVD |
|--------|----------|------------------------|
| **SVD** | General-purpose, rank-deficient matrices, dimensionality reduction | Default for robust solutions and dimensionality reduction |
| **LU Decomposition** | Square systems with unique solutions | When A is square and well-conditioned (faster) |
| **QR Decomposition** | Least squares problems | When A has full column rank (more efficient) |
| **Cholesky Decomposition** | Symmetric positive definite matrices | When A is symmetric positive definite (much faster) |
| **Eigen Decomposition** | Square symmetric matrices | When A is symmetric and you need eigenvalues |

**Choose SVD when:**
- The matrix is rectangular or rank-deficient
- You need the most numerically stable solution
- You want to understand the fundamental structure of the data
- You need low-rank approximations for compression or denoising

---

## 8. Implementation Tips

1. **Use Established Libraries**: Prefer numpy.linalg.svd or scipy.linalg.svd over custom implementations
2. **Truncation for Efficiency**: Use truncated SVD (e.g., sklearn.decomposition.TruncatedSVD) for large matrices
3. **Randomized SVD**: For very large matrices, consider randomized SVD algorithms
4. **Condition Number**: Check the condition number (σ_max/σ_min) to understand stability
5. **Regularization**: For ill-conditioned problems, add regularization (Tikhonov regularization)
6. **Memory Management**: For massive matrices, use iterative methods or out-of-core computation

## 9. Conclusion

SVD is a fundamental tool in the data scientist's toolkit, offering unparalleled versatility for matrix analysis and decomposition. Its ability to handle any matrix type, provide optimal low-rank approximations, and reveal the intrinsic structure of data makes it invaluable for tasks ranging from dimensionality reduction to solving ill-conditioned systems.

While computationally more expensive than some alternatives, its numerical stability and theoretical foundations make it the preferred choice for many advanced data science applications. Understanding when and how to apply SVD—including its truncated variants for large-scale problems—is essential for modern data analysis and machine learning.

# 9. Iterative Methods for Solving Linear Systems
---

## 1. Introduction
Iterative methods are algorithms used to solve systems of linear equations by repeatedly improving approximations to the solution. Unlike direct methods (e.g., Gaussian elimination), which provide an exact solution in a finite number of steps, iterative methods start with an initial guess and refine it until convergence. They are particularly useful for large, sparse systems where direct methods are computationally expensive.

---

## 2. Simple Definition
An iterative method solves \( A\mathbf{x} = \mathbf{b} \) by generating a sequence of approximations \( \mathbf{x}^{(0)}, \mathbf{x}^{(1)}, \mathbf{x}^{(2)}, \dots \) that converge to the exact solution. Common techniques include the **Jacobi method** and **Gauss-Seidel method**.

---

## 3. Explanation

### - General Idea:
1. Start with an initial guess \( \mathbf{x}^{(0)} \).
2. Update each variable using the current values of other variables.
3. Repeat until the change between iterations is below a tolerance threshold.

### - Jacobi Method:
- Each variable is updated using values from the *previous* iteration.
- Formula for the \( k \)-th iteration:
  \[
  x_i^{(k)} = \frac{1}{a_{ii}} \left( b_i - \sum_{j \neq i} a_{ij} x_j^{(k-1)} \right)
  \]

### - Gauss-Seidel Method:
- Uses the *latest* updated values in the same iteration.
- Typically converges faster than Jacobi.

---

## 4. Example: Solving a 3×3 System
### Problem:
Solve the system:
\[
\begin{align*}
4x + y + z &= 7 \\
x + 3y + z &= 5 \\
x + y + 5z &= 3 \\
\end{align*}
\]

### Step-by-Step Solution using Jacobi Method:
1. **Rearrange equations to isolate variables:**
   \[
   x = \frac{7 - y - z}{4}, \quad y = \frac{5 - x - z}{3}, \quad z = \frac{3 - x - y}{5}
   \]

2. **Initial guess:** \( x^{(0)} = 0, y^{(0)} = 0, z^{(0)} = 0 \).

3. **Iteration 1:**
   \[
   x^{(1)} = \frac{7 - 0 - 0}{4} = 1.75 \\
   y^{(1)} = \frac{5 - 0 - 0}{3} \approx 1.6667 \\
   z^{(1)} = \frac{3 - 0 - 0}{5} = 0.6
   \]

4. **Iteration 2:**
   \[
   x^{(2)} = \frac{7 - 1.6667 - 0.6}{4} \approx 1.1833 \\
   y^{(2)} = \frac{5 - 1.75 - 0.6}{3} \approx 0.8833 \\
   z^{(2)} = \frac{3 - 1.75 - 1.6667}{5} \approx -0.0833
   \]

5. **Iteration 3:**
   \[
   x^{(3)} = \frac{7 - 0.8833 - (-0.0833)}{4} \approx 1.55 \\
   y^{(3)} = \frac{5 - 1.1833 - (-0.0833)}{3} \approx 1.3 \\
   z^{(3)} = \frac{3 - 1.1833 - 0.8833}{5} \approx 0.1867
   \]

6. **Continue until convergence** (e.g., after 10 iterations):
   \[
   x \approx 1.0, \quad y \approx 1.0, \quad z \approx 0.0
   \]

**Exact solution:** \( x = 1, y = 1, z = 0 \).

---

## 5. Use Cases in Data Science
1. **Large-Scale Linear Systems:** Solving normal equations in linear regression for big datasets.
2. **PageRank Algorithm:** Used by Google to rank web pages (solving a massive linear system iteratively).
3. **Image Reconstruction:** In tomography and deblurring.
4. **Machine Learning:** Training linear models (e.g., SGD for optimization).
5. **Finite Element Methods:** Solving partial differential equations numerically.

---

## 6. Benefits and Limitations
### - Benefits:
- **Efficiency for Large Systems:** Reduced memory and computation time for sparse matrices.
- **Parallelization:** Jacobi method can be parallelized easily.
- **Simplicity:** Easy to implement and understand.

### - Limitations:
- **Convergence Not Guaranteed:** Requires diagonal dominance or symmetry for convergence.
- **Slow Convergence:** May need many iterations for high accuracy.
- **Accuracy:** Provides approximate solutions.

---

> ## 7. Python Implementation
### Jacobi Method Code:

In [4]:
import numpy as np

def jacobi(A, b, tol=1e-10, max_iterations=100):
    n = len(b)
    x = np.zeros_like(b)
    for k in range(max_iterations):
        x_new = np.zeros_like(x)
        for i in range(n):
            s = np.dot(A[i, :], x) - A[i, i] * x[i]
            x_new[i] = (b[i] - s) / A[i, i]
        if np.linalg.norm(x_new - x) < tol:
            return x_new
        x = x_new
    return x

# Example system:
A = np.array([[4, 1, 1], [1, 3, 1], [1, 1, 5]])
b = np.array([7, 5, 3])
x = jacobi(A, b)
print("Solution:", x)

### Output: [1. 1. 0.]

Solution: [1 1 0]


---

## 8. When to Use Iterative Methods
1. **Large Sparse Matrices:** When \( A \) has mostly zeros (e.g., network data).
2. **Memory Constraints:** Direct methods require storing entire matrices, while iterative methods use less memory.
3. **Approximate Solutions Needed:** When an approximate solution is acceptable (e.g., machine learning).

**Avoid iterative methods for:**
- Small or dense systems.
- Ill-conditioned matrices (unless preconditioned).

---

## 9. Tips
1. **Check Diagonal Dominance:** Ensure \( |a_{ii}| > \sum_{j \neq i} |a_{ij}| \) for convergence.
2. **Preconditioning:** Use preconditioners (e.g., Jacobi preconditioner) to speed up convergence.
3. **Initial Guess:** Choose a good initial guess (e.g., from a similar problem) to reduce iterations.
4. **Stop Early:** In ML, early stopping can prevent overfitting and save time.
5. **Hybrid Approaches:** Combine direct and iterative methods (e.g., use direct method for preconditioning).

---

## 10. References
- Golub, G. H., & Van Loan, C. F. (2013). *Matrix Computations*.
- Saad, Y. (2003). *Iterative Methods for Sparse Linear Systems*.

**Note:** Always validate results with direct methods for critical applications.

# 10. Cramer's Rule: Matrix Solving Method

## 1. Definition
Cramer's Rule is a method for solving systems of linear equations using determinants. It expresses the solution in terms of the ratio of determinants of matrices derived from the coefficient matrix and the constant terms.

---

## 2. Explanation
For a system of linear equations with n variables, Cramer's Rule states that:
- If the coefficient matrix has a non-zero determinant, the system has a unique solution
- Each variable can be found by replacing the corresponding column in the coefficient matrix with the column of constants and computing the ratio of determinants

The formula for each variable is:
```
xᵢ = Det(Aᵢ)/Det(A)
```
where:
- A is the coefficient matrix
- Aᵢ is the matrix formed by replacing the ith column of A with the constants column
- Det() represents the determinant

---

## 3. Example: Solving a 3×3 System

Consider the system:
```
2x + y + z = 8
x - 3y + z = -2
4x + y - 2z = 4
```

### Step 1: Write in matrix form
A = [
    [2, 1, 1],
    [1, -3, 1],
    [4, 1, -2]
]

b = [8, -2, 4]

### Step 2: Calculate Det(A)
Det(A) = 2·(-3)·(-2) + 1·1·4 + 1·1·1 - 1·(-3)·4 - 2·1·1 - 1·1·(-2)
       = 12 + 4 + 1 - (-12) - 2 - (-2)
       = 12 + 4 + 1 + 12 - 2 + 2
       = 29

### Step 3: Calculate Det(A₁)
A₁ = [
    [8, 1, 1],
    [-2, -3, 1],
    [4, 1, -2]
]

Det(A₁) = 8·(-3)·(-2) + 1·1·4 + 1·(-2)·(-2) - 1·(-3)·4 - 8·1·(-2) - (-2)·1·1
        = 48 + 4 + 4 - (-12) - (-16) - (-2)
        = 48 + 4 + 4 + 12 + 16 + 2
        = 86

### Step 4: Calculate Det(A₂)
A₂ = [
    [2, 8, 1],
    [1, -2, 1],
    [4, 4, -2]
]

Det(A₂) = 2·(-2)·(-2) + 8·1·4 + 1·1·4 - 1·(-2)·4 - 2·1·4 - 8·1·(-2)
        = 8 + 32 + 4 - (-8) - 8 - (-16)
        = 8 + 32 + 4 + 8 - 8 + 16
        = 60

### Step 5: Calculate Det(A₃)
A₃ = [
    [2, 1, 8],
    [1, -3, -2],
    [4, 1, 4]
]

Det(A₃) = 2·(-3)·4 + 1·(-2)·4 + 8·1·1 - 8·(-3)·4 - 2·(-2)·1 - 1·1·4
        = -24 - 8 + 8 - (-96) - (-4) - 4
        = -24 - 8 + 8 + 96 + 4 - 4
        = 72

### Step 6: Calculate the solutions
x = Det(A₁)/Det(A) = 86/29 ≈ 2.97
y = Det(A₂)/Det(A) = 60/29 ≈ 2.07
z = Det(A₃)/Det(A) = 72/29 ≈ 2.48

Therefore, x ≈ 3, y ≈ 2, z ≈ 2.5

---

## 4. Use Cases in Data Science
1. **Solving regression equations** when working with small to medium datasets
2. **Feature extraction** when solving systems of equations in statistical models
3. **Computer graphics** for transformations and calculations
4. **Network analysis** when solving flow equations
5. **Finance models** for portfolio optimization equations

---

## 5. Benefits
1. **Direct formula**: Provides an explicit formula for each variable
2. **Theoretical value**: Helps understand the structure of linear systems
3. **No need for elimination steps**: Unlike Gaussian elimination, variables are computed directly
4. **Good for small systems**: Very clear and straightforward for 2×2 and 3×3 systems

---

## 6. Limitations
1. **Computational complexity**: O(n!) time complexity makes it inefficient for large systems
2. **Numerical instability**: Can suffer from round-off errors
3. **Cannot handle singular matrices**: If Det(A) = 0, Cramer's Rule cannot be applied
4. **Not suitable for sparse matrices**: Doesn't take advantage of sparsity

---

## 7. When to Use Cramer's Rule vs. Other Methods
- **Use Cramer's Rule when**:
  - Working with small systems (2×2, 3×3)
  - Teaching or explaining linear systems
  - Theoretical proofs
  - Checking solutions obtained by other methods

- **Use other methods when**:
  - Working with large systems (use Gaussian elimination)
  - Dealing with sparse matrices (use specialized sparse solvers)
  - Need higher numerical stability (use LU decomposition)
  - Iterative solutions required (use Jacobi or Gauss-Seidel)

---

## 8. Python Implementation



In [7]:
import numpy as np

def cramers_rule(A, b):
    """
    Solve a system of linear equations using Cramer's Rule.
    
    Parameters:
    A (numpy.ndarray): Coefficient matrix
    b (numpy.ndarray): Constants vector
    
    Returns:
    numpy.ndarray: Solution vector
    """
    # Get the number of variables/equations
    n = len(b)
    
    # Calculate the determinant of the coefficient matrix
    det_A = np.linalg.det(A)
    
    # Check if the determinant is zero
    if abs(det_A) < 1e-10:
        raise ValueError("The coefficient matrix is singular, Cramer's Rule cannot be applied.")
    
    # Initialize solution vector
    x = np.zeros(n)
    
    # Apply Cramer's Rule for each variable
    for i in range(n):
        # Create a copy of A
        A_i = A.copy()
        # Replace the i-th column with the constants
        A_i[:, i] = b
        # Calculate the determinant
        det_A_i = np.linalg.det(A_i)
        # Calculate the value of the i-th variable
        x[i] = det_A_i / det_A
    
    return x

# Example usage
A = np.array([[2, 1, 1], 
              [1, -3, 1], 
              [4, 1, -2]])
b = np.array([8, -2, 4])

try:
    solution = cramers_rule(A, b)
    print("Solution using Cramer's Rule:")
    print(f"x = {solution[0]}")
    print(f"y = {solution[1]}")
    print(f"z = {solution[2]}")
    
    # Verify the solution
    print("\nVerification:")
    print(f"Ax = {np.dot(A, solution)}")
    print(f"b = {b}")
    
except ValueError as e:
    print(f"Error: {e}")

Solution using Cramer's Rule:
x = 1.7241379310344815
y = 2.0689655172413786
z = 2.482758620689655

Verification:
Ax = [ 8. -2.  4.]
b = [ 8 -2  4]


---

## 9. Tips for Using Cramer's Rule

1. **Check the determinant first**: If Det(A) is zero or very close to zero, don't use Cramer's Rule.

2. **Use it as a teaching tool**: It's excellent for understanding the relationship between determinants and linear systems.

3. **Consider computational efficiency**: For systems larger than 3×3, other methods are typically more efficient.

4. **Symbolic computation**: Cramer's Rule works well with symbolic math libraries when exact solutions are needed.

5. **Double-check your work**: Verify solutions by substituting back into the original equations.

6. **Benchmark against other methods**: Compare solutions with NumPy's built-in solver for accuracy validation.

7. **Don't reinvent the wheel**: For practical applications in data science, use optimized libraries like NumPy's `linalg.solve()` instead of implementing Cramer's Rule manually.

8. **Remember its applications**: Though not always computationally efficient, Cramer's Rule has important theoretical applications in linear algebra, differential equations, and circuit analysis.

---