# **Problem Statement**  
## **1. Implement linear regression from scratch using NumPy.**

Implement Linear Regression from scratch using Numpy, without using any machine learning libraries (such as scikit-learn).

Given a dataset with input features x and target vlues y, the goal is to learn the optimal parameters weights (w) and bias (b) such that the predicted values:

y=Xw+b

minimize the Mean Squared Error (MSE).

### Constraints & Example Inputs/Outputs

- Input features are numeric.
- Dataset size: small to medium.
- No external ML libraries allowed (NumPy only).
- Gradient Descent used for optimization.

**Example Input**
```python
X = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]
```

**Expected Output**
```python
Weight ≈ 2
Bias ≈ 0
Predictions ≈ [2, 4, 6, 8, 10]
```

### Solution Approach

#### 1. Model 

**Linear Regression Model:**

y=wx+b

#### 2. Loss Function

**Mean Squared Error:**

J(w,b)=1/n​∑(y−y^​)2

#### 3. Optimization

Use **Gradient Descent:**
- Update weights and bias iteratively
- Move parameters in the direction of minimum loss

#### 4. Gradients 

∂J/∂w = −2/n​∑x(y−y^​)

∂J/∂b = −2/n​∑(y−y^​)

### Solution Code

In [10]:
# Approach1: Brute Force Approach (Normal Equation)

# -- closed-form solution - no iteration 
import numpy as np

def linear_regression_bruteforce(X, y):
    X = np.array(X)
    y = np.array(y)

    # Add bias term
    X_b = np.c_[np.ones(len(X)), X]

    # Normal Equation
    theta = np.linalg.inv(X_b.T @ X_b) @ X_b.T @ y

    bias = theta[0]
    weights = theta[1:]

    return weights, bias


### Alternative Solution

In [11]:
# Approach2: Optimized Approach (Gradient Descent)
def linear_regression_gradient_descent(X, y, lr=0.01, epochs=1000):
    X = np.array(X)
    y = np.array(y)
    n = len(y)

    w = 0.0
    b = 0.0

    for _ in range(epochs):
        y_pred = w * X + b

        dw = (-2/n) * np.sum(X * (y - y_pred))
        db = (-2/n) * np.sum(y - y_pred)

        w -= lr * dw
        b -= lr * db

    return w, b

### Alternative Approaches

#### 1. Normal Equation (Brute Force)
- No iterations required
- Time: O(n³)
- Fails if matrix is non-invertible

#### 2. Gradient Descent (Optimized)
- Works for large datasets
- Time: O(n × epochs)
- Preferred in practice

#### 3. Stochastic / Mini-Batch Gradient Descent
- Faster convergence on large datasets
- Used in real ML systems

### Test Case

In [16]:
# Test Case1: Perfect Linear Data

X = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

w_bf, b_bf = linear_regression_bruteforce(X, y)
w_gd, b_gd = linear_regression_gradient_descent(X, y)

print("Brute Force -> w:", w_bf, "b:", b_bf)
print("Gradient Descent -> w:", w_gd, "b:", b_gd)


Brute Force -> w: [2.] b: 5.662137425588298e-15
Gradient Descent -> w: 1.9951803506719779 b: 0.017400463340610635


In [17]:
# Test Case2: With Noise

X = [1, 2, 3, 4, 5]
y = [2.1, 3.9, 6.2, 7.8, 10.1]

w, b = linear_regression_gradient_descent(X, y, lr=0.01, epochs=2000)
print("w:", w, "b:", b)


w: 1.9898525247159426 b: 0.05053243256910116


In [18]:
# Test Case3: Predictions Check

X_test = np.array([6, 7, 8])
y_pred = w * X_test + b
print("Predictions:", y_pred)


Predictions: [11.98964758 13.97950011 15.96935263]


## Complexity Analysis

#### Normal Equation
- Time: O(n³)
- Space: O(n²)

#### Gradient Descent
- Time: O(n × epochs)
- Space: O(1)

#### Thank You!!