# Linear Regression — Introduction
Author: ChatGPT

This notebook provides a basic introduction to simple linear regression, including model formulation, cost function, gradient descent, and visualization.

## 1. Model Formulation
We consider the simple linear regression model:
$$h_\theta(x) = \theta_0 + \theta_1 x$$
- $\theta_0$ is the intercept.
- $\theta_1$ is the slope (weight).
- Given training examples $(x^{(i)}, y^{(i)})$, we aim to find parameters $\theta$ that minimize the cost function.

## 2. Imports and Data Setup

In [None]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# Example dataset (synthetic)
X = np.linspace(0, 10, 50)
# True parameters
theta_true = [2.5, 1.3]
Y = theta_true[0] + theta_true[1] * X + np.random.normal(scale=1.0, size=X.shape)

# Visualize data
plt.scatter(X, Y, label='Training Data')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.title('Synthetic Training Data')


## 3. Hypothesis (Prediction)

In [None]:
def predict(X, theta):
    """Compute predictions for input X and parameters theta."""
    return theta[0] + theta[1] * X

# Test prediction
print(predict(np.array([0, 5, 10]), theta_true))  # should approximate [2.5, 2.5+1.3*5, ...]


## 4. Cost Function
We use the mean squared error cost:
$$J(\theta) = \frac{1}{2m} \sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)})^2$$

In [None]:
def compute_cost(X, Y, theta):
    m = len(Y)
    predictions = predict(X, theta)
    sq_errors = (predictions - Y) ** 2
    return (1 / (2 * m)) * np.sum(sq_errors)

# Compute initial cost
theta_test = [0.0, 0.0]
print("Initial cost:", compute_cost(X, Y, theta_test))


## 5. Gradient Descent
We update parameters iteratively:
$$\theta_j := \theta_j - \alpha \frac{1}{m} \sum_{i=1}^m (h_\theta(x^{(i)}) - y^{(i)}) x^{(i)}_j$$
where $x_j^{(i)}$ is the feature (for $j=1$) or 1 (for $j=0$).

In [None]:
def gradient_descent(X, Y, theta, alpha, num_iters):
    m = len(Y)
    J_history = []
    for _ in range(num_iters):
        predictions = predict(X, theta)
        error = predictions - Y
        # Compute gradients
        grad0 = (1/m) * np.sum(error)
        grad1 = (1/m) * np.dot(error, X)
        # Update parameters
        theta[0] -= alpha * grad0
        theta[1] -= alpha * grad1
        J_history.append(compute_cost(X, Y, theta))
    return theta, J_history

# Run gradient descent
theta_init = [0.0, 0.0]
alpha = 0.01
iters = 1000
theta_learned, J_hist = gradient_descent(X, Y, theta_init, alpha, iters)
print("Learned theta:", theta_learned)


## 6. Cost Function Convergence

In [None]:
plt.plot(range(len(J_hist)), J_hist, '-b')
plt.xlabel('Iteration')
plt.ylabel('Cost J')
plt.title('Convergence of Gradient Descent')


## 7. Visualizing the Regression Line

In [None]:
plt.scatter(X, Y, label='Data')
plt.plot(X, predict(X, theta_learned), 'r-', label='Linear fit')
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.title('Linear Regression Fit')


## 8. Conclusion
We have implemented simple linear regression from scratch:
- Model and cost function defined
- Gradient descent optimization
- Visualization of data, convergence, and fit

This foundational example can be extended to multivariate regression and advanced ML.