# **IS-4100: Linear Regression from Scratch**

### **Objective**
In this assignment, you will implement a simple linear regression model from scratch using NumPy. You will compute the slope ($m$) and intercept ($b$) of the best-fit line using the least squares method and evaluate your implementation with test data.

---

### **Background**
Linear regression is used to find the relationship between a dependent variable ($y$) and an independent variable ($x$). The formula for the regression line is:

$$
y = mx + b
$$

Where:
- **\(m\) (slope)** is calculated as:
$$
m = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}
$$

- **\(b\) (intercept)** is calculated as:
$$
b = \bar{y} - m \bar{x}
$$

The model predicts values of ($y$) (dependent variable) based on ($x$) (independent variable). The **goodness of fit** of the model can be measured using the ($R^2$) value, which quantifies how well the model explains the variability in the data.

---

### **Task**

1. **Implement the Linear Regression Formula**
   - Write a function `linear_regression(x, y)` that takes two NumPy arrays ($x$) (independent variable) and ($y$) (dependent variable) as inputs.
   - Compute the slope ($m$) and intercept ($b$) using the least squares method.
   - Return the slope and intercept as a tuple.

2. **Predict Using the Model**
   - Write a function `predict(x, m, b)` that takes ($x$), ($m$), and \($b$) as inputs and predicts ($y$) values using the regression line formula.

3. **Evaluate the Model**
   - Write a function `r_squared(y_true, y_pred)` to compute the ($R^2$) value, which measures how well the model fits the data:
   $$
   R^2 = 1 - \frac{\sum (y_i - \hat{y}_i)^2}{\sum (y_i - \bar{y})^2}
   $$

4. **Test Your Implementation**
   - Use the provided dataset to test your functions.
   - Compare your slope, intercept, and ($R^2$) results with the implementation from `sklearn.linear_model.LinearRegression`.

---

### **Dataset**

You can use the following dataset to test your implementation:

```python
import numpy as np

# Independent variable (e.g., hours studied)
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

# Dependent variable (e.g., test scores)
y = np.array([50, 55, 61, 66, 70, 74, 79, 83, 88, 92])
