# Gradient Descent for Simple Linear Regression

In this notebook, we'll implement simple linear regression using the Gradient Descent (GD) algorithm on a dataset of study hours and exam scores.

## Dataset

Here's the dataset we'll be using:

| Hours of Study (x) | Exam Score (y) |
|--------------------|----------------|
| 1                  | 2              |
| 2                  | 4              |
| 3                  | 6              |
| 4                  | 8              |
| 5                  | 10             |
| 6                  | 12             |
| 7                  | 14             |
| 8                  | 15             |
| 9                  | 16             |
| 10                 | 20             |

## Steps

1. **Load the dataset**: Start by loading the data into an appropriate data structure.
2. **Split the dataset**: Divide the dataset into three parts:
   - Training set
   - Validation set
   - Testing set
3. **Implement Gradient Descent Algorithm (GD)**: Use the training set to implement the GD algorithm to find the optimal parameters for our linear regression model.
4. **Evaluate the model on the validation set**: After training, evaluate the performance of the model on the validation set and adjust parameters if necessary.
5. **Find the final value of the parameters**: After the validation phase, finalize the values of the parameters.
6. **Test your model on the testing data**: Evaluate the model's performance on the testing set to see how well it generalizes to new, unseen data.

### Validation Phase

During the validation phase, observe the performance of the model and make necessary adjustments to the learning rate or the number of iterations for gradient descent if required.

### Testing Phase

In the testing phase, use the model with the finalized parameters to predict the exam scores based on the hours of study and compare these predictions to the actual scores to assess accuracy.

## Conclusion

This exercise will help understand the basics of linear regression and the functionality of the gradient descent algorithm in optimizing model parameters.


In [1]:
dictionary = {'x': [1,2,3,4,5,6,7,8,9,10], 'y': [2,4,6,8,10,12,14,15,16,20]}

In [2]:
import pandas as pd
df = pd.DataFrame(dictionary)
df.head()

Unnamed: 0,x,y
0,1,2
1,2,4
2,3,6
3,4,8
4,5,10


In [3]:
X_train = df['x'][0:6].tolist()
y_train = df['y'][0:6].tolist()
X_val = df['x'][6:8].tolist()
y_val = df['y'][6:8].tolist()
X_test = df['x'][8:10].tolist()
y_test = df['y'][8:10].tolist()

In [4]:
def gradient_descent(X_train, y_train, alpha, iterations):
    beta_0 = 1
    beta_1 = 1
    m = len(X_train)
    prev_mse = float('inf')

    for epoch in range(iterations):
        grad_beta_0 = 0
        grad_beta_1 = 0
        mse = 0

        for i in range(m):
            y_pred = beta_0 + beta_1 * X_train[i]
            error = y_train[i] - y_pred
            mse += (1 / (2 * m)) * (error ** 2)
            grad_beta_0 += -(1 / m) * error
            grad_beta_1 += -(1 / m) * error * X_train[i]

        beta_0 -= alpha * grad_beta_0
        beta_1 -= alpha * grad_beta_1

        if abs(prev_mse - mse) < 0.01:
            break
        prev_mse = mse

    return beta_0, beta_1


In [5]:
alpha_range = [i * 0.01 for i in range(1, 101)]
validation_results = []

In [6]:
for alpha in alpha_range:
    beta_0, beta_1 = gradient_descent(X_train, y_train, alpha, 100)
    y_val_pred = []
    for i in range(len(X_val)):
        pred = beta_0 + beta_1 * X_val[i]
        y_val_pred.append(pred)
    
    val_mse = 0
    count = 0
    for i in range(len(y_val)):
        error = y_val[i] - y_val_pred[i]
        val_mse += error ** 2
        count += 1
    val_mse /= count
    
    validation_results.append((val_mse, alpha, beta_0, beta_1))

best_val_mse, best_alpha, best_beta_0, best_beta_1 = min(validation_results)



In [7]:
X_train_val = X_train + X_val
y_train_val = y_train + y_val
final_beta_0, final_beta_1 = gradient_descent(X_train_val, y_train_val, best_alpha, 100)

In [8]:
y_test_pred = []
for i in range(len(X_test)):
    pred = final_beta_0 + final_beta_1 * X_test[i]
    y_test_pred.append(pred)

test_mse = 0
count = 0
for i in range(len(y_test)):
    error = y_test[i] - y_test_pred[i]
    test_mse += error ** 2
    count += 1
test_mse /= count

print(f"Best Validation MSE: {best_val_mse}")
print(f"Final Parameters: beta_0 = {final_beta_0}, beta_1 = {final_beta_1}, alpha = {best_alpha}")
print(f"Testing MSE: {test_mse}")

Best Validation MSE: 0.16449220612423854
Final Parameters: beta_0 = -7.022613067965027e+26, beta_1 = -3.9483289163636893e+27, alpha = 0.11
Testing MSE: 1.464007346285242e+57
