<h1 align="center">Linear Regression</h1>

This code creates synthetic data for a simple linear regression problem. It generates evenly spaced input values and computes the corresponding output using a linear relationship defined by a slope and an intercept. Random noise is added to the output to simulate real-world data, where observations are not perfectly linear. This synthetic dataset can be used to demonstrate and test linear regression algorithms.



In [1]:
import numpy as np
x= np.linspace(0,10,50)
true_w=2.5
true_b=4
y=true_w*x+true_b+np.random.randn(50)



This function computes the **cost** of a linear regression model using the **Mean Squared Error (MSE)** criterion. The cost measures how far the modelâ€™s predictions are from the actual target values.

For a linear model defined as:

$$
\hat{y} = wx + b
$$

the cost function is given by:

$$
J(w, b) = \frac{1}{2m} \sum_{i=1}^{m} (\hat{y}^{(i)} - y^{(i)})^2
$$

where:
- $ m $ is the number of training examples  
- $\hat{y}^{(i)}$ is the predicted value  
- $y^{(i)}$ is the actual value  

The function iterates over all data points, computes the prediction using the current weight and bias, calculates the squared error for each example, and then averages the total error. A lower cost value indicates a better fit of the model to the data.


In [2]:
#Cost function
def compute_cost(x,y,w,b):
    m=len(x)
    total_error=0
    for i in range(m):
        prediction= w*x[i]+b
        error= prediction-y[i]
        total_error+=error**2
    cost=total_error/(2*m)
    return cost




This function computes the **gradients** of the cost function with respect to the model parameters **weight** and **bias**. These gradients indicate how the parameters should be adjusted to reduce the prediction error during gradient descent.

The linear regression model is:

$$
\hat{y} = wx + b
$$

The cost function is:

$$
J(w, b) = \frac{1}{2m} \sum_{i=1}^{m} \left( \hat{y}^{(i)} - y^{(i)} \right)^2
$$

The gradients of the cost function are:

$$
\frac{\partial J}{\partial w} = \frac{1}{m} \sum_{i=1}^{m} \left( \hat{y}^{(i)} - y^{(i)} \right) x^{(i)}
$$

$$
\frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^{m} \left( \hat{y}^{(i)} - y^{(i)} \right)
$$

The function iterates over all training examples, computes the prediction error for each example, accumulates the contributions to the gradients, and then averages them. These gradient values are used by gradient descent to update the model parameters.


In [3]:
def compute_grad(x,y,w,b):
    m=len(x)
    dw=0
    db=0
    for i in range(m):
        prediction=w*x[i]+b
        error= prediction-y[i]
        dw+=error*x[i]
        db+=error
    dw=dw/m
    db=db/m
    return dw,db




This function implements **gradient descent** to learn the optimal values of the weight and bias for a linear regression model. Gradient descent is an iterative optimization algorithm that minimizes the cost function by updating parameters step by step.

The linear regression model is:

$$
\hat{y} = wx + b
$$

At each iteration, the parameters are updated using the gradient descent update rules:

$$
w := w - \alpha \frac{\partial J}{\partial w}
$$

$$
b := b - \alpha \frac{\partial J}{\partial b}
$$

where:
- $\alpha$ is the learning rate  
- $\frac{\partial J}{\partial w}$ and $\frac{\partial J}{\partial b}$ are the gradients of the cost function  

The algorithm starts with initial values of the weight and bias set to zero. In each iteration, it computes the gradients, updates the parameters, and gradually reduces the cost. The cost value is periodically printed to monitor convergence. After completing all iterations, the function returns the learned weight and bias values.


In [4]:
def gradient_descent(x,y,learning_rate,iter):
    w=0
    b=0
    for i in range(iter):
        dw,db=compute_grad(x,y,w,b)
        w=w-learning_rate*dw
        b=b-learning_rate*db
        if i%100==0:
            cost=compute_cost(x,y,w,b)
            print(f"iteration:{i} Cost={cost:.4f},w={w:.4f},b={b:.4f}")
    return w,b



This step runs the gradient descent algorithm on the given dataset to learn the optimal values of the weight and bias for the linear regression model. The learning rate controls the step size of each update, while the number of iterations determines how many times the model parameters are adjusted.

After training is complete, the final learned values of the weight and bias are printed. These values represent the parameters of the best-fit line that minimizes the cost function for the given data.


In [5]:
final_w,final_b= gradient_descent(x,y,learning_rate=0.01,iter=2000)
print("final learned parameters:")
print("w =",final_w)
print("b = ",final_b)

iteration:0 Cost=68.7726,w=1.0264,b=0.1608
iteration:100 Cost=0.9520,w=2.8906,b=1.0768
iteration:200 Cost=0.7114,w=2.8182,b=1.5610
iteration:300 Cost=0.5661,w=2.7619,b=1.9372
iteration:400 Cost=0.4783,w=2.7182,b=2.2296
iteration:500 Cost=0.4254,w=2.6842,b=2.4567
iteration:600 Cost=0.3934,w=2.6578,b=2.6332
iteration:700 Cost=0.3741,w=2.6373,b=2.7704
iteration:800 Cost=0.3624,w=2.6213,b=2.8769
iteration:900 Cost=0.3554,w=2.6089,b=2.9597
iteration:1000 Cost=0.3511,w=2.5993,b=3.0241
iteration:1100 Cost=0.3486,w=2.5918,b=3.0741
iteration:1200 Cost=0.3470,w=2.5860,b=3.1129
iteration:1300 Cost=0.3461,w=2.5815,b=3.1431
iteration:1400 Cost=0.3455,w=2.5780,b=3.1666
iteration:1500 Cost=0.3452,w=2.5753,b=3.1848
iteration:1600 Cost=0.3450,w=2.5732,b=3.1989
iteration:1700 Cost=0.3448,w=2.5715,b=3.2099
iteration:1800 Cost=0.3448,w=2.5702,b=3.2185
iteration:1900 Cost=0.3447,w=2.5692,b=3.2251
final learned parameters:
w = 2.5684692026438594
b =  3.230242045849406
