### Linear Regression

• Linear regression is a statistical approach for modelling relationship between a dependent variable with a given set of independent variables.

• Note: In this article, we refer dependent variables as response and independent variables as features for simplicity.

#### Simple Linear Regression

• Simple linear regression is an approach for predicting a response using a single feature.

• It is assumed that the two variables are linearly related. Hence, we try to find a linear function that predicts the response value(y) as accurately as possible as a function of the feature or independent variable(x).

• Let us consider a dataset where we have a value of response y for every feature x:

In [3]:
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

y = [1, 3, 2, 5, 7, 8, 8, 9, 10, 12]

• Given below is the python implementation of above technique on our small dataset:

In [None]:
import numpy as np
import matplotlib.pyplot as plt

def estimate_coef(x, y):
    # Number of observations/points
    n = np.size(x)
    
    # Mean of x and y vector
    m_x, m_y = np.mean(x), np.mean(y)
    
    # Calculating cross-deviation & deviation about x
    SS_xy = np.sum(x*y) - n*m*m_x
    SS_xx = np.sum(x*x) - n*m*m_y
    
    # Calculating regression coefficients
    b_1 = SS_xy / SS_xx
    b_0 = m_y - b_1*m_x
    
    return(b_0, b_1)

def plot_regression_line(x, y, b):
    # Plotting the actual points as scatter plot
    plt.scatter(x, y , color = 'm', marker='o', s=30)
    
    # Predicted response vector
    y_predict = b[0] + b[1]*x
    
    # Plotting the regression line
    plt.plot(x, y_predict, color='g')
    
    # Putting 