# Linear Regression Implementation for single variable

Linear Regression comes under Supervised Machine Learning. The objective is to fit a straight (linear) line such that it "fits" the data as much as possible. For single variable regression, you can visualise the line y = mx + c

In [None]:
# Importing the necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [None]:
# Reading the Data
df = pd.read_csv('salary.csv')
print(df.head(10))

In [None]:
# Unloading data to X and y
X = np.array(df.iloc[:,0])
y = np.array(df.iloc[:,1])
m = len(X)

In [None]:
# Visualizing the Data
plt.scatter(X,y)
plt.xlabel("Experience")
plt.ylabel("Salary")

In [None]:
print(np.shape(X))
print(np.shape(y))

At this point, we have the data ready. However we want X to be of order mx2 and y to be of mx1. Then we need a function to calculate the cost, and another one for gradient descent. X, y and theta are taken as matrices as calculation becomes very easy.

In [None]:
# Parameter Initialisation and reshaping matrices
theta = np.zeros((2,1))
alpha = 0.01
X = np.c_[np.ones(m), X]
y = np.reshape(y, (m, 1))
print(np.shape(X))
print(np.shape(y))

<img src="1.png">

In [None]:
# Function to calculate cost
def costfunc(X,y,theta):
    hx = np.dot(X, theta)
    cost = np.sum(np.power(hx - y, 2)) / (2*m)
    return cost 
J = costfunc(X, y, theta) # Cost before gradient descent
print(J)

<img src = "2.png">

In [None]:
# Function for gradient descent
def grad(X,y,alpha,theta):
    for i in range(1000):
        temp = np.matmul(X,theta)-y
        temp = np.matmul(X.T,temp)
        theta = theta - (alpha/m)*temp
        
    return theta

theta = grad(X, y, alpha, theta)
print(theta)

In [None]:
# Cost after gradient desccent
J = costfunc(X, y, theta)
print(J)

In [None]:
# Visualising the bestfit curve
plt.scatter(X[:,1],y)
plt.plot(X[:,1], np.dot(X,theta), color="red")
plt.xlabel("Experience")
plt.ylabel("Salary")