<a href="https://colab.research.google.com/github/manjunathrgithub/Simple-LR-Model-with-Numpy/blob/master/LR_Using_Numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
import numpy as np


In [0]:
# Data Generation
np.random.seed(42)
x = np.random.rand(100, 1)

# Below equation is y = a + bx + noise where a = 1 and b = 2
y = 1 + 2 * x + .1 * np.random.randn(100, 1)

# Shuffles the indices
idx = np.arange(100)
np.random.shuffle(idx)

# Uses first 80 random indices for train
train_idx = idx[:80]
# Uses the remaining indices for validation
val_idx = idx[80:]

# Generates train and validation sets
x_train, y_train = x[train_idx], y[train_idx]
x_val, y_val = x[val_idx], y[val_idx]



In [18]:
x_train[:10]


array([[0.77127035],
       [0.06355835],
       [0.86310343],
       [0.02541913],
       [0.73199394],
       [0.07404465],
       [0.19871568],
       [0.31098232],
       [0.47221493],
       [0.96958463]])

In [19]:
type(x_train)

numpy.ndarray

In [20]:
y_train[:10]



array([[2.47453822],
       [1.19277206],
       [2.9127843 ],
       [1.07850733],
       [2.47316396],
       [1.17131467],
       [1.2653857 ],
       [1.52449648],
       [1.98570794],
       [2.84011562]])

In [21]:
type(y_train)

numpy.ndarray

In [22]:
x_val[:10]


array([[0.30461377],
       [0.15599452],
       [0.66252228],
       [0.10789143],
       [0.9093204 ],
       [0.30424224],
       [0.54671028],
       [0.77096718],
       [0.96563203],
       [0.59865848]])

In [23]:
type(x_val)

numpy.ndarray

In [24]:
y_val[:10]

array([[1.61525056],
       [1.3477003 ],
       [2.23410582],
       [1.29850118],
       [2.89383411],
       [1.53827918],
       [2.15210627],
       [2.45298292],
       [2.73938694],
       [1.99856008]])

In [25]:
type(y_val)

numpy.ndarray

In [26]:
# Initializes parameters "a" and "b" randomly
np.random.seed(42)
a = np.random.randn(1)
b = np.random.randn(1)

print(a, b)


[0.49671415] [-0.1382643]


In [0]:
# Sets learning rate. If this is changed to 2e-1 then the output matches with numpy's LinearRegressor upto 8 decimal places
lr = 2e-1
# Defines number of epochs
n_epochs = 1000

In [28]:



for epoch in range(n_epochs):
    # Computes our model's predicted output
    yhat = a + b * x_train
    
    # How wrong is our model? That's the error! 
    error = (y_train - yhat)
    # It is a regression, so it computes mean squared error (MSE)
    loss = (error ** 2).mean()
    
    # Computes gradients for both "a" and "b" parameters
    a_grad = -2 * error.mean()
    b_grad = -2 * (x_train * error).mean()
    
    # Updates parameters using gradients and the learning rate
    a = a - lr * a_grad
    b = b - lr * b_grad
    
print(a, b)


[1.02354075] [1.96896447]


In [29]:

# Sanity Check: do we get the same results as our gradient descent?
from sklearn.linear_model import LinearRegression
linr = LinearRegression()
linr.fit(x_train, y_train)
print(linr.intercept_, linr.coef_[0])


[1.02354075] [1.96896447]


The outputs of LR with numpy and LR with sklearn are almost same. 

The outputs are closer to actual values which is 1 and 2