# Exercise 4 Solution - Linear Regression

### Task
Implement a linear regression model with the provided class structure. 
Write the following member functions:
- the forward prediction
- the cost function computation
- the gradient computation
- the training algorithm 

### Learning goals
- Understand the foundational steps of machine learning by implementing each of the components
- Compare with regression using the Normal Equations

In [None]:
import numpy as np
import matplotlib.pyplot as plt

Generate noisy training and test data with an 80/20 split

In [None]:
np.random.seed(765)  # deterministic random seed
xTrain = np.random.randn(80)
yTrain = 2 * xTrain + 3 + np.random.randn(80)

xTest = np.random.randn(20)
yTest = 2 * xTest + 3 + np.random.randn(20)

Model Definition in `forward`: $\hat{y}=wx+b$ 

Cost function: $C(\boldsymbol{w},b)=\frac1{m_\mathcal{D}}\sum_{i=1}^{m_\mathcal{D}}(\tilde{y}_i-(\boldsymbol{w}^\mathsf{T}\tilde{x}_i+b))^2$

Gradient of weights: $\frac{\partial C}{\partial w} =\frac{1}{m_{\mathcal{D}}}\sum_{i=1}^{m_{\mathcal{D}}}-2\tilde{x}_{i}\left(\tilde{y}_{i}-(w\tilde{x}_{i}+b)\right)$

Gradient of biases: $\frac{\partial C}{\partial b} =\frac{1}{m_{\mathcal{D}}}\sum_{i=1}^{m_{\mathcal{D}}}-2\left(\tilde{y}_{i}-(w\tilde{x}_{i}+b)\right)$

Training update steps:
$w\leftarrow w-\alpha\frac{\partial C}{\partial w} \\
b\leftarrow b-\alpha\frac{\partial C}{\partial b}$

In [None]:
class LinearRegression:
    def __init__(self):
        self.weight = 0
        self.bias = 0

    def forward(self, x):
        y = self.weight * x + self.bias
        return y

    def costFunction(self, x, y):
        cost = np.mean((self.forward(x) - y) ** 2)
        return cost

    def gradient(self, x, y):
        gradientWeight = np.mean((2 * (self.forward(x) - y) * x))
        gradientBias = np.mean((2 * (self.forward(x) - y)))
        return gradientWeight, gradientBias

    def train(self, epochs, lr, xTrain, yTrain, xTest, yTest):
        for epoch in range(epochs):
            costTrain = self.costFunction(xTrain, yTrain)
            costTest = self.costFunction(xTest, yTest)

            # Update step
            gradientWeight, gradientBias = self.gradient(xTrain, yTrain)
            self.weight -= lr * gradientWeight
            self.bias -= lr * gradientBias

            if epoch % 10 == 0:
                string = "Epoch: {}/{}\t\tTraining cost = {:.2e}\t\tValidation cost = {:.2e}"
                print(string.format(epoch, epochs, costTrain, costTest))

model training

In [None]:
lr = 5e-2
epochs = 101

model = LinearRegression()
model.train(epochs, lr, xTrain, yTrain, xTest, yTest)

visualize the prediction

In [None]:
yTrainPred = model.forward(xTrain)  # not visualized
yTestPred = model.forward(xTest)  # not visualized

# Draw predictor between min and max x values of Testset
x = np.linspace(np.min(xTest), np.max(xTest), 100)
yPred = model.forward(x)

fig, ax = plt.subplots(figsize=(12, 6))
ax.scatter(xTest, yTest, color="r", label="testing data")
ax.scatter(xTrain, yTrain, color="k", label="training data")
ax.plot(x, yPred, "b", label="prediction")
ax.legend()
plt.show()

In [None]:
# Compare learned model with normal equations 
print('Model Bias, b = ', model.bias)
print('Model Weight, w = ', model.weight)

# Compare with Normal equations approach
x = np.matrix(xTrain).T  # column vectors
y = np.matrix(yTrain).T
X = np.hstack([np.ones((x.shape[0], 1)), x])  # augment with 1s

theta = np.linalg.inv(X.T * X) * X.T * y

print("\nCompare with Normal equation weights (bias and slope):\n", theta)

## Normal equations 

In [None]:
x = np.matrix(range(4)).T  # X and Y are column vectors by convention 
y = 2 * x + 3
print('x \n', x)
print('y \n', y)

# for multidimensional regression problems, X is (m, n)
#  with m rows for the data points and n columns for the features (dimensions) + 1
X = np.hstack([np.ones((x.shape[0], 1)), x])  # augment with 1s

X_transpose_X = X.T * X
X_transpose_y = X.T * y
theta = np.linalg.inv(X_transpose_X) * X_transpose_y

print("X.T * X:\n", X_transpose_X)
print("X.T * y:\n", X_transpose_y)

# Print the weights
print("Weights (bias and slope):\n", theta)

In [None]:
x = np.matrix(xTrain).T
y = np.matrix(yTrain).T
X = np.hstack([np.ones((x.shape[0], 1)), x])  # augment with 1s

X_transpose_X = X.T * X
X_transpose_y = X.T * y
theta = np.linalg.inv(X_transpose_X) * X_transpose_y

# Print the weights
print("Weights (bias and slope):\n", theta)