# Stochastic Gradient Descent - Linear Regression

Implement the Stochastic Gradient Descent (SGD) and update the weights of a linear regression model

## Data for the linear regression model

Use the following data for the training and evaluation of your model

In [5]:
import numpy as np

In [6]:
# Data points
data_amount = 15
max_num = 10
X = np.random.randint(max_num, size=(data_amount, 3))

# We generate them by "knowing" the output weights for this example (this is not the case for real data!)
final_weights = np.random.rand(X.shape[1])
final_weights = final_weights / np.sum(final_weights)

final_bias = 0.2

# Corresponding labels
random_noise = np.random.rand(X.shape[0]) / 7.5 # ranges from 0-1. We divide that by 7.5 to not get to much noise in here
y = np.dot(final_weights, X.T) + final_bias + random_noise

print('data set X\n', X)
print('labels y\n', y)

data set X
 [[6 0 8]
 [9 7 7]
 [9 9 5]
 [8 8 1]
 [6 9 7]
 [1 9 0]
 [5 3 1]
 [4 7 0]
 [5 2 7]
 [6 1 6]
 [4 2 8]
 [7 5 4]
 [1 9 6]
 [6 8 3]
 [2 4 8]]
labels y
 [5.24157644 7.85982604 7.6674491  5.47547762 7.53827703 3.36874649
 2.98525093 3.62689474 5.19432495 4.69982074 5.21118191 5.49054437
 5.74550285 5.68681945 5.33058001]


# Training and test data

In [7]:
train_len = int(data_amount * 0.75)

# We train with the following data
X_train = X[:train_len]
y_train = y[:train_len]

# We test / evaluate with the following data
X_test = X[train_len:]
y_test = y[train_len:]

## Information about the model

In [8]:
# We set the inital weights randomly
weights = np.random.rand(X.shape[1])

# The bias value is set to 1 initially
bias = np.array([1])

### Some more information

We know the regression equation:

$y_{pred}= w_1x_1 + w_2x_2 + \ldots + w_nx_n + b$

In [9]:
# What are the current results of the untrained model?
y_untrained = np.dot(weights, X_test.T) + bias
print('Outputs for our untrained model:', y_untrained)

# What are the results of the final model (that we want to achieve by updating the weights by the Stochastic gradient descent method)
y_final = np.dot(final_weights, X_test.T) + final_bias
print('Outputs for the final model:', y_final)

Outputs for our untrained model: [8.33427394 8.02859111 8.0945906  8.33655977]
Outputs for the final model: [5.37346671 5.69992782 5.62121443 5.22919272]


### Loss function

We want to use the mean squarred error to calculate the loss for the model outputs which is defined as follows:

$$MSE = \frac{1}{n}\sum_{i=1}^n (y_i-y_{i_{pred}})^2$$

In [10]:
mse = lambda y, y_pred: np.mean(np.sum((y-y_pred)**2))

In [11]:
# In our example the loss for our untrained model is:
loss_untrained = mse(y_test, y_untrained)
print('The loss of the untrained model is:', loss_untrained)

# Loss for the final model
loss_final = mse(y_test, y_final)
print('The loss of the final model is:', loss_final)

The loss of the untrained model is: 28.132566035326086
The loss of the final model is: 0.03036766416783028


## Your stochastic gradient descent implementation to optimize the weights of your model

In [12]:
# Summary on what we know so far:

# We know the loss function: Variable 'mse' (Mean squared error)
# We know the initial weights that we want to optimize: variable 'weights'
# We know the initial bias value: variable 'bias'

In [13]:
# Use the training data to optimize the weights of the linear regression model

# use these variables for your sgd implementation
learning_rate = 0.005
iterations = 1000

In [14]:
# YOUR CODE FOR THE STOCHASTIC GRADIENT DESCENT IMPLEMENTATION

def sgd(X, y, weights, bias, learning_rate, iterations):
    for i in range(iterations):
        for j in range(len(y)):
            # Select one data point
            x_i = X[j, :]
            y_i = y[j]

            # Compute prediction
            y_pred = np.dot(weights, x_i) + bias

            # Compute error
            error = y_pred - y_i

            # Update weights and bias
            weights -= learning_rate * error * x_i
            bias -= learning_rate * error

    return weights, bias

## Compare the results with the Test data

In [16]:
bias = 1.0

# Test the model using SGD
weights, bias = sgd(X_train, y_train, weights, bias, learning_rate, iterations)

# Evaluate the trained model
y_trained = np.dot(weights, X_test.T) + bias
loss_trained = mse(y_test, y_trained)
print('Outputs for trained model:', y_trained)
print('The loss of the trained model is:', loss_trained)

Outputs for trained model: [5.41987761 5.73080406 5.65599295 5.28015835]
The loss of the trained model is: 0.008702463347969444
