Design a linear neuron to perform the following mapping:

$x = (x_1, x_2, x_3)$ | $y$
-|-
$(0.09 −0.44 −0.15)$ | $−2.57$
$(0.69 −0.99 −0.76)$ | $−2.97$
$(0.34 0.65 −0.73)$ | $0.96$
$(0.15 0.78 −0.58)$ | $1.04$
$(−0.63 −0.78 −0.56)$ | $−3.21$
$(0.96 0.62 −0.66)$ | $1.05$
$(0.63 −0.45 −0.14)$ | $−2.39$
$(0.88 0.64 −0.33)$ | $0.66$

In [4]:
import random

import matplotlib.pyplot as plt
import tensorflow as tf


training_data = {
    (0.09, -0.44, -0.15): -2.57,
    (0.69, -0.99, -0.76): -2.97,
    (0.34, 0.65, -0.73): 0.96,
    (0.15, 0.78, -0.58): 1.04,
    (-0.63, -0.78, -0.56): -3.21,
    (0.96, 0.62, -0.66): 1.05,
    (0.63, -0.45, -0.14): -2.39,
    (0.88, 0.64, -0.33): 0.66,
}


class LinearNeuron:
    def __init__(self, weights: tf.Variable, bias: tf.Variable):
        '''
        Constructs a linear neuron

        :params:
        - weights (tf.Variable): initial weights
        - bias (tf.Variable): initial bias
        '''
        self.weights = weights
        self.bias = bias

    def __call__(self, neuron_input):
        '''
        Get output of this neuron, from given input

        :params:
        - neuron_input: input to this neuron
        '''
        return tf.tensordot(neuron_input, self.weights, axes=1) + self.bias

    def update_params(
        self,
        train_inputs,
        train_outputs,
        method: str = 'sgd',
        learning_rate: float = 0.01,
    ) -> float:
        '''
        One step of training

        :params:
        - train_inputs: TODO
        - train_outputs: TODO
        - method (str): method to use for training the neuron - 'sgd' or 'gd' (default: 'sgd')
        - learning_rate (float): how much to learn at this step (default: 0.01)

        :return:
        - loss (mean squared error)
        '''
        if method == 'sgd':
            train_data = list(zip(train_inputs, train_outputs))
            random.shuffle(train_data)

            sse = 0  # sum squared error

            for train_input, train_output in train_data:
                output = self(train_input)
                learning_amount = learning_rate * (train_output - output)

                self.weights = self.weights + learning_amount * train_input
                self.bias = self.bias + learning_amount

                squared_error = 0.5 * (train_output - output)**2
                sse += squared_error

            return sse / len(train_data)

Show one iteration of learning of the neuron with  
(a) Stochastic gradient descent learning  
(b) Gradient descent learning

Initialize the weights as $\begin{pmatrix} 0.77 \\ 0.02 \\ 0.63 \end{pmatrix}$ and biases to $0.0$, and use a learning factor $α = 0.01$

In [15]:
initial_weights = tf.Variable([0.77, 0.02, 0.63])
initial_bias = tf.Variable(0.0)

train_inputs = [
    tf.Variable(train_input)
    for train_input in training_data.keys()
]
train_outputs = [
    tf.Variable(train_output)
    for train_output in training_data.values()
]

# (a)
neuron1 = LinearNeuron(initial_weights, initial_bias)

tf.print(f'before sgd - weights: {neuron1.weights.numpy()}, bias: {neuron1.bias.numpy()}')

loss = neuron1.update_params(train_inputs, train_outputs)

print(f'after sgd - weights: {neuron1.weights.numpy()}, bias: {neuron1.bias.numpy()}')
print(f'loss: {loss}')

before sgd - weights: [0.77 0.02 0.63], bias: 0.0
after sgd - weights: [0.7588135  0.11362498 0.65190977], bias: -0.07202943414449692
loss: 1.9617705345153809


Plot the learning curves (mean square error vs. epochs) until convergence.

Determine the learned weights, biases, and the predicted values of y by the neuron.