### Gradient Descent

In this notebook we will demonstrate how gradient descent works. We will use the same example as in the slides.

In [1]:
import numpy as np
from matplotlib import pyplot as plt

# Training data
x = np.array([1, 2])     # Number of drinks
y = np.array([6, 12])    # Price in euros

#### Definition of Neuron

In [2]:
class Neuron:
    def __init__(self, weight=4):
        self.weight = weight        
        
    def __call__(self, x):
        return self.weight * x
    
    def __repr__(self):
        return f'Neuron weight: {self.weight:.5f}'

#### Definition of Model

We need to also define the neural network (or model) itself. In our simple case, the model will consist of only one single neuron.

Moreover, inorder to conduct the optimization, we shall define the loss function and its corresponding derivative with respect to **all** of the network parameters (the so called gradient).

In [3]:
class Network():
    def __init__(self):
        self.neuron = Neuron()                
        self.grad = None
    
    def loss(self, x, y):
        self.grad = np.sum(2 * (self.neuron(x) - y) * x)        
        return np.sum((self.neuron(x) - y)**2)

#### Training

We will now apply the gradient descent algorithm to find the parameters that minimize the loss we defined earlier.

In [4]:
model = Network()

lr = 0.05
for _ in range(1000):
    loss = model.loss(x, y)
    model.neuron.weight = model.neuron.weight - lr * model.grad    
    print(model.neuron, model.grad, loss)

Neuron weight: 5.00000 -20 20
Neuron weight: 5.50000 -10.0 5.0
Neuron weight: 5.75000 -5.0 1.25
Neuron weight: 5.87500 -2.5 0.3125
Neuron weight: 5.93750 -1.25 0.078125
Neuron weight: 5.96875 -0.625 0.01953125
Neuron weight: 5.98438 -0.3125 0.0048828125
Neuron weight: 5.99219 -0.15625 0.001220703125
Neuron weight: 5.99609 -0.078125 0.00030517578125
Neuron weight: 5.99805 -0.0390625 7.62939453125e-05
Neuron weight: 5.99902 -0.01953125 1.9073486328125e-05
Neuron weight: 5.99951 -0.009765625 4.76837158203125e-06
Neuron weight: 5.99976 -0.0048828125 1.1920928955078125e-06
Neuron weight: 5.99988 -0.00244140625 2.980232238769531e-07
Neuron weight: 5.99994 -0.001220703125 7.450580596923828e-08
Neuron weight: 5.99997 -0.0006103515625 1.862645149230957e-08
Neuron weight: 5.99998 -0.00030517578125 4.6566128730773926e-09
Neuron weight: 5.99999 -0.000152587890625 1.1641532182693481e-09
Neuron weight: 6.00000 -7.62939453125e-05 2.9103830456733704e-10
Neuron weight: 6.00000 -3.814697265625e-05 7.275

### Tensorflow Implementation

Let's now implement the same neural network model using Tensorflow.

In [3]:
import tensorflow as tf
from tensorflow.keras import Model
from tensorflow.keras.layers import Dense, Input

In [6]:
inputs = Input(shape=(1,))
outputs = Dense(1, activation="linear", use_bias=True)(inputs)
model = Model(inputs, outputs)

model.summary()

#### Model Training

In [8]:
model.compile(optimizer ='sgd', loss = 'mean_squared_error')
model.fit(x, y, epochs=2000, batch_size=1)

Epoch 1/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 7.0891e-06 
Epoch 2/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 5.1280e-06
Epoch 3/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 5.1167e-06
Epoch 4/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 5.1029e-06
Epoch 5/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 5.0868e-06
Epoch 6/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 6.8288e-06
Epoch 7/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 6.8050e-06
Epoch 8/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - loss: 6.7801e-06
Epoch 9/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 4.9321e-06
Epoch 10/2000
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s

<keras.src.callbacks.history.History at 0x13fa9b4a0>

After the traning, we can check the parameters that the network has learnt.

In [1]:
model.layers[1].weights[0].numpy()

NameError: name 'model' is not defined