<hr style="border-top: 3px solid black"></hr>
<center>
<h1><span style="color:black">Coding an Autoassociator from Scratch</span></h1>
</center>
<hr style="border-top: 3px solid black"></hr>

# 1°) With the logistic as activation function (binary inputs)

**Activation function and its derivative:**
*Logisitc*

$ S(x) = \frac{1}{1 + e^{-x}} $

$ S'(x) = S(x) \cdot (1 - S(x)) $

**Error function:** 
*The Mean Squared Error*

$\text{MSE}(y_{\text{true}}, y_{\text{pred}}) = \frac{1}{n} \sum_{i=1}^{n} (y_{\text{pred},i} - y_{\text{true},i})^2$

| Phase             | Formulas |
|-------------------|----------|
| **Forward pass:** | *Weighted Sum:* $z_1 = p \cdot w_1 + b_1$ <br> $a_1 = \text{S}(z_1)$ |
| **Backward pass:** | $\delta z_1 = (a_1 - t) \cdot \text{S'}(z_1)$ <br> $\frac{\partial w_1}{\partial \text{loss}} = \frac{p^T \cdot \delta z_1}{\text{len}(p)}$ <br> $\frac{\partial b_1}{\partial \text{loss}} = \frac{\text{sum}(\delta z_1)}{\text{len}(p)}$ |
| **Delta Rule:**   | $w_1 = w_1 - \text{lr} \cdot \frac{\partial w_1}{\partial \text{loss}}$ <br> $b_1 = b_1 - \text{lr} \cdot \frac{\partial b_1}{\partial \text{loss}}$ |


In the *backward pass* the input pattern $p$ is transposed to a column vector such that it can be multiplied with the gradient $\delta z_1$. In linear algebra, the number of columns of the first matrix has to correspond to the number of lines of the second matrix during matrix multiplication. Then, $p$, wich is first defined as a line vector $(1, elements)$, has to change to be a column vector of shape $(elements, 1)$.
<br><br>
A dot product takes two sequences of the same lenght and outputs a single number. It is the sum of the products of the corresponding elements between the two vectors. Then, here, with p = (0,1) and w = (0.5, 0.2) we compute $z = p \cdot w = 0 \times .5 + 1 \times .2 = .2$
<br><br>
We divide by $len(p)$ to compute a mean of the gradients. This normalizes the gradients relatively to the number of elements in the input vector.

In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [2]:
#define the patterns:
p1 = [0,0]
p2 = [0,1]
p3 = [1,0]
p4 = [1,1]
p5 = [0.5,0.5]
p6 = [0.1,0.5]
p_list = [p1,p2,p3,p4,p5,p6]
patterns = np.asarray(p_list)

t1 = 0.3
t2 = 0.3
t3 = 0.1
t4 = 0.3
t5 = 0.7
t6 = 0.4
targets = np.asarray([t1,t2,t3,t4,t5,t6])

input_shape = 2
output_shape= 1
epochs = 500
learning_rate = 0.01
lr = learning_rate

In [3]:
#binary inputs!!!! 
training_patterns = np.array([
    [0, 1, 1],
    [1, 1, 0],
    [0, 0, 0]])

test_patterns = np.array([
    [0, 0.5, 1],
    [1, 0.5, 0],
    [0, 1, 0]])

input_shape = 3
output_shape = 3
epochs = 90000000000
learning_rate = 0.001

def logistic(x):
    return 1/(1 + np.exp(-x))
def logisticprime(x):
    return logistic(x)*(1-logistic(x))


def initialize_parameters(input_shape, output_shape):
    wlayer1 = np.random.randn(input_shape, output_shape) * 0.1
    wlayer1 = (wlayer1 - np.min(wlayer1)) / (np.max(wlayer1) - np.min(wlayer1))
    blayer1 = np.zeros((1, input_shape))
    return {'w1': wlayer1, 'b1':blayer1}

def mse_loss(y_true, y_pred):
    return np.mean((y_pred - y_true) ** 2)
    
def fpass(p, parameters, activation_function):
    z1 = np.dot(p, parameters['w1']) + parameters['b1']
    activation1 = activation_function(z1)
    return {'z1': z1, 'a1': activation1}
    
def bpass(p, t, parameters, cache, activation_prime):
    dz1 = (cache['a1'] - t) * activation_prime(cache['z1'])
    p_reshaped = p.reshape(-1, 1)  #p doit être un vecteur colonne !!!!!!!!
    dw1 = np.dot(p_reshaped, dz1) / len(p)
    db1 = np.sum(dz1, axis=0, keepdims=True) / len(p)
    return {'dw1': dw1, 'db1': db1}

def delta_rule(parameters, grads, lr=learning_rate):
    parameters['w1'] -= lr * grads['dw1']
    parameters['b1'] -= learning_rate * grads['db1']
    return parameters

def train(activation_function, activation_prime, inputs=training_patterns, num_epochs=epochs, lr=learning_rate, input_shape=input_shape, output_shape=output_shape):
    parameters = initialize_parameters(input_shape, output_shape)
    for epoch in range(num_epochs):
        total_loss = 0
        list_loss = []
        for p in inputs:
            cache = fpass(p, parameters, activation_function=activation_function)
            loss = mse_loss(p, cache['a1'])
            total_loss += loss
            list_loss.append(loss)
            grads = bpass(p, p, parameters, cache=cache, activation_prime=activation_prime)
            parameters = delta_rule(parameters, grads=grads, lr=lr)
        average_loss = total_loss / len(inputs)
        if epoch % 200 == 0:
            print("Epoch number:", epoch, "MSE_loss:", average_loss)
        # #stopping criterion 1: 
        # if average_loss < 0.001:
        #     print("Epoch number:", epoch, "MSE_loss:", average_loss)
        #     break
        #stopping criterion 2: PREFERABLE
        if np.max(list_loss) < 0.015:
            print("Epoch number:", epoch, "MSE_loss:", average_loss, "Max_loss:", np.max(list_loss))
            break
    return parameters

trained_parameters = train(activation_function=logistic, activation_prime=logisticprime, inputs=training_patterns)

Epoch number: 0 MSE_loss: 0.27411583891362784
Epoch number: 200 MSE_loss: 0.27184121330768657
Epoch number: 400 MSE_loss: 0.2695666135988211
Epoch number: 600 MSE_loss: 0.2672917534273627
Epoch number: 800 MSE_loss: 0.26501644696760257
Epoch number: 1000 MSE_loss: 0.26274060822615736
Epoch number: 1200 MSE_loss: 0.2604642501337386
Epoch number: 1400 MSE_loss: 0.25818748338338865
Epoch number: 1600 MSE_loss: 0.25591051496621653
Epoch number: 1800 MSE_loss: 0.253633646355183
Epoch number: 2000 MSE_loss: 0.25135727128864443
Epoch number: 2200 MSE_loss: 0.24908187310822497
Epoch number: 2400 MSE_loss: 0.2468080216101963
Epoch number: 2600 MSE_loss: 0.24453636937589682
Epoch number: 2800 MSE_loss: 0.2422676475547703
Epoch number: 3000 MSE_loss: 0.24000266108325197
Epoch number: 3200 MSE_loss: 0.23774228333379535
Epoch number: 3400 MSE_loss: 0.23548745020059056
Epoch number: 3600 MSE_loss: 0.23323915364166328
Epoch number: 3800 MSE_loss: 0.2309984347107001
Epoch number: 4000 MSE_loss: 0.2287

In [4]:
# testing on input patterns, and reinjecting on the third to denoise it
test_patterns = np.array([
    [-1, -1, 1],
    [1, 1, -1],
    [-0.7, -0.8, 0.6]])
attractor = fpass(test_patterns[0], trained_parameters, activation_function=logistic)
print('SUPPOSED ATTRACTOR: ',np.round(attractor['a1'],4))
p = test_patterns[2]
cache = fpass(p, trained_parameters, activation_function=logistic)
print('STILL NOT REINJECTED: ',np.round(cache['a1'],4))
print('#########################################################')
cache = fpass(p, trained_parameters, activation_function=logistic)
for reinjection_number in range(10):
    reinjection_number +=1
    cache = fpass(cache['a1'], trained_parameters, activation_function=logistic)
    mse_error = np.mean((cache['a1'] - attractor['a1']) ** 2)
    euclidian_distance = np.absolute(np.subtract(cache['a1'], attractor['a1']))
    mean_distance = np.mean(euclidian_distance)
    print('Reinjection n°:', reinjection_number, ' Output: ',np.round(cache['a1'],4),
          'MSError:', np.round(mse_error,6))
    print('Euclidian distance:', np.round(euclidian_distance,4), 'Mean distance:',np.round(mean_distance,4))
    print('#########################################################')

SUPPOSED ATTRACTOR:  [[4.000e-04 1.210e-02 8.323e-01]]
STILL NOT REINJECTED:  [[0.0025 0.017  0.5164]]
#########################################################
Reinjection n°: 1  Output:  [[0.0574 0.3001 0.4047]] MSError: 0.089669
Euclidian distance: [[0.057  0.288  0.4276]] Mean distance: 0.2575
#########################################################
Reinjection n°: 2  Output:  [[0.1033 0.4487 0.3653]] MSError: 0.139752
Euclidian distance: [[0.1028 0.4365 0.467 ]] Mean distance: 0.3355
#########################################################
Reinjection n°: 3  Output:  [[0.1416 0.5481 0.3542]] MSError: 0.178596
Euclidian distance: [[0.1411 0.536  0.4781]] Mean distance: 0.3851
#########################################################
Reinjection n°: 4  Output:  [[0.1736 0.6225 0.3559]] MSError: 0.209815
Euclidian distance: [[0.1731 0.6103 0.4764]] Mean distance: 0.42
#########################################################
Reinjection n°: 5  Output:  [[0.2003 0.6799 0.3637]] MSEr