#### The following implementation of the perceptron uses the numerical definition of the derivative for the gradient descent algorithm. A small value of epsilon was used to iterate between the W weights, since the Ws are the values we want to optimize in the cost function:

$$
 \frac{f(x+\varepsilon )-f(x)}{\varepsilon }
$$

#### W's are updated using the following formula:

$$
W_{i+1}=W_i - \eta \frac{\partial J}{\partial W_i} 
$$

#### The J-cost function is determined by:

$$
(y-\hat{y})^2
$$

#### where $y$ is the actual value and $ \hat{y} $ is the predicted value.

In [1]:
import numpy as np

### We are going to create 2 functions
1.- The **outputFunction**: which is performed by the dot product between the **X** inputs and the **W** weights

2.- The **transferFunction**: which will use the logistics function


In [2]:
def outputFunction(weights, inputs):
    return np.dot(weights, inputs)


def transferFunction(activation):
    return 1.0 / (1.0 + np.exp(-activation))

In [3]:
input_dim = 2 #Number of neurons
learning_rate = 0.1 #eta learning rate for the gradient descent
numIte=2000 # number of iterations for learning, epoch
eps = 1e-4 #epsilon for the numerical derivative

#### We assign the weights for the AND gate

In [4]:
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]]) #Our input X
y = np.array([0, 0, 0, 1]) #Our output Y


#### We integrate the bias as one more input to the set of X with a value of 1 and add a W for the additional bias

In [5]:
W = np.random.rand(input_dim + 1) #we generate random values for the weights
bias = np.ones((np.size(X,0),1)) 

X = np.concatenate((X, bias), axis=1)
training_count = np.size(X,0)
X


array([[0., 0., 1.],
       [0., 1., 1.],
       [1., 0., 1.],
       [1., 1., 1.]])

In [6]:

W_Delta = np.array(W)
NewW = np.array(W)


In [7]:
print(W) #First Weights
print('__________________')
print(transferFunction(outputFunction(X, W))) #First Output

[0.785307   0.27876964 0.65134538]
__________________
[0.65731358 0.71709862 0.80793572 0.84753822]


In [8]:
for i in range(0, numIte):

    for j in range(0, training_count):
        pred = transferFunction(outputFunction(X[j, :], W))
        error=(y[j]-pred)**2
        #print(f'{j} {error}')
        for k in range(0, np.size(W)):
            W_Delta = np.array(W)
            W_Delta[k] = W_Delta[k] + eps #The epsilon is iterated between the W
            
            #The output, yHat & yHatDelta is calculated using the W and the W+epsilon
            yHat = transferFunction(outputFunction(X[j, :], W))
            yHatDelta = transferFunction(outputFunction(X[j, :], W_Delta)) 

            errorDerivado_delta = (y[j] - yHatDelta)**2
            errorDerivado = (y[j] - yHat)**2
            NewW[k] = W[k] - learning_rate * (((errorDerivado_delta) - (errorDerivado)) / eps) 
        #print(f'**********')
        W = NewW
    #print('__________________')



In [9]:
print(W) #Final Weights
print('__________________')
print(transferFunction(outputFunction(X, W))) #Final Output

[ 4.38623618  4.38069416 -6.67184142]
__________________
[0.00126446 0.0918588  0.09232217 0.89042493]
