1. Implemente un perceptrón simple que aprenda la función lógica $AND$ y la función lógica $OR$, de $2$ y de $4$ entradas. Muestre la evolución del error durante el entrenamiento. Para el caso de $2$ dimensiones, grafique la recta discriminadora y todos los vectores de entrada de la red

![](img/perceptrón-simple1.png)

$AND$ de $2$ entradas:
| $x_1$ | $0$ | $0$ | $1$ | $1$ |
|-------|-----|-----|-----|-----|
| $x_2$ | $0$ | $1$ | $0$ | $1$ |
| $y$   | $0$ | $0$ | $0$ | $1$ |

In [1]:
import numpy as np
from matplotlib import pyplot as plt
np.random.seed(2002)

In [2]:
def AND(X):
    return all(X)

def OR(X):
    return any(X)

In [3]:
class PerceptronSimple:
    def __init__(self):
        self.W = np.random.sample(3)
    def train(self, X, Y, alpha, iter_):
        for _ in range(iter_):
            for n in range(len(X)):
                a = self.predict(X[n])
                if a != Y[n]:
                    self.W[0] += alpha * (Y[n] - a) * X[n][0]
                    self.W[1] += alpha * (Y[n] - a) * X[n][1]
                    self.W[2] += alpha * (Y[n] - a) * (-1)
    def predict(self, x):
        h = np.dot(np.append(x, -1), self.W)
        return 0 if h < 0 else 1

In [4]:
X_train = [[x1,x2] for x1 in [0,1] for x2 in[0,1]]
Y_train = [AND(x) for x in X_train]

perceptron = PerceptronSimple()
perceptron.train(X_train[1:3], Y_train[1:3], 0.01, 100)

for x in X_train:
    print(x, perceptron.predict(x))

[0, 0] 0
[0, 1] 0
[1, 0] 0
[1, 1] 1


$AND$ de $4$ entradas:

| $x_1$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $1$ | $1$ | $1$ | $1$ | $1$ | $1$ | $1$ | $1$ |
|-------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
| $x_2$ | $0$ | $0$ | $0$ | $0$ | $1$ | $1$ | $1$ | $1$ | $0$ | $0$ | $0$ | $0$ | $1$ | $1$ | $1$ | $1$ |
| $x_3$ | $0$ | $0$ | $1$ | $1$ | $0$ | $0$ | $0$ | $1$ | $0$ | $0$ | $1$ | $1$ | $0$ | $0$ | $1$ | $1$ |
| $x_4$ | $0$ | $1$ | $0$ | $1$ | $0$ | $1$ | $0$ | $1$ | $0$ | $1$ | $0$ | $1$ | $0$ | $1$ | $0$ | $1$ |
| $y$   | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $0$ | $1$ |

In [5]:
class PerceptronSimple:
    def __init__(self):
        self.W = np.random.sample(5)
    def train(self, X, Y, alpha, iter_):
        for _ in range(iter_):
            for n in range(len(X)):
                a = self.predict(X[n])
                if a != Y[n]:
                    self.W[0] += alpha * (Y[n] - a) * X[n][0]
                    self.W[1] += alpha * (Y[n] - a) * X[n][1]
                    self.W[2] += alpha * (Y[n] - a) * X[n][2]
                    self.W[3] += alpha * (Y[n] - a) * X[n][3]
                    self.W[4] += alpha * (Y[n] - a) * (-1)
    def predict(self, x):
        h = np.dot(np.append(x, -1), self.W)
        return 0 if h < 0 else 1

In [6]:
X_train = [[x1,x2,x3,x4] for x1 in [0,1] for x2 in[0,1] for x3 in [0,1] for x4 in[0,1]]
Y_train = [AND(x) for x in X_train]

perceptron = PerceptronSimple()
perceptron.train(X_train[7:15], Y_train[7:15], 0.01, 100)

for x in X_train:
    print(x, perceptron.predict(x))

[0, 0, 0, 0] 0
[0, 0, 0, 1] 0
[0, 0, 1, 0] 0
[0, 0, 1, 1] 0
[0, 1, 0, 0] 0
[0, 1, 0, 1] 0
[0, 1, 1, 0] 0
[0, 1, 1, 1] 0
[1, 0, 0, 0] 0
[1, 0, 0, 1] 0
[1, 0, 1, 0] 0
[1, 0, 1, 1] 0
[1, 1, 0, 0] 0
[1, 1, 0, 1] 0
[1, 1, 1, 0] 0
[1, 1, 1, 1] 0


2. Implemente un perceptrón multicapa que aprenda la función lógica $XOR$ de $2$ y de $4$ entradas (utilizando el algoritmo Backpropagation y actualizando en batch). Muestre cómo evoluciona el error durante el entrenamiento.

![](img/perceptrón-multicapa1.png)

$XOR$ de $2$ entradas:
| $x_1$ | $0$ | $0$ | $1$ | $1$ |
|-------|-----|-----|-----|-----|
| $x_2$ | $0$ | $1$ | $0$ | $1$ |
| $y$   | $0$ | $1$ | $1$ | $0$ |

El entrenamiento mediante gradient-descent, consiste en calculaer el promedio del gradiente de la función costo con respecto de la matriz de pesos para cada uno de los elementos en el set de entrenamiento, para luego moverse en la dirección contraria, pues el gradiente de una función apunta en su dirección creciente.  
El error para una muestra $X_n$ del set de entrenamiento $X$, es
$$
\text{e}_X = C_X(W) = (Y_X - \hat Y_X)^2.
$$
El error para todo el set de entrenamiento (para las $N$ muestras), es entonces
$$
\text{e} = C(W) = \frac{1}{N} \sum^{N}_{k=1}(Y_k - \hat Y_k)^2.
$$
El gradiente del error con respecto de la matriz de pesos es
$$
\nabla e_W = \nabla C_W =
\begin{bmatrix}
\frac{\partial C}{w^{(1)}_{11}} & \frac{\partial C}{w^{(2)}_{11}}&\frac{\partial C}{w^{(3)}_{11}}\\\\
\frac{\partial C}{w^{(1)}_{12}} & \frac{\partial C}{w^{(2)}_{12}}&\frac{\partial C}{w^{(3)}_{12}}\\\\
\frac{\partial C}{w^{(1)}_{21}} & \frac{\partial C}{w^{(2)}_{21}}&\\\\
\frac{\partial C}{w^{(1)}_{22}} & \frac{\partial C}{w^{(2)}_{22}}&
\end{bmatrix}
$$

In [None]:
# calcular_parcial_e_w(bla, bla, bla, bla=1, bla=1) para generalizar?
# recurrencia??????? probablemente...

In [None]:
class PerceptronMulticapa:
    def __init__(self):
        self.W = np.random.sample(10)
    def train(self, X, Y, alpha, tol):
        A_salida = [self.predict(x) for x in X]
        e = np.mean((A_salida - Y)**2)
        while e > tol:
            for n in range(len(X)):
                a = self.predict(X[n])
                if a != Y[n]:
                    self.W[0] += alpha * (Y[n] - a) * X[n][0]
                    self.W[1] += alpha * (Y[n] - a) * X[n][1]
                    self.W[2] += alpha * (Y[n] - a) * X[n][2]
                    self.W[3] += alpha * (Y[n] - a) * X[n][3]
                    self.W[4] += alpha * (Y[n] - a) * (-1)
    def predict(self, x):
        h = np.dot(np.append(x, -1), self.W)
        return 0 if h < 0 else 1