### Redes con tensorflow
#### Por Francisco Serradilla

#### Tareas

* Entrenar el perceptrón multicapa suministrado.
* Ampliarlo para meter más capas.
* Ampliarlo para usar activación relu.
* Ampliarlo para evaluar con un conjunto de test.
* Entrenar con 3 de los problemas suministrados y comentar los resultados. Si se entrenan más de 3, el resto se considerarán actividades opcionales.
* (Opcional) Probar otros optimizadores.
* (Opcional) Añadir dropout. ¿Mejora la generalización con alguno de los problemas suministrados?

In [1]:
print('Loading tensorflow...')
import tensorflow as tf
print('Loaded')

# Create a Constant op that produces a 1x2 matrix.  The op is
# added as a node to the default graph.
# The value returned by the constructor represents the output of the Constant op.
matrix1 = tf.constant([[3., 3.]])

# Create another Constant that produces a 2x1 matrix.
matrix2 = tf.constant([[2.],[2.]])

# Create a Matmul op that takes 'matrix1' and 'matrix2' as inputs.
# The returned value, 'product', represents the result of the matrix multiplication.
product = tf.matmul(matrix1, matrix2)

print(product)

Loading tensorflow...
Loaded
tf.Tensor([[12.]], shape=(1, 1), dtype=float32)


In [2]:
# Variables

# Create a Variable, that will be initialized to the scalar value 0.
state = tf.Variable(0, name="counter")

# Create an Op to add one to `state`.
one = tf.constant(1)

print(state) # Print the initial value of 'state'
for _ in range(3): # Run the op that updates 'state' and print 'state'.
  state = tf.add(state, one)
  print(state)


<tf.Variable 'counter:0' shape=() dtype=int32, numpy=0>
tf.Tensor(1, shape=(), dtype=int32)
tf.Tensor(2, shape=(), dtype=int32)
tf.Tensor(3, shape=(), dtype=int32)


In [208]:
# Creación de la red
# Define un perceptrón multicapa con 2 capas usando RMS como medida del error y back proagation como algoritmo de entrenamiento

#import tensorflow as tf

class Multilayer:
    
    def __init__(self, ninput, nhidden1, nhidden2, noutput, activation):
        range = 0.1
        self.W1 = tf.Variable(tf.random.uniform([ninput, nhidden1], -range, range, dtype=tf.float32))
        self.b1 = tf.Variable(tf.random.uniform([nhidden1], -range, range, dtype=tf.float32))
        self.W2 = tf.Variable(tf.random.uniform([nhidden1, nhidden2], -range, range, dtype=tf.float32))
        self.b2 = tf.Variable(tf.random.uniform([nhidden2], -range, range, dtype=tf.float32))
        self.W3 = tf.Variable(tf.random.uniform([nhidden2, noutput], -range, range, dtype=tf.float32))
        self.b3 = tf.Variable(tf.random.uniform([noutput], -range, range, dtype=tf.float32))
        self.activation = activation
        
        self.trainable = [self.W1, self.b1, self.W2, self.b2, self.W3, self.b3]

    def forward (self, e):
        
        if self.activation == "sigmoid":     
            s2 = tf.nn.sigmoid(tf.matmul(e,self.W1)+self.b1)
            s1 = tf.nn.sigmoid(tf.matmul(s2,self.W2)+self.b2)
            s = tf.nn.sigmoid(tf.matmul(s1,self.W3)+self.b3)
        elif self.activation == "relu":
            s2 = tf.nn.relu(tf.matmul(e,self.W1)+self.b1)
            s1 = tf.nn.relu(tf.matmul(s2,self.W2)+self.b2)
            s = tf.nn.relu(tf.matmul(s1,self.W3)+self.b3)
        return s
    
    def loss (self, predicted, labels):
        return tf.reduce_mean(tf.square(predicted - labels))

    def accuracy (self, predicted, labels):
        aciertos = tf.math.equal(tf.argmax(labels,1),tf.argmax(predicted,1))
        return tf.reduce_mean(tf.cast(aciertos, dtype=tf.float32))
    
    def train(self, X, D, optimizer, epochs, trace):
        #optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
        # ejecutamos el número de epochs indicados
        for i in range(1,epochs+1):
            with tf.GradientTape() as tape: # GradientTape almacena valores para que pueda calcularse el gradiente
                predicted = self.forward(X)
                current_loss = self.loss(predicted, D)

            grads = tape.gradient(current_loss, self.trainable) # calcula los gradientes
            optimizer.apply_gradients(zip(grads, self.trainable)) # aplica el descenso

            if i%trace == 0:
                print('Epoch %d; MSE: %.4f; Acc: %.4f' % (i,current_loss.numpy(),self.accuracy(predicted,D).numpy()))


In [118]:
# Definicion de Multilayer con n capas ocultas
#import tensorflow as tf
import numpy as np

class MultilayerN:
    
    def __init__(self, nPerLayer, activation): # nPerLayer es una lista que nicluye el numero de neuronas por capa (incluyendo las de entrada y salida)
        range1 = 0.1
        self.W=[]
        self.b=[]
        self.testPercentage = 10 # Sirve para indicar el porcentaje del set que se usa para test
        self.Xt=[]
        self.dt=[]       
        self.nlayers = len(nPerLayer)
        for element in range(0,self.nlayers-1):   
            self.W.append(tf.Variable(tf.random.uniform([int(nPerLayer[element]), int(nPerLayer[element+1])], -range1, range1, dtype=tf.float32)))
            self.b.append(tf.Variable(tf.random.uniform([int(nPerLayer[element+1])], -range1, range1, dtype=tf.float32)))
        
        # La ultima capa se pone a mano
        
        self.activation = activation
        toTrain = []
        for j in range(0, self.nlayers-1):
            toTrain.append(self.W[j])
            toTrain.append(self.b[j])
        self.trainable = toTrain
    
    
    def forward (self, e):
        
        if self.activation == "sigmoid":
            s = tf.nn.sigmoid(tf.matmul(e,self.W[0])+self.b[0])
            for i in range(1, self.nlayers-1):
                s = tf.nn.sigmoid(tf.matmul(s,self.W[i])+self.b[i])                
                
        elif self.activation == "relu":    
            s = tf.nn.relu(tf.matmul(e,self.W[0])+self.b[0])
            for i in range(1, self.nlayers-1):
                s = tf.nn.relu(tf.matmul(s,self.W[i])+self.b[i])  
        return s
    
    def loss (self, predicted, labels):
        return tf.reduce_mean(tf.square(predicted - labels))

    def accuracy (self, predicted, labels):
        aciertos = tf.math.equal(tf.argmax(labels,1),tf.argmax(predicted,1))
        return tf.reduce_mean(tf.cast(aciertos, dtype=tf.float32))
    
    def train(self, X, D, optimizer, epochs, trace):
        for i in range(1,epochs+1):
            with tf.GradientTape() as tape: # GradientTape almacena valores para que pueda calcularse el gradiente
                predicted = self.forward(X)
                current_loss = self.loss(predicted, D)

            grads = tape.gradient(current_loss, self.trainable) # calcula los gradientes
            optimizer.apply_gradients(zip(grads, self.trainable)) # aplica el descenso

            if i%trace == 0:
                print('Epoch %d; MSE: %.4f; Acc: %.4f' % (i,current_loss.numpy(),self.accuracy(predicted,D).numpy()))
    
    def trainTest (self, X, D, optimizer, epochs, trace=0, percentageTest=10): # Define el porcentaje de la muestra que pasa a ser test
               
        self.testPercentage=(int)(len(X)*0.01*percentageTest)
        
        """
        
        Reordenamos los arrays de entradas y salidas deseadas, ojo, no hay que hacerlo por separado,
        ya que están relacionadas, deben ser reordenadas de la misma manera
        
        usamos la funcion unison_shuffled_copies
        
        
        """
        if len(self.Xt)==0 and len(self.dt)==0:
            reordenados=unison_shuffled_copies(X,D)
            self.Xt=reordenados[0]
            self.dt=reordenados[1]
            
        self.train(self.Xt[:len(X)-self.testPercentage],self.dt[:len(D)-self.testPercentage], optimizer, epochs, trace)
        
        self.testError(self.Xt, self.dt)
        
                
                
    def testError (self, X, D):
        if len(self.Xt)==0 and len(self.dt)==0:
            print("El conjunto de test no está definido. Use primero traiTest.")
        else:
            print("Error de Test:")
            self.info(X[-self.testPercentage:],D[-self.testPercentage:])
#             predicted = self.forward(X[-self.testPercentage:])
#             current_loss = self.loss(predicted, D[-self.testPercentage:])    
#             print("Sobre el conjunto de test:")
#             print('MSE: %.4f; Acc: %.4f' % (current_loss.numpy(),self.accuracy(predicted,D[-self.testPercentage:]).numpy()))
            
    def info (self, X, D):
        predicted = self.forward(X)
        current_loss = self.loss(predicted, D)
        print('MSE: %.4f; Acc: %.4f' % (current_loss.numpy(),self.accuracy(predicted,D).numpy()))
    
def unison_shuffled_copies(a, b):
    assert len(a) == len(b)
    p = np.random.permutation(len(a))
    c = a.numpy()[p]
    a = tf.convert_to_tensor(c, dtype=tf.float32)
    c = b.numpy()[p] 
    b = tf.convert_to_tensor(c, dtype=tf.float32)
    return a, b

### Ejemplo círculo

In [3]:
# Carga ejemplos de circulo.txt

#import tensorflow as tf
import numpy as np

# carga datos de entrenamiento
d = np.loadtxt("samples/circulo.txt")

#print(d)
np.random.shuffle(d) # cuidado, no usar el shuffle de python, hace cosas raras con arrays de np
#print(d)
inputs = d[:,:2]
labels = d[:,2:]

# convierte a formato tf
inputs = tf.cast(inputs, dtype=tf.float32)
labels = tf.cast(labels, dtype=tf.float32)

In [4]:
red1=MultilayerN([2,7,3], "sigmoid")
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
red1.train(inputs, labels, optimizer, 1000,100)

Epoch 100; MSE: 0.2091; Acc: 0.4800
Epoch 200; MSE: 0.2091; Acc: 0.4800
Epoch 300; MSE: 0.2073; Acc: 0.4800
Epoch 400; MSE: 0.1865; Acc: 0.5600
Epoch 500; MSE: 0.1377; Acc: 0.7200
Epoch 600; MSE: 0.1119; Acc: 0.8000
Epoch 700; MSE: 0.0958; Acc: 0.8400
Epoch 800; MSE: 0.0852; Acc: 0.9200
Epoch 900; MSE: 0.0762; Acc: 0.9600
Epoch 1000; MSE: 0.0689; Acc: 0.9600


In [5]:
# Entrenamiento

net = MultilayerN([2,7,3], "sigmoid") # crea red

optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
#optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)

# ejecutamos el número de epochs indicados
epochs = 5000
trace = 500
net.trainTest(inputs, labels, optimizer, 2000, 100)

Epoch 100; MSE: 0.1965; Acc: 0.5217
Epoch 200; MSE: 0.1447; Acc: 0.7391
Epoch 300; MSE: 0.1103; Acc: 0.7826
Epoch 400; MSE: 0.0888; Acc: 0.9130
Epoch 500; MSE: 0.0739; Acc: 1.0000
Epoch 600; MSE: 0.0598; Acc: 1.0000
Epoch 700; MSE: 0.0479; Acc: 1.0000
Epoch 800; MSE: 0.0387; Acc: 1.0000
Epoch 900; MSE: 0.0319; Acc: 1.0000
Epoch 1000; MSE: 0.0268; Acc: 1.0000
Epoch 1100; MSE: 0.0229; Acc: 1.0000
Epoch 1200; MSE: 0.0199; Acc: 1.0000
Epoch 1300; MSE: 0.0176; Acc: 1.0000
Epoch 1400; MSE: 0.0156; Acc: 1.0000
Epoch 1500; MSE: 0.0140; Acc: 1.0000
Epoch 1600; MSE: 0.0126; Acc: 1.0000
Epoch 1700; MSE: 0.0114; Acc: 1.0000
Epoch 1800; MSE: 0.0104; Acc: 1.0000
Epoch 1900; MSE: 0.0095; Acc: 1.0000
Epoch 2000; MSE: 0.0087; Acc: 1.0000
MSE: 0.3512; Acc: 0.5000


In [7]:
# Región aprendida

import numpy as np

step = 0.1
vmin = -3
vmax = 3
values = np.arange(vmin, vmax, step)

%matplotlib notebook

import matplotlib.pyplot as plt
import matplotlib.patches as patches

fig = plt.figure()
ax = fig.add_subplot(111)
ylim = ax.set_ylim(vmin,vmax)
xlim = ax.set_xlim(vmin,vmax)

for x in values:
    for y in values:
        e = np.array([[x,y]])
        s = net.forward(tf.cast(e, dtype=tf.float32))
        o = tf.argmax(s,1)
        if o==0:
            c = 'r'
        elif o==1:
            c = 'g'
        else: c = 'b'
        ax.add_patch(patches.Rectangle((x, y), step, step, fill=True, color=c))


<IPython.core.display.Javascript object>

In [11]:
# Keras

# 1) create model
model = tf.keras.models.Sequential()

# 2) add model layers
model.add(tf.keras.layers.Dense(7, activation='sigmoid', input_shape=(2,)))
model.add(tf.keras.layers.Dense(3, activation='sigmoid'))

# 3) compile model
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['mean_absolute_error', 'mean_squared_error', 'accuracy'])

# 4) train
print('training')
h = model.fit(inputs, labels, epochs=5000, verbose=0)
print('trained')

results = model.evaluate(inputs, labels)
print('loss, mae, rms, acc:', results)


training
trained
loss, mae, rms, acc: [0.21130609512329102, 0.30806035, 0.2838355, 1.0]


In [77]:
# Keras

# 1) create model
model = tf.keras.models.Sequential()

# 2) add model layers
model.add(tf.keras.layers.Dense(4, activation='relu', input_shape=(2,)))
model.add(tf.keras.layers.Dense(3, activation='relu'))
model.add(tf.keras.layers.Dense(3, activation='softmax'))

# 3) compile model
model.compile(optimizer='RMSprop', loss='categorical_crossentropy', metrics=['mean_absolute_error', 'mean_squared_error', 'accuracy'])

# 4) train
print('training')
h = model.fit(inputs, labels, epochs=8000, verbose=0)
print('trained')

results = model.evaluate(inputs, labels)
print('test loss, test acc:', results)

training
trained
test loss, test acc: [7.772432013553043e-07, 5.1670605e-07, 1.6691059e-12, 1.0]


In [78]:
%matplotlib notebook

import matplotlib.pyplot as plt

fig = plt.figure()
plt.plot(h.history['loss'])
plt.plot(h.history['accuracy'])

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x130e83910>]

In [79]:
# Región aprendida

import numpy as np

step = 0.1
vmin = -3
vmax = 3
values = np.arange(vmin, vmax, step)

%matplotlib notebook

import matplotlib.pyplot as plt
import matplotlib.patches as patches

fig = plt.figure()
ax = fig.add_subplot(111)
ylim = ax.set_ylim(vmin,vmax)
xlim = ax.set_xlim(vmin,vmax)

for x in values:
    for y in values:
        e = np.array([[x,y]])
        s = model(tf.cast(e, dtype=tf.float32))
        o = tf.argmax(s,1)
        if o==0:
            c = 'r'
        elif o==1:
            c = 'g'
        else: c = 'b'
        ax.add_patch(patches.Rectangle((x, y), step, step, fill=True, color=c))



<IPython.core.display.Javascript object>

### Ejemplo regiones no lineales

In [88]:
# regiones no lineales
import matplotlib.pyplot as plt
def one_hot (d): # codificación one_hot
    num_classes = len(set(d))
    rows = d.shape[0]
    labels = np.zeros((rows, num_classes), dtype='float32')
    labels[np.arange(rows),d.T] = 1
    return labels
X = np.loadtxt('samples/data_3classes_nonlinear_2D.txt')

d = X[:,-1].astype('int')
X = X[:,:-1]
plt.figure()
plt.xlim(0,1)
plt.ylim(0,1)
plt.plot(X[d==0,0],X[d==0,1], 'ro')
plt.plot(X[d==1,0],X[d==1,1], 'go')
plt.plot(X[d==2,0],X[d==2,1], 'bo')
plt.show()

no = len(set(d))
ni = X.shape[1]

d = one_hot(d)

# encontrar arquitectura mínima que aprende este problema, para data_2classes_nonlinear_2D.txt y para data_3classes_nonlinear_2D.txt


<IPython.core.display.Javascript object>

In [90]:
p = MultilayerN([ni,5,no],'sigmoid')
X = tf.convert_to_tensor(X, dtype=tf.float32)
d = tf.convert_to_tensor(d, dtype=tf.float32)
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
p.trainTest(X,d,optimizer,2500,100)

Epoch 100; MSE: 0.1859; Acc: 0.6402
Epoch 200; MSE: 0.1343; Acc: 0.7090
Epoch 300; MSE: 0.1162; Acc: 0.7831
Epoch 400; MSE: 0.1051; Acc: 0.7937
Epoch 500; MSE: 0.0947; Acc: 0.7989
Epoch 600; MSE: 0.0857; Acc: 0.8254
Epoch 700; MSE: 0.0791; Acc: 0.8307
Epoch 800; MSE: 0.0720; Acc: 0.8307
Epoch 900; MSE: 0.0610; Acc: 0.8466
Epoch 1000; MSE: 0.0461; Acc: 0.9259
Epoch 1100; MSE: 0.0306; Acc: 0.9683
Epoch 1200; MSE: 0.0181; Acc: 0.9947
Epoch 1300; MSE: 0.0105; Acc: 1.0000
Epoch 1400; MSE: 0.0067; Acc: 1.0000
Epoch 1500; MSE: 0.0044; Acc: 1.0000
Epoch 1600; MSE: 0.0029; Acc: 1.0000
Epoch 1700; MSE: 0.0019; Acc: 1.0000
Epoch 1800; MSE: 0.0012; Acc: 1.0000
Epoch 1900; MSE: 0.0008; Acc: 1.0000
Epoch 2000; MSE: 0.0005; Acc: 1.0000
Epoch 2100; MSE: 0.0003; Acc: 1.0000
Epoch 2200; MSE: 0.0002; Acc: 1.0000
Epoch 2300; MSE: 0.0001; Acc: 1.0000
Epoch 2400; MSE: 0.0001; Acc: 1.0000
Epoch 2500; MSE: 0.0001; Acc: 1.0000
MSE: 0.0006; Acc: 1.0000


Lo aprende perfectamente y a su vez pasa el test. En comparación con la practica anteior, los tiempos de ejecución son más rapidos, es más como de ejecutar y podemos definir ciertos parametros, como el tipo de función de activación y el optimizador.

### Ejemplo XOR

In [10]:
# define XOR
inputs = tf.constant([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
labels = tf.constant([[1.0, 0.0], [0.0, 1.0], [0.0, 1.0], [1.0, 0.0]])

In [85]:
# define XOR
inputs = tf.constant([[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]])
labels = tf.constant([[1.0], [0.0], [0.0], [1.0]])

In [87]:
red1=MultilayerN([2,3,2], "sigmoid")
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-1)
red1.train(inputs, labels, optimizer, 100,10)

Epoch 10; MSE: 0.2460; Acc: 0.7500
Epoch 20; MSE: 0.1973; Acc: 0.2500
Epoch 30; MSE: 0.1797; Acc: 0.2500
Epoch 40; MSE: 0.1739; Acc: 1.0000
Epoch 50; MSE: 0.1710; Acc: 1.0000
Epoch 60; MSE: 0.1694; Acc: 0.2500
Epoch 70; MSE: 0.1685; Acc: 0.2500
Epoch 80; MSE: 0.1679; Acc: 1.0000
Epoch 90; MSE: 0.1675; Acc: 0.2500
Epoch 100; MSE: 0.1672; Acc: 0.2500


### Ejemplo orquídeas

In [12]:
# Orquideas

X = np.loadtxt('samples/iris.csv', dtype = 'float64', usecols = [0,1,2,3])
L = np.loadtxt('samples/iris.csv', dtype = str, usecols = [4]) 

# convierte la salida a enteros
d = []
options = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']
for e in L:
    d.append(options.index(e))

d = np.array(d)
X = np.array(X)

d = one_hot(d)

d = tf.cast(d, dtype=tf.float32)
X = tf.cast(X, dtype=tf.float32)

ni = X.shape[1]
no = len(options)


# encontrar arquitectura mínima que aprende este problema

In [13]:
redOrq = MultilayerN([ni,40,no],'sigmoid')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
redOrq.trainTest(X, d, optimizer, 1000, 100)

Epoch 100; MSE: 0.0602; Acc: 0.9630
Epoch 200; MSE: 0.0247; Acc: 0.9630
Epoch 300; MSE: 0.0173; Acc: 0.9630
Epoch 400; MSE: 0.0150; Acc: 0.9630
Epoch 500; MSE: 0.0140; Acc: 0.9630
Epoch 600; MSE: 0.0133; Acc: 0.9704
Epoch 700; MSE: 0.0129; Acc: 0.9704
Epoch 800; MSE: 0.0126; Acc: 0.9778
Epoch 900; MSE: 0.0122; Acc: 0.9778
Epoch 1000; MSE: 0.0120; Acc: 0.9778
MSE: 0.0049; Acc: 1.0000


En este problema no se puede alcanzar el 1 de accuracy, sin embargo sonseguimos un resultado bastante buenno, tato en entrenamiento como en test.

# Regiones lineales

No hemos conseguido que funcione con ellas

### Ejemplo de aprobados

In [46]:
#Transformamos la salida deseada en un array de arrays
def transformList (d): # codificación one_hot
    list=[]
    for i in d:
        list.append(np.array([i]))
    return np.array(list)

In [105]:
#Transformamos la salida deseada en un array de arrays y de dos salidad, opuestas la una a la otra
def transformListTwo (d): # codificación one_hot
    list=[]
    for i in d:
        list.append(np.array([i,1 - i]))
    return np.array(list)

In [16]:
X = np.loadtxt('samples/aprobado-ent.txt')

d = X[:,-1].astype('int')
X = X[:,:-1]

d=transformListTwo(d)
d = tf.cast(d, dtype=tf.float32)
X = tf.cast(X, dtype=tf.float32)

In [17]:
redAprob = MultilayerN([3,1],'sigmoid')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
redAprob.train(X, d, optimizer, 100, 1)

Epoch 1; MSE: 0.2554; Acc: 0.4694
Epoch 2; MSE: 0.2603; Acc: 0.4694
Epoch 3; MSE: 0.2536; Acc: 0.4694
Epoch 4; MSE: 0.2524; Acc: 0.4694
Epoch 5; MSE: 0.2518; Acc: 0.4694
Epoch 6; MSE: 0.2513; Acc: 0.4694
Epoch 7; MSE: 0.2510; Acc: 0.4694
Epoch 8; MSE: 0.2508; Acc: 0.4694
Epoch 9; MSE: 0.2506; Acc: 0.4694
Epoch 10; MSE: 0.2504; Acc: 0.4694
Epoch 11; MSE: 0.2503; Acc: 0.4694
Epoch 12; MSE: 0.2502; Acc: 0.4694
Epoch 13; MSE: 0.2502; Acc: 0.4694
Epoch 14; MSE: 0.2501; Acc: 0.4694
Epoch 15; MSE: 0.2501; Acc: 0.4694
Epoch 16; MSE: 0.2501; Acc: 0.4694
Epoch 17; MSE: 0.2501; Acc: 0.4694
Epoch 18; MSE: 0.2501; Acc: 0.4694
Epoch 19; MSE: 0.2504; Acc: 0.4694
Epoch 20; MSE: 0.2512; Acc: 0.4694
Epoch 21; MSE: 0.2522; Acc: 0.4694
Epoch 22; MSE: 0.2517; Acc: 0.4694
Epoch 23; MSE: 0.2508; Acc: 0.4694
Epoch 24; MSE: 0.2504; Acc: 0.4694
Epoch 25; MSE: 0.2502; Acc: 0.4694
Epoch 26; MSE: 0.2502; Acc: 0.4694
Epoch 27; MSE: 0.2501; Acc: 0.4694
Epoch 28; MSE: 0.2501; Acc: 0.4694
Epoch 29; MSE: 0.2501; Acc: 0

#### Test:

In [100]:
Xt = np.loadtxt('samples/aprobado-tst.txt')

dt = Xt[:,-1].astype('int')
Xt = Xt[:,:-1]
# print(d)

dt=transformList(dt)
dt = tf.cast(dt, dtype=tf.float32)
Xt = tf.cast(Xt, dtype=tf.float32)
redAprob.info(Xt, dt)

MSE: 0.2153; Acc: 1.0000


Este ejemplo es líneal y así podemos comprobar que también s epueden definir perceptrones normales.

### Ejemplo Morosos

In [18]:
X = np.loadtxt('samples/morosos-ent.txt')

d = X[:,-1].astype('int')
X = X[:,:-1]

d=transformList(d)

d = tf.cast(d, dtype=tf.float32)
X = tf.cast(X, dtype=tf.float32)

In [19]:
redMor = MultilayerN([9,1],'relu')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
redMor.train(X, d, optimizer, 10, 1)

Epoch 1; MSE: 0.5231; Acc: 1.0000
Epoch 2; MSE: 0.4236; Acc: 1.0000
Epoch 3; MSE: 0.3212; Acc: 1.0000
Epoch 4; MSE: 0.2696; Acc: 1.0000
Epoch 5; MSE: 0.2420; Acc: 1.0000
Epoch 6; MSE: 0.2264; Acc: 1.0000
Epoch 7; MSE: 0.2172; Acc: 1.0000
Epoch 8; MSE: 0.2114; Acc: 1.0000
Epoch 9; MSE: 0.2076; Acc: 1.0000
Epoch 10; MSE: 0.2049; Acc: 1.0000


In [130]:
Xt = np.loadtxt('samples/morosos-tst.txt')

dt = Xt[:,-1].astype('int')
Xt = Xt[:,:-1]
# print(d)

dt=transformList(dt)
dt = tf.cast(dt, dtype=tf.float32)
Xt = tf.cast(Xt, dtype=tf.float32)
redMor.info(Xt, dt)

MSE: 0.1751; Acc: 1.0000


Aquí lo hemos resuelto con un perceptrón normal

## Quinielas

In [20]:
def ultimosTres(array):
    lista = []
    for i in range(0,len(array)):
        np.append(array, np.array([array[i][-3:]]))
        lista.append(array[i][-3:])
    return np.array(lista)

In [83]:
X = np.loadtxt('samples/quinielas60-3-trn.txt')
d=ultimosTres(X)
X = X[:,:-3]
d = tf.cast(d, dtype=tf.float32)
X = tf.cast(X, dtype=tf.float32)

In [84]:
quiniela=MultilayerN([60,20,3],'sigmoid')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.1)
quiniela.train(X, d, optimizer, 10000, 1000)

Epoch 1000; MSE: 0.1440; Acc: 0.7098
Epoch 2000; MSE: 0.1266; Acc: 0.7468
Epoch 3000; MSE: 0.1328; Acc: 0.7273
Epoch 4000; MSE: 0.1256; Acc: 0.7495
Epoch 5000; MSE: 0.1046; Acc: 0.7852
Epoch 6000; MSE: 0.1150; Acc: 0.7684
Epoch 7000; MSE: 0.1150; Acc: 0.7697
Epoch 8000; MSE: 0.1032; Acc: 0.7859
Epoch 9000; MSE: 0.1023; Acc: 0.7906
Epoch 10000; MSE: 0.0993; Acc: 0.7946


Con 20 neuronas ocultas y 10000 epoch: MSE: 0.0949; Acc: 0.7919 
Con 50 neuronas ocultas y 10000 epoch: MSE: 0.0406; Acc: 0.9192

### Test
Lo hacemos con 50 neuronas ocultas

In [28]:
Xt = np.loadtxt('samples/quinielas60-3-tst.txt')
# print(X)
dt=ultimosTres(Xt)
Xt = Xt[:,:-3]
dt = tf.cast(dt, dtype=tf.float32)
Xt = tf.cast(Xt, dtype=tf.float32)
quiniela.info(Xt, dt)

MSE: 0.2513; Acc: 0.5333


Con 50 ocultas y el MSE comentado anteriormente da estos resultados: MSE: 0.3535; Acc: 0.3879
Con 20 ocultas y el MSE comentado anteriormente da estos resultados:  MSE: 0.2513; Acc: 0.5333

Esto quiere decir que con menos neuronas generaliza mejor, con 50 lo que hace es aprender casos

## Ejemplo encoder

In [40]:
X = np.loadtxt('samples/encoder.txt')
X = X[:,:8]

X = tf.cast(X, dtype=tf.float32)
d = Xt

In [45]:
enco=MultilayerN([8,3,8],'sigmoid')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.1)
enco.train(X, d, optimizer, 100, 10)

Epoch 10; MSE: 0.1088; Acc: 0.2500
Epoch 20; MSE: 0.1040; Acc: 0.3750
Epoch 30; MSE: 0.0879; Acc: 0.5000
Epoch 40; MSE: 0.0670; Acc: 0.7500
Epoch 50; MSE: 0.0493; Acc: 0.8750
Epoch 60; MSE: 0.0335; Acc: 1.0000
Epoch 70; MSE: 0.0206; Acc: 1.0000
Epoch 80; MSE: 0.0136; Acc: 1.0000
Epoch 90; MSE: 0.0095; Acc: 1.0000
Epoch 100; MSE: 0.0068; Acc: 1.0000


Lo aprende sin problemas

### Ejemplo Diabetes

In [119]:
from numpy import genfromtxt
X = genfromtxt('samples/pima-diabetes.csv', delimiter=',')
d = X[:,-1].astype('int')
X = X[:,:-1]

d=transformListTwo(d)

d = tf.cast(d, dtype=tf.float32)
X = tf.cast(X, dtype=tf.float32)
print(d)

tf.Tensor(
[[1. 0.]
 [0. 1.]
 [1. 0.]
 ...
 [0. 1.]
 [1. 0.]
 [0. 1.]], shape=(768, 2), dtype=float32)


In [151]:
diab = MultilayerN([8,10, 2],'relu')
optimizer = tf.keras.optimizers.RMSprop(learning_rate=1e-2)
diab.trainTest(X, d, optimizer, 10000, 1000, 30) # 30 es el porcentaje de test recomendado en la practica anterior. 

Epoch 1000; MSE: 0.2665; Acc: 0.6543
Epoch 2000; MSE: 0.2543; Acc: 0.6561
Epoch 3000; MSE: 0.2493; Acc: 0.6599
Epoch 4000; MSE: 0.2468; Acc: 0.6673
Epoch 5000; MSE: 0.2447; Acc: 0.6673
Epoch 6000; MSE: 0.2443; Acc: 0.6691
Epoch 7000; MSE: 0.2427; Acc: 0.6710
Epoch 8000; MSE: 0.2430; Acc: 0.6729
Epoch 9000; MSE: 0.2421; Acc: 0.6766
Epoch 10000; MSE: 0.2415; Acc: 0.6766
Error de Test:
MSE: 0.2663; Acc: 0.6478


Todos con el mismo learning_rate=1e-2

Con una capa oculta de 10 neuronas:

Epoch 10000; MSE: 0.1641; Acc: 0.7751
Error de Test:
MSE: 0.1883; Acc: 0.7304

Con dos capas ocultas, una de 30 neuronas y otra de 20:

Epoch 10000; MSE: 0.0916; Acc: 0.8941
Error de Test:
MSE: 0.2346; Acc: 0.7130

Con 3 capas ocultas, de 30, 25 y 20: 

Epoch 10000; MSE: 0.0383; Acc: 0.9591
Error de Test:
MSE: 0.2922; Acc: 0.6652

Como vemos, segun aumentamos el numero de neuronas mejora el accuracy en entrenamiento, sin embargoesto lo hace a costa de prder generelización, ya que lo que hace es aprender los ejemplos, ya que el test baja

## Probando otros optimizadores

Para el ejemplo de diabetes

In [157]:
diab = MultilayerN([8, 10, 2],'relu')
# optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
# optimizer = tf.keras.optimizers.Adamax(learning_rate=1e-3)
# optimizer = tf.keras.optimizers.SGD(learning_rate=1e-3)
optimizer = tf.keras.optimizers.Ftrl(learning_rate=1e-3)

diab.trainTest(X, d, optimizer, 10000, 1000, 30) # 30 es el porcentaje de test recomendado en la practica anterior.

Epoch 1000; MSE: 0.2373; Acc: 0.6599
Epoch 2000; MSE: 0.2265; Acc: 0.6803
Epoch 3000; MSE: 0.2161; Acc: 0.6859
Epoch 4000; MSE: 0.2092; Acc: 0.6803
Epoch 5000; MSE: 0.2039; Acc: 0.6989
Epoch 6000; MSE: 0.2003; Acc: 0.7193
Epoch 7000; MSE: 0.1975; Acc: 0.7230
Epoch 8000; MSE: 0.1953; Acc: 0.7156
Epoch 9000; MSE: 0.1938; Acc: 0.7100
Epoch 10000; MSE: 0.1926; Acc: 0.7100
Error de Test:
MSE: 0.1980; Acc: 0.7130


### Resultados

A veces no sabemos por qué se atasca (tenemos la teoría de que quizas es porque "mueren" neuronas) y no aprende (se queda en un accuracy constante de 0.30 aprox), pero cuando sale bien con el ejemplo de la diabetes optenemos:

#### Adam

learning_rate=1e-3


Epoch 10000; MSE: 0.1293; Acc: 0.8253


Error de Test:

MSE: 0.1646; Acc: 0.7870

Mejora ligeramente

#### Adamax

learning_rate=1e-3


Epoch 10000; MSE: 0.1338; Acc: 0.8216


Error de Test:

MSE: 0.1695; Acc: 0.7783

Mejora ligeramente

#### SGD

Nos ha costado varias ejecuciones que funcionase

learning_rate=1e-3


Epoch 10000; MSE: 0.1801; Acc: 0.7546


Error de Test:

MSE: 0.1857; Acc: 0.7478

Este no mejora RMSprop

#### Ftrl

learning_rate=1e-3


Epoch 10000; MSE: 0.1926; Acc: 0.7100


Error de Test:

MSE: 0.1980; Acc: 0.7130

No mejora a RMSprop

#### Conclusiones:

Adam si parece que es una buena solución, ya vimos en clase que era uno de los recomendados junto con RMSprop