<img src="mioti.png" style="height: 100px">
<center style="color:#888">Módulo Data Science in IoT<br/>Asignatura Deep Learning</center>
# Challenge S2: "Fashion MNIST" con Redes Neuronales en TensorFlow (DNNs)

## Objetivos

El objetivo de este challenge es crear una red neuronal en TensorFlow capaz de distinguir entre prendas de ropa de la base de datos Fashion MNIST, incluyendo normalización de los datos de entrada y el criterio de parada según el rendimiento en un subconjunto de los datos.

## Punto de partida

El punto de partida se corresponde con el código que hemos visto en el worksheet:

In [1]:
import tensorflow as tf

# Import Fashion MNIST data
fashion_mnist = tf.keras.datasets.fashion_mnist.load_data()
(train_images, train_labels), (test_images, test_labels) = fashion_mnist

# Parameters
learning_rate = 0.001
batch_size = 128
display_step = 100
max_epochs = 5
n_batches = len(train_labels)/batch_size

# Network Parameters
n_input = 784   # Fashion MNIST data input (img shape: 28*28)
n_classes = 10  # Fashion MNIST total classes (0-9 types of clothes)

# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

input_layer = tf.reshape(x, [-1, 28*28])
dense_layer = tf.layers.dense(inputs=input_layer, units=512, activation=tf.nn.tanh)
output = tf.layers.dense(inputs=dense_layer, units=n_classes, activation=tf.nn.softmax)

# Construct model
pred = output

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    epoch = 0
    while epoch < max_epochs:
        print("Epoch " + str(epoch) + ":")        
        step = 0
        while step < n_batches:
            batch_x = train_images[step*batch_size:(step+1)*batch_size].reshape([-1,28*28])
            batch_y = tf.keras.utils.to_categorical(train_labels[step*batch_size:(step+1)*batch_size],n_classes)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            if step % display_step == 0:
                loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                                  y: batch_y,
                                                                  })
                print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                      "{:.6f}".format(loss) + ", Training Accuracy= " + \
                      "{:.5f}".format(acc))
            step += 1
        epoch +=1
            
    print("Optimization Finished!")

    # Calculate accuracy for the fashion mnist test images
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: test_images.reshape([-1,28*28]),
                                      y: tf.keras.utils.to_categorical(test_labels,n_classes),
                                      }))


Instructions for updating:
Use keras.layers.dense instead.
Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.

Epoch 0:
Iter 0, Minibatch Loss= 2.173581, Training Accuracy= 0.31250
Iter 12800, Minibatch Loss= 1.786840, Training Accuracy= 0.70312
Iter 25600, Minibatch Loss= 1.782801, Training Accuracy= 0.72656
Iter 38400, Minibatch Loss= 1.740467, Training Accuracy= 0.75000
Iter 51200, Minibatch Loss= 1.795145, Training Accuracy= 0.68750
Epoch 1:
Iter 0, Minibatch Loss= 1.749093, Training Accuracy= 0.73438
Iter 12800, Minibatch Loss= 1.750044, Training Accuracy= 0.71094
Iter 25600, Minibatch Loss= 1.791924, Training Accuracy= 0.71875
Iter 38400, Minibatch Loss= 1.727432, Training Accuracy= 0.75781
Iter 51200, Minibatch Loss= 1.807412, Training Accuracy= 0.70312
Epoch 2:
Iter 0, M

## Tareas

Vamos a comenzar normalizando los datos de entrada según tres criterios: escalar los valores de entrada al rango 0-1, centrar a una media aproximada de 0 y transformar los datos de entrada aproximadamente a una distribución normal de media 0 y desviación unidad (N(0,1)).

A continuación, cambiaremos el criterio de parada del entrenamiento del número máximo de iteraciones (épocas) a terminar el entrenamiento cuando se cumplan unas ciertas condiciones en un subconjunto de los datos u opcionalmente en un conjunto de validación (independiente del entrenamiento).

### Normalización 1: escalado de los valores al rango (0, 1)

A partir del código anterior, realizar las modificaciones necesarias para que los valores de las imágenes estén escalados al rango (0, 1).


In [2]:
#Norm 1
test_images_scaled = test_images/255
train_images_scaled= train_images/255

# Parameters
learning_rate = 0.001
batch_size = 128
display_step = 100
max_epochs = 5
n_batches = len(train_labels)/batch_size

# Network Parameters
n_input = 784   # Fashion MNIST data input (img shape: 28*28)
n_classes = 10  # Fashion MNIST total classes (0-9 types of clothes)

# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

input_layer = tf.reshape(x, [-1, 28*28])
dense_layer = tf.layers.dense(inputs=input_layer, units=512, activation=tf.nn.tanh)
output = tf.layers.dense(inputs=dense_layer, units=n_classes, activation=tf.nn.softmax)

# Construct model
pred = output

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    epoch = 0
    while epoch < max_epochs:
        print("Epoch " + str(epoch) + ":")        
        step = 0
        while step < n_batches:
            batch_x = train_images_scaled[step*batch_size:(step+1)*batch_size].reshape([-1,28*28])
            batch_y = tf.keras.utils.to_categorical(train_labels[step*batch_size:(step+1)*batch_size],n_classes)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            if step % display_step == 0:
                loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                                  y: batch_y,
                                                                  })
                print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                      "{:.6f}".format(loss) + ", Training Accuracy= " + \
                      "{:.5f}".format(acc))
            step += 1
        epoch +=1
            
    print("Optimization Finished!")

    # Calculate accuracy for the fashion mnist test images
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: test_images_scaled.reshape([-1,28*28]),
                                      y: tf.keras.utils.to_categorical(test_labels,n_classes),
                                      }))


Epoch 0:
Iter 0, Minibatch Loss= 2.159032, Training Accuracy= 0.37500
Iter 12800, Minibatch Loss= 1.632601, Training Accuracy= 0.82812
Iter 25600, Minibatch Loss= 1.643293, Training Accuracy= 0.82812
Iter 38400, Minibatch Loss= 1.613135, Training Accuracy= 0.85938
Iter 51200, Minibatch Loss= 1.635592, Training Accuracy= 0.84375
Epoch 1:
Iter 0, Minibatch Loss= 1.580157, Training Accuracy= 0.89062
Iter 12800, Minibatch Loss= 1.587881, Training Accuracy= 0.87500
Iter 25600, Minibatch Loss= 1.604913, Training Accuracy= 0.85938
Iter 38400, Minibatch Loss= 1.591487, Training Accuracy= 0.89062
Iter 51200, Minibatch Loss= 1.612817, Training Accuracy= 0.85938
Epoch 2:
Iter 0, Minibatch Loss= 1.565567, Training Accuracy= 0.90625
Iter 12800, Minibatch Loss= 1.577341, Training Accuracy= 0.87500
Iter 25600, Minibatch Loss= 1.593363, Training Accuracy= 0.86719
Iter 38400, Minibatch Loss= 1.576361, Training Accuracy= 0.88281
Iter 51200, Minibatch Loss= 1.608650, Training Accuracy= 0.85156
Epoch 3:
I

### Normalización 2: centrar a una media aproximada de 0 

AYUDA: Para centrar los valores a una media aproximada de 0, puedes calcular la media total y restarsela a todos los datos. Recuerda que la información de los datos de evaluación (test) no se puede utilizar, pero deben llevar el mismo procesamiento que los datos con los que se entrena la red.


In [3]:
#Norm 2
media = train_images.mean()
test_images_scaled = test_images - media
train_images_scaled= train_images - media

# Parameters
learning_rate = 0.001
batch_size = 128
display_step = 100
max_epochs = 5
n_batches = len(train_labels)/batch_size

# Network Parameters
n_input = 784   # Fashion MNIST data input (img shape: 28*28)
n_classes = 10  # Fashion MNIST total classes (0-9 types of clothes)

# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

input_layer = tf.reshape(x, [-1, 28*28])
dense_layer = tf.layers.dense(inputs=input_layer, units=512, activation=tf.nn.tanh)
output = tf.layers.dense(inputs=dense_layer, units=n_classes, activation=tf.nn.softmax)

# Construct model
pred = output

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    epoch = 0
    while epoch < max_epochs:
        print("Epoch " + str(epoch) + ":")        
        step = 0
        while step < n_batches:
            batch_x = train_images_scaled[step*batch_size:(step+1)*batch_size].reshape([-1,28*28])
            batch_y = tf.keras.utils.to_categorical(train_labels[step*batch_size:(step+1)*batch_size],n_classes)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            if step % display_step == 0:
                loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                                  y: batch_y,
                                                                  })
                print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                      "{:.6f}".format(loss) + ", Training Accuracy= " + \
                      "{:.5f}".format(acc))
            step += 1
        epoch +=1
            
    print("Optimization Finished!")

    # Calculate accuracy for the fashion mnist test images
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: test_images_scaled.reshape([-1,28*28]),
                                      y: tf.keras.utils.to_categorical(test_labels,n_classes),
                                      }))


Epoch 0:
Iter 0, Minibatch Loss= 2.172581, Training Accuracy= 0.33594
Iter 12800, Minibatch Loss= 1.702028, Training Accuracy= 0.79688
Iter 25600, Minibatch Loss= 1.716622, Training Accuracy= 0.78906
Iter 38400, Minibatch Loss= 1.697324, Training Accuracy= 0.80469
Iter 51200, Minibatch Loss= 1.727158, Training Accuracy= 0.78906
Epoch 1:
Iter 0, Minibatch Loss= 1.656062, Training Accuracy= 0.83594
Iter 12800, Minibatch Loss= 1.677757, Training Accuracy= 0.80469
Iter 25600, Minibatch Loss= 1.695767, Training Accuracy= 0.80469
Iter 38400, Minibatch Loss= 1.697211, Training Accuracy= 0.79688
Iter 51200, Minibatch Loss= 1.713699, Training Accuracy= 0.79688
Epoch 2:
Iter 0, Minibatch Loss= 1.655661, Training Accuracy= 0.84375
Iter 12800, Minibatch Loss= 1.645783, Training Accuracy= 0.84375
Iter 25600, Minibatch Loss= 1.700110, Training Accuracy= 0.75000
Iter 38400, Minibatch Loss= 1.691915, Training Accuracy= 0.80469
Iter 51200, Minibatch Loss= 1.724837, Training Accuracy= 0.78906
Epoch 3:
I

### Normalización 3: distribución normal de media 0 y desviación stándard 1 (estandarización N(0,1))

AYUDA: Para estandarizar los valores a una distribución aproximadamente normal N(0, 1), puedes calcular la media y la desviación total y aplicar la normalización: x\_norm = (x - media)/desviacion. 

Recuerda que la información de los datos de evaluación (test) no se puede utilizar, pero deben llevar el mismo procesamiento que los datos con los que se entrena la red.


In [4]:
############## Si al ejecutar el Kernel se bloquea, 
############## utiliza estas líneas para permitir la 
############## duplicación de librerías
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
##############
#Norm3
desv = train_images.std()
test_images_scaled = (test_images - media)/desv
train_images_scaled= (train_images - media)/desv
learning_rate = 0.001
batch_size = 128
display_step = 100
max_epochs = 5
n_batches = len(train_labels)/batch_size

# Network Parameters
n_input = 784   # Fashion MNIST data input (img shape: 28*28)
n_classes = 10  # Fashion MNIST total classes (0-9 types of clothes)

# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

input_layer = tf.reshape(x, [-1, 28*28])
dense_layer = tf.layers.dense(inputs=input_layer, units=512, activation=tf.nn.tanh)
output = tf.layers.dense(inputs=dense_layer, units=n_classes, activation=tf.nn.softmax)

# Construct model
pred = output

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    epoch = 0
    while epoch < max_epochs:
        print("Epoch " + str(epoch) + ":")        
        step = 0
        while step < n_batches:
            batch_x = train_images_scaled[step*batch_size:(step+1)*batch_size].reshape([-1,28*28])
            batch_y = tf.keras.utils.to_categorical(train_labels[step*batch_size:(step+1)*batch_size],n_classes)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            if step % display_step == 0:
                loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                                  y: batch_y,
                                                                  })
                print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                      "{:.6f}".format(loss) + ", Training Accuracy= " + \
                      "{:.5f}".format(acc))
            step += 1
        epoch +=1
            
    print("Optimization Finished!")

    # Calculate accuracy for the fashion mnist test images
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: test_images_scaled.reshape([-1,28*28]),
                                      y: tf.keras.utils.to_categorical(test_labels,n_classes),
                                      }))



Epoch 0:
Iter 0, Minibatch Loss= 2.147420, Training Accuracy= 0.28906
Iter 12800, Minibatch Loss= 1.613519, Training Accuracy= 0.85156
Iter 25600, Minibatch Loss= 1.616001, Training Accuracy= 0.85938
Iter 38400, Minibatch Loss= 1.594514, Training Accuracy= 0.88281
Iter 51200, Minibatch Loss= 1.627428, Training Accuracy= 0.84375
Epoch 1:
Iter 0, Minibatch Loss= 1.570115, Training Accuracy= 0.89062
Iter 12800, Minibatch Loss= 1.571768, Training Accuracy= 0.88281
Iter 25600, Minibatch Loss= 1.598428, Training Accuracy= 0.87500
Iter 38400, Minibatch Loss= 1.582096, Training Accuracy= 0.88281
Iter 51200, Minibatch Loss= 1.586909, Training Accuracy= 0.88281
Epoch 2:
Iter 0, Minibatch Loss= 1.561244, Training Accuracy= 0.91406
Iter 12800, Minibatch Loss= 1.560225, Training Accuracy= 0.89844
Iter 25600, Minibatch Loss= 1.586061, Training Accuracy= 0.89844
Iter 38400, Minibatch Loss= 1.576350, Training Accuracy= 0.89844
Iter 51200, Minibatch Loss= 1.558934, Training Accuracy= 0.91406
Epoch 3:
I

¿Ha mejorado el resultado con estas normalizaciones? ¿Por qué? ¿Con cuál se obtiene el mejor resultado?

Sí, se observa una mejora en todas las modificaciones, para cada una de las normalizaciones. Esto se debe a una centralización de la información que viaja sobre las capas, que hace que se tenga mayor precisión al ejecutar la red.


Se obtiene el mejor resultado con la distribución normal de media 0 y desv 1.

### Criterio de parada

En muchas ocasiones, en lugar de estimar el número máximo de iteraciones o épocas, lo que se hace es poner un criterio de parada, es decir, algún mecanismo de control que pare el entrenamiento cuando se cumplen ciertas condiciones en un subconjunto de los datos.

Cuando queremos aplicar este tipo de técnicas, solemos separar el conjunto de datos de entrenamiento en 2 conjuntos, uno de "train", con el que entrenamos el modelo, y uno de validación, que utilizamos para tomar este tipo de decisiones. Para comenzar con una versión más sencilla del ejercicio, puedes utilizar un criterio de parada más sencillo que utilice el rendimiento de la red en train (en el último batch de entrenamiento utilizado).

A continuación, modificar el código inicial para que el entrenamiento se pare cuando el rendimiento (accuracy) medido con los datos de train no ha mejorado en las últimas N iteraciones (parámetro que podemos poner al inicio del código, "patience"):

OPCIONAL: De manera opcional puedes elegir un subconjunto del entrenamiento (ejemplo, 1000 imágenes, a poder ser, balanceadas por clase), y establecerlo como el conjunto de validación, y realizar este criterio de parada sobre él.

In [5]:
# Parameters
learning_rate = 0.001
batch_size = 128
display_step = 100
max_epochs = 5
n_batches = len(train_labels)/batch_size
patience = 0
old_acc = 0

# Network Parameters
n_input = 784   # Fashion MNIST data input (img shape: 28*28)
n_classes = 10  # Fashion MNIST total classes (0-9 types of clothes)

# tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

input_layer = tf.reshape(x, [-1, 28*28])
dense_layer = tf.layers.dense(inputs=input_layer, units=512, activation=tf.nn.tanh)
output = tf.layers.dense(inputs=dense_layer, units=n_classes, activation=tf.nn.softmax)

# Construct model
pred = output

# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    epoch = 0
    while epoch < max_epochs and patience <3:
        print("Epoch " + str(epoch) + ":")        
        step = 0
        while step < n_batches:
            batch_x = train_images[step*batch_size:(step+1)*batch_size].reshape([-1,28*28])
            batch_y = tf.keras.utils.to_categorical(train_labels[step*batch_size:(step+1)*batch_size],n_classes)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            if step % display_step == 0:
                loss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,
                                                                  y: batch_y,
                                                                  })
                print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \
                      "{:.6f}".format(loss) + ", Training Accuracy= " + \
                      "{:.5f}".format(acc))
                
            ##Cambio para medir la diferencia de accuracy
            
            diff = acc - old_acc
            if diff < 0.0001 and epoch > 0:
                patience += 1
            old_acc = acc
            step += 1
        epoch +=1
            
    print("Optimization Finished!")

    # Calculate accuracy for the fashion mnist test images
    print("Testing Accuracy:", \
        sess.run(accuracy, feed_dict={x: test_images.reshape([-1,28*28]),
                                      y: tf.keras.utils.to_categorical(test_labels,n_classes),
                                      }))


Epoch 0:
Iter 0, Minibatch Loss= 2.208279, Training Accuracy= 0.26562
Iter 12800, Minibatch Loss= 1.744557, Training Accuracy= 0.78906
Iter 25600, Minibatch Loss= 1.742233, Training Accuracy= 0.75781
Iter 38400, Minibatch Loss= 1.742383, Training Accuracy= 0.78906
Iter 51200, Minibatch Loss= 1.766330, Training Accuracy= 0.75781
Epoch 1:
Iter 0, Minibatch Loss= 1.752274, Training Accuracy= 0.78125
Iter 12800, Minibatch Loss= 1.743943, Training Accuracy= 0.77344
Iter 25600, Minibatch Loss= 1.720491, Training Accuracy= 0.78125
Iter 38400, Minibatch Loss= 1.736371, Training Accuracy= 0.74219
Iter 51200, Minibatch Loss= 1.753955, Training Accuracy= 0.75000
Optimization Finished!
Testing Accuracy: 0.734
