<h1>CIFAR10 mit Convolutional Neural Network</h1>

<p>Die bisherigen Netzwerke waren &uuml;blicherweise in wenigen Sekunden oder einigen Minute trainierbar und mit der richtigen Netzwerk Struktur waren Genauigkeiten von &uuml;ber 90% zu erreichen. Das wird sich in dieser &Uuml;bung &auml;ndern, da wir das CIFAR10 Datenset verwenden werden. Es enth&auml;lt 50k Trainings- und 10k Testbilder mit je 32x32 RGB-Pixeln. Wie der Name schon sagt, besteht es aus 10 Kategorien. Auch wenn es nur wenige Bildtypen beinhaltet, ist es dennoch sehr kompliziert und wird in der Forschung eingesetzt um neue Ideen zu testen. <a href="http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#43494641522d3130" target="_blank">http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html#43494641522d3130</a></p>


<h3>Vorbereitung</h3>

<p>F&uuml;r das neue Datenset wurde der Programmcode in der <a href="http://home.htw-berlin.de/~hezel/computervision/WS1718/uebung4/cvutils.py" target="_blank">cvutils.py</a> Datei aktualisiert. Kopieren Sie sich den Inhalt oder &uuml;berschreiben Sie ihre aktuelle Version mit der Online-Variante. &Uuml;berf&uuml;hren Sie danach das Convolutional Neural Network der letzten &Uuml;bung in die neue Vorlage und passen Sie eventuell die Netzwerkgr&ouml;&szlig;en an. Trainieren Sie einmalig ein paar Iterationen und notieren Sie sich die Genauigkeit ihres ersten Versuches.</p>

<p>Das komplette Notebook steht hier zum&nbsp;<a href="http://home.htw-berlin.de/~hezel/computervision/WS1718/uebung4/Tensorflow_ConvNet_CIFAR10_Vorlage.ipynb" target="_blank">download</a>&nbsp;bereit.</p>

<h3>Aufgabe</h3>

<p>Wird das Netzwerk der letzten Woche &uuml;bernommen, sind Genauigkeiten von knapp 70% zu erwarten. Danach tritt Overfitting ein. Schauen Sie sich dazu die Trainingsfehler und die Genauigkeitskurven an. Ziel der &Uuml;bung ist es, mit einen selbst geschriebenen Neuronalen Netzwerk in so kurzer Zeit wie m&ouml;glich, eine gute Vorhersagegenauigkeit f&uuml;r das neue Daten Set zu erreichen. Ihnen sind keine Grenzen gesetzt, welche Verfahren Sie dabei nutzen. Erstrebenswert sind Genauigkeiten um die 80%. Hier sind ein paar Tipps, wie Sie dies erreichen k&ouml;nnen:</p>

<ul>
	<li>
	<p><strong>Dropout</strong>: Mit Hilfe von Dropout (<a href="https://www.tensorflow.org/versions/master/api_docs/python/tf/layers/dropout" target="_blank">tf.layers.dropout</a>) k&ouml;nnen Sie das Overfitting reduzieren.&nbsp;</p>
	</li>
	<li>
	<p><strong>Weight Regularization</strong>: Die Filterkernel Regulartoren (von <a href="https://www.tensorflow.org/versions/master/api_docs/python/tf/layers/conv2d" target="_blank">tf.layers.conv2d</a>&nbsp;und <a href="https://www.tensorflow.org/versions/master/api_docs/python/tf/layers/dense" target="_blank">tf.layers.dense</a>) k&ouml;nnen daf&uuml;r sorgen, dass die Gewichtwerte keine extrem gro&szlig;en Werte annehmen.&nbsp;</p>
	</li>
	<li>
	<p><strong>Data Augmentation</strong>: Um mehr Variation in den Trainingsdaten zu haben, k&ouml;nnen diese erweitert (augmentiert) werden. Es gibt viele Funktionen um zuf&auml;llige leichte Ver&auml;nderungen an den Bildern (<a href="https://www.tensorflow.org/api_docs/python/tf/image" target="_blank">tf.image</a>) vornehmen zu k&ouml;nnen.</p>
	</li>
	<li>
	<p><strong>Batch normalization</strong>: Indem das Netzwerk &uuml;berall mit normalisieren Daten (<a href="https://www.tensorflow.org/versions/master/api_docs/python/tf/nn/batch_normalization" target="_blank">tf.nn.batch_normalization</a>) arbeitet, kann es sich besser auf das Wesentliche (Klassifzieren) konzentieren.</p>
	</li>
	<li>
	<p><strong>Netzwerkstruktur</strong>: Die Anzahl und Gr&ouml;&szlig;e der Filterkernel in einem ConvNet entscheiden dar&uuml;ber wie viele verschiedene komplexe Bildmuster das Netzwerk erkennen kann. Inspirieren Sie sich daher bei der Netzwerkstruktur von&nbsp;dem Paper <a href="https://arxiv.org/pdf/1412.6806.pdf" target="_blank">&quot;Striving for Simplicity: The all Convolutional Net&quot;</a>.</p>
	</li>
</ul>

<p>Notieren Sie f&uuml;r Ihr bestes Netzwerk die Genauigkeit und schreiben Sie diese, inklusive dem Netzwerkaufbau in eine PDF Datei. Der strukturelle Aufbau kann in Textform oder visuell festgehalten werden.</p>


In [1]:
import tensorflow as tf
import numpy as np
from cvutils import fetch_cifar10
import math
import time

In [2]:
# load CIFAR10 data set
cifar = fetch_cifar10()
x_train = cifar.train.data.astype('float32')
y_train = cifar.train.target.astype('int64')
x_test = cifar.test.data.astype('float32')
y_test = cifar.test.target.astype('int64')

print(x_train.shape)
print(x_test.shape)

Download MNIST to /home/s0540607/.keras/datasets
(50000, 32, 32, 3)
(10000, 32, 32, 3)


In [3]:
print(y_train.shape)

(50000, 1)


In [29]:
DROPOUT = True
DROPOUT_KEEP_PROB = 0.8
NORM = True 

In [34]:
# optional utils
def opt_drop(X, keep_prob=DROPOUT_KEEP_PROB, drop=DROPOUT):
    """
        Optionally implements dropout at a given layer
    """
    if drop:
        print('WARNING: Applying dropout...')
        return tf.nn.dropout(X, keep_prob=keep_prob)
    return X

def opt_norm(X, name, norm=NORM):
    """
        Local response normalization, used by Alex
        API: https://www.tensorflow.org/api_docs/python/tf/nn/local_response_normalization
    """
    
    if norm:
        return tf.nn.lrn(X, name=name)
    return X



def batch_norm(x, n_out, phase_train):
    """
    Batch normalization on convolutional maps.
    Ref.: http://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow
    Args:
        x:           Tensor, 4D BHWD input maps
        n_out:       integer, depth of input maps
        phase_train: boolean tf.Varialbe, true indicates training phase
        scope:       string, variable scope
    Return:
        normed:      batch-normalized maps
    """
    with tf.variable_scope('bn'):
        beta = tf.Variable(tf.constant(0.0, shape=[n_out]),
                                     name='beta', trainable=True)
        gamma = tf.Variable(tf.constant(1.0, shape=[n_out]),
                                      name='gamma', trainable=True)
        batch_mean, batch_var = tf.nn.moments(x, [0,1,2], name='moments')
        ema = tf.train.ExponentialMovingAverage(decay=0.5)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)

        mean, var = tf.cond(phase_train,
                            mean_var_with_update,
                            lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
    return normed


In [79]:
num_classes = 10
kernel_size = 3
num_channels = x_train.shape[3] #3
# Number of channels by layer
n_channels_01 = 64
n_channels_02 = 128
n_channels_03 = 256
n_channels_04 = 128
n_channels_05 = num_classes 

# learn rate
learning_rate = 0.01

batch_size = 128
num_iters = math.ceil(x_train.shape[0] / batch_size)
print('num_iters: ', num_iters)
num_epochs = 3
print('num_epochs: ', num_epochs)

# Seed
seed = 5
img_dim = x_train.shape[1] # 32


graphCNN = tf.Graph()
with graphCNN.as_default():
    #x_input = tf.placeholder(dtype=tf.float32, shape=[None, img_dim*img_dim], name='x')
    x_input = tf.placeholder(dtype=tf.float32, shape=[None,img_dim, img_dim, num_channels])
    y_input = tf.placeholder(tf.int64, shape=[None, y_train.shape[1]], name='y')
    phase_train = tf.placeholder(tf.bool, name='phase_train')
    
    #x_in = tf.reshape(x_input, shape=[tf.shape(x_input)[0], img_dim, img_dim, num_channels])
    
    # Network
    # Layer 01: output feature maps: 6 filters X 34x34; https://www.tensorflow.org/api_docs/python/tf/nn/conv2d
    # https://www.tensorflow.org/api_docs/python/tf/pad
    paddings = tf.constant([[0, 0,], [2, 2,], [2, 2], [0, 0,]])
    padded = tf.pad(x_input, paddings, 'CONSTANT')
    print("Padded input shape", padded.shape)

    # layer 1: output feature maps: 6 filters X 14x14;
#     W1 = tf.get_variable("W1", [kernel_size, kernel_size, num_channels, n_channels_01], 
#                          initializer=tf.contrib.layers.xavier_initializer(seed=seed))
#     Z1 = tf.nn.conv2d(padded, W1, strides=[1,1,1,1], padding='SAME', name='conv1'
#                       kernel_regularize)
    Z1 = tf.layers.conv2d(padded, n_channels_01, [kernel_size, kernel_size], padding='SAME', name='conv1',
                          kernel_regularizer=tf.contrib.layers.l2_regularizer(0.3), 
                          kernel_initializer=tf.contrib.layers.xavier_initializer(seed=seed))
    
    Z1 = batch_norm(Z1, n_channels_01, phase_train=phase_train)
    
    A1 = opt_drop(X=tf.nn.relu(Z1, name='relu1'))

    print("A1 shape:", A1.shape)
    P1 = opt_norm(X=tf.nn.max_pool(A1, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME', name='max_pool1'),
                 name='norm1')
    print("P1 shape:", P1.shape)

    # Layer 02 Conv: output feature maps: 16 filters X 10x10
#     W2 = tf.get_variable("W2", [kernel_size, kernel_size, n_channels_01, n_channels_02], 
#                          initializer=tf.contrib.layers.xavier_initializer(seed=seed))
#     Z2 = tf.nn.conv2d(P1, W2, strides=[1,1,1,1], padding='SAME', name='conv2')
    Z2 = tf.layers.conv2d(P1, n_channels_02, [kernel_size, kernel_size], padding='SAME', name='conv2',
                          kernel_regularizer=tf.contrib.layers.l2_regularizer(0.3), 
                          kernel_initializer=tf.contrib.layers.xavier_initializer(seed=seed))
    
    Z2 = batch_norm(Z2, n_channels_02, phase_train=phase_train)
    
    A2  = opt_drop(X=tf.nn.relu(Z2, name='relu2'))
    print("A2 shape:", A2.shape)
    P2 = opt_norm(X=tf.nn.max_pool(A2, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME', name='max_pool2'),
                 name='norm2')
    print("P2 shape:", P2.shape)
    
    
    # Layer 03 Conv: output feature maps: 120 filters X 5x5
#     W3 = tf.get_variable("W3", [kernel_size, kernel_size, n_channels_02, n_channels_03], 
#                          initializer=tf.contrib.layers.xavier_initializer(seed=seed))
#     Z3 = tf.nn.conv2d(P2, W3, strides=[1,1,1,1], padding='SAME', name='conv3')
    Z3 = tf.layers.conv2d(P2, n_channels_03, [kernel_size, kernel_size], padding='SAME', name='conv3',
                          kernel_regularizer=tf.contrib.layers.l2_regularizer(0.3), 
                          kernel_initializer=tf.contrib.layers.xavier_initializer(seed=seed))
    
    
    Z3 = batch_norm(Z3, n_channels_03, phase_train=phase_train)
    
    A3  = opt_norm(X=opt_drop(tf.nn.relu(Z3, name='relu3')),
                  name='norm3')
    print("A3 shape:", A3.shape)

    F = tf.contrib.layers.flatten(A3)
    #F = tf.contrib.layers.flatten(P2)
    print('A3 flatten shape', F.shape)
    
    Z4 = tf.contrib.layers.fully_connected(F, n_channels_04, activation_fn=None)
    Z4 = tf.contrib.layers.batch_norm(Z4, center=True, scale=True, is_training=True, scope='bn')
    #Z4 = batch_norm(Z4, n_channels_04, phase_train=phase_train)
    
    A4 = X=opt_drop(tf.nn.relu(Z4))
    print("A4 shape:", A4.shape)

    Z5 = tf.contrib.layers.fully_connected(A4, n_channels_05, activation_fn=None)
    #Z5 = tf.contrib.layers.fully_connected(F, n_channels_05, activation_fn=None)
    print("A5 shape:", Z5.shape)

    prediction = Z5
    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_input, logits=prediction))
    
    # compute trainings error
    cost = tf.losses.sparse_softmax_cross_entropy(labels=y_input, logits=prediction)
    
    # adding exponential decay to learning rate
    global_step = tf.Variable(0, trainable=False)
    starter_learning_rate = 1e-3
    end_learning_rate = 5e-3
    decay_steps = 10000

    learning_rate = tf.train.polynomial_decay(starter_learning_rate, global_step,
                                              decay_steps, end_learning_rate,
                                              power=0.5)

    exp_learning_rate = tf.train.exponential_decay(starter_learning_rate, global_step,
                                                   100000, 0.96, staircase=True)
    
    # use the Adam optimizer to derive the cost function and update the weights
    optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)

    # accuracy for multiple batches
    acc, update_acc = tf.metrics.accuracy(labels=y_input, predictions=tf.argmax(prediction, axis=-1))



num_iters:  391
num_epochs:  3
Padded input shape (?, 36, 36, 3)
A1 shape: (?, 36, 36, 64)
P1 shape: (?, 18, 18, 64)
A2 shape: (?, 18, 18, 128)
P2 shape: (?, 9, 9, 128)
A3 shape: (?, 9, 9, 256)
P3 shape: (?, 5, 5, 256)
A3_1 shape: (?, 5, 5, 128)
A3 flatten shape (?, 3200)
A4 shape: (?, 128)
A5 shape: (?, 10)


In [82]:
num_epochs = 15

# learn rate
learning_rate = 0.03
batch_size = 64
num_iters = math.ceil(x_train.shape[0] / batch_size)

# session configuration
config = tf.ConfigProto()
config.gpu_options.visible_device_list = "0"
config.gpu_options.allow_growth = True

start = time.time()


train_costs = []
test_costs = []

# work on GPU if available
with tf.device("/gpu:0"):

    # start a new session
    with tf.Session(graph=graphCNN, config=config) as session:  
    
        # initialize weights and bias variables
        session.run(tf.group(tf.global_variables_initializer(), tf.local_variables_initializer()))     
        
        # which nodes to fetch from the computation graph
        fetch_train_nodes = {
            'cost' : cost,
            'optimizer' : optimizer 
        }
        
        for epoch in range(num_epochs):
            
            # shuffle data
            permutation = np.random.permutation(x_train.shape[0])
            x_train = x_train[permutation]
            y_train = y_train[permutation]
            
            for i in range(num_iters):
                X_batch = x_train[i * batch_size:(i + 1) * batch_size]
                y_batch = y_train[i * batch_size:(i + 1) * batch_size]
                
                output_batch = session.run(fetch_train_nodes, feed_dict={x_input: X_batch, y_input: y_batch,
                      phase_train: True})
                if(i % 100) == 0:
#                     print('Elapsed time (min.s): ', math.ceil((time.time() - start)/60))
                    train_costs.append(output_batch["cost"])
                
                if(i % 500) == 0:
                    print("Train error [{}/{}]".format(epoch, i), output_batch["cost"])
                    test_acc = session.run(update_acc, feed_dict={x_input: x_test, y_input: y_test, 
                      phase_train: False})
                    test_costs.append(test_acc)
                    print("Test accuracy ", test_acc)
            
        # check against test set
        print("Test accuracy ", session.run(update_acc, feed_dict={x_input: x_test, y_input: y_test,
                      phase_train: False}))

Train error [0/0] 2.92713
Test accuracy  0.194
Train error [0/500] 1.05036
Test accuracy  0.42515
Train error [1/0] 0.733456
Test accuracy  0.514533
Train error [1/500] 0.779862
Test accuracy  0.5681
Train error [2/0] 0.771092
Test accuracy  0.60218
Train error [2/500] 0.578959
Test accuracy  0.628867
Train error [3/0] 0.449598
Test accuracy  0.648129
Train error [3/500] 0.416569
Test accuracy  0.663787
Train error [4/0] 0.498659
Test accuracy  0.676133
Train error [4/500] 0.217038
Test accuracy  0.68604
Train error [5/0] 0.283774
Test accuracy  0.694109
Train error [5/500] 0.228783
Test accuracy  0.702175
Train error [6/0] 0.279478
Test accuracy  0.709154
Train error [6/500] 0.267573
Test accuracy  0.715514
Train error [7/0] 0.324472
Test accuracy  0.720833
Train error [7/500] 0.430671
Test accuracy  0.725469
Train error [8/0] 0.0978293
Test accuracy  0.729694
Train error [8/500] 0.180186
Test accuracy  0.732967
Train error [9/0] 0.0811786
Test accuracy  0.736442
Train error [9/500] 0

<hr />
<h2>Abgabe</h2>

<p>Das von Ihnen erstellte Notebook und die PDF Datei m&uuml;ssen sp&auml;testens bis zum 21. Januar 2018 um 23:59 UTC+1 ;) per E-Mail an&nbsp;<a href="mailto:hezel@htw-berlin.de" target="_blank">hezel@htw-berlin.de</a>&nbsp;eingesendet werden. Verwenden Sie als Betreff bitte &quot;CV1718 &Uuml;bung4 &lt;NAME&gt;&quot; und als Notebook Name &quot;CV1718_Ue4_Tensorflow_ConvNet_CIFAR_NAME.ipynb&quot; sowie &quot;CV1718_Ue4_Tensorflow_ConvNet_CIFAR_NAME.pdf&quot; f&uuml;r die PDF. Bevor Sie mir eine Mail schicken, entfernen Sie bitte &uuml;ber &quot;Kernel&quot; -&gt; &quot;Restart and Clear Output&quot; s&auml;mtlichen von Python erstellten Inhalt und speichern anschlie&szlig;end das Notebook &quot;File&quot; -&gt; &quot;Save and Checkpoint&quot;.</p>
