
# Problem Set 3


By Xide Xia  with help of Brian Kulis, Kate Saenko, Ali Siahkamari, and Kun He.


This assignment will introduce you to:
1. Building and training a convolutional network
2. Saving snapshots of your trained model
3. Reloading weights from a saved model
4. Fine-tuning a pre-trained network
5. Visualizations using Tensorboard

This code has been tested and should for Python 3.5 and 2.7 with tensorflow. You can update to recent tensorflow version just by doing `pip install tensorflow`,  or `pip install tensorflow-gpu` if you want to use GPU.

**Note:** This notebook contains problem descriptions and demo/starter code. However, you're welcome to implement and submit .py files directly, if that's easier for you. Starter .py files are provided in the same `pset4/` directory.

**Warning:** The gpu queue on SCC may be long when the deadline comes. Please start your homework early.

## Part 0: Tutorials

You will find these TensorFlow tutorials on CNNs useful:
 - [Deep MNIST for experts](https://www.tensorflow.org/get_started/mnist/pros)
 - [Convolutional Neural Networks](https://www.tensorflow.org/tutorials/deep_cnn)
 
Note that there are many ways to implement the same thing in TensorFlow, for example, both tf.nn and tf.layers provide convolutional layers but with slightly different interfaces. You will need to read the documentation of the functions provided below to understand how they work.

Also, you can run your experiments on SCC if you want to use GPU. You will find the SCC tutorial helpful: - [SCC tutorials](http://rcs.bu.edu/classes/DeepLearning/)

## Part 1: Building and Training a ConvNet on SVHN
(25 points)

First we provide demo code that trains a convolutional network on the [SVHN Dataset](http://ufldl.stanford.edu/housenumbers/).. 

You will need to download   __Format 2__ from the link above.
- Create a directory named `svhn_mat/` in the working directory. Or, you can create it anywhere you want, but change the path in `svhn_dataset_generator` to match it.
- Download `train_32x32.mat` and `test_32x32.mat` to this directory.
- `extra_32x32.mat` is NOT needed.
- You may find the `wget` command useful for downloading on linux. 



The following defines a generator for the SVHN Dataset, yielding the next batch every time next is invoked.

In [1]:
import copy
import os
import math
import numpy as np
import scipy
import scipy.io

from six.moves import range

import read_data

@read_data.restartable
def svhn_dataset_generator(dataset_name, batch_size):
    assert dataset_name in ['train', 'test']
    assert batch_size > 0 or batch_size == -1  # -1 for entire dataset
    
    path = './svhn_mat/' # path to the SVHN dataset you will download in Q1.1
    file_name = '%s_32x32.mat' % dataset_name
    file_dict = scipy.io.loadmat(os.path.join(path, file_name))
    X_all = file_dict['X'].transpose((3, 0, 1, 2))
    y_all = file_dict['y']
    data_len = X_all.shape[0]
    batch_size = batch_size if batch_size > 0 else data_len
    
    X_all_padded = np.concatenate([X_all, X_all[:batch_size]], axis=0)
    y_all_padded = np.concatenate([y_all, y_all[:batch_size]], axis=0)
    y_all_padded[y_all_padded == 10] = 0
    
    for slice_i in range(int(math.ceil(data_len / batch_size))):
        idx = slice_i * batch_size
        X_batch = X_all_padded[idx:idx + batch_size]
        y_batch = np.ravel(y_all_padded[idx:idx + batch_size])
        yield X_batch, y_batch

In [2]:
import tensorflow as tf

# The following defines a simple CovNet Model.
def SVHN_net_v0(x_):
    conv1 = tf.layers.conv2d(
            inputs=x_,
            filters=32,  # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu)
    
    pool1 = tf.layers.max_pooling2d(inputs=conv1, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
    
    conv2 = tf.layers.conv2d(
            inputs=pool1,
            filters=32, # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu)
    
    pool2 = tf.layers.max_pooling2d(inputs=conv2, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
        
    pool_flat = tf.contrib.layers.flatten(pool2, scope='pool2flat')
    dense = tf.layers.dense(inputs=pool_flat, units=500, activation=tf.nn.relu)
    logits = tf.layers.dense(inputs=dense, units=10)
    return logits


def apply_classification_loss(model_function):
    with tf.Graph().as_default() as g:
        with tf.device("/gpu:0"):  # use gpu:0 if on GPU
            x_ = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y_ = tf.placeholder(tf.int32, [None])
            y_logits = model_function(x_)
            
            y_dict = dict(labels=y_, logits=y_logits)
            losses = tf.nn.sparse_softmax_cross_entropy_with_logits(**y_dict)
            cross_entropy_loss = tf.reduce_mean(losses)
            trainer = tf.train.AdamOptimizer(learning_rate=0.001)
            train_op = trainer.minimize(cross_entropy_loss)
            
            y_pred = tf.argmax(tf.nn.softmax(y_logits), axis=1)
            correct_prediction = tf.equal(tf.cast(y_pred, tf.int32), y_)
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    model_dict = {'graph': g, 'inputs': [x_, y_], 'train_op': train_op,
                  'accuracy': accuracy, 'loss': cross_entropy_loss}
    
    return model_dict

### Q1.1 Training SVHN Net
(2 points)

Now we train the SVHN_net_v0 net on Format 2 of the SVHN Dataset.  

**Note:** training will take a while, so you might want to use GPU.

In [3]:
def train_model(model_dict, dataset_generators, epoch_n, print_every):
    with model_dict['graph'].as_default(), tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        
        for epoch_i in range(epoch_n):
            for iter_i, data_batch in enumerate(dataset_generators['train']):
                train_feed_dict = dict(zip(model_dict['inputs'], data_batch))
                sess.run(model_dict['train_op'], feed_dict=train_feed_dict)
                
                if iter_i % print_every == 0:
                    collect_arr = []
                    for test_batch in dataset_generators['test']:
                        test_feed_dict = dict(zip(model_dict['inputs'], test_batch))
                        to_compute = [model_dict['loss'], model_dict['accuracy']]
                        collect_arr.append(sess.run(to_compute, test_feed_dict))
                    averages = np.mean(collect_arr, axis=0)
                    fmt = (epoch_i, iter_i, ) + tuple(averages)
                    print('epoch {:d} iter {:d}, loss: {:.3f}, '
                          'accuracy: {:.3f}'.format(*fmt))

In [4]:
dataset_generators = {
        'train': svhn_dataset_generator('train', 256),
        'test': svhn_dataset_generator('test', 256)
}

In [4]:
dataset_generators = {
        'train': svhn_dataset_generator('train', 256),
        'test': svhn_dataset_generator('test', 256)
}
    
model_dict = apply_classification_loss(SVHN_net_v0)
train_model(model_dict, dataset_generators, epoch_n=50, print_every=20)

epoch 0 iter 0, loss: 73.921, accuracy: 0.159
epoch 0 iter 20, loss: 2.253, accuracy: 0.193
epoch 0 iter 40, loss: 2.243, accuracy: 0.196
epoch 0 iter 60, loss: 2.240, accuracy: 0.194
epoch 0 iter 80, loss: 2.231, accuracy: 0.196
epoch 0 iter 100, loss: 2.229, accuracy: 0.202
epoch 0 iter 120, loss: 2.205, accuracy: 0.206
epoch 0 iter 140, loss: 2.196, accuracy: 0.214
epoch 0 iter 160, loss: 2.207, accuracy: 0.209
epoch 0 iter 180, loss: 2.177, accuracy: 0.219
epoch 0 iter 200, loss: 2.158, accuracy: 0.242
epoch 0 iter 220, loss: 2.128, accuracy: 0.246
epoch 0 iter 240, loss: 2.079, accuracy: 0.281
epoch 0 iter 260, loss: 2.025, accuracy: 0.302
epoch 0 iter 280, loss: 1.983, accuracy: 0.323
epoch 1 iter 0, loss: 2.007, accuracy: 0.317
epoch 1 iter 20, loss: 2.000, accuracy: 0.315
epoch 1 iter 40, loss: 1.950, accuracy: 0.328
epoch 1 iter 60, loss: 1.945, accuracy: 0.331
epoch 1 iter 80, loss: 1.922, accuracy: 0.354
epoch 1 iter 100, loss: 1.918, accuracy: 0.392
epoch 1 iter 120, loss: 

epoch 11 iter 220, loss: 1.173, accuracy: 0.814
epoch 11 iter 240, loss: 0.986, accuracy: 0.816
epoch 11 iter 260, loss: 0.869, accuracy: 0.827
epoch 11 iter 280, loss: 0.956, accuracy: 0.818
epoch 12 iter 0, loss: 1.303, accuracy: 0.798
epoch 12 iter 20, loss: 0.892, accuracy: 0.821
epoch 12 iter 40, loss: 1.014, accuracy: 0.817
epoch 12 iter 60, loss: 0.962, accuracy: 0.818
epoch 12 iter 80, loss: 1.011, accuracy: 0.815
epoch 12 iter 100, loss: 1.009, accuracy: 0.821
epoch 12 iter 120, loss: 0.944, accuracy: 0.807
epoch 12 iter 140, loss: 1.000, accuracy: 0.800
epoch 12 iter 160, loss: 1.212, accuracy: 0.794
epoch 12 iter 180, loss: 1.069, accuracy: 0.813
epoch 12 iter 200, loss: 0.959, accuracy: 0.822
epoch 12 iter 220, loss: 1.424, accuracy: 0.802
epoch 12 iter 240, loss: 1.262, accuracy: 0.800
epoch 12 iter 260, loss: 1.037, accuracy: 0.821
epoch 12 iter 280, loss: 0.996, accuracy: 0.821
epoch 13 iter 0, loss: 1.338, accuracy: 0.802
epoch 13 iter 20, loss: 1.018, accuracy: 0.827
e

epoch 23 iter 80, loss: 1.670, accuracy: 0.822
epoch 23 iter 100, loss: 1.631, accuracy: 0.828
epoch 23 iter 120, loss: 1.664, accuracy: 0.824
epoch 23 iter 140, loss: 1.503, accuracy: 0.831
epoch 23 iter 160, loss: 1.564, accuracy: 0.802
epoch 23 iter 180, loss: 1.542, accuracy: 0.828
epoch 23 iter 200, loss: 1.825, accuracy: 0.818
epoch 23 iter 220, loss: 1.901, accuracy: 0.827
epoch 23 iter 240, loss: 1.754, accuracy: 0.833
epoch 23 iter 260, loss: 1.910, accuracy: 0.827
epoch 23 iter 280, loss: 1.806, accuracy: 0.829
epoch 24 iter 0, loss: 1.811, accuracy: 0.830
epoch 24 iter 20, loss: 1.943, accuracy: 0.826
epoch 24 iter 40, loss: 1.981, accuracy: 0.814
epoch 24 iter 60, loss: 1.872, accuracy: 0.826
epoch 24 iter 80, loss: 1.758, accuracy: 0.824
epoch 24 iter 100, loss: 1.782, accuracy: 0.821
epoch 24 iter 120, loss: 1.763, accuracy: 0.813
epoch 24 iter 140, loss: 1.658, accuracy: 0.830
epoch 24 iter 160, loss: 1.634, accuracy: 0.804
epoch 24 iter 180, loss: 1.753, accuracy: 0.829

epoch 34 iter 240, loss: 2.549, accuracy: 0.836
epoch 34 iter 260, loss: 2.380, accuracy: 0.836
epoch 34 iter 280, loss: 2.569, accuracy: 0.837
epoch 35 iter 0, loss: 2.672, accuracy: 0.837
epoch 35 iter 20, loss: 2.591, accuracy: 0.832
epoch 35 iter 40, loss: 2.725, accuracy: 0.823
epoch 35 iter 60, loss: 2.273, accuracy: 0.828
epoch 35 iter 80, loss: 2.508, accuracy: 0.832
epoch 35 iter 100, loss: 2.361, accuracy: 0.832
epoch 35 iter 120, loss: 2.406, accuracy: 0.831
epoch 35 iter 140, loss: 2.303, accuracy: 0.827
epoch 35 iter 160, loss: 2.510, accuracy: 0.828
epoch 35 iter 180, loss: 2.553, accuracy: 0.826
epoch 35 iter 200, loss: 2.312, accuracy: 0.836
epoch 35 iter 220, loss: 2.609, accuracy: 0.833
epoch 35 iter 240, loss: 2.543, accuracy: 0.836
epoch 35 iter 260, loss: 2.730, accuracy: 0.840
epoch 35 iter 280, loss: 2.798, accuracy: 0.828
epoch 36 iter 0, loss: 2.694, accuracy: 0.828
epoch 36 iter 20, loss: 2.756, accuracy: 0.833
epoch 36 iter 40, loss: 3.128, accuracy: 0.819
ep

epoch 46 iter 100, loss: 2.932, accuracy: 0.820
epoch 46 iter 120, loss: 3.117, accuracy: 0.830
epoch 46 iter 140, loss: 3.027, accuracy: 0.833
epoch 46 iter 160, loss: 2.928, accuracy: 0.836
epoch 46 iter 180, loss: 3.048, accuracy: 0.836
epoch 46 iter 200, loss: 2.951, accuracy: 0.839
epoch 46 iter 220, loss: 3.437, accuracy: 0.824
epoch 46 iter 240, loss: 3.493, accuracy: 0.820
epoch 46 iter 260, loss: 3.394, accuracy: 0.830
epoch 46 iter 280, loss: 3.112, accuracy: 0.832
epoch 47 iter 0, loss: 3.630, accuracy: 0.839
epoch 47 iter 20, loss: 3.569, accuracy: 0.840
epoch 47 iter 40, loss: 3.663, accuracy: 0.833
epoch 47 iter 60, loss: 3.591, accuracy: 0.837
epoch 47 iter 80, loss: 3.453, accuracy: 0.833
epoch 47 iter 100, loss: 3.059, accuracy: 0.817
epoch 47 iter 120, loss: 3.206, accuracy: 0.830
epoch 47 iter 140, loss: 3.224, accuracy: 0.833
epoch 47 iter 160, loss: 3.128, accuracy: 0.831
epoch 47 iter 180, loss: 3.308, accuracy: 0.833
epoch 47 iter 200, loss: 2.891, accuracy: 0.83

### Q1.2 Understanding the CNN Architecture
(7 points)

Explain the definition of the following terms. What is the corresponding setting in our SVHN net? Are there any other choices?

  - Stride
  - Padding
  - Non-linearity
  - Pooling
  - Optimizer
  - Learning rate
  - Loss function

**[Double click here to add your answer]**

### Q1.3 SVHN Net Variations
(16 points)

Now we vary the structure of the network. To keep things simple, we still use  two identical conv layers, but vary their parameters. 

Report the final test accuracy on 3 different number of filters, 3 different size of kernels, 3 different number of strides, and 3 different dimension of final fully connected layer. Each time when you vary one parameter, keep the other fixed at the original value. Explain the results.

|# of Filter|Accuracy|
|--|-------------------------------|
| 36 | 0.825 |
| 50 | 0.822|
| 60 | 0.828 |

|Kernel size|Accuracy|
|--|-------------------------------|
| [4,4] | 0.836 |
| [6,6] | 0.815 |
| [7,7] | 0.795 |

|Stride|Accuracy|
|--|-------------------------------|
| 3 | 0.815 |
| 4 | 0.819 |
| 5 | 0.811 |

|FC size|Accuracy|
|--|-------------------------------|
| 400 | 0.839 |
| 600 | 0.842 |
| 800 | 0.849 |

A template for one sample modification is given below. 

**Note:** you're welcome to decide how many training epochs to use, if that gets you the same results but faster.

## Explanation of result ##

I varied 4 parameters of CNN and the result is shown above. 

For filter, we could see that more filters, higher accuracy. But it doesn't change clearly

For kernel size, smaller kernel seems could return us a better result.

For stride, the accuracy decrease with the improvement of the stride.

And accuracy improves when the FC size improves.

In [12]:
def my_SVHN_net(x_, filters, kernel_size, stride, FC_size):
    conv1 = tf.layers.conv2d(
            inputs=x_,
            filters=filters,  # number of filters
            kernel_size=kernel_size,
            padding="same",
            activation=tf.nn.relu)
    
    pool1 = tf.layers.max_pooling2d(inputs=conv1, 
                                    pool_size=[2, 2], 
                                    strides=stride)  # convolution stride
    
    conv2 = tf.layers.conv2d(
            inputs=pool1,
            filters=filters, # number of filters
            kernel_size=kernel_size,
            padding="same",
            activation=tf.nn.relu)
    
    pool2 = tf.layers.max_pooling2d(inputs=conv2, 
                                    pool_size=[2, 2], 
                                    strides=stride)  # convolution stride
        
    pool_flat = tf.contrib.layers.flatten(pool2, scope='pool2flat')
    dense = tf.layers.dense(inputs=pool_flat, units=FC_size, activation=tf.nn.relu)
    logits = tf.layers.dense(inputs=dense, units=10)
    return logits

def apply_classification_loss(model_function, para_input):
    with tf.Graph().as_default() as g:
        with tf.device("/gpu:0"):  # use gpu:0 if on GPU
            x_ = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y_ = tf.placeholder(tf.int32, [None])
            y_logits = model_function(x_, *para_input)
            
            y_dict = dict(labels=y_, logits=y_logits)
            losses = tf.nn.sparse_softmax_cross_entropy_with_logits(**y_dict)
            cross_entropy_loss = tf.reduce_mean(losses)
            trainer = tf.train.AdamOptimizer(learning_rate=0.001)
            train_op = trainer.minimize(cross_entropy_loss)
            
            y_pred = tf.argmax(tf.nn.softmax(y_logits), axis=1)
            correct_prediction = tf.equal(tf.cast(y_pred, tf.int32), y_)
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    model_dict = {'graph': g, 'inputs': [x_, y_], 'train_op': train_op,
                  'accuracy': accuracy, 'loss': cross_entropy_loss}
    
    return model_dict


## Filter size ##

In [12]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(36, [5, 5], 2, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [36, [5, 5], 2, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 36, kernel_size = [5, 5], stride = 2, FC_size = 500
epoch 0 iter 0, loss: 192.477, accuracy: 0.196
epoch 0 iter 10, loss: 2.324, accuracy: 0.138
epoch 0 iter 20, loss: 2.248, accuracy: 0.198
epoch 0 iter 30, loss: 2.242, accuracy: 0.192
epoch 0 iter 40, loss: 2.232, accuracy: 0.199
epoch 0 iter 50, loss: 2.232, accuracy: 0.202
epoch 0 iter 60, loss: 2.232, accuracy: 0.203
epoch 0 iter 70, loss: 2.228, accuracy: 0.205
epoch 0 iter 80, loss: 2.223, accuracy: 0.205
epoch 0 iter 90, loss: 2.222, accuracy: 0.206
epoch 0 iter 100, loss: 2.225, accuracy: 0.206
epoch 0 iter 110, loss: 2.213, accuracy: 0.208
epoch 0 iter 120, loss: 2.212, accuracy: 0.210
epoch 0 iter 130, loss: 2.207, accuracy: 0.210
epoch 0 iter 140, loss: 2.211, accuracy: 0.213
epoch 0 iter 150, loss: 2.198, accuracy: 0.211
epoch 0 iter 160, loss: 2.227, accuracy: 0.194
epoch 0 iter 170, loss: 2.208, accuracy: 0.205
epoch 0 iter 180, loss: 2.200, accuracy: 0.216
epoch 0 iter 190, loss: 2.192, accuracy: 0.217
epo

epoch 6 iter 10, loss: 0.684, accuracy: 0.816
epoch 6 iter 20, loss: 0.672, accuracy: 0.818
epoch 6 iter 30, loss: 0.676, accuracy: 0.816
epoch 6 iter 40, loss: 0.713, accuracy: 0.808
epoch 6 iter 50, loss: 0.681, accuracy: 0.811
epoch 6 iter 60, loss: 0.717, accuracy: 0.808
epoch 6 iter 70, loss: 0.692, accuracy: 0.814
epoch 6 iter 80, loss: 0.703, accuracy: 0.806
epoch 6 iter 90, loss: 0.704, accuracy: 0.814
epoch 6 iter 100, loss: 0.720, accuracy: 0.806
epoch 6 iter 110, loss: 0.658, accuracy: 0.816
epoch 6 iter 120, loss: 0.695, accuracy: 0.807
epoch 6 iter 130, loss: 0.725, accuracy: 0.802
epoch 6 iter 140, loss: 0.724, accuracy: 0.808
epoch 6 iter 150, loss: 0.761, accuracy: 0.803
epoch 6 iter 160, loss: 0.690, accuracy: 0.809
epoch 6 iter 170, loss: 0.711, accuracy: 0.814
epoch 6 iter 180, loss: 0.704, accuracy: 0.812
epoch 6 iter 190, loss: 0.721, accuracy: 0.809
epoch 6 iter 200, loss: 0.736, accuracy: 0.810
epoch 6 iter 210, loss: 0.687, accuracy: 0.815
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.809, accuracy: 0.830
epoch 12 iter 30, loss: 0.881, accuracy: 0.804
epoch 12 iter 40, loss: 0.882, accuracy: 0.820
epoch 12 iter 50, loss: 0.930, accuracy: 0.812
epoch 12 iter 60, loss: 1.001, accuracy: 0.804
epoch 12 iter 70, loss: 0.890, accuracy: 0.808
epoch 12 iter 80, loss: 0.898, accuracy: 0.814
epoch 12 iter 90, loss: 0.903, accuracy: 0.814
epoch 12 iter 100, loss: 0.896, accuracy: 0.818
epoch 12 iter 110, loss: 0.847, accuracy: 0.814
epoch 12 iter 120, loss: 0.836, accuracy: 0.825
epoch 12 iter 130, loss: 0.847, accuracy: 0.826
epoch 12 iter 140, loss: 0.968, accuracy: 0.814
epoch 12 iter 150, loss: 0.954, accuracy: 0.819
epoch 12 iter 160, loss: 0.894, accuracy: 0.824
epoch 12 iter 170, loss: 0.857, accuracy: 0.822
epoch 12 iter 180, loss: 0.881, accuracy: 0.822
epoch 12 iter 190, loss: 0.941, accuracy: 0.814
epoch 12 iter 200, loss: 0.878, accuracy: 0.820
epoch 12 iter 210, loss: 0.955, accuracy: 0.813
epoch 12 iter 220, loss: 0.971, accuracy: 0.806


epoch 18 iter 0, loss: 1.026, accuracy: 0.832
epoch 18 iter 10, loss: 1.228, accuracy: 0.821
epoch 18 iter 20, loss: 1.138, accuracy: 0.824
epoch 18 iter 30, loss: 1.179, accuracy: 0.823
epoch 18 iter 40, loss: 1.145, accuracy: 0.818
epoch 18 iter 50, loss: 1.189, accuracy: 0.816
epoch 18 iter 60, loss: 1.131, accuracy: 0.825
epoch 18 iter 70, loss: 1.082, accuracy: 0.823
epoch 18 iter 80, loss: 1.212, accuracy: 0.817
epoch 18 iter 90, loss: 1.077, accuracy: 0.808
epoch 18 iter 100, loss: 1.079, accuracy: 0.818
epoch 18 iter 110, loss: 1.135, accuracy: 0.815
epoch 18 iter 120, loss: 1.133, accuracy: 0.822
epoch 18 iter 130, loss: 1.072, accuracy: 0.806
epoch 18 iter 140, loss: 1.235, accuracy: 0.816
epoch 18 iter 150, loss: 1.266, accuracy: 0.809
epoch 18 iter 160, loss: 1.256, accuracy: 0.815
epoch 18 iter 170, loss: 1.181, accuracy: 0.818
epoch 18 iter 180, loss: 1.128, accuracy: 0.823
epoch 18 iter 190, loss: 1.157, accuracy: 0.830
epoch 18 iter 200, loss: 1.215, accuracy: 0.815
epo

epoch 23 iter 280, loss: 1.536, accuracy: 0.827
epoch 24 iter 0, loss: 1.481, accuracy: 0.820
epoch 24 iter 10, loss: 1.538, accuracy: 0.824
epoch 24 iter 20, loss: 1.506, accuracy: 0.822
epoch 24 iter 30, loss: 1.520, accuracy: 0.819
epoch 24 iter 40, loss: 1.527, accuracy: 0.803
epoch 24 iter 50, loss: 1.660, accuracy: 0.805
epoch 24 iter 60, loss: 1.491, accuracy: 0.818
epoch 24 iter 70, loss: 1.578, accuracy: 0.822
epoch 24 iter 80, loss: 1.458, accuracy: 0.818
epoch 24 iter 90, loss: 1.573, accuracy: 0.818
epoch 24 iter 100, loss: 1.512, accuracy: 0.813
epoch 24 iter 110, loss: 1.492, accuracy: 0.820
epoch 24 iter 120, loss: 1.502, accuracy: 0.820
epoch 24 iter 130, loss: 1.433, accuracy: 0.828
epoch 24 iter 140, loss: 1.368, accuracy: 0.816
epoch 24 iter 150, loss: 1.786, accuracy: 0.807
epoch 24 iter 160, loss: 1.434, accuracy: 0.807
epoch 24 iter 170, loss: 1.577, accuracy: 0.815
epoch 24 iter 180, loss: 1.453, accuracy: 0.822
epoch 24 iter 190, loss: 1.576, accuracy: 0.828
epo

epoch 29 iter 270, loss: 2.046, accuracy: 0.821
epoch 29 iter 280, loss: 1.996, accuracy: 0.822
epoch 30 iter 0, loss: 1.952, accuracy: 0.825
epoch 30 iter 10, loss: 1.792, accuracy: 0.825
epoch 30 iter 20, loss: 1.889, accuracy: 0.829
epoch 30 iter 30, loss: 1.972, accuracy: 0.827
epoch 30 iter 40, loss: 1.974, accuracy: 0.811
epoch 30 iter 50, loss: 1.801, accuracy: 0.821
epoch 30 iter 60, loss: 1.702, accuracy: 0.813
epoch 30 iter 70, loss: 2.209, accuracy: 0.797
epoch 30 iter 80, loss: 1.733, accuracy: 0.829
epoch 30 iter 90, loss: 1.891, accuracy: 0.825
epoch 30 iter 100, loss: 1.943, accuracy: 0.817
epoch 30 iter 110, loss: 1.853, accuracy: 0.811
epoch 30 iter 120, loss: 1.702, accuracy: 0.820
epoch 30 iter 130, loss: 2.025, accuracy: 0.815
epoch 30 iter 140, loss: 1.803, accuracy: 0.819
epoch 30 iter 150, loss: 1.987, accuracy: 0.817
epoch 30 iter 160, loss: 1.765, accuracy: 0.813
epoch 30 iter 170, loss: 1.898, accuracy: 0.818
epoch 30 iter 180, loss: 1.982, accuracy: 0.821
epo

epoch 35 iter 260, loss: 2.152, accuracy: 0.824
epoch 35 iter 270, loss: 1.944, accuracy: 0.832
epoch 35 iter 280, loss: 2.079, accuracy: 0.834
epoch 36 iter 0, loss: 2.102, accuracy: 0.826
epoch 36 iter 10, loss: 2.144, accuracy: 0.823
epoch 36 iter 20, loss: 2.100, accuracy: 0.832
epoch 36 iter 30, loss: 1.956, accuracy: 0.837
epoch 36 iter 40, loss: 2.048, accuracy: 0.836
epoch 36 iter 50, loss: 2.094, accuracy: 0.814
epoch 36 iter 60, loss: 2.254, accuracy: 0.814
epoch 36 iter 70, loss: 2.160, accuracy: 0.815
epoch 36 iter 80, loss: 2.512, accuracy: 0.804
epoch 36 iter 90, loss: 2.148, accuracy: 0.803
epoch 36 iter 100, loss: 2.196, accuracy: 0.816
epoch 36 iter 110, loss: 2.235, accuracy: 0.823
epoch 36 iter 120, loss: 2.000, accuracy: 0.823
epoch 36 iter 130, loss: 2.080, accuracy: 0.826
epoch 36 iter 140, loss: 2.273, accuracy: 0.816
epoch 36 iter 150, loss: 2.122, accuracy: 0.825
epoch 36 iter 160, loss: 2.625, accuracy: 0.813
epoch 36 iter 170, loss: 1.897, accuracy: 0.815
epo

In [14]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(60, [5, 5], 2, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [60, [5, 5], 2, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 60, kernel_size = [5, 5], stride = 2, FC_size = 500
epoch 0 iter 0, loss: 88.079, accuracy: 0.129
epoch 0 iter 10, loss: 2.331, accuracy: 0.146
epoch 0 iter 20, loss: 2.241, accuracy: 0.202
epoch 0 iter 30, loss: 2.139, accuracy: 0.247
epoch 0 iter 40, loss: 1.901, accuracy: 0.367
epoch 0 iter 50, loss: 1.793, accuracy: 0.414
epoch 0 iter 60, loss: 1.647, accuracy: 0.471
epoch 0 iter 70, loss: 1.526, accuracy: 0.512
epoch 0 iter 80, loss: 1.468, accuracy: 0.538
epoch 0 iter 90, loss: 1.444, accuracy: 0.537
epoch 0 iter 100, loss: 1.370, accuracy: 0.572
epoch 0 iter 110, loss: 1.376, accuracy: 0.566
epoch 0 iter 120, loss: 1.299, accuracy: 0.595
epoch 0 iter 130, loss: 1.277, accuracy: 0.603
epoch 0 iter 140, loss: 1.235, accuracy: 0.614
epoch 0 iter 150, loss: 1.271, accuracy: 0.605
epoch 0 iter 160, loss: 1.226, accuracy: 0.616
epoch 0 iter 170, loss: 1.207, accuracy: 0.630
epoch 0 iter 180, loss: 1.164, accuracy: 0.640
epoch 0 iter 190, loss: 1.197, accuracy: 0.633
epoc

epoch 6 iter 10, loss: 0.859, accuracy: 0.752
epoch 6 iter 20, loss: 0.809, accuracy: 0.766
epoch 6 iter 30, loss: 0.884, accuracy: 0.747
epoch 6 iter 40, loss: 0.827, accuracy: 0.756
epoch 6 iter 50, loss: 0.845, accuracy: 0.756
epoch 6 iter 60, loss: 0.896, accuracy: 0.744
epoch 6 iter 70, loss: 0.899, accuracy: 0.734
epoch 6 iter 80, loss: 0.899, accuracy: 0.742
epoch 6 iter 90, loss: 0.846, accuracy: 0.754
epoch 6 iter 100, loss: 0.865, accuracy: 0.749
epoch 6 iter 110, loss: 0.826, accuracy: 0.763
epoch 6 iter 120, loss: 0.862, accuracy: 0.758
epoch 6 iter 130, loss: 0.826, accuracy: 0.762
epoch 6 iter 140, loss: 0.915, accuracy: 0.748
epoch 6 iter 150, loss: 0.877, accuracy: 0.754
epoch 6 iter 160, loss: 0.830, accuracy: 0.761
epoch 6 iter 170, loss: 0.854, accuracy: 0.756
epoch 6 iter 180, loss: 0.838, accuracy: 0.760
epoch 6 iter 190, loss: 0.830, accuracy: 0.763
epoch 6 iter 200, loss: 0.873, accuracy: 0.756
epoch 6 iter 210, loss: 0.841, accuracy: 0.757
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.925, accuracy: 0.763
epoch 12 iter 30, loss: 0.908, accuracy: 0.758
epoch 12 iter 40, loss: 0.904, accuracy: 0.768
epoch 12 iter 50, loss: 0.861, accuracy: 0.776
epoch 12 iter 60, loss: 0.947, accuracy: 0.756
epoch 12 iter 70, loss: 0.866, accuracy: 0.772
epoch 12 iter 80, loss: 0.935, accuracy: 0.778
epoch 12 iter 90, loss: 0.882, accuracy: 0.769
epoch 12 iter 100, loss: 0.956, accuracy: 0.768
epoch 12 iter 110, loss: 0.926, accuracy: 0.768
epoch 12 iter 120, loss: 0.984, accuracy: 0.766
epoch 12 iter 130, loss: 0.938, accuracy: 0.773
epoch 12 iter 140, loss: 0.890, accuracy: 0.780
epoch 12 iter 150, loss: 0.976, accuracy: 0.764
epoch 12 iter 160, loss: 0.935, accuracy: 0.773
epoch 12 iter 170, loss: 0.909, accuracy: 0.778
epoch 12 iter 180, loss: 0.937, accuracy: 0.760
epoch 12 iter 190, loss: 0.918, accuracy: 0.770
epoch 12 iter 200, loss: 0.959, accuracy: 0.776
epoch 12 iter 210, loss: 0.949, accuracy: 0.763
epoch 12 iter 220, loss: 0.907, accuracy: 0.777


epoch 18 iter 0, loss: 1.018, accuracy: 0.798
epoch 18 iter 10, loss: 1.024, accuracy: 0.801
epoch 18 iter 20, loss: 1.012, accuracy: 0.798
epoch 18 iter 30, loss: 0.962, accuracy: 0.799
epoch 18 iter 40, loss: 0.999, accuracy: 0.796
epoch 18 iter 50, loss: 1.031, accuracy: 0.804
epoch 18 iter 60, loss: 0.946, accuracy: 0.810
epoch 18 iter 70, loss: 0.978, accuracy: 0.804
epoch 18 iter 80, loss: 1.037, accuracy: 0.805
epoch 18 iter 90, loss: 0.940, accuracy: 0.801
epoch 18 iter 100, loss: 1.025, accuracy: 0.793
epoch 18 iter 110, loss: 0.958, accuracy: 0.802
epoch 18 iter 120, loss: 1.012, accuracy: 0.788
epoch 18 iter 130, loss: 1.022, accuracy: 0.788
epoch 18 iter 140, loss: 1.067, accuracy: 0.812
epoch 18 iter 150, loss: 1.036, accuracy: 0.803
epoch 18 iter 160, loss: 1.027, accuracy: 0.803
epoch 18 iter 170, loss: 1.145, accuracy: 0.778
epoch 18 iter 180, loss: 1.022, accuracy: 0.810
epoch 18 iter 190, loss: 1.056, accuracy: 0.796
epoch 18 iter 200, loss: 1.070, accuracy: 0.787
epo

epoch 23 iter 280, loss: 1.297, accuracy: 0.798
epoch 24 iter 0, loss: 1.380, accuracy: 0.802
epoch 24 iter 10, loss: 1.296, accuracy: 0.806
epoch 24 iter 20, loss: 1.140, accuracy: 0.818
epoch 24 iter 30, loss: 1.145, accuracy: 0.821
epoch 24 iter 40, loss: 1.162, accuracy: 0.822
epoch 24 iter 50, loss: 1.374, accuracy: 0.809
epoch 24 iter 60, loss: 1.241, accuracy: 0.818
epoch 24 iter 70, loss: 1.250, accuracy: 0.802
epoch 24 iter 80, loss: 1.256, accuracy: 0.820
epoch 24 iter 90, loss: 1.289, accuracy: 0.821
epoch 24 iter 100, loss: 1.314, accuracy: 0.815
epoch 24 iter 110, loss: 1.351, accuracy: 0.817
epoch 24 iter 120, loss: 1.425, accuracy: 0.801
epoch 24 iter 130, loss: 1.242, accuracy: 0.825
epoch 24 iter 140, loss: 1.375, accuracy: 0.812
epoch 24 iter 150, loss: 1.282, accuracy: 0.825
epoch 24 iter 160, loss: 1.221, accuracy: 0.828
epoch 24 iter 170, loss: 1.193, accuracy: 0.828
epoch 24 iter 180, loss: 1.331, accuracy: 0.816
epoch 24 iter 190, loss: 1.243, accuracy: 0.833
epo

epoch 29 iter 270, loss: 1.627, accuracy: 0.826
epoch 29 iter 280, loss: 1.515, accuracy: 0.822
epoch 30 iter 0, loss: 1.655, accuracy: 0.820
epoch 30 iter 10, loss: 1.635, accuracy: 0.806
epoch 30 iter 20, loss: 1.634, accuracy: 0.826
epoch 30 iter 30, loss: 1.559, accuracy: 0.812
epoch 30 iter 40, loss: 1.630, accuracy: 0.826
epoch 30 iter 50, loss: 1.671, accuracy: 0.820
epoch 30 iter 60, loss: 1.992, accuracy: 0.799
epoch 30 iter 70, loss: 1.771, accuracy: 0.804
epoch 30 iter 80, loss: 1.630, accuracy: 0.819
epoch 30 iter 90, loss: 1.655, accuracy: 0.817
epoch 30 iter 100, loss: 1.741, accuracy: 0.818
epoch 30 iter 110, loss: 1.873, accuracy: 0.803
epoch 30 iter 120, loss: 1.612, accuracy: 0.824
epoch 30 iter 130, loss: 1.699, accuracy: 0.816
epoch 30 iter 140, loss: 1.549, accuracy: 0.821
epoch 30 iter 150, loss: 1.591, accuracy: 0.817
epoch 30 iter 160, loss: 1.790, accuracy: 0.821
epoch 30 iter 170, loss: 1.667, accuracy: 0.824
epoch 30 iter 180, loss: 1.852, accuracy: 0.818
epo

epoch 35 iter 260, loss: 1.888, accuracy: 0.827
epoch 35 iter 270, loss: 2.027, accuracy: 0.827
epoch 35 iter 280, loss: 1.854, accuracy: 0.827
epoch 36 iter 0, loss: 1.925, accuracy: 0.831
epoch 36 iter 10, loss: 1.954, accuracy: 0.827
epoch 36 iter 20, loss: 1.952, accuracy: 0.824
epoch 36 iter 30, loss: 2.083, accuracy: 0.817
epoch 36 iter 40, loss: 1.874, accuracy: 0.811
epoch 36 iter 50, loss: 2.006, accuracy: 0.826
epoch 36 iter 60, loss: 2.452, accuracy: 0.812
epoch 36 iter 70, loss: 2.110, accuracy: 0.805
epoch 36 iter 80, loss: 2.025, accuracy: 0.817
epoch 36 iter 90, loss: 2.106, accuracy: 0.814
epoch 36 iter 100, loss: 1.912, accuracy: 0.828
epoch 36 iter 110, loss: 2.167, accuracy: 0.823
epoch 36 iter 120, loss: 2.114, accuracy: 0.813
epoch 36 iter 130, loss: 2.001, accuracy: 0.818
epoch 36 iter 140, loss: 2.211, accuracy: 0.818
epoch 36 iter 150, loss: 2.288, accuracy: 0.796
epoch 36 iter 160, loss: 2.237, accuracy: 0.808
epoch 36 iter 170, loss: 2.346, accuracy: 0.810
epo

In [9]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(50, [5, 5], 2, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [50, [5, 5], 2, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 50, kernel_size = [5, 5], stride = 2, FC_size = 500
epoch 0 iter 0, loss: 137.431, accuracy: 0.191
epoch 0 iter 10, loss: 2.412, accuracy: 0.136
epoch 0 iter 20, loss: 2.255, accuracy: 0.198
epoch 0 iter 30, loss: 2.240, accuracy: 0.182
epoch 0 iter 40, loss: 2.237, accuracy: 0.196
epoch 0 iter 50, loss: 2.239, accuracy: 0.196
epoch 0 iter 60, loss: 2.228, accuracy: 0.183
epoch 0 iter 70, loss: 2.203, accuracy: 0.206
epoch 0 iter 80, loss: 2.239, accuracy: 0.203
epoch 0 iter 90, loss: 2.228, accuracy: 0.201
epoch 0 iter 100, loss: 2.224, accuracy: 0.204
epoch 0 iter 110, loss: 2.217, accuracy: 0.213
epoch 0 iter 120, loss: 2.221, accuracy: 0.204
epoch 0 iter 130, loss: 2.213, accuracy: 0.219
epoch 0 iter 140, loss: 2.220, accuracy: 0.211
epoch 0 iter 150, loss: 2.199, accuracy: 0.219
epoch 0 iter 160, loss: 2.207, accuracy: 0.215
epoch 0 iter 170, loss: 2.191, accuracy: 0.216
epoch 0 iter 180, loss: 2.190, accuracy: 0.214
epoch 0 iter 190, loss: 2.178, accuracy: 0.222
epo

epoch 6 iter 10, loss: 0.711, accuracy: 0.802
epoch 6 iter 20, loss: 0.704, accuracy: 0.802
epoch 6 iter 30, loss: 0.701, accuracy: 0.806
epoch 6 iter 40, loss: 0.710, accuracy: 0.801
epoch 6 iter 50, loss: 0.722, accuracy: 0.796
epoch 6 iter 60, loss: 0.673, accuracy: 0.814
epoch 6 iter 70, loss: 0.677, accuracy: 0.814
epoch 6 iter 80, loss: 0.719, accuracy: 0.798
epoch 6 iter 90, loss: 0.693, accuracy: 0.806
epoch 6 iter 100, loss: 0.661, accuracy: 0.817
epoch 6 iter 110, loss: 0.685, accuracy: 0.810
epoch 6 iter 120, loss: 0.706, accuracy: 0.800
epoch 6 iter 130, loss: 0.686, accuracy: 0.809
epoch 6 iter 140, loss: 0.670, accuracy: 0.811
epoch 6 iter 150, loss: 0.645, accuracy: 0.822
epoch 6 iter 160, loss: 0.660, accuracy: 0.814
epoch 6 iter 170, loss: 0.656, accuracy: 0.819
epoch 6 iter 180, loss: 0.665, accuracy: 0.818
epoch 6 iter 190, loss: 0.664, accuracy: 0.823
epoch 6 iter 200, loss: 0.670, accuracy: 0.819
epoch 6 iter 210, loss: 0.661, accuracy: 0.819
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.842, accuracy: 0.814
epoch 12 iter 30, loss: 0.812, accuracy: 0.822
epoch 12 iter 40, loss: 0.848, accuracy: 0.814
epoch 12 iter 50, loss: 0.901, accuracy: 0.811
epoch 12 iter 60, loss: 0.824, accuracy: 0.816
epoch 12 iter 70, loss: 0.819, accuracy: 0.819
epoch 12 iter 80, loss: 0.863, accuracy: 0.810
epoch 12 iter 90, loss: 0.862, accuracy: 0.811
epoch 12 iter 100, loss: 0.806, accuracy: 0.813
epoch 12 iter 110, loss: 0.784, accuracy: 0.821
epoch 12 iter 120, loss: 0.813, accuracy: 0.820
epoch 12 iter 130, loss: 0.790, accuracy: 0.828
epoch 12 iter 140, loss: 0.806, accuracy: 0.823
epoch 12 iter 150, loss: 0.928, accuracy: 0.809
epoch 12 iter 160, loss: 0.847, accuracy: 0.814
epoch 12 iter 170, loss: 0.810, accuracy: 0.815
epoch 12 iter 180, loss: 0.866, accuracy: 0.812
epoch 12 iter 190, loss: 0.893, accuracy: 0.810
epoch 12 iter 200, loss: 0.872, accuracy: 0.822
epoch 12 iter 210, loss: 0.924, accuracy: 0.814
epoch 12 iter 220, loss: 0.823, accuracy: 0.826


epoch 18 iter 0, loss: 1.065, accuracy: 0.811
epoch 18 iter 10, loss: 1.106, accuracy: 0.815
epoch 18 iter 20, loss: 1.087, accuracy: 0.822
epoch 18 iter 30, loss: 1.096, accuracy: 0.823
epoch 18 iter 40, loss: 1.039, accuracy: 0.817
epoch 18 iter 50, loss: 1.057, accuracy: 0.828
epoch 18 iter 60, loss: 1.132, accuracy: 0.819
epoch 18 iter 70, loss: 1.222, accuracy: 0.814
epoch 18 iter 80, loss: 1.160, accuracy: 0.811
epoch 18 iter 90, loss: 1.457, accuracy: 0.797
epoch 18 iter 100, loss: 1.035, accuracy: 0.821
epoch 18 iter 110, loss: 0.973, accuracy: 0.831
epoch 18 iter 120, loss: 0.991, accuracy: 0.828
epoch 18 iter 130, loss: 1.007, accuracy: 0.825
epoch 18 iter 140, loss: 0.999, accuracy: 0.822
epoch 18 iter 150, loss: 1.046, accuracy: 0.827
epoch 18 iter 160, loss: 1.152, accuracy: 0.800
epoch 18 iter 170, loss: 1.077, accuracy: 0.817
epoch 18 iter 180, loss: 1.097, accuracy: 0.822
epoch 18 iter 190, loss: 1.067, accuracy: 0.828
epoch 18 iter 200, loss: 1.182, accuracy: 0.817
epo

epoch 23 iter 280, loss: 1.350, accuracy: 0.818
epoch 24 iter 0, loss: 1.287, accuracy: 0.826
epoch 24 iter 10, loss: 1.329, accuracy: 0.820
epoch 24 iter 20, loss: 1.372, accuracy: 0.822
epoch 24 iter 30, loss: 1.432, accuracy: 0.818
epoch 24 iter 40, loss: 1.334, accuracy: 0.820
epoch 24 iter 50, loss: 1.405, accuracy: 0.821
epoch 24 iter 60, loss: 1.398, accuracy: 0.809
epoch 24 iter 70, loss: 1.680, accuracy: 0.811
epoch 24 iter 80, loss: 1.798, accuracy: 0.799
epoch 24 iter 90, loss: 1.667, accuracy: 0.806
epoch 24 iter 100, loss: 1.479, accuracy: 0.809
epoch 24 iter 110, loss: 1.515, accuracy: 0.816
epoch 24 iter 120, loss: 1.424, accuracy: 0.821
epoch 24 iter 130, loss: 1.450, accuracy: 0.828
epoch 24 iter 140, loss: 1.374, accuracy: 0.825
epoch 24 iter 150, loss: 1.405, accuracy: 0.826
epoch 24 iter 160, loss: 1.314, accuracy: 0.825
epoch 24 iter 170, loss: 1.425, accuracy: 0.812
epoch 24 iter 180, loss: 1.260, accuracy: 0.823
epoch 24 iter 190, loss: 1.394, accuracy: 0.824
epo

epoch 29 iter 270, loss: 1.727, accuracy: 0.825
epoch 29 iter 280, loss: 1.902, accuracy: 0.815
epoch 30 iter 0, loss: 1.671, accuracy: 0.812
epoch 30 iter 10, loss: 1.768, accuracy: 0.826
epoch 30 iter 20, loss: 1.742, accuracy: 0.831
epoch 30 iter 30, loss: 1.798, accuracy: 0.824
epoch 30 iter 40, loss: 1.789, accuracy: 0.827
epoch 30 iter 50, loss: 1.840, accuracy: 0.827
epoch 30 iter 60, loss: 1.751, accuracy: 0.829
epoch 30 iter 70, loss: 2.224, accuracy: 0.802
epoch 30 iter 80, loss: 2.090, accuracy: 0.805
epoch 30 iter 90, loss: 1.846, accuracy: 0.812
epoch 30 iter 100, loss: 2.004, accuracy: 0.806
epoch 30 iter 110, loss: 1.898, accuracy: 0.821
epoch 30 iter 120, loss: 1.829, accuracy: 0.814
epoch 30 iter 130, loss: 1.744, accuracy: 0.826
epoch 30 iter 140, loss: 1.776, accuracy: 0.826
epoch 30 iter 150, loss: 1.873, accuracy: 0.821
epoch 30 iter 160, loss: 1.709, accuracy: 0.820
epoch 30 iter 170, loss: 1.580, accuracy: 0.823
epoch 30 iter 180, loss: 1.670, accuracy: 0.822
epo

epoch 35 iter 260, loss: 2.090, accuracy: 0.825
epoch 35 iter 270, loss: 1.936, accuracy: 0.836
epoch 35 iter 280, loss: 1.942, accuracy: 0.833
epoch 36 iter 0, loss: 2.065, accuracy: 0.830
epoch 36 iter 10, loss: 1.937, accuracy: 0.828
epoch 36 iter 20, loss: 2.045, accuracy: 0.831
epoch 36 iter 30, loss: 2.016, accuracy: 0.833
epoch 36 iter 40, loss: 2.021, accuracy: 0.822
epoch 36 iter 50, loss: 2.086, accuracy: 0.834
epoch 36 iter 60, loss: 2.022, accuracy: 0.830
epoch 36 iter 70, loss: 2.241, accuracy: 0.820
epoch 36 iter 80, loss: 2.191, accuracy: 0.824
epoch 36 iter 90, loss: 2.126, accuracy: 0.829
epoch 36 iter 100, loss: 2.127, accuracy: 0.827
epoch 36 iter 110, loss: 2.241, accuracy: 0.816
epoch 36 iter 120, loss: 2.168, accuracy: 0.825
epoch 36 iter 130, loss: 2.106, accuracy: 0.825
epoch 36 iter 140, loss: 1.995, accuracy: 0.837
epoch 36 iter 150, loss: 2.042, accuracy: 0.836
epoch 36 iter 160, loss: 2.116, accuracy: 0.824
epoch 36 iter 170, loss: 2.008, accuracy: 0.825
epo

## Kernal size ##

In [16]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [4, 4], 2, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [4, 4], 2, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [4, 4], stride = 2, FC_size = 500
epoch 0 iter 0, loss: 146.877, accuracy: 0.196
epoch 0 iter 10, loss: 3.707, accuracy: 0.158
epoch 0 iter 20, loss: 2.306, accuracy: 0.115
epoch 0 iter 30, loss: 2.261, accuracy: 0.170
epoch 0 iter 40, loss: 2.238, accuracy: 0.192
epoch 0 iter 50, loss: 2.236, accuracy: 0.195
epoch 0 iter 60, loss: 2.237, accuracy: 0.194
epoch 0 iter 70, loss: 2.234, accuracy: 0.195
epoch 0 iter 80, loss: 2.232, accuracy: 0.194
epoch 0 iter 90, loss: 2.231, accuracy: 0.196
epoch 0 iter 100, loss: 2.229, accuracy: 0.196
epoch 0 iter 110, loss: 2.230, accuracy: 0.196
epoch 0 iter 120, loss: 2.226, accuracy: 0.196
epoch 0 iter 130, loss: 2.225, accuracy: 0.196
epoch 0 iter 140, loss: 2.230, accuracy: 0.196
epoch 0 iter 150, loss: 2.225, accuracy: 0.196
epoch 0 iter 160, loss: 2.225, accuracy: 0.196
epoch 0 iter 170, loss: 2.225, accuracy: 0.196
epoch 0 iter 180, loss: 2.224, accuracy: 0.196
epoch 0 iter 190, loss: 2.225, accuracy: 0.196
epo

epoch 6 iter 10, loss: 0.890, accuracy: 0.750
epoch 6 iter 20, loss: 0.883, accuracy: 0.756
epoch 6 iter 30, loss: 0.914, accuracy: 0.747
epoch 6 iter 40, loss: 0.881, accuracy: 0.753
epoch 6 iter 50, loss: 0.850, accuracy: 0.760
epoch 6 iter 60, loss: 0.919, accuracy: 0.739
epoch 6 iter 70, loss: 0.833, accuracy: 0.766
epoch 6 iter 80, loss: 0.840, accuracy: 0.769
epoch 6 iter 90, loss: 0.857, accuracy: 0.765
epoch 6 iter 100, loss: 0.894, accuracy: 0.746
epoch 6 iter 110, loss: 0.931, accuracy: 0.733
epoch 6 iter 120, loss: 0.833, accuracy: 0.777
epoch 6 iter 130, loss: 0.827, accuracy: 0.777
epoch 6 iter 140, loss: 0.820, accuracy: 0.777
epoch 6 iter 150, loss: 0.801, accuracy: 0.780
epoch 6 iter 160, loss: 0.778, accuracy: 0.786
epoch 6 iter 170, loss: 0.852, accuracy: 0.765
epoch 6 iter 180, loss: 0.781, accuracy: 0.780
epoch 6 iter 190, loss: 0.796, accuracy: 0.784
epoch 6 iter 200, loss: 0.780, accuracy: 0.785
epoch 6 iter 210, loss: 0.774, accuracy: 0.789
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.816, accuracy: 0.829
epoch 12 iter 30, loss: 0.796, accuracy: 0.828
epoch 12 iter 40, loss: 0.837, accuracy: 0.822
epoch 12 iter 50, loss: 0.808, accuracy: 0.830
epoch 12 iter 60, loss: 0.880, accuracy: 0.816
epoch 12 iter 70, loss: 0.824, accuracy: 0.821
epoch 12 iter 80, loss: 0.783, accuracy: 0.830
epoch 12 iter 90, loss: 0.797, accuracy: 0.829
epoch 12 iter 100, loss: 0.793, accuracy: 0.822
epoch 12 iter 110, loss: 0.880, accuracy: 0.821
epoch 12 iter 120, loss: 0.874, accuracy: 0.824
epoch 12 iter 130, loss: 0.828, accuracy: 0.831
epoch 12 iter 140, loss: 0.839, accuracy: 0.826
epoch 12 iter 150, loss: 0.902, accuracy: 0.822
epoch 12 iter 160, loss: 0.884, accuracy: 0.805
epoch 12 iter 170, loss: 0.786, accuracy: 0.834
epoch 12 iter 180, loss: 0.905, accuracy: 0.817
epoch 12 iter 190, loss: 0.956, accuracy: 0.819
epoch 12 iter 200, loss: 0.937, accuracy: 0.820
epoch 12 iter 210, loss: 0.873, accuracy: 0.823
epoch 12 iter 220, loss: 0.859, accuracy: 0.819


epoch 18 iter 0, loss: 1.283, accuracy: 0.814
epoch 18 iter 10, loss: 1.307, accuracy: 0.819
epoch 18 iter 20, loss: 1.104, accuracy: 0.824
epoch 18 iter 30, loss: 1.130, accuracy: 0.818
epoch 18 iter 40, loss: 1.261, accuracy: 0.820
epoch 18 iter 50, loss: 1.275, accuracy: 0.819
epoch 18 iter 60, loss: 1.135, accuracy: 0.828
epoch 18 iter 70, loss: 1.135, accuracy: 0.819
epoch 18 iter 80, loss: 1.151, accuracy: 0.832
epoch 18 iter 90, loss: 1.197, accuracy: 0.829
epoch 18 iter 100, loss: 1.139, accuracy: 0.825
epoch 18 iter 110, loss: 1.127, accuracy: 0.816
epoch 18 iter 120, loss: 1.358, accuracy: 0.819
epoch 18 iter 130, loss: 1.175, accuracy: 0.831
epoch 18 iter 140, loss: 1.205, accuracy: 0.826
epoch 18 iter 150, loss: 1.250, accuracy: 0.827
epoch 18 iter 160, loss: 1.212, accuracy: 0.801
epoch 18 iter 170, loss: 1.143, accuracy: 0.832
epoch 18 iter 180, loss: 1.142, accuracy: 0.832
epoch 18 iter 190, loss: 1.302, accuracy: 0.830
epoch 18 iter 200, loss: 1.171, accuracy: 0.829
epo

epoch 23 iter 280, loss: 2.047, accuracy: 0.795
epoch 24 iter 0, loss: 1.647, accuracy: 0.828
epoch 24 iter 10, loss: 1.657, accuracy: 0.828
epoch 24 iter 20, loss: 1.637, accuracy: 0.825
epoch 24 iter 30, loss: 1.507, accuracy: 0.842
epoch 24 iter 40, loss: 1.570, accuracy: 0.834
epoch 24 iter 50, loss: 1.676, accuracy: 0.824
epoch 24 iter 60, loss: 1.581, accuracy: 0.824
epoch 24 iter 70, loss: 1.549, accuracy: 0.832
epoch 24 iter 80, loss: 1.568, accuracy: 0.821
epoch 24 iter 90, loss: 1.561, accuracy: 0.819
epoch 24 iter 100, loss: 1.604, accuracy: 0.821
epoch 24 iter 110, loss: 1.339, accuracy: 0.824
epoch 24 iter 120, loss: 1.561, accuracy: 0.822
epoch 24 iter 130, loss: 1.622, accuracy: 0.831
epoch 24 iter 140, loss: 1.566, accuracy: 0.826
epoch 24 iter 150, loss: 1.681, accuracy: 0.819
epoch 24 iter 160, loss: 1.541, accuracy: 0.827
epoch 24 iter 170, loss: 1.424, accuracy: 0.829
epoch 24 iter 180, loss: 1.635, accuracy: 0.823
epoch 24 iter 190, loss: 1.582, accuracy: 0.830
epo

epoch 29 iter 270, loss: 1.868, accuracy: 0.831
epoch 29 iter 280, loss: 2.009, accuracy: 0.786
epoch 30 iter 0, loss: 1.987, accuracy: 0.792
epoch 30 iter 10, loss: 1.807, accuracy: 0.834
epoch 30 iter 20, loss: 1.863, accuracy: 0.833
epoch 30 iter 30, loss: 1.880, accuracy: 0.834
epoch 30 iter 40, loss: 1.871, accuracy: 0.836
epoch 30 iter 50, loss: 1.951, accuracy: 0.836
epoch 30 iter 60, loss: 1.784, accuracy: 0.835
epoch 30 iter 70, loss: 1.701, accuracy: 0.840
epoch 30 iter 80, loss: 1.829, accuracy: 0.835
epoch 30 iter 90, loss: 1.670, accuracy: 0.833
epoch 30 iter 100, loss: 1.847, accuracy: 0.826
epoch 30 iter 110, loss: 1.814, accuracy: 0.830
epoch 30 iter 120, loss: 1.702, accuracy: 0.837
epoch 30 iter 130, loss: 1.916, accuracy: 0.824
epoch 30 iter 140, loss: 1.757, accuracy: 0.836
epoch 30 iter 150, loss: 1.919, accuracy: 0.829
epoch 30 iter 160, loss: 2.063, accuracy: 0.826
epoch 30 iter 170, loss: 1.880, accuracy: 0.817
epoch 30 iter 180, loss: 1.708, accuracy: 0.834
epo

epoch 35 iter 260, loss: 2.201, accuracy: 0.826
epoch 35 iter 270, loss: 2.167, accuracy: 0.833
epoch 35 iter 280, loss: 2.062, accuracy: 0.844
epoch 36 iter 0, loss: 2.557, accuracy: 0.817
epoch 36 iter 10, loss: 2.159, accuracy: 0.821
epoch 36 iter 20, loss: 2.227, accuracy: 0.824
epoch 36 iter 30, loss: 2.507, accuracy: 0.822
epoch 36 iter 40, loss: 2.150, accuracy: 0.834
epoch 36 iter 50, loss: 2.176, accuracy: 0.836
epoch 36 iter 60, loss: 2.386, accuracy: 0.837
epoch 36 iter 70, loss: 2.098, accuracy: 0.840
epoch 36 iter 80, loss: 2.069, accuracy: 0.846
epoch 36 iter 90, loss: 2.053, accuracy: 0.840
epoch 36 iter 100, loss: 1.795, accuracy: 0.836
epoch 36 iter 110, loss: 2.023, accuracy: 0.843
epoch 36 iter 120, loss: 2.062, accuracy: 0.837
epoch 36 iter 130, loss: 2.049, accuracy: 0.839
epoch 36 iter 140, loss: 1.959, accuracy: 0.842
epoch 36 iter 150, loss: 2.131, accuracy: 0.843
epoch 36 iter 160, loss: 2.213, accuracy: 0.843
epoch 36 iter 170, loss: 1.988, accuracy: 0.841
epo

In [7]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [6, 6], 2, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [6, 6], 2, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [6, 6], stride = 2, FC_size = 500
epoch 0 iter 0, loss: 32.303, accuracy: 0.159
epoch 0 iter 10, loss: 2.454, accuracy: 0.090
epoch 0 iter 20, loss: 2.212, accuracy: 0.225
epoch 0 iter 30, loss: 2.125, accuracy: 0.259
epoch 0 iter 40, loss: 2.054, accuracy: 0.302
epoch 0 iter 50, loss: 1.959, accuracy: 0.343
epoch 0 iter 60, loss: 1.754, accuracy: 0.422
epoch 0 iter 70, loss: 1.690, accuracy: 0.439
epoch 0 iter 80, loss: 1.539, accuracy: 0.504
epoch 0 iter 90, loss: 1.515, accuracy: 0.512
epoch 0 iter 100, loss: 1.461, accuracy: 0.532
epoch 0 iter 110, loss: 1.446, accuracy: 0.539
epoch 0 iter 120, loss: 1.411, accuracy: 0.555
epoch 0 iter 130, loss: 1.361, accuracy: 0.565
epoch 0 iter 140, loss: 1.443, accuracy: 0.541
epoch 0 iter 150, loss: 1.333, accuracy: 0.579
epoch 0 iter 160, loss: 1.295, accuracy: 0.588
epoch 0 iter 170, loss: 1.308, accuracy: 0.591
epoch 0 iter 180, loss: 1.228, accuracy: 0.617
epoch 0 iter 190, loss: 1.206, accuracy: 0.630
epoc

epoch 6 iter 10, loss: 0.900, accuracy: 0.735
epoch 6 iter 20, loss: 0.830, accuracy: 0.767
epoch 6 iter 30, loss: 0.821, accuracy: 0.765
epoch 6 iter 40, loss: 0.834, accuracy: 0.760
epoch 6 iter 50, loss: 0.839, accuracy: 0.761
epoch 6 iter 60, loss: 0.815, accuracy: 0.766
epoch 6 iter 70, loss: 0.789, accuracy: 0.771
epoch 6 iter 80, loss: 0.779, accuracy: 0.780
epoch 6 iter 90, loss: 0.787, accuracy: 0.778
epoch 6 iter 100, loss: 0.799, accuracy: 0.774
epoch 6 iter 110, loss: 0.803, accuracy: 0.763
epoch 6 iter 120, loss: 0.834, accuracy: 0.762
epoch 6 iter 130, loss: 0.805, accuracy: 0.773
epoch 6 iter 140, loss: 0.810, accuracy: 0.770
epoch 6 iter 150, loss: 0.830, accuracy: 0.761
epoch 6 iter 160, loss: 0.812, accuracy: 0.766
epoch 6 iter 170, loss: 0.809, accuracy: 0.770
epoch 6 iter 180, loss: 0.798, accuracy: 0.775
epoch 6 iter 190, loss: 0.780, accuracy: 0.778
epoch 6 iter 200, loss: 0.781, accuracy: 0.781
epoch 6 iter 210, loss: 0.779, accuracy: 0.779
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.863, accuracy: 0.797
epoch 12 iter 30, loss: 0.835, accuracy: 0.805
epoch 12 iter 40, loss: 0.797, accuracy: 0.815
epoch 12 iter 50, loss: 0.762, accuracy: 0.813
epoch 12 iter 60, loss: 0.827, accuracy: 0.810
epoch 12 iter 70, loss: 0.837, accuracy: 0.808
epoch 12 iter 80, loss: 0.892, accuracy: 0.799
epoch 12 iter 90, loss: 0.782, accuracy: 0.814
epoch 12 iter 100, loss: 0.808, accuracy: 0.814
epoch 12 iter 110, loss: 0.854, accuracy: 0.795
epoch 12 iter 120, loss: 0.809, accuracy: 0.811
epoch 12 iter 130, loss: 0.819, accuracy: 0.802
epoch 12 iter 140, loss: 0.834, accuracy: 0.817
epoch 12 iter 150, loss: 0.867, accuracy: 0.790
epoch 12 iter 160, loss: 0.825, accuracy: 0.805
epoch 12 iter 170, loss: 0.847, accuracy: 0.803
epoch 12 iter 180, loss: 0.873, accuracy: 0.806
epoch 12 iter 190, loss: 0.851, accuracy: 0.807
epoch 12 iter 200, loss: 0.824, accuracy: 0.815
epoch 12 iter 210, loss: 0.836, accuracy: 0.805
epoch 12 iter 220, loss: 0.876, accuracy: 0.790


epoch 18 iter 0, loss: 1.055, accuracy: 0.804
epoch 18 iter 10, loss: 1.189, accuracy: 0.803
epoch 18 iter 20, loss: 1.125, accuracy: 0.815
epoch 18 iter 30, loss: 1.078, accuracy: 0.815
epoch 18 iter 40, loss: 1.094, accuracy: 0.809
epoch 18 iter 50, loss: 1.061, accuracy: 0.816
epoch 18 iter 60, loss: 1.006, accuracy: 0.811
epoch 18 iter 70, loss: 1.092, accuracy: 0.818
epoch 18 iter 80, loss: 1.293, accuracy: 0.795
epoch 18 iter 90, loss: 1.162, accuracy: 0.802
epoch 18 iter 100, loss: 1.110, accuracy: 0.811
epoch 18 iter 110, loss: 1.004, accuracy: 0.825
epoch 18 iter 120, loss: 1.018, accuracy: 0.812
epoch 18 iter 130, loss: 0.984, accuracy: 0.821
epoch 18 iter 140, loss: 1.022, accuracy: 0.817
epoch 18 iter 150, loss: 1.063, accuracy: 0.822
epoch 18 iter 160, loss: 0.984, accuracy: 0.824
epoch 18 iter 170, loss: 1.014, accuracy: 0.808
epoch 18 iter 180, loss: 1.138, accuracy: 0.812
epoch 18 iter 190, loss: 1.034, accuracy: 0.813
epoch 18 iter 200, loss: 1.066, accuracy: 0.821
epo

epoch 23 iter 280, loss: 1.359, accuracy: 0.806
epoch 24 iter 0, loss: 1.363, accuracy: 0.811
epoch 24 iter 10, loss: 1.245, accuracy: 0.821
epoch 24 iter 20, loss: 1.325, accuracy: 0.818
epoch 24 iter 30, loss: 1.599, accuracy: 0.793
epoch 24 iter 40, loss: 1.304, accuracy: 0.813
epoch 24 iter 50, loss: 1.354, accuracy: 0.823
epoch 24 iter 60, loss: 1.332, accuracy: 0.822
epoch 24 iter 70, loss: 1.327, accuracy: 0.823
epoch 24 iter 80, loss: 1.346, accuracy: 0.822
epoch 24 iter 90, loss: 1.602, accuracy: 0.804
epoch 24 iter 100, loss: 1.565, accuracy: 0.792
epoch 24 iter 110, loss: 1.499, accuracy: 0.801
epoch 24 iter 120, loss: 1.429, accuracy: 0.809
epoch 24 iter 130, loss: 1.289, accuracy: 0.795
epoch 24 iter 140, loss: 1.399, accuracy: 0.809
epoch 24 iter 150, loss: 1.276, accuracy: 0.794
epoch 24 iter 160, loss: 1.404, accuracy: 0.811
epoch 24 iter 170, loss: 1.348, accuracy: 0.815
epoch 24 iter 180, loss: 1.409, accuracy: 0.805
epoch 24 iter 190, loss: 1.343, accuracy: 0.816
epo

epoch 29 iter 270, loss: 1.653, accuracy: 0.821
epoch 29 iter 280, loss: 1.746, accuracy: 0.817
epoch 30 iter 0, loss: 1.655, accuracy: 0.816
epoch 30 iter 10, loss: 1.735, accuracy: 0.821
epoch 30 iter 20, loss: 1.841, accuracy: 0.815
epoch 30 iter 30, loss: 1.746, accuracy: 0.801
epoch 30 iter 40, loss: 1.754, accuracy: 0.822
epoch 30 iter 50, loss: 1.785, accuracy: 0.812
epoch 30 iter 60, loss: 1.721, accuracy: 0.825
epoch 30 iter 70, loss: 1.779, accuracy: 0.809
epoch 30 iter 80, loss: 1.696, accuracy: 0.824
epoch 30 iter 90, loss: 1.951, accuracy: 0.809
epoch 30 iter 100, loss: 1.837, accuracy: 0.809
epoch 30 iter 110, loss: 1.872, accuracy: 0.817
epoch 30 iter 120, loss: 1.792, accuracy: 0.808
epoch 30 iter 130, loss: 1.828, accuracy: 0.803
epoch 30 iter 140, loss: 1.736, accuracy: 0.808
epoch 30 iter 150, loss: 1.628, accuracy: 0.813
epoch 30 iter 160, loss: 1.689, accuracy: 0.821
epoch 30 iter 170, loss: 1.697, accuracy: 0.806
epoch 30 iter 180, loss: 1.801, accuracy: 0.815
epo

epoch 35 iter 260, loss: 2.232, accuracy: 0.820
epoch 35 iter 270, loss: 2.082, accuracy: 0.823
epoch 35 iter 280, loss: 2.164, accuracy: 0.813
epoch 36 iter 0, loss: 2.171, accuracy: 0.821
epoch 36 iter 10, loss: 2.077, accuracy: 0.819
epoch 36 iter 20, loss: 2.039, accuracy: 0.826
epoch 36 iter 30, loss: 2.223, accuracy: 0.809
epoch 36 iter 40, loss: 2.117, accuracy: 0.814
epoch 36 iter 50, loss: 2.105, accuracy: 0.822
epoch 36 iter 60, loss: 2.194, accuracy: 0.823
epoch 36 iter 70, loss: 2.033, accuracy: 0.822
epoch 36 iter 80, loss: 2.110, accuracy: 0.815
epoch 36 iter 90, loss: 2.200, accuracy: 0.807
epoch 36 iter 100, loss: 2.349, accuracy: 0.814
epoch 36 iter 110, loss: 1.909, accuracy: 0.813
epoch 36 iter 120, loss: 2.004, accuracy: 0.812
epoch 36 iter 130, loss: 2.219, accuracy: 0.823
epoch 36 iter 140, loss: 2.084, accuracy: 0.816
epoch 36 iter 150, loss: 1.964, accuracy: 0.818
epoch 36 iter 160, loss: 1.995, accuracy: 0.828
epoch 36 iter 170, loss: 2.117, accuracy: 0.821
epo

In [13]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [7, 7], 2, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [7, 7], 2, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [7, 7], stride = 2, FC_size = 500
epoch 0 iter 0, loss: 98.806, accuracy: 0.197
epoch 0 iter 10, loss: 2.330, accuracy: 0.096
epoch 0 iter 20, loss: 2.268, accuracy: 0.168
epoch 0 iter 30, loss: 2.249, accuracy: 0.172
epoch 0 iter 40, loss: 2.254, accuracy: 0.199
epoch 0 iter 50, loss: 2.241, accuracy: 0.199
epoch 0 iter 60, loss: 2.244, accuracy: 0.165
epoch 0 iter 70, loss: 2.234, accuracy: 0.196
epoch 0 iter 80, loss: 2.238, accuracy: 0.196
epoch 0 iter 90, loss: 2.235, accuracy: 0.196
epoch 0 iter 100, loss: 2.233, accuracy: 0.196
epoch 0 iter 110, loss: 2.230, accuracy: 0.196
epoch 0 iter 120, loss: 2.232, accuracy: 0.196
epoch 0 iter 130, loss: 2.232, accuracy: 0.196
epoch 0 iter 140, loss: 2.240, accuracy: 0.196
epoch 0 iter 150, loss: 2.225, accuracy: 0.197
epoch 0 iter 160, loss: 2.226, accuracy: 0.197
epoch 0 iter 170, loss: 2.221, accuracy: 0.202
epoch 0 iter 180, loss: 2.217, accuracy: 0.209
epoch 0 iter 190, loss: 2.219, accuracy: 0.208
epoc

epoch 6 iter 10, loss: 0.760, accuracy: 0.784
epoch 6 iter 20, loss: 0.779, accuracy: 0.781
epoch 6 iter 30, loss: 0.782, accuracy: 0.775
epoch 6 iter 40, loss: 0.773, accuracy: 0.779
epoch 6 iter 50, loss: 0.767, accuracy: 0.783
epoch 6 iter 60, loss: 0.782, accuracy: 0.779
epoch 6 iter 70, loss: 0.784, accuracy: 0.776
epoch 6 iter 80, loss: 0.855, accuracy: 0.761
epoch 6 iter 90, loss: 0.776, accuracy: 0.783
epoch 6 iter 100, loss: 0.795, accuracy: 0.776
epoch 6 iter 110, loss: 0.795, accuracy: 0.775
epoch 6 iter 120, loss: 0.758, accuracy: 0.787
epoch 6 iter 130, loss: 0.752, accuracy: 0.784
epoch 6 iter 140, loss: 0.777, accuracy: 0.785
epoch 6 iter 150, loss: 0.762, accuracy: 0.792
epoch 6 iter 160, loss: 0.771, accuracy: 0.786
epoch 6 iter 170, loss: 0.764, accuracy: 0.785
epoch 6 iter 180, loss: 0.774, accuracy: 0.787
epoch 6 iter 190, loss: 0.757, accuracy: 0.789
epoch 6 iter 200, loss: 0.732, accuracy: 0.796
epoch 6 iter 210, loss: 0.762, accuracy: 0.787
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.794, accuracy: 0.794
epoch 12 iter 30, loss: 0.855, accuracy: 0.785
epoch 12 iter 40, loss: 0.769, accuracy: 0.806
epoch 12 iter 50, loss: 0.790, accuracy: 0.802
epoch 12 iter 60, loss: 0.810, accuracy: 0.802
epoch 12 iter 70, loss: 0.813, accuracy: 0.792
epoch 12 iter 80, loss: 0.831, accuracy: 0.798
epoch 12 iter 90, loss: 0.790, accuracy: 0.806
epoch 12 iter 100, loss: 0.799, accuracy: 0.801
epoch 12 iter 110, loss: 0.817, accuracy: 0.801
epoch 12 iter 120, loss: 0.812, accuracy: 0.799
epoch 12 iter 130, loss: 0.790, accuracy: 0.804
epoch 12 iter 140, loss: 0.813, accuracy: 0.800
epoch 12 iter 150, loss: 0.799, accuracy: 0.797
epoch 12 iter 160, loss: 0.811, accuracy: 0.798
epoch 12 iter 170, loss: 0.790, accuracy: 0.803
epoch 12 iter 180, loss: 0.810, accuracy: 0.801
epoch 12 iter 190, loss: 0.785, accuracy: 0.808
epoch 12 iter 200, loss: 0.809, accuracy: 0.806
epoch 12 iter 210, loss: 0.805, accuracy: 0.799
epoch 12 iter 220, loss: 0.811, accuracy: 0.793


epoch 18 iter 0, loss: 0.915, accuracy: 0.802
epoch 18 iter 10, loss: 0.911, accuracy: 0.810
epoch 18 iter 20, loss: 0.994, accuracy: 0.796
epoch 18 iter 30, loss: 1.635, accuracy: 0.744
epoch 18 iter 40, loss: 1.018, accuracy: 0.791
epoch 18 iter 50, loss: 0.996, accuracy: 0.801
epoch 18 iter 60, loss: 1.042, accuracy: 0.785
epoch 18 iter 70, loss: 0.936, accuracy: 0.793
epoch 18 iter 80, loss: 0.986, accuracy: 0.792
epoch 18 iter 90, loss: 0.902, accuracy: 0.802
epoch 18 iter 100, loss: 0.970, accuracy: 0.793
epoch 18 iter 110, loss: 0.904, accuracy: 0.795
epoch 18 iter 120, loss: 0.886, accuracy: 0.809
epoch 18 iter 130, loss: 0.931, accuracy: 0.803
epoch 18 iter 140, loss: 0.903, accuracy: 0.810
epoch 18 iter 150, loss: 0.918, accuracy: 0.799
epoch 18 iter 160, loss: 0.948, accuracy: 0.804
epoch 18 iter 170, loss: 0.975, accuracy: 0.806
epoch 18 iter 180, loss: 0.942, accuracy: 0.795
epoch 18 iter 190, loss: 0.864, accuracy: 0.808
epoch 18 iter 200, loss: 0.949, accuracy: 0.810
epo

epoch 23 iter 280, loss: 1.247, accuracy: 0.791
epoch 24 iter 0, loss: 1.094, accuracy: 0.784
epoch 24 iter 10, loss: 1.150, accuracy: 0.810
epoch 24 iter 20, loss: 1.297, accuracy: 0.788
epoch 24 iter 30, loss: 1.251, accuracy: 0.793
epoch 24 iter 40, loss: 1.192, accuracy: 0.793
epoch 24 iter 50, loss: 1.243, accuracy: 0.790
epoch 24 iter 60, loss: 1.216, accuracy: 0.793
epoch 24 iter 70, loss: 1.233, accuracy: 0.799
epoch 24 iter 80, loss: 1.071, accuracy: 0.795
epoch 24 iter 90, loss: 1.148, accuracy: 0.807
epoch 24 iter 100, loss: 1.285, accuracy: 0.784
epoch 24 iter 110, loss: 1.213, accuracy: 0.803
epoch 24 iter 120, loss: 1.131, accuracy: 0.799
epoch 24 iter 130, loss: 1.112, accuracy: 0.802
epoch 24 iter 140, loss: 1.182, accuracy: 0.799
epoch 24 iter 150, loss: 1.223, accuracy: 0.805
epoch 24 iter 160, loss: 1.132, accuracy: 0.804
epoch 24 iter 170, loss: 1.125, accuracy: 0.801
epoch 24 iter 180, loss: 1.059, accuracy: 0.800
epoch 24 iter 190, loss: 1.236, accuracy: 0.792
epo

epoch 29 iter 270, loss: 1.457, accuracy: 0.803
epoch 29 iter 280, loss: 1.295, accuracy: 0.808
epoch 30 iter 0, loss: 1.538, accuracy: 0.792
epoch 30 iter 10, loss: 1.456, accuracy: 0.797
epoch 30 iter 20, loss: 1.485, accuracy: 0.804
epoch 30 iter 30, loss: 1.710, accuracy: 0.790
epoch 30 iter 40, loss: 1.503, accuracy: 0.791
epoch 30 iter 50, loss: 1.479, accuracy: 0.800
epoch 30 iter 60, loss: 1.376, accuracy: 0.793
epoch 30 iter 70, loss: 1.512, accuracy: 0.802
epoch 30 iter 80, loss: 1.295, accuracy: 0.798
epoch 30 iter 90, loss: 1.478, accuracy: 0.795
epoch 30 iter 100, loss: 1.445, accuracy: 0.802
epoch 30 iter 110, loss: 1.342, accuracy: 0.805
epoch 30 iter 120, loss: 1.356, accuracy: 0.782
epoch 30 iter 130, loss: 1.469, accuracy: 0.806
epoch 30 iter 140, loss: 1.642, accuracy: 0.784
epoch 30 iter 150, loss: 1.514, accuracy: 0.802
epoch 30 iter 160, loss: 1.325, accuracy: 0.791
epoch 30 iter 170, loss: 1.595, accuracy: 0.800
epoch 30 iter 180, loss: 1.385, accuracy: 0.803
epo

epoch 35 iter 260, loss: 1.612, accuracy: 0.800
epoch 35 iter 270, loss: 1.636, accuracy: 0.795
epoch 35 iter 280, loss: 1.749, accuracy: 0.802
epoch 36 iter 0, loss: 1.555, accuracy: 0.805
epoch 36 iter 10, loss: 1.509, accuracy: 0.811
epoch 36 iter 20, loss: 1.744, accuracy: 0.808
epoch 36 iter 30, loss: 1.866, accuracy: 0.802
epoch 36 iter 40, loss: 1.954, accuracy: 0.795
epoch 36 iter 50, loss: 1.806, accuracy: 0.808
epoch 36 iter 60, loss: 1.447, accuracy: 0.804
epoch 36 iter 70, loss: 1.692, accuracy: 0.814
epoch 36 iter 80, loss: 1.624, accuracy: 0.809
epoch 36 iter 90, loss: 1.707, accuracy: 0.792
epoch 36 iter 100, loss: 1.740, accuracy: 0.811
epoch 36 iter 110, loss: 1.949, accuracy: 0.768
epoch 36 iter 120, loss: 1.637, accuracy: 0.788
epoch 36 iter 130, loss: 1.455, accuracy: 0.795
epoch 36 iter 140, loss: 2.058, accuracy: 0.775
epoch 36 iter 150, loss: 1.867, accuracy: 0.794
epoch 36 iter 160, loss: 1.649, accuracy: 0.799
epoch 36 iter 170, loss: 1.748, accuracy: 0.800
epo

## Stride ##

In [10]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [5, 5], 3, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [5, 5], 3, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [5, 5], stride = 3, FC_size = 500
epoch 0 iter 0, loss: 29.529, accuracy: 0.080
epoch 0 iter 10, loss: 4.940, accuracy: 0.161
epoch 0 iter 20, loss: 2.386, accuracy: 0.138
epoch 0 iter 30, loss: 2.250, accuracy: 0.187
epoch 0 iter 40, loss: 2.232, accuracy: 0.200
epoch 0 iter 50, loss: 2.217, accuracy: 0.204
epoch 0 iter 60, loss: 2.192, accuracy: 0.222
epoch 0 iter 70, loss: 2.172, accuracy: 0.240
epoch 0 iter 80, loss: 2.119, accuracy: 0.263
epoch 0 iter 90, loss: 2.121, accuracy: 0.241
epoch 0 iter 100, loss: 2.018, accuracy: 0.306
epoch 0 iter 110, loss: 1.947, accuracy: 0.334
epoch 0 iter 120, loss: 1.873, accuracy: 0.367
epoch 0 iter 130, loss: 1.789, accuracy: 0.396
epoch 0 iter 140, loss: 1.699, accuracy: 0.434
epoch 0 iter 150, loss: 1.634, accuracy: 0.461
epoch 0 iter 160, loss: 1.646, accuracy: 0.463
epoch 0 iter 170, loss: 1.585, accuracy: 0.489
epoch 0 iter 180, loss: 1.530, accuracy: 0.509
epoch 0 iter 190, loss: 1.520, accuracy: 0.521
epoc

epoch 6 iter 10, loss: 0.723, accuracy: 0.804
epoch 6 iter 20, loss: 0.698, accuracy: 0.810
epoch 6 iter 30, loss: 0.693, accuracy: 0.811
epoch 6 iter 40, loss: 0.678, accuracy: 0.813
epoch 6 iter 50, loss: 0.667, accuracy: 0.818
epoch 6 iter 60, loss: 0.679, accuracy: 0.815
epoch 6 iter 70, loss: 0.678, accuracy: 0.819
epoch 6 iter 80, loss: 0.696, accuracy: 0.802
epoch 6 iter 90, loss: 0.661, accuracy: 0.818
epoch 6 iter 100, loss: 0.674, accuracy: 0.815
epoch 6 iter 110, loss: 0.684, accuracy: 0.814
epoch 6 iter 120, loss: 0.667, accuracy: 0.822
epoch 6 iter 130, loss: 0.677, accuracy: 0.815
epoch 6 iter 140, loss: 0.674, accuracy: 0.816
epoch 6 iter 150, loss: 0.693, accuracy: 0.814
epoch 6 iter 160, loss: 0.663, accuracy: 0.822
epoch 6 iter 170, loss: 0.653, accuracy: 0.821
epoch 6 iter 180, loss: 0.666, accuracy: 0.819
epoch 6 iter 190, loss: 0.672, accuracy: 0.818
epoch 6 iter 200, loss: 0.677, accuracy: 0.821
epoch 6 iter 210, loss: 0.690, accuracy: 0.817
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.806, accuracy: 0.815
epoch 12 iter 30, loss: 0.930, accuracy: 0.785
epoch 12 iter 40, loss: 0.898, accuracy: 0.809
epoch 12 iter 50, loss: 0.829, accuracy: 0.820
epoch 12 iter 60, loss: 0.765, accuracy: 0.814
epoch 12 iter 70, loss: 0.774, accuracy: 0.819
epoch 12 iter 80, loss: 0.804, accuracy: 0.814
epoch 12 iter 90, loss: 0.786, accuracy: 0.821
epoch 12 iter 100, loss: 0.783, accuracy: 0.815
epoch 12 iter 110, loss: 0.778, accuracy: 0.824
epoch 12 iter 120, loss: 0.766, accuracy: 0.823
epoch 12 iter 130, loss: 0.809, accuracy: 0.824
epoch 12 iter 140, loss: 0.811, accuracy: 0.821
epoch 12 iter 150, loss: 0.804, accuracy: 0.823
epoch 12 iter 160, loss: 0.838, accuracy: 0.818
epoch 12 iter 170, loss: 0.802, accuracy: 0.826
epoch 12 iter 180, loss: 0.777, accuracy: 0.825
epoch 12 iter 190, loss: 0.871, accuracy: 0.818
epoch 12 iter 200, loss: 0.866, accuracy: 0.811
epoch 12 iter 210, loss: 0.896, accuracy: 0.810
epoch 12 iter 220, loss: 0.840, accuracy: 0.810


epoch 18 iter 0, loss: 1.066, accuracy: 0.813
epoch 18 iter 10, loss: 1.044, accuracy: 0.821
epoch 18 iter 20, loss: 1.031, accuracy: 0.819
epoch 18 iter 30, loss: 1.029, accuracy: 0.808
epoch 18 iter 40, loss: 1.061, accuracy: 0.809
epoch 18 iter 50, loss: 1.063, accuracy: 0.815
epoch 18 iter 60, loss: 1.000, accuracy: 0.816
epoch 18 iter 70, loss: 0.982, accuracy: 0.823
epoch 18 iter 80, loss: 0.980, accuracy: 0.818
epoch 18 iter 90, loss: 0.984, accuracy: 0.813
epoch 18 iter 100, loss: 0.971, accuracy: 0.820
epoch 18 iter 110, loss: 1.047, accuracy: 0.809
epoch 18 iter 120, loss: 1.001, accuracy: 0.812
epoch 18 iter 130, loss: 1.041, accuracy: 0.819
epoch 18 iter 140, loss: 1.113, accuracy: 0.811
epoch 18 iter 150, loss: 1.028, accuracy: 0.813
epoch 18 iter 160, loss: 1.038, accuracy: 0.813
epoch 18 iter 170, loss: 1.093, accuracy: 0.813
epoch 18 iter 180, loss: 1.090, accuracy: 0.816
epoch 18 iter 190, loss: 1.100, accuracy: 0.815
epoch 18 iter 200, loss: 1.053, accuracy: 0.812
epo

epoch 23 iter 280, loss: 1.390, accuracy: 0.815
epoch 24 iter 0, loss: 1.374, accuracy: 0.810
epoch 24 iter 10, loss: 1.312, accuracy: 0.821
epoch 24 iter 20, loss: 1.227, accuracy: 0.828
epoch 24 iter 30, loss: 1.329, accuracy: 0.824
epoch 24 iter 40, loss: 1.229, accuracy: 0.823
epoch 24 iter 50, loss: 1.277, accuracy: 0.815
epoch 24 iter 60, loss: 1.342, accuracy: 0.819
epoch 24 iter 70, loss: 1.312, accuracy: 0.824
epoch 24 iter 80, loss: 1.273, accuracy: 0.818
epoch 24 iter 90, loss: 1.203, accuracy: 0.824
epoch 24 iter 100, loss: 1.236, accuracy: 0.821
epoch 24 iter 110, loss: 1.272, accuracy: 0.821
epoch 24 iter 120, loss: 1.272, accuracy: 0.814
epoch 24 iter 130, loss: 1.320, accuracy: 0.815
epoch 24 iter 140, loss: 1.461, accuracy: 0.808
epoch 24 iter 150, loss: 1.345, accuracy: 0.819
epoch 24 iter 160, loss: 1.247, accuracy: 0.822
epoch 24 iter 170, loss: 1.315, accuracy: 0.822
epoch 24 iter 180, loss: 1.282, accuracy: 0.821
epoch 24 iter 190, loss: 1.346, accuracy: 0.816
epo

epoch 29 iter 270, loss: 1.704, accuracy: 0.806
epoch 29 iter 280, loss: 1.737, accuracy: 0.821
epoch 30 iter 0, loss: 1.751, accuracy: 0.817
epoch 30 iter 10, loss: 1.741, accuracy: 0.805
epoch 30 iter 20, loss: 1.610, accuracy: 0.819
epoch 30 iter 30, loss: 1.604, accuracy: 0.813
epoch 30 iter 40, loss: 1.704, accuracy: 0.818
epoch 30 iter 50, loss: 1.653, accuracy: 0.821
epoch 30 iter 60, loss: 1.640, accuracy: 0.816
epoch 30 iter 70, loss: 1.563, accuracy: 0.823
epoch 30 iter 80, loss: 1.646, accuracy: 0.814
epoch 30 iter 90, loss: 1.504, accuracy: 0.825
epoch 30 iter 100, loss: 1.540, accuracy: 0.821
epoch 30 iter 110, loss: 1.612, accuracy: 0.821
epoch 30 iter 120, loss: 1.586, accuracy: 0.807
epoch 30 iter 130, loss: 1.713, accuracy: 0.809
epoch 30 iter 140, loss: 1.521, accuracy: 0.815
epoch 30 iter 150, loss: 1.711, accuracy: 0.804
epoch 30 iter 160, loss: 1.600, accuracy: 0.811
epoch 30 iter 170, loss: 1.640, accuracy: 0.821
epoch 30 iter 180, loss: 1.562, accuracy: 0.819
epo

epoch 35 iter 260, loss: 1.796, accuracy: 0.823
epoch 35 iter 270, loss: 1.904, accuracy: 0.811
epoch 35 iter 280, loss: 2.057, accuracy: 0.829
epoch 36 iter 0, loss: 2.141, accuracy: 0.823
epoch 36 iter 10, loss: 2.077, accuracy: 0.816
epoch 36 iter 20, loss: 1.965, accuracy: 0.820
epoch 36 iter 30, loss: 1.960, accuracy: 0.819
epoch 36 iter 40, loss: 1.905, accuracy: 0.821
epoch 36 iter 50, loss: 1.968, accuracy: 0.826
epoch 36 iter 60, loss: 1.910, accuracy: 0.826
epoch 36 iter 70, loss: 1.846, accuracy: 0.816
epoch 36 iter 80, loss: 1.862, accuracy: 0.819
epoch 36 iter 90, loss: 1.866, accuracy: 0.818
epoch 36 iter 100, loss: 1.696, accuracy: 0.823
epoch 36 iter 110, loss: 1.820, accuracy: 0.824
epoch 36 iter 120, loss: 1.905, accuracy: 0.816
epoch 36 iter 130, loss: 1.857, accuracy: 0.825
epoch 36 iter 140, loss: 2.008, accuracy: 0.823
epoch 36 iter 150, loss: 1.997, accuracy: 0.818
epoch 36 iter 160, loss: 1.740, accuracy: 0.816
epoch 36 iter 170, loss: 1.762, accuracy: 0.825
epo

In [11]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [5, 5], 4, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [5, 5], 4, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [5, 5], stride = 4, FC_size = 500
epoch 0 iter 0, loss: 14.892, accuracy: 0.147
epoch 0 iter 10, loss: 2.848, accuracy: 0.120
epoch 0 iter 20, loss: 2.274, accuracy: 0.144
epoch 0 iter 30, loss: 2.247, accuracy: 0.189
epoch 0 iter 40, loss: 2.242, accuracy: 0.193
epoch 0 iter 50, loss: 2.244, accuracy: 0.194
epoch 0 iter 60, loss: 2.245, accuracy: 0.180
epoch 0 iter 70, loss: 2.235, accuracy: 0.194
epoch 0 iter 80, loss: 2.217, accuracy: 0.201
epoch 0 iter 90, loss: 2.183, accuracy: 0.219
epoch 0 iter 100, loss: 2.172, accuracy: 0.220
epoch 0 iter 110, loss: 2.179, accuracy: 0.226
epoch 0 iter 120, loss: 2.107, accuracy: 0.243
epoch 0 iter 130, loss: 2.149, accuracy: 0.224
epoch 0 iter 140, loss: 2.088, accuracy: 0.249
epoch 0 iter 150, loss: 2.081, accuracy: 0.259
epoch 0 iter 160, loss: 2.105, accuracy: 0.250
epoch 0 iter 170, loss: 2.075, accuracy: 0.257
epoch 0 iter 180, loss: 2.062, accuracy: 0.263
epoch 0 iter 190, loss: 2.050, accuracy: 0.272
epoc

epoch 6 iter 10, loss: 0.755, accuracy: 0.779
epoch 6 iter 20, loss: 0.763, accuracy: 0.778
epoch 6 iter 30, loss: 0.763, accuracy: 0.777
epoch 6 iter 40, loss: 0.753, accuracy: 0.777
epoch 6 iter 50, loss: 0.747, accuracy: 0.778
epoch 6 iter 60, loss: 0.793, accuracy: 0.772
epoch 6 iter 70, loss: 0.777, accuracy: 0.770
epoch 6 iter 80, loss: 0.777, accuracy: 0.774
epoch 6 iter 90, loss: 0.770, accuracy: 0.773
epoch 6 iter 100, loss: 0.746, accuracy: 0.783
epoch 6 iter 110, loss: 0.733, accuracy: 0.786
epoch 6 iter 120, loss: 0.735, accuracy: 0.784
epoch 6 iter 130, loss: 0.747, accuracy: 0.785
epoch 6 iter 140, loss: 0.757, accuracy: 0.778
epoch 6 iter 150, loss: 0.786, accuracy: 0.770
epoch 6 iter 160, loss: 0.759, accuracy: 0.779
epoch 6 iter 170, loss: 0.789, accuracy: 0.763
epoch 6 iter 180, loss: 0.759, accuracy: 0.779
epoch 6 iter 190, loss: 0.759, accuracy: 0.776
epoch 6 iter 200, loss: 0.751, accuracy: 0.780
epoch 6 iter 210, loss: 0.740, accuracy: 0.785
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.681, accuracy: 0.809
epoch 12 iter 30, loss: 0.673, accuracy: 0.810
epoch 12 iter 40, loss: 0.654, accuracy: 0.816
epoch 12 iter 50, loss: 0.638, accuracy: 0.815
epoch 12 iter 60, loss: 0.635, accuracy: 0.824
epoch 12 iter 70, loss: 0.669, accuracy: 0.811
epoch 12 iter 80, loss: 0.642, accuracy: 0.820
epoch 12 iter 90, loss: 0.625, accuracy: 0.820
epoch 12 iter 100, loss: 0.664, accuracy: 0.811
epoch 12 iter 110, loss: 0.645, accuracy: 0.817
epoch 12 iter 120, loss: 0.666, accuracy: 0.810
epoch 12 iter 130, loss: 0.641, accuracy: 0.818
epoch 12 iter 140, loss: 0.650, accuracy: 0.817
epoch 12 iter 150, loss: 0.645, accuracy: 0.820
epoch 12 iter 160, loss: 0.653, accuracy: 0.815
epoch 12 iter 170, loss: 0.656, accuracy: 0.813
epoch 12 iter 180, loss: 0.651, accuracy: 0.818
epoch 12 iter 190, loss: 0.669, accuracy: 0.811
epoch 12 iter 200, loss: 0.682, accuracy: 0.812
epoch 12 iter 210, loss: 0.644, accuracy: 0.816
epoch 12 iter 220, loss: 0.644, accuracy: 0.819


epoch 18 iter 0, loss: 0.718, accuracy: 0.814
epoch 18 iter 10, loss: 0.696, accuracy: 0.823
epoch 18 iter 20, loss: 0.680, accuracy: 0.823
epoch 18 iter 30, loss: 0.702, accuracy: 0.816
epoch 18 iter 40, loss: 0.786, accuracy: 0.807
epoch 18 iter 50, loss: 0.692, accuracy: 0.818
epoch 18 iter 60, loss: 0.672, accuracy: 0.822
epoch 18 iter 70, loss: 0.716, accuracy: 0.817
epoch 18 iter 80, loss: 0.684, accuracy: 0.814
epoch 18 iter 90, loss: 0.654, accuracy: 0.828
epoch 18 iter 100, loss: 0.713, accuracy: 0.810
epoch 18 iter 110, loss: 0.660, accuracy: 0.822
epoch 18 iter 120, loss: 0.692, accuracy: 0.821
epoch 18 iter 130, loss: 0.659, accuracy: 0.829
epoch 18 iter 140, loss: 0.660, accuracy: 0.826
epoch 18 iter 150, loss: 0.691, accuracy: 0.818
epoch 18 iter 160, loss: 0.677, accuracy: 0.827
epoch 18 iter 170, loss: 0.693, accuracy: 0.813
epoch 18 iter 180, loss: 0.729, accuracy: 0.807
epoch 18 iter 190, loss: 0.693, accuracy: 0.815
epoch 18 iter 200, loss: 0.679, accuracy: 0.825
epo

epoch 23 iter 280, loss: 0.792, accuracy: 0.805
epoch 24 iter 0, loss: 0.763, accuracy: 0.825
epoch 24 iter 10, loss: 0.752, accuracy: 0.828
epoch 24 iter 20, loss: 0.739, accuracy: 0.828
epoch 24 iter 30, loss: 0.778, accuracy: 0.822
epoch 24 iter 40, loss: 0.833, accuracy: 0.820
epoch 24 iter 50, loss: 0.808, accuracy: 0.810
epoch 24 iter 60, loss: 0.757, accuracy: 0.825
epoch 24 iter 70, loss: 0.770, accuracy: 0.820
epoch 24 iter 80, loss: 0.744, accuracy: 0.821
epoch 24 iter 90, loss: 0.711, accuracy: 0.829
epoch 24 iter 100, loss: 0.699, accuracy: 0.831
epoch 24 iter 110, loss: 0.752, accuracy: 0.821
epoch 24 iter 120, loss: 0.738, accuracy: 0.825
epoch 24 iter 130, loss: 0.807, accuracy: 0.811
epoch 24 iter 140, loss: 0.783, accuracy: 0.824
epoch 24 iter 150, loss: 0.788, accuracy: 0.817
epoch 24 iter 160, loss: 0.741, accuracy: 0.825
epoch 24 iter 170, loss: 0.743, accuracy: 0.824
epoch 24 iter 180, loss: 0.698, accuracy: 0.832
epoch 24 iter 190, loss: 0.713, accuracy: 0.829
epo

epoch 29 iter 270, loss: 0.904, accuracy: 0.815
epoch 29 iter 280, loss: 0.891, accuracy: 0.820
epoch 30 iter 0, loss: 0.878, accuracy: 0.821
epoch 30 iter 10, loss: 0.890, accuracy: 0.812
epoch 30 iter 20, loss: 0.811, accuracy: 0.828
epoch 30 iter 30, loss: 0.851, accuracy: 0.827
epoch 30 iter 40, loss: 0.867, accuracy: 0.830
epoch 30 iter 50, loss: 0.900, accuracy: 0.821
epoch 30 iter 60, loss: 0.891, accuracy: 0.825
epoch 30 iter 70, loss: 0.931, accuracy: 0.806
epoch 30 iter 80, loss: 0.898, accuracy: 0.806
epoch 30 iter 90, loss: 0.846, accuracy: 0.822
epoch 30 iter 100, loss: 0.862, accuracy: 0.808
epoch 30 iter 110, loss: 0.869, accuracy: 0.818
epoch 30 iter 120, loss: 0.870, accuracy: 0.819
epoch 30 iter 130, loss: 0.937, accuracy: 0.808
epoch 30 iter 140, loss: 0.948, accuracy: 0.806
epoch 30 iter 150, loss: 0.922, accuracy: 0.816
epoch 30 iter 160, loss: 0.881, accuracy: 0.827
epoch 30 iter 170, loss: 0.861, accuracy: 0.821
epoch 30 iter 180, loss: 0.850, accuracy: 0.824
epo

epoch 35 iter 260, loss: 0.973, accuracy: 0.832
epoch 35 iter 270, loss: 0.926, accuracy: 0.830
epoch 35 iter 280, loss: 0.969, accuracy: 0.827
epoch 36 iter 0, loss: 0.931, accuracy: 0.835
epoch 36 iter 10, loss: 0.998, accuracy: 0.819
epoch 36 iter 20, loss: 0.897, accuracy: 0.834
epoch 36 iter 30, loss: 0.988, accuracy: 0.822
epoch 36 iter 40, loss: 1.008, accuracy: 0.822
epoch 36 iter 50, loss: 0.984, accuracy: 0.818
epoch 36 iter 60, loss: 0.940, accuracy: 0.826
epoch 36 iter 70, loss: 1.014, accuracy: 0.814
epoch 36 iter 80, loss: 1.044, accuracy: 0.812
epoch 36 iter 90, loss: 0.982, accuracy: 0.816
epoch 36 iter 100, loss: 0.882, accuracy: 0.833
epoch 36 iter 110, loss: 0.937, accuracy: 0.814
epoch 36 iter 120, loss: 0.953, accuracy: 0.825
epoch 36 iter 130, loss: 0.944, accuracy: 0.817
epoch 36 iter 140, loss: 0.903, accuracy: 0.829
epoch 36 iter 150, loss: 0.994, accuracy: 0.829
epoch 36 iter 160, loss: 0.940, accuracy: 0.820
epoch 36 iter 170, loss: 0.944, accuracy: 0.827
epo

In [12]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [5, 5], 5, 500))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [5, 5], 5, 500])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [5, 5], stride = 5, FC_size = 500
epoch 0 iter 0, loss: 13.677, accuracy: 0.129
epoch 0 iter 10, loss: 5.742, accuracy: 0.104
epoch 0 iter 20, loss: 2.528, accuracy: 0.134
epoch 0 iter 30, loss: 2.298, accuracy: 0.152
epoch 0 iter 40, loss: 2.254, accuracy: 0.206
epoch 0 iter 50, loss: 2.224, accuracy: 0.216
epoch 0 iter 60, loss: 2.190, accuracy: 0.232
epoch 0 iter 70, loss: 2.168, accuracy: 0.242
epoch 0 iter 80, loss: 2.147, accuracy: 0.244
epoch 0 iter 90, loss: 2.114, accuracy: 0.261
epoch 0 iter 100, loss: 2.077, accuracy: 0.273
epoch 0 iter 110, loss: 2.047, accuracy: 0.293
epoch 0 iter 120, loss: 2.021, accuracy: 0.310
epoch 0 iter 130, loss: 1.923, accuracy: 0.341
epoch 0 iter 140, loss: 1.889, accuracy: 0.368
epoch 0 iter 150, loss: 1.835, accuracy: 0.379
epoch 0 iter 160, loss: 1.826, accuracy: 0.380
epoch 0 iter 170, loss: 1.761, accuracy: 0.411
epoch 0 iter 180, loss: 1.746, accuracy: 0.405
epoch 0 iter 190, loss: 1.749, accuracy: 0.411
epoc

epoch 6 iter 10, loss: 0.708, accuracy: 0.798
epoch 6 iter 20, loss: 0.724, accuracy: 0.797
epoch 6 iter 30, loss: 0.774, accuracy: 0.781
epoch 6 iter 40, loss: 0.726, accuracy: 0.794
epoch 6 iter 50, loss: 0.692, accuracy: 0.801
epoch 6 iter 60, loss: 0.698, accuracy: 0.803
epoch 6 iter 70, loss: 0.742, accuracy: 0.790
epoch 6 iter 80, loss: 0.715, accuracy: 0.796
epoch 6 iter 90, loss: 0.719, accuracy: 0.793
epoch 6 iter 100, loss: 0.711, accuracy: 0.798
epoch 6 iter 110, loss: 0.741, accuracy: 0.794
epoch 6 iter 120, loss: 0.703, accuracy: 0.802
epoch 6 iter 130, loss: 0.695, accuracy: 0.803
epoch 6 iter 140, loss: 0.711, accuracy: 0.797
epoch 6 iter 150, loss: 0.733, accuracy: 0.793
epoch 6 iter 160, loss: 0.727, accuracy: 0.793
epoch 6 iter 170, loss: 0.718, accuracy: 0.794
epoch 6 iter 180, loss: 0.740, accuracy: 0.795
epoch 6 iter 190, loss: 0.755, accuracy: 0.783
epoch 6 iter 200, loss: 0.725, accuracy: 0.798
epoch 6 iter 210, loss: 0.751, accuracy: 0.787
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.685, accuracy: 0.822
epoch 12 iter 30, loss: 0.760, accuracy: 0.803
epoch 12 iter 40, loss: 0.703, accuracy: 0.814
epoch 12 iter 50, loss: 0.708, accuracy: 0.810
epoch 12 iter 60, loss: 0.695, accuracy: 0.812
epoch 12 iter 70, loss: 0.724, accuracy: 0.804
epoch 12 iter 80, loss: 0.711, accuracy: 0.809
epoch 12 iter 90, loss: 0.703, accuracy: 0.812
epoch 12 iter 100, loss: 0.717, accuracy: 0.806
epoch 12 iter 110, loss: 0.726, accuracy: 0.817
epoch 12 iter 120, loss: 0.671, accuracy: 0.822
epoch 12 iter 130, loss: 0.709, accuracy: 0.814
epoch 12 iter 140, loss: 0.675, accuracy: 0.817
epoch 12 iter 150, loss: 0.705, accuracy: 0.813
epoch 12 iter 160, loss: 0.723, accuracy: 0.810
epoch 12 iter 170, loss: 0.716, accuracy: 0.813
epoch 12 iter 180, loss: 0.726, accuracy: 0.809
epoch 12 iter 190, loss: 0.708, accuracy: 0.812
epoch 12 iter 200, loss: 0.689, accuracy: 0.817
epoch 12 iter 210, loss: 0.704, accuracy: 0.816
epoch 12 iter 220, loss: 0.733, accuracy: 0.813


epoch 18 iter 0, loss: 0.776, accuracy: 0.819
epoch 18 iter 10, loss: 0.738, accuracy: 0.828
epoch 18 iter 20, loss: 0.747, accuracy: 0.825
epoch 18 iter 30, loss: 0.832, accuracy: 0.803
epoch 18 iter 40, loss: 0.749, accuracy: 0.812
epoch 18 iter 50, loss: 0.777, accuracy: 0.818
epoch 18 iter 60, loss: 0.829, accuracy: 0.802
epoch 18 iter 70, loss: 0.759, accuracy: 0.812
epoch 18 iter 80, loss: 0.762, accuracy: 0.816
epoch 18 iter 90, loss: 0.736, accuracy: 0.823
epoch 18 iter 100, loss: 0.812, accuracy: 0.801
epoch 18 iter 110, loss: 0.768, accuracy: 0.812
epoch 18 iter 120, loss: 0.787, accuracy: 0.810
epoch 18 iter 130, loss: 0.781, accuracy: 0.817
epoch 18 iter 140, loss: 0.758, accuracy: 0.821
epoch 18 iter 150, loss: 0.733, accuracy: 0.825
epoch 18 iter 160, loss: 0.802, accuracy: 0.804
epoch 18 iter 170, loss: 0.731, accuracy: 0.826
epoch 18 iter 180, loss: 0.852, accuracy: 0.793
epoch 18 iter 190, loss: 0.739, accuracy: 0.819
epoch 18 iter 200, loss: 0.725, accuracy: 0.827
epo

epoch 23 iter 280, loss: 0.855, accuracy: 0.822
epoch 24 iter 0, loss: 0.835, accuracy: 0.822
epoch 24 iter 10, loss: 0.834, accuracy: 0.820
epoch 24 iter 20, loss: 0.860, accuracy: 0.817
epoch 24 iter 30, loss: 0.894, accuracy: 0.811
epoch 24 iter 40, loss: 0.931, accuracy: 0.805
epoch 24 iter 50, loss: 0.895, accuracy: 0.814
epoch 24 iter 60, loss: 0.859, accuracy: 0.818
epoch 24 iter 70, loss: 0.868, accuracy: 0.809
epoch 24 iter 80, loss: 0.896, accuracy: 0.808
epoch 24 iter 90, loss: 0.885, accuracy: 0.810
epoch 24 iter 100, loss: 0.877, accuracy: 0.806
epoch 24 iter 110, loss: 0.914, accuracy: 0.808
epoch 24 iter 120, loss: 0.846, accuracy: 0.819
epoch 24 iter 130, loss: 0.857, accuracy: 0.817
epoch 24 iter 140, loss: 0.860, accuracy: 0.819
epoch 24 iter 150, loss: 0.870, accuracy: 0.813
epoch 24 iter 160, loss: 0.846, accuracy: 0.815
epoch 24 iter 170, loss: 0.843, accuracy: 0.818
epoch 24 iter 180, loss: 0.832, accuracy: 0.825
epoch 24 iter 190, loss: 0.841, accuracy: 0.815
epo

epoch 29 iter 270, loss: 0.947, accuracy: 0.820
epoch 29 iter 280, loss: 0.914, accuracy: 0.825
epoch 30 iter 0, loss: 0.927, accuracy: 0.827
epoch 30 iter 10, loss: 0.887, accuracy: 0.828
epoch 30 iter 20, loss: 0.969, accuracy: 0.819
epoch 30 iter 30, loss: 1.015, accuracy: 0.810
epoch 30 iter 40, loss: 0.978, accuracy: 0.815
epoch 30 iter 50, loss: 1.094, accuracy: 0.806
epoch 30 iter 60, loss: 0.983, accuracy: 0.817
epoch 30 iter 70, loss: 0.970, accuracy: 0.807
epoch 30 iter 80, loss: 0.985, accuracy: 0.816
epoch 30 iter 90, loss: 1.050, accuracy: 0.804
epoch 30 iter 100, loss: 0.971, accuracy: 0.812
epoch 30 iter 110, loss: 1.064, accuracy: 0.793
epoch 30 iter 120, loss: 0.950, accuracy: 0.814
epoch 30 iter 130, loss: 1.016, accuracy: 0.811
epoch 30 iter 140, loss: 0.934, accuracy: 0.818
epoch 30 iter 150, loss: 0.922, accuracy: 0.817
epoch 30 iter 160, loss: 0.912, accuracy: 0.823
epoch 30 iter 170, loss: 0.950, accuracy: 0.820
epoch 30 iter 180, loss: 0.920, accuracy: 0.831
epo

epoch 35 iter 260, loss: 1.057, accuracy: 0.827
epoch 35 iter 270, loss: 1.134, accuracy: 0.811
epoch 35 iter 280, loss: 1.054, accuracy: 0.825
epoch 36 iter 0, loss: 1.096, accuracy: 0.818
epoch 36 iter 10, loss: 1.059, accuracy: 0.829
epoch 36 iter 20, loss: 1.095, accuracy: 0.822
epoch 36 iter 30, loss: 1.090, accuracy: 0.819
epoch 36 iter 40, loss: 1.065, accuracy: 0.823
epoch 36 iter 50, loss: 1.081, accuracy: 0.824
epoch 36 iter 60, loss: 1.095, accuracy: 0.821
epoch 36 iter 70, loss: 1.137, accuracy: 0.806
epoch 36 iter 80, loss: 1.106, accuracy: 0.810
epoch 36 iter 90, loss: 1.113, accuracy: 0.819
epoch 36 iter 100, loss: 1.143, accuracy: 0.802
epoch 36 iter 110, loss: 1.121, accuracy: 0.807
epoch 36 iter 120, loss: 1.073, accuracy: 0.803
epoch 36 iter 130, loss: 1.129, accuracy: 0.810
epoch 36 iter 140, loss: 1.136, accuracy: 0.800
epoch 36 iter 150, loss: 1.187, accuracy: 0.806
epoch 36 iter 160, loss: 1.116, accuracy: 0.811
epoch 36 iter 170, loss: 1.085, accuracy: 0.813
epo

## FC size ##

In [13]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [5, 5], 2, 400))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [5, 5], 2, 400])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [5, 5], stride = 2, FC_size = 400
epoch 0 iter 0, loss: 105.521, accuracy: 0.158
epoch 0 iter 10, loss: 2.403, accuracy: 0.149
epoch 0 iter 20, loss: 2.300, accuracy: 0.118
epoch 0 iter 30, loss: 2.259, accuracy: 0.175
epoch 0 iter 40, loss: 2.250, accuracy: 0.194
epoch 0 iter 50, loss: 2.224, accuracy: 0.203
epoch 0 iter 60, loss: 2.214, accuracy: 0.210
epoch 0 iter 70, loss: 2.201, accuracy: 0.216
epoch 0 iter 80, loss: 2.201, accuracy: 0.221
epoch 0 iter 90, loss: 2.177, accuracy: 0.226
epoch 0 iter 100, loss: 2.160, accuracy: 0.237
epoch 0 iter 110, loss: 2.144, accuracy: 0.247
epoch 0 iter 120, loss: 2.111, accuracy: 0.257
epoch 0 iter 130, loss: 2.121, accuracy: 0.252
epoch 0 iter 140, loss: 2.083, accuracy: 0.266
epoch 0 iter 150, loss: 2.063, accuracy: 0.274
epoch 0 iter 160, loss: 2.070, accuracy: 0.262
epoch 0 iter 170, loss: 2.045, accuracy: 0.282
epoch 0 iter 180, loss: 2.109, accuracy: 0.259
epoch 0 iter 190, loss: 1.974, accuracy: 0.327
epo

epoch 6 iter 10, loss: 0.734, accuracy: 0.798
epoch 6 iter 20, loss: 0.707, accuracy: 0.808
epoch 6 iter 30, loss: 0.727, accuracy: 0.800
epoch 6 iter 40, loss: 0.729, accuracy: 0.801
epoch 6 iter 50, loss: 0.703, accuracy: 0.806
epoch 6 iter 60, loss: 0.717, accuracy: 0.804
epoch 6 iter 70, loss: 0.719, accuracy: 0.802
epoch 6 iter 80, loss: 0.801, accuracy: 0.768
epoch 6 iter 90, loss: 0.782, accuracy: 0.793
epoch 6 iter 100, loss: 0.768, accuracy: 0.791
epoch 6 iter 110, loss: 0.760, accuracy: 0.795
epoch 6 iter 120, loss: 0.793, accuracy: 0.789
epoch 6 iter 130, loss: 0.727, accuracy: 0.808
epoch 6 iter 140, loss: 0.726, accuracy: 0.800
epoch 6 iter 150, loss: 0.730, accuracy: 0.808
epoch 6 iter 160, loss: 0.747, accuracy: 0.802
epoch 6 iter 170, loss: 0.737, accuracy: 0.809
epoch 6 iter 180, loss: 0.741, accuracy: 0.803
epoch 6 iter 190, loss: 0.763, accuracy: 0.801
epoch 6 iter 200, loss: 0.788, accuracy: 0.804
epoch 6 iter 210, loss: 0.761, accuracy: 0.803
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.881, accuracy: 0.817
epoch 12 iter 30, loss: 0.953, accuracy: 0.800
epoch 12 iter 40, loss: 0.951, accuracy: 0.810
epoch 12 iter 50, loss: 0.927, accuracy: 0.820
epoch 12 iter 60, loss: 0.979, accuracy: 0.815
epoch 12 iter 70, loss: 0.927, accuracy: 0.816
epoch 12 iter 80, loss: 0.887, accuracy: 0.822
epoch 12 iter 90, loss: 0.841, accuracy: 0.835
epoch 12 iter 100, loss: 0.906, accuracy: 0.827
epoch 12 iter 110, loss: 0.871, accuracy: 0.815
epoch 12 iter 120, loss: 0.964, accuracy: 0.805
epoch 12 iter 130, loss: 0.881, accuracy: 0.817
epoch 12 iter 140, loss: 0.888, accuracy: 0.826
epoch 12 iter 150, loss: 0.914, accuracy: 0.825
epoch 12 iter 160, loss: 0.880, accuracy: 0.826
epoch 12 iter 170, loss: 0.904, accuracy: 0.819
epoch 12 iter 180, loss: 0.978, accuracy: 0.818
epoch 12 iter 190, loss: 0.936, accuracy: 0.824
epoch 12 iter 200, loss: 0.992, accuracy: 0.815
epoch 12 iter 210, loss: 1.001, accuracy: 0.820
epoch 12 iter 220, loss: 0.948, accuracy: 0.829


epoch 18 iter 0, loss: 1.201, accuracy: 0.833
epoch 18 iter 10, loss: 1.201, accuracy: 0.833
epoch 18 iter 20, loss: 1.075, accuracy: 0.841
epoch 18 iter 30, loss: 1.251, accuracy: 0.818
epoch 18 iter 40, loss: 1.413, accuracy: 0.807
epoch 18 iter 50, loss: 1.177, accuracy: 0.829
epoch 18 iter 60, loss: 1.123, accuracy: 0.818
epoch 18 iter 70, loss: 1.250, accuracy: 0.829
epoch 18 iter 80, loss: 1.162, accuracy: 0.837
epoch 18 iter 90, loss: 1.217, accuracy: 0.832
epoch 18 iter 100, loss: 1.299, accuracy: 0.824
epoch 18 iter 110, loss: 1.177, accuracy: 0.818
epoch 18 iter 120, loss: 1.238, accuracy: 0.823
epoch 18 iter 130, loss: 1.252, accuracy: 0.828
epoch 18 iter 140, loss: 1.122, accuracy: 0.835
epoch 18 iter 150, loss: 1.171, accuracy: 0.842
epoch 18 iter 160, loss: 1.092, accuracy: 0.828
epoch 18 iter 170, loss: 1.161, accuracy: 0.837
epoch 18 iter 180, loss: 1.193, accuracy: 0.834
epoch 18 iter 190, loss: 1.121, accuracy: 0.830
epoch 18 iter 200, loss: 1.221, accuracy: 0.840
epo

epoch 23 iter 280, loss: 1.441, accuracy: 0.843
epoch 24 iter 0, loss: 1.465, accuracy: 0.842
epoch 24 iter 10, loss: 1.613, accuracy: 0.839
epoch 24 iter 20, loss: 1.449, accuracy: 0.838
epoch 24 iter 30, loss: 1.377, accuracy: 0.832
epoch 24 iter 40, loss: 1.441, accuracy: 0.823
epoch 24 iter 50, loss: 1.818, accuracy: 0.816
epoch 24 iter 60, loss: 1.568, accuracy: 0.835
epoch 24 iter 70, loss: 1.478, accuracy: 0.827
epoch 24 iter 80, loss: 1.618, accuracy: 0.828
epoch 24 iter 90, loss: 1.692, accuracy: 0.833
epoch 24 iter 100, loss: 1.714, accuracy: 0.836
epoch 24 iter 110, loss: 1.529, accuracy: 0.844
epoch 24 iter 120, loss: 1.497, accuracy: 0.832
epoch 24 iter 130, loss: 1.724, accuracy: 0.829
epoch 24 iter 140, loss: 1.533, accuracy: 0.834
epoch 24 iter 150, loss: 1.414, accuracy: 0.837
epoch 24 iter 160, loss: 1.338, accuracy: 0.837
epoch 24 iter 170, loss: 1.449, accuracy: 0.842
epoch 24 iter 180, loss: 1.665, accuracy: 0.829
epoch 24 iter 190, loss: 1.409, accuracy: 0.840
epo

epoch 29 iter 270, loss: 1.959, accuracy: 0.839
epoch 29 iter 280, loss: 1.778, accuracy: 0.841
epoch 30 iter 0, loss: 1.735, accuracy: 0.842
epoch 30 iter 10, loss: 1.902, accuracy: 0.840
epoch 30 iter 20, loss: 1.852, accuracy: 0.850
epoch 30 iter 30, loss: 1.680, accuracy: 0.843
epoch 30 iter 40, loss: 1.799, accuracy: 0.846
epoch 30 iter 50, loss: 1.969, accuracy: 0.836
epoch 30 iter 60, loss: 1.955, accuracy: 0.837
epoch 30 iter 70, loss: 2.028, accuracy: 0.840
epoch 30 iter 80, loss: 1.932, accuracy: 0.818
epoch 30 iter 90, loss: 1.810, accuracy: 0.832
epoch 30 iter 100, loss: 1.949, accuracy: 0.845
epoch 30 iter 110, loss: 1.970, accuracy: 0.839
epoch 30 iter 120, loss: 1.868, accuracy: 0.837
epoch 30 iter 130, loss: 1.904, accuracy: 0.847
epoch 30 iter 140, loss: 1.814, accuracy: 0.848
epoch 30 iter 150, loss: 1.927, accuracy: 0.835
epoch 30 iter 160, loss: 1.845, accuracy: 0.846
epoch 30 iter 170, loss: 1.782, accuracy: 0.835
epoch 30 iter 180, loss: 1.984, accuracy: 0.818
epo

epoch 35 iter 260, loss: 2.162, accuracy: 0.844
epoch 35 iter 270, loss: 2.133, accuracy: 0.835
epoch 35 iter 280, loss: 2.224, accuracy: 0.845
epoch 36 iter 0, loss: 2.360, accuracy: 0.848
epoch 36 iter 10, loss: 2.447, accuracy: 0.844
epoch 36 iter 20, loss: 2.084, accuracy: 0.844
epoch 36 iter 30, loss: 2.079, accuracy: 0.848
epoch 36 iter 40, loss: 2.219, accuracy: 0.843
epoch 36 iter 50, loss: 2.283, accuracy: 0.845
epoch 36 iter 60, loss: 2.348, accuracy: 0.838
epoch 36 iter 70, loss: 2.517, accuracy: 0.830
epoch 36 iter 80, loss: 2.353, accuracy: 0.835
epoch 36 iter 90, loss: 2.195, accuracy: 0.827
epoch 36 iter 100, loss: 2.339, accuracy: 0.837
epoch 36 iter 110, loss: 2.289, accuracy: 0.837
epoch 36 iter 120, loss: 2.341, accuracy: 0.830
epoch 36 iter 130, loss: 2.295, accuracy: 0.848
epoch 36 iter 140, loss: 2.336, accuracy: 0.848
epoch 36 iter 150, loss: 2.247, accuracy: 0.848
epoch 36 iter 160, loss: 2.318, accuracy: 0.835
epoch 36 iter 170, loss: 2.165, accuracy: 0.832
epo

In [14]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [5, 5], 2, 600))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [5, 5], 2, 600])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [5, 5], stride = 2, FC_size = 600
epoch 0 iter 0, loss: 114.819, accuracy: 0.159
epoch 0 iter 10, loss: 3.245, accuracy: 0.135
epoch 0 iter 20, loss: 2.260, accuracy: 0.171
epoch 0 iter 30, loss: 2.236, accuracy: 0.194
epoch 0 iter 40, loss: 2.231, accuracy: 0.197
epoch 0 iter 50, loss: 2.231, accuracy: 0.196
epoch 0 iter 60, loss: 2.232, accuracy: 0.196
epoch 0 iter 70, loss: 2.238, accuracy: 0.196
epoch 0 iter 80, loss: 2.229, accuracy: 0.196
epoch 0 iter 90, loss: 2.224, accuracy: 0.196
epoch 0 iter 100, loss: 2.224, accuracy: 0.196
epoch 0 iter 110, loss: 2.223, accuracy: 0.196
epoch 0 iter 120, loss: 2.215, accuracy: 0.196
epoch 0 iter 130, loss: 2.218, accuracy: 0.196
epoch 0 iter 140, loss: 2.218, accuracy: 0.196
epoch 0 iter 150, loss: 2.212, accuracy: 0.196
epoch 0 iter 160, loss: 2.213, accuracy: 0.196
epoch 0 iter 170, loss: 2.214, accuracy: 0.196
epoch 0 iter 180, loss: 2.212, accuracy: 0.197
epoch 0 iter 190, loss: 2.214, accuracy: 0.197
epo

epoch 6 iter 10, loss: 0.684, accuracy: 0.816
epoch 6 iter 20, loss: 0.648, accuracy: 0.825
epoch 6 iter 30, loss: 0.697, accuracy: 0.811
epoch 6 iter 40, loss: 0.683, accuracy: 0.819
epoch 6 iter 50, loss: 0.677, accuracy: 0.814
epoch 6 iter 60, loss: 0.705, accuracy: 0.807
epoch 6 iter 70, loss: 0.687, accuracy: 0.817
epoch 6 iter 80, loss: 0.680, accuracy: 0.816
epoch 6 iter 90, loss: 0.654, accuracy: 0.821
epoch 6 iter 100, loss: 0.654, accuracy: 0.822
epoch 6 iter 110, loss: 0.657, accuracy: 0.821
epoch 6 iter 120, loss: 0.682, accuracy: 0.812
epoch 6 iter 130, loss: 0.686, accuracy: 0.816
epoch 6 iter 140, loss: 0.669, accuracy: 0.819
epoch 6 iter 150, loss: 0.708, accuracy: 0.810
epoch 6 iter 160, loss: 0.719, accuracy: 0.808
epoch 6 iter 170, loss: 0.723, accuracy: 0.808
epoch 6 iter 180, loss: 0.666, accuracy: 0.822
epoch 6 iter 190, loss: 0.669, accuracy: 0.823
epoch 6 iter 200, loss: 0.647, accuracy: 0.829
epoch 6 iter 210, loss: 0.653, accuracy: 0.824
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.789, accuracy: 0.829
epoch 12 iter 30, loss: 0.806, accuracy: 0.838
epoch 12 iter 40, loss: 0.788, accuracy: 0.839
epoch 12 iter 50, loss: 0.813, accuracy: 0.831
epoch 12 iter 60, loss: 0.788, accuracy: 0.831
epoch 12 iter 70, loss: 0.813, accuracy: 0.826
epoch 12 iter 80, loss: 0.793, accuracy: 0.828
epoch 12 iter 90, loss: 0.797, accuracy: 0.834
epoch 12 iter 100, loss: 0.902, accuracy: 0.820
epoch 12 iter 110, loss: 0.844, accuracy: 0.819
epoch 12 iter 120, loss: 0.808, accuracy: 0.828
epoch 12 iter 130, loss: 0.761, accuracy: 0.841
epoch 12 iter 140, loss: 0.851, accuracy: 0.830
epoch 12 iter 150, loss: 0.795, accuracy: 0.828
epoch 12 iter 160, loss: 0.840, accuracy: 0.816
epoch 12 iter 170, loss: 0.931, accuracy: 0.812
epoch 12 iter 180, loss: 0.869, accuracy: 0.809
epoch 12 iter 190, loss: 0.835, accuracy: 0.832
epoch 12 iter 200, loss: 0.931, accuracy: 0.823
epoch 12 iter 210, loss: 0.817, accuracy: 0.835
epoch 12 iter 220, loss: 0.947, accuracy: 0.804


epoch 18 iter 0, loss: 1.117, accuracy: 0.817
epoch 18 iter 10, loss: 1.018, accuracy: 0.835
epoch 18 iter 20, loss: 1.126, accuracy: 0.835
epoch 18 iter 30, loss: 1.134, accuracy: 0.826
epoch 18 iter 40, loss: 1.120, accuracy: 0.827
epoch 18 iter 50, loss: 1.080, accuracy: 0.831
epoch 18 iter 60, loss: 1.104, accuracy: 0.831
epoch 18 iter 70, loss: 1.128, accuracy: 0.825
epoch 18 iter 80, loss: 1.139, accuracy: 0.808
epoch 18 iter 90, loss: 1.029, accuracy: 0.827
epoch 18 iter 100, loss: 1.102, accuracy: 0.824
epoch 18 iter 110, loss: 1.214, accuracy: 0.803
epoch 18 iter 120, loss: 1.087, accuracy: 0.829
epoch 18 iter 130, loss: 1.116, accuracy: 0.839
epoch 18 iter 140, loss: 1.139, accuracy: 0.833
epoch 18 iter 150, loss: 1.086, accuracy: 0.826
epoch 18 iter 160, loss: 1.017, accuracy: 0.838
epoch 18 iter 170, loss: 1.242, accuracy: 0.805
epoch 18 iter 180, loss: 1.140, accuracy: 0.811
epoch 18 iter 190, loss: 1.170, accuracy: 0.821
epoch 18 iter 200, loss: 1.210, accuracy: 0.826
epo

epoch 23 iter 280, loss: 1.496, accuracy: 0.825
epoch 24 iter 0, loss: 1.451, accuracy: 0.831
epoch 24 iter 10, loss: 1.382, accuracy: 0.832
epoch 24 iter 20, loss: 1.417, accuracy: 0.821
epoch 24 iter 30, loss: 1.357, accuracy: 0.833
epoch 24 iter 40, loss: 1.552, accuracy: 0.814
epoch 24 iter 50, loss: 1.474, accuracy: 0.831
epoch 24 iter 60, loss: 1.440, accuracy: 0.818
epoch 24 iter 70, loss: 1.636, accuracy: 0.820
epoch 24 iter 80, loss: 1.631, accuracy: 0.819
epoch 24 iter 90, loss: 1.461, accuracy: 0.813
epoch 24 iter 100, loss: 1.531, accuracy: 0.823
epoch 24 iter 110, loss: 1.634, accuracy: 0.804
epoch 24 iter 120, loss: 1.288, accuracy: 0.836
epoch 24 iter 130, loss: 1.350, accuracy: 0.834
epoch 24 iter 140, loss: 1.476, accuracy: 0.819
epoch 24 iter 150, loss: 1.366, accuracy: 0.824
epoch 24 iter 160, loss: 1.415, accuracy: 0.830
epoch 24 iter 170, loss: 1.420, accuracy: 0.830
epoch 24 iter 180, loss: 1.439, accuracy: 0.829
epoch 24 iter 190, loss: 1.394, accuracy: 0.832
epo

epoch 29 iter 270, loss: 1.754, accuracy: 0.835
epoch 29 iter 280, loss: 1.827, accuracy: 0.809
epoch 30 iter 0, loss: 1.840, accuracy: 0.828
epoch 30 iter 10, loss: 1.880, accuracy: 0.832
epoch 30 iter 20, loss: 1.637, accuracy: 0.828
epoch 30 iter 30, loss: 1.789, accuracy: 0.826
epoch 30 iter 40, loss: 1.903, accuracy: 0.829
epoch 30 iter 50, loss: 1.782, accuracy: 0.829
epoch 30 iter 60, loss: 1.677, accuracy: 0.814
epoch 30 iter 70, loss: 1.700, accuracy: 0.824
epoch 30 iter 80, loss: 2.069, accuracy: 0.815
epoch 30 iter 90, loss: 1.765, accuracy: 0.833
epoch 30 iter 100, loss: 1.791, accuracy: 0.826
epoch 30 iter 110, loss: 1.755, accuracy: 0.836
epoch 30 iter 120, loss: 1.706, accuracy: 0.837
epoch 30 iter 130, loss: 1.652, accuracy: 0.840
epoch 30 iter 140, loss: 1.699, accuracy: 0.840
epoch 30 iter 150, loss: 1.834, accuracy: 0.837
epoch 30 iter 160, loss: 1.766, accuracy: 0.811
epoch 30 iter 170, loss: 1.791, accuracy: 0.822
epoch 30 iter 180, loss: 1.809, accuracy: 0.824
epo

epoch 35 iter 260, loss: 2.052, accuracy: 0.842
epoch 35 iter 270, loss: 2.006, accuracy: 0.837
epoch 35 iter 280, loss: 2.021, accuracy: 0.842
epoch 36 iter 0, loss: 2.132, accuracy: 0.828
epoch 36 iter 10, loss: 2.016, accuracy: 0.842
epoch 36 iter 20, loss: 2.195, accuracy: 0.841
epoch 36 iter 30, loss: 2.047, accuracy: 0.848
epoch 36 iter 40, loss: 2.011, accuracy: 0.843
epoch 36 iter 50, loss: 2.050, accuracy: 0.846
epoch 36 iter 60, loss: 1.979, accuracy: 0.842
epoch 36 iter 70, loss: 1.999, accuracy: 0.842
epoch 36 iter 80, loss: 1.950, accuracy: 0.846
epoch 36 iter 90, loss: 2.128, accuracy: 0.824
epoch 36 iter 100, loss: 1.885, accuracy: 0.836
epoch 36 iter 110, loss: 2.180, accuracy: 0.833
epoch 36 iter 120, loss: 2.028, accuracy: 0.837
epoch 36 iter 130, loss: 1.948, accuracy: 0.835
epoch 36 iter 140, loss: 1.903, accuracy: 0.839
epoch 36 iter 150, loss: 2.044, accuracy: 0.841
epoch 36 iter 160, loss: 2.047, accuracy: 0.839
epoch 36 iter 170, loss: 2.025, accuracy: 0.837
epo

In [15]:
print('Filter size = {}, kernel_size = {}, stride = {}, FC_size = {}'.format(32, [5, 5], 2, 800))
modified_model_dict = apply_classification_loss(my_SVHN_net, [32, [5, 5], 2, 800])
train_model(modified_model_dict, dataset_generators, epoch_n=40, print_every=10)

Filter size = 32, kernel_size = [5, 5], stride = 2, FC_size = 800
epoch 0 iter 0, loss: 200.610, accuracy: 0.098
epoch 0 iter 10, loss: 2.419, accuracy: 0.100
epoch 0 iter 20, loss: 2.248, accuracy: 0.191
epoch 0 iter 30, loss: 2.248, accuracy: 0.189
epoch 0 iter 40, loss: 2.234, accuracy: 0.207
epoch 0 iter 50, loss: 2.238, accuracy: 0.202
epoch 0 iter 60, loss: 2.226, accuracy: 0.215
epoch 0 iter 70, loss: 2.223, accuracy: 0.207
epoch 0 iter 80, loss: 2.214, accuracy: 0.215
epoch 0 iter 90, loss: 2.238, accuracy: 0.212
epoch 0 iter 100, loss: 2.216, accuracy: 0.226
epoch 0 iter 110, loss: 2.206, accuracy: 0.218
epoch 0 iter 120, loss: 2.193, accuracy: 0.227
epoch 0 iter 130, loss: 2.204, accuracy: 0.231
epoch 0 iter 140, loss: 2.205, accuracy: 0.220
epoch 0 iter 150, loss: 2.181, accuracy: 0.230
epoch 0 iter 160, loss: 2.178, accuracy: 0.232
epoch 0 iter 170, loss: 2.177, accuracy: 0.235
epoch 0 iter 180, loss: 2.161, accuracy: 0.243
epoch 0 iter 190, loss: 2.167, accuracy: 0.226
epo

epoch 6 iter 10, loss: 0.641, accuracy: 0.818
epoch 6 iter 20, loss: 0.661, accuracy: 0.810
epoch 6 iter 30, loss: 0.670, accuracy: 0.814
epoch 6 iter 40, loss: 0.658, accuracy: 0.812
epoch 6 iter 50, loss: 0.671, accuracy: 0.806
epoch 6 iter 60, loss: 0.650, accuracy: 0.818
epoch 6 iter 70, loss: 0.648, accuracy: 0.817
epoch 6 iter 80, loss: 0.709, accuracy: 0.803
epoch 6 iter 90, loss: 0.659, accuracy: 0.811
epoch 6 iter 100, loss: 0.661, accuracy: 0.810
epoch 6 iter 110, loss: 0.673, accuracy: 0.807
epoch 6 iter 120, loss: 0.688, accuracy: 0.809
epoch 6 iter 130, loss: 0.675, accuracy: 0.814
epoch 6 iter 140, loss: 0.689, accuracy: 0.804
epoch 6 iter 150, loss: 0.707, accuracy: 0.806
epoch 6 iter 160, loss: 0.657, accuracy: 0.817
epoch 6 iter 170, loss: 0.676, accuracy: 0.816
epoch 6 iter 180, loss: 0.665, accuracy: 0.814
epoch 6 iter 190, loss: 0.666, accuracy: 0.815
epoch 6 iter 200, loss: 0.735, accuracy: 0.804
epoch 6 iter 210, loss: 0.661, accuracy: 0.814
epoch 6 iter 220, loss

epoch 12 iter 20, loss: 0.746, accuracy: 0.828
epoch 12 iter 30, loss: 0.804, accuracy: 0.819
epoch 12 iter 40, loss: 0.764, accuracy: 0.824
epoch 12 iter 50, loss: 0.732, accuracy: 0.823
epoch 12 iter 60, loss: 0.827, accuracy: 0.803
epoch 12 iter 70, loss: 0.750, accuracy: 0.820
epoch 12 iter 80, loss: 0.832, accuracy: 0.814
epoch 12 iter 90, loss: 0.838, accuracy: 0.815
epoch 12 iter 100, loss: 0.775, accuracy: 0.817
epoch 12 iter 110, loss: 0.804, accuracy: 0.820
epoch 12 iter 120, loss: 0.951, accuracy: 0.803
epoch 12 iter 130, loss: 0.836, accuracy: 0.816
epoch 12 iter 140, loss: 0.826, accuracy: 0.810
epoch 12 iter 150, loss: 0.881, accuracy: 0.812
epoch 12 iter 160, loss: 0.827, accuracy: 0.818
epoch 12 iter 170, loss: 0.823, accuracy: 0.817
epoch 12 iter 180, loss: 0.773, accuracy: 0.826
epoch 12 iter 190, loss: 0.814, accuracy: 0.819
epoch 12 iter 200, loss: 0.905, accuracy: 0.813
epoch 12 iter 210, loss: 0.753, accuracy: 0.829
epoch 12 iter 220, loss: 0.809, accuracy: 0.826


epoch 18 iter 0, loss: 1.006, accuracy: 0.803
epoch 18 iter 10, loss: 1.104, accuracy: 0.819
epoch 18 iter 20, loss: 1.039, accuracy: 0.820
epoch 18 iter 30, loss: 0.957, accuracy: 0.812
epoch 18 iter 40, loss: 1.037, accuracy: 0.820
epoch 18 iter 50, loss: 0.974, accuracy: 0.837
epoch 18 iter 60, loss: 0.988, accuracy: 0.829
epoch 18 iter 70, loss: 1.019, accuracy: 0.822
epoch 18 iter 80, loss: 1.099, accuracy: 0.816
epoch 18 iter 90, loss: 1.038, accuracy: 0.797
epoch 18 iter 100, loss: 1.045, accuracy: 0.818
epoch 18 iter 110, loss: 1.078, accuracy: 0.826
epoch 18 iter 120, loss: 1.279, accuracy: 0.807
epoch 18 iter 130, loss: 1.075, accuracy: 0.818
epoch 18 iter 140, loss: 1.174, accuracy: 0.817
epoch 18 iter 150, loss: 0.992, accuracy: 0.835
epoch 18 iter 160, loss: 1.014, accuracy: 0.833
epoch 18 iter 170, loss: 1.031, accuracy: 0.830
epoch 18 iter 180, loss: 0.969, accuracy: 0.830
epoch 18 iter 190, loss: 1.117, accuracy: 0.829
epoch 18 iter 200, loss: 1.001, accuracy: 0.822
epo

epoch 23 iter 280, loss: 1.659, accuracy: 0.830
epoch 24 iter 0, loss: 1.410, accuracy: 0.832
epoch 24 iter 10, loss: 1.232, accuracy: 0.842
epoch 24 iter 20, loss: 1.296, accuracy: 0.842
epoch 24 iter 30, loss: 1.221, accuracy: 0.839
epoch 24 iter 40, loss: 1.300, accuracy: 0.836
epoch 24 iter 50, loss: 1.365, accuracy: 0.829
epoch 24 iter 60, loss: 1.360, accuracy: 0.826
epoch 24 iter 70, loss: 1.382, accuracy: 0.802
epoch 24 iter 80, loss: 1.551, accuracy: 0.782
epoch 24 iter 90, loss: 1.474, accuracy: 0.802
epoch 24 iter 100, loss: 1.409, accuracy: 0.799
epoch 24 iter 110, loss: 1.385, accuracy: 0.817
epoch 24 iter 120, loss: 1.425, accuracy: 0.819
epoch 24 iter 130, loss: 1.476, accuracy: 0.822
epoch 24 iter 140, loss: 1.574, accuracy: 0.810
epoch 24 iter 150, loss: 1.561, accuracy: 0.816
epoch 24 iter 160, loss: 1.310, accuracy: 0.829
epoch 24 iter 170, loss: 1.340, accuracy: 0.839
epoch 24 iter 180, loss: 1.415, accuracy: 0.832
epoch 24 iter 190, loss: 1.280, accuracy: 0.835
epo

epoch 29 iter 270, loss: 2.221, accuracy: 0.832
epoch 29 iter 280, loss: 1.880, accuracy: 0.842
epoch 30 iter 0, loss: 1.871, accuracy: 0.834
epoch 30 iter 10, loss: 1.756, accuracy: 0.835
epoch 30 iter 20, loss: 1.620, accuracy: 0.842
epoch 30 iter 30, loss: 1.665, accuracy: 0.842
epoch 30 iter 40, loss: 1.737, accuracy: 0.827
epoch 30 iter 50, loss: 1.744, accuracy: 0.840
epoch 30 iter 60, loss: 1.523, accuracy: 0.847
epoch 30 iter 70, loss: 1.497, accuracy: 0.849
epoch 30 iter 80, loss: 1.616, accuracy: 0.847
epoch 30 iter 90, loss: 1.537, accuracy: 0.844
epoch 30 iter 100, loss: 1.587, accuracy: 0.832
epoch 30 iter 110, loss: 1.765, accuracy: 0.830
epoch 30 iter 120, loss: 1.649, accuracy: 0.835
epoch 30 iter 130, loss: 1.824, accuracy: 0.831
epoch 30 iter 140, loss: 1.663, accuracy: 0.841
epoch 30 iter 150, loss: 1.638, accuracy: 0.850
epoch 30 iter 160, loss: 1.777, accuracy: 0.844
epoch 30 iter 170, loss: 1.580, accuracy: 0.847
epoch 30 iter 180, loss: 1.818, accuracy: 0.851
epo

epoch 35 iter 260, loss: 2.258, accuracy: 0.831
epoch 35 iter 270, loss: 2.658, accuracy: 0.841
epoch 35 iter 280, loss: 2.915, accuracy: 0.796
epoch 36 iter 0, loss: 2.379, accuracy: 0.802
epoch 36 iter 10, loss: 2.326, accuracy: 0.830
epoch 36 iter 20, loss: 2.007, accuracy: 0.837
epoch 36 iter 30, loss: 2.380, accuracy: 0.836
epoch 36 iter 40, loss: 2.096, accuracy: 0.844
epoch 36 iter 50, loss: 2.051, accuracy: 0.846
epoch 36 iter 60, loss: 2.172, accuracy: 0.826
epoch 36 iter 70, loss: 2.052, accuracy: 0.841
epoch 36 iter 80, loss: 1.859, accuracy: 0.839
epoch 36 iter 90, loss: 2.273, accuracy: 0.828
epoch 36 iter 100, loss: 1.859, accuracy: 0.833
epoch 36 iter 110, loss: 2.117, accuracy: 0.843
epoch 36 iter 120, loss: 2.321, accuracy: 0.836
epoch 36 iter 130, loss: 2.282, accuracy: 0.839
epoch 36 iter 140, loss: 2.085, accuracy: 0.845
epoch 36 iter 150, loss: 2.307, accuracy: 0.838
epoch 36 iter 160, loss: 2.141, accuracy: 0.848
epoch 36 iter 170, loss: 1.980, accuracy: 0.850
epo

**[Double click here to add your answer]**

## Part 2: Saving and Reloading Model Weights
(25 points)

In this section you learn to save the weights of a trained model, and to load the weights of a saved model. This is really useful when we would like to load an already trained model in order to continue training or to fine-tune it. Often times we save “snapshots” of the trained model as training progresses in case the training is interrupted, or in case we would like to fall back to an earlier model, this is called snapshot saving.


### Q2.1 Defining another network
(10 points)

Define a network with a slightly different structure in `def cnn_expanded(x_)` below. `cnn_expanded` is an expanded version of `cnn_model`. 
It should have: 
- followed by one additional convolutional layer, and 
- followed by one additional pooling layer.

The last fully-connected layer will stay the same.

In [7]:
# Define the new model 
def cnn_expanded(x_):
    conv1 = tf.layers.conv2d(
            inputs=x_,
            filters=32,  # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu)
    
    pool1 = tf.layers.max_pooling2d(inputs=conv1, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
    
    conv2 = tf.layers.conv2d(
            inputs=pool1,
            filters=32, # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu)
    
    pool2 = tf.layers.max_pooling2d(inputs=conv2, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
    
    conv3 = tf.layers.conv2d(
            inputs=pool2,
            filters=32, # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu)
    
    pool3 = tf.layers.max_pooling2d(inputs=conv3, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
        
    pool_flat = tf.contrib.layers.flatten(pool3, scope='pool2flat')
    dense = tf.layers.dense(inputs=pool_flat, units=500, activation=tf.nn.relu)
    logits = tf.layers.dense(inputs=dense, units=10)
    return logits


### Q2.2 Saving and Loading Weights
(15 points)

`new_train_model()` below has two additional parameters `save_model=False, load_model=False` than `train_model` defined previously. Modify `new_train_model()` such that it would 
- save weights after the training is complete if `save_model` is `True`, and
- load weights on start-up before training if `load_model` is `True`.

*Hint:*  take a look at the docs for `tf.train.Saver()` here: https://www.tensorflow.org/api_docs/python/tf/train/Saver#__init__. You probably will be specifying the first argument `var_list` in `cnn_expanded` to accomplish this question.

**Note:** you're welcome to decide how many training epochs to use, if that gets you the same results but faster.

In [6]:
#### Modify this:
def new_train_model(model_dict, dataset_generators, epoch_n, print_every,
                    save_model=False, load_model=False):
    with model_dict['graph'].as_default(), tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        saver = tf.train.Saver()
        if load_model:
            saver.restore(sess, 'checkpoints/weight_saver.ckpt')
            print('Model already loaded')
            ## -- ! code required 
            
        for epoch_i in range(epoch_n):
            for iter_i, data_batch in enumerate(dataset_generators['train']):
                train_feed_dict = dict(zip(model_dict['inputs'], data_batch))
                sess.run(model_dict['train_op'], feed_dict=train_feed_dict)
                
                if iter_i % print_every == 0:
                    collect_arr = []
                    for test_batch in dataset_generators['test']:
                        test_feed_dict = dict(zip(model_dict['inputs'], test_batch))
                        to_compute = [model_dict['loss'], model_dict['accuracy']]
                        collect_arr.append(sess.run(to_compute, test_feed_dict))
                    averages = np.mean(collect_arr, axis=0)
                    fmt = (epoch_i, iter_i, ) + tuple(averages)
                    print('iteration {:d} {:d}\t loss: {:.3f}, '
                          'accuracy: {:.3f}'.format(*fmt))
                    
        if save_model:
            save_path = saver.save(sess, 'checkpoints/weight_saver.ckpt')
            print("Model saved in path: %s" % save_path)
            
            
cnn_expanded_dict = apply_classification_loss(cnn_expanded)

In [9]:
### Hint: call the saver like this: tf.train.Saver(var_list)
### where var_list is a list of TF variables you want to save
new_train_model(cnn_expanded_dict, dataset_generators, epoch_n=40, print_every=100, save_model=True)

iteration 0 0	 loss: 30.848, accuracy: 0.159
iteration 0 100	 loss: 2.165, accuracy: 0.235
iteration 0 200	 loss: 1.404, accuracy: 0.563
iteration 1 0	 loss: 1.064, accuracy: 0.677
iteration 1 100	 loss: 0.918, accuracy: 0.723
iteration 1 200	 loss: 0.807, accuracy: 0.762
iteration 2 0	 loss: 0.726, accuracy: 0.789
iteration 2 100	 loss: 0.662, accuracy: 0.810
iteration 2 200	 loss: 0.694, accuracy: 0.797
iteration 3 0	 loss: 0.648, accuracy: 0.816
iteration 3 100	 loss: 0.579, accuracy: 0.833
iteration 3 200	 loss: 0.618, accuracy: 0.823
iteration 4 0	 loss: 0.582, accuracy: 0.837
iteration 4 100	 loss: 0.553, accuracy: 0.842
iteration 4 200	 loss: 0.581, accuracy: 0.835
iteration 5 0	 loss: 0.547, accuracy: 0.849
iteration 5 100	 loss: 0.553, accuracy: 0.845
iteration 5 200	 loss: 0.604, accuracy: 0.829
iteration 6 0	 loss: 0.606, accuracy: 0.834
iteration 6 100	 loss: 0.534, accuracy: 0.856
iteration 6 200	 loss: 0.572, accuracy: 0.842
iteration 7 0	 loss: 0.598, accuracy: 0.837
ite

In [10]:
### Hint: call the saver like this: tf.train.Saver(var_list)
### where var_list is a list of TF variables you want to load from the checkpoint 
new_train_model(cnn_expanded_dict, dataset_generators, epoch_n=40, print_every=100, load_model=True)

INFO:tensorflow:Restoring parameters from checkpoints/weight_saver.ckpt
Model already loaded
iteration 0 0	 loss: 1.610, accuracy: 0.843
iteration 0 100	 loss: 1.469, accuracy: 0.862
iteration 0 200	 loss: 1.588, accuracy: 0.865
iteration 1 0	 loss: 1.519, accuracy: 0.850
iteration 1 100	 loss: 1.473, accuracy: 0.859
iteration 1 200	 loss: 1.528, accuracy: 0.855
iteration 2 0	 loss: 1.546, accuracy: 0.851
iteration 2 100	 loss: 1.637, accuracy: 0.857
iteration 2 200	 loss: 1.572, accuracy: 0.856
iteration 3 0	 loss: 1.552, accuracy: 0.852
iteration 3 100	 loss: 1.577, accuracy: 0.858
iteration 3 200	 loss: 1.673, accuracy: 0.862
iteration 4 0	 loss: 1.507, accuracy: 0.862
iteration 4 100	 loss: 1.555, accuracy: 0.861
iteration 4 200	 loss: 1.695, accuracy: 0.858
iteration 5 0	 loss: 1.693, accuracy: 0.851
iteration 5 100	 loss: 1.639, accuracy: 0.859
iteration 5 200	 loss: 1.786, accuracy: 0.853
iteration 6 0	 loss: 1.706, accuracy: 0.857
iteration 6 100	 loss: 1.649, accuracy: 0.863
i

## Part 3: Fine-tuning a Pre-trained Network on CIFAR-10
(20 points)

[CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) is another popular benchmark for image classification.
We provide you with modified verstion of the file cifar10.py from [https://github.com/Hvass-Labs/TensorFlow-Tutorials](https://github.com/Hvass-Labs/TensorFlow-Tutorials).


In [14]:
import read_cifar10 as cf10

We also provide a generator for the CIFAR-10 Dataset, yielding the next batch every time next is invoked.

In [15]:
@read_data.restartable
def cifar10_dataset_generator(dataset_name, batch_size, restrict_size=1000):
    assert dataset_name in ['train', 'test']
    assert batch_size > 0 or batch_size == -1  # -1 for entire dataset
    
    X_all_unrestricted, y_all = (cf10.load_training_data() if dataset_name == 'train'
                                 else cf10.load_test_data())
    
    actual_restrict_size = restrict_size if dataset_name == 'train' else int(1e10)
    X_all = X_all_unrestricted[:actual_restrict_size]
    data_len = X_all.shape[0]
    batch_size = batch_size if batch_size > 0 else data_len
    
    X_all_padded = np.concatenate([X_all, X_all[:batch_size]], axis=0)
    y_all_padded = np.concatenate([y_all, y_all[:batch_size]], axis=0)
    
    for slice_i in range(math.ceil(data_len / batch_size)):
        idx = slice_i * batch_size
        #X_batch = X_all_padded[idx:idx + batch_size]
        X_batch = X_all_padded[idx:idx + batch_size]*255  
        y_batch = np.ravel(y_all_padded[idx:idx + batch_size])
        yield X_batch.astype(np.uint8), y_batch.astype(np.uint8)

cifar10_dataset_generators = {
    'train': cifar10_dataset_generator('train', 1000),
    'test': cifar10_dataset_generator('test', -1)
}


### Q3.1 Fine-tuning
Let's fine-tune SVHN net on **1000 examples** from CIFAR-10. 
Compare test accuracies of the following scenarios: 
  - Train from scratch on the 1000 CIFAR-10 examples
  - Fine-tuning a pretrained SVHN net (trained on SVHN dataset) on 1000 exampes from CIFAR-10. Use `new_train_model()` defined above to load SVHN net weights, but train on the CIFAR-10 examples.
  
**Note:** you're welcome to decide how many training epochs to use, if that gets you the same results but faster.

**Important:** please do not change the `restrict_size=1000` parameter.

In [18]:
def apply_classification_loss(model_function):
    with tf.Graph().as_default() as g:
        with tf.device("/gpu:0"):  # use gpu:0 if on GPU
            x_ = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y_ = tf.placeholder(tf.int32, [None])
            y_logits = model_function(x_)
            
            y_dict = dict(labels=y_, logits=y_logits)
            losses = tf.nn.sparse_softmax_cross_entropy_with_logits(**y_dict)
            cross_entropy_loss = tf.reduce_mean(losses)
            trainer = tf.train.AdamOptimizer(learning_rate=0.001)
            train_op = trainer.minimize(cross_entropy_loss)
            
            y_pred = tf.argmax(tf.nn.softmax(y_logits), axis=1)
            correct_prediction = tf.equal(tf.cast(y_pred, tf.int32), y_)
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    model_dict = {'graph': g, 'inputs': [x_, y_], 'train_op': train_op,
                  'accuracy': accuracy, 'loss': cross_entropy_loss}
    
    return model_dict

In [20]:
cnn_expanded_dict = apply_classification_loss(cnn_expanded)

## train a model from scratch
new_train_model(cnn_expanded_dict, cifar10_dataset_generators, epoch_n=100, 
                print_every=100)

iteration 0 0	 loss: 21.777, accuracy: 0.102
iteration 1 0	 loss: 16.845, accuracy: 0.099
iteration 2 0	 loss: 13.750, accuracy: 0.100
iteration 3 0	 loss: 11.064, accuracy: 0.097
iteration 4 0	 loss: 7.908, accuracy: 0.124
iteration 5 0	 loss: 5.642, accuracy: 0.146
iteration 6 0	 loss: 4.163, accuracy: 0.141
iteration 7 0	 loss: 3.524, accuracy: 0.104
iteration 8 0	 loss: 3.021, accuracy: 0.109
iteration 9 0	 loss: 2.719, accuracy: 0.120
iteration 10 0	 loss: 2.553, accuracy: 0.119
iteration 11 0	 loss: 2.452, accuracy: 0.120
iteration 12 0	 loss: 2.396, accuracy: 0.119
iteration 13 0	 loss: 2.366, accuracy: 0.123
iteration 14 0	 loss: 2.345, accuracy: 0.135
iteration 15 0	 loss: 2.329, accuracy: 0.132
iteration 16 0	 loss: 2.313, accuracy: 0.131
iteration 17 0	 loss: 2.299, accuracy: 0.133
iteration 18 0	 loss: 2.286, accuracy: 0.137
iteration 19 0	 loss: 2.274, accuracy: 0.145
iteration 20 0	 loss: 2.264, accuracy: 0.150
iteration 21 0	 loss: 2.254, accuracy: 0.157
iteration 22 0	 

In [21]:
## fine-tuning SVHN Net using Cifar-10 weights saved in Q2
new_train_model(cnn_expanded_dict, cifar10_dataset_generators, epoch_n=100, 
                print_every=100, load_model=True)

INFO:tensorflow:Restoring parameters from checkpoints/weight_saver.ckpt
Model already loaded
iteration 0 0	 loss: 16.713, accuracy: 0.103
iteration 1 0	 loss: 9.717, accuracy: 0.117
iteration 2 0	 loss: 5.365, accuracy: 0.139
iteration 3 0	 loss: 3.364, accuracy: 0.141
iteration 4 0	 loss: 2.766, accuracy: 0.131
iteration 5 0	 loss: 2.532, accuracy: 0.121
iteration 6 0	 loss: 2.409, accuracy: 0.110
iteration 7 0	 loss: 2.348, accuracy: 0.108
iteration 8 0	 loss: 2.323, accuracy: 0.104
iteration 9 0	 loss: 2.313, accuracy: 0.102
iteration 10 0	 loss: 2.310, accuracy: 0.103
iteration 11 0	 loss: 2.311, accuracy: 0.103
iteration 12 0	 loss: 2.312, accuracy: 0.103
iteration 13 0	 loss: 2.316, accuracy: 0.103
iteration 14 0	 loss: 2.335, accuracy: 0.101
iteration 15 0	 loss: 2.312, accuracy: 0.101
iteration 16 0	 loss: 2.313, accuracy: 0.102
iteration 17 0	 loss: 2.311, accuracy: 0.098
iteration 18 0	 loss: 2.314, accuracy: 0.097
iteration 19 0	 loss: 2.327, accuracy: 0.095
iteration 20 0	 

## Analysis of the result ##

It seems that train from scratch on the 1000 CIFAR-10 examples offer us a low accuracy. But loading SVHN net weights, but training on the CIFAR-10 examples is even worse. The reason why that happens is because SVHN and Cifar-10 are quite different image dataset. The weights trained in SVHN module don't make sense in the training of Cifar-10. So it performs even worse than the new weights.

| Train from scratch | Fine tunning using SVHN net weights|
|----------------|----------------|----------|
| 0.350 | 0.166 |

## Part 4: TensorBoard Visualization
(30 points)

[TensorBoard](https://www.tensorflow.org/get_started/summaries_and_tensorboard) is a very helpful tool for visualization of neural networks. 

Present at least one visualization for each of the following:
  - Filters
  - Loss
  - Accuracy
  - Feature map  

Modify code you have wrote above to also have summary writers. To  run tensorboard, the command is `tensorboard --logdir=path/to/your/log/directory`.

Please notice that there may be some difficulty to run the tensorboard on SCC and you may want to run it locally.

In [16]:
def apply_classification_loss(model_function):
    with tf.Graph().as_default() as g:
        with tf.device("/gpu:0"):  # use gpu:0 if on GPU
            x_ = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y_ = tf.placeholder(tf.int32, [None])
            y_logits = model_function(x_)
            
            y_dict = dict(labels=y_, logits=y_logits)
            losses = tf.nn.sparse_softmax_cross_entropy_with_logits(**y_dict)
            cross_entropy_loss = tf.reduce_mean(losses)
            trainer = tf.train.AdamOptimizer(learning_rate=0.001)
            train_op = trainer.minimize(cross_entropy_loss)
            
            y_pred = tf.argmax(tf.nn.softmax(y_logits), axis=1)
            correct_prediction = tf.equal(tf.cast(y_pred, tf.int32), y_)
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    model_dict = {'graph': g, 'inputs': [x_, y_], 'train_op': train_op,
                  'accuracy': accuracy, 'loss': cross_entropy_loss}
    
    return model_dict

def cnn_expanded(x_):
    conv1 = tf.layers.conv2d(
            inputs=x_,
            filters=32,  # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu,
            name = 'conv1')
    
    pool1 = tf.layers.max_pooling2d(inputs=conv1, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
    
    conv2 = tf.layers.conv2d(
            inputs=pool1,
            filters=32, # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu,
            name = 'conv2')
    
    pool2 = tf.layers.max_pooling2d(inputs=conv2, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
    
    conv3 = tf.layers.conv2d(
            inputs=pool2,
            filters=32, # number of filters
            kernel_size=[5, 5],
            padding="same",
            activation=tf.nn.relu,
            name = 'conv3')
    
    pool3 = tf.layers.max_pooling2d(inputs=conv3, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
        
    pool_flat = tf.contrib.layers.flatten(pool3, scope='pool2flat')
    dense = tf.layers.dense(inputs=pool_flat, units=500, activation=tf.nn.relu)
    logits = tf.layers.dense(inputs=dense, units=10)
    return logits

In [17]:
model_dict = apply_classification_loss(cnn_expanded)
def visualize(model_dict, dataset_generators, epoch_n, print_every=287):
    
    with model_dict['graph'].as_default(), tf.Session() as sess:
        
        tf.summary.scalar('accuracy', model_dict['accuracy'])
        tf.summary.scalar('loss', model_dict['loss'])

        writer = tf.summary.FileWriter('./logs', sess.graph)
  
        kernel = tf.get_default_graph().get_tensor_by_name('conv1/kernel:0')
        kernel_transposed = tf.transpose(kernel, [3, 0, 1, 2])
        tf.summary.image('filters/conv1', kernel_transposed, max_outputs=32)
  
        features = tf.get_default_graph().get_tensor_by_name('conv1/Relu:0')
        features, _ = tf.split(features, [1, -1], 0)
        features_transposed = tf.transpose(features, [3, 1, 2, 0])
        tf.summary.image('features/conv1', features_transposed, max_outputs=32)

        merged = tf.summary.merge_all()

        train_writer = tf.summary.FileWriter('./graph' + '/train',
                                      sess.graph)
        test_writer = tf.summary.FileWriter('./graph' + '/test')
        
        sess.run(tf.global_variables_initializer())
        for epoch_i in range(epoch_n):
            for iter_i, data_batch in enumerate(dataset_generators['train']):
                train_feed_dict = dict(zip(model_dict['inputs'], data_batch))
                sess.run([model_dict['train_op'], ], feed_dict=train_feed_dict)
                
                summary_train = sess.run(merged, feed_dict=train_feed_dict)
                train_writer.add_summary(summary_train, epoch_i) 
        
                
                if iter_i % print_every == print_every-1:
                    collect_arr = []
                    for test_batch in dataset_generators['test']:
                        test_feed_dict = dict(zip(model_dict['inputs'], test_batch))
                        to_compute = [model_dict['loss'], model_dict['accuracy']]
                        collect_arr.append(sess.run(to_compute, test_feed_dict)) 
                        
                        summary_test = sess.run(merged, feed_dict=test_feed_dict)
                        test_writer.add_summary(summary_test, epoch_i)
                        
                    averages = np.mean(collect_arr, axis=0)
                    fmt = (epoch_i+1, print_every) + tuple(averages)
                    print('epoch {:d}, iter: {:d}, loss: {:.3f}, '
                          'accuracy: {:.3f}'.format(*fmt))

In [18]:
visualize(model_dict, dataset_generators, epoch_n=10)

epoch 1, iter: 287, loss: 0.768, accuracy: 0.773
epoch 2, iter: 287, loss: 0.572, accuracy: 0.836
epoch 3, iter: 287, loss: 0.529, accuracy: 0.850
epoch 4, iter: 287, loss: 0.522, accuracy: 0.858
epoch 5, iter: 287, loss: 0.525, accuracy: 0.858
epoch 6, iter: 287, loss: 0.538, accuracy: 0.864
epoch 7, iter: 287, loss: 0.593, accuracy: 0.853
epoch 8, iter: 287, loss: 0.629, accuracy: 0.852
epoch 9, iter: 287, loss: 0.590, accuracy: 0.860
epoch 10, iter: 287, loss: 0.618, accuracy: 0.872


## Part 5: Bonus
(20 points)

### Q5.1 SVHN Net ++
Improve the accuracy of SVHN Net beyond that of the provided demo: SVHN Net ++. Report your result and explain why it is improved. (The best result will get the most bonus points!)

## My implemention of simplified VGG16##

To improve the accuracy of the prediction, I implemented a simple version of VGG16. VGG 16 was a second place in ILSVRC in 2014. Using this CNN, I finally get accuracy over 0.921 with a reasonable time cost. It will achieve the peak accuracy pretty fast. In about epoch 5, we could find that it almost have 0.92 accuracy. 

Considering the size of dataset, I chose to not apply dropout function to the network. It will kind of improve the accuracy and the time it costs is acceptable. So in that way, I will call my implementation a simplified VGG16.

So it consists five blocks. Each block consists two or three convolution layers and is ended with a max pooling layer. After the fifth pooling layer there is a flatten layer which get prepare for the full-connected layers. There exist three full-connected layers which help us produce our final prediction. 

## Reason of improved accuracy##

The kernel size is classical size:[3,3]. And the pooling size is also classical which is [2,3]. Every block has better ability han the corresponding huge convolution layer. For instance, 3 3x3 convolution layers together is kind like a 7x7 convolution layer but provide a better view than that one. It could clearly tell us what's in the dataset. Another good news is that in this way, the calculation won't take a large amount of time. 

Also, there is another reason that it can improve accuaracy. The deeper depth of the nueral network will provide us a better extraction and conclusion of features in the picutres.

Meanwhile, if we are not satisfied with the speed, we could even discard one or two full-connected layers, that won't hurt our result.

In [4]:
# Simplified VGG16
import tensorflow as tf
def SVHN_plusplus(x_):
    conv1_1 = tf.layers.conv2d(
            inputs=x_,
            filters=64,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)

    conv1_2 = tf.layers.conv2d(
            inputs=conv1_1,
            filters=64,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    pool1 = tf.layers.max_pooling2d(inputs=conv1_2, 
                                    pool_size=[2, 2], 
                                    strides=2)  # convolution stride
  
    conv2_1 = tf.layers.conv2d(
            inputs=pool1,
            filters=128,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)

    conv2_2 = tf.layers.conv2d(
            inputs=conv2_1,
            filters=128,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    pool2 = tf.layers.max_pooling2d(inputs=conv2_2, 
                                    pool_size=[2, 2], 
                                    strides=2) 
    
    conv3_1 = tf.layers.conv2d(
            inputs=pool2,
            filters=256,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)

    conv3_2 = tf.layers.conv2d(
            inputs=conv3_1,
            filters=256,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    conv3_3 = tf.layers.conv2d(
            inputs=conv3_2,
            filters=256,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    pool3 = tf.layers.max_pooling2d(inputs=conv3_3, 
                                    pool_size=[2, 2], 
                                    strides=2) 
    
    conv4_1 = tf.layers.conv2d(
            inputs=pool3,
            filters=512,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)

    conv4_2 = tf.layers.conv2d(
            inputs=conv4_1,
            filters=512,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    conv4_3 = tf.layers.conv2d(
            inputs=conv4_2,
            filters=512,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    pool4 = tf.layers.max_pooling2d(inputs=conv4_3, 
                                    pool_size=[2, 2], 
                                    strides=2) 
        
    conv5_1 = tf.layers.conv2d(
            inputs=pool4,
            filters=512,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)

    conv5_2 = tf.layers.conv2d(
            inputs=conv5_1,
            filters=512,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    conv5_3 = tf.layers.conv2d(
            inputs=conv5_2,
            filters=512,  # number of filters
            kernel_size=[3, 3],
            padding="same",
            activation=tf.nn.relu)
    
    pool5 = tf.layers.max_pooling2d(inputs=conv5_3, 
                                    pool_size=[2, 2], 
                                    strides=2) 
    
    pool_flat = tf.contrib.layers.flatten(pool5, scope='pool2flat')
    dense = tf.layers.dense(inputs=pool_flat, units=4096, activation=tf.nn.relu)
    dense_2 = tf.layers.dense(inputs=dense, units=4096)
    logits = tf.layers.dense(inputs=dense_2, units=1000)
    return logits

def apply_classification_loss(model_function):
    with tf.Graph().as_default() as g:
        with tf.device("/gpu:0"):  # use gpu:0 if on GPU
            x_ = tf.placeholder(tf.float32, [None, 32, 32, 3])
            y_ = tf.placeholder(tf.int32, [None])
            y_logits = model_function(x_)
            
            y_dict = dict(labels=y_, logits=y_logits)
            losses = tf.nn.sparse_softmax_cross_entropy_with_logits(**y_dict)
            cross_entropy_loss = tf.reduce_mean(losses)
            trainer = tf.train.AdamOptimizer(learning_rate=0.001)
            train_op = trainer.minimize(cross_entropy_loss)
            
            y_pred = tf.argmax(tf.nn.softmax(y_logits), axis=1)
            correct_prediction = tf.equal(tf.cast(y_pred, tf.int32), y_)
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    
    model_dict = {'graph': g, 'inputs': [x_, y_], 'train_op': train_op,
                  'accuracy': accuracy, 'loss': cross_entropy_loss}
    
    return model_dict

def train_model(model_dict, dataset_generators, epoch_n, print_every):
    with model_dict['graph'].as_default(), tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        
        for epoch_i in range(epoch_n):
            for iter_i, data_batch in enumerate(dataset_generators['train']):
                train_feed_dict = dict(zip(model_dict['inputs'], data_batch))
                sess.run(model_dict['train_op'], feed_dict=train_feed_dict)
                
                if iter_i % print_every == 0:
                    collect_arr = []
                    for test_batch in dataset_generators['test']:
                        test_feed_dict = dict(zip(model_dict['inputs'], test_batch))
                        to_compute = [model_dict['loss'], model_dict['accuracy']]
                        collect_arr.append(sess.run(to_compute, test_feed_dict))
                    averages = np.mean(collect_arr, axis=0)
                    fmt = (epoch_i, iter_i, ) + tuple(averages)
                    print('epoch {:d} iter {:d}, loss: {:.3f}, '
                          'accuracy: {:.3f}'.format(*fmt))
dataset_generators = {
        'train': svhn_dataset_generator('train', 256),
        'test': svhn_dataset_generator('test', 256)
}


In [5]:
model_dict = apply_classification_loss(SVHN_plusplus)
train_model(model_dict, dataset_generators, epoch_n=40, print_every=200)             

epoch 0 iter 0, loss: 419.640, accuracy: 0.196
epoch 0 iter 200, loss: 2.227, accuracy: 0.196
epoch 1 iter 0, loss: 2.041, accuracy: 0.296
epoch 1 iter 200, loss: 0.835, accuracy: 0.743
epoch 2 iter 0, loss: 0.630, accuracy: 0.818
epoch 2 iter 200, loss: 0.386, accuracy: 0.891
epoch 3 iter 0, loss: 0.440, accuracy: 0.876
epoch 3 iter 200, loss: 0.332, accuracy: 0.906
epoch 4 iter 0, loss: 0.391, accuracy: 0.893
epoch 4 iter 200, loss: 0.307, accuracy: 0.916
epoch 5 iter 0, loss: 0.322, accuracy: 0.915
epoch 5 iter 200, loss: 0.333, accuracy: 0.915
epoch 6 iter 0, loss: 0.298, accuracy: 0.924
epoch 6 iter 200, loss: 0.323, accuracy: 0.918
epoch 7 iter 0, loss: 0.326, accuracy: 0.919
epoch 7 iter 200, loss: 0.350, accuracy: 0.915
epoch 8 iter 0, loss: 0.327, accuracy: 0.925
epoch 8 iter 200, loss: 0.356, accuracy: 0.918
epoch 9 iter 0, loss: 0.391, accuracy: 0.912
epoch 9 iter 200, loss: 0.360, accuracy: 0.913
epoch 10 iter 0, loss: 0.379, accuracy: 0.914
epoch 10 iter 200, loss: 0.358, 