**Cole Turner and Ethan Seal**

Fall 2019

CS343: Neural Networks

Project 3: Convolutional Neural Networks

In [1]:
import os
import random
import numpy as np
import matplotlib.pyplot as plt

plt.style.use(['seaborn-colorblind', 'seaborn-darkgrid'])
plt.rcParams.update({'font.size': 20})

np.set_printoptions(suppress=True, precision=7)

# Automatically reload your external source code
%load_ext autoreload
%autoreload 2

**Global note: Make sure any debug printouts do not appear if `verbose=False`!**

## Task 4) Implement weight optimizers for gradient descent

To change the weights during training, we need an optimization algorithm to have our loss decrease over epochs as we learn the structure of the input patterns. Until now, we used **Stochastic gradient descent (SGD)**, which is the simplest algorithm. We will implement 3 popular algorithms:

- `SGD` (stochastic gradient descent)
- `SGD_Momentum` (stochastic gradient descent with momentum)
- `Adam` (Adaptive Moment Estimation)

Implement each of these according to the update equations (in `optimizer.py::update_weights` in each subclass). Let's use $w_t$ in the math below to represent the weights in a layer at time step $t$, $dw$ to represent the gradient of the weights in a layer, and $\eta$ represent the learning rate. We use vectorized notation below (update applies to all weights element-wise). Then:

**SGD**: 

$w_{t} = w_{t-1} - \eta \times dw$

**SGD (momentum)**:

$v_{t} = m \times v_{t-1} - \eta \times dw$

$w_{t} = w_{t-1} + v_t$

where $v_t$ is called the `velocity` at time $t$. At the first time step (0), velocity should be set to all zeros and have the same shape as $w$. $m$ is a constant that determines how much of the gradient obtained on the previous time step should factor into the weight update for the current time step.


**Adam**:

$m_{t} = \beta_1 \times m_{t-1} + (1 - \beta_1)\times dw$

$v_{t} = \beta_2 \times v_{t-1} + (1 - \beta_2)\times dw^2$

$n = m_{t} / \left (1-(\beta_1^t) \right )$

$u = v_{t} / \left (1-(\beta_2^t) \right )$

$w_{t} = w_{t-1} - \left ( \eta \times n \right ) / \left ( \sqrt(u) + \epsilon \right ) $


Like SGD (momentum), Adam records momentum terms $m$ and $v$. At time step 0, you should initialize them to zeros in an array equal in size to the weights. $n$ and $u$ are variables computed on each time step. The remaining quantities are constants. Note that $t$ keeps track of the integer time step, and needs to be incremented on each update. 

In [2]:
from optimizer import *

####  Test SGD

In [3]:
np.random.seed(0)

wts = np.arange(-3, 3, dtype=np.float64)
d_wts = np.random.randn(len(wts))

optimizer = SGD()
optimizer.prepare(wts, d_wts)

new_wts_1 = optimizer.update_weights()
new_wts_2 = optimizer.update_weights()

print(f'SGD: Wts after 1 iter {new_wts_1}')
print(f'SGD: Wts after 2 iter {new_wts_2}')

AttributeError: 'SGD' object has no attribute 'velocity'

Output should be:

    SGD: Wts after 1 iter [-3.1764052 -2.0400157 -1.0978738 -0.2240893  0.8132442  2.0977278]
    SGD: Wts after 2 iter [-3.3528105 -2.0800314 -1.1957476 -0.4481786  0.6264884  2.1954556]

####  Test SGD_Momentum

In [4]:
np.random.seed(0)

wts = np.random.randn(3, 4)
d_wts = np.random.randn(3, 4)

optimizer = SGD_Momentum(lr=0.1, m=0.6)
optimizer.prepare(wts, d_wts)

new_wts_1 = optimizer.update_weights()
new_wts_2 = optimizer.update_weights()

print(f'SGD M: Wts after 1 iter\n{new_wts_1}')
print(f'SGD M: Wts after 2 iter\n{new_wts_2}')

SGD M: Wts after 1 iter
[[ 1.6879486  0.3879897  0.9343517  2.2075258]
 [ 1.7181501 -0.9567621  0.9187816 -0.0659476]
 [ 0.1520801  0.3452366  0.0576     1.52849  ]]
SGD M: Wts after 2 iter
[[ 1.5661825  0.3685217  0.8633335  2.1541379]
 [ 1.4790974 -0.9239367  0.8686908  0.0707077]
 [ 0.5605585  0.2406577 -0.0807098  1.6472364]]


Output should be:

    SGD M: Wts after 1 iter
    [[ 1.6879486  0.3879897  0.9343517  2.2075258]
     [ 1.7181501 -0.9567621  0.9187816 -0.0659476]
     [ 0.1520801  0.3452366  0.0576     1.52849  ]]
    SGD M: Wts after 2 iter
    [[ 1.5661825  0.3685217  0.8633335  2.1541379]
     [ 1.4790974 -0.9239367  0.8686908  0.0707077]
     [ 0.5605585  0.2406577 -0.0807098  1.6472364]]

####  Test Adam

In [5]:
np.random.seed(0)

wts = np.random.randn(3, 4)
d_wts = np.random.randn(3, 4)

optimizer = Adam(lr=0.1)
optimizer.prepare(wts, d_wts)

new_wts_1 = optimizer.update_weights()
new_wts_2 = optimizer.update_weights()
new_wts_3 = optimizer.update_weights()

print(f'Adam: Wts after 1 iter\n{new_wts_1}')
print(f'Adam: Wts after 2 iter\n{new_wts_2}')
print(f'Adam: Wts after 3 iter\n{new_wts_3}')

Adam: Wts after 1 iter
[[ 1.6640523  0.3001572  0.878738   2.1408932]
 [ 1.767558  -0.8772779  0.8500884 -0.0513572]
 [-0.0032189  0.3105985  0.0440436  1.5542735]]
Adam: Wts after 2 iter
[[ 1.5640523  0.2001572  0.778738   2.0408932]
 [ 1.667558  -0.7772779  0.7500884  0.0486428]
 [ 0.0967811  0.2105985 -0.0559564  1.6542735]]
Adam: Wts after 3 iter
[[ 1.4640523  0.1001572  0.678738   1.9408932]
 [ 1.567558  -0.6772779  0.6500884  0.1486428]
 [ 0.1967811  0.1105985 -0.1559564  1.7542735]]


Output should be:

    Adam: Wts after 1 iter
    [[ 1.6640523  0.3001572  0.878738   2.1408932]
     [ 1.767558  -0.8772779  0.8500884 -0.0513572]
     [-0.0032189  0.3105985  0.0440436  1.5542735]]
    Adam: Wts after 2 iter
    [[ 1.5640523  0.2001572  0.778738   2.0408932]
     [ 1.667558  -0.7772779  0.7500884  0.0486428]
     [ 0.0967811  0.2105985 -0.0559564  1.6542735]]
    Adam: Wts after 3 iter
    [[ 1.4640523  0.1001572  0.678738   1.9408932]
     [ 1.567558  -0.6772779  0.6500884  0.1486428]
     [ 0.1967811  0.1105985 -0.1559564  1.7542735]]  

## Task 5) Write network training methods

Implement methods in `network.py` to actually train the network, using all the building blocks that you have created. The methods to implement are:

- `predict`
- `fit`. Add an optional parameter `print_every=1` that controls the frequency (in iterations) with which to wait before printing out the loss and iteration number.

## Task 6) Overfitting a convolutional neural network

Usually we try to prevent overfitting, but we can use it as a valuable debugging tool to test out a complex backprop-style neural network. Assuming everything is working, it is almost always the case that we should be able to overfit a tiny dataset with a huge model with tons of parameters (i.e. your CNN). You will use this strategy to verify that your network is working.

Let's use a small amount of real data from STL-10. If everything is working properly, the network should overfit and you should see a significant drop in the loss from its starting value of ~2.3.

### 6a) Move your `preprocess_data.py` from the MLP project

Make the one following change:

- Re-arrange dimensions of `imgs` so that when it is returned, `shape=(Num imgs, RGB color chans, height, width)` (No longer flatten non-batch dimensions)

In [13]:
import load_stl10_dataset
import preprocess_data
from network import ConvNet4
import optimizer

### 6b) Load in STL-10 at 16x16 resolution

If you don't want to wait for STL-10 to download from the internet and resize, copy over your data and numpy folders from your MLP project.

**Notes:**
- You will need to download the new version of `load_stl10_dataset`.
- The different train/test split here won't work if you hard coded the proportions in your `create_splits` implementation! *This isn't catastrophic, it just means that it will take longer to compute accuracy on the validation set.*

In [21]:
# Download the STL-10 dataset from the internet, convert it to Numpy ndarray, resize to 16x16
# cache it locally on your computer for faster loading next time.
load_stl10_dataset.purge_cached_dataset()
stl_imgs, stl_labels = load_stl10_dataset.load(scale_fact=6)
# preprocess
stl_imgs, stl_labels = preprocess_data.preprocess_stl(stl_imgs, stl_labels)
# create splits
x_train, y_train, x_test, y_test, x_val, y_val, x_dev, y_dev = preprocess_data.create_splits(
    stl_imgs, stl_labels, n_train_samps=4548, n_test_samps=400, n_valid_samps=2, n_dev_samps=50)

print ('Train data shape: ', x_train.shape)
print ('Train labels shape: ', y_train.shape)
print ('Test data shape: ', x_test.shape)
print ('Test labels shape: ', y_test.shape)
print ('Validation data shape: ', x_val.shape)
print ('Validation labels shape: ', y_val.shape)
print ('dev data shape: ', x_dev.shape)
print ('dev labels shape: ', y_dev.shape)

classes = np.loadtxt(os.path.join('data', 'stl10_binary', 'class_names.txt'), dtype=str)

Images are: (5000, 96, 96, 3)
Labels are: (5000,)
Resizing 5000 images to 16x16...Done!
Saving Numpy arrays the images and labels to ./numpy...Done!
imgs.shape (5000, 16, 16, 3)
data.shape (5000, 768)
Train data shape:  (4548, 768)
Train labels shape:  (4548,)
Test data shape:  (400, 768)
Test labels shape:  (400,)
Validation data shape:  (2, 768)
Validation labels shape:  (2,)
dev data shape:  (50, 768)
dev labels shape:  (50,)


### 6c) Train and overfit the network on a small STL-10 sample with each optimizer

**Goal:** If your network works, you should see a drop in loss over epochs to 0.

In 3 seperate cells below

- Create 3 different `ConvNet4` networks.
- Compile each with a different optimizer (each net uses a different optimizer).
- Train each on the **dev** set and validate on the tiny validation set (we dont care about out-of-training-set performance here).

You will be making plots demonstrating the overfitting for each optimizer below. **You should train the nets with the same number of epochs such that at least 2/3 of them clearly show loss convergence to a small value; one optimizer may not converge yet, and that's ok**. Cut off the simulations based on the 2/3 that do converge.

Guidelines:

- Weight scales and learning rates of `1e-2` should work well.
- Start by testing the Adam optimizer.
- Remember that the input shape is (3, 16, 16). You need to specify this to the network constructor.
- The hyperparameters are up to you, though I wouldn't recommend a batch size that is too small (close to 1), otherwise it may be tricky to see whether the loss is actually decreasing on average.
- Decreasing `acc_freq` will make the `fit` function evaluate the training and validation accuracy more often. This is a computationally intensive process, so small values come with an increase in training time. On the other hand, checking the accuracy too infrequently means you won't know whether the network is trending toward overfitting the training data, which is what you're checking for.
- Each training session takes ~30 mins on my laptop.

**Caveat emptor:** Training convolutional networks is notoriously computationally intensive. If you experiment with hyperparameters, each training session may take several hours. Use the loss/accuracy print outs to quickly gauge whether your hyperparameter choices are getting your network to decrease in loss. Monitor print outs and interrupt the Jupyter kernel if things are not trending in the right direction. Consider using the Davis 102 iMacs if this is running too slow on your laptop.

In [78]:
wt_scale = 1e-2
lr = 1e-2
input_shape = (3, 16, 16)
mini_batch_sz = 10

#preprocessing flattened it when we actually wanted it not flattened.
x_dev = x_dev.reshape(x_dev.shape[0], input_shape[0], input_shape[1], input_shape[2])
x_val = x_val.reshape(x_val.shape[0], input_shape[0], input_shape[1], input_shape[2])


In [76]:
# Adam

adam = ConvNet4(input_shape=input_shape, wt_scale=wt_scale, verbose=False)
adam.compile('adam')
adam.fit(x_dev, y_dev, x_val, y_val, mini_batch_sz=mini_batch_sz, n_epochs=30)

Starting to train...
500 iterations. 10 iter/epoch.
We are on iteration: 1 of 500. 
Time taken for iteration 0: 1.2095959186553955
Estimated time to complete: 604.7979593276978
We are on iteration: 2 of 500. 
We are on iteration: 3 of 500. 
We are on iteration: 4 of 500. 
We are on iteration: 5 of 500. 
We are on iteration: 6 of 500. 
We are on iteration: 7 of 500. 
We are on iteration: 8 of 500. 
We are on iteration: 9 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1 1 1 1]
[1 1]
[0 3]
  Train acc: 0.16, Val acc: 0.0
We are on iteration: 10 of 500. 
We are on iteration: 11 of 500. 
We are on iteration: 12 of 500. 
We are on iteration: 13 of 500. 
We are on iteration: 14 of 500. 
We are on iteration: 15 of 500. 
We are on i

[5 5 5 5 5]
[5 8 1 5 5]
[8 5 8 5 5]
[5 8 5 5 5]
[5 5 5 8 5]
[5 5 5 5 5]
[8 5 8 5 5]
[5 5 5 5 5]
[1 1 8 5 1]
[1 5 5 1 5]
[5 5]
[0 3]
  Train acc: 0.34, Val acc: 0.0
We are on iteration: 64 of 500. 
We are on iteration: 65 of 500. 
We are on iteration: 66 of 500. 
We are on iteration: 67 of 500. 
We are on iteration: 68 of 500. 
We are on iteration: 69 of 500. 
We are on iteration: 70 of 500. 
We are on iteration: 71 of 500. 
We are on iteration: 72 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 2.2120419

[1 1 5 5 5]
[1 8 8 5 5]
[8 5 8 5 5]
[5 8 5 5 5]
[5 2 5 8 5]
[5 1 5 5 1]
[8 5 8 5 5]
[5 5 5 5 5]
[1 1 1 5 1]
[5 5 5 1 5]
[5 1]
[0 3]
  Train acc: 0.4, Val acc: 0.0
We are on iteration: 100 of 500. 
We are on iteration: 101 of 500. 
We are on iteration: 102 of 500. 
We are on iteration: 103 of 500. 
We are on iteration: 104 of 500. 
We are on iteration: 105 of 500. 
We are on iteration: 106 of 500. 
We are on iteration: 107 of 500. 
We are on iteration: 108 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 2

[1 1 5 5 5]
[1 8 0 5 5]
[0 1 8 5 5]
[2 0 1 0 5]
[5 2 5 8 1]
[2 1 7 5 1]
[8 5 8 1 5]
[5 5 5 5 5]
[1 2 9 5 1]
[1 5 1 1 5]
[5 1]
[0 3]
  Train acc: 0.6, Val acc: 0.0
We are on iteration: 127 of 500. 
We are on iteration: 128 of 500. 
We are on iteration: 129 of 500. 
We are on iteration: 130 of 500. 
We are on iteration: 131 of 500. 
We are on iteration: 132 of 500. 
We are on iteration: 133 of 500. 
We are on iteration: 134 of 500. 
We are on iteration: 135 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 2

[1 1 4 5 2]
[4 8 0 7 5]
[8 1 8 5 5]
[2 0 1 0 7]
[4 2 7 8 1]
[6 1 7 5 1]
[8 5 8 1 5]
[5 5 5 5 5]
[1 6 8 5 4]
[1 5 1 1 5]
[5 1]
[0 3]
  Train acc: 0.72, Val acc: 0.0
We are on iteration: 154 of 500. 
We are on iteration: 155 of 500. 
We are on iteration: 156 of 500. 
We are on iteration: 157 of 500. 
We are on iteration: 158 of 500. 
We are on iteration: 159 of 500. 
We are on iteration: 160 of 500. 
We are on iteration: 161 of 500. 
We are on iteration: 162 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 4 5 2]
[4 8 0 7 4]
[0 1 8 5 5]
[2 0 3 0 7]
[4 2 7 8 1]
[6 1 7 5 1]
[8 4 8 3 7]
[5 5 5 4 5]
[1 6 9 3 4]
[1 5 1 1 5]
[4 1]
[0 3]
  Train acc: 0.84, Val acc: 0.0
We are on iteration: 181 of 500. 
We are on iteration: 182 of 500. 
We are on iteration: 183 of 500. 
We are on iteration: 184 of 500. 
We are on iteration: 185 of 500. 
We are on iteration: 186 of 500. 
We are on iteration: 187 of 500. 
We are on iteration: 188 of 500. 
We are on iteration: 189 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 3 2]
[4 8 0 7 4]
[0 1 8 5 5]
[2 0 3 0 3]
[4 2 7 8 1]
[6 4 7 5 1]
[8 4 8 3 7]
[5 5 3 4 5]
[1 6 9 3 4]
[0 5 1 1 5]
[3 1]
[0 3]
  Train acc: 0.84, Val acc: 0.0
We are on iteration: 199 of 500. 
We are on iteration: 200 of 500. 
We are on iteration: 201 of 500. 
We are on iteration: 202 of 500. 
We are on iteration: 203 of 500. 
We are on iteration: 204 of 500. 
We are on iteration: 205 of 500. 
We are on iteration: 206 of 500. 
We are on iteration: 207 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 5 2]
[4 8 0 7 5]
[1 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 6 7 5 1]
[8 4 8 3 7]
[5 5 5 5 5]
[1 6 9 3 4]
[6 5 1 1 5]
[5 1]
[0 3]
  Train acc: 0.88, Val acc: 0.0
We are on iteration: 217 of 500. 
We are on iteration: 218 of 500. 
We are on iteration: 219 of 500. 
We are on iteration: 220 of 500. 
We are on iteration: 221 of 500. 
We are on iteration: 222 of 500. 
We are on iteration: 223 of 500. 
We are on iteration: 224 of 500. 
We are on iteration: 225 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 5 2]
[4 8 0 7 5]
[1 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 1 7 5 1]
[8 4 9 1 7]
[5 5 5 5 5]
[1 6 9 5 4]
[1 5 1 1 5]
[5 1]
[0 3]
  Train acc: 0.86, Val acc: 0.0
We are on iteration: 235 of 500. 
We are on iteration: 236 of 500. 
We are on iteration: 237 of 500. 
We are on iteration: 238 of 500. 
We are on iteration: 239 of 500. 
We are on iteration: 240 of 500. 
We are on iteration: 241 of 500. 
We are on iteration: 242 of 500. 
We are on iteration: 243 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 5 5]
[4 8 0 7 5]
[1 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 6 7 5 1]
[8 4 9 3 7]
[5 5 3 5 5]
[1 6 9 3 4]
[1 5 1 7 5]
[5 1]
[0 3]
  Train acc: 0.94, Val acc: 0.0
We are on iteration: 253 of 500. 
We are on iteration: 254 of 500. 
We are on iteration: 255 of 500. 
We are on iteration: 256 of 500. 
We are on iteration: 257 of 500. 
We are on iteration: 258 of 500. 
We are on iteration: 259 of 500. 
We are on iteration: 260 of 500. 
We are on iteration: 261 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 3 2]
[4 8 0 7 4]
[1 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 6 7 5 1]
[8 4 9 3 7]
[5 5 3 4 5]
[1 6 9 3 4]
[1 5 1 7 5]
[4 1]
[0 3]
  Train acc: 0.94, Val acc: 0.0
We are on iteration: 271 of 500. 
We are on iteration: 272 of 500. 
We are on iteration: 273 of 500. 
We are on iteration: 274 of 500. 
We are on iteration: 275 of 500. 
We are on iteration: 276 of 500. 
We are on iteration: 277 of 500. 
We are on iteration: 278 of 500. 
We are on iteration: 279 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 5 2]
[4 8 0 7 5]
[7 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 6 7 5 1]
[8 4 9 3 7]
[5 5 3 4 5]
[1 6 9 3 4]
[1 5 1 7 5]
[5 1]
[0 3]
  Train acc: 1.0, Val acc: 0.0
We are on iteration: 289 of 500. 
We are on iteration: 290 of 500. 
We are on iteration: 291 of 500. 
We are on iteration: 292 of 500. 
We are on iteration: 293 of 500. 
We are on iteration: 294 of 500. 
We are on iteration: 295 of 500. 
We are on iteration: 296 of 500. 
We are on iteration: 297 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 2

[1 1 5 5 2]
[4 8 0 7 5]
[7 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 6 7 5 1]
[8 4 9 3 4]
[5 5 3 4 5]
[1 6 9 3 4]
[1 5 1 7 5]
[4 1]
[0 3]
  Train acc: 0.98, Val acc: 0.0
We are on iteration: 307 of 500. 
We are on iteration: 308 of 500. 
We are on iteration: 309 of 500. 
We are on iteration: 310 of 500. 
We are on iteration: 311 of 500. 
We are on iteration: 312 of 500. 
We are on iteration: 313 of 500. 
We are on iteration: 314 of 500. 
We are on iteration: 315 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 

[1 1 5 5 2]
[4 8 0 7 5]
[7 1 8 5 5]
[2 0 3 0 6]
[4 2 7 8 1]
[6 6 7 5 1]
[8 4 9 3 7]
[5 5 3 4 5]
[1 6 9 3 4]
[1 5 1 7 5]
[3 1]
[0 3]
  Train acc: 1.0, Val acc: 0.0
We are on iteration: 325 of 500. 
We are on iteration: 326 of 500. 
We are on iteration: 327 of 500. 
We are on iteration: 328 of 500. 
We are on iteration: 329 of 500. 
We are on iteration: 330 of 500. 
We are on iteration: 331 of 500. 
We are on iteration: 332 of 500. 
We are on iteration: 333 of 500. 
Loss History: [2.3040361725149525, 2.2999972953588594, 2.2922609673044834, 2.2816561849013652, 2.294584076928961, 2.2909093952635566, 2.2785665652189713, 2.2428269260193754, 2.2518910369212772, 2.2609841936493034, 2.222912603317906, 2.2219080421793165, 2.2144286882621205, 2.2529159665850518, 2.0094927693863087, 2.202083588482599, 1.9321646735171205, 2.2673616080582666, 2.064220866908327, 2.4721378351992627, 1.7689893844978943, 2.2128127258356143, 2.4118948273495673, 1.9995223724044722, 2.376006909356501, 2.3182796848516762, 2

KeyboardInterrupt: 

In [84]:
# SGD-M
sgd_m = ConvNet4(input_shape=input_shape, wt_scale=wt_scale, verbose=False)
sgd_m.compile('sgd_momentum')
sgd_m.fit(x_dev, y_dev, x_val, y_val, mini_batch_sz=mini_batch_sz, n_epochs=30)

Starting to train...
750 iterations. 5 iter/epoch.
We are on iteration: 1 of 750. 
Time taken for iteration 0: 2.3133342266082764
Estimated time to complete: 1735.0006699562073
We are on iteration: 2 of 750. 
We are on iteration: 3 of 750. 
We are on iteration: 4 of 750. 
We are on iteration: 5 of 750. 
We are on iteration: 6 of 750. 
We are on iteration: 7 of 750. 
We are on iteration: 8 of 750. 
We are on iteration: 9 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138]
[1 1 1 1 8 1 8 1 8 1]
[8 8 1 8 1 1 1 1 8 8]
[1 1 1 8 8 1 1 8 1 1]
[8 1 1 8 1 1 1 8 1 8]
[8 1 8 1 1 8 1 8 1 1]
[8 8]
[0 3]
  Train acc: 0.12, Val acc: 0.0
We are on iteration: 10 of 750. 
We are on iteration: 11 of 750. 
We are on iteration: 12 of 750. 
We are on iteration: 13 of 750. 
We are on iteration: 14 of 750. 
We are on iteration: 15 of 750. 
We are on iteration: 16

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 64 of 750. 
We are on iteration: 65 of 750. 
We are on iteration: 66 of 750. 
We are on iteration: 67 of 750. 
We are on iteration: 68 of 750. 
We are on iteration: 69 of 750. 
We are on iteration: 70 of 750. 
We are on iteration: 71 of 750. 
We are on iteration: 72 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.2997266279600455, 2.286882233119855, 2.2942

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 100 of 750. 
We are on iteration: 101 of 750. 
We are on iteration: 102 of 750. 
We are on iteration: 103 of 750. 
We are on iteration: 104 of 750. 
We are on iteration: 105 of 750. 
We are on iteration: 106 of 750. 
We are on iteration: 107 of 750. 
We are on iteration: 108 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 127 of 750. 
We are on iteration: 128 of 750. 
We are on iteration: 129 of 750. 
We are on iteration: 130 of 750. 
We are on iteration: 131 of 750. 
We are on iteration: 132 of 750. 
We are on iteration: 133 of 750. 
We are on iteration: 134 of 750. 
We are on iteration: 135 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 154 of 750. 
We are on iteration: 155 of 750. 
We are on iteration: 156 of 750. 
We are on iteration: 157 of 750. 
We are on iteration: 158 of 750. 
We are on iteration: 159 of 750. 
We are on iteration: 160 of 750. 
We are on iteration: 161 of 750. 
We are on iteration: 162 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 181 of 750. 
We are on iteration: 182 of 750. 
We are on iteration: 183 of 750. 
We are on iteration: 184 of 750. 
We are on iteration: 185 of 750. 
We are on iteration: 186 of 750. 
We are on iteration: 187 of 750. 
We are on iteration: 188 of 750. 
We are on iteration: 189 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 199 of 750. 
We are on iteration: 200 of 750. 
We are on iteration: 201 of 750. 
We are on iteration: 202 of 750. 
We are on iteration: 203 of 750. 
We are on iteration: 204 of 750. 
We are on iteration: 205 of 750. 
We are on iteration: 206 of 750. 
We are on iteration: 207 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 217 of 750. 
We are on iteration: 218 of 750. 
We are on iteration: 219 of 750. 
We are on iteration: 220 of 750. 
We are on iteration: 221 of 750. 
We are on iteration: 222 of 750. 
We are on iteration: 223 of 750. 
We are on iteration: 224 of 750. 
We are on iteration: 225 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 235 of 750. 
We are on iteration: 236 of 750. 
We are on iteration: 237 of 750. 
We are on iteration: 238 of 750. 
We are on iteration: 239 of 750. 
We are on iteration: 240 of 750. 
We are on iteration: 241 of 750. 
We are on iteration: 242 of 750. 
We are on iteration: 243 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 253 of 750. 
We are on iteration: 254 of 750. 
We are on iteration: 255 of 750. 
We are on iteration: 256 of 750. 
We are on iteration: 257 of 750. 
We are on iteration: 258 of 750. 
We are on iteration: 259 of 750. 
We are on iteration: 260 of 750. 
We are on iteration: 261 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 271 of 750. 
We are on iteration: 272 of 750. 
We are on iteration: 273 of 750. 
We are on iteration: 274 of 750. 
We are on iteration: 275 of 750. 
We are on iteration: 276 of 750. 
We are on iteration: 277 of 750. 
We are on iteration: 278 of 750. 
We are on iteration: 279 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 289 of 750. 
We are on iteration: 290 of 750. 
We are on iteration: 291 of 750. 
We are on iteration: 292 of 750. 
We are on iteration: 293 of 750. 
We are on iteration: 294 of 750. 
We are on iteration: 295 of 750. 
We are on iteration: 296 of 750. 
We are on iteration: 297 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 307 of 750. 
We are on iteration: 308 of 750. 
We are on iteration: 309 of 750. 
We are on iteration: 310 of 750. 
We are on iteration: 311 of 750. 
We are on iteration: 312 of 750. 
We are on iteration: 313 of 750. 
We are on iteration: 314 of 750. 
We are on iteration: 315 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 325 of 750. 
We are on iteration: 326 of 750. 
We are on iteration: 327 of 750. 
We are on iteration: 328 of 750. 
We are on iteration: 329 of 750. 
We are on iteration: 330 of 750. 
We are on iteration: 331 of 750. 
We are on iteration: 332 of 750. 
We are on iteration: 333 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 343 of 750. 
We are on iteration: 344 of 750. 
We are on iteration: 345 of 750. 
We are on iteration: 346 of 750. 
We are on iteration: 347 of 750. 
We are on iteration: 348 of 750. 
We are on iteration: 349 of 750. 
We are on iteration: 350 of 750. 
We are on iteration: 351 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 361 of 750. 
We are on iteration: 362 of 750. 
We are on iteration: 363 of 750. 
We are on iteration: 364 of 750. 
We are on iteration: 365 of 750. 
We are on iteration: 366 of 750. 
We are on iteration: 367 of 750. 
We are on iteration: 368 of 750. 
We are on iteration: 369 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 379 of 750. 
We are on iteration: 380 of 750. 
We are on iteration: 381 of 750. 
We are on iteration: 382 of 750. 
We are on iteration: 383 of 750. 
We are on iteration: 384 of 750. 
We are on iteration: 385 of 750. 
We are on iteration: 386 of 750. 
We are on iteration: 387 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

We are on iteration: 389 of 750. 
We are on iteration: 390 of 750. 
We are on iteration: 391 of 750. 
We are on iteration: 392 of 750. 
We are on iteration: 393 of 750. 
We are on iteration: 394 of 750. 
We are on iteration: 395 of 750. 
We are on iteration: 396 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.2997266279600455, 2.286882233119855, 2.294217883142991, 2.296719528924457, 2.294766642186307, 2.280884465042594, 2.2961998406834754, 2.2982131232042358, 2.2954205555831195, 2.2895903828343003, 2.29878

We are on iteration: 398 of 750. 
We are on iteration: 399 of 750. 
We are on iteration: 400 of 750. 
We are on iteration: 401 of 750. 
We are on iteration: 402 of 750. 
We are on iteration: 403 of 750. 
We are on iteration: 404 of 750. 
We are on iteration: 405 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.2997266279600455, 2.286882233119855, 2.294217883142991, 2.296719528924457, 2.294766642186307, 2.280884465042594, 2.2961998406834754, 2.2982131232042358, 2.2954205555831195, 2.2895903828343003, 2.29878

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 406 of 750. 
We are on iteration: 407 of 750. 
We are on iteration: 408 of 750. 
We are on iteration: 409 of 750. 
We are on iteration: 410 of 750. 
We are on iteration: 411 of 750. 
We are on iteration: 412 of 750. 
We are on iteration: 413 of 750. 
We are on iteration: 414 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.2997266279600455, 2.28688223311985

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 415 of 750. 
We are on iteration: 416 of 750. 
We are on iteration: 417 of 750. 
We are on iteration: 418 of 750. 
We are on iteration: 419 of 750. 
We are on iteration: 420 of 750. 
We are on iteration: 421 of 750. 
We are on iteration: 422 of 750. 
We are on iteration: 423 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 424 of 750. 
We are on iteration: 425 of 750. 
We are on iteration: 426 of 750. 
We are on iteration: 427 of 750. 
We are on iteration: 428 of 750. 
We are on iteration: 429 of 750. 
We are on iteration: 430 of 750. 
We are on iteration: 431 of 750. 
We are on iteration: 432 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 433 of 750. 
We are on iteration: 434 of 750. 
We are on iteration: 435 of 750. 
We are on iteration: 436 of 750. 
We are on iteration: 437 of 750. 
We are on iteration: 438 of 750. 
We are on iteration: 439 of 750. 
We are on iteration: 440 of 750. 
We are on iteration: 441 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 442 of 750. 
We are on iteration: 443 of 750. 
We are on iteration: 444 of 750. 
We are on iteration: 445 of 750. 
We are on iteration: 446 of 750. 
We are on iteration: 447 of 750. 
We are on iteration: 448 of 750. 
We are on iteration: 449 of 750. 
We are on iteration: 450 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 451 of 750. 
We are on iteration: 452 of 750. 
We are on iteration: 453 of 750. 
We are on iteration: 454 of 750. 
We are on iteration: 455 of 750. 
We are on iteration: 456 of 750. 
We are on iteration: 457 of 750. 
We are on iteration: 458 of 750. 
We are on iteration: 459 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 460 of 750. 
We are on iteration: 461 of 750. 
We are on iteration: 462 of 750. 
We are on iteration: 463 of 750. 
We are on iteration: 464 of 750. 
We are on iteration: 465 of 750. 
We are on iteration: 466 of 750. 
We are on iteration: 467 of 750. 
We are on iteration: 468 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 469 of 750. 
We are on iteration: 470 of 750. 
We are on iteration: 471 of 750. 
We are on iteration: 472 of 750. 
We are on iteration: 473 of 750. 
We are on iteration: 474 of 750. 
We are on iteration: 475 of 750. 
We are on iteration: 476 of 750. 
We are on iteration: 477 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 478 of 750. 
We are on iteration: 479 of 750. 
We are on iteration: 480 of 750. 
We are on iteration: 481 of 750. 
We are on iteration: 482 of 750. 
We are on iteration: 483 of 750. 
We are on iteration: 484 of 750. 
We are on iteration: 485 of 750. 
We are on iteration: 486 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 487 of 750. 
We are on iteration: 488 of 750. 
We are on iteration: 489 of 750. 
We are on iteration: 490 of 750. 
We are on iteration: 491 of 750. 
We are on iteration: 492 of 750. 
We are on iteration: 493 of 750. 
We are on iteration: 494 of 750. 
We are on iteration: 495 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 496 of 750. 
We are on iteration: 497 of 750. 
We are on iteration: 498 of 750. 
We are on iteration: 499 of 750. 
We are on iteration: 500 of 750. 
We are on iteration: 501 of 750. 
We are on iteration: 502 of 750. 
We are on iteration: 503 of 750. 
We are on iteration: 504 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 505 of 750. 
We are on iteration: 506 of 750. 
We are on iteration: 507 of 750. 
We are on iteration: 508 of 750. 
We are on iteration: 509 of 750. 
We are on iteration: 510 of 750. 
We are on iteration: 511 of 750. 
We are on iteration: 512 of 750. 
We are on iteration: 513 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 514 of 750. 
We are on iteration: 515 of 750. 
We are on iteration: 516 of 750. 
We are on iteration: 517 of 750. 
We are on iteration: 518 of 750. 
We are on iteration: 519 of 750. 
We are on iteration: 520 of 750. 
We are on iteration: 521 of 750. 
We are on iteration: 522 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 523 of 750. 
We are on iteration: 524 of 750. 
We are on iteration: 525 of 750. 
We are on iteration: 526 of 750. 
We are on iteration: 527 of 750. 
We are on iteration: 528 of 750. 
We are on iteration: 529 of 750. 
We are on iteration: 530 of 750. 
We are on iteration: 531 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 532 of 750. 
We are on iteration: 533 of 750. 
We are on iteration: 534 of 750. 
We are on iteration: 535 of 750. 
We are on iteration: 536 of 750. 
We are on iteration: 537 of 750. 
We are on iteration: 538 of 750. 
We are on iteration: 539 of 750. 
We are on iteration: 540 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 541 of 750. 
We are on iteration: 542 of 750. 
We are on iteration: 543 of 750. 
We are on iteration: 544 of 750. 
We are on iteration: 545 of 750. 
We are on iteration: 546 of 750. 
We are on iteration: 547 of 750. 
We are on iteration: 548 of 750. 
We are on iteration: 549 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 550 of 750. 
We are on iteration: 551 of 750. 
We are on iteration: 552 of 750. 
We are on iteration: 553 of 750. 
We are on iteration: 554 of 750. 
We are on iteration: 555 of 750. 
We are on iteration: 556 of 750. 
We are on iteration: 557 of 750. 
We are on iteration: 558 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 559 of 750. 
We are on iteration: 560 of 750. 
We are on iteration: 561 of 750. 
We are on iteration: 562 of 750. 
We are on iteration: 563 of 750. 
We are on iteration: 564 of 750. 
We are on iteration: 565 of 750. 
We are on iteration: 566 of 750. 
We are on iteration: 567 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 568 of 750. 
We are on iteration: 569 of 750. 
We are on iteration: 570 of 750. 
We are on iteration: 571 of 750. 
We are on iteration: 572 of 750. 
We are on iteration: 573 of 750. 
We are on iteration: 574 of 750. 
We are on iteration: 575 of 750. 
We are on iteration: 576 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 577 of 750. 
We are on iteration: 578 of 750. 
We are on iteration: 579 of 750. 
We are on iteration: 580 of 750. 
We are on iteration: 581 of 750. 
We are on iteration: 582 of 750. 
We are on iteration: 583 of 750. 
We are on iteration: 584 of 750. 
We are on iteration: 585 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 586 of 750. 
We are on iteration: 587 of 750. 
We are on iteration: 588 of 750. 
We are on iteration: 589 of 750. 
We are on iteration: 590 of 750. 
We are on iteration: 591 of 750. 
We are on iteration: 592 of 750. 
We are on iteration: 593 of 750. 
We are on iteration: 594 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 595 of 750. 
We are on iteration: 596 of 750. 
We are on iteration: 597 of 750. 
We are on iteration: 598 of 750. 
We are on iteration: 599 of 750. 
We are on iteration: 600 of 750. 
We are on iteration: 601 of 750. 
We are on iteration: 602 of 750. 
We are on iteration: 603 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 604 of 750. 
We are on iteration: 605 of 750. 
We are on iteration: 606 of 750. 
We are on iteration: 607 of 750. 
We are on iteration: 608 of 750. 
We are on iteration: 609 of 750. 
We are on iteration: 610 of 750. 
We are on iteration: 611 of 750. 
We are on iteration: 612 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 613 of 750. 
We are on iteration: 614 of 750. 
We are on iteration: 615 of 750. 
We are on iteration: 616 of 750. 
We are on iteration: 617 of 750. 
We are on iteration: 618 of 750. 
We are on iteration: 619 of 750. 
We are on iteration: 620 of 750. 
We are on iteration: 621 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 622 of 750. 
We are on iteration: 623 of 750. 
We are on iteration: 624 of 750. 
We are on iteration: 625 of 750. 
We are on iteration: 626 of 750. 
We are on iteration: 627 of 750. 
We are on iteration: 628 of 750. 
We are on iteration: 629 of 750. 
We are on iteration: 630 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 631 of 750. 
We are on iteration: 632 of 750. 
We are on iteration: 633 of 750. 
We are on iteration: 634 of 750. 
We are on iteration: 635 of 750. 
We are on iteration: 636 of 750. 
We are on iteration: 637 of 750. 
We are on iteration: 638 of 750. 
We are on iteration: 639 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 640 of 750. 
We are on iteration: 641 of 750. 
We are on iteration: 642 of 750. 
We are on iteration: 643 of 750. 
We are on iteration: 644 of 750. 
We are on iteration: 645 of 750. 
We are on iteration: 646 of 750. 
We are on iteration: 647 of 750. 
We are on iteration: 648 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 649 of 750. 
We are on iteration: 650 of 750. 
We are on iteration: 651 of 750. 
We are on iteration: 652 of 750. 
We are on iteration: 653 of 750. 
We are on iteration: 654 of 750. 
We are on iteration: 655 of 750. 
We are on iteration: 656 of 750. 
We are on iteration: 657 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 658 of 750. 
We are on iteration: 659 of 750. 
We are on iteration: 660 of 750. 
We are on iteration: 661 of 750. 
We are on iteration: 662 of 750. 
We are on iteration: 663 of 750. 
We are on iteration: 664 of 750. 
We are on iteration: 665 of 750. 
We are on iteration: 666 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 667 of 750. 
We are on iteration: 668 of 750. 
We are on iteration: 669 of 750. 
We are on iteration: 670 of 750. 
We are on iteration: 671 of 750. 
We are on iteration: 672 of 750. 
We are on iteration: 673 of 750. 
We are on iteration: 674 of 750. 
We are on iteration: 675 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 676 of 750. 
We are on iteration: 677 of 750. 
We are on iteration: 678 of 750. 
We are on iteration: 679 of 750. 
We are on iteration: 680 of 750. 
We are on iteration: 681 of 750. 
We are on iteration: 682 of 750. 
We are on iteration: 683 of 750. 
We are on iteration: 684 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 685 of 750. 
We are on iteration: 686 of 750. 
We are on iteration: 687 of 750. 
We are on iteration: 688 of 750. 
We are on iteration: 689 of 750. 
We are on iteration: 690 of 750. 
We are on iteration: 691 of 750. 
We are on iteration: 692 of 750. 
We are on iteration: 693 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 694 of 750. 
We are on iteration: 695 of 750. 
We are on iteration: 696 of 750. 
We are on iteration: 697 of 750. 
We are on iteration: 698 of 750. 
We are on iteration: 699 of 750. 
We are on iteration: 700 of 750. 
We are on iteration: 701 of 750. 
We are on iteration: 702 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 703 of 750. 
We are on iteration: 704 of 750. 
We are on iteration: 705 of 750. 
We are on iteration: 706 of 750. 
We are on iteration: 707 of 750. 
We are on iteration: 708 of 750. 
We are on iteration: 709 of 750. 
We are on iteration: 710 of 750. 
We are on iteration: 711 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 712 of 750. 
We are on iteration: 713 of 750. 
We are on iteration: 714 of 750. 
We are on iteration: 715 of 750. 
We are on iteration: 716 of 750. 
We are on iteration: 717 of 750. 
We are on iteration: 718 of 750. 
We are on iteration: 719 of 750. 
We are on iteration: 720 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 721 of 750. 
We are on iteration: 722 of 750. 
We are on iteration: 723 of 750. 
We are on iteration: 724 of 750. 
We are on iteration: 725 of 750. 
We are on iteration: 726 of 750. 
We are on iteration: 727 of 750. 
We are on iteration: 728 of 750. 
We are on iteration: 729 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 730 of 750. 
We are on iteration: 731 of 750. 
We are on iteration: 732 of 750. 
We are on iteration: 733 of 750. 
We are on iteration: 734 of 750. 
We are on iteration: 735 of 750. 
We are on iteration: 736 of 750. 
We are on iteration: 737 of 750. 
We are on iteration: 738 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 739 of 750. 
We are on iteration: 740 of 750. 
We are on iteration: 741 of 750. 
We are on iteration: 742 of 750. 
We are on iteration: 743 of 750. 
We are on iteration: 744 of 750. 
We are on iteration: 745 of 750. 
We are on iteration: 746 of 750. 
We are on iteration: 747 of 750. 
Loss History: [2.3035770659078723, 2.2964817762981578, 2.3037534723627178, 2.301452327338059, 2.3008701375723266, 2.301060789222587, 2.299561655766926, 2.3041011660385755, 2.3022084521788138, 2.3039958630344266, 2.299885583405895, 2.3051569825431137, 2.2988826042886217, 2.2986564486669847, 2.2952588477953855, 2.297918658795827, 2.296803073411219, 2.3006011135158957, 2.3005363853098233, 2.2941297368960414, 2.3035534094387993, 2.2967907602549196, 2.2979667882169657, 2.29894407160088, 2.3008375284366247, 2.295489267977041, 2.299726627960

[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5 5 5 5 5 5 5 5 5]
[5 5]
[0 3]
  Train acc: 0.22, Val acc: 0.0
We are on iteration: 748 of 750. 
We are on iteration: 749 of 750. 
We are on iteration: 750 of 750. 


In [None]:
# SGD
sgd = ConvNet4(input_shape=input_shape, wt_scale=wt_scale, verbose=False)
sgd.compile('sgd')
sgd.fit(x_dev, y_dev, x_val, y_val, mini_batch_sz=mini_batch_sz, n_epochs=30)

**Question 3**: Why does decreasing the mini-batch size make the loss print-outs more erratic?



### 6d) Evaluate the different optimizers

Make 2 "high quality" plots showing the following

- Plot the accuracy (y axis) for the three optimizers as a function of training epoch (x axis).
- Plot the loss (y axis) for the three optimizers as a function of training iteration (x axis).

A high quality plot consists of:
- A useful title
- X and Y axis labels
- A legend

**Question 4**: Which optimizer works best and why do think it is best?

**Question 5**: What is happening with the training set accuracy and why?

## Task 7: Training convolutional neural network on STL-10

### 7a) Load in STL-10 at 32x32 resolution

In [None]:
# Download the STL-10 dataset from the internet, convert it to Numpy ndarray, resize to 32x32
# cache it locally on your computer for faster loading next time.
load_stl10_dataset.purge_cached_dataset()
stl_imgs, stl_labels = load_stl10_dataset.load(scale_fact=3)
# preprocess
stl_imgs, stl_labels = preprocess_data.preprocess_stl(stl_imgs, stl_labels)
# create splits
x_train, y_train, x_test, y_test, x_val, y_val, x_dev, y_dev = preprocess_data.create_splits(
    stl_imgs, stl_labels, n_train_samps=4548, n_test_samps=400, n_valid_samps=2, n_dev_samps=50)

print ('Train data shape: ', x_train.shape)
print ('Train labels shape: ', y_train.shape)
print ('Test data shape: ', x_test.shape)
print ('Test labels shape: ', y_test.shape)
print ('Validation data shape: ', x_val.shape)
print ('Validation labels shape: ', y_val.shape)
print ('dev data shape: ', x_dev.shape)
print ('dev labels shape: ', y_dev.shape)

classes = np.loadtxt(os.path.join('data', 'stl10_binary', 'class_names.txt'), dtype=str)

### 7b) Set up accelerated convolution and max pooling layers

As you may have noticed, we had to downsize STL-10 to 16x16 resolution to train the network on the dev set (N=50) in a reasonable amount of time. The training set is N=4000, how will we ever manage to process that amount of data!?

On one hand, this is an unfortunate inevitable reality of working with large ("big") datasets: you can easily find a dataset that is too time consuming to process for any computer, despite how fast/many CPU/GPUs it has.

On the other hand, we can do better for this project and STL-10 :) If you were to time (profile) different parts of the training process, you'd notice that largest bottleneck is convolution and max pooling operations (both forward/backward). You implemented those operations intuitively, which does not always yield the best performance. **By swapping out forward/backward convolution and maxpooling for implementations that use different algorithms (im2col, reshaping) that are compiled to C code, we will speed up training up by several orders of magnitude**.

Follow these steps to subsitute in the "accelerated" convolution and max pooling layers.

- Install the `cython` python package: `pip3 install cython` (or `pip3 install cython --user` if working in Davis 102)
- Dowload files `im2col_cython.pyx`, `accelerated_layer.py`, `setup.py` from the project website. Put them in your base project folder.
- Open terminal, `cd` to Project directory.
- Compile the im2col functions: `python3 setup.py build_ext --inplace`. A `.c` and `.so` file should have appeared in your project folder.
- Restart Jupyter Notebook kernel
- Create a class called `Conv4NetAccel` in `network.py` by copy-pasting the contents of `Conv4Net`. Import `accelerated_layer` at the top and replace the `Conv2D` and `MaxPool2D` layers with `Conv2DAccel` and `MaxPool2DAccel`.

### 7c) Training convolutional neural network on STL-10

You are now ready to train on the entire training set.

- Create a `Conv4NetAccel` object with hyperparameters of your choice.
- Your goal is to achieve 45% accuracy on the test and/or validation set.

Notes:

- I suggest using your intuition about hyperparameters and over/underfitting to guide your choice, rather than a grid search. This should not be overly challenging.
- Use the best / most efficient optimizer based on your prior analysis.
- It should take on the order of 1 sec per training iteration. If that's way off, seek help as something could be wrong with running the acclerated code.

In [None]:
from network import ConvNet4Accel

### 7d) Analysis of STL-10 training quality

Use your trained network that achieves 45%+ accuracy on the test set to make "high quality" plots showing the following 

- Plot the accuracy of the training and validation sets as a function of training epoch. You may have to convert iterations to epochs.
- Plot the loss as a function of training iteration.

In [None]:
plt.plot(netT.validation_acc_history)
plt.plot(netT.train_acc_history)
plt.show()

In [None]:
plt.plot(netT.loss_history)
plt.show()

### 7f) Visualize layer weights

Run the following code and submit the inline image of the weight visualization of the 1st layer (convolutional layer) of the network.

**Note:**
- Setting optional parameter to `True` will let you save a .PNG file in your project folder of your weights. I'd suggest setting it to `False` unless look at your weights and they look like they are worth saving. You don't want a training run that produces undesirable weights to overwrite your good looking results!

In [None]:
def plot_weights(wts, saveFig=True, filename='convWts_adam_overfit.png'):
    grid_sz = int(np.sqrt(len(wts)))
    plt.figure(figsize=(10,10))
    for x in range(grid_sz):
        for y in range(grid_sz):
            lin_ind = np.ravel_multi_index((x, y), dims=(grid_sz, grid_sz))
            plt.subplot(grid_sz, grid_sz, lin_ind+1)
            currImg = wts[lin_ind]
            low, high = np.min(currImg), np.max(currImg)
            currImg = 255*(currImg - low) / (high - low)
            currImg = currImg.astype('uint8')
            plt.imshow(currImg)
            plt.gca().axis('off')
    if saveFig:
        plt.savefig('convWts_adam_overfit.png')
    plt.show()

In [None]:
# Subsitute your trained network below
# netT is my network's name
# You shouldn't see RGB noise
plot_weights(netT.layers[0].wts.transpose(0, 2, 3, 1), saveFig=False, filename='convWts_adam_train_20epoch.png')

**Question 6:** What do the learned filters look like? Does this make sense to you / is this what you expected? In which area of the brain do these filters resemble cell receptive fields?

Note: you should not see RGB "noise". If you do, and you pass the "overfit" test with the Adam optimizer, you probably need to increase the number of training epochs.

## Extensions

**General advice:** When making modifications for extensions, make small changes, then check to make sure you pass test code. Also, test out the network runtime on small examples before/after the changes. If you're not careful, the simulation time can become intractable really quickly!

**Remember:** One thorough extension usually is worth more than several "shallow" extensions.

### 0. Pedal to the metal: achieve high accuracy on STL-10

You can achieve higher (>50%) classification accuracy on the STL-10 test set. Find the hyperparameters to achieve this.

### 1. Experiment with different network architectures.

The design of the `Network` class is modular. As long as you're careful about shapes, adding/removing network layers (e.g. `Conv2D`, `Dense`, etc.) should be straight forward. Experiment with adding another sequence of `Conv2D` and `MaxPooling2D` layers. Add another `Dense` hidden layer before the output layer. How do the changes affect classification accuracy and loss? 

### 2. Experiment with different network hyperparameters.

Explore the affect one or more change below has on classification. Be careful about how the hyperparameters may affect the shape of network layers. Thorough analysis will get you more points (not try a few ad hoc values).

- Experiment with different numbers of hidden units in the Dense layers.
- Experiment different max pooling window sizes and strides.
- Experiment with kernel sizes (not 7x7). Can you get away with smaller ones? Do they perform just as well? What is the change in runtime like? What is the impact on their visualized appearance?
- Experiment with number of kernels in the convolutional layer. Is more/fewer better? What is the impact on their visualized appearance?

###  3. Add and test some training bells and whistles

Add features like early stopping, learning rate decay (learning rate at the end of an epoch becomes some fraction of its former value), etc and assess how they affect training loss convergence and accuracy. 

### 4. Additional optimizers

Research other optimizers used in backpropogation and implement one or more of them within the model structure. Compare its performance to ones you have implemented

### 5. Optimize your algorithms

Find the main performance bottlenecks in the network and improve your code to reduce runtime (e.g. reduce explicit for loops, increase vectorization, etc). Research faster algorithms to do operations like convolution and implement them. Given the complexity of the network, I suggest focusing on one area at a time and make sure everything you change passes the test code before proceeding. Quantify and discuss your performance improvements

### 6. Additional loss functions

Implement support for sigmoid, or another activation functions and associated losses. Test it out and compare with softmax/cross entropy. Make sure any necessary changes to the layer's gradient are made.

### 7. Additional datasets

Do classification and analyxe the results with an image dataset of your choice.

### 8. Performance analysis

Do a thorough comparative analysis of the non-accelerated network and accelerated networks with respect to runtime.