The active/lazy training paper does not attempt to trigger active/lazy regime via initialization. Instead, the paper applies a scaling factor alpha to the loss function. \\

Efforts following this approach proved to be more fruitful than my earlier attempts to observe active/lazy training via controlling the variance of initialization values. 
I have used the scaling method from the paper’s CIFAR10 CNN experiment in a single-hidden-layer neural network. The general observation remains the same:
1. Experiment shows that the l2-norm of weight movements away from initialization values increase with scaling alpha, which is consistent with theoretical assumptions.
2. I have observed that there is a threshold for alpha at which movements of weight begin to occur. Scaling factors smaller than this threshold would see little/zero weight movement whereas larger scaling factors lead to increasing movements past this threshold.
3. With respect to decreasing scaling factor alpha while holding other factors constant, model accuracy peaks at a point after the threshold at which weights begin to move (not at the threshold). After this peak, model accuracy would then deteriorate to that of random selection (both training and dev set to around 0.1 for CIFAR10). Experimental results also show that as the scaling factor continues to increase, weight movements from initialization experiences a temporary drop after the performance peak before resuming to increase monotonously at increasing rates.
4. Learning rate and (likely) model size and architecture have influence on not only model performance, but also where the active/lazy threshold occurs: experimental result shows that 
larger models see later performance peak (wrt. decreasing alpha), slower performance deterioration, and slower increase in weight movements. 
holding other factors constant, decreasing learning rate led to:
Smaller active/lazy threshold alpha (scaling factor alpha at which weights begin to move during training)
Slower increase in weight movements as alpha decreases
Later performance peak as alpha decreases
Slower performance deterioration as alpha decreases
holding other factors constant, increasing hidden layer width led to:
Unsurprisingly, better accuracy
Faster increase in weight movements as alpha decreases
5. While the paper’s code produced reasonable accuracies at large alpha values (lazy regime), my two-layer nn experiments show poor model performance (both training and dev set are at the random selection accuracy around 0.1) under the lazy training regime. This difference can be caused by major differences in model architectures (the paper implements a variation of VGG ConvNet). \\

Next step is to integrate into the Convnet architecture used in Sam’s double descent experiments.



In [3]:
import tensorflow as tf
print(tf.__version__)
from tensorflow.keras.layers import Dense, Flatten 
from tensorflow.keras import Model
import matplotlib.pyplot as plt

2.4.1


In [4]:
DATA = 'CIFAR10'
NORMALIZE = True

In [5]:
if DATA == 'MNIST':
  mnist = tf.keras.datasets.mnist
  (x_train, y_train), (x_test, y_test) = mnist.load_data()
if DATA == 'CIFAR10':
  cifar10 = tf.keras.datasets.cifar10
  (x_train, y_train), (x_test, y_test) = cifar10.load_data()
if NORMALIZE:
  x_train, x_test = x_train / 255.0, x_test / 255.0

In [6]:
class CustomCallback_epoch(tf.keras.callbacks.Callback):
    def __init__(self):
        super().__init__()
        self.dh = display('SAVING INITIAL WEIGHT VALUES', display_id=True)
        self.start_time = time.perf_counter()

    def on_train_begin(self, logs=None):
        global weight_history
        # print('SAVING INITIAL WEIGHT VALUES')
        weight_history.append(self.model.get_weights())

    def on_epoch_end(self, epoch, logs=None):
        global weight_history
        # list of weight tensors
        curr_weight = self.model.get_weights()
        if weight_history:
          weight_change = [curr_weight[i] - weight_history[0][i] for i in range(len(curr_weight))]
          norm_delta = [tf.norm(t, ord=2).numpy() for t in weight_change]
          # print('L2 NORM OF WEIGHT CHANGE RELATIVE TO INITIAL VALUES: ', norm_delta)
          weight_history.append(norm_delta)

        end_time = time.perf_counter()
        run_time = end_time - self.start_time
        hrs, mnts, secs = int(run_time // 60 // 60), int(run_time // 60 % 60), int(run_time % 60)

        template = 'Epoch: {:04}, Total Run Time: {:02}:{:02}:{:02}'
        template += ' - Loss: {:.4e}, Accuracy: {:.3f}, Test Loss: {:.4e}, Test Accuracy: {:.3f}'
        template += ' - L2 Norm of Weight Movement From Initialization: %s' %str(norm_delta)

        train_loss, train_accuracy = logs['loss'], logs['accuracy']
        test_loss, test_accuracy = logs['val_loss'], logs['val_accuracy']
        self.dh.update(template.format(epoch, hrs, mnts, secs, train_loss, train_accuracy, test_loss, test_accuracy))
  

In [7]:
# some parameters in the active/lazy paper:
# optimizer: sgd with momentum=0.9
# initialization: kernel - xavier(glorot) normal, bias - zeros (default for Dense in tf)
# loss: both categroical crossentropy(sparse) and mse are provided as options, using cce here
# scaling: implemented in the paper repo's train.py as loss = torch.nn.CrossEntropyLoss(alpha*outputs, targets)/alpha**2
#          implemented here with scaled_custom_loss() loss function that does the same thing        

def train(alpha, epoch, opt, lr, scaling=True, width=64):
    global weight_history
    weight_history = []

    def scaled_custom_loss(y_actual, y_pred):
        sce = tf.keras.losses.SparseCategoricalCrossentropy()
        scaled_sce = sce(y_actual, alpha*y_pred)/alpha**2
        return scaled_sce

    if DATA == 'CIFAR10':
      flatten = tf.keras.layers.Flatten(input_shape=(32, 32, 3))

    if DATA == 'MNIST':
      flatten = tf.keras.layers.Flatten(input_shape=(28, 28))

    model = tf.keras.models.Sequential([
      flatten,
      tf.keras.layers.Dense(width, activation='relu'),
      tf.keras.layers.Dense(10, activation='softmax')
    ])
    # model.summary()
  
    if opt=='adam':
      # default lr for Adam is 0.001
      opt = tf.keras.optimizers.Adam(learning_rate=lr)
    elif opt=='sgd':
      # default lr for sgd is 0.01
      opt = tf.keras.optimizers.SGD(learning_rate=lr, momentum=0.9)
    else:
      raise Exception('optimizer must be adam/sgd')

    model.compile(optimizer=opt,
                  loss=scaled_custom_loss if scaling else 'sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    history = model.fit(x_train, y_train, epochs=epoch, validation_data=(x_test, y_test), callbacks=[CustomCallback_epoch()], verbose=0)
    
    print()
    print('max training accuracy', max(history.history['accuracy']))
    print('min training loss', min(history.history['loss']))
    print('max validation accuracy', max(history.history['val_accuracy']))
    print('min validation loss', min(history.history['val_loss']))

    # plot(history)
    # print(tf.math.confusion_matrix(y_test, tf.argmax(model.predict(x_test), axis=1)))
    print()
    print('l2-normed weight changes from initial values after last epoch:')
    print(weight_history[-1])

    return (weight_history[1:], history.history)

In [20]:
unscaled = train(alpha=None, epoch=10, opt='sgd', lr=0.01, scaling=False)

'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 1.6841e+00, Accuracy: 0.390, Test Loss: 1.7300e+00, Test Accuracy: 0.383 - L2 Norm of Weight Movement From Initialization: [13.935824, 3.1807551, 2.3657033, 2.7277188]'


max training accuracy 0.3919999897480011
min training loss 1.6840555667877197
max validation accuracy 0.40130001306533813
min validation loss 1.6713179349899292

l2-normed weight changes from initial values after last epoch:
[13.935824, 3.1807551, 2.3657033, 2.7277188]


In [13]:
lrs = [1.0, 0.1, 0.01, 0.001]
alphas = [10000000.0, 1000000.0, 100000.0, 10000.0, 1000.0, 100.0, 10.0, 5.0, 1.0, 0.5, 0.1, 0.01]
normed_weight_changes = {}
optimizer = 'sgd'
num_epochs = 10
for learning_rate in lrs:
  for alpha_val in alphas:
    print('='*80)
    print('opt = %s, lr = %f, alpha = %f' %(optimizer, learning_rate, alpha_val))
    print('='*80)
    normed_weight_changes[(optimizer, learning_rate, alpha_val)] = train(alpha=alpha_val, epoch=num_epochs, opt=optimizer, lr=learning_rate, scaling=True)

opt = sgd, lr = 1.000000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-14, Accuracy: 0.095, Test Loss: 2.3026e-14, Test Accuracy: 0.099 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09533999860286713
min training loss 2.3025665711129753e-14
max validation accuracy 0.09910000115633011
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-12, Accuracy: 0.106, Test Loss: 2.3026e-12, Test Accuracy: 0.107 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10605999827384949
min training loss 2.3025997341469262e-12
max validation accuracy 0.1071000024676323
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-10, Accuracy: 0.099, Test Loss: 2.3026e-10, Test Accuracy: 0.098 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09917999804019928
min training loss 2.3026328066499957e-10
max validation accuracy 0.09830000251531601
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 2.3026e-08, Accuracy: 0.096, Test Loss: 2.3026e-08, Test Accuracy: 0.095 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09634000062942505
min training loss 2.3025949147381652e-08
max validation accuracy 0.09510000050067902
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 2.3026e-06, Accuracy: 0.081, Test Loss: 2.3026e-06, Test Accuracy: 0.076 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.08129999786615372
min training loss 2.302632083228673e-06
max validation accuracy 0.07580000162124634
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:51 - Loss: 2.1155e-04, Accuracy: 0.067, Test Loss: 2.1165e-04, Test Accuracy: 0.074 - L2 Norm of Weight Movement From Initialization: [1.1128045, 0.08482273, 0.41498986, 0.08937229]'


max training accuracy 0.09386000037193298
min training loss 0.00021154677961021662
max validation accuracy 0.09570000320672989
min validation loss 0.00021165492944419384

l2-normed weight changes from initial values after last epoch:
[1.1128045, 0.08482273, 0.41498986, 0.08937229]
opt = sgd, lr = 1.000000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:50 - Loss: 1.7628e-02, Accuracy: 0.354, Test Loss: 1.8057e-02, Test Accuracy: 0.342 - L2 Norm of Weight Movement From Initialization: [13.759092, 2.8705606, 2.5233994, 4.061022]'


max training accuracy 0.35429999232292175
min training loss 0.017628353089094162
max validation accuracy 0.3626999855041504
min validation loss 0.01752445288002491

l2-normed weight changes from initial values after last epoch:
[13.759092, 2.8705606, 2.5233994, 4.061022]
opt = sgd, lr = 1.000000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 9.2210e-02, Accuracy: 0.101, Test Loss: 9.2304e-02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [7.808285, 0.594512, 2.9711738, 0.31939018]'


max training accuracy 0.1257999986410141
min training loss 0.09069974720478058
max validation accuracy 0.1225999966263771
min validation loss 0.09143058210611343

l2-normed weight changes from initial values after last epoch:
[7.808285, 0.594512, 2.9711738, 0.31939018]
opt = sgd, lr = 1.000000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [48.145218, 1.4057947, 23.490149, 1.3475701]'


max training accuracy 0.10000000149011612
min training loss 14.499435424804688
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[48.145218, 1.4057947, 23.490149, 1.3475701]
opt = sgd, lr = 1.000000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 5.5530e+01, Accuracy: 0.100, Test Loss: 5.5530e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [493.2944, 15.856777, 65.584656, 14.882198]'


max training accuracy 0.10000000149011612
min training loss 55.50836944580078
max validation accuracy 0.10000000149011612
min validation loss 55.52981948852539

l2-normed weight changes from initial values after last epoch:
[493.2944, 15.856777, 65.584656, 14.882198]
opt = sgd, lr = 1.000000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [8407.028, 260.91104, 1134.2736, 231.82428]'


max training accuracy 0.10000000149011612
min training loss 1242.8416748046875
max validation accuracy 0.10000000149011612
min validation loss 1243.396484375

l2-normed weight changes from initial values after last epoch:
[8407.028, 260.91104, 1134.2736, 231.82428]
opt = sgd, lr = 1.000000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [319761.8, 10169.653, 48352.297, 10697.181]'


max training accuracy 0.10000000149011612
min training loss 103569.46875
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[319761.8, 10169.653, 48352.297, 10697.181]
opt = sgd, lr = 0.100000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-14, Accuracy: 0.101, Test Loss: 2.3026e-14, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10051999986171722
min training loss 2.3025665711129753e-14
max validation accuracy 0.10140000283718109
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-12, Accuracy: 0.102, Test Loss: 2.3026e-12, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.1020599976181984
min training loss 2.3025997341469262e-12
max validation accuracy 0.10170000046491623
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-10, Accuracy: 0.100, Test Loss: 2.3026e-10, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09963999688625336
min training loss 2.3026328066499957e-10
max validation accuracy 0.0997999981045723
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-08, Accuracy: 0.106, Test Loss: 2.3026e-08, Test Accuracy: 0.106 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10638000071048737
min training loss 2.3025949147381652e-08
max validation accuracy 0.10589999705553055
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 2.3026e-06, Accuracy: 0.080, Test Loss: 2.3026e-06, Test Accuracy: 0.076 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.07953999936580658
min training loss 2.302632083228673e-06
max validation accuracy 0.07649999856948853
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-04, Accuracy: 0.099, Test Loss: 2.3026e-04, Test Accuracy: 0.096 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09901999682188034
min training loss 0.0002302551583852619
max validation accuracy 0.09629999846220016
min validation loss 0.0002302593638887629

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.6068e-02, Accuracy: 0.424, Test Loss: 1.6387e-02, Test Accuracy: 0.427 - L2 Norm of Weight Movement From Initialization: [6.025654, 0.7365643, 2.9212887, 0.60148376]'


max training accuracy 0.424019992351532
min training loss 0.01606808975338936
max validation accuracy 0.42669999599456787
min validation loss 0.016325337812304497

l2-normed weight changes from initial values after last epoch:
[6.025654, 0.7365643, 2.9212887, 0.60148376]
opt = sgd, lr = 0.100000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 6.2829e-02, Accuracy: 0.441, Test Loss: 6.4463e-02, Test Accuracy: 0.433 - L2 Norm of Weight Movement From Initialization: [10.955688, 1.6777126, 2.886862, 2.271945]'


max training accuracy 0.44102001190185547
min training loss 0.06282926350831985
max validation accuracy 0.44429999589920044
min validation loss 0.06360765546560287

l2-normed weight changes from initial values after last epoch:
[10.955688, 1.6777126, 2.886862, 2.271945]
opt = sgd, lr = 0.100000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [21.721178, 0.8048843, 38.87243, 1.2128525]'


max training accuracy 0.10000000149011612
min training loss 14.489500999450684
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[21.721178, 0.8048843, 38.87243, 1.2128525]
opt = sgd, lr = 0.100000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 5.5530e+01, Accuracy: 0.100, Test Loss: 5.5530e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [30.639067, 1.3211002, 49.933365, 1.3894542]'


max training accuracy 0.10006000101566315
min training loss 55.49030685424805
max validation accuracy 0.10000000149011612
min validation loss 55.529842376708984

l2-normed weight changes from initial values after last epoch:
[30.639067, 1.3211002, 49.933365, 1.3894542]
opt = sgd, lr = 0.100000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [581.12805, 21.489557, 101.42391, 23.787317]'


max training accuracy 0.10000000149011612
min training loss 1242.8460693359375
max validation accuracy 0.10000000149011612
min validation loss 1243.396728515625

l2-normed weight changes from initial values after last epoch:
[581.12805, 21.489557, 101.42391, 23.787317]
opt = sgd, lr = 0.100000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [78411.98, 2514.0466, 10091.065, 2961.561]'


max training accuracy 0.10000000149011612
min training loss 103578.46875
max validation accuracy 0.10000000149011612
min validation loss 103617.046875

l2-normed weight changes from initial values after last epoch:
[78411.98, 2514.0466, 10091.065, 2961.561]
opt = sgd, lr = 0.010000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-14, Accuracy: 0.100, Test Loss: 2.3026e-14, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09978000074625015
min training loss 2.3025665711129753e-14
max validation accuracy 0.1006999984383583
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:00:50 - Loss: 2.3026e-12, Accuracy: 0.103, Test Loss: 2.3026e-12, Test Accuracy: 0.108 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10344000160694122
min training loss 2.3025997341469262e-12
max validation accuracy 0.10809999704360962
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-10, Accuracy: 0.101, Test Loss: 2.3026e-10, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10080000013113022
min training loss 2.3026328066499957e-10
max validation accuracy 0.10180000215768814
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-08, Accuracy: 0.094, Test Loss: 2.3026e-08, Test Accuracy: 0.090 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.094480000436306
min training loss 2.3025949147381652e-08
max validation accuracy 0.09030000120401382
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-06, Accuracy: 0.098, Test Loss: 2.3026e-06, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09839999675750732
min training loss 2.302632083228673e-06
max validation accuracy 0.09740000218153
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 2.3026e-04, Accuracy: 0.109, Test Loss: 2.3026e-04, Test Accuracy: 0.109 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.1092199981212616
min training loss 0.0002302551583852619
max validation accuracy 0.10909999907016754
min validation loss 0.0002302593638887629

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 1.9162e-02, Accuracy: 0.200, Test Loss: 1.9095e-02, Test Accuracy: 0.210 - L2 Norm of Weight Movement From Initialization: [1.69165, 0.14758249, 0.65157443, 0.14894329]'


max training accuracy 0.19957999885082245
min training loss 0.019161611795425415
max validation accuracy 0.2101999968290329
min validation loss 0.019095223397016525

l2-normed weight changes from initial values after last epoch:
[1.69165, 0.14758249, 0.65157443, 0.14894329]
opt = sgd, lr = 0.010000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 6.5288e-02, Accuracy: 0.420, Test Loss: 6.5596e-02, Test Accuracy: 0.421 - L2 Norm of Weight Movement From Initialization: [3.779057, 0.42760393, 1.8717364, 0.36189264]'


max training accuracy 0.4198800027370453
min training loss 0.0652879998087883
max validation accuracy 0.4214000105857849
min validation loss 0.06559550017118454

l2-normed weight changes from initial values after last epoch:
[3.779057, 0.42760393, 1.8717364, 0.36189264]
opt = sgd, lr = 0.010000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 1.7178e+00, Accuracy: 0.383, Test Loss: 1.7821e+00, Test Accuracy: 0.368 - L2 Norm of Weight Movement From Initialization: [13.247671, 2.9822862, 2.2227376, 3.1047251]'


max training accuracy 0.3833799958229065
min training loss 1.7177767753601074
max validation accuracy 0.3727000057697296
min validation loss 1.7324427366256714

l2-normed weight changes from initial values after last epoch:
[13.247671, 2.9822862, 2.2227376, 3.1047251]
opt = sgd, lr = 0.010000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 9.2223e+00, Accuracy: 0.099, Test Loss: 9.2176e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [6.812647, 0.42942634, 2.6451435, 0.19117646]'


max training accuracy 0.10875999927520752
min training loss 9.167901039123535
max validation accuracy 0.12300000339746475
min validation loss 9.055254936218262

l2-normed weight changes from initial values after last epoch:
[6.812647, 0.42942634, 2.6451435, 0.19117646]
opt = sgd, lr = 0.010000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [69.04085, 2.2613008, 7.2700424, 2.1453192]'


max training accuracy 0.10000000149011612
min training loss 1242.8170166015625
max validation accuracy 0.10000000149011612
min validation loss 1243.396728515625

l2-normed weight changes from initial values after last epoch:
[69.04085, 2.2613008, 7.2700424, 2.1453192]
opt = sgd, lr = 0.010000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10157.753, 351.99518, 1353.956, 311.5324]'


max training accuracy 0.10000000149011612
min training loss 103575.953125
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[10157.753, 351.99518, 1353.956, 311.5324]
opt = sgd, lr = 0.001000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 2.3026e-14, Accuracy: 0.077, Test Loss: 2.3026e-14, Test Accuracy: 0.079 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.07699999958276749
min training loss 2.3025665711129753e-14
max validation accuracy 0.07909999787807465
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.3026e-12, Accuracy: 0.092, Test Loss: 2.3026e-12, Test Accuracy: 0.091 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09192000329494476
min training loss 2.3025997341469262e-12
max validation accuracy 0.09059999883174896
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-10, Accuracy: 0.085, Test Loss: 2.3026e-10, Test Accuracy: 0.086 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.0853400006890297
min training loss 2.3026328066499957e-10
max validation accuracy 0.08609999716281891
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-08, Accuracy: 0.100, Test Loss: 2.3026e-08, Test Accuracy: 0.098 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.100040003657341
min training loss 2.3025949147381652e-08
max validation accuracy 0.09790000319480896
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-06, Accuracy: 0.095, Test Loss: 2.3026e-06, Test Accuracy: 0.096 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09474000334739685
min training loss 2.302632083228673e-06
max validation accuracy 0.09600000083446503
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3026e-04, Accuracy: 0.114, Test Loss: 2.3026e-04, Test Accuracy: 0.117 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.11448000371456146
min training loss 0.0002302551583852619
max validation accuracy 0.11699999868869781
min validation loss 0.0002302593638887629

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.1392e-02, Accuracy: 0.087, Test Loss: 2.1353e-02, Test Accuracy: 0.090 - L2 Norm of Weight Movement From Initialization: [0.43189716, 0.016479632, 0.09886324, 0.02174516]'


max training accuracy 0.10029999911785126
min training loss 0.021392416208982468
max validation accuracy 0.10119999945163727
min validation loss 0.021353216841816902

l2-normed weight changes from initial values after last epoch:
[0.43189716, 0.016479632, 0.09886324, 0.02174516]
opt = sgd, lr = 0.001000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 7.5305e-02, Accuracy: 0.338, Test Loss: 7.5469e-02, Test Accuracy: 0.322 - L2 Norm of Weight Movement From Initialization: [1.2281338, 0.06536366, 0.4421587, 0.07780786]'


max training accuracy 0.3384999930858612
min training loss 0.07530481368303299
max validation accuracy 0.3253999948501587
min validation loss 0.0754685178399086

l2-normed weight changes from initial values after last epoch:
[1.2281338, 0.06536366, 0.4421587, 0.07780786]
opt = sgd, lr = 0.001000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.4605e+00, Accuracy: 0.488, Test Loss: 1.4940e+00, Test Accuracy: 0.468 - L2 Norm of Weight Movement From Initialization: [6.272486, 0.9100276, 2.6379566, 0.9102471]'


max training accuracy 0.48750001192092896
min training loss 1.4605473279953003
max validation accuracy 0.4677000045776367
min validation loss 1.4939736127853394

l2-normed weight changes from initial values after last epoch:
[6.272486, 0.9100276, 2.6379566, 0.9102471]
opt = sgd, lr = 0.001000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 6.1231e+00, Accuracy: 0.455, Test Loss: 6.2871e+00, Test Accuracy: 0.445 - L2 Norm of Weight Movement From Initialization: [10.945328, 1.7978578, 2.5358744, 2.3660274]'


max training accuracy 0.45451998710632324
min training loss 6.123098850250244
max validation accuracy 0.4447000026702881
min validation loss 6.286983966827393

l2-normed weight changes from initial values after last epoch:
[10.945328, 1.7978578, 2.5358744, 2.3660274]
opt = sgd, lr = 0.001000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3104e+02, Accuracy: 0.099, Test Loss: 2.3058e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [25.359215, 0.89406925, 17.915815, 0.25575876]'


max training accuracy 0.10255999863147736
min training loss 230.9173126220703
max validation accuracy 0.10000000149011612
min validation loss 230.57913208007812

l2-normed weight changes from initial values after last epoch:
[25.359215, 0.89406925, 17.915815, 0.25575876]
opt = sgd, lr = 0.001000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [524.8614, 22.775066, 73.9547, 19.422792]'


max training accuracy 0.10000000149011612
min training loss 103571.109375
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[524.8614, 22.775066, 73.9547, 19.422792]


In [8]:
lrs = [1.0, 0.1, 0.01, 0.001]
alphas = [20000.0, 1000.0, 100.0, 80.0, 60.0, 40.0, 30.0, 20.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.8, 0.6, 0.4, 0.2, 0.1, 0.07, 0.04, 0.01, 0.001]
normed_weight_changes = {}
optimizer = 'sgd'
num_epochs = 10
for learning_rate in lrs:
  for alpha_val in alphas:
    print('='*80)
    print('opt = %s, lr = %f, alpha = %f' %(optimizer, learning_rate, alpha_val))
    print('='*80)
    normed_weight_changes[(optimizer, learning_rate, alpha_val)] = train(alpha=alpha_val, epoch=num_epochs, opt=optimizer, lr=learning_rate, scaling=True)

opt = sgd, lr = 1.000000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:01:01 - Loss: 5.7565e-09, Accuracy: 0.106, Test Loss: 5.7564e-09, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10608000308275223
min training loss 5.756487286845413e-09
max validation accuracy 0.10189999639987946
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:03 - Loss: 2.3026e-06, Accuracy: 0.113, Test Loss: 2.3026e-06, Test Accuracy: 0.115 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.11259999871253967
min training loss 2.302632083228673e-06
max validation accuracy 0.1151999980211258
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:01 - Loss: 2.2452e-04, Accuracy: 0.131, Test Loss: 2.2451e-04, Test Accuracy: 0.140 - L2 Norm of Weight Movement From Initialization: [0.68474156, 0.06128573, 0.22278644, 0.060494028]'


max training accuracy 0.13106000423431396
min training loss 0.00022451694530900568
max validation accuracy 0.1396999955177307
min validation loss 0.00022451221593655646

l2-normed weight changes from initial values after last epoch:
[0.68474156, 0.06128573, 0.22278644, 0.060494028]
opt = sgd, lr = 1.000000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:00:55 - Loss: 3.5977e-04, Accuracy: 0.103, Test Loss: 3.5978e-04, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.020813564, 0.00036608937, 0.002504848, 0.0002190831]'


max training accuracy 0.10345999896526337
min training loss 0.00035977360676042736
max validation accuracy 0.1014999970793724
min validation loss 0.0003597791073843837

l2-normed weight changes from initial values after last epoch:
[0.020813564, 0.00036608937, 0.002504848, 0.0002190831]
opt = sgd, lr = 1.000000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:00:57 - Loss: 5.9156e-04, Accuracy: 0.070, Test Loss: 5.9200e-04, Test Accuracy: 0.070 - L2 Norm of Weight Movement From Initialization: [1.4477499, 0.18152007, 0.60584956, 0.17281352]'


max training accuracy 0.09888000041246414
min training loss 0.0005915614892728627
max validation accuracy 0.09700000286102295
min validation loss 0.0005919969989918172

l2-normed weight changes from initial values after last epoch:
[1.4477499, 0.18152007, 0.60584956, 0.17281352]
opt = sgd, lr = 1.000000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:01:00 - Loss: 1.4016e-03, Accuracy: 0.092, Test Loss: 1.3739e-03, Test Accuracy: 0.084 - L2 Norm of Weight Movement From Initialization: [0.81815594, 0.0848096, 0.19500084, 0.08900814]'


max training accuracy 0.12008000165224075
min training loss 0.0014015964698046446
max validation accuracy 0.1324000060558319
min validation loss 0.0013739463174715638

l2-normed weight changes from initial values after last epoch:
[0.81815594, 0.0848096, 0.19500084, 0.08900814]
opt = sgd, lr = 1.000000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:00:58 - Loss: 1.9624e-03, Accuracy: 0.224, Test Loss: 2.0118e-03, Test Accuracy: 0.259 - L2 Norm of Weight Movement From Initialization: [5.250973, 0.6217295, 2.7605786, 1.0026412]'


max training accuracy 0.22416000068187714
min training loss 0.0019623825792223215
max validation accuracy 0.2590999901294708
min validation loss 0.0019937672186642885

l2-normed weight changes from initial values after last epoch:
[5.250973, 0.6217295, 2.7605786, 1.0026412]
opt = sgd, lr = 1.000000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:00:59 - Loss: 4.0943e-03, Accuracy: 0.423, Test Loss: 4.1619e-03, Test Accuracy: 0.423 - L2 Norm of Weight Movement From Initialization: [9.140864, 1.1357388, 3.683684, 0.96537554]'


max training accuracy 0.42309999465942383
min training loss 0.004094340838491917
max validation accuracy 0.42250001430511475
min validation loss 0.00416187196969986

l2-normed weight changes from initial values after last epoch:
[9.140864, 1.1357388, 3.683684, 0.96537554]
opt = sgd, lr = 1.000000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:57 - Loss: 1.8194e-02, Accuracy: 0.321, Test Loss: 1.8235e-02, Test Accuracy: 0.335 - L2 Norm of Weight Movement From Initialization: [11.933597, 2.267309, 2.408033, 3.4048178]'


max training accuracy 0.32133999466896057
min training loss 0.018193619325757027
max validation accuracy 0.3345000147819519
min validation loss 0.018234586343169212

l2-normed weight changes from initial values after last epoch:
[11.933597, 2.267309, 2.408033, 3.4048178]
opt = sgd, lr = 1.000000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.1922e-02, Accuracy: 0.350, Test Loss: 2.2702e-02, Test Accuracy: 0.311 - L2 Norm of Weight Movement From Initialization: [14.473484, 2.8185508, 2.8190112, 4.142594]'


max training accuracy 0.34964001178741455
min training loss 0.021921848878264427
max validation accuracy 0.3508000075817108
min validation loss 0.021948199719190598

l2-normed weight changes from initial values after last epoch:
[14.473484, 2.8185508, 2.8190112, 4.142594]
opt = sgd, lr = 1.000000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 2.9038e-02, Accuracy: 0.300, Test Loss: 2.8680e-02, Test Accuracy: 0.324 - L2 Norm of Weight Movement From Initialization: [13.996995, 3.2468293, 2.584002, 2.650785]'


max training accuracy 0.3000600039958954
min training loss 0.029037686064839363
max validation accuracy 0.32429999113082886
min validation loss 0.028680093586444855

l2-normed weight changes from initial values after last epoch:
[13.996995, 3.2468293, 2.584002, 2.650785]
opt = sgd, lr = 1.000000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:00:51 - Loss: 3.9344e-02, Accuracy: 0.248, Test Loss: 3.9371e-02, Test Accuracy: 0.253 - L2 Norm of Weight Movement From Initialization: [13.309275, 3.6144874, 1.8896875, 3.3773122]'


max training accuracy 0.24794000387191772
min training loss 0.03934365510940552
max validation accuracy 0.2581000030040741
min validation loss 0.039214398711919785

l2-normed weight changes from initial values after last epoch:
[13.309275, 3.6144874, 1.8896875, 3.3773122]
opt = sgd, lr = 1.000000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 5.9076e-02, Accuracy: 0.175, Test Loss: 6.1692e-02, Test Accuracy: 0.179 - L2 Norm of Weight Movement From Initialization: [12.204843, 1.6011443, 2.6020687, 1.3189007]'


max training accuracy 0.19327999651432037
min training loss 0.05851173773407936
max validation accuracy 0.19740000367164612
min validation loss 0.056341566145420074

l2-normed weight changes from initial values after last epoch:
[12.204843, 1.6011443, 2.6020687, 1.3189007]
opt = sgd, lr = 1.000000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 9.2223e-02, Accuracy: 0.099, Test Loss: 9.2178e-02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [5.362784, 0.19341625, 2.6222446, 0.19345966]'


max training accuracy 0.10496000200510025
min training loss 0.09189360588788986
max validation accuracy 0.10000000149011612
min validation loss 0.09215328097343445

l2-normed weight changes from initial values after last epoch:
[5.362784, 0.19341625, 2.6222446, 0.19345966]
opt = sgd, lr = 1.000000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:00:52 - Loss: 1.4419e-01, Accuracy: 0.100, Test Loss: 1.4409e-01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [6.7772055, 0.22017655, 2.6528208, 0.23961352]'


max training accuracy 0.10481999814510345
min training loss 0.14395739138126373
max validation accuracy 0.10000000149011612
min validation loss 0.14400887489318848

l2-normed weight changes from initial values after last epoch:
[6.7772055, 0.22017655, 2.6528208, 0.23961352]
opt = sgd, lr = 1.000000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.6118e+00, Accuracy: 0.100, Test Loss: 1.6118e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [16.435804, 0.5601467, 16.423609, 0.51729816]'


max training accuracy 0.100040003657341
min training loss 1.6092973947525024
max validation accuracy 0.10000000149011612
min validation loss 1.6118098497390747

l2-normed weight changes from initial values after last epoch:
[16.435804, 0.5601467, 16.423609, 0.51729816]
opt = sgd, lr = 1.000000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 3.6266e+00, Accuracy: 0.100, Test Loss: 3.6266e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [29.530716, 1.1123738, 51.807373, 1.9005119]'


max training accuracy 0.10000000149011612
min training loss 3.622891664505005
max validation accuracy 0.10000000149011612
min validation loss 3.6265671253204346

l2-normed weight changes from initial values after last epoch:
[29.530716, 1.1123738, 51.807373, 1.9005119]
opt = sgd, lr = 1.000000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [47.57411, 1.5923578, 22.187862, 1.6518625]'


max training accuracy 0.10000000149011612
min training loss 14.499715805053711
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[47.57411, 1.5923578, 22.187862, 1.6518625]
opt = sgd, lr = 1.000000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 2.2352e+01, Accuracy: 0.100, Test Loss: 2.2352e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [91.49532, 2.948711, 17.438267, 3.687691]'


max training accuracy 0.10000000149011612
min training loss 22.34185791015625
max validation accuracy 0.10000000149011612
min validation loss 22.352264404296875

l2-normed weight changes from initial values after last epoch:
[91.49532, 2.948711, 17.438267, 3.687691]
opt = sgd, lr = 1.000000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 3.9018e+01, Accuracy: 0.100, Test Loss: 3.9018e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [156.5725, 4.6003084, 17.242233, 4.8498206]'


max training accuracy 0.10000000149011612
min training loss 38.998226165771484
max validation accuracy 0.10000000149011612
min validation loss 39.018157958984375

l2-normed weight changes from initial values after last epoch:
[156.5725, 4.6003084, 17.242233, 4.8498206]
opt = sgd, lr = 1.000000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 8.5509e+01, Accuracy: 0.100, Test Loss: 8.5510e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [314.52945, 10.748122, 52.516026, 10.180938]'


max training accuracy 0.10000000149011612
min training loss 85.4654769897461
max validation accuracy 0.10000000149011612
min validation loss 85.51007843017578

l2-normed weight changes from initial values after last epoch:
[314.52945, 10.748122, 52.516026, 10.180938]
opt = sgd, lr = 1.000000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 3.2644e+02, Accuracy: 0.100, Test Loss: 3.2644e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [1234.1603, 42.385532, 201.95984, 57.5798]'


max training accuracy 0.10000000149011612
min training loss 326.3004150390625
max validation accuracy 0.10000000149011612
min validation loss 326.44476318359375

l2-normed weight changes from initial values after last epoch:
[1234.1603, 42.385532, 201.95984, 57.5798]
opt = sgd, lr = 1.000000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:50 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [7996.9116, 271.05765, 1219.1005, 290.3244]'


max training accuracy 0.10000000149011612
min training loss 1242.851318359375
max validation accuracy 0.10000000149011612
min validation loss 1243.396728515625

l2-normed weight changes from initial values after last epoch:
[7996.9116, 271.05765, 1219.1005, 290.3244]
opt = sgd, lr = 1.000000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:00:53 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10543.988, 317.61487, 1507.5032, 291.19934]'


max training accuracy 0.10000000149011612
min training loss 2470.890869140625
max validation accuracy 0.10000000149011612
min validation loss 2472.03662109375

l2-normed weight changes from initial values after last epoch:
[10543.988, 317.61487, 1507.5032, 291.19934]
opt = sgd, lr = 1.000000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:00:55 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [47775.734, 1560.6101, 7844.4155, 1707.0941]'


max training accuracy 0.10000000149011612
min training loss 7252.43798828125
max validation accuracy 0.10000000149011612
min validation loss 7255.82421875

l2-normed weight changes from initial values after last epoch:
[47775.734, 1560.6101, 7844.4155, 1707.0941]
opt = sgd, lr = 1.000000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:53 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [773025.75, 27903.514, 99754.98, 31971.89]'


max training accuracy 0.10000000149011612
min training loss 103576.3203125
max validation accuracy 0.10000000149011612
min validation loss 103617.0546875

l2-normed weight changes from initial values after last epoch:
[773025.75, 27903.514, 99754.98, 31971.89]
opt = sgd, lr = 1.000000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:00:56 - Loss: 8.3130e+06, Accuracy: 0.098, Test Loss: 8.3123e+06, Test Accuracy: 0.098 - L2 Norm of Weight Movement From Initialization: [49182830.0, 1741843.4, 7779259.5, 1541352.6]'


max training accuracy 0.09752000123262405
min training loss 8309390.0
max validation accuracy 0.09759999811649323
min validation loss 8312312.0

l2-normed weight changes from initial values after last epoch:
[49182830.0, 1741843.4, 7779259.5, 1541352.6]
opt = sgd, lr = 0.100000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:00:54 - Loss: 5.7565e-09, Accuracy: 0.100, Test Loss: 5.7564e-09, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09957999736070633
min training loss 5.756487286845413e-09
max validation accuracy 0.09960000216960907
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:50 - Loss: 2.3026e-06, Accuracy: 0.116, Test Loss: 2.3026e-06, Test Accuracy: 0.114 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.11587999761104584
min training loss 2.302632083228673e-06
max validation accuracy 0.11379999667406082
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:52 - Loss: 2.3001e-04, Accuracy: 0.075, Test Loss: 2.2996e-04, Test Accuracy: 0.074 - L2 Norm of Weight Movement From Initialization: [0.054268762, 0.0014107644, 0.006875383, 0.0013348376]'


max training accuracy 0.08851999789476395
min training loss 0.00023001179215498269
max validation accuracy 0.0877000018954277
min validation loss 0.0002299635234521702

l2-normed weight changes from initial values after last epoch:
[0.054268762, 0.0014107644, 0.006875383, 0.0013348376]
opt = sgd, lr = 0.100000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:00:59 - Loss: 3.4853e-04, Accuracy: 0.112, Test Loss: 3.4779e-04, Test Accuracy: 0.114 - L2 Norm of Weight Movement From Initialization: [0.34200373, 0.0110071385, 0.06974908, 0.011733193]'


max training accuracy 0.11187999695539474
min training loss 0.000348534929798916
max validation accuracy 0.11540000140666962
min validation loss 0.0003477866994217038

l2-normed weight changes from initial values after last epoch:
[0.34200373, 0.0110071385, 0.06974908, 0.011733193]
opt = sgd, lr = 0.100000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:01:08 - Loss: 6.3962e-04, Accuracy: 0.097, Test Loss: 6.3960e-04, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.004359057, 9.551497e-05, 0.0006109884, 8.822795e-05]'


max training accuracy 0.09724000096321106
min training loss 0.0006396151147782803
max validation accuracy 0.09650000184774399
min validation loss 0.0006396027165465057

l2-normed weight changes from initial values after last epoch:
[0.004359057, 9.551497e-05, 0.0006109884, 8.822795e-05]
opt = sgd, lr = 0.100000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:01:01 - Loss: 1.4137e-03, Accuracy: 0.108, Test Loss: 1.4121e-03, Test Accuracy: 0.109 - L2 Norm of Weight Movement From Initialization: [0.34828386, 0.015198114, 0.056152724, 0.013733017]'


max training accuracy 0.120619997382164
min training loss 0.00141371157951653
max validation accuracy 0.12479999661445618
min validation loss 0.0014120617415755987

l2-normed weight changes from initial values after last epoch:
[0.34828386, 0.015198114, 0.056152724, 0.013733017]
opt = sgd, lr = 0.100000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.3478e-03, Accuracy: 0.062, Test Loss: 2.3460e-03, Test Accuracy: 0.057 - L2 Norm of Weight Movement From Initialization: [1.1805403, 0.07192715, 0.35428986, 0.06801064]'


max training accuracy 0.08771999925374985
min training loss 0.0023477887734770775
max validation accuracy 0.08839999884366989
min validation loss 0.002345960820093751

l2-normed weight changes from initial values after last epoch:
[1.1805403, 0.07192715, 0.35428986, 0.06801064]
opt = sgd, lr = 0.100000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 4.7079e-03, Accuracy: 0.117, Test Loss: 4.7079e-03, Test Accuracy: 0.123 - L2 Norm of Weight Movement From Initialization: [2.339544, 0.20176792, 0.98722535, 0.25597855]'


max training accuracy 0.11789999902248383
min training loss 0.004707875195890665
max validation accuracy 0.1251000016927719
min validation loss 0.004707877524197102

l2-normed weight changes from initial values after last epoch:
[2.339544, 0.20176792, 0.98722535, 0.25597855]
opt = sgd, lr = 0.100000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.6113e-02, Accuracy: 0.420, Test Loss: 1.6321e-02, Test Accuracy: 0.423 - L2 Norm of Weight Movement From Initialization: [5.9524803, 0.7210022, 2.9867098, 0.5624039]'


max training accuracy 0.41962000727653503
min training loss 0.016112886369228363
max validation accuracy 0.42320001125335693
min validation loss 0.016320642083883286

l2-normed weight changes from initial values after last epoch:
[5.9524803, 0.7210022, 2.9867098, 0.5624039]
opt = sgd, lr = 0.100000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 1.9711e-02, Accuracy: 0.433, Test Loss: 2.0047e-02, Test Accuracy: 0.423 - L2 Norm of Weight Movement From Initialization: [6.503714, 0.8349519, 3.259109, 0.7352887]'


max training accuracy 0.4325000047683716
min training loss 0.019711410626769066
max validation accuracy 0.4309999942779541
min validation loss 0.020025044679641724

l2-normed weight changes from initial values after last epoch:
[6.503714, 0.8349519, 3.259109, 0.7352887]
opt = sgd, lr = 0.100000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.4444e-02, Accuracy: 0.441, Test Loss: 2.4671e-02, Test Accuracy: 0.450 - L2 Norm of Weight Movement From Initialization: [7.451647, 1.0755042, 3.4698749, 0.90320677]'


max training accuracy 0.4410800039768219
min training loss 0.024443529546260834
max validation accuracy 0.44999998807907104
min validation loss 0.02467130869626999

l2-normed weight changes from initial values after last epoch:
[7.451647, 1.0755042, 3.4698749, 0.90320677]
opt = sgd, lr = 0.100000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 3.1843e-02, Accuracy: 0.444, Test Loss: 3.1788e-02, Test Accuracy: 0.454 - L2 Norm of Weight Movement From Initialization: [8.395297, 1.2555305, 3.4009585, 1.1722462]'


max training accuracy 0.44359999895095825
min training loss 0.03184295818209648
max validation accuracy 0.4537999927997589
min validation loss 0.03178844228386879

l2-normed weight changes from initial values after last epoch:
[8.395297, 1.2555305, 3.4009585, 1.1722462]
opt = sgd, lr = 0.100000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 4.3115e-02, Accuracy: 0.438, Test Loss: 4.3636e-02, Test Accuracy: 0.439 - L2 Norm of Weight Movement From Initialization: [9.788506, 1.5278726, 3.1762538, 1.6564387]'


max training accuracy 0.44064000248908997
min training loss 0.04311501607298851
max validation accuracy 0.43950000405311584
min validation loss 0.04363606125116348

l2-normed weight changes from initial values after last epoch:
[9.788506, 1.5278726, 3.1762538, 1.6564387]
opt = sgd, lr = 0.100000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 6.3252e-02, Accuracy: 0.432, Test Loss: 6.5580e-02, Test Accuracy: 0.413 - L2 Norm of Weight Movement From Initialization: [10.5476265, 1.63417, 2.9999561, 2.1166186]'


max training accuracy 0.4318400025367737
min training loss 0.06325214356184006
max validation accuracy 0.426800012588501
min validation loss 0.06499608606100082

l2-normed weight changes from initial values after last epoch:
[10.5476265, 1.63417, 2.9999561, 2.1166186]
opt = sgd, lr = 0.100000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:00:50 - Loss: 9.9345e-02, Accuracy: 0.430, Test Loss: 9.9032e-02, Test Accuracy: 0.442 - L2 Norm of Weight Movement From Initialization: [12.597225, 2.0851648, 2.8402867, 2.8266504]'


max training accuracy 0.43007999658584595
min training loss 0.09934525191783905
max validation accuracy 0.44200000166893005
min validation loss 0.09903214126825333

l2-normed weight changes from initial values after last epoch:
[12.597225, 2.0851648, 2.8402867, 2.8266504]
opt = sgd, lr = 0.100000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.8939e-01, Accuracy: 0.388, Test Loss: 1.8677e-01, Test Accuracy: 0.391 - L2 Norm of Weight Movement From Initialization: [14.344613, 2.9397628, 2.5271456, 2.8185773]'


max training accuracy 0.38815999031066895
min training loss 0.1893947720527649
max validation accuracy 0.3912999927997589
min validation loss 0.18677058815956116

l2-normed weight changes from initial values after last epoch:
[14.344613, 2.9397628, 2.5271456, 2.8185773]
opt = sgd, lr = 0.100000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 5.5519e-01, Accuracy: 0.136, Test Loss: 5.7644e-01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10.437254, 2.671324, 3.1384814, 0.2524314]'


max training accuracy 0.1729000061750412
min training loss 0.5374564528465271
max validation accuracy 0.21089999377727509
min validation loss 0.5227702260017395

l2-normed weight changes from initial values after last epoch:
[10.437254, 2.671324, 3.1384814, 0.2524314]
opt = sgd, lr = 0.100000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [14.5854845, 0.55096865, 19.61719, 0.8292582]'


max training accuracy 0.10000000149011612
min training loss 14.487117767333984
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[14.5854845, 0.55096865, 19.61719, 0.8292582]
opt = sgd, lr = 0.100000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.2352e+01, Accuracy: 0.100, Test Loss: 2.2352e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [30.65274, 1.1685557, 47.97587, 1.4433278]'


max training accuracy 0.10001999884843826
min training loss 22.32388687133789
max validation accuracy 0.10000000149011612
min validation loss 22.35226821899414

l2-normed weight changes from initial values after last epoch:
[30.65274, 1.1685557, 47.97587, 1.4433278]
opt = sgd, lr = 0.100000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 3.9018e+01, Accuracy: 0.100, Test Loss: 3.9018e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [32.9481, 1.3245616, 46.448566, 1.5705059]'


max training accuracy 0.10006000101566315
min training loss 38.98286819458008
max validation accuracy 0.10000000149011612
min validation loss 39.01816177368164

l2-normed weight changes from initial values after last epoch:
[32.9481, 1.3245616, 46.448566, 1.5705059]
opt = sgd, lr = 0.100000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 8.5509e+01, Accuracy: 0.100, Test Loss: 8.5510e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [83.8756, 3.234147, 116.98285, 3.1039855]'


max training accuracy 0.10000000149011612
min training loss 85.44725036621094
max validation accuracy 0.10000000149011612
min validation loss 85.51007080078125

l2-normed weight changes from initial values after last epoch:
[83.8756, 3.234147, 116.98285, 3.1039855]
opt = sgd, lr = 0.100000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 3.2644e+02, Accuracy: 0.100, Test Loss: 3.2644e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [154.58298, 5.807763, 20.92782, 6.0851865]'


max training accuracy 0.10000000149011612
min training loss 326.28839111328125
max validation accuracy 0.10000000149011612
min validation loss 326.44476318359375

l2-normed weight changes from initial values after last epoch:
[154.58298, 5.807763, 20.92782, 6.0851865]
opt = sgd, lr = 0.100000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [763.7431, 26.46899, 82.06319, 26.584469]'


max training accuracy 0.10000000149011612
min training loss 1242.86962890625
max validation accuracy 0.10000000149011612
min validation loss 1243.396484375

l2-normed weight changes from initial values after last epoch:
[763.7431, 26.46899, 82.06319, 26.584469]
opt = sgd, lr = 0.100000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:00:50 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2768.6328, 97.84261, 276.4753, 86.67657]'


max training accuracy 0.10000000149011612
min training loss 2471.16015625
max validation accuracy 0.10000000149011612
min validation loss 2472.03466796875

l2-normed weight changes from initial values after last epoch:
[2768.6328, 97.84261, 276.4753, 86.67657]
opt = sgd, lr = 0.100000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [3858.7102, 128.98053, 738.0783, 150.73013]'


max training accuracy 0.10000000149011612
min training loss 7252.71044921875
max validation accuracy 0.10000000149011612
min validation loss 7255.8232421875

l2-normed weight changes from initial values after last epoch:
[3858.7102, 128.98053, 738.0783, 150.73013]
opt = sgd, lr = 0.100000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [65426.305, 2185.4602, 8354.539, 1994.0205]'


max training accuracy 0.10000000149011612
min training loss 103572.8828125
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[65426.305, 2185.4602, 8354.539, 1994.0205]
opt = sgd, lr = 0.100000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [5576388.0, 218211.34, 1040679.5, 247779.11]'


max training accuracy 0.10000000149011612
min training loss 8286802.0
max validation accuracy 0.10000000149011612
min validation loss 8290206.5

l2-normed weight changes from initial values after last epoch:
[5576388.0, 218211.34, 1040679.5, 247779.11]
opt = sgd, lr = 0.010000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 5.7565e-09, Accuracy: 0.096, Test Loss: 5.7564e-09, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09625999629497528
min training loss 5.756487286845413e-09
max validation accuracy 0.09679999947547913
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.3026e-06, Accuracy: 0.116, Test Loss: 2.3026e-06, Test Accuracy: 0.117 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.11631999909877777
min training loss 2.302632083228673e-06
max validation accuracy 0.11710000038146973
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.3026e-04, Accuracy: 0.094, Test Loss: 2.3026e-04, Test Accuracy: 0.095 - L2 Norm of Weight Movement From Initialization: [0.00024980976, 5.7359007e-06, 4.5553305e-05, 6.672597e-06]'


max training accuracy 0.0940999984741211
min training loss 0.00023025574046187103
max validation accuracy 0.09570000320672989
min validation loss 0.0002302593638887629

l2-normed weight changes from initial values after last epoch:
[0.00024980976, 5.7359007e-06, 4.5553305e-05, 6.672597e-06]
opt = sgd, lr = 0.010000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 3.5978e-04, Accuracy: 0.113, Test Loss: 3.5980e-04, Test Accuracy: 0.117 - L2 Norm of Weight Movement From Initialization: [0.011108864, 0.0002145907, 0.0017369111, 0.00020889568]'


max training accuracy 0.11337999999523163
min training loss 0.00035978350206278265
max validation accuracy 0.11729999631643295
min validation loss 0.0003598020412027836

l2-normed weight changes from initial values after last epoch:
[0.011108864, 0.0002145907, 0.0017369111, 0.00020889568]
opt = sgd, lr = 0.010000, alpha = 60.000000


'Epoch: 0004, Total Run Time: 00:00:23 - Loss: 6.3922e-04, Accuracy: 0.091, Test Loss: 6.3930e-04, Test Accuracy: 0.092 - L2 Norm of Weight Movement From Initialization: [0.013705308, 0.0002762005, 0.0017851021, 0.00021769408]'

'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 6.3880e-04, Accuracy: 0.091, Test Loss: 6.3892e-04, Test Accuracy: 0.093 - L2 Norm of Weight Movement From Initialization: [0.030978173, 0.00063523056, 0.0040650168, 0.00046431154]'


max training accuracy 0.09126000106334686
min training loss 0.0006388036999851465
max validation accuracy 0.0934000015258789
min validation loss 0.000638919766061008

l2-normed weight changes from initial values after last epoch:
[0.030978173, 0.00063523056, 0.0040650168, 0.00046431154]
opt = sgd, lr = 0.010000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 1.4391e-03, Accuracy: 0.106, Test Loss: 1.4391e-03, Test Accuracy: 0.108 - L2 Norm of Weight Movement From Initialization: [0.01389715, 0.00033202482, 0.0017931663, 0.000353919]'


max training accuracy 0.10666000097990036
min training loss 0.00143910082988441
max validation accuracy 0.10809999704360962
min validation loss 0.0014390799915418029

l2-normed weight changes from initial values after last epoch:
[0.01389715, 0.00033202482, 0.0017931663, 0.000353919]
opt = sgd, lr = 0.010000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.5225e-03, Accuracy: 0.103, Test Loss: 2.5221e-03, Test Accuracy: 0.104 - L2 Norm of Weight Movement From Initialization: [0.19908425, 0.009972077, 0.03135142, 0.008934185]'


max training accuracy 0.1053600013256073
min training loss 0.0025225465651601553
max validation accuracy 0.10419999808073044
min validation loss 0.0025220809038728476

l2-normed weight changes from initial values after last epoch:
[0.19908425, 0.009972077, 0.03135142, 0.008934185]
opt = sgd, lr = 0.010000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 5.2643e-03, Accuracy: 0.124, Test Loss: 5.2352e-03, Test Accuracy: 0.119 - L2 Norm of Weight Movement From Initialization: [0.6655547, 0.021023288, 0.16571982, 0.019797683]'


max training accuracy 0.12625999748706818
min training loss 0.005264324136078358
max validation accuracy 0.12860000133514404
min validation loss 0.005235190968960524

l2-normed weight changes from initial values after last epoch:
[0.6655547, 0.021023288, 0.16571982, 0.019797683]
opt = sgd, lr = 0.010000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.8755e-02, Accuracy: 0.235, Test Loss: 1.8775e-02, Test Accuracy: 0.233 - L2 Norm of Weight Movement From Initialization: [1.7845676, 0.1257214, 0.68083364, 0.13456154]'


max training accuracy 0.23533999919891357
min training loss 0.018755441531538963
max validation accuracy 0.23330000042915344
min validation loss 0.018775252625346184

l2-normed weight changes from initial values after last epoch:
[1.7845676, 0.1257214, 0.68083364, 0.13456154]
opt = sgd, lr = 0.010000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 2.2956e-02, Accuracy: 0.255, Test Loss: 2.2887e-02, Test Accuracy: 0.255 - L2 Norm of Weight Movement From Initialization: [1.9066192, 0.13871357, 0.77168304, 0.13173616]'


max training accuracy 0.2551400065422058
min training loss 0.022956496104598045
max validation accuracy 0.2547999918460846
min validation loss 0.022887496277689934

l2-normed weight changes from initial values after last epoch:
[1.9066192, 0.13871357, 0.77168304, 0.13173616]
opt = sgd, lr = 0.010000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 2.8580e-02, Accuracy: 0.272, Test Loss: 2.8684e-02, Test Accuracy: 0.268 - L2 Norm of Weight Movement From Initialization: [2.1928542, 0.17457834, 0.85823333, 0.14218593]'


max training accuracy 0.27233999967575073
min training loss 0.028580017387866974
max validation accuracy 0.2711000144481659
min validation loss 0.02868424355983734

l2-normed weight changes from initial values after last epoch:
[2.1928542, 0.17457834, 0.85823333, 0.14218593]
opt = sgd, lr = 0.010000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 3.5915e-02, Accuracy: 0.328, Test Loss: 3.6030e-02, Test Accuracy: 0.341 - L2 Norm of Weight Movement From Initialization: [2.5709534, 0.21085525, 1.1475517, 0.18983097]'


max training accuracy 0.32839998602867126
min training loss 0.03591495752334595
max validation accuracy 0.34060001373291016
min validation loss 0.03603028133511543

l2-normed weight changes from initial values after last epoch:
[2.5709534, 0.21085525, 1.1475517, 0.18983097]
opt = sgd, lr = 0.010000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 4.7632e-02, Accuracy: 0.376, Test Loss: 4.7892e-02, Test Accuracy: 0.377 - L2 Norm of Weight Movement From Initialization: [3.043211, 0.2986869, 1.4687209, 0.27956274]'


max training accuracy 0.3756600022315979
min training loss 0.047631826251745224
max validation accuracy 0.37720000743865967
min validation loss 0.047889795154333115

l2-normed weight changes from initial values after last epoch:
[3.043211, 0.2986869, 1.4687209, 0.27956274]
opt = sgd, lr = 0.010000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 6.5731e-02, Accuracy: 0.421, Test Loss: 6.6318e-02, Test Accuracy: 0.423 - L2 Norm of Weight Movement From Initialization: [3.7500105, 0.39715508, 1.9572554, 0.32953978]'


max training accuracy 0.4205999970436096
min training loss 0.06573103368282318
max validation accuracy 0.4262000024318695
min validation loss 0.0663180947303772

l2-normed weight changes from initial values after last epoch:
[3.7500105, 0.39715508, 1.9572554, 0.32953978]
opt = sgd, lr = 0.010000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 9.7515e-02, Accuracy: 0.450, Test Loss: 9.7900e-02, Test Accuracy: 0.454 - L2 Norm of Weight Movement From Initialization: [4.777665, 0.6588268, 2.486111, 0.59155613]'


max training accuracy 0.45013999938964844
min training loss 0.09751484543085098
max validation accuracy 0.45350000262260437
min validation loss 0.09789978712797165

l2-normed weight changes from initial values after last epoch:
[4.777665, 0.6588268, 2.486111, 0.59155613]
opt = sgd, lr = 0.010000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.6551e-01, Accuracy: 0.476, Test Loss: 1.6851e-01, Test Accuracy: 0.467 - L2 Norm of Weight Movement From Initialization: [6.4416347, 1.0086722, 3.0421476, 0.96739495]'


max training accuracy 0.47609999775886536
min training loss 0.16551020741462708
max validation accuracy 0.46650001406669617
min validation loss 0.16850784420967102

l2-normed weight changes from initial values after last epoch:
[6.4416347, 1.0086722, 3.0421476, 0.96739495]
opt = sgd, lr = 0.010000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 3.7269e-01, Accuracy: 0.470, Test Loss: 3.8520e-01, Test Accuracy: 0.447 - L2 Norm of Weight Movement From Initialization: [9.057637, 1.4776163, 2.853147, 1.7801768]'


max training accuracy 0.47001999616622925
min training loss 0.37269386649131775
max validation accuracy 0.45579999685287476
min validation loss 0.38086310029029846

l2-normed weight changes from initial values after last epoch:
[9.057637, 1.4776163, 2.853147, 1.7801768]
opt = sgd, lr = 0.010000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.6992e+00, Accuracy: 0.389, Test Loss: 1.7130e+00, Test Accuracy: 0.380 - L2 Norm of Weight Movement From Initialization: [13.647683, 3.0959759, 2.103018, 2.959004]'


max training accuracy 0.3889800012111664
min training loss 1.6991910934448242
max validation accuracy 0.38600000739097595
min validation loss 1.7130123376846313

l2-normed weight changes from initial values after last epoch:
[13.647683, 3.0959759, 2.103018, 2.959004]
opt = sgd, lr = 0.010000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 3.0348e+00, Accuracy: 0.270, Test Loss: 2.9687e+00, Test Accuracy: 0.283 - L2 Norm of Weight Movement From Initialization: [10.341673, 2.6792932, 1.666092, 2.384451]'


max training accuracy 0.2701199948787689
min training loss 3.0233774185180664
max validation accuracy 0.2883000075817108
min validation loss 2.968705177307129

l2-normed weight changes from initial values after last epoch:
[10.341673, 2.6792932, 1.666092, 2.384451]
opt = sgd, lr = 0.010000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 6.4022e+00, Accuracy: 0.099, Test Loss: 6.3997e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [6.4488835, 1.1765423, 2.363383, 0.1655872]'


max training accuracy 0.12064000219106674
min training loss 6.2844767570495605
max validation accuracy 0.19189999997615814
min validation loss 6.049861907958984

l2-normed weight changes from initial values after last epoch:
[6.4488835, 1.1765423, 2.363383, 0.1655872]
opt = sgd, lr = 0.010000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:00:42 - Loss: 1.4421e+01, Accuracy: 0.098, Test Loss: 1.4415e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10.31103, 0.386674, 5.643599, 0.27159783]'


max training accuracy 0.10118000209331512
min training loss 14.41639232635498
max validation accuracy 0.10000000149011612
min validation loss 14.407975196838379

l2-normed weight changes from initial values after last epoch:
[10.31103, 0.386674, 5.643599, 0.27159783]
opt = sgd, lr = 0.010000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:00:43 - Loss: 3.2644e+02, Accuracy: 0.100, Test Loss: 3.2644e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [25.243608, 0.9938675, 28.37464, 1.2337116]'


max training accuracy 0.10000000149011612
min training loss 326.26708984375
max validation accuracy 0.10000000149011612
min validation loss 326.44476318359375

l2-normed weight changes from initial values after last epoch:
[25.243608, 0.9938675, 28.37464, 1.2337116]
opt = sgd, lr = 0.010000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:43 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [74.28205, 2.2687125, 12.155881, 2.3133523]'


max training accuracy 0.10000000149011612
min training loss 1242.786376953125
max validation accuracy 0.10000000149011612
min validation loss 1243.396484375

l2-normed weight changes from initial values after last epoch:
[74.28205, 2.2687125, 12.155881, 2.3133523]
opt = sgd, lr = 0.010000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [163.71794, 5.243478, 57.15761, 5.2170506]'


max training accuracy 0.10006000101566315
min training loss 2470.6689453125
max validation accuracy 0.10000000149011612
min validation loss 2472.03564453125

l2-normed weight changes from initial values after last epoch:
[163.71794, 5.243478, 57.15761, 5.2170506]
opt = sgd, lr = 0.010000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [251.35077, 8.070101, 36.698364, 7.8012905]'


max training accuracy 0.10000000149011612
min training loss 7252.2080078125
max validation accuracy 0.10000000149011612
min validation loss 7255.82421875

l2-normed weight changes from initial values after last epoch:
[251.35077, 8.070101, 36.698364, 7.8012905]
opt = sgd, lr = 0.010000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [6337.586, 209.62503, 1028.7968, 219.9832]'


max training accuracy 0.10000000149011612
min training loss 103569.2421875
max validation accuracy 0.10000000149011612
min validation loss 103617.0625

l2-normed weight changes from initial values after last epoch:
[6337.586, 209.62503, 1028.7968, 219.9832]
opt = sgd, lr = 0.010000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [1069550.5, 33252.152, 170467.14, 33361.004]'


max training accuracy 0.10000000149011612
min training loss 8287235.0
max validation accuracy 0.10000000149011612
min validation loss 8290205.0

l2-normed weight changes from initial values after last epoch:
[1069550.5, 33252.152, 170467.14, 33361.004]
opt = sgd, lr = 0.001000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:00:43 - Loss: 5.7565e-09, Accuracy: 0.094, Test Loss: 5.7564e-09, Test Accuracy: 0.093 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09415999799966812
min training loss 5.756487286845413e-09
max validation accuracy 0.0925000011920929
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:00:43 - Loss: 2.3026e-06, Accuracy: 0.089, Test Loss: 2.3026e-06, Test Accuracy: 0.088 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.088639996945858
min training loss 2.302632083228673e-06
max validation accuracy 0.08789999783039093
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 2.3030e-04, Accuracy: 0.101, Test Loss: 2.3032e-04, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0009601165, 3.5921992e-05, 0.00024641884, 4.140436e-05]'


max training accuracy 0.10130000114440918
min training loss 0.000230304358410649
max validation accuracy 0.10170000046491623
min validation loss 0.000230320161790587

l2-normed weight changes from initial values after last epoch:
[0.0009601165, 3.5921992e-05, 0.00024641884, 4.140436e-05]
opt = sgd, lr = 0.001000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 3.5998e-04, Accuracy: 0.096, Test Loss: 3.5996e-04, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0060916925, 0.00016185999, 0.001216271, 0.00019715105]'


max training accuracy 0.09582000225782394
min training loss 0.0003599772753659636
max validation accuracy 0.09870000183582306
min validation loss 0.0003599566116463393

l2-normed weight changes from initial values after last epoch:
[0.0060916925, 0.00016185999, 0.001216271, 0.00019715105]
opt = sgd, lr = 0.001000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 6.3973e-04, Accuracy: 0.104, Test Loss: 6.3979e-04, Test Accuracy: 0.105 - L2 Norm of Weight Movement From Initialization: [0.006964406, 0.00017045543, 0.0010207827, 0.00013682344]'


max training accuracy 0.10409999638795853
min training loss 0.000639728328678757
max validation accuracy 0.10509999841451645
min validation loss 0.0006397857796400785

l2-normed weight changes from initial values after last epoch:
[0.006964406, 0.00017045543, 0.0010207827, 0.00013682344]
opt = sgd, lr = 0.001000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 1.4392e-03, Accuracy: 0.099, Test Loss: 1.4391e-03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.016033595, 0.0003930556, 0.002088645, 0.0003877037]'


max training accuracy 0.10106000304222107
min training loss 0.0014392241137102246
max validation accuracy 0.10119999945163727
min validation loss 0.0014391306322067976

l2-normed weight changes from initial values after last epoch:
[0.016033595, 0.0003930556, 0.002088645, 0.0003877037]
opt = sgd, lr = 0.001000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 2.5495e-03, Accuracy: 0.087, Test Loss: 2.5492e-03, Test Accuracy: 0.089 - L2 Norm of Weight Movement From Initialization: [0.026556656, 0.00068038347, 0.0032935934, 0.00055029267]'


max training accuracy 0.08907999843358994
min training loss 0.0025495104491710663
max validation accuracy 0.09099999815225601
min validation loss 0.0025492191780358553

l2-normed weight changes from initial values after last epoch:
[0.026556656, 0.00068038347, 0.0032935934, 0.00055029267]
opt = sgd, lr = 0.001000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 5.6828e-03, Accuracy: 0.114, Test Loss: 5.6839e-03, Test Accuracy: 0.117 - L2 Norm of Weight Movement From Initialization: [0.10137652, 0.0027214652, 0.018756162, 0.003754253]'


max training accuracy 0.11435999721288681
min training loss 0.005682753399014473
max validation accuracy 0.11729999631643295
min validation loss 0.005683880764991045

l2-normed weight changes from initial values after last epoch:
[0.10137652, 0.0027214652, 0.018756162, 0.003754253]
opt = sgd, lr = 0.001000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 2.0956e-02, Accuracy: 0.110, Test Loss: 2.0916e-02, Test Accuracy: 0.110 - L2 Norm of Weight Movement From Initialization: [0.48699778, 0.020415168, 0.11272783, 0.020781415]'


max training accuracy 0.11247999966144562
min training loss 0.020956311374902725
max validation accuracy 0.11010000109672546
min validation loss 0.02091616578400135

l2-normed weight changes from initial values after last epoch:
[0.48699778, 0.020415168, 0.11272783, 0.020781415]
opt = sgd, lr = 0.001000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 2.5785e-02, Accuracy: 0.139, Test Loss: 2.5735e-02, Test Accuracy: 0.138 - L2 Norm of Weight Movement From Initialization: [0.53473866, 0.018835802, 0.14457874, 0.032799132]'


max training accuracy 0.13882000744342804
min training loss 0.02578529343008995
max validation accuracy 0.13899999856948853
min validation loss 0.02573472075164318

l2-normed weight changes from initial values after last epoch:
[0.53473866, 0.018835802, 0.14457874, 0.032799132]
opt = sgd, lr = 0.001000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:00:44 - Loss: 3.2312e-02, Accuracy: 0.122, Test Loss: 3.2207e-02, Test Accuracy: 0.123 - L2 Norm of Weight Movement From Initialization: [0.67128766, 0.0300431, 0.18600559, 0.04794811]'


max training accuracy 0.14377999305725098
min training loss 0.03231215476989746
max validation accuracy 0.14990000426769257
min validation loss 0.03220657631754875

l2-normed weight changes from initial values after last epoch:
[0.67128766, 0.0300431, 0.18600559, 0.04794811]
opt = sgd, lr = 0.001000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 4.0975e-02, Accuracy: 0.230, Test Loss: 4.0880e-02, Test Accuracy: 0.228 - L2 Norm of Weight Movement From Initialization: [0.8276962, 0.033146415, 0.2588749, 0.040917885]'


max training accuracy 0.24988000094890594
min training loss 0.040975283831357956
max validation accuracy 0.2459000051021576
min validation loss 0.040880464017391205

l2-normed weight changes from initial values after last epoch:
[0.8276962, 0.033146415, 0.2588749, 0.040917885]
opt = sgd, lr = 0.001000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 5.3704e-02, Accuracy: 0.305, Test Loss: 5.3663e-02, Test Accuracy: 0.311 - L2 Norm of Weight Movement From Initialization: [0.96660084, 0.049045946, 0.31003702, 0.04347367]'


max training accuracy 0.3050999939441681
min training loss 0.05370378494262695
max validation accuracy 0.31060001254081726
min validation loss 0.053662627935409546

l2-normed weight changes from initial values after last epoch:
[0.96660084, 0.049045946, 0.31003702, 0.04347367]
opt = sgd, lr = 0.001000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 7.5620e-02, Accuracy: 0.339, Test Loss: 7.5414e-02, Test Accuracy: 0.336 - L2 Norm of Weight Movement From Initialization: [1.1883656, 0.07267438, 0.37737468, 0.06799513]'


max training accuracy 0.3389799892902374
min training loss 0.07561992853879929
max validation accuracy 0.3361000120639801
min validation loss 0.07541428506374359

l2-normed weight changes from initial values after last epoch:
[1.1883656, 0.07267438, 0.37737468, 0.06799513]
opt = sgd, lr = 0.001000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 1.1380e-01, Accuracy: 0.359, Test Loss: 1.1382e-01, Test Accuracy: 0.354 - L2 Norm of Weight Movement From Initialization: [1.5296563, 0.09872549, 0.5896142, 0.09529141]'


max training accuracy 0.3594200015068054
min training loss 0.11379600316286087
max validation accuracy 0.35429999232292175
min validation loss 0.11382072418928146

l2-normed weight changes from initial values after last epoch:
[1.5296563, 0.09872549, 0.5896142, 0.09529141]
opt = sgd, lr = 0.001000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.9240e-01, Accuracy: 0.400, Test Loss: 1.9245e-01, Test Accuracy: 0.399 - L2 Norm of Weight Movement From Initialization: [1.9734362, 0.17178439, 0.7063096, 0.16055235]'


max training accuracy 0.40011999011039734
min training loss 0.1924024373292923
max validation accuracy 0.39890000224113464
min validation loss 0.1924535185098648

l2-normed weight changes from initial values after last epoch:
[1.9734362, 0.17178439, 0.7063096, 0.16055235]
opt = sgd, lr = 0.001000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 4.0449e-01, Accuracy: 0.439, Test Loss: 4.0624e-01, Test Accuracy: 0.427 - L2 Norm of Weight Movement From Initialization: [2.9941375, 0.3114812, 1.353337, 0.29284403]'


max training accuracy 0.4385400116443634
min training loss 0.4044868052005768
max validation accuracy 0.43149998784065247
min validation loss 0.40624183416366577

l2-normed weight changes from initial values after last epoch:
[2.9941375, 0.3114812, 1.353337, 0.29284403]
opt = sgd, lr = 0.001000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.4849e+00, Accuracy: 0.478, Test Loss: 1.5371e+00, Test Accuracy: 0.452 - L2 Norm of Weight Movement From Initialization: [6.0999618, 0.9718548, 2.6749973, 0.9249537]'


max training accuracy 0.4784800112247467
min training loss 1.484932541847229
max validation accuracy 0.4625999927520752
min validation loss 1.5214533805847168

l2-normed weight changes from initial values after last epoch:
[6.0999618, 0.9718548, 2.6749973, 0.9249537]
opt = sgd, lr = 0.001000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 2.3273e+00, Accuracy: 0.474, Test Loss: 2.3418e+00, Test Accuracy: 0.468 - L2 Norm of Weight Movement From Initialization: [7.287984, 1.1451353, 3.065266, 1.2919358]'


max training accuracy 0.4737200140953064
min training loss 2.327312469482422
max validation accuracy 0.46790000796318054
min validation loss 2.341810703277588

l2-normed weight changes from initial values after last epoch:
[7.287984, 1.1451353, 3.065266, 1.2919358]
opt = sgd, lr = 0.001000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 4.1027e+00, Accuracy: 0.476, Test Loss: 4.2694e+00, Test Accuracy: 0.444 - L2 Norm of Weight Movement From Initialization: [9.882187, 1.4782689, 2.829713, 1.9745305]'


max training accuracy 0.475739985704422
min training loss 4.102673530578613
max validation accuracy 0.45879998803138733
min validation loss 4.191259860992432

l2-normed weight changes from initial values after last epoch:
[9.882187, 1.4782689, 2.829713, 1.9745305]
opt = sgd, lr = 0.001000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:00:46 - Loss: 9.9545e+00, Accuracy: 0.431, Test Loss: 9.9244e+00, Test Accuracy: 0.441 - L2 Norm of Weight Movement From Initialization: [12.1502495, 2.117847, 2.554313, 2.5306857]'


max training accuracy 0.43116000294685364
min training loss 9.954527854919434
max validation accuracy 0.4406000077724457
min validation loss 9.924357414245605

l2-normed weight changes from initial values after last epoch:
[12.1502495, 2.117847, 2.554313, 2.5306857]
opt = sgd, lr = 0.001000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:00:45 - Loss: 5.6882e+01, Accuracy: 0.114, Test Loss: 5.2688e+01, Test Accuracy: 0.186 - L2 Norm of Weight Movement From Initialization: [8.948306, 4.2214894, 2.777471, 1.6098415]'


max training accuracy 0.1646600067615509
min training loss 54.420833587646484
max validation accuracy 0.18639999628067017
min validation loss 52.688255310058594

l2-normed weight changes from initial values after last epoch:
[8.948306, 4.2214894, 2.777471, 1.6098415]
opt = sgd, lr = 0.001000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.3096e+02, Accuracy: 0.100, Test Loss: 2.3124e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [13.761754, 0.5005625, 10.576102, 0.44711757]'


max training accuracy 0.10401999950408936
min training loss 230.9393768310547
max validation accuracy 0.10000000149011612
min validation loss 230.5560760498047

l2-normed weight changes from initial values after last epoch:
[13.761754, 0.5005625, 10.576102, 0.44711757]
opt = sgd, lr = 0.001000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:00:49 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [33.955746, 1.2559574, 44.06345, 1.8855948]'


max training accuracy 0.10000000149011612
min training loss 2469.98681640625
max validation accuracy 0.10000000149011612
min validation loss 2472.037109375

l2-normed weight changes from initial values after last epoch:
[33.955746, 1.2559574, 44.06345, 1.8855948]
opt = sgd, lr = 0.001000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [29.170166, 1.0205396, 27.17494, 1.0541494]'


max training accuracy 0.10010000318288803
min training loss 7251.66259765625
max validation accuracy 0.10000000149011612
min validation loss 7255.8232421875

l2-normed weight changes from initial values after last epoch:
[29.170166, 1.0205396, 27.17494, 1.0541494]
opt = sgd, lr = 0.001000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:00:47 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [526.5488, 20.098198, 72.01266, 23.057837]'


max training accuracy 0.10000000149011612
min training loss 103571.25
max validation accuracy 0.10000000149011612
min validation loss 103617.0546875

l2-normed weight changes from initial values after last epoch:
[526.5488, 20.098198, 72.01266, 23.057837]
opt = sgd, lr = 0.001000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [83535.76, 2984.2903, 11637.054, 3040.1663]'


max training accuracy 0.10000000149011612
min training loss 8286862.0
max validation accuracy 0.10000000149011612
min validation loss 8290207.5

l2-normed weight changes from initial values after last epoch:
[83535.76, 2984.2903, 11637.054, 3040.1663]


In [6]:
# experiment with wider network
lrs = [1.0, 0.1, 0.01, 0.001]
alphas = [10000000.0, 1000000.0, 100000.0, 10000.0, 1000.0, 100.0, 10.0, 5.0, 1.0, 0.5, 0.1, 0.01]
normed_weight_changes_w256 = {}
optimizer = 'sgd'
num_epochs = 10
for learning_rate in lrs:
  for alpha_val in alphas:
    print('='*80)
    print('opt = %s, lr = %f, alpha = %f' %(optimizer, learning_rate, alpha_val))
    print('='*80)
    normed_weight_changes_w256[(optimizer, learning_rate, alpha_val)] = train(alpha=alpha_val, epoch=num_epochs, opt=optimizer, lr=learning_rate, scaling=True, width=256)

opt = sgd, lr = 1.000000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 2.3026e-14, Accuracy: 0.094, Test Loss: 2.3026e-14, Test Accuracy: 0.093 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09350000321865082
min training loss 2.3025665711129753e-14
max validation accuracy 0.09279999881982803
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:01:33 - Loss: 2.3026e-12, Accuracy: 0.086, Test Loss: 2.3026e-12, Test Accuracy: 0.087 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.08643999695777893
min training loss 2.3025997341469262e-12
max validation accuracy 0.08709999918937683
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:01:38 - Loss: 2.3026e-10, Accuracy: 0.104, Test Loss: 2.3026e-10, Test Accuracy: 0.105 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10366000235080719
min training loss 2.3026328066499957e-10
max validation accuracy 0.10540000349283218
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:01:34 - Loss: 2.3026e-08, Accuracy: 0.091, Test Loss: 2.3026e-08, Test Accuracy: 0.094 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09102000296115875
min training loss 2.3025949147381652e-08
max validation accuracy 0.09449999779462814
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 2.3026e-06, Accuracy: 0.105, Test Loss: 2.3026e-06, Test Accuracy: 0.105 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10475999861955643
min training loss 2.302632083228673e-06
max validation accuracy 0.10530000180006027
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 2.3026e-04, Accuracy: 0.128, Test Loss: 2.3026e-04, Test Accuracy: 0.134 - L2 Norm of Weight Movement From Initialization: [0.0011157115, 2.9923829e-05, 0.00032210635, 3.0464722e-05]'


max training accuracy 0.1281999945640564
min training loss 0.00023025514383334666
max validation accuracy 0.1339000016450882
min validation loss 0.0002302593638887629

l2-normed weight changes from initial values after last epoch:
[0.0011157115, 2.9923829e-05, 0.00032210635, 3.0464722e-05]
opt = sgd, lr = 1.000000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.6597e-02, Accuracy: 0.406, Test Loss: 1.6601e-02, Test Accuracy: 0.409 - L2 Norm of Weight Movement From Initialization: [17.466265, 2.7950547, 3.642623, 3.560694]'


max training accuracy 0.4057599902153015
min training loss 0.01659747026860714
max validation accuracy 0.40869998931884766
min validation loss 0.016600918024778366

l2-normed weight changes from initial values after last epoch:
[17.466265, 2.7950547, 3.642623, 3.560694]
opt = sgd, lr = 1.000000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 9.0792e-02, Accuracy: 0.126, Test Loss: 9.2181e-02, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [14.745806, 1.4767742, 6.107231, 0.3025617]'


max training accuracy 0.13857999444007874
min training loss 0.08967292308807373
max validation accuracy 0.16349999606609344
min validation loss 0.08523838222026825

l2-normed weight changes from initial values after last epoch:
[14.745806, 1.4767742, 6.107231, 0.3025617]
opt = sgd, lr = 1.000000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [64.387794, 2.1683698, 18.199444, 2.3300815]'


max training accuracy 0.10000000149011612
min training loss 14.499810218811035
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[64.387794, 2.1683698, 18.199444, 2.3300815]
opt = sgd, lr = 1.000000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 5.5530e+01, Accuracy: 0.100, Test Loss: 5.5530e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [280.2445, 9.767775, 1319.8534, 10.442392]'


max training accuracy 0.10000000149011612
min training loss 55.50981903076172
max validation accuracy 0.10000000149011612
min validation loss 55.52980041503906

l2-normed weight changes from initial values after last epoch:
[280.2445, 9.767775, 1319.8534, 10.442392]
opt = sgd, lr = 1.000000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [8128.575, 262.55762, 2542.33, 292.8512]'


max training accuracy 0.10000000149011612
min training loss 1242.8526611328125
max validation accuracy 0.10000000149011612
min validation loss 1243.396484375

l2-normed weight changes from initial values after last epoch:
[8128.575, 262.55762, 2542.33, 292.8512]
opt = sgd, lr = 1.000000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [596398.06, 24251.34, 175318.9, 25775.72]'


max training accuracy 0.10000000149011612
min training loss 103572.2421875
max validation accuracy 0.10000000149011612
min validation loss 103617.046875

l2-normed weight changes from initial values after last epoch:
[596398.06, 24251.34, 175318.9, 25775.72]
opt = sgd, lr = 0.100000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.3026e-14, Accuracy: 0.086, Test Loss: 2.3026e-14, Test Accuracy: 0.089 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.08579999953508377
min training loss 2.3025665711129753e-14
max validation accuracy 0.0885000005364418
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.3026e-12, Accuracy: 0.103, Test Loss: 2.3026e-12, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10335999727249146
min training loss 2.3025997341469262e-12
max validation accuracy 0.10239999741315842
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 2.3026e-10, Accuracy: 0.103, Test Loss: 2.3026e-10, Test Accuracy: 0.103 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10283999890089035
min training loss 2.3026328066499957e-10
max validation accuracy 0.10260000079870224
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 2.3026e-08, Accuracy: 0.076, Test Loss: 2.3026e-08, Test Accuracy: 0.075 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.07581999897956848
min training loss 2.3025949147381652e-08
max validation accuracy 0.07450000196695328
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.3026e-06, Accuracy: 0.090, Test Loss: 2.3026e-06, Test Accuracy: 0.087 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09021999686956406
min training loss 2.302632083228673e-06
max validation accuracy 0.08659999817609787
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 2.2995e-04, Accuracy: 0.104, Test Loss: 2.2985e-04, Test Accuracy: 0.105 - L2 Norm of Weight Movement From Initialization: [0.06319915, 0.0016370281, 0.017728776, 0.0015774898]'


max training accuracy 0.10688000172376633
min training loss 0.0002299540356034413
max validation accuracy 0.10980000346899033
min validation loss 0.0002298453327966854

l2-normed weight changes from initial values after last epoch:
[0.06319915, 0.0016370281, 0.017728776, 0.0015774898]
opt = sgd, lr = 0.100000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.5665e-02, Accuracy: 0.430, Test Loss: 1.6182e-02, Test Accuracy: 0.431 - L2 Norm of Weight Movement From Initialization: [6.284316, 0.71012884, 3.29916, 0.6048763]'


max training accuracy 0.4298799932003021
min training loss 0.015665274113416672
max validation accuracy 0.4309999942779541
min validation loss 0.01618211343884468

l2-normed weight changes from initial values after last epoch:
[6.284316, 0.71012884, 3.29916, 0.6048763]
opt = sgd, lr = 0.100000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 5.7926e-02, Accuracy: 0.482, Test Loss: 5.9141e-02, Test Accuracy: 0.484 - L2 Norm of Weight Movement From Initialization: [13.549016, 1.768065, 4.1108456, 2.1841493]'


max training accuracy 0.4815399944782257
min training loss 0.05792620778083801
max validation accuracy 0.48410001397132874
min validation loss 0.059141356498003006

l2-normed weight changes from initial values after last epoch:
[13.549016, 1.768065, 4.1108456, 2.1841493]
opt = sgd, lr = 0.100000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 2.3101e+00, Accuracy: 0.101, Test Loss: 2.3121e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [25.048271, 1.0870311, 15.605741, 0.44250396]'


max training accuracy 0.10288000106811523
min training loss 2.3092122077941895
max validation accuracy 0.1111999973654747
min validation loss 2.287990093231201

l2-normed weight changes from initial values after last epoch:
[25.048271, 1.0870311, 15.605741, 0.44250396]
opt = sgd, lr = 0.100000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 5.5530e+01, Accuracy: 0.100, Test Loss: 5.5530e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [36.523777, 1.2436191, 83.20046, 1.2521315]'


max training accuracy 0.10006000101566315
min training loss 55.49504089355469
max validation accuracy 0.10000000149011612
min validation loss 55.52981185913086

l2-normed weight changes from initial values after last epoch:
[36.523777, 1.2436191, 83.20046, 1.2521315]
opt = sgd, lr = 0.100000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [1285.915, 46.89927, 397.59528, 50.01398]'


max training accuracy 0.10000000149011612
min training loss 1242.874755859375
max validation accuracy 0.10000000149011612
min validation loss 1243.396728515625

l2-normed weight changes from initial values after last epoch:
[1285.915, 46.89927, 397.59528, 50.01398]
opt = sgd, lr = 0.100000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [84417.42, 2820.396, 26056.498, 3015.306]'


max training accuracy 0.10000000149011612
min training loss 103577.5703125
max validation accuracy 0.10000000149011612
min validation loss 103617.0625

l2-normed weight changes from initial values after last epoch:
[84417.42, 2820.396, 26056.498, 3015.306]
opt = sgd, lr = 0.010000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.3026e-14, Accuracy: 0.102, Test Loss: 2.3026e-14, Test Accuracy: 0.104 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10209999978542328
min training loss 2.3025665711129753e-14
max validation accuracy 0.10400000214576721
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 2.3026e-12, Accuracy: 0.105, Test Loss: 2.3026e-12, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10462000221014023
min training loss 2.3025997341469262e-12
max validation accuracy 0.10130000114440918
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 2.3026e-10, Accuracy: 0.106, Test Loss: 2.3026e-10, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10576000064611435
min training loss 2.3026328066499957e-10
max validation accuracy 0.10199999809265137
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 2.3026e-08, Accuracy: 0.099, Test Loss: 2.3026e-08, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09867999702692032
min training loss 2.3025949147381652e-08
max validation accuracy 0.09700000286102295
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:32 - Loss: 2.3026e-06, Accuracy: 0.097, Test Loss: 2.3026e-06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09685999900102615
min training loss 2.302632083228673e-06
max validation accuracy 0.10019999742507935
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 2.3026e-04, Accuracy: 0.095, Test Loss: 2.3026e-04, Test Accuracy: 0.093 - L2 Norm of Weight Movement From Initialization: [0.002840842, 5.1637737e-05, 0.0006502471, 4.5848632e-05]'


max training accuracy 0.09517999738454819
min training loss 0.00023025812697596848
max validation accuracy 0.09390000253915787
min validation loss 0.00023026196868158877

l2-normed weight changes from initial values after last epoch:
[0.002840842, 5.1637737e-05, 0.0006502471, 4.5848632e-05]
opt = sgd, lr = 0.010000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 1.8625e-02, Accuracy: 0.233, Test Loss: 1.8595e-02, Test Accuracy: 0.238 - L2 Norm of Weight Movement From Initialization: [1.7309929, 0.1449064, 0.74818456, 0.13955392]'


max training accuracy 0.23348000645637512
min training loss 0.018625035881996155
max validation accuracy 0.23770000040531158
min validation loss 0.01859469898045063

l2-normed weight changes from initial values after last epoch:
[1.7309929, 0.1449064, 0.74818456, 0.13955392]
opt = sgd, lr = 0.010000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 6.3853e-02, Accuracy: 0.432, Test Loss: 6.4097e-02, Test Accuracy: 0.437 - L2 Norm of Weight Movement From Initialization: [3.8946943, 0.454309, 2.0324702, 0.3691277]'


max training accuracy 0.4320000112056732
min training loss 0.06385335326194763
max validation accuracy 0.4372999966144562
min validation loss 0.06409714370965958

l2-normed weight changes from initial values after last epoch:
[3.8946943, 0.454309, 2.0324702, 0.3691277]
opt = sgd, lr = 0.010000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.5703e+00, Accuracy: 0.443, Test Loss: 1.7030e+00, Test Accuracy: 0.395 - L2 Norm of Weight Movement From Initialization: [17.231565, 2.93126, 3.2676954, 3.0236156]'


max training accuracy 0.4427799880504608
min training loss 1.5702654123306274
max validation accuracy 0.42309999465942383
min validation loss 1.6167796850204468

l2-normed weight changes from initial values after last epoch:
[17.231565, 2.93126, 3.2676954, 3.0236156]
opt = sgd, lr = 0.010000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 9.2137e+00, Accuracy: 0.101, Test Loss: 9.2188e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [11.790709, 0.6200805, 5.7813225, 0.20874594]'


max training accuracy 0.11535999923944473
min training loss 9.149772644042969
max validation accuracy 0.1386999934911728
min validation loss 8.910216331481934

l2-normed weight changes from initial values after last epoch:
[11.790709, 0.6200805, 5.7813225, 0.20874594]
opt = sgd, lr = 0.010000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [64.52841, 2.339535, 17.914055, 2.2179146]'


max training accuracy 0.10000000149011612
min training loss 1242.8314208984375
max validation accuracy 0.10000000149011612
min validation loss 1243.396728515625

l2-normed weight changes from initial values after last epoch:
[64.52841, 2.339535, 17.914055, 2.2179146]
opt = sgd, lr = 0.010000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [9047.3955, 330.96738, 3029.1943, 352.78625]'


max training accuracy 0.10000000149011612
min training loss 103579.453125
max validation accuracy 0.10000000149011612
min validation loss 103617.046875

l2-normed weight changes from initial values after last epoch:
[9047.3955, 330.96738, 3029.1943, 352.78625]
opt = sgd, lr = 0.001000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.3026e-14, Accuracy: 0.105, Test Loss: 2.3026e-14, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10512000322341919
min training loss 2.3025665711129753e-14
max validation accuracy 0.10100000351667404
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 2.3026e-12, Accuracy: 0.106, Test Loss: 2.3026e-12, Test Accuracy: 0.104 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10614000260829926
min training loss 2.3025997341469262e-12
max validation accuracy 0.10429999977350235
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 2.3026e-10, Accuracy: 0.101, Test Loss: 2.3026e-10, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10051999986171722
min training loss 2.3026328066499957e-10
max validation accuracy 0.10109999775886536
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 2.3026e-08, Accuracy: 0.100, Test Loss: 2.3026e-08, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09982000291347504
min training loss 2.3025949147381652e-08
max validation accuracy 0.10029999911785126
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.3026e-06, Accuracy: 0.099, Test Loss: 2.3026e-06, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09944000095129013
min training loss 2.302632083228673e-06
max validation accuracy 0.09709999710321426
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.3018e-04, Accuracy: 0.110, Test Loss: 2.3025e-04, Test Accuracy: 0.118 - L2 Norm of Weight Movement From Initialization: [0.00041355574, 1.3301002e-05, 0.00014269097, 7.3098813e-06]'


max training accuracy 0.11014000326395035
min training loss 0.00023018104548100382
max validation accuracy 0.11779999732971191
min validation loss 0.00023024642723612487

l2-normed weight changes from initial values after last epoch:
[0.00041355574, 1.3301002e-05, 0.00014269097, 7.3098813e-06]
opt = sgd, lr = 0.001000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.0872e-02, Accuracy: 0.114, Test Loss: 2.0834e-02, Test Accuracy: 0.115 - L2 Norm of Weight Movement From Initialization: [0.4857198, 0.018770743, 0.14852796, 0.022330372]'


max training accuracy 0.11386000365018845
min training loss 0.020872239023447037
max validation accuracy 0.1145000010728836
min validation loss 0.020834116265177727

l2-normed weight changes from initial values after last epoch:
[0.4857198, 0.018770743, 0.14852796, 0.022330372]
opt = sgd, lr = 0.001000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 7.4512e-02, Accuracy: 0.343, Test Loss: 7.4503e-02, Test Accuracy: 0.335 - L2 Norm of Weight Movement From Initialization: [1.1709511, 0.06414604, 0.46405303, 0.06708013]'


max training accuracy 0.34290000796318054
min training loss 0.07451184093952179
max validation accuracy 0.335999995470047
min validation loss 0.07450321316719055

l2-normed weight changes from initial values after last epoch:
[1.1709511, 0.06414604, 0.46405303, 0.06708013]
opt = sgd, lr = 0.001000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:33 - Loss: 1.3928e+00, Accuracy: 0.513, Test Loss: 1.4265e+00, Test Accuracy: 0.500 - L2 Norm of Weight Movement From Initialization: [6.7479134, 0.9367003, 3.166426, 0.94252664]'


max training accuracy 0.5126799941062927
min training loss 1.3928364515304565
max validation accuracy 0.5001999735832214
min validation loss 1.4264897108078003

l2-normed weight changes from initial values after last epoch:
[6.7479134, 0.9367003, 3.166426, 0.94252664]
opt = sgd, lr = 0.001000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:01:33 - Loss: 5.5570e+00, Accuracy: 0.508, Test Loss: 5.8832e+00, Test Accuracy: 0.482 - L2 Norm of Weight Movement From Initialization: [13.602513, 1.6406225, 3.7420678, 2.4693606]'


max training accuracy 0.5079799890518188
min training loss 5.557001113891602
max validation accuracy 0.4821999967098236
min validation loss 5.8831682205200195

l2-normed weight changes from initial values after last epoch:
[13.602513, 1.6406225, 3.7420678, 2.4693606]
opt = sgd, lr = 0.001000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [18.55185, 0.7219624, 19.908718, 0.798692]'


max training accuracy 0.10007999837398529
min training loss 1241.932373046875
max validation accuracy 0.10000000149011612
min validation loss 1243.396484375

l2-normed weight changes from initial values after last epoch:
[18.55185, 0.7219624, 19.908718, 0.798692]
opt = sgd, lr = 0.001000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [739.8659, 26.920643, 214.2995, 29.101757]'


max training accuracy 0.10000000149011612
min training loss 103573.1171875
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[739.8659, 26.920643, 214.2995, 29.101757]


In [9]:
lrs = [1.0, 0.1, 0.01, 0.001]
alphas = [20000.0, 1000.0, 100.0, 80.0, 60.0, 40.0, 30.0, 20.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.8, 0.6, 0.4, 0.2, 0.1, 0.07, 0.04, 0.01, 0.001]
normed_weight_changes_w256 = {}
optimizer = 'sgd'
num_epochs = 10
for learning_rate in lrs:
  for alpha_val in alphas:
    print('='*80)
    print('opt = %s, lr = %f, alpha = %f' %(optimizer, learning_rate, alpha_val))
    print('='*80)
    normed_weight_changes_w256[(optimizer, learning_rate, alpha_val)] = train(alpha=alpha_val, epoch=num_epochs, opt=optimizer, lr=learning_rate, scaling=True, width=256)

opt = sgd, lr = 1.000000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 5.7565e-09, Accuracy: 0.097, Test Loss: 5.7564e-09, Test Accuracy: 0.098 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09724000096321106
min training loss 5.756487286845413e-09
max validation accuracy 0.09769999980926514
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 2.3026e-06, Accuracy: 0.105, Test Loss: 2.3026e-06, Test Accuracy: 0.106 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10543999820947647
min training loss 2.302632083228673e-06
max validation accuracy 0.10599999874830246
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 1.000000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 2.2763e-04, Accuracy: 0.103, Test Loss: 2.2774e-04, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.41873896, 0.032245055, 0.12712727, 0.03279595]'


max training accuracy 0.10766000300645828
min training loss 0.00022762635489925742
max validation accuracy 0.10520000010728836
min validation loss 0.0002277366875205189

l2-normed weight changes from initial values after last epoch:
[0.41873896, 0.032245055, 0.12712727, 0.03279595]
opt = sgd, lr = 1.000000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 3.1245e-04, Accuracy: 0.163, Test Loss: 3.1159e-04, Test Accuracy: 0.163 - L2 Norm of Weight Movement From Initialization: [1.5545715, 0.15130505, 0.6627439, 0.120959364]'


max training accuracy 0.16266000270843506
min training loss 0.0003124534559901804
max validation accuracy 0.1648000031709671
min validation loss 0.000311594019876793

l2-normed weight changes from initial values after last epoch:
[1.5545715, 0.15130505, 0.6627439, 0.120959364]
opt = sgd, lr = 1.000000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 5.9624e-04, Accuracy: 0.082, Test Loss: 5.9243e-04, Test Accuracy: 0.076 - L2 Norm of Weight Movement From Initialization: [1.2019433, 0.0922227, 0.48264205, 0.102145374]'


max training accuracy 0.08209999650716782
min training loss 0.0005962415016256273
max validation accuracy 0.0763000026345253
min validation loss 0.0005924340803176165

l2-normed weight changes from initial values after last epoch:
[1.2019433, 0.0922227, 0.48264205, 0.102145374]
opt = sgd, lr = 1.000000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 1.1196e-03, Accuracy: 0.114, Test Loss: 1.1262e-03, Test Accuracy: 0.115 - L2 Norm of Weight Movement From Initialization: [3.8021646, 0.5926871, 1.8828449, 0.81488484]'


max training accuracy 0.12831999361515045
min training loss 0.0011196063132956624
max validation accuracy 0.14650000631809235
min validation loss 0.001126211485825479

l2-normed weight changes from initial values after last epoch:
[3.8021646, 0.5926871, 1.8828449, 0.81488484]
opt = sgd, lr = 1.000000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 1.8693e-03, Accuracy: 0.327, Test Loss: 1.9099e-03, Test Accuracy: 0.320 - L2 Norm of Weight Movement From Initialization: [5.943694, 0.89668405, 3.2208214, 0.98353493]'


max training accuracy 0.3273800015449524
min training loss 0.0018692879239097238
max validation accuracy 0.32010000944137573
min validation loss 0.0019098591292276978

l2-normed weight changes from initial values after last epoch:
[5.943694, 0.89668405, 3.2208214, 0.98353493]
opt = sgd, lr = 1.000000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 3.9256e-03, Accuracy: 0.407, Test Loss: 4.0110e-03, Test Accuracy: 0.416 - L2 Norm of Weight Movement From Initialization: [10.326107, 1.1340399, 4.304434, 1.3194847]'


max training accuracy 0.4069199860095978
min training loss 0.003925600089132786
max validation accuracy 0.41620001196861267
min validation loss 0.004010975360870361

l2-normed weight changes from initial values after last epoch:
[10.326107, 1.1340399, 4.304434, 1.3194847]
opt = sgd, lr = 1.000000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 1.6432e-02, Accuracy: 0.418, Test Loss: 1.6911e-02, Test Accuracy: 0.409 - L2 Norm of Weight Movement From Initialization: [17.473753, 2.765965, 3.9066038, 3.4901977]'


max training accuracy 0.417820006608963
min training loss 0.016432296484708786
max validation accuracy 0.4228000044822693
min validation loss 0.01678049936890602

l2-normed weight changes from initial values after last epoch:
[17.473753, 2.765965, 3.9066038, 3.4901977]
opt = sgd, lr = 1.000000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 2.1042e-02, Accuracy: 0.383, Test Loss: 2.1145e-02, Test Accuracy: 0.378 - L2 Norm of Weight Movement From Initialization: [17.631014, 3.1739748, 3.7417345, 3.6165082]'


max training accuracy 0.3847599923610687
min training loss 0.021041516214609146
max validation accuracy 0.3776000142097473
min validation loss 0.021145455539226532

l2-normed weight changes from initial values after last epoch:
[17.631014, 3.1739748, 3.7417345, 3.6165082]
opt = sgd, lr = 1.000000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:01:22 - Loss: 2.7284e-02, Accuracy: 0.361, Test Loss: 2.7110e-02, Test Accuracy: 0.368 - L2 Norm of Weight Movement From Initialization: [19.008566, 3.4754574, 3.8647196, 3.5674872]'


max training accuracy 0.36438000202178955
min training loss 0.027284396812319756
max validation accuracy 0.3675000071525574
min validation loss 0.027110259979963303

l2-normed weight changes from initial values after last epoch:
[19.008566, 3.4754574, 3.8647196, 3.5674872]
opt = sgd, lr = 1.000000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 3.7201e-02, Accuracy: 0.331, Test Loss: 3.6845e-02, Test Accuracy: 0.324 - L2 Norm of Weight Movement From Initialization: [19.112148, 4.332064, 3.983906, 2.5680082]'


max training accuracy 0.3310000002384186
min training loss 0.037201158702373505
max validation accuracy 0.34779998660087585
min validation loss 0.03684544563293457

l2-normed weight changes from initial values after last epoch:
[19.112148, 4.332064, 3.983906, 2.5680082]
opt = sgd, lr = 1.000000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 5.7896e-02, Accuracy: 0.204, Test Loss: 5.9301e-02, Test Accuracy: 0.204 - L2 Norm of Weight Movement From Initialization: [16.61676, 4.217694, 4.9302197, 0.9271555]'


max training accuracy 0.2125999927520752
min training loss 0.05755740404129028
max validation accuracy 0.20440000295639038
min validation loss 0.05807633697986603

l2-normed weight changes from initial values after last epoch:
[16.61676, 4.217694, 4.9302197, 0.9271555]
opt = sgd, lr = 1.000000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:22 - Loss: 9.2115e-02, Accuracy: 0.104, Test Loss: 9.2226e-02, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [15.5188675, 1.2368046, 5.7322874, 0.26405898]'


max training accuracy 0.13504000008106232
min training loss 0.08974801748991013
max validation accuracy 0.1745000034570694
min validation loss 0.08715963363647461

l2-normed weight changes from initial values after last epoch:
[15.5188675, 1.2368046, 5.7322874, 0.26405898]
opt = sgd, lr = 1.000000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:01:22 - Loss: 1.4414e-01, Accuracy: 0.100, Test Loss: 1.4405e-01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10.830339, 0.39428785, 5.150262, 0.2135633]'


max training accuracy 0.10226000100374222
min training loss 0.14404340088367462
max validation accuracy 0.10000000149011612
min validation loss 0.14393693208694458

l2-normed weight changes from initial values after last epoch:
[10.830339, 0.39428785, 5.150262, 0.2135633]
opt = sgd, lr = 1.000000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 2.5672e-01, Accuracy: 0.099, Test Loss: 2.5644e-01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [23.910095, 0.84055763, 15.963717, 0.32284486]'


max training accuracy 0.10165999829769135
min training loss 0.256670743227005
max validation accuracy 0.10010000318288803
min validation loss 0.25637125968933105

l2-normed weight changes from initial values after last epoch:
[23.910095, 0.84055763, 15.963717, 0.32284486]
opt = sgd, lr = 1.000000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 3.6266e+00, Accuracy: 0.100, Test Loss: 3.6266e+00, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [20.859085, 0.8590329, 29.069435, 1.0291991]'


max training accuracy 0.10006000101566315
min training loss 3.6243762969970703
max validation accuracy 0.10000000149011612
min validation loss 3.6265676021575928

l2-normed weight changes from initial values after last epoch:
[20.859085, 0.8590329, 29.069435, 1.0291991]
opt = sgd, lr = 1.000000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [67.4401, 2.0009003, 132.09991, 2.234769]'


max training accuracy 0.10000000149011612
min training loss 14.499731063842773
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[67.4401, 2.0009003, 132.09991, 2.234769]
opt = sgd, lr = 1.000000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 2.2352e+01, Accuracy: 0.100, Test Loss: 2.2352e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [93.606674, 3.6548486, 27.213808, 3.84976]'


max training accuracy 0.10000000149011612
min training loss 22.34124755859375
max validation accuracy 0.10000000149011612
min validation loss 22.352266311645508

l2-normed weight changes from initial values after last epoch:
[93.606674, 3.6548486, 27.213808, 3.84976]
opt = sgd, lr = 1.000000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 3.9018e+01, Accuracy: 0.100, Test Loss: 3.9018e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [159.94589, 5.943975, 47.845215, 5.387581]'


max training accuracy 0.10000000149011612
min training loss 38.99819564819336
max validation accuracy 0.10000000149011612
min validation loss 39.01817321777344

l2-normed weight changes from initial values after last epoch:
[159.94589, 5.943975, 47.845215, 5.387581]
opt = sgd, lr = 1.000000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 8.5509e+01, Accuracy: 0.100, Test Loss: 8.5510e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [761.7858, 24.012634, 203.99283, 23.774912]'


max training accuracy 0.10000000149011612
min training loss 85.4697494506836
max validation accuracy 0.10000000149011612
min validation loss 85.51007843017578

l2-normed weight changes from initial values after last epoch:
[761.7858, 24.012634, 203.99283, 23.774912]
opt = sgd, lr = 1.000000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 3.2644e+02, Accuracy: 0.100, Test Loss: 3.2644e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [1814.4906, 63.61089, 603.6043, 72.20668]'


max training accuracy 0.10000000149011612
min training loss 326.2973327636719
max validation accuracy 0.10000000149011612
min validation loss 326.4447937011719

l2-normed weight changes from initial values after last epoch:
[1814.4906, 63.61089, 603.6043, 72.20668]
opt = sgd, lr = 1.000000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [6239.953, 190.3665, 1982.0292, 197.1899]'


max training accuracy 0.10001999884843826
min training loss 1242.8341064453125
max validation accuracy 0.10000000149011612
min validation loss 1243.396728515625

l2-normed weight changes from initial values after last epoch:
[6239.953, 190.3665, 1982.0292, 197.1899]
opt = sgd, lr = 1.000000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10741.759, 336.96606, 2983.4543, 304.0279]'


max training accuracy 0.10001999884843826
min training loss 2470.754150390625
max validation accuracy 0.10000000149011612
min validation loss 2472.034912109375

l2-normed weight changes from initial values after last epoch:
[10741.759, 336.96606, 2983.4543, 304.0279]
opt = sgd, lr = 1.000000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [38604.95, 1278.1039, 12363.168, 1506.7021]'


max training accuracy 0.10000000149011612
min training loss 7252.42236328125
max validation accuracy 0.10000000149011612
min validation loss 7255.82421875

l2-normed weight changes from initial values after last epoch:
[38604.95, 1278.1039, 12363.168, 1506.7021]
opt = sgd, lr = 1.000000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [650032.7, 23797.451, 205709.58, 26988.72]'


max training accuracy 0.10000000149011612
min training loss 103573.453125
max validation accuracy 0.10000000149011612
min validation loss 103617.0625

l2-normed weight changes from initial values after last epoch:
[650032.7, 23797.451, 205709.58, 26988.72]
opt = sgd, lr = 1.000000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [95755900.0, 3336957.8, 26708270.0, 3368106.2]'


max training accuracy 0.10000000149011612
min training loss 8286753.5
max validation accuracy 0.10000000149011612
min validation loss 8290204.0

l2-normed weight changes from initial values after last epoch:
[95755900.0, 3336957.8, 26708270.0, 3368106.2]
opt = sgd, lr = 0.100000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 5.7565e-09, Accuracy: 0.103, Test Loss: 5.7564e-09, Test Accuracy: 0.103 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10279999673366547
min training loss 5.756487286845413e-09
max validation accuracy 0.10279999673366547
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 2.3026e-06, Accuracy: 0.100, Test Loss: 2.3026e-06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09997999668121338
min training loss 2.302632083228673e-06
max validation accuracy 0.10000000149011612
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.100000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 2.3025e-04, Accuracy: 0.089, Test Loss: 2.3026e-04, Test Accuracy: 0.087 - L2 Norm of Weight Movement From Initialization: [0.007517429, 0.00016141681, 0.0024248788, 0.00017723913]'


max training accuracy 0.08962000161409378
min training loss 0.00023025488189887255
max validation accuracy 0.0885000005364418
min validation loss 0.00023025879636406898

l2-normed weight changes from initial values after last epoch:
[0.007517429, 0.00016141681, 0.0024248788, 0.00017723913]
opt = sgd, lr = 0.100000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 3.5977e-04, Accuracy: 0.102, Test Loss: 3.5978e-04, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.004362196, 9.041206e-05, 0.0011462331, 6.318793e-05]'


max training accuracy 0.10153999924659729
min training loss 0.00035977447987534106
max validation accuracy 0.10119999945163727
min validation loss 0.00035977564402855933

l2-normed weight changes from initial values after last epoch:
[0.004362196, 9.041206e-05, 0.0011462331, 6.318793e-05]
opt = sgd, lr = 0.100000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 6.2129e-04, Accuracy: 0.098, Test Loss: 6.2037e-04, Test Accuracy: 0.099 - L2 Norm of Weight Movement From Initialization: [0.3401735, 0.011369001, 0.099322915, 0.012552504]'


max training accuracy 0.09842000156641006
min training loss 0.0006212916341610253
max validation accuracy 0.09889999777078629
min validation loss 0.0006203747470863163

l2-normed weight changes from initial values after last epoch:
[0.3401735, 0.011369001, 0.099322915, 0.012552504]
opt = sgd, lr = 0.100000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 1.3549e-03, Accuracy: 0.078, Test Loss: 1.3556e-03, Test Accuracy: 0.086 - L2 Norm of Weight Movement From Initialization: [0.80778825, 0.0665578, 0.28679842, 0.06465688]'


max training accuracy 0.08696000277996063
min training loss 0.0013548950664699078
max validation accuracy 0.08640000224113464
min validation loss 0.0013556383782997727

l2-normed weight changes from initial values after last epoch:
[0.80778825, 0.0665578, 0.28679842, 0.06465688]
opt = sgd, lr = 0.100000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 2.1620e-03, Accuracy: 0.106, Test Loss: 2.1630e-03, Test Accuracy: 0.106 - L2 Norm of Weight Movement From Initialization: [1.5236617, 0.15569778, 0.6171805, 0.16218957]'


max training accuracy 0.10824000090360641
min training loss 0.002162036020308733
max validation accuracy 0.1103999987244606
min validation loss 0.0021629633847624063

l2-normed weight changes from initial values after last epoch:
[1.5236617, 0.15569778, 0.6171805, 0.16218957]
opt = sgd, lr = 0.100000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 4.5832e-03, Accuracy: 0.110, Test Loss: 4.5891e-03, Test Accuracy: 0.109 - L2 Norm of Weight Movement From Initialization: [2.4744003, 0.3958256, 1.1558793, 0.41110328]'


max training accuracy 0.11574000120162964
min training loss 0.004583227448165417
max validation accuracy 0.11810000240802765
min validation loss 0.004589103162288666

l2-normed weight changes from initial values after last epoch:
[2.4744003, 0.3958256, 1.1558793, 0.41110328]
opt = sgd, lr = 0.100000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 1.5672e-02, Accuracy: 0.401, Test Loss: 1.5825e-02, Test Accuracy: 0.403 - L2 Norm of Weight Movement From Initialization: [6.2570877, 0.6631088, 3.2189782, 0.5710258]'


max training accuracy 0.40088000893592834
min training loss 0.015671873465180397
max validation accuracy 0.4025000035762787
min validation loss 0.01582496240735054

l2-normed weight changes from initial values after last epoch:
[6.2570877, 0.6631088, 3.2189782, 0.5710258]
opt = sgd, lr = 0.100000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:01:21 - Loss: 1.8807e-02, Accuracy: 0.453, Test Loss: 1.9251e-02, Test Accuracy: 0.456 - L2 Norm of Weight Movement From Initialization: [7.208417, 0.8669182, 3.5801575, 0.71956044]'


max training accuracy 0.4527199864387512
min training loss 0.018806682899594307
max validation accuracy 0.45559999346733093
min validation loss 0.019251316785812378

l2-normed weight changes from initial values after last epoch:
[7.208417, 0.8669182, 3.5801575, 0.71956044]
opt = sgd, lr = 0.100000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:01:21 - Loss: 2.3348e-02, Accuracy: 0.463, Test Loss: 2.3957e-02, Test Accuracy: 0.463 - L2 Norm of Weight Movement From Initialization: [8.281945, 1.0472871, 3.8444595, 0.93343055]'


max training accuracy 0.46303999423980713
min training loss 0.023348428308963776
max validation accuracy 0.4634000062942505
min validation loss 0.023956742137670517

l2-normed weight changes from initial values after last epoch:
[8.281945, 1.0472871, 3.8444595, 0.93343055]
opt = sgd, lr = 0.100000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:01:21 - Loss: 3.0014e-02, Accuracy: 0.475, Test Loss: 3.0689e-02, Test Accuracy: 0.481 - L2 Norm of Weight Movement From Initialization: [9.650502, 1.2502702, 4.0684233, 1.2128412]'


max training accuracy 0.4754199981689453
min training loss 0.030014336109161377
max validation accuracy 0.48100000619888306
min validation loss 0.030688857659697533

l2-normed weight changes from initial values after last epoch:
[9.650502, 1.2502702, 4.0684233, 1.2128412]
opt = sgd, lr = 0.100000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:01:22 - Loss: 4.0144e-02, Accuracy: 0.480, Test Loss: 4.1529e-02, Test Accuracy: 0.483 - L2 Norm of Weight Movement From Initialization: [11.38466, 1.5515461, 4.259508, 1.6951067]'


max training accuracy 0.4802600145339966
min training loss 0.040143899619579315
max validation accuracy 0.4848000109195709
min validation loss 0.041266895830631256

l2-normed weight changes from initial values after last epoch:
[11.38466, 1.5515461, 4.259508, 1.6951067]
opt = sgd, lr = 0.100000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:22 - Loss: 5.7941e-02, Accuracy: 0.482, Test Loss: 6.2240e-02, Test Accuracy: 0.455 - L2 Norm of Weight Movement From Initialization: [13.617822, 1.7301579, 4.289348, 2.2236338]'


max training accuracy 0.48194000124931335
min training loss 0.057941220700740814
max validation accuracy 0.4699000120162964
min validation loss 0.05979766324162483

l2-normed weight changes from initial values after last epoch:
[13.617822, 1.7301579, 4.289348, 2.2236338]
opt = sgd, lr = 0.100000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 9.2519e-02, Accuracy: 0.474, Test Loss: 9.4012e-02, Test Accuracy: 0.475 - L2 Norm of Weight Movement From Initialization: [15.782785, 2.1027782, 3.9755561, 2.782597]'


max training accuracy 0.47360000014305115
min training loss 0.09251885861158371
max validation accuracy 0.4749999940395355
min validation loss 0.09401188045740128

l2-normed weight changes from initial values after last epoch:
[15.782785, 2.1027782, 3.9755561, 2.782597]
opt = sgd, lr = 0.100000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 1.7736e-01, Accuracy: 0.430, Test Loss: 1.7944e-01, Test Accuracy: 0.426 - L2 Norm of Weight Movement From Initialization: [17.77824, 3.0437975, 3.628206, 3.04608]'


max training accuracy 0.43022000789642334
min training loss 0.17735616862773895
max validation accuracy 0.42890000343322754
min validation loss 0.17943543195724487

l2-normed weight changes from initial values after last epoch:
[17.77824, 3.0437975, 3.628206, 3.04608]
opt = sgd, lr = 0.100000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 5.0679e-01, Accuracy: 0.246, Test Loss: 5.2748e-01, Test Accuracy: 0.213 - L2 Norm of Weight Movement From Initialization: [15.946052, 3.8762436, 4.025885, 1.0893966]'


max training accuracy 0.25036001205444336
min training loss 0.5045549273490906
max validation accuracy 0.24789999425411224
min validation loss 0.4915575087070465

l2-normed weight changes from initial values after last epoch:
[15.946052, 3.8762436, 4.025885, 1.0893966]
opt = sgd, lr = 0.100000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [18.89649, 0.73113817, 29.987486, 1.0240723]'


max training accuracy 0.10000000149011612
min training loss 14.486127853393555
max validation accuracy 0.10000000149011612
min validation loss 14.506270408630371

l2-normed weight changes from initial values after last epoch:
[18.89649, 0.73113817, 29.987486, 1.0240723]
opt = sgd, lr = 0.100000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.2352e+01, Accuracy: 0.100, Test Loss: 2.2352e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [24.830273, 0.9725812, 38.809475, 1.0708419]'


max training accuracy 0.10000000149011612
min training loss 22.331640243530273
max validation accuracy 0.10000000149011612
min validation loss 22.352264404296875

l2-normed weight changes from initial values after last epoch:
[24.830273, 0.9725812, 38.809475, 1.0708419]
opt = sgd, lr = 0.100000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 3.9018e+01, Accuracy: 0.100, Test Loss: 3.9018e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [32.449745, 1.2392254, 42.482456, 1.2580422]'


max training accuracy 0.10000000149011612
min training loss 38.99216079711914
max validation accuracy 0.10000000149011612
min validation loss 39.01816940307617

l2-normed weight changes from initial values after last epoch:
[32.449745, 1.2392254, 42.482456, 1.2580422]
opt = sgd, lr = 0.100000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 8.5509e+01, Accuracy: 0.100, Test Loss: 8.5510e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [45.224174, 1.5869079, 12.899495, 1.6774335]'


max training accuracy 0.10000000149011612
min training loss 85.47538757324219
max validation accuracy 0.10000000149011612
min validation loss 85.51007080078125

l2-normed weight changes from initial values after last epoch:
[45.224174, 1.5869079, 12.899495, 1.6774335]
opt = sgd, lr = 0.100000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 3.2644e+02, Accuracy: 0.100, Test Loss: 3.2644e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [219.25743, 9.112642, 2785.4956, 9.624435]'


max training accuracy 0.10000000149011612
min training loss 326.2712707519531
max validation accuracy 0.10000000149011612
min validation loss 326.44476318359375

l2-normed weight changes from initial values after last epoch:
[219.25743, 9.112642, 2785.4956, 9.624435]
opt = sgd, lr = 0.100000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [527.316, 19.020807, 149.0115, 21.669239]'


max training accuracy 0.10000000149011612
min training loss 1242.83203125
max validation accuracy 0.10000000149011612
min validation loss 1243.39697265625

l2-normed weight changes from initial values after last epoch:
[527.316, 19.020807, 149.0115, 21.669239]
opt = sgd, lr = 0.100000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [977.0861, 32.618942, 269.48843, 31.131561]'


max training accuracy 0.10000000149011612
min training loss 2470.902587890625
max validation accuracy 0.10000000149011612
min validation loss 2472.03662109375

l2-normed weight changes from initial values after last epoch:
[977.0861, 32.618942, 269.48843, 31.131561]
opt = sgd, lr = 0.100000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [3907.8943, 116.50701, 1005.58777, 127.47133]'


max training accuracy 0.10000000149011612
min training loss 7252.68994140625
max validation accuracy 0.10000000149011612
min validation loss 7255.82470703125

l2-normed weight changes from initial values after last epoch:
[3907.8943, 116.50701, 1005.58777, 127.47133]
opt = sgd, lr = 0.100000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [88266.64, 2630.1274, 26636.09, 2632.7737]'


max training accuracy 0.10000000149011612
min training loss 103576.3203125
max validation accuracy 0.10000000149011612
min validation loss 103617.046875

l2-normed weight changes from initial values after last epoch:
[88266.64, 2630.1274, 26636.09, 2632.7737]
opt = sgd, lr = 0.100000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [5339789.0, 173438.56, 1516788.6, 174556.36]'


max training accuracy 0.10000000149011612
min training loss 8286585.5
max validation accuracy 0.10000000149011612
min validation loss 8290205.0

l2-normed weight changes from initial values after last epoch:
[5339789.0, 173438.56, 1516788.6, 174556.36]
opt = sgd, lr = 0.010000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 5.7565e-09, Accuracy: 0.094, Test Loss: 5.7564e-09, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.094200000166893
min training loss 5.756487286845413e-09
max validation accuracy 0.09690000116825104
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.3026e-06, Accuracy: 0.097, Test Loss: 2.3026e-06, Test Accuracy: 0.096 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09706000238656998
min training loss 2.302632083228673e-06
max validation accuracy 0.09629999846220016
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.010000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 2.3026e-04, Accuracy: 0.090, Test Loss: 2.3026e-04, Test Accuracy: 0.090 - L2 Norm of Weight Movement From Initialization: [6.645706e-06, 6.8403097e-07, 2.976654e-06, 6.4590927e-07]'


max training accuracy 0.0899600014090538
min training loss 0.00023025507107377052
max validation accuracy 0.09049999713897705
min validation loss 0.0002302593638887629

l2-normed weight changes from initial values after last epoch:
[6.645706e-06, 6.8403097e-07, 2.976654e-06, 6.4590927e-07]
opt = sgd, lr = 0.010000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 3.5966e-04, Accuracy: 0.101, Test Loss: 3.5967e-04, Test Accuracy: 0.098 - L2 Norm of Weight Movement From Initialization: [0.011293031, 0.00027869613, 0.0031709375, 0.00029919803]'


max training accuracy 0.1020599976181984
min training loss 0.000359655125066638
max validation accuracy 0.10040000081062317
min validation loss 0.00035966504947282374

l2-normed weight changes from initial values after last epoch:
[0.011293031, 0.00027869613, 0.0031709375, 0.00029919803]
opt = sgd, lr = 0.010000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 6.3835e-04, Accuracy: 0.080, Test Loss: 6.3807e-04, Test Accuracy: 0.078 - L2 Norm of Weight Movement From Initialization: [0.04280802, 0.0011335298, 0.01256985, 0.0011899628]'


max training accuracy 0.09205999970436096
min training loss 0.0006383509607985616
max validation accuracy 0.08399999886751175
min validation loss 0.0006380743579939008

l2-normed weight changes from initial values after last epoch:
[0.04280802, 0.0011335298, 0.01256985, 0.0011899628]
opt = sgd, lr = 0.010000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 1.4355e-03, Accuracy: 0.099, Test Loss: 1.4351e-03, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.07014451, 0.001679705, 0.015064642, 0.0012438617]'


max training accuracy 0.10109999775886536
min training loss 0.0014354625018313527
max validation accuracy 0.10040000081062317
min validation loss 0.0014351275749504566

l2-normed weight changes from initial values after last epoch:
[0.07014451, 0.001679705, 0.015064642, 0.0012438617]
opt = sgd, lr = 0.010000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 2.5357e-03, Accuracy: 0.104, Test Loss: 2.5315e-03, Test Accuracy: 0.103 - L2 Norm of Weight Movement From Initialization: [0.17618962, 0.0042397925, 0.044878293, 0.0032139388]'


max training accuracy 0.10434000194072723
min training loss 0.002535689389333129
max validation accuracy 0.10270000249147415
min validation loss 0.002531488426029682

l2-normed weight changes from initial values after last epoch:
[0.17618962, 0.0042397925, 0.044878293, 0.0032139388]
opt = sgd, lr = 0.010000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 5.3528e-03, Accuracy: 0.055, Test Loss: 5.3441e-03, Test Accuracy: 0.055 - L2 Norm of Weight Movement From Initialization: [0.60182524, 0.033921067, 0.20102407, 0.036085278]'


max training accuracy 0.09892000257968903
min training loss 0.005352758802473545
max validation accuracy 0.09839999675750732
min validation loss 0.005344097502529621

l2-normed weight changes from initial values after last epoch:
[0.60182524, 0.033921067, 0.20102407, 0.036085278]
opt = sgd, lr = 0.010000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 1.8387e-02, Accuracy: 0.247, Test Loss: 1.8371e-02, Test Accuracy: 0.247 - L2 Norm of Weight Movement From Initialization: [1.7997682, 0.12412166, 0.7494549, 0.117332935]'


max training accuracy 0.24743999540805817
min training loss 0.018386948853731155
max validation accuracy 0.24709999561309814
min validation loss 0.01837128773331642

l2-normed weight changes from initial values after last epoch:
[1.7997682, 0.12412166, 0.7494549, 0.117332935]
opt = sgd, lr = 0.010000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.2524e-02, Accuracy: 0.261, Test Loss: 2.2536e-02, Test Accuracy: 0.255 - L2 Norm of Weight Movement From Initialization: [1.9343381, 0.14746466, 0.84811574, 0.13418758]'


max training accuracy 0.26058000326156616
min training loss 0.022523554041981697
max validation accuracy 0.2572000026702881
min validation loss 0.02253580093383789

l2-normed weight changes from initial values after last epoch:
[1.9343381, 0.14746466, 0.84811574, 0.13418758]
opt = sgd, lr = 0.010000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.7953e-02, Accuracy: 0.281, Test Loss: 2.7886e-02, Test Accuracy: 0.281 - L2 Norm of Weight Movement From Initialization: [2.194505, 0.17997696, 1.0412112, 0.14957395]'


max training accuracy 0.2807599902153015
min training loss 0.027952829375863075
max validation accuracy 0.29019999504089355
min validation loss 0.02788560837507248

l2-normed weight changes from initial values after last epoch:
[2.194505, 0.17997696, 1.0412112, 0.14957395]
opt = sgd, lr = 0.010000, alpha = 7.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 3.5289e-02, Accuracy: 0.337, Test Loss: 3.5345e-02, Test Accuracy: 0.330 - L2 Norm of Weight Movement From Initialization: [2.62865, 0.21106501, 1.2445316, 0.18439136]'


max training accuracy 0.33748000860214233
min training loss 0.03528907150030136
max validation accuracy 0.3346000015735626
min validation loss 0.035345204174518585

l2-normed weight changes from initial values after last epoch:
[2.62865, 0.21106501, 1.2445316, 0.18439136]
opt = sgd, lr = 0.010000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 4.6312e-02, Accuracy: 0.406, Test Loss: 4.6208e-02, Test Accuracy: 0.412 - L2 Norm of Weight Movement From Initialization: [3.1796649, 0.30980942, 1.6018678, 0.2605905]'


max training accuracy 0.40615999698638916
min training loss 0.04631239175796509
max validation accuracy 0.41200000047683716
min validation loss 0.04620841518044472

l2-normed weight changes from initial values after last epoch:
[3.1796649, 0.30980942, 1.6018678, 0.2605905]
opt = sgd, lr = 0.010000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 6.3891e-02, Accuracy: 0.436, Test Loss: 6.4382e-02, Test Accuracy: 0.444 - L2 Norm of Weight Movement From Initialization: [3.9001348, 0.4190906, 2.0434022, 0.36807054]'


max training accuracy 0.4357599914073944
min training loss 0.0638914629817009
max validation accuracy 0.44440001249313354
min validation loss 0.06438204646110535

l2-normed weight changes from initial values after last epoch:
[3.9001348, 0.4190906, 2.0434022, 0.36807054]
opt = sgd, lr = 0.010000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 9.4673e-02, Accuracy: 0.467, Test Loss: 9.5552e-02, Test Accuracy: 0.459 - L2 Norm of Weight Movement From Initialization: [4.9944043, 0.64595926, 2.6394873, 0.5594582]'


max training accuracy 0.46713998913764954
min training loss 0.09467285871505737
max validation accuracy 0.4609000086784363
min validation loss 0.09555154293775558

l2-normed weight changes from initial values after last epoch:
[4.9944043, 0.64595926, 2.6394873, 0.5594582]
opt = sgd, lr = 0.010000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 1.5754e-01, Accuracy: 0.501, Test Loss: 1.6686e-01, Test Accuracy: 0.461 - L2 Norm of Weight Movement From Initialization: [7.0043864, 1.0327001, 3.366817, 0.97669125]'


max training accuracy 0.5008599758148193
min training loss 0.15754099190235138
max validation accuracy 0.4722000062465668
min validation loss 0.16685578227043152

l2-normed weight changes from initial values after last epoch:
[7.0043864, 1.0327001, 3.366817, 0.97669125]
opt = sgd, lr = 0.010000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 3.4133e-01, Accuracy: 0.517, Test Loss: 3.6343e-01, Test Accuracy: 0.486 - L2 Norm of Weight Movement From Initialization: [10.9976225, 1.5037423, 3.850562, 1.812466]'


max training accuracy 0.5168399810791016
min training loss 0.3413275480270386
max validation accuracy 0.4862000048160553
min validation loss 0.3634343147277832

l2-normed weight changes from initial values after last epoch:
[10.9976225, 1.5037423, 3.850562, 1.812466]
opt = sgd, lr = 0.010000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 1.5817e+00, Accuracy: 0.437, Test Loss: 1.6413e+00, Test Accuracy: 0.408 - L2 Norm of Weight Movement From Initialization: [16.72436, 2.925099, 3.4992113, 2.8001301]'


max training accuracy 0.4366599917411804
min training loss 1.5817354917526245
max validation accuracy 0.41929998993873596
min validation loss 1.6282109022140503

l2-normed weight changes from initial values after last epoch:
[16.72436, 2.925099, 3.4992113, 2.8001301]
opt = sgd, lr = 0.010000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 2.7514e+00, Accuracy: 0.363, Test Loss: 2.8311e+00, Test Accuracy: 0.374 - L2 Norm of Weight Movement From Initialization: [18.085585, 3.803259, 3.4157438, 2.8382688]'


max training accuracy 0.36305999755859375
min training loss 2.751420736312866
max validation accuracy 0.37400001287460327
min validation loss 2.7544384002685547

l2-normed weight changes from initial values after last epoch:
[18.085585, 3.803259, 3.4157438, 2.8382688]
opt = sgd, lr = 0.010000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 6.2750e+00, Accuracy: 0.127, Test Loss: 6.1494e+00, Test Accuracy: 0.193 - L2 Norm of Weight Movement From Initialization: [14.199989, 2.356098, 4.7895064, 0.19142488]'


max training accuracy 0.19144000113010406
min training loss 5.954569339752197
max validation accuracy 0.19329999387264252
min validation loss 5.8498029708862305

l2-normed weight changes from initial values after last epoch:
[14.199989, 2.356098, 4.7895064, 0.19142488]
opt = sgd, lr = 0.010000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 1.4401e+01, Accuracy: 0.103, Test Loss: 1.4398e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [14.406209, 0.5362958, 8.862862, 0.21812494]'


max training accuracy 0.10276000201702118
min training loss 14.399994850158691
max validation accuracy 0.10040000081062317
min validation loss 14.397696495056152

l2-normed weight changes from initial values after last epoch:
[14.406209, 0.5362958, 8.862862, 0.21812494]
opt = sgd, lr = 0.010000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 3.2644e+02, Accuracy: 0.100, Test Loss: 3.2644e+02, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [27.953476, 0.9378917, 36.425186, 1.1105372]'


max training accuracy 0.10000000149011612
min training loss 326.2479248046875
max validation accuracy 0.10000000149011612
min validation loss 326.44488525390625

l2-normed weight changes from initial values after last epoch:
[27.953476, 0.9378917, 36.425186, 1.1105372]
opt = sgd, lr = 0.010000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [74.59898, 2.3453836, 206.24986, 2.7019885]'


max training accuracy 0.10000000149011612
min training loss 1242.781005859375
max validation accuracy 0.10000000149011612
min validation loss 1243.396484375

l2-normed weight changes from initial values after last epoch:
[74.59898, 2.3453836, 206.24986, 2.7019885]
opt = sgd, lr = 0.010000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [117.37245, 3.9583983, 202.28632, 3.7011113]'


max training accuracy 0.10010000318288803
min training loss 2470.5810546875
max validation accuracy 0.10000000149011612
min validation loss 2472.03466796875

l2-normed weight changes from initial values after last epoch:
[117.37245, 3.9583983, 202.28632, 3.7011113]
opt = sgd, lr = 0.010000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [487.18155, 17.862513, 143.8665, 18.163635]'


max training accuracy 0.10000000149011612
min training loss 7252.56201171875
max validation accuracy 0.10000000149011612
min validation loss 7255.8232421875

l2-normed weight changes from initial values after last epoch:
[487.18155, 17.862513, 143.8665, 18.163635]
opt = sgd, lr = 0.010000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [10892.2, 345.46066, 3323.4277, 396.69052]'


max training accuracy 0.10000000149011612
min training loss 103578.9921875
max validation accuracy 0.10000000149011612
min validation loss 103617.0625

l2-normed weight changes from initial values after last epoch:
[10892.2, 345.46066, 3323.4277, 396.69052]
opt = sgd, lr = 0.010000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [707663.06, 19453.342, 190744.69, 18600.143]'


max training accuracy 0.10000000149011612
min training loss 8287275.0
max validation accuracy 0.10000000149011612
min validation loss 8290205.5

l2-normed weight changes from initial values after last epoch:
[707663.06, 19453.342, 190744.69, 18600.143]
opt = sgd, lr = 0.001000, alpha = 20000.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 5.7565e-09, Accuracy: 0.106, Test Loss: 5.7564e-09, Test Accuracy: 0.105 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10582000017166138
min training loss 5.756487286845413e-09
max validation accuracy 0.1054999977350235
min validation loss 5.7564477629057365e-09

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 2.3026e-06, Accuracy: 0.095, Test Loss: 2.3026e-06, Test Accuracy: 0.096 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.0952799990773201
min training loss 2.302632083228673e-06
max validation accuracy 0.09600000083446503
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = sgd, lr = 0.001000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:26 - Loss: 2.3020e-04, Accuracy: 0.112, Test Loss: 2.3020e-04, Test Accuracy: 0.112 - L2 Norm of Weight Movement From Initialization: [0.0004414353, 1.8225932e-05, 0.00014523198, 1.5440864e-05]'


max training accuracy 0.11159999668598175
min training loss 0.000230196223128587
max validation accuracy 0.11240000277757645
min validation loss 0.00023019561194814742

l2-normed weight changes from initial values after last epoch:
[0.0004414353, 1.8225932e-05, 0.00014523198, 1.5440864e-05]
opt = sgd, lr = 0.001000, alpha = 80.000000


'Epoch: 0009, Total Run Time: 00:01:25 - Loss: 3.5977e-04, Accuracy: 0.106, Test Loss: 3.5978e-04, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [4.638704e-07, 8.298799e-08, 1.6841194e-07, 8.808561e-08]'


max training accuracy 0.10555999726057053
min training loss 0.00035977488732896745
max validation accuracy 0.1023000031709671
min validation loss 0.00035977951483801007

l2-normed weight changes from initial values after last epoch:
[4.638704e-07, 8.298799e-08, 1.6841194e-07, 8.808561e-08]
opt = sgd, lr = 0.001000, alpha = 60.000000


'Epoch: 0009, Total Run Time: 00:01:24 - Loss: 6.3949e-04, Accuracy: 0.139, Test Loss: 6.3969e-04, Test Accuracy: 0.142 - L2 Norm of Weight Movement From Initialization: [0.0019424473, 5.986363e-05, 0.0007087543, 5.9432663e-05]'


max training accuracy 0.13944000005722046
min training loss 0.0006394927622750401
max validation accuracy 0.14309999346733093
min validation loss 0.0006396878161467612

l2-normed weight changes from initial values after last epoch:
[0.0019424473, 5.986363e-05, 0.0007087543, 5.9432663e-05]
opt = sgd, lr = 0.001000, alpha = 40.000000


'Epoch: 0009, Total Run Time: 00:01:23 - Loss: 1.4389e-03, Accuracy: 0.095, Test Loss: 1.4384e-03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [0.009590569, 0.00022698878, 0.0024083976, 0.00020657327]'


max training accuracy 0.09520000219345093
min training loss 0.0014389035059139132
max validation accuracy 0.09950000047683716
min validation loss 0.0014384380774572492

l2-normed weight changes from initial values after last epoch:
[0.009590569, 0.00022698878, 0.0024083976, 0.00020657327]
opt = sgd, lr = 0.001000, alpha = 30.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 2.5573e-03, Accuracy: 0.106, Test Loss: 2.5569e-03, Test Accuracy: 0.106 - L2 Norm of Weight Movement From Initialization: [0.012362701, 0.00034933985, 0.003690732, 0.0003970499]'


max training accuracy 0.11060000211000443
min training loss 0.002557338448241353
max validation accuracy 0.11219999939203262
min validation loss 0.0025568893179297447

l2-normed weight changes from initial values after last epoch:
[0.012362701, 0.00034933985, 0.003690732, 0.0003970499]
opt = sgd, lr = 0.001000, alpha = 20.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 5.7328e-03, Accuracy: 0.102, Test Loss: 5.7299e-03, Test Accuracy: 0.102 - L2 Norm of Weight Movement From Initialization: [0.0666156, 0.0016989089, 0.017955618, 0.0013477497]'


max training accuracy 0.10413999855518341
min training loss 0.005732804071158171
max validation accuracy 0.10270000249147415
min validation loss 0.0057298592291772366

l2-normed weight changes from initial values after last epoch:
[0.0666156, 0.0016989089, 0.017955618, 0.0013477497]
opt = sgd, lr = 0.001000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.1284e-02, Accuracy: 0.119, Test Loss: 2.1292e-02, Test Accuracy: 0.117 - L2 Norm of Weight Movement From Initialization: [0.46795195, 0.019259224, 0.14152645, 0.020883422]'


max training accuracy 0.11919999867677689
min training loss 0.021283915266394615
max validation accuracy 0.11720000207424164
min validation loss 0.02129177190363407

l2-normed weight changes from initial values after last epoch:
[0.46795195, 0.019259224, 0.14152645, 0.020883422]
opt = sgd, lr = 0.001000, alpha = 9.000000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 2.5273e-02, Accuracy: 0.123, Test Loss: 2.5279e-02, Test Accuracy: 0.121 - L2 Norm of Weight Movement From Initialization: [0.60499585, 0.017679585, 0.1918085, 0.01782667]'


max training accuracy 0.12371999770402908
min training loss 0.02527325414121151
max validation accuracy 0.12359999865293503
min validation loss 0.025278691202402115

l2-normed weight changes from initial values after last epoch:
[0.60499585, 0.017679585, 0.1918085, 0.01782667]
opt = sgd, lr = 0.001000, alpha = 8.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 3.2029e-02, Accuracy: 0.112, Test Loss: 3.2055e-02, Test Accuracy: 0.110 - L2 Norm of Weight Movement From Initialization: [0.67420095, 0.030948484, 0.2197782, 0.03239953]'


max training accuracy 0.11648000031709671
min training loss 0.032028984278440475
max validation accuracy 0.12549999356269836
min validation loss 0.03205468878149986

l2-normed weight changes from initial values after last epoch:
[0.67420095, 0.030948484, 0.2197782, 0.03239953]
opt = sgd, lr = 0.001000, alpha = 7.000000


'Epoch: 0003, Total Run Time: 00:00:36 - Loss: 4.2264e-02, Accuracy: 0.260, Test Loss: 4.2082e-02, Test Accuracy: 0.264 - L2 Norm of Weight Movement From Initialization: [0.4906083, 0.0187785, 0.14965616, 0.021187915]'

'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 4.0025e-02, Accuracy: 0.230, Test Loss: 4.0032e-02, Test Accuracy: 0.228 - L2 Norm of Weight Movement From Initialization: [0.8417337, 0.037356228, 0.29112762, 0.039383706]'


max training accuracy 0.2597599923610687
min training loss 0.04002485051751137
max validation accuracy 0.2639000117778778
min validation loss 0.04003211483359337

l2-normed weight changes from initial values after last epoch:
[0.8417337, 0.037356228, 0.29112762, 0.039383706]
opt = sgd, lr = 0.001000, alpha = 6.000000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 5.3062e-02, Accuracy: 0.311, Test Loss: 5.2978e-02, Test Accuracy: 0.304 - L2 Norm of Weight Movement From Initialization: [0.98428047, 0.046784334, 0.35245427, 0.047969274]'


max training accuracy 0.3110800087451935
min training loss 0.0530615858733654
max validation accuracy 0.30720001459121704
min validation loss 0.052977848798036575

l2-normed weight changes from initial values after last epoch:
[0.98428047, 0.046784334, 0.35245427, 0.047969274]
opt = sgd, lr = 0.001000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 7.4461e-02, Accuracy: 0.345, Test Loss: 7.4518e-02, Test Accuracy: 0.341 - L2 Norm of Weight Movement From Initialization: [1.1675524, 0.06623198, 0.47329208, 0.06457589]'


max training accuracy 0.34455999732017517
min training loss 0.07446112483739853
max validation accuracy 0.3409000039100647
min validation loss 0.07451844215393066

l2-normed weight changes from initial values after last epoch:
[1.1675524, 0.06623198, 0.47329208, 0.06457589]
opt = sgd, lr = 0.001000, alpha = 4.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 1.1188e-01, Accuracy: 0.377, Test Loss: 1.1196e-01, Test Accuracy: 0.372 - L2 Norm of Weight Movement From Initialization: [1.4814717, 0.111842334, 0.569923, 0.099283315]'


max training accuracy 0.3768399953842163
min training loss 0.11188321560621262
max validation accuracy 0.37229999899864197
min validation loss 0.11196362227201462

l2-normed weight changes from initial values after last epoch:
[1.4814717, 0.111842334, 0.569923, 0.099283315]
opt = sgd, lr = 0.001000, alpha = 3.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 1.8924e-01, Accuracy: 0.414, Test Loss: 1.8972e-01, Test Accuracy: 0.413 - L2 Norm of Weight Movement From Initialization: [2.0238266, 0.18006457, 0.87524384, 0.1653121]'


max training accuracy 0.4140399992465973
min training loss 0.1892358362674713
max validation accuracy 0.4133000075817108
min validation loss 0.1897185742855072

l2-normed weight changes from initial values after last epoch:
[2.0238266, 0.18006457, 0.87524384, 0.1653121]
opt = sgd, lr = 0.001000, alpha = 2.000000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 3.9529e-01, Accuracy: 0.454, Test Loss: 3.9634e-01, Test Accuracy: 0.450 - L2 Norm of Weight Movement From Initialization: [3.1108382, 0.3497785, 1.5085452, 0.30650535]'


max training accuracy 0.454039990901947
min training loss 0.39528870582580566
max validation accuracy 0.4496999979019165
min validation loss 0.39633825421333313

l2-normed weight changes from initial values after last epoch:
[3.1108382, 0.3497785, 1.5085452, 0.30650535]
opt = sgd, lr = 0.001000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:30 - Loss: 1.3895e+00, Accuracy: 0.514, Test Loss: 1.4708e+00, Test Accuracy: 0.475 - L2 Norm of Weight Movement From Initialization: [6.7917376, 0.9397748, 3.1316378, 0.93745565]'


max training accuracy 0.513759970664978
min training loss 1.389522671699524
max validation accuracy 0.49149999022483826
min validation loss 1.4448493719100952

l2-normed weight changes from initial values after last epoch:
[6.7917376, 0.9397748, 3.1316378, 0.93745565]
opt = sgd, lr = 0.001000, alpha = 0.800000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 2.1238e+00, Accuracy: 0.522, Test Loss: 2.2293e+00, Test Accuracy: 0.492 - L2 Norm of Weight Movement From Initialization: [8.739545, 1.241354, 3.6091905, 1.3263038]'


max training accuracy 0.5223199725151062
min training loss 2.123778820037842
max validation accuracy 0.4925999939441681
min validation loss 2.2292532920837402

l2-normed weight changes from initial values after last epoch:
[8.739545, 1.241354, 3.6091905, 1.3263038]
opt = sgd, lr = 0.001000, alpha = 0.600000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 3.7889e+00, Accuracy: 0.516, Test Loss: 3.9760e+00, Test Accuracy: 0.494 - L2 Norm of Weight Movement From Initialization: [11.69835, 1.5312567, 3.8914366, 1.9396666]'


max training accuracy 0.5158200263977051
min training loss 3.7888741493225098
max validation accuracy 0.4943000078201294
min validation loss 3.976044178009033

l2-normed weight changes from initial values after last epoch:
[11.69835, 1.5312567, 3.8914366, 1.9396666]
opt = sgd, lr = 0.001000, alpha = 0.400000


'Epoch: 0009, Total Run Time: 00:01:31 - Loss: 9.1741e+00, Accuracy: 0.479, Test Loss: 9.3700e+00, Test Accuracy: 0.467 - L2 Norm of Weight Movement From Initialization: [15.663797, 1.9789902, 3.3971956, 3.007679]'


max training accuracy 0.4789600074291229
min training loss 9.174113273620605
max validation accuracy 0.4666000008583069
min validation loss 9.369959831237793

l2-normed weight changes from initial values after last epoch:
[15.663797, 1.9789902, 3.3971956, 3.007679]
opt = sgd, lr = 0.001000, alpha = 0.200000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 5.0597e+01, Accuracy: 0.247, Test Loss: 5.0525e+01, Test Accuracy: 0.244 - L2 Norm of Weight Movement From Initialization: [15.003761, 3.257604, 4.467832, 1.4268014]'


max training accuracy 0.24741999804973602
min training loss 50.59748458862305
max validation accuracy 0.2581000030040741
min validation loss 48.94134521484375

l2-normed weight changes from initial values after last epoch:
[15.003761, 3.257604, 4.467832, 1.4268014]
opt = sgd, lr = 0.001000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [17.506886, 0.63307124, 19.356989, 0.8078182]'


max training accuracy 0.10000000149011612
min training loss 1241.9998779296875
max validation accuracy 0.10000000149011612
min validation loss 1243.39697265625

l2-normed weight changes from initial values after last epoch:
[17.506886, 0.63307124, 19.356989, 0.8078182]
opt = sgd, lr = 0.001000, alpha = 0.070000


'Epoch: 0009, Total Run Time: 00:01:28 - Loss: 2.4720e+03, Accuracy: 0.100, Test Loss: 2.4720e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [16.54108, 0.63718486, 21.228315, 0.7860911]'


max training accuracy 0.10001999884843826
min training loss 2469.63525390625
max validation accuracy 0.10000000149011612
min validation loss 2472.03564453125

l2-normed weight changes from initial values after last epoch:
[16.54108, 0.63718486, 21.228315, 0.7860911]
opt = sgd, lr = 0.001000, alpha = 0.040000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 7.2558e+03, Accuracy: 0.100, Test Loss: 7.2558e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [38.611435, 1.405106, 56.614784, 1.6976434]'


max training accuracy 0.10000000149011612
min training loss 7252.4599609375
max validation accuracy 0.10000000149011612
min validation loss 7255.8232421875

l2-normed weight changes from initial values after last epoch:
[38.611435, 1.405106, 56.614784, 1.6976434]
opt = sgd, lr = 0.001000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:27 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [548.04144, 12.98345, 144.25642, 12.287204]'


max training accuracy 0.10000000149011612
min training loss 103574.078125
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[548.04144, 12.98345, 144.25642, 12.287204]
opt = sgd, lr = 0.001000, alpha = 0.001000


'Epoch: 0009, Total Run Time: 00:01:29 - Loss: 8.2902e+06, Accuracy: 0.100, Test Loss: 8.2902e+06, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [58834.17, 2367.6875, 19716.24, 2814.171]'


max training accuracy 0.10000000149011612
min training loss 8287156.5
max validation accuracy 0.10000000149011612
min validation loss 8290206.5

l2-normed weight changes from initial values after last epoch:
[58834.17, 2367.6875, 19716.24, 2814.171]


In [None]:
# experiment with default network size (width=64) but using adam optimizer
lrs = [1.0, 0.1, 0.01, 0.001, 0.0001]
alphas = [20000.0, 1000.0, 100.0, 80.0, 60.0, 40.0, 30.0, 20.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.8, 0.6, 0.4, 0.2, 0.1, 0.07, 0.04, 0.01, 0.001]
normed_weight_changes_adam = {}
optimizer = 'adam'
num_epochs = 10
for learning_rate in lrs:
  for alpha_val in alphas:
    print('='*80)
    print('opt = %s, lr = %f, alpha = %f' %(optimizer, learning_rate, alpha_val))
    print('='*80)
    normed_weight_changes_adam[(optimizer, learning_rate, alpha_val)] = train(alpha=alpha_val, epoch=num_epochs, opt=optimizer, lr=learning_rate, scaling=True)

opt = adam, lr = 1.000000, alpha = 10000000.000000


'Epoch: 0009, Total Run Time: 00:00:48 - Loss: 2.3026e-14, Accuracy: 0.090, Test Loss: 2.3026e-14, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09009999781847
min training loss 2.3025665711129753e-14
max validation accuracy 0.09709999710321426
min validation loss 2.3025850364312254e-14

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = adam, lr = 1.000000, alpha = 1000000.000000


'Epoch: 0009, Total Run Time: 00:00:53 - Loss: 2.3026e-12, Accuracy: 0.094, Test Loss: 2.3026e-12, Test Accuracy: 0.096 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.09443999826908112
min training loss 2.3025997341469262e-12
max validation accuracy 0.09600000083446503
min validation loss 2.302588675284767e-12

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = adam, lr = 1.000000, alpha = 100000.000000


'Epoch: 0009, Total Run Time: 00:01:07 - Loss: 2.3026e-10, Accuracy: 0.102, Test Loss: 2.3026e-10, Test Accuracy: 0.103 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.1023000031709671
min training loss 2.3026328066499957e-10
max validation accuracy 0.10289999842643738
min validation loss 2.3025850670599368e-10

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = adam, lr = 1.000000, alpha = 10000.000000


'Epoch: 0009, Total Run Time: 00:01:06 - Loss: 2.3026e-08, Accuracy: 0.096, Test Loss: 2.3026e-08, Test Accuracy: 0.097 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.095660001039505
min training loss 2.3025949147381652e-08
max validation accuracy 0.09669999778270721
min validation loss 2.3025791051622946e-08

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = adam, lr = 1.000000, alpha = 1000.000000


'Epoch: 0009, Total Run Time: 00:01:05 - Loss: 2.3026e-06, Accuracy: 0.102, Test Loss: 2.3026e-06, Test Accuracy: 0.101 - L2 Norm of Weight Movement From Initialization: [0.0, 0.0, 0.0, 0.0]'


max training accuracy 0.10211999714374542
min training loss 2.302632083228673e-06
max validation accuracy 0.10100000351667404
min validation loss 2.3025800146569964e-06

l2-normed weight changes from initial values after last epoch:
[0.0, 0.0, 0.0, 0.0]
opt = adam, lr = 1.000000, alpha = 100.000000


'Epoch: 0009, Total Run Time: 00:01:04 - Loss: 1.4506e-03, Accuracy: 0.100, Test Loss: 1.4506e-03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [111.27019, 9.391343, 41.12513, 6.255609]'


max training accuracy 0.10000000149011612
min training loss 0.0013913430739194155
max validation accuracy 0.10000000149011612
min validation loss 0.0014506278093904257

l2-normed weight changes from initial values after last epoch:
[111.27019, 9.391343, 41.12513, 6.255609]
opt = adam, lr = 1.000000, alpha = 10.000000


'Epoch: 0009, Total Run Time: 00:01:08 - Loss: 1.4506e-01, Accuracy: 0.100, Test Loss: 1.4506e-01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2134.9937, 40.65172, 123.765274, 18.169634]'


max training accuracy 0.10000000149011612
min training loss 0.14499522745609283
max validation accuracy 0.10000000149011612
min validation loss 0.14506274461746216

l2-normed weight changes from initial values after last epoch:
[2134.9937, 40.65172, 123.765274, 18.169634]
opt = adam, lr = 1.000000, alpha = 5.000000


'Epoch: 0009, Total Run Time: 00:01:10 - Loss: 5.8026e-01, Accuracy: 0.100, Test Loss: 5.8025e-01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2415.3723, 44.390163, 137.28018, 18.921852]'


max training accuracy 0.10000000149011612
min training loss 0.579979419708252
max validation accuracy 0.10000000149011612
min validation loss 0.5802512168884277

l2-normed weight changes from initial values after last epoch:
[2415.3723, 44.390163, 137.28018, 18.921852]
opt = adam, lr = 1.000000, alpha = 1.000000


'Epoch: 0009, Total Run Time: 00:01:08 - Loss: 1.4506e+01, Accuracy: 0.100, Test Loss: 1.4506e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2434.259, 44.08051, 139.10837, 18.979357]'


max training accuracy 0.100040003657341
min training loss 14.49924087524414
max validation accuracy 0.10000000149011612
min validation loss 14.50627326965332

l2-normed weight changes from initial values after last epoch:
[2434.259, 44.08051, 139.10837, 18.979357]
opt = adam, lr = 1.000000, alpha = 0.500000


'Epoch: 0009, Total Run Time: 00:01:07 - Loss: 5.5530e+01, Accuracy: 0.100, Test Loss: 5.5530e+01, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2533.3542, 45.72536, 144.58206, 18.99072]'


max training accuracy 0.10000000149011612
min training loss 55.5029411315918
max validation accuracy 0.10000000149011612
min validation loss 55.52981185913086

l2-normed weight changes from initial values after last epoch:
[2533.3542, 45.72536, 144.58206, 18.99072]
opt = adam, lr = 1.000000, alpha = 0.100000


'Epoch: 0009, Total Run Time: 00:01:01 - Loss: 1.2434e+03, Accuracy: 0.100, Test Loss: 1.2434e+03, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2353.5745, 42.46457, 134.58029, 18.991018]'


max training accuracy 0.10000000149011612
min training loss 1243.297607421875
max validation accuracy 0.10000000149011612
min validation loss 1243.39697265625

l2-normed weight changes from initial values after last epoch:
[2353.5745, 42.46457, 134.58029, 18.991018]
opt = adam, lr = 1.000000, alpha = 0.010000


'Epoch: 0009, Total Run Time: 00:01:08 - Loss: 1.0362e+05, Accuracy: 0.100, Test Loss: 1.0362e+05, Test Accuracy: 0.100 - L2 Norm of Weight Movement From Initialization: [2400.2646, 43.306286, 136.94647, 18.991064]'


max training accuracy 0.10000000149011612
min training loss 103570.265625
max validation accuracy 0.10000000149011612
min validation loss 103617.0703125

l2-normed weight changes from initial values after last epoch:
[2400.2646, 43.306286, 136.94647, 18.991064]
opt = adam, lr = 0.100000, alpha = 10000000.000000


'SAVING INITIAL WEIGHT VALUES'

In [None]:
# experiment with default network size (width=64) but more epochs (100)
lrs = [1.0, 0.1, 0.01, 0.001]
alphas = [20000.0, 1000.0, 100.0, 80.0, 60.0, 40.0, 30.0, 20.0, 10.0, 9.0, 8.0, 7.0, 6.0, 5.0, 4.0, 3.0, 2.0, 1.0, 0.8, 0.6, 0.4, 0.2, 0.1, 0.07, 0.04, 0.01, 0.001]
normed_weight_changes_e100 = {}
optimizer = 'sgd'
num_epochs = 100
for learning_rate in lrs:
  for alpha_val in alphas:
    print('='*80)
    print('opt = %s, lr = %f, alpha = %f' %(optimizer, learning_rate, alpha_val))
    print('='*80)
    normed_weight_changes_e100[(optimizer, learning_rate, alpha_val)] = train(alpha=alpha_val, epoch=num_epochs, opt=optimizer, lr=learning_rate, scaling=True)

In [None]:
for i in range(20):
  time.sleep(3600)