Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
3232 lines (3231 sloc) 210 KB
Namespace(batch_size=50, data_name='SST-2', dropout=0.5, epochs=40, gpu=0, log_interval=30, lr=0.0001, model_mode='multichannel', save_prefix='sa-model')
Use gpu0
1614
53
Done! Tokenizing Time=4.41s, #Sentences=118038
Done! Tokenizing Time=0.80s, #Sentences=1745
SentimentNet(
(embedding): Embedding(17814 -> 300, float32)
(embedding_extend): Embedding(17814 -> 300, float32)
(encoder): ConvolutionalEncoder(
(_convs): HybridConcurrent(
(0): HybridSequential(
(0): Conv1D(600 -> 100, kernel_size=(3,), stride=(1,))
(1): Activation(relu)
(2): HybridLambda(<lambda>)
)
(1): HybridSequential(
(0): Conv1D(600 -> 100, kernel_size=(4,), stride=(1,))
(1): Activation(relu)
(2): HybridLambda(<lambda>)
)
(2): HybridSequential(
(0): Conv1D(600 -> 100, kernel_size=(5,), stride=(1,))
(1): Activation(relu)
(2): HybridLambda(<lambda>)
)
)
)
(output): HybridSequential(
(0): Dropout(p = 0.5, axes=())
(1): Dense(None -> 2, linear)
)
)
[Epoch 0 Batch 30/2125] avg loss 0.0147516, throughput 2.56407K wps
[Epoch 0 Batch 60/2125] avg loss 0.0142936, throughput 4.02322K wps
[Epoch 0 Batch 90/2125] avg loss 0.0138636, throughput 4.02236K wps
[Epoch 0 Batch 120/2125] avg loss 0.0135436, throughput 4.01478K wps
[Epoch 0 Batch 150/2125] avg loss 0.0134229, throughput 4.01785K wps
[Epoch 0 Batch 180/2125] avg loss 0.0130767, throughput 4.01938K wps
[Epoch 0 Batch 210/2125] avg loss 0.0132462, throughput 4.01397K wps
[Epoch 0 Batch 240/2125] avg loss 0.012815, throughput 4.01308K wps
[Epoch 0 Batch 270/2125] avg loss 0.0123686, throughput 4.01757K wps
[Epoch 0 Batch 300/2125] avg loss 0.0124178, throughput 4.01749K wps
[Epoch 0 Batch 330/2125] avg loss 0.0121333, throughput 4.01016K wps
[Epoch 0 Batch 360/2125] avg loss 0.0115257, throughput 4.00946K wps
[Epoch 0 Batch 390/2125] avg loss 0.0113919, throughput 4.01288K wps
[Epoch 0 Batch 420/2125] avg loss 0.0111251, throughput 4.01304K wps
[Epoch 0 Batch 450/2125] avg loss 0.0112771, throughput 4.01867K wps
[Epoch 0 Batch 480/2125] avg loss 0.0108728, throughput 4.01583K wps
[Epoch 0 Batch 510/2125] avg loss 0.010625, throughput 4.01621K wps
[Epoch 0 Batch 540/2125] avg loss 0.0100855, throughput 4.01905K wps
[Epoch 0 Batch 570/2125] avg loss 0.0101509, throughput 4.01538K wps
[Epoch 0 Batch 600/2125] avg loss 0.00997682, throughput 4.01337K wps
[Epoch 0 Batch 630/2125] avg loss 0.00949247, throughput 4.01543K wps
[Epoch 0 Batch 660/2125] avg loss 0.00951337, throughput 4.01165K wps
[Epoch 0 Batch 690/2125] avg loss 0.00911888, throughput 4.01354K wps
[Epoch 0 Batch 720/2125] avg loss 0.00937186, throughput 4.01438K wps
[Epoch 0 Batch 750/2125] avg loss 0.00887978, throughput 3.99346K wps
[Epoch 0 Batch 780/2125] avg loss 0.0086955, throughput 3.99133K wps
[Epoch 0 Batch 810/2125] avg loss 0.00847885, throughput 4.01173K wps
[Epoch 0 Batch 840/2125] avg loss 0.00817178, throughput 4.00983K wps
[Epoch 0 Batch 870/2125] avg loss 0.00775457, throughput 4.00867K wps
[Epoch 0 Batch 900/2125] avg loss 0.00780742, throughput 4.00721K wps
[Epoch 0 Batch 930/2125] avg loss 0.00779968, throughput 4.01075K wps
[Epoch 0 Batch 960/2125] avg loss 0.00801744, throughput 4.00696K wps
[Epoch 0 Batch 990/2125] avg loss 0.00739751, throughput 4.00404K wps
[Epoch 0 Batch 1020/2125] avg loss 0.00721081, throughput 4.01038K wps
[Epoch 0 Batch 1050/2125] avg loss 0.0075239, throughput 4.00852K wps
[Epoch 0 Batch 1080/2125] avg loss 0.00713659, throughput 4.00842K wps
[Epoch 0 Batch 1110/2125] avg loss 0.00718793, throughput 4.00881K wps
[Epoch 0 Batch 1140/2125] avg loss 0.00707068, throughput 4.00812K wps
[Epoch 0 Batch 1170/2125] avg loss 0.00702073, throughput 4.00948K wps
[Epoch 0 Batch 1200/2125] avg loss 0.00687074, throughput 4.00639K wps
[Epoch 0 Batch 1230/2125] avg loss 0.0066359, throughput 4.00242K wps
[Epoch 0 Batch 1260/2125] avg loss 0.00681151, throughput 4.0071K wps
[Epoch 0 Batch 1290/2125] avg loss 0.00656432, throughput 4.00106K wps
[Epoch 0 Batch 1320/2125] avg loss 0.00684996, throughput 4.00742K wps
[Epoch 0 Batch 1350/2125] avg loss 0.00622697, throughput 4.00299K wps
[Epoch 0 Batch 1380/2125] avg loss 0.00652663, throughput 4.00453K wps
[Epoch 0 Batch 1410/2125] avg loss 0.00646962, throughput 4.00456K wps
[Epoch 0 Batch 1440/2125] avg loss 0.00648735, throughput 4.00695K wps
[Epoch 0 Batch 1470/2125] avg loss 0.00661196, throughput 4.0091K wps
[Epoch 0 Batch 1500/2125] avg loss 0.00634111, throughput 4.00373K wps
[Epoch 0 Batch 1530/2125] avg loss 0.00648579, throughput 4.00131K wps
[Epoch 0 Batch 1560/2125] avg loss 0.00671795, throughput 4.00447K wps
[Epoch 0 Batch 1590/2125] avg loss 0.00577287, throughput 4.00277K wps
[Epoch 0 Batch 1620/2125] avg loss 0.00629674, throughput 4.00556K wps
[Epoch 0 Batch 1650/2125] avg loss 0.00624858, throughput 4.0016K wps
[Epoch 0 Batch 1680/2125] avg loss 0.00627834, throughput 4.00326K wps
[Epoch 0 Batch 1710/2125] avg loss 0.00608684, throughput 4.00241K wps
[Epoch 0 Batch 1740/2125] avg loss 0.00591085, throughput 4.00248K wps
[Epoch 0 Batch 1770/2125] avg loss 0.00591208, throughput 4.00495K wps
[Epoch 0 Batch 1800/2125] avg loss 0.00619698, throughput 4.00061K wps
[Epoch 0 Batch 1830/2125] avg loss 0.0060902, throughput 4.00252K wps
[Epoch 0 Batch 1860/2125] avg loss 0.00596406, throughput 4.00577K wps
[Epoch 0 Batch 1890/2125] avg loss 0.00586057, throughput 3.99572K wps
[Epoch 0 Batch 1920/2125] avg loss 0.0055443, throughput 4.00382K wps
[Epoch 0 Batch 1950/2125] avg loss 0.00571695, throughput 4.00193K wps
[Epoch 0 Batch 1980/2125] avg loss 0.00622543, throughput 4.00254K wps
[Epoch 0 Batch 2010/2125] avg loss 0.00572866, throughput 3.99695K wps
[Epoch 0 Batch 2040/2125] avg loss 0.00574906, throughput 4.0022K wps
[Epoch 0 Batch 2070/2125] avg loss 0.00520557, throughput 3.99985K wps
[Epoch 0 Batch 2100/2125] avg loss 0.00583206, throughput 3.99851K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 0] train avg loss 0.00838684, test acc 0.8910, test avg loss 0.276346, throughput 3.95853K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 1 Batch 30/2125] avg loss 0.00563384, throughput 4.09238K wps
[Epoch 1 Batch 60/2125] avg loss 0.00520742, throughput 3.99956K wps
[Epoch 1 Batch 90/2125] avg loss 0.00521086, throughput 3.99952K wps
[Epoch 1 Batch 120/2125] avg loss 0.00537803, throughput 4.0037K wps
[Epoch 1 Batch 150/2125] avg loss 0.00526046, throughput 3.99612K wps
[Epoch 1 Batch 180/2125] avg loss 0.00488797, throughput 4.00344K wps
[Epoch 1 Batch 210/2125] avg loss 0.0052571, throughput 4.00291K wps
[Epoch 1 Batch 240/2125] avg loss 0.00512038, throughput 3.99767K wps
[Epoch 1 Batch 270/2125] avg loss 0.00538388, throughput 4.00111K wps
[Epoch 1 Batch 300/2125] avg loss 0.00493089, throughput 3.99568K wps
[Epoch 1 Batch 330/2125] avg loss 0.00512095, throughput 3.99584K wps
[Epoch 1 Batch 360/2125] avg loss 0.00463241, throughput 3.99511K wps
[Epoch 1 Batch 390/2125] avg loss 0.00514738, throughput 3.99734K wps
[Epoch 1 Batch 420/2125] avg loss 0.00464598, throughput 3.99816K wps
[Epoch 1 Batch 450/2125] avg loss 0.00525882, throughput 4.00017K wps
[Epoch 1 Batch 480/2125] avg loss 0.00503221, throughput 4.00129K wps
[Epoch 1 Batch 510/2125] avg loss 0.00471628, throughput 4.00111K wps
[Epoch 1 Batch 540/2125] avg loss 0.0048286, throughput 3.99672K wps
[Epoch 1 Batch 570/2125] avg loss 0.00518976, throughput 3.9972K wps
[Epoch 1 Batch 600/2125] avg loss 0.00482811, throughput 3.99797K wps
[Epoch 1 Batch 630/2125] avg loss 0.00523104, throughput 3.99996K wps
[Epoch 1 Batch 660/2125] avg loss 0.00526784, throughput 3.99531K wps
[Epoch 1 Batch 690/2125] avg loss 0.0052043, throughput 4.00332K wps
[Epoch 1 Batch 720/2125] avg loss 0.00445159, throughput 4.00849K wps
[Epoch 1 Batch 750/2125] avg loss 0.00489004, throughput 4.00589K wps
[Epoch 1 Batch 780/2125] avg loss 0.0050508, throughput 3.9984K wps
[Epoch 1 Batch 810/2125] avg loss 0.00532554, throughput 4.00045K wps
[Epoch 1 Batch 840/2125] avg loss 0.00457348, throughput 3.99803K wps
[Epoch 1 Batch 870/2125] avg loss 0.00447419, throughput 3.99703K wps
[Epoch 1 Batch 900/2125] avg loss 0.00454232, throughput 3.99857K wps
[Epoch 1 Batch 930/2125] avg loss 0.0045241, throughput 3.99841K wps
[Epoch 1 Batch 960/2125] avg loss 0.00446654, throughput 3.99464K wps
[Epoch 1 Batch 990/2125] avg loss 0.00481538, throughput 3.99886K wps
[Epoch 1 Batch 1020/2125] avg loss 0.00493456, throughput 4.00295K wps
[Epoch 1 Batch 1050/2125] avg loss 0.00483817, throughput 3.99934K wps
[Epoch 1 Batch 1080/2125] avg loss 0.00496186, throughput 3.99858K wps
[Epoch 1 Batch 1110/2125] avg loss 0.00481145, throughput 3.9978K wps
[Epoch 1 Batch 1140/2125] avg loss 0.00481299, throughput 3.9983K wps
[Epoch 1 Batch 1170/2125] avg loss 0.00478651, throughput 3.999K wps
[Epoch 1 Batch 1200/2125] avg loss 0.00484942, throughput 3.99996K wps
[Epoch 1 Batch 1230/2125] avg loss 0.00425814, throughput 3.99893K wps
[Epoch 1 Batch 1260/2125] avg loss 0.00429702, throughput 3.99778K wps
[Epoch 1 Batch 1290/2125] avg loss 0.00455624, throughput 3.99912K wps
[Epoch 1 Batch 1320/2125] avg loss 0.00477553, throughput 4.0015K wps
[Epoch 1 Batch 1350/2125] avg loss 0.00486682, throughput 4.00088K wps
[Epoch 1 Batch 1380/2125] avg loss 0.00457054, throughput 4.00098K wps
[Epoch 1 Batch 1410/2125] avg loss 0.00447109, throughput 3.99609K wps
[Epoch 1 Batch 1440/2125] avg loss 0.00483, throughput 3.99917K wps
[Epoch 1 Batch 1470/2125] avg loss 0.00463008, throughput 3.99754K wps
[Epoch 1 Batch 1500/2125] avg loss 0.00478489, throughput 3.99667K wps
[Epoch 1 Batch 1530/2125] avg loss 0.00442297, throughput 3.99891K wps
[Epoch 1 Batch 1560/2125] avg loss 0.00465089, throughput 3.99869K wps
[Epoch 1 Batch 1590/2125] avg loss 0.00441596, throughput 4.00165K wps
[Epoch 1 Batch 1620/2125] avg loss 0.00482146, throughput 3.99899K wps
[Epoch 1 Batch 1650/2125] avg loss 0.00451658, throughput 4.00158K wps
[Epoch 1 Batch 1680/2125] avg loss 0.00471935, throughput 3.99911K wps
[Epoch 1 Batch 1710/2125] avg loss 0.00454052, throughput 3.99986K wps
[Epoch 1 Batch 1740/2125] avg loss 0.00462583, throughput 4.00117K wps
[Epoch 1 Batch 1770/2125] avg loss 0.00459405, throughput 4.00014K wps
[Epoch 1 Batch 1800/2125] avg loss 0.00472798, throughput 3.99881K wps
[Epoch 1 Batch 1830/2125] avg loss 0.00475623, throughput 3.99817K wps
[Epoch 1 Batch 1860/2125] avg loss 0.00406364, throughput 3.99465K wps
[Epoch 1 Batch 1890/2125] avg loss 0.00494156, throughput 3.99573K wps
[Epoch 1 Batch 1920/2125] avg loss 0.00450352, throughput 3.99914K wps
[Epoch 1 Batch 1950/2125] avg loss 0.00504218, throughput 4.00008K wps
[Epoch 1 Batch 1980/2125] avg loss 0.00509062, throughput 3.99711K wps
[Epoch 1 Batch 2010/2125] avg loss 0.00451889, throughput 3.99997K wps
[Epoch 1 Batch 2040/2125] avg loss 0.00473392, throughput 4.00022K wps
[Epoch 1 Batch 2070/2125] avg loss 0.00440976, throughput 3.99826K wps
[Epoch 1 Batch 2100/2125] avg loss 0.00495309, throughput 3.99512K wps
Begin Testing...
[Batch 30/237] elapsed 0.48 s
[Batch 60/237] elapsed 0.44 s
[Batch 90/237] elapsed 0.44 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 1] train avg loss 0.00481669, test acc 0.9076, test avg loss 0.24216, throughput 4.00031K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 2 Batch 30/2125] avg loss 0.00372768, throughput 4.08903K wps
[Epoch 2 Batch 60/2125] avg loss 0.00375142, throughput 3.99836K wps
[Epoch 2 Batch 90/2125] avg loss 0.00421321, throughput 4.00193K wps
[Epoch 2 Batch 120/2125] avg loss 0.00375105, throughput 3.99626K wps
[Epoch 2 Batch 150/2125] avg loss 0.00424413, throughput 4.00474K wps
[Epoch 2 Batch 180/2125] avg loss 0.00412048, throughput 4.00238K wps
[Epoch 2 Batch 210/2125] avg loss 0.00406056, throughput 4.00128K wps
[Epoch 2 Batch 240/2125] avg loss 0.00404164, throughput 3.99992K wps
[Epoch 2 Batch 270/2125] avg loss 0.00349389, throughput 4.00198K wps
[Epoch 2 Batch 300/2125] avg loss 0.00429312, throughput 4.00009K wps
[Epoch 2 Batch 330/2125] avg loss 0.00384681, throughput 3.99972K wps
[Epoch 2 Batch 360/2125] avg loss 0.00365196, throughput 4.00152K wps
[Epoch 2 Batch 390/2125] avg loss 0.00458708, throughput 3.99666K wps
[Epoch 2 Batch 420/2125] avg loss 0.00427266, throughput 3.9996K wps
[Epoch 2 Batch 450/2125] avg loss 0.00411935, throughput 4.0002K wps
[Epoch 2 Batch 480/2125] avg loss 0.00385362, throughput 4.00362K wps
[Epoch 2 Batch 510/2125] avg loss 0.00410252, throughput 3.99808K wps
[Epoch 2 Batch 540/2125] avg loss 0.00409695, throughput 4.00171K wps
[Epoch 2 Batch 570/2125] avg loss 0.00401114, throughput 3.99919K wps
[Epoch 2 Batch 600/2125] avg loss 0.0038817, throughput 4.00085K wps
[Epoch 2 Batch 630/2125] avg loss 0.00371004, throughput 4.00038K wps
[Epoch 2 Batch 660/2125] avg loss 0.0037653, throughput 3.99807K wps
[Epoch 2 Batch 690/2125] avg loss 0.00429721, throughput 4.0003K wps
[Epoch 2 Batch 720/2125] avg loss 0.00386738, throughput 4.00036K wps
[Epoch 2 Batch 750/2125] avg loss 0.00396381, throughput 3.99859K wps
[Epoch 2 Batch 780/2125] avg loss 0.0035891, throughput 3.9995K wps
[Epoch 2 Batch 810/2125] avg loss 0.00384002, throughput 4.0002K wps
[Epoch 2 Batch 840/2125] avg loss 0.00369433, throughput 4.00073K wps
[Epoch 2 Batch 870/2125] avg loss 0.00447307, throughput 3.99554K wps
[Epoch 2 Batch 900/2125] avg loss 0.00381584, throughput 4.00025K wps
[Epoch 2 Batch 930/2125] avg loss 0.00400714, throughput 3.99807K wps
[Epoch 2 Batch 960/2125] avg loss 0.00339915, throughput 3.99543K wps
[Epoch 2 Batch 990/2125] avg loss 0.00433354, throughput 3.99834K wps
[Epoch 2 Batch 1020/2125] avg loss 0.00373597, throughput 4.0005K wps
[Epoch 2 Batch 1050/2125] avg loss 0.0037145, throughput 3.99814K wps
[Epoch 2 Batch 1080/2125] avg loss 0.00416336, throughput 3.99888K wps
[Epoch 2 Batch 1110/2125] avg loss 0.00412111, throughput 4.00026K wps
[Epoch 2 Batch 1140/2125] avg loss 0.00431221, throughput 3.99573K wps
[Epoch 2 Batch 1170/2125] avg loss 0.00363961, throughput 3.99919K wps
[Epoch 2 Batch 1200/2125] avg loss 0.00429422, throughput 3.99704K wps
[Epoch 2 Batch 1230/2125] avg loss 0.00367753, throughput 3.99945K wps
[Epoch 2 Batch 1260/2125] avg loss 0.0039543, throughput 4.00349K wps
[Epoch 2 Batch 1290/2125] avg loss 0.00391486, throughput 3.99713K wps
[Epoch 2 Batch 1320/2125] avg loss 0.003756, throughput 3.99817K wps
[Epoch 2 Batch 1350/2125] avg loss 0.00426251, throughput 3.9992K wps
[Epoch 2 Batch 1380/2125] avg loss 0.00367834, throughput 4.00101K wps
[Epoch 2 Batch 1410/2125] avg loss 0.00445956, throughput 3.99988K wps
[Epoch 2 Batch 1440/2125] avg loss 0.0036721, throughput 4.00143K wps
[Epoch 2 Batch 1470/2125] avg loss 0.00411154, throughput 3.99852K wps
[Epoch 2 Batch 1500/2125] avg loss 0.00377809, throughput 4.0006K wps
[Epoch 2 Batch 1530/2125] avg loss 0.00376677, throughput 3.99998K wps
[Epoch 2 Batch 1560/2125] avg loss 0.00426462, throughput 3.99308K wps
[Epoch 2 Batch 1590/2125] avg loss 0.00394671, throughput 3.99943K wps
[Epoch 2 Batch 1620/2125] avg loss 0.00354415, throughput 3.9959K wps
[Epoch 2 Batch 1650/2125] avg loss 0.00412582, throughput 4.00173K wps
[Epoch 2 Batch 1680/2125] avg loss 0.00421245, throughput 3.99636K wps
[Epoch 2 Batch 1710/2125] avg loss 0.0038108, throughput 3.99586K wps
[Epoch 2 Batch 1740/2125] avg loss 0.00407157, throughput 3.99694K wps
[Epoch 2 Batch 1770/2125] avg loss 0.00368507, throughput 3.99741K wps
[Epoch 2 Batch 1800/2125] avg loss 0.00411122, throughput 3.99853K wps
[Epoch 2 Batch 1830/2125] avg loss 0.00341367, throughput 3.99659K wps
[Epoch 2 Batch 1860/2125] avg loss 0.00379787, throughput 3.99688K wps
[Epoch 2 Batch 1890/2125] avg loss 0.00373053, throughput 3.99825K wps
[Epoch 2 Batch 1920/2125] avg loss 0.00395751, throughput 3.99939K wps
[Epoch 2 Batch 1950/2125] avg loss 0.00377079, throughput 3.99675K wps
[Epoch 2 Batch 1980/2125] avg loss 0.00372288, throughput 3.99618K wps
[Epoch 2 Batch 2010/2125] avg loss 0.00387072, throughput 4.00164K wps
[Epoch 2 Batch 2040/2125] avg loss 0.00374262, throughput 3.9979K wps
[Epoch 2 Batch 2070/2125] avg loss 0.00371304, throughput 3.99782K wps
[Epoch 2 Batch 2100/2125] avg loss 0.00408101, throughput 4.00084K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 2] train avg loss 0.00393602, test acc 0.9140, test avg loss 0.230929, throughput 4.0004K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 3 Batch 30/2125] avg loss 0.0033904, throughput 4.09276K wps
[Epoch 3 Batch 60/2125] avg loss 0.00363944, throughput 3.99807K wps
[Epoch 3 Batch 90/2125] avg loss 0.00346443, throughput 3.99326K wps
[Epoch 3 Batch 120/2125] avg loss 0.00354299, throughput 3.99663K wps
[Epoch 3 Batch 150/2125] avg loss 0.00330513, throughput 3.99947K wps
[Epoch 3 Batch 180/2125] avg loss 0.00386889, throughput 3.99914K wps
[Epoch 3 Batch 210/2125] avg loss 0.003229, throughput 3.99901K wps
[Epoch 3 Batch 240/2125] avg loss 0.00308446, throughput 3.99368K wps
[Epoch 3 Batch 270/2125] avg loss 0.0031067, throughput 3.99616K wps
[Epoch 3 Batch 300/2125] avg loss 0.00319471, throughput 3.99572K wps
[Epoch 3 Batch 330/2125] avg loss 0.0031411, throughput 3.99688K wps
[Epoch 3 Batch 360/2125] avg loss 0.00364092, throughput 3.99281K wps
[Epoch 3 Batch 390/2125] avg loss 0.00291069, throughput 3.99397K wps
[Epoch 3 Batch 420/2125] avg loss 0.00343475, throughput 3.99523K wps
[Epoch 3 Batch 450/2125] avg loss 0.00383715, throughput 4.00428K wps
[Epoch 3 Batch 480/2125] avg loss 0.00378076, throughput 4.005K wps
[Epoch 3 Batch 510/2125] avg loss 0.00386261, throughput 4.00318K wps
[Epoch 3 Batch 540/2125] avg loss 0.00384428, throughput 3.99722K wps
[Epoch 3 Batch 570/2125] avg loss 0.00335969, throughput 3.99745K wps
[Epoch 3 Batch 600/2125] avg loss 0.00360967, throughput 3.9984K wps
[Epoch 3 Batch 630/2125] avg loss 0.0031967, throughput 3.99796K wps
[Epoch 3 Batch 660/2125] avg loss 0.00339834, throughput 3.99808K wps
[Epoch 3 Batch 690/2125] avg loss 0.0030571, throughput 3.99524K wps
[Epoch 3 Batch 720/2125] avg loss 0.00367219, throughput 3.99747K wps
[Epoch 3 Batch 750/2125] avg loss 0.00337101, throughput 3.99773K wps
[Epoch 3 Batch 780/2125] avg loss 0.00314582, throughput 3.99382K wps
[Epoch 3 Batch 810/2125] avg loss 0.00379513, throughput 4.00088K wps
[Epoch 3 Batch 840/2125] avg loss 0.00343282, throughput 4.00088K wps
[Epoch 3 Batch 870/2125] avg loss 0.0033154, throughput 3.99781K wps
[Epoch 3 Batch 900/2125] avg loss 0.00337585, throughput 3.99328K wps
[Epoch 3 Batch 930/2125] avg loss 0.00346683, throughput 3.99658K wps
[Epoch 3 Batch 960/2125] avg loss 0.00318301, throughput 3.99935K wps
[Epoch 3 Batch 990/2125] avg loss 0.00326784, throughput 3.99995K wps
[Epoch 3 Batch 1020/2125] avg loss 0.00330093, throughput 3.9993K wps
[Epoch 3 Batch 1050/2125] avg loss 0.0035539, throughput 3.99703K wps
[Epoch 3 Batch 1080/2125] avg loss 0.00358457, throughput 3.99678K wps
[Epoch 3 Batch 1110/2125] avg loss 0.00334674, throughput 3.99963K wps
[Epoch 3 Batch 1140/2125] avg loss 0.00376865, throughput 4.00068K wps
[Epoch 3 Batch 1170/2125] avg loss 0.00345749, throughput 3.99935K wps
[Epoch 3 Batch 1200/2125] avg loss 0.00370953, throughput 3.99696K wps
[Epoch 3 Batch 1230/2125] avg loss 0.00337568, throughput 3.9979K wps
[Epoch 3 Batch 1260/2125] avg loss 0.00337902, throughput 4.00158K wps
[Epoch 3 Batch 1290/2125] avg loss 0.00346691, throughput 4.00419K wps
[Epoch 3 Batch 1320/2125] avg loss 0.00334159, throughput 3.98596K wps
[Epoch 3 Batch 1350/2125] avg loss 0.00336824, throughput 3.98322K wps
[Epoch 3 Batch 1380/2125] avg loss 0.00289602, throughput 4.00355K wps
[Epoch 3 Batch 1410/2125] avg loss 0.00309783, throughput 3.9991K wps
[Epoch 3 Batch 1440/2125] avg loss 0.00324806, throughput 4.00235K wps
[Epoch 3 Batch 1470/2125] avg loss 0.00317644, throughput 3.99644K wps
[Epoch 3 Batch 1500/2125] avg loss 0.00335172, throughput 3.9968K wps
[Epoch 3 Batch 1530/2125] avg loss 0.00307378, throughput 3.99854K wps
[Epoch 3 Batch 1560/2125] avg loss 0.00350744, throughput 4.00058K wps
[Epoch 3 Batch 1590/2125] avg loss 0.00340111, throughput 3.99818K wps
[Epoch 3 Batch 1620/2125] avg loss 0.00334096, throughput 3.99496K wps
[Epoch 3 Batch 1650/2125] avg loss 0.00394288, throughput 3.99416K wps
[Epoch 3 Batch 1680/2125] avg loss 0.00379437, throughput 3.99374K wps
[Epoch 3 Batch 1710/2125] avg loss 0.00352824, throughput 4.00517K wps
[Epoch 3 Batch 1740/2125] avg loss 0.00384453, throughput 3.99972K wps
[Epoch 3 Batch 1770/2125] avg loss 0.00368801, throughput 4.00482K wps
[Epoch 3 Batch 1800/2125] avg loss 0.0033688, throughput 4.00369K wps
[Epoch 3 Batch 1830/2125] avg loss 0.00365095, throughput 4.00813K wps
[Epoch 3 Batch 1860/2125] avg loss 0.00314778, throughput 4.00519K wps
[Epoch 3 Batch 1890/2125] avg loss 0.00344119, throughput 4.00323K wps
[Epoch 3 Batch 1920/2125] avg loss 0.00409361, throughput 4.00673K wps
[Epoch 3 Batch 1950/2125] avg loss 0.00384816, throughput 3.9972K wps
[Epoch 3 Batch 1980/2125] avg loss 0.00356137, throughput 4.0022K wps
[Epoch 3 Batch 2010/2125] avg loss 0.00351984, throughput 3.99787K wps
[Epoch 3 Batch 2040/2125] avg loss 0.00330064, throughput 4.00271K wps
[Epoch 3 Batch 2070/2125] avg loss 0.00372276, throughput 3.99924K wps
[Epoch 3 Batch 2100/2125] avg loss 0.00314503, throughput 3.9998K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 3] train avg loss 0.00344526, test acc 0.9171, test avg loss 0.231881, throughput 3.99986K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 4 Batch 30/2125] avg loss 0.00275388, throughput 4.09485K wps
[Epoch 4 Batch 60/2125] avg loss 0.00279117, throughput 3.99671K wps
[Epoch 4 Batch 90/2125] avg loss 0.00307231, throughput 3.99559K wps
[Epoch 4 Batch 120/2125] avg loss 0.00312418, throughput 3.99535K wps
[Epoch 4 Batch 150/2125] avg loss 0.00340846, throughput 3.99539K wps
[Epoch 4 Batch 180/2125] avg loss 0.00251215, throughput 3.99665K wps
[Epoch 4 Batch 210/2125] avg loss 0.00296142, throughput 4.00003K wps
[Epoch 4 Batch 240/2125] avg loss 0.00309944, throughput 3.99956K wps
[Epoch 4 Batch 270/2125] avg loss 0.00305135, throughput 3.99804K wps
[Epoch 4 Batch 300/2125] avg loss 0.00278939, throughput 3.99771K wps
[Epoch 4 Batch 330/2125] avg loss 0.00307862, throughput 3.99755K wps
[Epoch 4 Batch 360/2125] avg loss 0.00295552, throughput 4.00047K wps
[Epoch 4 Batch 390/2125] avg loss 0.00298802, throughput 3.99751K wps
[Epoch 4 Batch 420/2125] avg loss 0.00290689, throughput 4.00038K wps
[Epoch 4 Batch 450/2125] avg loss 0.00291306, throughput 3.99849K wps
[Epoch 4 Batch 480/2125] avg loss 0.00295056, throughput 3.99739K wps
[Epoch 4 Batch 510/2125] avg loss 0.00282286, throughput 3.99817K wps
[Epoch 4 Batch 540/2125] avg loss 0.00277874, throughput 3.99842K wps
[Epoch 4 Batch 570/2125] avg loss 0.00344663, throughput 4.00122K wps
[Epoch 4 Batch 600/2125] avg loss 0.00316646, throughput 4.00003K wps
[Epoch 4 Batch 630/2125] avg loss 0.00300245, throughput 3.99967K wps
[Epoch 4 Batch 660/2125] avg loss 0.00300958, throughput 3.99948K wps
[Epoch 4 Batch 690/2125] avg loss 0.00293653, throughput 3.9974K wps
[Epoch 4 Batch 720/2125] avg loss 0.00286842, throughput 3.99788K wps
[Epoch 4 Batch 750/2125] avg loss 0.00277752, throughput 3.99875K wps
[Epoch 4 Batch 780/2125] avg loss 0.00270723, throughput 4.00128K wps
[Epoch 4 Batch 810/2125] avg loss 0.00339673, throughput 3.99962K wps
[Epoch 4 Batch 840/2125] avg loss 0.00347367, throughput 3.99809K wps
[Epoch 4 Batch 870/2125] avg loss 0.00309338, throughput 4.00137K wps
[Epoch 4 Batch 900/2125] avg loss 0.00288506, throughput 3.99747K wps
[Epoch 4 Batch 930/2125] avg loss 0.00312296, throughput 3.99729K wps
[Epoch 4 Batch 960/2125] avg loss 0.00325099, throughput 4.00118K wps
[Epoch 4 Batch 990/2125] avg loss 0.00302203, throughput 3.99933K wps
[Epoch 4 Batch 1020/2125] avg loss 0.00291513, throughput 3.99783K wps
[Epoch 4 Batch 1050/2125] avg loss 0.00332072, throughput 3.99681K wps
[Epoch 4 Batch 1080/2125] avg loss 0.00324292, throughput 4.00157K wps
[Epoch 4 Batch 1110/2125] avg loss 0.0030892, throughput 3.99716K wps
[Epoch 4 Batch 1140/2125] avg loss 0.002979, throughput 4.00198K wps
[Epoch 4 Batch 1170/2125] avg loss 0.00300227, throughput 4.00176K wps
[Epoch 4 Batch 1200/2125] avg loss 0.00309862, throughput 3.99897K wps
[Epoch 4 Batch 1230/2125] avg loss 0.00327812, throughput 3.99971K wps
[Epoch 4 Batch 1260/2125] avg loss 0.00296275, throughput 4.00055K wps
[Epoch 4 Batch 1290/2125] avg loss 0.00295405, throughput 3.99969K wps
[Epoch 4 Batch 1320/2125] avg loss 0.00360768, throughput 4.00235K wps
[Epoch 4 Batch 1350/2125] avg loss 0.00292452, throughput 3.98739K wps
[Epoch 4 Batch 1380/2125] avg loss 0.00288385, throughput 3.9972K wps
[Epoch 4 Batch 1410/2125] avg loss 0.00313842, throughput 3.99222K wps
[Epoch 4 Batch 1440/2125] avg loss 0.00317818, throughput 3.98766K wps
[Epoch 4 Batch 1470/2125] avg loss 0.00308852, throughput 3.99424K wps
[Epoch 4 Batch 1500/2125] avg loss 0.00320696, throughput 3.9957K wps
[Epoch 4 Batch 1530/2125] avg loss 0.00282071, throughput 3.99144K wps
[Epoch 4 Batch 1560/2125] avg loss 0.00301076, throughput 3.99821K wps
[Epoch 4 Batch 1590/2125] avg loss 0.00300079, throughput 3.99844K wps
[Epoch 4 Batch 1620/2125] avg loss 0.00283279, throughput 3.98945K wps
[Epoch 4 Batch 1650/2125] avg loss 0.00327742, throughput 3.99459K wps
[Epoch 4 Batch 1680/2125] avg loss 0.00262472, throughput 3.9953K wps
[Epoch 4 Batch 1710/2125] avg loss 0.00303903, throughput 3.99364K wps
[Epoch 4 Batch 1740/2125] avg loss 0.00305432, throughput 3.99123K wps
[Epoch 4 Batch 1770/2125] avg loss 0.00264639, throughput 3.99864K wps
[Epoch 4 Batch 1800/2125] avg loss 0.00319014, throughput 3.99453K wps
[Epoch 4 Batch 1830/2125] avg loss 0.00336713, throughput 3.99245K wps
[Epoch 4 Batch 1860/2125] avg loss 0.00308155, throughput 3.99563K wps
[Epoch 4 Batch 1890/2125] avg loss 0.00359417, throughput 3.99548K wps
[Epoch 4 Batch 1920/2125] avg loss 0.00308635, throughput 3.9988K wps
[Epoch 4 Batch 1950/2125] avg loss 0.00341302, throughput 3.99974K wps
[Epoch 4 Batch 1980/2125] avg loss 0.00324131, throughput 3.99628K wps
[Epoch 4 Batch 2010/2125] avg loss 0.00339585, throughput 4.00152K wps
[Epoch 4 Batch 2040/2125] avg loss 0.0028757, throughput 3.99672K wps
[Epoch 4 Batch 2070/2125] avg loss 0.00338727, throughput 3.99844K wps
[Epoch 4 Batch 2100/2125] avg loss 0.00329966, throughput 3.99514K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 4] train avg loss 0.00306041, test acc 0.9193, test avg loss 0.232051, throughput 3.9988K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 5 Batch 30/2125] avg loss 0.0022825, throughput 4.09718K wps
[Epoch 5 Batch 60/2125] avg loss 0.00254244, throughput 4.00326K wps
[Epoch 5 Batch 90/2125] avg loss 0.00259583, throughput 4.00383K wps
[Epoch 5 Batch 120/2125] avg loss 0.00259168, throughput 3.99592K wps
[Epoch 5 Batch 150/2125] avg loss 0.00278133, throughput 3.99464K wps
[Epoch 5 Batch 180/2125] avg loss 0.00256978, throughput 3.99731K wps
[Epoch 5 Batch 210/2125] avg loss 0.0028489, throughput 4.00237K wps
[Epoch 5 Batch 240/2125] avg loss 0.00234232, throughput 4.0006K wps
[Epoch 5 Batch 270/2125] avg loss 0.00275093, throughput 3.99601K wps
[Epoch 5 Batch 300/2125] avg loss 0.00308546, throughput 3.99846K wps
[Epoch 5 Batch 330/2125] avg loss 0.00294743, throughput 3.99919K wps
[Epoch 5 Batch 360/2125] avg loss 0.00239277, throughput 3.99715K wps
[Epoch 5 Batch 390/2125] avg loss 0.00240688, throughput 3.99695K wps
[Epoch 5 Batch 420/2125] avg loss 0.00302727, throughput 3.99842K wps
[Epoch 5 Batch 450/2125] avg loss 0.00290547, throughput 4.00039K wps
[Epoch 5 Batch 480/2125] avg loss 0.00291568, throughput 4.00077K wps
[Epoch 5 Batch 510/2125] avg loss 0.0022932, throughput 3.95203K wps
[Epoch 5 Batch 540/2125] avg loss 0.00253938, throughput 3.91343K wps
[Epoch 5 Batch 570/2125] avg loss 0.00290635, throughput 3.9948K wps
[Epoch 5 Batch 600/2125] avg loss 0.00255613, throughput 3.99381K wps
[Epoch 5 Batch 630/2125] avg loss 0.0025742, throughput 3.99282K wps
[Epoch 5 Batch 660/2125] avg loss 0.00280975, throughput 3.99607K wps
[Epoch 5 Batch 690/2125] avg loss 0.00303493, throughput 3.99957K wps
[Epoch 5 Batch 720/2125] avg loss 0.00262448, throughput 3.9975K wps
[Epoch 5 Batch 750/2125] avg loss 0.00295974, throughput 3.9925K wps
[Epoch 5 Batch 780/2125] avg loss 0.00278173, throughput 3.99061K wps
[Epoch 5 Batch 810/2125] avg loss 0.00311076, throughput 3.98913K wps
[Epoch 5 Batch 840/2125] avg loss 0.00276734, throughput 3.97578K wps
[Epoch 5 Batch 870/2125] avg loss 0.00253087, throughput 3.9432K wps
[Epoch 5 Batch 900/2125] avg loss 0.00314625, throughput 3.98029K wps
[Epoch 5 Batch 930/2125] avg loss 0.00256389, throughput 3.9947K wps
[Epoch 5 Batch 960/2125] avg loss 0.00303426, throughput 3.99888K wps
[Epoch 5 Batch 990/2125] avg loss 0.00309291, throughput 4.00066K wps
[Epoch 5 Batch 1020/2125] avg loss 0.00349506, throughput 3.99828K wps
[Epoch 5 Batch 1050/2125] avg loss 0.0030527, throughput 3.99809K wps
[Epoch 5 Batch 1080/2125] avg loss 0.0031657, throughput 3.99843K wps
[Epoch 5 Batch 1110/2125] avg loss 0.00252705, throughput 3.99707K wps
[Epoch 5 Batch 1140/2125] avg loss 0.00271469, throughput 3.9968K wps
[Epoch 5 Batch 1170/2125] avg loss 0.00339467, throughput 3.99625K wps
[Epoch 5 Batch 1200/2125] avg loss 0.00267632, throughput 3.99773K wps
[Epoch 5 Batch 1230/2125] avg loss 0.00313533, throughput 3.99311K wps
[Epoch 5 Batch 1260/2125] avg loss 0.00282691, throughput 3.99371K wps
[Epoch 5 Batch 1290/2125] avg loss 0.00260989, throughput 3.99549K wps
[Epoch 5 Batch 1320/2125] avg loss 0.00274335, throughput 3.99704K wps
[Epoch 5 Batch 1350/2125] avg loss 0.00263737, throughput 3.99598K wps
[Epoch 5 Batch 1380/2125] avg loss 0.00268219, throughput 3.99997K wps
[Epoch 5 Batch 1410/2125] avg loss 0.00296077, throughput 3.99854K wps
[Epoch 5 Batch 1440/2125] avg loss 0.00314575, throughput 3.99998K wps
[Epoch 5 Batch 1470/2125] avg loss 0.00246876, throughput 4.002K wps
[Epoch 5 Batch 1500/2125] avg loss 0.0025774, throughput 3.99884K wps
[Epoch 5 Batch 1530/2125] avg loss 0.00252392, throughput 3.99729K wps
[Epoch 5 Batch 1560/2125] avg loss 0.00261165, throughput 4.00009K wps
[Epoch 5 Batch 1590/2125] avg loss 0.00310957, throughput 4.00064K wps
[Epoch 5 Batch 1620/2125] avg loss 0.00252723, throughput 4.00013K wps
[Epoch 5 Batch 1650/2125] avg loss 0.00263221, throughput 3.99356K wps
[Epoch 5 Batch 1680/2125] avg loss 0.00283974, throughput 3.99471K wps
[Epoch 5 Batch 1710/2125] avg loss 0.00248055, throughput 3.99706K wps
[Epoch 5 Batch 1740/2125] avg loss 0.00279796, throughput 3.99294K wps
[Epoch 5 Batch 1770/2125] avg loss 0.00272356, throughput 3.99453K wps
[Epoch 5 Batch 1800/2125] avg loss 0.00261568, throughput 3.99291K wps
[Epoch 5 Batch 1830/2125] avg loss 0.00229801, throughput 4.00706K wps
[Epoch 5 Batch 1860/2125] avg loss 0.00305227, throughput 4.00175K wps
[Epoch 5 Batch 1890/2125] avg loss 0.00305707, throughput 4.00184K wps
[Epoch 5 Batch 1920/2125] avg loss 0.00277188, throughput 4.00803K wps
[Epoch 5 Batch 1950/2125] avg loss 0.0030973, throughput 4.00178K wps
[Epoch 5 Batch 1980/2125] avg loss 0.00323209, throughput 4.00807K wps
[Epoch 5 Batch 2010/2125] avg loss 0.00313434, throughput 4.00589K wps
[Epoch 5 Batch 2040/2125] avg loss 0.00297003, throughput 3.9988K wps
[Epoch 5 Batch 2070/2125] avg loss 0.00287839, throughput 4.002K wps
[Epoch 5 Batch 2100/2125] avg loss 0.00237589, throughput 4.00351K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 5] train avg loss 0.00278417, test acc 0.9218, test avg loss 0.240415, throughput 3.99647K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 6 Batch 30/2125] avg loss 0.00198337, throughput 4.09125K wps
[Epoch 6 Batch 60/2125] avg loss 0.00250973, throughput 4.00059K wps
[Epoch 6 Batch 90/2125] avg loss 0.00224365, throughput 3.99929K wps
[Epoch 6 Batch 120/2125] avg loss 0.00244529, throughput 3.99571K wps
[Epoch 6 Batch 150/2125] avg loss 0.00233686, throughput 3.99807K wps
[Epoch 6 Batch 180/2125] avg loss 0.00228603, throughput 4.00047K wps
[Epoch 6 Batch 210/2125] avg loss 0.00247067, throughput 3.99854K wps
[Epoch 6 Batch 240/2125] avg loss 0.00192245, throughput 3.99817K wps
[Epoch 6 Batch 270/2125] avg loss 0.00264432, throughput 4.00229K wps
[Epoch 6 Batch 300/2125] avg loss 0.00225173, throughput 3.9957K wps
[Epoch 6 Batch 330/2125] avg loss 0.00246615, throughput 3.99887K wps
[Epoch 6 Batch 360/2125] avg loss 0.00195428, throughput 4.00334K wps
[Epoch 6 Batch 390/2125] avg loss 0.00263974, throughput 4.0072K wps
[Epoch 6 Batch 420/2125] avg loss 0.00336309, throughput 3.99809K wps
[Epoch 6 Batch 450/2125] avg loss 0.00251252, throughput 4.00382K wps
[Epoch 6 Batch 480/2125] avg loss 0.00273801, throughput 4.00413K wps
[Epoch 6 Batch 510/2125] avg loss 0.00235722, throughput 4.00315K wps
[Epoch 6 Batch 540/2125] avg loss 0.00213768, throughput 4.00051K wps
[Epoch 6 Batch 570/2125] avg loss 0.0025257, throughput 3.99737K wps
[Epoch 6 Batch 600/2125] avg loss 0.00233726, throughput 3.99771K wps
[Epoch 6 Batch 630/2125] avg loss 0.00260488, throughput 3.99993K wps
[Epoch 6 Batch 660/2125] avg loss 0.00251389, throughput 3.99632K wps
[Epoch 6 Batch 690/2125] avg loss 0.00265432, throughput 3.99869K wps
[Epoch 6 Batch 720/2125] avg loss 0.00274276, throughput 3.99677K wps
[Epoch 6 Batch 750/2125] avg loss 0.00234884, throughput 3.99436K wps
[Epoch 6 Batch 780/2125] avg loss 0.002565, throughput 3.9968K wps
[Epoch 6 Batch 810/2125] avg loss 0.00249897, throughput 3.99947K wps
[Epoch 6 Batch 840/2125] avg loss 0.0021836, throughput 4.0004K wps
[Epoch 6 Batch 870/2125] avg loss 0.00277331, throughput 4.00056K wps
[Epoch 6 Batch 900/2125] avg loss 0.00299542, throughput 3.99939K wps
[Epoch 6 Batch 930/2125] avg loss 0.00235707, throughput 3.99501K wps
[Epoch 6 Batch 960/2125] avg loss 0.00233503, throughput 3.99501K wps
[Epoch 6 Batch 990/2125] avg loss 0.0021442, throughput 3.99792K wps
[Epoch 6 Batch 1020/2125] avg loss 0.00238372, throughput 3.99819K wps
[Epoch 6 Batch 1050/2125] avg loss 0.00257481, throughput 3.9963K wps
[Epoch 6 Batch 1080/2125] avg loss 0.00239753, throughput 3.99762K wps
[Epoch 6 Batch 1110/2125] avg loss 0.00260934, throughput 3.99565K wps
[Epoch 6 Batch 1140/2125] avg loss 0.0020846, throughput 3.99698K wps
[Epoch 6 Batch 1170/2125] avg loss 0.00252877, throughput 3.99645K wps
[Epoch 6 Batch 1200/2125] avg loss 0.00241086, throughput 4.00087K wps
[Epoch 6 Batch 1230/2125] avg loss 0.00222946, throughput 3.99477K wps
[Epoch 6 Batch 1260/2125] avg loss 0.00228515, throughput 3.99639K wps
[Epoch 6 Batch 1290/2125] avg loss 0.00268957, throughput 3.99807K wps
[Epoch 6 Batch 1320/2125] avg loss 0.00296045, throughput 3.99524K wps
[Epoch 6 Batch 1350/2125] avg loss 0.00286201, throughput 3.99815K wps
[Epoch 6 Batch 1380/2125] avg loss 0.00224584, throughput 4.00231K wps
[Epoch 6 Batch 1410/2125] avg loss 0.0027368, throughput 3.99931K wps
[Epoch 6 Batch 1440/2125] avg loss 0.00278062, throughput 4.00013K wps
[Epoch 6 Batch 1470/2125] avg loss 0.00251595, throughput 3.9947K wps
[Epoch 6 Batch 1500/2125] avg loss 0.00262117, throughput 3.9967K wps
[Epoch 6 Batch 1530/2125] avg loss 0.00263141, throughput 3.99964K wps
[Epoch 6 Batch 1560/2125] avg loss 0.0022681, throughput 3.99425K wps
[Epoch 6 Batch 1590/2125] avg loss 0.00248618, throughput 3.99993K wps
[Epoch 6 Batch 1620/2125] avg loss 0.00291531, throughput 3.99706K wps
[Epoch 6 Batch 1650/2125] avg loss 0.00255724, throughput 3.99795K wps
[Epoch 6 Batch 1680/2125] avg loss 0.00259566, throughput 3.9947K wps
[Epoch 6 Batch 1710/2125] avg loss 0.00279095, throughput 4.00166K wps
[Epoch 6 Batch 1740/2125] avg loss 0.00261297, throughput 4.00298K wps
[Epoch 6 Batch 1770/2125] avg loss 0.00249802, throughput 4.00489K wps
[Epoch 6 Batch 1800/2125] avg loss 0.00282959, throughput 4.00854K wps
[Epoch 6 Batch 1830/2125] avg loss 0.00324875, throughput 4.00491K wps
[Epoch 6 Batch 1860/2125] avg loss 0.00265956, throughput 3.99918K wps
[Epoch 6 Batch 1890/2125] avg loss 0.00224365, throughput 3.99895K wps
[Epoch 6 Batch 1920/2125] avg loss 0.00303943, throughput 3.98367K wps
[Epoch 6 Batch 1950/2125] avg loss 0.00273962, throughput 3.99777K wps
[Epoch 6 Batch 1980/2125] avg loss 0.00305446, throughput 4.00023K wps
[Epoch 6 Batch 2010/2125] avg loss 0.00290733, throughput 3.9979K wps
[Epoch 6 Batch 2040/2125] avg loss 0.00230875, throughput 3.99444K wps
[Epoch 6 Batch 2070/2125] avg loss 0.00290851, throughput 4.001K wps
[Epoch 6 Batch 2100/2125] avg loss 0.00306109, throughput 3.99765K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 6] train avg loss 0.00254322, test acc 0.9216, test avg loss 0.246337, throughput 3.99997K wps
[Epoch 7 Batch 30/2125] avg loss 0.00211947, throughput 4.08196K wps
[Epoch 7 Batch 60/2125] avg loss 0.00216626, throughput 4.0011K wps
[Epoch 7 Batch 90/2125] avg loss 0.00223903, throughput 4.00228K wps
[Epoch 7 Batch 120/2125] avg loss 0.00215377, throughput 4.00116K wps
[Epoch 7 Batch 150/2125] avg loss 0.00222163, throughput 4.00163K wps
[Epoch 7 Batch 180/2125] avg loss 0.0021045, throughput 4.00566K wps
[Epoch 7 Batch 210/2125] avg loss 0.00213662, throughput 4.00286K wps
[Epoch 7 Batch 240/2125] avg loss 0.00167227, throughput 3.99828K wps
[Epoch 7 Batch 270/2125] avg loss 0.00208361, throughput 4.00174K wps
[Epoch 7 Batch 300/2125] avg loss 0.00257552, throughput 3.99728K wps
[Epoch 7 Batch 330/2125] avg loss 0.00217625, throughput 4.00005K wps
[Epoch 7 Batch 360/2125] avg loss 0.0022611, throughput 4.00104K wps
[Epoch 7 Batch 390/2125] avg loss 0.00197589, throughput 4.00051K wps
[Epoch 7 Batch 420/2125] avg loss 0.00245764, throughput 4.00054K wps
[Epoch 7 Batch 450/2125] avg loss 0.00208687, throughput 3.99801K wps
[Epoch 7 Batch 480/2125] avg loss 0.00188943, throughput 3.99609K wps
[Epoch 7 Batch 510/2125] avg loss 0.00226759, throughput 3.99779K wps
[Epoch 7 Batch 540/2125] avg loss 0.00273505, throughput 3.99812K wps
[Epoch 7 Batch 570/2125] avg loss 0.00225724, throughput 3.9989K wps
[Epoch 7 Batch 600/2125] avg loss 0.00234034, throughput 4.00058K wps
[Epoch 7 Batch 630/2125] avg loss 0.0022215, throughput 3.99763K wps
[Epoch 7 Batch 660/2125] avg loss 0.00208174, throughput 3.99299K wps
[Epoch 7 Batch 690/2125] avg loss 0.00236086, throughput 4.00054K wps
[Epoch 7 Batch 720/2125] avg loss 0.00238492, throughput 3.99898K wps
[Epoch 7 Batch 750/2125] avg loss 0.00264821, throughput 3.99996K wps
[Epoch 7 Batch 780/2125] avg loss 0.00205827, throughput 3.99851K wps
[Epoch 7 Batch 810/2125] avg loss 0.00241725, throughput 3.99528K wps
[Epoch 7 Batch 840/2125] avg loss 0.00206261, throughput 3.99716K wps
[Epoch 7 Batch 870/2125] avg loss 0.0021823, throughput 3.99769K wps
[Epoch 7 Batch 900/2125] avg loss 0.00228839, throughput 3.99582K wps
[Epoch 7 Batch 930/2125] avg loss 0.00204104, throughput 3.99635K wps
[Epoch 7 Batch 960/2125] avg loss 0.00264661, throughput 3.99854K wps
[Epoch 7 Batch 990/2125] avg loss 0.00192559, throughput 3.99475K wps
[Epoch 7 Batch 1020/2125] avg loss 0.00219127, throughput 3.99536K wps
[Epoch 7 Batch 1050/2125] avg loss 0.00218674, throughput 3.99791K wps
[Epoch 7 Batch 1080/2125] avg loss 0.00231877, throughput 3.99704K wps
[Epoch 7 Batch 1110/2125] avg loss 0.00282527, throughput 4.00033K wps
[Epoch 7 Batch 1140/2125] avg loss 0.00266714, throughput 3.99473K wps
[Epoch 7 Batch 1170/2125] avg loss 0.00243537, throughput 3.9993K wps
[Epoch 7 Batch 1200/2125] avg loss 0.00225484, throughput 3.99617K wps
[Epoch 7 Batch 1230/2125] avg loss 0.00259499, throughput 3.99892K wps
[Epoch 7 Batch 1260/2125] avg loss 0.00240174, throughput 3.99977K wps
[Epoch 7 Batch 1290/2125] avg loss 0.00199789, throughput 4.0004K wps
[Epoch 7 Batch 1320/2125] avg loss 0.00256965, throughput 3.99705K wps
[Epoch 7 Batch 1350/2125] avg loss 0.00210882, throughput 3.99913K wps
[Epoch 7 Batch 1380/2125] avg loss 0.00242438, throughput 3.99631K wps
[Epoch 7 Batch 1410/2125] avg loss 0.00237462, throughput 3.99911K wps
[Epoch 7 Batch 1440/2125] avg loss 0.00222942, throughput 3.99948K wps
[Epoch 7 Batch 1470/2125] avg loss 0.00209812, throughput 3.99927K wps
[Epoch 7 Batch 1500/2125] avg loss 0.00198964, throughput 3.99764K wps
[Epoch 7 Batch 1530/2125] avg loss 0.00270763, throughput 3.99622K wps
[Epoch 7 Batch 1560/2125] avg loss 0.00235467, throughput 3.9983K wps
[Epoch 7 Batch 1590/2125] avg loss 0.00251543, throughput 3.99825K wps
[Epoch 7 Batch 1620/2125] avg loss 0.00260508, throughput 4K wps
[Epoch 7 Batch 1650/2125] avg loss 0.00292433, throughput 4.00132K wps
[Epoch 7 Batch 1680/2125] avg loss 0.00254386, throughput 4.00117K wps
[Epoch 7 Batch 1710/2125] avg loss 0.00236536, throughput 3.99931K wps
[Epoch 7 Batch 1740/2125] avg loss 0.00286541, throughput 3.99772K wps
[Epoch 7 Batch 1770/2125] avg loss 0.00227186, throughput 4.00194K wps
[Epoch 7 Batch 1800/2125] avg loss 0.00247652, throughput 4.00084K wps
[Epoch 7 Batch 1830/2125] avg loss 0.00285552, throughput 3.99837K wps
[Epoch 7 Batch 1860/2125] avg loss 0.00263854, throughput 4.00332K wps
[Epoch 7 Batch 1890/2125] avg loss 0.00210071, throughput 3.99905K wps
[Epoch 7 Batch 1920/2125] avg loss 0.00239953, throughput 3.99889K wps
[Epoch 7 Batch 1950/2125] avg loss 0.00243826, throughput 3.99726K wps
[Epoch 7 Batch 1980/2125] avg loss 0.00210164, throughput 4.00014K wps
[Epoch 7 Batch 2010/2125] avg loss 0.00249143, throughput 4.00154K wps
[Epoch 7 Batch 2040/2125] avg loss 0.00235105, throughput 3.99744K wps
[Epoch 7 Batch 2070/2125] avg loss 0.00244036, throughput 4.00064K wps
[Epoch 7 Batch 2100/2125] avg loss 0.00239664, throughput 3.99818K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 7] train avg loss 0.00232972, test acc 0.9245, test avg loss 0.253364, throughput 4.00008K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 8 Batch 30/2125] avg loss 0.0020827, throughput 4.08588K wps
[Epoch 8 Batch 60/2125] avg loss 0.00166811, throughput 3.99781K wps
[Epoch 8 Batch 90/2125] avg loss 0.00183865, throughput 4.00016K wps
[Epoch 8 Batch 120/2125] avg loss 0.00189961, throughput 3.99895K wps
[Epoch 8 Batch 150/2125] avg loss 0.00180559, throughput 4.00035K wps
[Epoch 8 Batch 180/2125] avg loss 0.0019114, throughput 3.99805K wps
[Epoch 8 Batch 210/2125] avg loss 0.00219012, throughput 4.00271K wps
[Epoch 8 Batch 240/2125] avg loss 0.00183277, throughput 3.99982K wps
[Epoch 8 Batch 270/2125] avg loss 0.00218164, throughput 4.00023K wps
[Epoch 8 Batch 300/2125] avg loss 0.00208982, throughput 4.0004K wps
[Epoch 8 Batch 330/2125] avg loss 0.00183288, throughput 4.00056K wps
[Epoch 8 Batch 360/2125] avg loss 0.0019262, throughput 3.99983K wps
[Epoch 8 Batch 390/2125] avg loss 0.00209054, throughput 3.99926K wps
[Epoch 8 Batch 420/2125] avg loss 0.00204555, throughput 4.00233K wps
[Epoch 8 Batch 450/2125] avg loss 0.00196361, throughput 4.00054K wps
[Epoch 8 Batch 480/2125] avg loss 0.00196424, throughput 4.00061K wps
[Epoch 8 Batch 510/2125] avg loss 0.00216374, throughput 4.0001K wps
[Epoch 8 Batch 540/2125] avg loss 0.00195603, throughput 3.99901K wps
[Epoch 8 Batch 570/2125] avg loss 0.0021751, throughput 3.99855K wps
[Epoch 8 Batch 600/2125] avg loss 0.0020346, throughput 4.0006K wps
[Epoch 8 Batch 630/2125] avg loss 0.00181399, throughput 3.99713K wps
[Epoch 8 Batch 660/2125] avg loss 0.00225244, throughput 3.99765K wps
[Epoch 8 Batch 690/2125] avg loss 0.00238105, throughput 3.99962K wps
[Epoch 8 Batch 720/2125] avg loss 0.0018422, throughput 4.00004K wps
[Epoch 8 Batch 750/2125] avg loss 0.00242821, throughput 3.99898K wps
[Epoch 8 Batch 780/2125] avg loss 0.00207147, throughput 3.99943K wps
[Epoch 8 Batch 810/2125] avg loss 0.00208339, throughput 4.00256K wps
[Epoch 8 Batch 840/2125] avg loss 0.00226943, throughput 3.99919K wps
[Epoch 8 Batch 870/2125] avg loss 0.0026114, throughput 3.99728K wps
[Epoch 8 Batch 900/2125] avg loss 0.00210617, throughput 3.99655K wps
[Epoch 8 Batch 930/2125] avg loss 0.00180289, throughput 3.99468K wps
[Epoch 8 Batch 960/2125] avg loss 0.00241649, throughput 3.99824K wps
[Epoch 8 Batch 990/2125] avg loss 0.00245444, throughput 3.99623K wps
[Epoch 8 Batch 1020/2125] avg loss 0.00218489, throughput 3.99707K wps
[Epoch 8 Batch 1050/2125] avg loss 0.00218768, throughput 4.00232K wps
[Epoch 8 Batch 1080/2125] avg loss 0.00207225, throughput 4.00212K wps
[Epoch 8 Batch 1110/2125] avg loss 0.00221815, throughput 3.9928K wps
[Epoch 8 Batch 1140/2125] avg loss 0.00209712, throughput 3.9893K wps
[Epoch 8 Batch 1170/2125] avg loss 0.00221913, throughput 4.00356K wps
[Epoch 8 Batch 1200/2125] avg loss 0.00226165, throughput 3.99796K wps
[Epoch 8 Batch 1230/2125] avg loss 0.00201832, throughput 4.00438K wps
[Epoch 8 Batch 1260/2125] avg loss 0.00200858, throughput 4.00192K wps
[Epoch 8 Batch 1290/2125] avg loss 0.00229307, throughput 3.99903K wps
[Epoch 8 Batch 1320/2125] avg loss 0.00231625, throughput 4.00374K wps
[Epoch 8 Batch 1350/2125] avg loss 0.00198324, throughput 4.00078K wps
[Epoch 8 Batch 1380/2125] avg loss 0.00246484, throughput 4.0005K wps
[Epoch 8 Batch 1410/2125] avg loss 0.00231798, throughput 4.00114K wps
[Epoch 8 Batch 1440/2125] avg loss 0.00201665, throughput 3.99913K wps
[Epoch 8 Batch 1470/2125] avg loss 0.002221, throughput 3.9986K wps
[Epoch 8 Batch 1500/2125] avg loss 0.00222868, throughput 4.00223K wps
[Epoch 8 Batch 1530/2125] avg loss 0.00242457, throughput 3.99674K wps
[Epoch 8 Batch 1560/2125] avg loss 0.00213446, throughput 3.99848K wps
[Epoch 8 Batch 1590/2125] avg loss 0.00216858, throughput 3.99487K wps
[Epoch 8 Batch 1620/2125] avg loss 0.00208444, throughput 4.00031K wps
[Epoch 8 Batch 1650/2125] avg loss 0.00221152, throughput 3.99825K wps
[Epoch 8 Batch 1680/2125] avg loss 0.0017208, throughput 4.00128K wps
[Epoch 8 Batch 1710/2125] avg loss 0.0021396, throughput 3.9925K wps
[Epoch 8 Batch 1740/2125] avg loss 0.00228274, throughput 3.99886K wps
[Epoch 8 Batch 1770/2125] avg loss 0.00281929, throughput 4.00089K wps
[Epoch 8 Batch 1800/2125] avg loss 0.00242054, throughput 4.00066K wps
[Epoch 8 Batch 1830/2125] avg loss 0.0023918, throughput 4.00082K wps
[Epoch 8 Batch 1860/2125] avg loss 0.00233443, throughput 3.99899K wps
[Epoch 8 Batch 1890/2125] avg loss 0.00201438, throughput 3.9994K wps
[Epoch 8 Batch 1920/2125] avg loss 0.00204531, throughput 3.99648K wps
[Epoch 8 Batch 1950/2125] avg loss 0.00209994, throughput 3.99477K wps
[Epoch 8 Batch 1980/2125] avg loss 0.00264662, throughput 3.99454K wps
[Epoch 8 Batch 2010/2125] avg loss 0.0022478, throughput 3.99845K wps
[Epoch 8 Batch 2040/2125] avg loss 0.00249431, throughput 4.0008K wps
[Epoch 8 Batch 2070/2125] avg loss 0.00247069, throughput 3.99903K wps
[Epoch 8 Batch 2100/2125] avg loss 0.0020063, throughput 3.99913K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 8] train avg loss 0.00215639, test acc 0.9241, test avg loss 0.262648, throughput 4.00035K wps
[Epoch 9 Batch 30/2125] avg loss 0.00182424, throughput 4.09643K wps
[Epoch 9 Batch 60/2125] avg loss 0.00195425, throughput 4.00526K wps
[Epoch 9 Batch 90/2125] avg loss 0.001745, throughput 4.00641K wps
[Epoch 9 Batch 120/2125] avg loss 0.00179037, throughput 3.99767K wps
[Epoch 9 Batch 150/2125] avg loss 0.00182356, throughput 3.99757K wps
[Epoch 9 Batch 180/2125] avg loss 0.00149395, throughput 3.99629K wps
[Epoch 9 Batch 210/2125] avg loss 0.00149949, throughput 3.99575K wps
[Epoch 9 Batch 240/2125] avg loss 0.00191736, throughput 3.99876K wps
[Epoch 9 Batch 270/2125] avg loss 0.00200521, throughput 3.99996K wps
[Epoch 9 Batch 300/2125] avg loss 0.00176404, throughput 3.99886K wps
[Epoch 9 Batch 330/2125] avg loss 0.00158298, throughput 3.99654K wps
[Epoch 9 Batch 360/2125] avg loss 0.00206157, throughput 4.00124K wps
[Epoch 9 Batch 390/2125] avg loss 0.00171151, throughput 3.99779K wps
[Epoch 9 Batch 420/2125] avg loss 0.00217365, throughput 3.9971K wps
[Epoch 9 Batch 450/2125] avg loss 0.00203262, throughput 3.99628K wps
[Epoch 9 Batch 480/2125] avg loss 0.00203901, throughput 3.99772K wps
[Epoch 9 Batch 510/2125] avg loss 0.00200545, throughput 3.99308K wps
[Epoch 9 Batch 540/2125] avg loss 0.00198275, throughput 3.9994K wps
[Epoch 9 Batch 570/2125] avg loss 0.00166026, throughput 3.99679K wps
[Epoch 9 Batch 600/2125] avg loss 0.00190238, throughput 3.99661K wps
[Epoch 9 Batch 630/2125] avg loss 0.00194891, throughput 3.99943K wps
[Epoch 9 Batch 660/2125] avg loss 0.0016567, throughput 4.00027K wps
[Epoch 9 Batch 690/2125] avg loss 0.0019335, throughput 4.0002K wps
[Epoch 9 Batch 720/2125] avg loss 0.0018421, throughput 3.99595K wps
[Epoch 9 Batch 750/2125] avg loss 0.00193512, throughput 3.9977K wps
[Epoch 9 Batch 780/2125] avg loss 0.0017217, throughput 3.99664K wps
[Epoch 9 Batch 810/2125] avg loss 0.00193914, throughput 3.99872K wps
[Epoch 9 Batch 840/2125] avg loss 0.00176153, throughput 4.00374K wps
[Epoch 9 Batch 870/2125] avg loss 0.00180941, throughput 4.00074K wps
[Epoch 9 Batch 900/2125] avg loss 0.00183327, throughput 4.0013K wps
[Epoch 9 Batch 930/2125] avg loss 0.00184996, throughput 4.00096K wps
[Epoch 9 Batch 960/2125] avg loss 0.00206765, throughput 4.00277K wps
[Epoch 9 Batch 990/2125] avg loss 0.00191093, throughput 4.00622K wps
[Epoch 9 Batch 1020/2125] avg loss 0.00190371, throughput 4.00448K wps
[Epoch 9 Batch 1050/2125] avg loss 0.00189038, throughput 4.00244K wps
[Epoch 9 Batch 1080/2125] avg loss 0.00207938, throughput 3.99382K wps
[Epoch 9 Batch 1110/2125] avg loss 0.00179534, throughput 3.99816K wps
[Epoch 9 Batch 1140/2125] avg loss 0.00200375, throughput 3.99704K wps
[Epoch 9 Batch 1170/2125] avg loss 0.00196437, throughput 3.9998K wps
[Epoch 9 Batch 1200/2125] avg loss 0.00222712, throughput 3.9978K wps
[Epoch 9 Batch 1230/2125] avg loss 0.00201903, throughput 3.99927K wps
[Epoch 9 Batch 1260/2125] avg loss 0.00196267, throughput 3.99576K wps
[Epoch 9 Batch 1290/2125] avg loss 0.00202064, throughput 3.99884K wps
[Epoch 9 Batch 1320/2125] avg loss 0.00185087, throughput 3.99767K wps
[Epoch 9 Batch 1350/2125] avg loss 0.00196528, throughput 3.99957K wps
[Epoch 9 Batch 1380/2125] avg loss 0.00180156, throughput 3.99766K wps
[Epoch 9 Batch 1410/2125] avg loss 0.00218442, throughput 3.99556K wps
[Epoch 9 Batch 1440/2125] avg loss 0.00179648, throughput 4.00405K wps
[Epoch 9 Batch 1470/2125] avg loss 0.00235052, throughput 3.99494K wps
[Epoch 9 Batch 1500/2125] avg loss 0.0021911, throughput 3.99862K wps
[Epoch 9 Batch 1530/2125] avg loss 0.00237207, throughput 4.00154K wps
[Epoch 9 Batch 1560/2125] avg loss 0.00223092, throughput 3.99876K wps
[Epoch 9 Batch 1590/2125] avg loss 0.00202236, throughput 3.99741K wps
[Epoch 9 Batch 1620/2125] avg loss 0.00197483, throughput 4.00178K wps
[Epoch 9 Batch 1650/2125] avg loss 0.00181497, throughput 3.99542K wps
[Epoch 9 Batch 1680/2125] avg loss 0.00208414, throughput 4.00048K wps
[Epoch 9 Batch 1710/2125] avg loss 0.00222789, throughput 3.9945K wps
[Epoch 9 Batch 1740/2125] avg loss 0.00239479, throughput 4.00021K wps
[Epoch 9 Batch 1770/2125] avg loss 0.0022883, throughput 3.99964K wps
[Epoch 9 Batch 1800/2125] avg loss 0.00221379, throughput 3.99406K wps
[Epoch 9 Batch 1830/2125] avg loss 0.00223983, throughput 3.99378K wps
[Epoch 9 Batch 1860/2125] avg loss 0.00252473, throughput 3.99987K wps
[Epoch 9 Batch 1890/2125] avg loss 0.0021407, throughput 3.99839K wps
[Epoch 9 Batch 1920/2125] avg loss 0.0022464, throughput 3.99638K wps
[Epoch 9 Batch 1950/2125] avg loss 0.0023794, throughput 3.99852K wps
[Epoch 9 Batch 1980/2125] avg loss 0.00234652, throughput 3.99516K wps
[Epoch 9 Batch 2010/2125] avg loss 0.00191696, throughput 3.9941K wps
[Epoch 9 Batch 2040/2125] avg loss 0.00231407, throughput 3.99785K wps
[Epoch 9 Batch 2070/2125] avg loss 0.00237412, throughput 3.99953K wps
[Epoch 9 Batch 2100/2125] avg loss 0.00193697, throughput 4.00012K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 9] train avg loss 0.00198911, test acc 0.9251, test avg loss 0.271076, throughput 3.99999K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 10 Batch 30/2125] avg loss 0.00161246, throughput 4.09126K wps
[Epoch 10 Batch 60/2125] avg loss 0.00159329, throughput 3.99932K wps
[Epoch 10 Batch 90/2125] avg loss 0.0019143, throughput 3.99465K wps
[Epoch 10 Batch 120/2125] avg loss 0.0020381, throughput 4.00127K wps
[Epoch 10 Batch 150/2125] avg loss 0.00150598, throughput 3.999K wps
[Epoch 10 Batch 180/2125] avg loss 0.00174184, throughput 4.00071K wps
[Epoch 10 Batch 210/2125] avg loss 0.00168061, throughput 3.99671K wps
[Epoch 10 Batch 240/2125] avg loss 0.00181365, throughput 3.99878K wps
[Epoch 10 Batch 270/2125] avg loss 0.00179769, throughput 4.0012K wps
[Epoch 10 Batch 300/2125] avg loss 0.00188586, throughput 3.99768K wps
[Epoch 10 Batch 330/2125] avg loss 0.0014642, throughput 3.98215K wps
[Epoch 10 Batch 360/2125] avg loss 0.00166928, throughput 3.99509K wps
[Epoch 10 Batch 390/2125] avg loss 0.00191644, throughput 3.99794K wps
[Epoch 10 Batch 420/2125] avg loss 0.00183531, throughput 3.99619K wps
[Epoch 10 Batch 450/2125] avg loss 0.00216231, throughput 4.00189K wps
[Epoch 10 Batch 480/2125] avg loss 0.00190306, throughput 4.00107K wps
[Epoch 10 Batch 510/2125] avg loss 0.00163717, throughput 4.00417K wps
[Epoch 10 Batch 540/2125] avg loss 0.00160676, throughput 3.99857K wps
[Epoch 10 Batch 570/2125] avg loss 0.00162635, throughput 4.00301K wps
[Epoch 10 Batch 600/2125] avg loss 0.00152909, throughput 3.99942K wps
[Epoch 10 Batch 630/2125] avg loss 0.00185673, throughput 4.00266K wps
[Epoch 10 Batch 660/2125] avg loss 0.00191634, throughput 3.99901K wps
[Epoch 10 Batch 690/2125] avg loss 0.00177302, throughput 3.99902K wps
[Epoch 10 Batch 720/2125] avg loss 0.00211625, throughput 3.99884K wps
[Epoch 10 Batch 750/2125] avg loss 0.0018063, throughput 3.99773K wps
[Epoch 10 Batch 780/2125] avg loss 0.0016285, throughput 4.00008K wps
[Epoch 10 Batch 810/2125] avg loss 0.00169169, throughput 3.99621K wps
[Epoch 10 Batch 840/2125] avg loss 0.00197939, throughput 3.99379K wps
[Epoch 10 Batch 870/2125] avg loss 0.00179231, throughput 3.99566K wps
[Epoch 10 Batch 900/2125] avg loss 0.00184352, throughput 3.997K wps
[Epoch 10 Batch 930/2125] avg loss 0.00181249, throughput 3.99749K wps
[Epoch 10 Batch 960/2125] avg loss 0.00187933, throughput 4.00072K wps
[Epoch 10 Batch 990/2125] avg loss 0.00206006, throughput 3.99698K wps
[Epoch 10 Batch 1020/2125] avg loss 0.00187025, throughput 3.99508K wps
[Epoch 10 Batch 1050/2125] avg loss 0.00178138, throughput 3.99899K wps
[Epoch 10 Batch 1080/2125] avg loss 0.0020451, throughput 3.99974K wps
[Epoch 10 Batch 1110/2125] avg loss 0.0016697, throughput 4.0017K wps
[Epoch 10 Batch 1140/2125] avg loss 0.00182663, throughput 3.99757K wps
[Epoch 10 Batch 1170/2125] avg loss 0.00188183, throughput 3.99361K wps
[Epoch 10 Batch 1200/2125] avg loss 0.00170622, throughput 3.99972K wps
[Epoch 10 Batch 1230/2125] avg loss 0.00177944, throughput 3.99696K wps
[Epoch 10 Batch 1260/2125] avg loss 0.00182049, throughput 3.99608K wps
[Epoch 10 Batch 1290/2125] avg loss 0.00163063, throughput 3.99697K wps
[Epoch 10 Batch 1320/2125] avg loss 0.00191474, throughput 3.99931K wps
[Epoch 10 Batch 1350/2125] avg loss 0.00188029, throughput 3.99655K wps
[Epoch 10 Batch 1380/2125] avg loss 0.00179538, throughput 4.00111K wps
[Epoch 10 Batch 1410/2125] avg loss 0.00173436, throughput 3.99731K wps
[Epoch 10 Batch 1440/2125] avg loss 0.00164505, throughput 3.99958K wps
[Epoch 10 Batch 1470/2125] avg loss 0.00185766, throughput 3.99577K wps
[Epoch 10 Batch 1500/2125] avg loss 0.00220893, throughput 3.99594K wps
[Epoch 10 Batch 1530/2125] avg loss 0.00192623, throughput 3.99696K wps
[Epoch 10 Batch 1560/2125] avg loss 0.00181159, throughput 3.99943K wps
[Epoch 10 Batch 1590/2125] avg loss 0.00193049, throughput 3.99928K wps
[Epoch 10 Batch 1620/2125] avg loss 0.00211622, throughput 3.99835K wps
[Epoch 10 Batch 1650/2125] avg loss 0.00190381, throughput 3.998K wps
[Epoch 10 Batch 1680/2125] avg loss 0.00217869, throughput 3.99809K wps
[Epoch 10 Batch 1710/2125] avg loss 0.00187785, throughput 3.99869K wps
[Epoch 10 Batch 1740/2125] avg loss 0.00203472, throughput 3.99849K wps
[Epoch 10 Batch 1770/2125] avg loss 0.00190781, throughput 3.99754K wps
[Epoch 10 Batch 1800/2125] avg loss 0.00188542, throughput 3.99738K wps
[Epoch 10 Batch 1830/2125] avg loss 0.00204809, throughput 3.9966K wps
[Epoch 10 Batch 1860/2125] avg loss 0.001891, throughput 3.99548K wps
[Epoch 10 Batch 1890/2125] avg loss 0.00238344, throughput 3.9953K wps
[Epoch 10 Batch 1920/2125] avg loss 0.00178786, throughput 3.9976K wps
[Epoch 10 Batch 1950/2125] avg loss 0.00167753, throughput 3.99735K wps
[Epoch 10 Batch 1980/2125] avg loss 0.00193586, throughput 3.99321K wps
[Epoch 10 Batch 2010/2125] avg loss 0.00214882, throughput 3.99876K wps
[Epoch 10 Batch 2040/2125] avg loss 0.00230855, throughput 3.99868K wps
[Epoch 10 Batch 2070/2125] avg loss 0.00166984, throughput 3.99802K wps
[Epoch 10 Batch 2100/2125] avg loss 0.00148174, throughput 3.99952K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 10] train avg loss 0.00184587, test acc 0.9241, test avg loss 0.284485, throughput 3.99923K wps
[Epoch 11 Batch 30/2125] avg loss 0.00148285, throughput 4.09668K wps
[Epoch 11 Batch 60/2125] avg loss 0.00188634, throughput 3.99774K wps
[Epoch 11 Batch 90/2125] avg loss 0.00162976, throughput 3.99675K wps
[Epoch 11 Batch 120/2125] avg loss 0.00157317, throughput 3.99944K wps
[Epoch 11 Batch 150/2125] avg loss 0.00147451, throughput 4.00497K wps
[Epoch 11 Batch 180/2125] avg loss 0.00166872, throughput 3.99442K wps
[Epoch 11 Batch 210/2125] avg loss 0.00151675, throughput 3.99885K wps
[Epoch 11 Batch 240/2125] avg loss 0.00131241, throughput 3.99976K wps
[Epoch 11 Batch 270/2125] avg loss 0.00131386, throughput 3.99652K wps
[Epoch 11 Batch 300/2125] avg loss 0.00155076, throughput 4.00128K wps
[Epoch 11 Batch 330/2125] avg loss 0.00168219, throughput 3.99642K wps
[Epoch 11 Batch 360/2125] avg loss 0.00151938, throughput 3.99718K wps
[Epoch 11 Batch 390/2125] avg loss 0.00139004, throughput 3.99945K wps
[Epoch 11 Batch 420/2125] avg loss 0.00125788, throughput 3.99748K wps
[Epoch 11 Batch 450/2125] avg loss 0.0018591, throughput 4.00055K wps
[Epoch 11 Batch 480/2125] avg loss 0.00192561, throughput 3.9961K wps
[Epoch 11 Batch 510/2125] avg loss 0.00155708, throughput 3.99642K wps
[Epoch 11 Batch 540/2125] avg loss 0.00166069, throughput 3.99789K wps
[Epoch 11 Batch 570/2125] avg loss 0.00190451, throughput 3.99819K wps
[Epoch 11 Batch 600/2125] avg loss 0.00142731, throughput 3.99835K wps
[Epoch 11 Batch 630/2125] avg loss 0.00186795, throughput 4.00037K wps
[Epoch 11 Batch 660/2125] avg loss 0.00165996, throughput 4.00036K wps
[Epoch 11 Batch 690/2125] avg loss 0.00155261, throughput 3.9992K wps
[Epoch 11 Batch 720/2125] avg loss 0.00156885, throughput 3.99483K wps
[Epoch 11 Batch 750/2125] avg loss 0.00214112, throughput 3.99616K wps
[Epoch 11 Batch 780/2125] avg loss 0.00172047, throughput 3.99405K wps
[Epoch 11 Batch 810/2125] avg loss 0.00152337, throughput 3.99626K wps
[Epoch 11 Batch 840/2125] avg loss 0.00197258, throughput 3.99745K wps
[Epoch 11 Batch 870/2125] avg loss 0.00158748, throughput 3.99385K wps
[Epoch 11 Batch 900/2125] avg loss 0.00183022, throughput 3.99837K wps
[Epoch 11 Batch 930/2125] avg loss 0.00178369, throughput 3.99804K wps
[Epoch 11 Batch 960/2125] avg loss 0.00182349, throughput 3.99499K wps
[Epoch 11 Batch 990/2125] avg loss 0.0019624, throughput 4.0023K wps
[Epoch 11 Batch 1020/2125] avg loss 0.00191406, throughput 3.99735K wps
[Epoch 11 Batch 1050/2125] avg loss 0.0017117, throughput 3.9998K wps
[Epoch 11 Batch 1080/2125] avg loss 0.00230425, throughput 3.99835K wps
[Epoch 11 Batch 1110/2125] avg loss 0.00200271, throughput 3.99991K wps
[Epoch 11 Batch 1140/2125] avg loss 0.00176632, throughput 3.99729K wps
[Epoch 11 Batch 1170/2125] avg loss 0.00205904, throughput 3.9978K wps
[Epoch 11 Batch 1200/2125] avg loss 0.00174823, throughput 3.99674K wps
[Epoch 11 Batch 1230/2125] avg loss 0.00173348, throughput 3.99545K wps
[Epoch 11 Batch 1260/2125] avg loss 0.00207612, throughput 3.99436K wps
[Epoch 11 Batch 1290/2125] avg loss 0.0014544, throughput 3.99719K wps
[Epoch 11 Batch 1320/2125] avg loss 0.00232286, throughput 3.99598K wps
[Epoch 11 Batch 1350/2125] avg loss 0.00191941, throughput 3.99804K wps
[Epoch 11 Batch 1380/2125] avg loss 0.00209322, throughput 3.99722K wps
[Epoch 11 Batch 1410/2125] avg loss 0.00159888, throughput 3.99799K wps
[Epoch 11 Batch 1440/2125] avg loss 0.00162708, throughput 3.99891K wps
[Epoch 11 Batch 1470/2125] avg loss 0.00194368, throughput 3.99486K wps
[Epoch 11 Batch 1500/2125] avg loss 0.00177019, throughput 3.99304K wps
[Epoch 11 Batch 1530/2125] avg loss 0.0017951, throughput 4K wps
[Epoch 11 Batch 1560/2125] avg loss 0.00181874, throughput 3.99595K wps
[Epoch 11 Batch 1590/2125] avg loss 0.00197475, throughput 3.9967K wps
[Epoch 11 Batch 1620/2125] avg loss 0.001995, throughput 4.00243K wps
[Epoch 11 Batch 1650/2125] avg loss 0.00191521, throughput 4.00083K wps
[Epoch 11 Batch 1680/2125] avg loss 0.00180007, throughput 3.99613K wps
[Epoch 11 Batch 1710/2125] avg loss 0.00187816, throughput 3.9999K wps
[Epoch 11 Batch 1740/2125] avg loss 0.00160779, throughput 3.97523K wps
[Epoch 11 Batch 1770/2125] avg loss 0.00215991, throughput 4.00307K wps
[Epoch 11 Batch 1800/2125] avg loss 0.00198507, throughput 4.00413K wps
[Epoch 11 Batch 1830/2125] avg loss 0.0018723, throughput 4.0059K wps
[Epoch 11 Batch 1860/2125] avg loss 0.00166865, throughput 4.00021K wps
[Epoch 11 Batch 1890/2125] avg loss 0.00177707, throughput 4.00568K wps
[Epoch 11 Batch 1920/2125] avg loss 0.00169021, throughput 3.99819K wps
[Epoch 11 Batch 1950/2125] avg loss 0.00183034, throughput 3.99815K wps
[Epoch 11 Batch 1980/2125] avg loss 0.00186279, throughput 3.99706K wps
[Epoch 11 Batch 2010/2125] avg loss 0.00158909, throughput 3.99632K wps
[Epoch 11 Batch 2040/2125] avg loss 0.00174095, throughput 3.99805K wps
[Epoch 11 Batch 2070/2125] avg loss 0.0019252, throughput 3.99746K wps
[Epoch 11 Batch 2100/2125] avg loss 0.0021274, throughput 3.99725K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 11] train avg loss 0.00176501, test acc 0.9275, test avg loss 0.29296, throughput 3.99927K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 12 Batch 30/2125] avg loss 0.00129393, throughput 4.08811K wps
[Epoch 12 Batch 60/2125] avg loss 0.00143885, throughput 4.00236K wps
[Epoch 12 Batch 90/2125] avg loss 0.00133644, throughput 3.99932K wps
[Epoch 12 Batch 120/2125] avg loss 0.00165476, throughput 4.00112K wps
[Epoch 12 Batch 150/2125] avg loss 0.00165715, throughput 4.00821K wps
[Epoch 12 Batch 180/2125] avg loss 0.00124659, throughput 3.99878K wps
[Epoch 12 Batch 210/2125] avg loss 0.00141917, throughput 3.99785K wps
[Epoch 12 Batch 240/2125] avg loss 0.00171562, throughput 4.00107K wps
[Epoch 12 Batch 270/2125] avg loss 0.00146289, throughput 3.9973K wps
[Epoch 12 Batch 300/2125] avg loss 0.00155682, throughput 4.00467K wps
[Epoch 12 Batch 330/2125] avg loss 0.00171591, throughput 3.99833K wps
[Epoch 12 Batch 360/2125] avg loss 0.00164373, throughput 4.00061K wps
[Epoch 12 Batch 390/2125] avg loss 0.00178253, throughput 4.00013K wps
[Epoch 12 Batch 420/2125] avg loss 0.00134182, throughput 3.99485K wps
[Epoch 12 Batch 450/2125] avg loss 0.00168795, throughput 4.00151K wps
[Epoch 12 Batch 480/2125] avg loss 0.00161439, throughput 3.9997K wps
[Epoch 12 Batch 510/2125] avg loss 0.00152796, throughput 3.99909K wps
[Epoch 12 Batch 540/2125] avg loss 0.00135535, throughput 3.99945K wps
[Epoch 12 Batch 570/2125] avg loss 0.00129757, throughput 3.99937K wps
[Epoch 12 Batch 600/2125] avg loss 0.00166433, throughput 3.99917K wps
[Epoch 12 Batch 630/2125] avg loss 0.00177641, throughput 3.99952K wps
[Epoch 12 Batch 660/2125] avg loss 0.00138355, throughput 3.99882K wps
[Epoch 12 Batch 690/2125] avg loss 0.00161383, throughput 3.9983K wps
[Epoch 12 Batch 720/2125] avg loss 0.00145325, throughput 3.99794K wps
[Epoch 12 Batch 750/2125] avg loss 0.00148319, throughput 3.99625K wps
[Epoch 12 Batch 780/2125] avg loss 0.0014777, throughput 3.99863K wps
[Epoch 12 Batch 810/2125] avg loss 0.00149331, throughput 3.99466K wps
[Epoch 12 Batch 840/2125] avg loss 0.00141492, throughput 3.99815K wps
[Epoch 12 Batch 870/2125] avg loss 0.00157201, throughput 3.99692K wps
[Epoch 12 Batch 900/2125] avg loss 0.00165, throughput 3.99972K wps
[Epoch 12 Batch 930/2125] avg loss 0.00148036, throughput 3.99831K wps
[Epoch 12 Batch 960/2125] avg loss 0.00182865, throughput 3.998K wps
[Epoch 12 Batch 990/2125] avg loss 0.00166504, throughput 3.9975K wps
[Epoch 12 Batch 1020/2125] avg loss 0.00135034, throughput 4.00056K wps
[Epoch 12 Batch 1050/2125] avg loss 0.00137402, throughput 3.99935K wps
[Epoch 12 Batch 1080/2125] avg loss 0.00225242, throughput 4.00171K wps
[Epoch 12 Batch 1110/2125] avg loss 0.00170478, throughput 3.99643K wps
[Epoch 12 Batch 1140/2125] avg loss 0.00175455, throughput 4.00118K wps
[Epoch 12 Batch 1170/2125] avg loss 0.00158357, throughput 3.99733K wps
[Epoch 12 Batch 1200/2125] avg loss 0.00168178, throughput 3.9964K wps
[Epoch 12 Batch 1230/2125] avg loss 0.00132912, throughput 3.99652K wps
[Epoch 12 Batch 1260/2125] avg loss 0.00150621, throughput 3.99843K wps
[Epoch 12 Batch 1290/2125] avg loss 0.0017541, throughput 4.00119K wps
[Epoch 12 Batch 1320/2125] avg loss 0.00145194, throughput 3.99823K wps
[Epoch 12 Batch 1350/2125] avg loss 0.00158289, throughput 4.0031K wps
[Epoch 12 Batch 1380/2125] avg loss 0.00208246, throughput 3.99537K wps
[Epoch 12 Batch 1410/2125] avg loss 0.00150327, throughput 3.99841K wps
[Epoch 12 Batch 1440/2125] avg loss 0.00199407, throughput 3.99788K wps
[Epoch 12 Batch 1470/2125] avg loss 0.00177998, throughput 3.99571K wps
[Epoch 12 Batch 1500/2125] avg loss 0.00151887, throughput 3.99785K wps
[Epoch 12 Batch 1530/2125] avg loss 0.0017708, throughput 3.99949K wps
[Epoch 12 Batch 1560/2125] avg loss 0.00160684, throughput 3.99939K wps
[Epoch 12 Batch 1590/2125] avg loss 0.00151553, throughput 3.99901K wps
[Epoch 12 Batch 1620/2125] avg loss 0.00155109, throughput 3.99828K wps
[Epoch 12 Batch 1650/2125] avg loss 0.00177758, throughput 3.99993K wps
[Epoch 12 Batch 1680/2125] avg loss 0.00172416, throughput 3.99865K wps
[Epoch 12 Batch 1710/2125] avg loss 0.00181669, throughput 3.99723K wps
[Epoch 12 Batch 1740/2125] avg loss 0.00162695, throughput 3.99854K wps
[Epoch 12 Batch 1770/2125] avg loss 0.0019472, throughput 4.00153K wps
[Epoch 12 Batch 1800/2125] avg loss 0.00175736, throughput 4.00272K wps
[Epoch 12 Batch 1830/2125] avg loss 0.00143236, throughput 3.99699K wps
[Epoch 12 Batch 1860/2125] avg loss 0.00169467, throughput 3.99866K wps
[Epoch 12 Batch 1890/2125] avg loss 0.00137086, throughput 4.00121K wps
[Epoch 12 Batch 1920/2125] avg loss 0.00198699, throughput 3.9957K wps
[Epoch 12 Batch 1950/2125] avg loss 0.00168446, throughput 3.99547K wps
[Epoch 12 Batch 1980/2125] avg loss 0.00192465, throughput 3.99385K wps
[Epoch 12 Batch 2010/2125] avg loss 0.00178117, throughput 3.99757K wps
[Epoch 12 Batch 2040/2125] avg loss 0.00198602, throughput 3.99753K wps
[Epoch 12 Batch 2070/2125] avg loss 0.00191581, throughput 3.99555K wps
[Epoch 12 Batch 2100/2125] avg loss 0.00211577, throughput 3.99971K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 12] train avg loss 0.0016296, test acc 0.9260, test avg loss 0.305619, throughput 4.00011K wps
[Epoch 13 Batch 30/2125] avg loss 0.00124588, throughput 4.09148K wps
[Epoch 13 Batch 60/2125] avg loss 0.00184935, throughput 3.99513K wps
[Epoch 13 Batch 90/2125] avg loss 0.00134027, throughput 4.00114K wps
[Epoch 13 Batch 120/2125] avg loss 0.00134966, throughput 3.9967K wps
[Epoch 13 Batch 150/2125] avg loss 0.00177033, throughput 3.99861K wps
[Epoch 13 Batch 180/2125] avg loss 0.00141572, throughput 3.99692K wps
[Epoch 13 Batch 210/2125] avg loss 0.00174889, throughput 3.99512K wps
[Epoch 13 Batch 240/2125] avg loss 0.00138283, throughput 3.997K wps
[Epoch 13 Batch 270/2125] avg loss 0.00120439, throughput 3.99976K wps
[Epoch 13 Batch 300/2125] avg loss 0.00135878, throughput 4.00132K wps
[Epoch 13 Batch 330/2125] avg loss 0.00139697, throughput 3.99973K wps
[Epoch 13 Batch 360/2125] avg loss 0.00142044, throughput 3.99895K wps
[Epoch 13 Batch 390/2125] avg loss 0.00180966, throughput 3.9985K wps
[Epoch 13 Batch 420/2125] avg loss 0.00144919, throughput 3.99693K wps
[Epoch 13 Batch 450/2125] avg loss 0.00128857, throughput 3.9999K wps
[Epoch 13 Batch 480/2125] avg loss 0.00136773, throughput 3.99633K wps
[Epoch 13 Batch 510/2125] avg loss 0.0015185, throughput 4.00092K wps
[Epoch 13 Batch 540/2125] avg loss 0.00157615, throughput 3.99515K wps
[Epoch 13 Batch 570/2125] avg loss 0.0015212, throughput 3.99964K wps
[Epoch 13 Batch 600/2125] avg loss 0.00134493, throughput 3.99941K wps
[Epoch 13 Batch 630/2125] avg loss 0.0013278, throughput 3.99788K wps
[Epoch 13 Batch 660/2125] avg loss 0.00126263, throughput 3.99891K wps
[Epoch 13 Batch 690/2125] avg loss 0.00164868, throughput 3.9986K wps
[Epoch 13 Batch 720/2125] avg loss 0.00139482, throughput 3.99944K wps
[Epoch 13 Batch 750/2125] avg loss 0.00150061, throughput 3.99651K wps
[Epoch 13 Batch 780/2125] avg loss 0.0013363, throughput 3.99391K wps
[Epoch 13 Batch 810/2125] avg loss 0.00179687, throughput 3.99526K wps
[Epoch 13 Batch 840/2125] avg loss 0.00155391, throughput 3.99746K wps
[Epoch 13 Batch 870/2125] avg loss 0.00151616, throughput 4.00105K wps
[Epoch 13 Batch 900/2125] avg loss 0.00152438, throughput 3.99968K wps
[Epoch 13 Batch 930/2125] avg loss 0.00184682, throughput 3.99175K wps
[Epoch 13 Batch 960/2125] avg loss 0.00168801, throughput 3.9856K wps
[Epoch 13 Batch 990/2125] avg loss 0.00132354, throughput 4.00001K wps
[Epoch 13 Batch 1020/2125] avg loss 0.00142143, throughput 3.99307K wps
[Epoch 13 Batch 1050/2125] avg loss 0.00153544, throughput 3.99903K wps
[Epoch 13 Batch 1080/2125] avg loss 0.0017637, throughput 4.00128K wps
[Epoch 13 Batch 1110/2125] avg loss 0.00152322, throughput 4.00001K wps
[Epoch 13 Batch 1140/2125] avg loss 0.00142272, throughput 3.99783K wps
[Epoch 13 Batch 1170/2125] avg loss 0.00139824, throughput 4.00152K wps
[Epoch 13 Batch 1200/2125] avg loss 0.00136482, throughput 3.99972K wps
[Epoch 13 Batch 1230/2125] avg loss 0.00142623, throughput 3.99496K wps
[Epoch 13 Batch 1260/2125] avg loss 0.0013282, throughput 3.99763K wps
[Epoch 13 Batch 1290/2125] avg loss 0.00126796, throughput 3.99593K wps
[Epoch 13 Batch 1320/2125] avg loss 0.0015553, throughput 3.99683K wps
[Epoch 13 Batch 1350/2125] avg loss 0.00157899, throughput 3.99556K wps
[Epoch 13 Batch 1380/2125] avg loss 0.0018387, throughput 3.99518K wps
[Epoch 13 Batch 1410/2125] avg loss 0.00132891, throughput 3.99743K wps
[Epoch 13 Batch 1440/2125] avg loss 0.00158892, throughput 3.99852K wps
[Epoch 13 Batch 1470/2125] avg loss 0.00173247, throughput 3.99939K wps
[Epoch 13 Batch 1500/2125] avg loss 0.00169469, throughput 3.9993K wps
[Epoch 13 Batch 1530/2125] avg loss 0.00164929, throughput 4.00071K wps
[Epoch 13 Batch 1560/2125] avg loss 0.00162759, throughput 3.99883K wps
[Epoch 13 Batch 1590/2125] avg loss 0.0015097, throughput 3.9956K wps
[Epoch 13 Batch 1620/2125] avg loss 0.00178358, throughput 3.99868K wps
[Epoch 13 Batch 1650/2125] avg loss 0.00164041, throughput 4.00066K wps
[Epoch 13 Batch 1680/2125] avg loss 0.00180463, throughput 3.99538K wps
[Epoch 13 Batch 1710/2125] avg loss 0.00146272, throughput 3.99853K wps
[Epoch 13 Batch 1740/2125] avg loss 0.00177004, throughput 3.9992K wps
[Epoch 13 Batch 1770/2125] avg loss 0.00176379, throughput 3.99993K wps
[Epoch 13 Batch 1800/2125] avg loss 0.00166394, throughput 3.99587K wps
[Epoch 13 Batch 1830/2125] avg loss 0.00154415, throughput 3.99494K wps
[Epoch 13 Batch 1860/2125] avg loss 0.00151395, throughput 3.99316K wps
[Epoch 13 Batch 1890/2125] avg loss 0.00157459, throughput 3.99728K wps
[Epoch 13 Batch 1920/2125] avg loss 0.00149692, throughput 3.99684K wps
[Epoch 13 Batch 1950/2125] avg loss 0.00184251, throughput 3.99371K wps
[Epoch 13 Batch 1980/2125] avg loss 0.0016275, throughput 3.99951K wps
[Epoch 13 Batch 2010/2125] avg loss 0.00173002, throughput 4.0001K wps
[Epoch 13 Batch 2040/2125] avg loss 0.00134406, throughput 3.99977K wps
[Epoch 13 Batch 2070/2125] avg loss 0.00133304, throughput 3.99696K wps
[Epoch 13 Batch 2100/2125] avg loss 0.00150532, throughput 4.00011K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 13] train avg loss 0.00153014, test acc 0.9265, test avg loss 0.316382, throughput 3.99904K wps
[Epoch 14 Batch 30/2125] avg loss 0.00127034, throughput 4.0981K wps
[Epoch 14 Batch 60/2125] avg loss 0.00129608, throughput 4.00702K wps
[Epoch 14 Batch 90/2125] avg loss 0.00137308, throughput 4.00111K wps
[Epoch 14 Batch 120/2125] avg loss 0.00109025, throughput 4.00139K wps
[Epoch 14 Batch 150/2125] avg loss 0.00140893, throughput 3.99852K wps
[Epoch 14 Batch 180/2125] avg loss 0.00111953, throughput 3.99957K wps
[Epoch 14 Batch 210/2125] avg loss 0.00152972, throughput 4.00049K wps
[Epoch 14 Batch 240/2125] avg loss 0.00124218, throughput 4K wps
[Epoch 14 Batch 270/2125] avg loss 0.00149156, throughput 3.99877K wps
[Epoch 14 Batch 300/2125] avg loss 0.00150114, throughput 4.00054K wps
[Epoch 14 Batch 330/2125] avg loss 0.00143544, throughput 4.00403K wps
[Epoch 14 Batch 360/2125] avg loss 0.00118093, throughput 4.00205K wps
[Epoch 14 Batch 390/2125] avg loss 0.00129389, throughput 3.99545K wps
[Epoch 14 Batch 420/2125] avg loss 0.00132597, throughput 3.99879K wps
[Epoch 14 Batch 450/2125] avg loss 0.0013764, throughput 4.00287K wps
[Epoch 14 Batch 480/2125] avg loss 0.00122391, throughput 3.99774K wps
[Epoch 14 Batch 510/2125] avg loss 0.00128551, throughput 4.00117K wps
[Epoch 14 Batch 540/2125] avg loss 0.0013829, throughput 3.99788K wps
[Epoch 14 Batch 570/2125] avg loss 0.00133149, throughput 3.99837K wps
[Epoch 14 Batch 600/2125] avg loss 0.00142085, throughput 3.99577K wps
[Epoch 14 Batch 630/2125] avg loss 0.00162747, throughput 4.00197K wps
[Epoch 14 Batch 660/2125] avg loss 0.00158685, throughput 3.99855K wps
[Epoch 14 Batch 690/2125] avg loss 0.00171373, throughput 4.00193K wps
[Epoch 14 Batch 720/2125] avg loss 0.00116644, throughput 3.99633K wps
[Epoch 14 Batch 750/2125] avg loss 0.00124234, throughput 3.99925K wps
[Epoch 14 Batch 780/2125] avg loss 0.00176973, throughput 3.99542K wps
[Epoch 14 Batch 810/2125] avg loss 0.00158835, throughput 3.99651K wps
[Epoch 14 Batch 840/2125] avg loss 0.00140108, throughput 3.99507K wps
[Epoch 14 Batch 870/2125] avg loss 0.00139747, throughput 3.99936K wps
[Epoch 14 Batch 900/2125] avg loss 0.00140692, throughput 3.99803K wps
[Epoch 14 Batch 930/2125] avg loss 0.0014011, throughput 4.00201K wps
[Epoch 14 Batch 960/2125] avg loss 0.00121032, throughput 4.00572K wps
[Epoch 14 Batch 990/2125] avg loss 0.00140938, throughput 4.00189K wps
[Epoch 14 Batch 1020/2125] avg loss 0.00158668, throughput 3.99976K wps
[Epoch 14 Batch 1050/2125] avg loss 0.00125949, throughput 3.99702K wps
[Epoch 14 Batch 1080/2125] avg loss 0.00119996, throughput 3.99766K wps
[Epoch 14 Batch 1110/2125] avg loss 0.00143679, throughput 3.99724K wps
[Epoch 14 Batch 1140/2125] avg loss 0.00134426, throughput 3.99656K wps
[Epoch 14 Batch 1170/2125] avg loss 0.00166982, throughput 4.00149K wps
[Epoch 14 Batch 1200/2125] avg loss 0.00138771, throughput 4.00025K wps
[Epoch 14 Batch 1230/2125] avg loss 0.00137952, throughput 3.99394K wps
[Epoch 14 Batch 1260/2125] avg loss 0.00125742, throughput 4.00244K wps
[Epoch 14 Batch 1290/2125] avg loss 0.00165146, throughput 4.00255K wps
[Epoch 14 Batch 1320/2125] avg loss 0.00170784, throughput 3.99648K wps
[Epoch 14 Batch 1350/2125] avg loss 0.00137613, throughput 3.99648K wps
[Epoch 14 Batch 1380/2125] avg loss 0.00106743, throughput 3.99856K wps
[Epoch 14 Batch 1410/2125] avg loss 0.00145107, throughput 4.00055K wps
[Epoch 14 Batch 1440/2125] avg loss 0.00136935, throughput 3.99874K wps
[Epoch 14 Batch 1470/2125] avg loss 0.0017126, throughput 3.9972K wps
[Epoch 14 Batch 1500/2125] avg loss 0.00143943, throughput 4.00025K wps
[Epoch 14 Batch 1530/2125] avg loss 0.00178885, throughput 3.99577K wps
[Epoch 14 Batch 1560/2125] avg loss 0.00158401, throughput 4.00451K wps
[Epoch 14 Batch 1590/2125] avg loss 0.00148694, throughput 4.00147K wps
[Epoch 14 Batch 1620/2125] avg loss 0.00173551, throughput 3.99729K wps
[Epoch 14 Batch 1650/2125] avg loss 0.00166573, throughput 3.99875K wps
[Epoch 14 Batch 1680/2125] avg loss 0.00149793, throughput 3.99848K wps
[Epoch 14 Batch 1710/2125] avg loss 0.00148056, throughput 4.0001K wps
[Epoch 14 Batch 1740/2125] avg loss 0.00161629, throughput 4.00173K wps
[Epoch 14 Batch 1770/2125] avg loss 0.00150915, throughput 3.99783K wps
[Epoch 14 Batch 1800/2125] avg loss 0.0014465, throughput 3.99935K wps
[Epoch 14 Batch 1830/2125] avg loss 0.0015569, throughput 3.99918K wps
[Epoch 14 Batch 1860/2125] avg loss 0.00142889, throughput 4.00099K wps
[Epoch 14 Batch 1890/2125] avg loss 0.00171854, throughput 4.00214K wps
[Epoch 14 Batch 1920/2125] avg loss 0.00168761, throughput 3.99912K wps
[Epoch 14 Batch 1950/2125] avg loss 0.00146768, throughput 4.00078K wps
[Epoch 14 Batch 1980/2125] avg loss 0.00181348, throughput 3.9961K wps
[Epoch 14 Batch 2010/2125] avg loss 0.00139335, throughput 3.99627K wps
[Epoch 14 Batch 2040/2125] avg loss 0.00179484, throughput 3.99853K wps
[Epoch 14 Batch 2070/2125] avg loss 0.0016057, throughput 3.99919K wps
[Epoch 14 Batch 2100/2125] avg loss 0.00203369, throughput 4.00079K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 14] train avg loss 0.00146141, test acc 0.9257, test avg loss 0.325402, throughput 4.00078K wps
[Epoch 15 Batch 30/2125] avg loss 0.00150994, throughput 4.08919K wps
[Epoch 15 Batch 60/2125] avg loss 0.00131382, throughput 3.99585K wps
[Epoch 15 Batch 90/2125] avg loss 0.00118598, throughput 3.99642K wps
[Epoch 15 Batch 120/2125] avg loss 0.00130593, throughput 3.9991K wps
[Epoch 15 Batch 150/2125] avg loss 0.00113085, throughput 3.99241K wps
[Epoch 15 Batch 180/2125] avg loss 0.00123516, throughput 3.98761K wps
[Epoch 15 Batch 210/2125] avg loss 0.00132778, throughput 3.99998K wps
[Epoch 15 Batch 240/2125] avg loss 0.00104262, throughput 3.99622K wps
[Epoch 15 Batch 270/2125] avg loss 0.00125496, throughput 3.99647K wps
[Epoch 15 Batch 300/2125] avg loss 0.00140322, throughput 3.99538K wps
[Epoch 15 Batch 330/2125] avg loss 0.00119728, throughput 3.99861K wps
[Epoch 15 Batch 360/2125] avg loss 0.0011488, throughput 3.99806K wps
[Epoch 15 Batch 390/2125] avg loss 0.00112253, throughput 3.99622K wps
[Epoch 15 Batch 420/2125] avg loss 0.00135466, throughput 3.99566K wps
[Epoch 15 Batch 450/2125] avg loss 0.0011625, throughput 3.99531K wps
[Epoch 15 Batch 480/2125] avg loss 0.00144903, throughput 3.99785K wps
[Epoch 15 Batch 510/2125] avg loss 0.00139536, throughput 3.99648K wps
[Epoch 15 Batch 540/2125] avg loss 0.00128713, throughput 3.99867K wps
[Epoch 15 Batch 570/2125] avg loss 0.00141581, throughput 3.99785K wps
[Epoch 15 Batch 600/2125] avg loss 0.00128177, throughput 3.99648K wps
[Epoch 15 Batch 630/2125] avg loss 0.00135103, throughput 3.99807K wps
[Epoch 15 Batch 660/2125] avg loss 0.00138576, throughput 3.99879K wps
[Epoch 15 Batch 690/2125] avg loss 0.0012571, throughput 3.99507K wps
[Epoch 15 Batch 720/2125] avg loss 0.00138627, throughput 3.99477K wps
[Epoch 15 Batch 750/2125] avg loss 0.00110761, throughput 3.99586K wps
[Epoch 15 Batch 780/2125] avg loss 0.00121602, throughput 4.00048K wps
[Epoch 15 Batch 810/2125] avg loss 0.00160001, throughput 3.9972K wps
[Epoch 15 Batch 840/2125] avg loss 0.00148598, throughput 3.99343K wps
[Epoch 15 Batch 870/2125] avg loss 0.00140106, throughput 3.9949K wps
[Epoch 15 Batch 900/2125] avg loss 0.00156522, throughput 3.99991K wps
[Epoch 15 Batch 930/2125] avg loss 0.00114561, throughput 3.99954K wps
[Epoch 15 Batch 960/2125] avg loss 0.00117359, throughput 3.99642K wps
[Epoch 15 Batch 990/2125] avg loss 0.00127141, throughput 3.99549K wps
[Epoch 15 Batch 1020/2125] avg loss 0.00123663, throughput 3.99829K wps
[Epoch 15 Batch 1050/2125] avg loss 0.00165317, throughput 3.99819K wps
[Epoch 15 Batch 1080/2125] avg loss 0.00140399, throughput 3.99976K wps
[Epoch 15 Batch 1110/2125] avg loss 0.00130847, throughput 3.99801K wps
[Epoch 15 Batch 1140/2125] avg loss 0.00108338, throughput 3.99682K wps
[Epoch 15 Batch 1170/2125] avg loss 0.00138158, throughput 3.99616K wps
[Epoch 15 Batch 1200/2125] avg loss 0.00135361, throughput 3.99617K wps
[Epoch 15 Batch 1230/2125] avg loss 0.00139186, throughput 3.99411K wps
[Epoch 15 Batch 1260/2125] avg loss 0.0013263, throughput 3.99875K wps
[Epoch 15 Batch 1290/2125] avg loss 0.00121903, throughput 3.99759K wps
[Epoch 15 Batch 1320/2125] avg loss 0.0013014, throughput 3.9952K wps
[Epoch 15 Batch 1350/2125] avg loss 0.00129941, throughput 4.00212K wps
[Epoch 15 Batch 1380/2125] avg loss 0.00154668, throughput 3.99917K wps
[Epoch 15 Batch 1410/2125] avg loss 0.00133899, throughput 3.99293K wps
[Epoch 15 Batch 1440/2125] avg loss 0.00124765, throughput 3.99658K wps
[Epoch 15 Batch 1470/2125] avg loss 0.00141983, throughput 3.99534K wps
[Epoch 15 Batch 1500/2125] avg loss 0.00164192, throughput 3.99772K wps
[Epoch 15 Batch 1530/2125] avg loss 0.00141968, throughput 3.99447K wps
[Epoch 15 Batch 1560/2125] avg loss 0.00171388, throughput 4.00658K wps
[Epoch 15 Batch 1590/2125] avg loss 0.00138425, throughput 4.00396K wps
[Epoch 15 Batch 1620/2125] avg loss 0.00150092, throughput 4.00363K wps
[Epoch 15 Batch 1650/2125] avg loss 0.00150656, throughput 4.00265K wps
[Epoch 15 Batch 1680/2125] avg loss 0.00136029, throughput 3.9988K wps
[Epoch 15 Batch 1710/2125] avg loss 0.00157346, throughput 4.00715K wps
[Epoch 15 Batch 1740/2125] avg loss 0.00119396, throughput 3.99736K wps
[Epoch 15 Batch 1770/2125] avg loss 0.00127263, throughput 3.99944K wps
[Epoch 15 Batch 1800/2125] avg loss 0.00130693, throughput 4.00051K wps
[Epoch 15 Batch 1830/2125] avg loss 0.00155926, throughput 3.99936K wps
[Epoch 15 Batch 1860/2125] avg loss 0.0016982, throughput 3.99673K wps
[Epoch 15 Batch 1890/2125] avg loss 0.00130211, throughput 3.99688K wps
[Epoch 15 Batch 1920/2125] avg loss 0.0017882, throughput 3.99931K wps
[Epoch 15 Batch 1950/2125] avg loss 0.00176059, throughput 4.00131K wps
[Epoch 15 Batch 1980/2125] avg loss 0.00159625, throughput 3.99887K wps
[Epoch 15 Batch 2010/2125] avg loss 0.00157769, throughput 3.99832K wps
[Epoch 15 Batch 2040/2125] avg loss 0.00111019, throughput 3.99794K wps
[Epoch 15 Batch 2070/2125] avg loss 0.00153764, throughput 3.99526K wps
[Epoch 15 Batch 2100/2125] avg loss 0.00184484, throughput 3.99947K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 15] train avg loss 0.00137414, test acc 0.9254, test avg loss 0.341153, throughput 3.99899K wps
[Epoch 16 Batch 30/2125] avg loss 0.00112453, throughput 4.09147K wps
[Epoch 16 Batch 60/2125] avg loss 0.00118554, throughput 3.99822K wps
[Epoch 16 Batch 90/2125] avg loss 0.00110662, throughput 3.99803K wps
[Epoch 16 Batch 120/2125] avg loss 0.00114275, throughput 3.99459K wps
[Epoch 16 Batch 150/2125] avg loss 0.00137315, throughput 3.99625K wps
[Epoch 16 Batch 180/2125] avg loss 0.00102226, throughput 3.99699K wps
[Epoch 16 Batch 210/2125] avg loss 0.00112245, throughput 3.996K wps
[Epoch 16 Batch 240/2125] avg loss 0.00114986, throughput 3.99718K wps
[Epoch 16 Batch 270/2125] avg loss 0.00144025, throughput 3.99873K wps
[Epoch 16 Batch 300/2125] avg loss 0.0011188, throughput 4.00249K wps
[Epoch 16 Batch 330/2125] avg loss 0.00133332, throughput 4.00011K wps
[Epoch 16 Batch 360/2125] avg loss 0.00140166, throughput 3.99909K wps
[Epoch 16 Batch 390/2125] avg loss 0.00136384, throughput 3.99664K wps
[Epoch 16 Batch 420/2125] avg loss 0.000906407, throughput 3.99805K wps
[Epoch 16 Batch 450/2125] avg loss 0.00110892, throughput 3.99935K wps
[Epoch 16 Batch 480/2125] avg loss 0.000987255, throughput 3.99315K wps
[Epoch 16 Batch 510/2125] avg loss 0.0010276, throughput 3.99912K wps
[Epoch 16 Batch 540/2125] avg loss 0.00120421, throughput 4.00017K wps
[Epoch 16 Batch 570/2125] avg loss 0.0010387, throughput 3.99959K wps
[Epoch 16 Batch 600/2125] avg loss 0.00138215, throughput 4.00251K wps
[Epoch 16 Batch 630/2125] avg loss 0.00102847, throughput 3.9989K wps
[Epoch 16 Batch 660/2125] avg loss 0.000957761, throughput 3.99476K wps
[Epoch 16 Batch 690/2125] avg loss 0.00107513, throughput 4.0016K wps
[Epoch 16 Batch 720/2125] avg loss 0.00133397, throughput 4.00095K wps
[Epoch 16 Batch 750/2125] avg loss 0.000987633, throughput 3.99522K wps
[Epoch 16 Batch 780/2125] avg loss 0.00129128, throughput 3.99705K wps
[Epoch 16 Batch 810/2125] avg loss 0.00147505, throughput 3.99939K wps
[Epoch 16 Batch 840/2125] avg loss 0.00135372, throughput 4.00081K wps
[Epoch 16 Batch 870/2125] avg loss 0.00109098, throughput 4.00583K wps
[Epoch 16 Batch 900/2125] avg loss 0.00119466, throughput 4.00589K wps
[Epoch 16 Batch 930/2125] avg loss 0.0013628, throughput 4.00065K wps
[Epoch 16 Batch 960/2125] avg loss 0.00102976, throughput 4.00461K wps
[Epoch 16 Batch 990/2125] avg loss 0.0015938, throughput 4.00377K wps
[Epoch 16 Batch 1020/2125] avg loss 0.00139306, throughput 4.00394K wps
[Epoch 16 Batch 1050/2125] avg loss 0.00169195, throughput 4.00004K wps
[Epoch 16 Batch 1080/2125] avg loss 0.00139558, throughput 3.99555K wps
[Epoch 16 Batch 1110/2125] avg loss 0.00129487, throughput 4.00151K wps
[Epoch 16 Batch 1140/2125] avg loss 0.0011956, throughput 3.99781K wps
[Epoch 16 Batch 1170/2125] avg loss 0.00138962, throughput 3.995K wps
[Epoch 16 Batch 1200/2125] avg loss 0.00121216, throughput 3.99762K wps
[Epoch 16 Batch 1230/2125] avg loss 0.00127225, throughput 3.99701K wps
[Epoch 16 Batch 1260/2125] avg loss 0.000962006, throughput 3.99813K wps
[Epoch 16 Batch 1290/2125] avg loss 0.0011438, throughput 3.99585K wps
[Epoch 16 Batch 1320/2125] avg loss 0.00120214, throughput 4.00237K wps
[Epoch 16 Batch 1350/2125] avg loss 0.00177332, throughput 3.99994K wps
[Epoch 16 Batch 1380/2125] avg loss 0.00141921, throughput 4.00108K wps
[Epoch 16 Batch 1410/2125] avg loss 0.00140636, throughput 3.99913K wps
[Epoch 16 Batch 1440/2125] avg loss 0.00139791, throughput 3.99978K wps
[Epoch 16 Batch 1470/2125] avg loss 0.00126551, throughput 3.99988K wps
[Epoch 16 Batch 1500/2125] avg loss 0.00138479, throughput 4.00338K wps
[Epoch 16 Batch 1530/2125] avg loss 0.00115164, throughput 3.99976K wps
[Epoch 16 Batch 1560/2125] avg loss 0.00122893, throughput 3.98616K wps
[Epoch 16 Batch 1590/2125] avg loss 0.001473, throughput 4.00289K wps
[Epoch 16 Batch 1620/2125] avg loss 0.00136932, throughput 4.01119K wps
[Epoch 16 Batch 1650/2125] avg loss 0.00158261, throughput 4.01053K wps
[Epoch 16 Batch 1680/2125] avg loss 0.00157877, throughput 4.01532K wps
[Epoch 16 Batch 1710/2125] avg loss 0.00135942, throughput 4.00838K wps
[Epoch 16 Batch 1740/2125] avg loss 0.0013832, throughput 4.00483K wps
[Epoch 16 Batch 1770/2125] avg loss 0.00158035, throughput 3.99804K wps
[Epoch 16 Batch 1800/2125] avg loss 0.00131155, throughput 3.9967K wps
[Epoch 16 Batch 1830/2125] avg loss 0.00121591, throughput 3.99788K wps
[Epoch 16 Batch 1860/2125] avg loss 0.0015982, throughput 4.00056K wps
[Epoch 16 Batch 1890/2125] avg loss 0.00150893, throughput 3.99619K wps
[Epoch 16 Batch 1920/2125] avg loss 0.00139205, throughput 3.99794K wps
[Epoch 16 Batch 1950/2125] avg loss 0.00161261, throughput 3.99914K wps
[Epoch 16 Batch 1980/2125] avg loss 0.00150901, throughput 3.99929K wps
[Epoch 16 Batch 2010/2125] avg loss 0.00138456, throughput 4.00326K wps
[Epoch 16 Batch 2040/2125] avg loss 0.00120564, throughput 3.9958K wps
[Epoch 16 Batch 2070/2125] avg loss 0.00138039, throughput 3.99894K wps
[Epoch 16 Batch 2100/2125] avg loss 0.00142338, throughput 3.9942K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 16] train avg loss 0.00129, test acc 0.9276, test avg loss 0.34692, throughput 4.00102K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 17 Batch 30/2125] avg loss 0.00118586, throughput 4.09147K wps
[Epoch 17 Batch 60/2125] avg loss 0.00101845, throughput 3.99625K wps
[Epoch 17 Batch 90/2125] avg loss 0.00111226, throughput 4.00113K wps
[Epoch 17 Batch 120/2125] avg loss 0.00110921, throughput 4.0005K wps
[Epoch 17 Batch 150/2125] avg loss 0.000919057, throughput 4.00002K wps
[Epoch 17 Batch 180/2125] avg loss 0.00127077, throughput 4.00259K wps
[Epoch 17 Batch 210/2125] avg loss 0.000983426, throughput 3.99784K wps
[Epoch 17 Batch 240/2125] avg loss 0.00115789, throughput 3.99663K wps
[Epoch 17 Batch 270/2125] avg loss 0.00126788, throughput 3.99881K wps
[Epoch 17 Batch 300/2125] avg loss 0.00110439, throughput 4.00172K wps
[Epoch 17 Batch 330/2125] avg loss 0.00112912, throughput 3.99765K wps
[Epoch 17 Batch 360/2125] avg loss 0.000948823, throughput 3.99906K wps
[Epoch 17 Batch 390/2125] avg loss 0.000946472, throughput 4.00206K wps
[Epoch 17 Batch 420/2125] avg loss 0.00122232, throughput 3.99346K wps
[Epoch 17 Batch 450/2125] avg loss 0.000866618, throughput 3.99795K wps
[Epoch 17 Batch 480/2125] avg loss 0.00110603, throughput 3.99975K wps
[Epoch 17 Batch 510/2125] avg loss 0.000997671, throughput 3.99747K wps
[Epoch 17 Batch 540/2125] avg loss 0.00122372, throughput 3.99893K wps
[Epoch 17 Batch 570/2125] avg loss 0.00113917, throughput 3.998K wps
[Epoch 17 Batch 600/2125] avg loss 0.00112712, throughput 3.9989K wps
[Epoch 17 Batch 630/2125] avg loss 0.00132763, throughput 3.9975K wps
[Epoch 17 Batch 660/2125] avg loss 0.00137705, throughput 4.00204K wps
[Epoch 17 Batch 690/2125] avg loss 0.00108251, throughput 4.00166K wps
[Epoch 17 Batch 720/2125] avg loss 0.00110706, throughput 4.0003K wps
[Epoch 17 Batch 750/2125] avg loss 0.0012244, throughput 4.00143K wps
[Epoch 17 Batch 780/2125] avg loss 0.00101607, throughput 3.99773K wps
[Epoch 17 Batch 810/2125] avg loss 0.000942903, throughput 4.00046K wps
[Epoch 17 Batch 840/2125] avg loss 0.00117001, throughput 4.00052K wps
[Epoch 17 Batch 870/2125] avg loss 0.00106685, throughput 4.00059K wps
[Epoch 17 Batch 900/2125] avg loss 0.00136993, throughput 3.9971K wps
[Epoch 17 Batch 930/2125] avg loss 0.00121528, throughput 4.00296K wps
[Epoch 17 Batch 960/2125] avg loss 0.00110242, throughput 3.99996K wps
[Epoch 17 Batch 990/2125] avg loss 0.00132116, throughput 3.99993K wps
[Epoch 17 Batch 1020/2125] avg loss 0.00136603, throughput 3.99979K wps
[Epoch 17 Batch 1050/2125] avg loss 0.00142085, throughput 4.00005K wps
[Epoch 17 Batch 1080/2125] avg loss 0.00128513, throughput 4.00172K wps
[Epoch 17 Batch 1110/2125] avg loss 0.0012591, throughput 3.99656K wps
[Epoch 17 Batch 1140/2125] avg loss 0.00131592, throughput 3.99948K wps
[Epoch 17 Batch 1170/2125] avg loss 0.00111377, throughput 3.99787K wps
[Epoch 17 Batch 1200/2125] avg loss 0.00120481, throughput 4.00156K wps
[Epoch 17 Batch 1230/2125] avg loss 0.00143985, throughput 3.99611K wps
[Epoch 17 Batch 1260/2125] avg loss 0.00110574, throughput 4.0007K wps
[Epoch 17 Batch 1290/2125] avg loss 0.000972418, throughput 3.99655K wps
[Epoch 17 Batch 1320/2125] avg loss 0.00123707, throughput 3.9977K wps
[Epoch 17 Batch 1350/2125] avg loss 0.00134861, throughput 3.99564K wps
[Epoch 17 Batch 1380/2125] avg loss 0.00130263, throughput 3.99499K wps
[Epoch 17 Batch 1410/2125] avg loss 0.00119141, throughput 3.99743K wps
[Epoch 17 Batch 1440/2125] avg loss 0.0011392, throughput 3.99961K wps
[Epoch 17 Batch 1470/2125] avg loss 0.00113437, throughput 4.00296K wps
[Epoch 17 Batch 1500/2125] avg loss 0.00171658, throughput 3.99586K wps
[Epoch 17 Batch 1530/2125] avg loss 0.00164132, throughput 4.00026K wps
[Epoch 17 Batch 1560/2125] avg loss 0.00123312, throughput 3.99741K wps
[Epoch 17 Batch 1590/2125] avg loss 0.00147875, throughput 3.99872K wps
[Epoch 17 Batch 1620/2125] avg loss 0.00144209, throughput 4.00019K wps
[Epoch 17 Batch 1650/2125] avg loss 0.00135991, throughput 3.9981K wps
[Epoch 17 Batch 1680/2125] avg loss 0.00160863, throughput 3.99862K wps
[Epoch 17 Batch 1710/2125] avg loss 0.00120756, throughput 4.00044K wps
[Epoch 17 Batch 1740/2125] avg loss 0.00131143, throughput 3.99841K wps
[Epoch 17 Batch 1770/2125] avg loss 0.00115452, throughput 4.00128K wps
[Epoch 17 Batch 1800/2125] avg loss 0.00156114, throughput 4.00172K wps
[Epoch 17 Batch 1830/2125] avg loss 0.00134182, throughput 4.00072K wps
[Epoch 17 Batch 1860/2125] avg loss 0.00149703, throughput 4.00268K wps
[Epoch 17 Batch 1890/2125] avg loss 0.00163139, throughput 3.99983K wps
[Epoch 17 Batch 1920/2125] avg loss 0.00133958, throughput 4.00044K wps
[Epoch 17 Batch 1950/2125] avg loss 0.00151947, throughput 3.99529K wps
[Epoch 17 Batch 1980/2125] avg loss 0.00134807, throughput 3.99849K wps
[Epoch 17 Batch 2010/2125] avg loss 0.00101955, throughput 3.99815K wps
[Epoch 17 Batch 2040/2125] avg loss 0.00123677, throughput 3.99744K wps
[Epoch 17 Batch 2070/2125] avg loss 0.00140476, throughput 3.9945K wps
[Epoch 17 Batch 2100/2125] avg loss 0.00139827, throughput 3.99884K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 17] train avg loss 0.00123432, test acc 0.9266, test avg loss 0.359808, throughput 4.00041K wps
[Epoch 18 Batch 30/2125] avg loss 0.00126372, throughput 4.08672K wps
[Epoch 18 Batch 60/2125] avg loss 0.00113849, throughput 3.99807K wps
[Epoch 18 Batch 90/2125] avg loss 0.000861538, throughput 3.99906K wps
[Epoch 18 Batch 120/2125] avg loss 0.00126198, throughput 4.00093K wps
[Epoch 18 Batch 150/2125] avg loss 0.00099162, throughput 4.00604K wps
[Epoch 18 Batch 180/2125] avg loss 0.000744702, throughput 4.00073K wps
[Epoch 18 Batch 210/2125] avg loss 0.000780721, throughput 3.99667K wps
[Epoch 18 Batch 240/2125] avg loss 0.000906803, throughput 4.00113K wps
[Epoch 18 Batch 270/2125] avg loss 0.00118999, throughput 4.0022K wps
[Epoch 18 Batch 300/2125] avg loss 0.000909894, throughput 3.9943K wps
[Epoch 18 Batch 330/2125] avg loss 0.00117073, throughput 4.00191K wps
[Epoch 18 Batch 360/2125] avg loss 0.00117666, throughput 4.00283K wps
[Epoch 18 Batch 390/2125] avg loss 0.00103272, throughput 3.99719K wps
[Epoch 18 Batch 420/2125] avg loss 0.0011592, throughput 3.99921K wps
[Epoch 18 Batch 450/2125] avg loss 0.00122382, throughput 3.9936K wps
[Epoch 18 Batch 480/2125] avg loss 0.00105607, throughput 3.99959K wps
[Epoch 18 Batch 510/2125] avg loss 0.00123494, throughput 3.99851K wps
[Epoch 18 Batch 540/2125] avg loss 0.000924003, throughput 4.0039K wps
[Epoch 18 Batch 570/2125] avg loss 0.000962914, throughput 3.99888K wps
[Epoch 18 Batch 600/2125] avg loss 0.00108266, throughput 3.99915K wps
[Epoch 18 Batch 630/2125] avg loss 0.00104977, throughput 3.99456K wps
[Epoch 18 Batch 660/2125] avg loss 0.00105278, throughput 3.99791K wps
[Epoch 18 Batch 690/2125] avg loss 0.00121342, throughput 4.00172K wps
[Epoch 18 Batch 720/2125] avg loss 0.00133052, throughput 3.9974K wps
[Epoch 18 Batch 750/2125] avg loss 0.00105954, throughput 3.99306K wps
[Epoch 18 Batch 780/2125] avg loss 0.00131088, throughput 3.99062K wps
[Epoch 18 Batch 810/2125] avg loss 0.00127225, throughput 3.99957K wps
[Epoch 18 Batch 840/2125] avg loss 0.00118661, throughput 3.99782K wps
[Epoch 18 Batch 870/2125] avg loss 0.00131487, throughput 3.99513K wps
[Epoch 18 Batch 900/2125] avg loss 0.00148881, throughput 3.99846K wps
[Epoch 18 Batch 930/2125] avg loss 0.00116998, throughput 3.99879K wps
[Epoch 18 Batch 960/2125] avg loss 0.00132672, throughput 3.99892K wps
[Epoch 18 Batch 990/2125] avg loss 0.00131441, throughput 3.99571K wps
[Epoch 18 Batch 1020/2125] avg loss 0.00111366, throughput 3.99827K wps
[Epoch 18 Batch 1050/2125] avg loss 0.00113077, throughput 3.99861K wps
[Epoch 18 Batch 1080/2125] avg loss 0.000981316, throughput 3.99807K wps
[Epoch 18 Batch 1110/2125] avg loss 0.00117109, throughput 3.99838K wps
[Epoch 18 Batch 1140/2125] avg loss 0.00117926, throughput 3.99415K wps
[Epoch 18 Batch 1170/2125] avg loss 0.00101528, throughput 3.99833K wps
[Epoch 18 Batch 1200/2125] avg loss 0.00111289, throughput 3.99946K wps
[Epoch 18 Batch 1230/2125] avg loss 0.00126613, throughput 3.99772K wps
[Epoch 18 Batch 1260/2125] avg loss 0.00113389, throughput 3.9981K wps
[Epoch 18 Batch 1290/2125] avg loss 0.00126156, throughput 3.9985K wps
[Epoch 18 Batch 1320/2125] avg loss 0.00123946, throughput 3.99828K wps
[Epoch 18 Batch 1350/2125] avg loss 0.00133541, throughput 3.99744K wps
[Epoch 18 Batch 1380/2125] avg loss 0.00140253, throughput 4.0006K wps
[Epoch 18 Batch 1410/2125] avg loss 0.00121444, throughput 3.99852K wps
[Epoch 18 Batch 1440/2125] avg loss 0.00135892, throughput 4.00279K wps
[Epoch 18 Batch 1470/2125] avg loss 0.0014543, throughput 3.99853K wps
[Epoch 18 Batch 1500/2125] avg loss 0.00147781, throughput 3.99851K wps
[Epoch 18 Batch 1530/2125] avg loss 0.0011702, throughput 3.9991K wps
[Epoch 18 Batch 1560/2125] avg loss 0.0012415, throughput 3.99449K wps
[Epoch 18 Batch 1590/2125] avg loss 0.00152404, throughput 3.99889K wps
[Epoch 18 Batch 1620/2125] avg loss 0.00115741, throughput 3.99715K wps
[Epoch 18 Batch 1650/2125] avg loss 0.00128976, throughput 4.00291K wps
[Epoch 18 Batch 1680/2125] avg loss 0.00117078, throughput 3.99745K wps
[Epoch 18 Batch 1710/2125] avg loss 0.00124009, throughput 3.99917K wps
[Epoch 18 Batch 1740/2125] avg loss 0.00146854, throughput 3.99772K wps
[Epoch 18 Batch 1770/2125] avg loss 0.00132173, throughput 3.99752K wps
[Epoch 18 Batch 1800/2125] avg loss 0.00112877, throughput 3.99999K wps
[Epoch 18 Batch 1830/2125] avg loss 0.00133183, throughput 3.99986K wps
[Epoch 18 Batch 1860/2125] avg loss 0.00117539, throughput 3.99834K wps
[Epoch 18 Batch 1890/2125] avg loss 0.00114614, throughput 3.99571K wps
[Epoch 18 Batch 1920/2125] avg loss 0.00162238, throughput 4.0016K wps
[Epoch 18 Batch 1950/2125] avg loss 0.00128382, throughput 3.99897K wps
[Epoch 18 Batch 1980/2125] avg loss 0.00117929, throughput 3.99784K wps
[Epoch 18 Batch 2010/2125] avg loss 0.00130802, throughput 3.9996K wps
[Epoch 18 Batch 2040/2125] avg loss 0.00138363, throughput 3.99868K wps
[Epoch 18 Batch 2070/2125] avg loss 0.00140398, throughput 3.99745K wps
[Epoch 18 Batch 2100/2125] avg loss 0.00141867, throughput 3.99669K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 18] train avg loss 0.00120213, test acc 0.9267, test avg loss 0.365369, throughput 3.99977K wps
[Epoch 19 Batch 30/2125] avg loss 0.000774495, throughput 4.09516K wps
[Epoch 19 Batch 60/2125] avg loss 0.000986809, throughput 4.00212K wps
[Epoch 19 Batch 90/2125] avg loss 0.000954099, throughput 4.00412K wps
[Epoch 19 Batch 120/2125] avg loss 0.000885045, throughput 4.00014K wps
[Epoch 19 Batch 150/2125] avg loss 0.000814815, throughput 4.0003K wps
[Epoch 19 Batch 180/2125] avg loss 0.00103525, throughput 3.99855K wps
[Epoch 19 Batch 210/2125] avg loss 0.00122207, throughput 3.99801K wps
[Epoch 19 Batch 240/2125] avg loss 0.00132357, throughput 3.99846K wps
[Epoch 19 Batch 270/2125] avg loss 0.00131384, throughput 3.99653K wps
[Epoch 19 Batch 300/2125] avg loss 0.00107465, throughput 3.99453K wps
[Epoch 19 Batch 330/2125] avg loss 0.000850544, throughput 4.00123K wps
[Epoch 19 Batch 360/2125] avg loss 0.000816712, throughput 3.99727K wps
[Epoch 19 Batch 390/2125] avg loss 0.000934226, throughput 3.99847K wps
[Epoch 19 Batch 420/2125] avg loss 0.00113803, throughput 3.99269K wps
[Epoch 19 Batch 450/2125] avg loss 0.000993471, throughput 3.99197K wps
[Epoch 19 Batch 480/2125] avg loss 0.00121695, throughput 3.99843K wps
[Epoch 19 Batch 510/2125] avg loss 0.00118495, throughput 3.99705K wps
[Epoch 19 Batch 540/2125] avg loss 0.000920851, throughput 3.99355K wps
[Epoch 19 Batch 570/2125] avg loss 0.000964465, throughput 3.99691K wps
[Epoch 19 Batch 600/2125] avg loss 0.000995782, throughput 3.99752K wps
[Epoch 19 Batch 630/2125] avg loss 0.00112885, throughput 3.99434K wps
[Epoch 19 Batch 660/2125] avg loss 0.00121099, throughput 3.99787K wps
[Epoch 19 Batch 690/2125] avg loss 0.00123994, throughput 4.00052K wps
[Epoch 19 Batch 720/2125] avg loss 0.00128731, throughput 3.99742K wps
[Epoch 19 Batch 750/2125] avg loss 0.00107223, throughput 3.99691K wps
[Epoch 19 Batch 780/2125] avg loss 0.000914004, throughput 3.99632K wps
[Epoch 19 Batch 810/2125] avg loss 0.000907214, throughput 3.99565K wps
[Epoch 19 Batch 840/2125] avg loss 0.00111882, throughput 3.99636K wps
[Epoch 19 Batch 870/2125] avg loss 0.00113621, throughput 3.99463K wps
[Epoch 19 Batch 900/2125] avg loss 0.00135686, throughput 3.9968K wps
[Epoch 19 Batch 930/2125] avg loss 0.000959874, throughput 3.99976K wps
[Epoch 19 Batch 960/2125] avg loss 0.00116758, throughput 4.00046K wps
[Epoch 19 Batch 990/2125] avg loss 0.00107923, throughput 3.99665K wps
[Epoch 19 Batch 1020/2125] avg loss 0.00127179, throughput 4.00039K wps
[Epoch 19 Batch 1050/2125] avg loss 0.000753401, throughput 3.99397K wps
[Epoch 19 Batch 1080/2125] avg loss 0.00117478, throughput 3.99252K wps
[Epoch 19 Batch 1110/2125] avg loss 0.00119651, throughput 3.99352K wps
[Epoch 19 Batch 1140/2125] avg loss 0.00108818, throughput 3.99459K wps
[Epoch 19 Batch 1170/2125] avg loss 0.00112318, throughput 3.99867K wps
[Epoch 19 Batch 1200/2125] avg loss 0.00112228, throughput 3.99307K wps
[Epoch 19 Batch 1230/2125] avg loss 0.00115108, throughput 3.9938K wps
[Epoch 19 Batch 1260/2125] avg loss 0.000930822, throughput 3.99991K wps
[Epoch 19 Batch 1290/2125] avg loss 0.00111015, throughput 3.99816K wps
[Epoch 19 Batch 1320/2125] avg loss 0.0011325, throughput 3.99648K wps
[Epoch 19 Batch 1350/2125] avg loss 0.000992642, throughput 3.9997K wps
[Epoch 19 Batch 1380/2125] avg loss 0.00109555, throughput 4.00276K wps
[Epoch 19 Batch 1410/2125] avg loss 0.00103748, throughput 3.99861K wps
[Epoch 19 Batch 1440/2125] avg loss 0.00129027, throughput 3.99492K wps
[Epoch 19 Batch 1470/2125] avg loss 0.00138727, throughput 3.99442K wps
[Epoch 19 Batch 1500/2125] avg loss 0.00105457, throughput 4.00028K wps
[Epoch 19 Batch 1530/2125] avg loss 0.00132759, throughput 3.99898K wps
[Epoch 19 Batch 1560/2125] avg loss 0.00124219, throughput 4.00015K wps
[Epoch 19 Batch 1590/2125] avg loss 0.0012568, throughput 3.99766K wps
[Epoch 19 Batch 1620/2125] avg loss 0.000991589, throughput 3.99978K wps
[Epoch 19 Batch 1650/2125] avg loss 0.00111485, throughput 4.00041K wps
[Epoch 19 Batch 1680/2125] avg loss 0.00135717, throughput 3.99684K wps
[Epoch 19 Batch 1710/2125] avg loss 0.00109061, throughput 3.99472K wps
[Epoch 19 Batch 1740/2125] avg loss 0.00105726, throughput 3.99637K wps
[Epoch 19 Batch 1770/2125] avg loss 0.00116961, throughput 3.99937K wps
[Epoch 19 Batch 1800/2125] avg loss 0.00122174, throughput 3.99458K wps
[Epoch 19 Batch 1830/2125] avg loss 0.00137793, throughput 3.99689K wps
[Epoch 19 Batch 1860/2125] avg loss 0.00127249, throughput 4.00117K wps
[Epoch 19 Batch 1890/2125] avg loss 0.00128334, throughput 4.00023K wps
[Epoch 19 Batch 1920/2125] avg loss 0.00104924, throughput 3.99608K wps
[Epoch 19 Batch 1950/2125] avg loss 0.00121113, throughput 3.99616K wps
[Epoch 19 Batch 1980/2125] avg loss 0.00126213, throughput 3.99881K wps
[Epoch 19 Batch 2010/2125] avg loss 0.00110849, throughput 3.99771K wps
[Epoch 19 Batch 2040/2125] avg loss 0.00125351, throughput 4.00053K wps
[Epoch 19 Batch 2070/2125] avg loss 0.00135166, throughput 3.99969K wps
[Epoch 19 Batch 2100/2125] avg loss 0.00124461, throughput 3.99884K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.44 s
[Batch 120/237] elapsed 0.44 s
[Batch 150/237] elapsed 0.45 s
[Batch 180/237] elapsed 0.45 s
[Batch 210/237] elapsed 0.43 s
[Epoch 19] train avg loss 0.00111425, test acc 0.9275, test avg loss 0.376316, throughput 3.99895K wps
[Epoch 20 Batch 30/2125] avg loss 0.000866265, throughput 4.09389K wps
[Epoch 20 Batch 60/2125] avg loss 0.00114917, throughput 4.00545K wps
[Epoch 20 Batch 90/2125] avg loss 0.000968071, throughput 3.99862K wps
[Epoch 20 Batch 120/2125] avg loss 0.000991712, throughput 3.99888K wps
[Epoch 20 Batch 150/2125] avg loss 0.00083024, throughput 3.99864K wps
[Epoch 20 Batch 180/2125] avg loss 0.000842984, throughput 3.99866K wps
[Epoch 20 Batch 210/2125] avg loss 0.00100058, throughput 3.99714K wps
[Epoch 20 Batch 240/2125] avg loss 0.00114288, throughput 3.99449K wps
[Epoch 20 Batch 270/2125] avg loss 0.000836616, throughput 3.99926K wps
[Epoch 20 Batch 300/2125] avg loss 0.000880353, throughput 3.99607K wps
[Epoch 20 Batch 330/2125] avg loss 0.000918761, throughput 3.99944K wps
[Epoch 20 Batch 360/2125] avg loss 0.000841088, throughput 3.99557K wps
[Epoch 20 Batch 390/2125] avg loss 0.000954156, throughput 3.99719K wps
[Epoch 20 Batch 420/2125] avg loss 0.00112716, throughput 3.99608K wps
[Epoch 20 Batch 450/2125] avg loss 0.000960227, throughput 3.99768K wps
[Epoch 20 Batch 480/2125] avg loss 0.00103625, throughput 3.99959K wps
[Epoch 20 Batch 510/2125] avg loss 0.00100711, throughput 4.00237K wps
[Epoch 20 Batch 540/2125] avg loss 0.0011939, throughput 3.99706K wps
[Epoch 20 Batch 570/2125] avg loss 0.000737567, throughput 3.99498K wps
[Epoch 20 Batch 600/2125] avg loss 0.000704267, throughput 3.99794K wps
[Epoch 20 Batch 630/2125] avg loss 0.00132672, throughput 3.99618K wps
[Epoch 20 Batch 660/2125] avg loss 0.00103613, throughput 3.99963K wps
[Epoch 20 Batch 690/2125] avg loss 0.00122677, throughput 3.9977K wps
[Epoch 20 Batch 720/2125] avg loss 0.00126678, throughput 3.99872K wps
[Epoch 20 Batch 750/2125] avg loss 0.00120833, throughput 4.00167K wps
[Epoch 20 Batch 780/2125] avg loss 0.000863705, throughput 3.99772K wps
[Epoch 20 Batch 810/2125] avg loss 0.000888794, throughput 3.9962K wps
[Epoch 20 Batch 840/2125] avg loss 0.000891203, throughput 3.9961K wps
[Epoch 20 Batch 870/2125] avg loss 0.00123527, throughput 3.99883K wps
[Epoch 20 Batch 900/2125] avg loss 0.00119403, throughput 3.99747K wps
[Epoch 20 Batch 930/2125] avg loss 0.00112517, throughput 3.99852K wps
[Epoch 20 Batch 960/2125] avg loss 0.00102051, throughput 3.99797K wps
[Epoch 20 Batch 990/2125] avg loss 0.00094521, throughput 3.99623K wps
[Epoch 20 Batch 1020/2125] avg loss 0.000986757, throughput 3.99734K wps
[Epoch 20 Batch 1050/2125] avg loss 0.00094475, throughput 3.99597K wps
[Epoch 20 Batch 1080/2125] avg loss 0.00104829, throughput 3.99509K wps
[Epoch 20 Batch 1110/2125] avg loss 0.00105356, throughput 3.99493K wps
[Epoch 20 Batch 1140/2125] avg loss 0.00127481, throughput 3.99774K wps
[Epoch 20 Batch 1170/2125] avg loss 0.00116232, throughput 3.9955K wps
[Epoch 20 Batch 1200/2125] avg loss 0.00100249, throughput 3.99414K wps
[Epoch 20 Batch 1230/2125] avg loss 0.00118509, throughput 3.99443K wps
[Epoch 20 Batch 1260/2125] avg loss 0.0015302, throughput 3.99633K wps
[Epoch 20 Batch 1290/2125] avg loss 0.00119128, throughput 3.99461K wps
[Epoch 20 Batch 1320/2125] avg loss 0.00136853, throughput 3.99987K wps
[Epoch 20 Batch 1350/2125] avg loss 0.00110863, throughput 3.99596K wps
[Epoch 20 Batch 1380/2125] avg loss 0.00107283, throughput 3.99779K wps
[Epoch 20 Batch 1410/2125] avg loss 0.00124635, throughput 3.99772K wps
[Epoch 20 Batch 1440/2125] avg loss 0.00109658, throughput 3.99767K wps
[Epoch 20 Batch 1470/2125] avg loss 0.00091117, throughput 3.99788K wps
[Epoch 20 Batch 1500/2125] avg loss 0.00136131, throughput 3.99489K wps
[Epoch 20 Batch 1530/2125] avg loss 0.000967119, throughput 3.99832K wps
[Epoch 20 Batch 1560/2125] avg loss 0.00132909, throughput 3.99553K wps
[Epoch 20 Batch 1590/2125] avg loss 0.00113425, throughput 3.99621K wps
[Epoch 20 Batch 1620/2125] avg loss 0.00109843, throughput 3.99657K wps
[Epoch 20 Batch 1650/2125] avg loss 0.00110867, throughput 3.99669K wps
[Epoch 20 Batch 1680/2125] avg loss 0.00115529, throughput 4.00132K wps
[Epoch 20 Batch 1710/2125] avg loss 0.0011333, throughput 3.99451K wps
[Epoch 20 Batch 1740/2125] avg loss 0.00110843, throughput 3.9996K wps
[Epoch 20 Batch 1770/2125] avg loss 0.00118142, throughput 3.99886K wps
[Epoch 20 Batch 1800/2125] avg loss 0.0011083, throughput 3.9989K wps
[Epoch 20 Batch 1830/2125] avg loss 0.0011868, throughput 3.99684K wps
[Epoch 20 Batch 1860/2125] avg loss 0.00115931, throughput 3.99265K wps
[Epoch 20 Batch 1890/2125] avg loss 0.000838669, throughput 3.99761K wps
[Epoch 20 Batch 1920/2125] avg loss 0.00153225, throughput 4.0012K wps
[Epoch 20 Batch 1950/2125] avg loss 0.00105023, throughput 3.99587K wps
[Epoch 20 Batch 1980/2125] avg loss 0.00123389, throughput 3.99666K wps
[Epoch 20 Batch 2010/2125] avg loss 0.00102462, throughput 3.99825K wps
[Epoch 20 Batch 2040/2125] avg loss 0.00103159, throughput 4.00029K wps
[Epoch 20 Batch 2070/2125] avg loss 0.00108951, throughput 3.99857K wps
[Epoch 20 Batch 2100/2125] avg loss 0.00106071, throughput 4.0011K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 20] train avg loss 0.00107644, test acc 0.9271, test avg loss 0.388841, throughput 3.99894K wps
[Epoch 21 Batch 30/2125] avg loss 0.00091412, throughput 4.09397K wps
[Epoch 21 Batch 60/2125] avg loss 0.000798113, throughput 3.99962K wps
[Epoch 21 Batch 90/2125] avg loss 0.000960392, throughput 3.99875K wps
[Epoch 21 Batch 120/2125] avg loss 0.000866542, throughput 3.999K wps
[Epoch 21 Batch 150/2125] avg loss 0.00107995, throughput 3.99592K wps
[Epoch 21 Batch 180/2125] avg loss 0.00114421, throughput 3.99984K wps
[Epoch 21 Batch 210/2125] avg loss 0.000797729, throughput 4.00092K wps
[Epoch 21 Batch 240/2125] avg loss 0.000851337, throughput 4.00121K wps
[Epoch 21 Batch 270/2125] avg loss 0.000910356, throughput 3.9953K wps
[Epoch 21 Batch 300/2125] avg loss 0.000674664, throughput 3.99714K wps
[Epoch 21 Batch 330/2125] avg loss 0.000840443, throughput 3.99816K wps
[Epoch 21 Batch 360/2125] avg loss 0.000946698, throughput 3.99855K wps
[Epoch 21 Batch 390/2125] avg loss 0.00105322, throughput 4.00143K wps
[Epoch 21 Batch 420/2125] avg loss 0.000873807, throughput 3.99944K wps
[Epoch 21 Batch 450/2125] avg loss 0.000767509, throughput 3.99889K wps
[Epoch 21 Batch 480/2125] avg loss 0.00076697, throughput 3.99565K wps
[Epoch 21 Batch 510/2125] avg loss 0.00090533, throughput 3.99814K wps
[Epoch 21 Batch 540/2125] avg loss 0.00115096, throughput 3.99822K wps
[Epoch 21 Batch 570/2125] avg loss 0.000943762, throughput 4.00268K wps
[Epoch 21 Batch 600/2125] avg loss 0.000866997, throughput 4.00525K wps
[Epoch 21 Batch 630/2125] avg loss 0.000766748, throughput 4.00427K wps
[Epoch 21 Batch 660/2125] avg loss 0.00123675, throughput 4.00451K wps
[Epoch 21 Batch 690/2125] avg loss 0.000872665, throughput 4.00125K wps
[Epoch 21 Batch 720/2125] avg loss 0.000997606, throughput 3.99794K wps
[Epoch 21 Batch 750/2125] avg loss 0.000945652, throughput 4.00047K wps
[Epoch 21 Batch 780/2125] avg loss 0.00117174, throughput 3.99884K wps
[Epoch 21 Batch 810/2125] avg loss 0.000877935, throughput 3.99671K wps
[Epoch 21 Batch 840/2125] avg loss 0.000918201, throughput 3.99676K wps
[Epoch 21 Batch 870/2125] avg loss 0.000858303, throughput 3.99879K wps
[Epoch 21 Batch 900/2125] avg loss 0.00101847, throughput 3.99628K wps
[Epoch 21 Batch 930/2125] avg loss 0.0008742, throughput 3.99973K wps
[Epoch 21 Batch 960/2125] avg loss 0.0010815, throughput 3.99731K wps
[Epoch 21 Batch 990/2125] avg loss 0.000819034, throughput 3.99901K wps
[Epoch 21 Batch 1020/2125] avg loss 0.00108667, throughput 3.99764K wps
[Epoch 21 Batch 1050/2125] avg loss 0.000991049, throughput 3.99091K wps
[Epoch 21 Batch 1080/2125] avg loss 0.00121979, throughput 3.99766K wps
[Epoch 21 Batch 1110/2125] avg loss 0.000942958, throughput 4.00306K wps
[Epoch 21 Batch 1140/2125] avg loss 0.00084987, throughput 3.99761K wps
[Epoch 21 Batch 1170/2125] avg loss 0.000916371, throughput 4.00893K wps
[Epoch 21 Batch 1200/2125] avg loss 0.000996435, throughput 3.99893K wps
[Epoch 21 Batch 1230/2125] avg loss 0.00117777, throughput 3.99969K wps
[Epoch 21 Batch 1260/2125] avg loss 0.000700407, throughput 4.00327K wps
[Epoch 21 Batch 1290/2125] avg loss 0.00121849, throughput 3.99678K wps
[Epoch 21 Batch 1320/2125] avg loss 0.00134703, throughput 4.00008K wps
[Epoch 21 Batch 1350/2125] avg loss 0.00143443, throughput 4.00005K wps
[Epoch 21 Batch 1380/2125] avg loss 0.0010845, throughput 3.98956K wps
[Epoch 21 Batch 1410/2125] avg loss 0.00126432, throughput 3.99825K wps
[Epoch 21 Batch 1440/2125] avg loss 0.000994224, throughput 3.99925K wps
[Epoch 21 Batch 1470/2125] avg loss 0.00102958, throughput 4.00338K wps
[Epoch 21 Batch 1500/2125] avg loss 0.000992077, throughput 3.99863K wps
[Epoch 21 Batch 1530/2125] avg loss 0.00115002, throughput 3.99657K wps
[Epoch 21 Batch 1560/2125] avg loss 0.00107734, throughput 3.99614K wps
[Epoch 21 Batch 1590/2125] avg loss 0.00124941, throughput 3.99648K wps
[Epoch 21 Batch 1620/2125] avg loss 0.00130473, throughput 3.99581K wps
[Epoch 21 Batch 1650/2125] avg loss 0.000978652, throughput 3.99952K wps
[Epoch 21 Batch 1680/2125] avg loss 0.00120015, throughput 4.00007K wps
[Epoch 21 Batch 1710/2125] avg loss 0.00141494, throughput 4.00451K wps
[Epoch 21 Batch 1740/2125] avg loss 0.00111657, throughput 4.00758K wps
[Epoch 21 Batch 1770/2125] avg loss 0.00107663, throughput 4.00373K wps
[Epoch 21 Batch 1800/2125] avg loss 0.000910058, throughput 4.00408K wps
[Epoch 21 Batch 1830/2125] avg loss 0.00136789, throughput 4.00129K wps
[Epoch 21 Batch 1860/2125] avg loss 0.000860811, throughput 3.99904K wps
[Epoch 21 Batch 1890/2125] avg loss 0.00109339, throughput 4.00154K wps
[Epoch 21 Batch 1920/2125] avg loss 0.000749127, throughput 4.00148K wps
[Epoch 21 Batch 1950/2125] avg loss 0.0010634, throughput 4.00256K wps
[Epoch 21 Batch 1980/2125] avg loss 0.00134822, throughput 4.00157K wps
[Epoch 21 Batch 2010/2125] avg loss 0.00119174, throughput 3.99848K wps
[Epoch 21 Batch 2040/2125] avg loss 0.00153513, throughput 4.0002K wps
[Epoch 21 Batch 2070/2125] avg loss 0.00130099, throughput 3.99788K wps
[Epoch 21 Batch 2100/2125] avg loss 0.00102834, throughput 3.99858K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 21] train avg loss 0.00102829, test acc 0.9280, test avg loss 0.398645, throughput 4.00082K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 22 Batch 30/2125] avg loss 0.000573973, throughput 4.08938K wps
[Epoch 22 Batch 60/2125] avg loss 0.000628941, throughput 3.99413K wps
[Epoch 22 Batch 90/2125] avg loss 0.000783209, throughput 3.99805K wps
[Epoch 22 Batch 120/2125] avg loss 0.000747279, throughput 3.99796K wps
[Epoch 22 Batch 150/2125] avg loss 0.0008323, throughput 4.00034K wps
[Epoch 22 Batch 180/2125] avg loss 0.000862263, throughput 3.9967K wps
[Epoch 22 Batch 210/2125] avg loss 0.000768722, throughput 3.99536K wps
[Epoch 22 Batch 240/2125] avg loss 0.00114809, throughput 3.99909K wps
[Epoch 22 Batch 270/2125] avg loss 0.000805191, throughput 3.99987K wps
[Epoch 22 Batch 300/2125] avg loss 0.00101757, throughput 3.99662K wps
[Epoch 22 Batch 330/2125] avg loss 0.000905691, throughput 3.99386K wps
[Epoch 22 Batch 360/2125] avg loss 0.0010612, throughput 4.00012K wps
[Epoch 22 Batch 390/2125] avg loss 0.00104276, throughput 3.99765K wps
[Epoch 22 Batch 420/2125] avg loss 0.000897911, throughput 3.99942K wps
[Epoch 22 Batch 450/2125] avg loss 0.000932381, throughput 4.00561K wps
[Epoch 22 Batch 480/2125] avg loss 0.000909892, throughput 3.99637K wps
[Epoch 22 Batch 510/2125] avg loss 0.000961466, throughput 3.99826K wps
[Epoch 22 Batch 540/2125] avg loss 0.000761049, throughput 3.99866K wps
[Epoch 22 Batch 570/2125] avg loss 0.000997713, throughput 3.99622K wps
[Epoch 22 Batch 600/2125] avg loss 0.000914467, throughput 3.99702K wps
[Epoch 22 Batch 630/2125] avg loss 0.0013926, throughput 3.99908K wps
[Epoch 22 Batch 660/2125] avg loss 0.000886779, throughput 4.00119K wps
[Epoch 22 Batch 690/2125] avg loss 0.00104322, throughput 3.99971K wps
[Epoch 22 Batch 720/2125] avg loss 0.000845584, throughput 3.99784K wps
[Epoch 22 Batch 750/2125] avg loss 0.00107166, throughput 3.99336K wps
[Epoch 22 Batch 780/2125] avg loss 0.00110571, throughput 3.99709K wps
[Epoch 22 Batch 810/2125] avg loss 0.000829087, throughput 3.99981K wps
[Epoch 22 Batch 840/2125] avg loss 0.000806676, throughput 3.99818K wps
[Epoch 22 Batch 870/2125] avg loss 0.00100738, throughput 3.99984K wps
[Epoch 22 Batch 900/2125] avg loss 0.00101226, throughput 3.99423K wps
[Epoch 22 Batch 930/2125] avg loss 0.000991372, throughput 3.99477K wps
[Epoch 22 Batch 960/2125] avg loss 0.00111098, throughput 4.00139K wps
[Epoch 22 Batch 990/2125] avg loss 0.00118448, throughput 4.002K wps
[Epoch 22 Batch 1020/2125] avg loss 0.00095291, throughput 3.99819K wps
[Epoch 22 Batch 1050/2125] avg loss 0.00116657, throughput 4.00171K wps
[Epoch 22 Batch 1080/2125] avg loss 0.000799335, throughput 4.00045K wps
[Epoch 22 Batch 1110/2125] avg loss 0.00090187, throughput 4.00185K wps
[Epoch 22 Batch 1140/2125] avg loss 0.000856196, throughput 3.99795K wps
[Epoch 22 Batch 1170/2125] avg loss 0.00113276, throughput 4.00227K wps
[Epoch 22 Batch 1200/2125] avg loss 0.00106588, throughput 3.46597K wps
[Epoch 22 Batch 1230/2125] avg loss 0.00102615, throughput 4.00202K wps
[Epoch 22 Batch 1260/2125] avg loss 0.000941572, throughput 4.00259K wps
[Epoch 22 Batch 1290/2125] avg loss 0.00101515, throughput 3.99586K wps
[Epoch 22 Batch 1320/2125] avg loss 0.00101626, throughput 3.99803K wps
[Epoch 22 Batch 1350/2125] avg loss 0.000908578, throughput 4.00351K wps
[Epoch 22 Batch 1380/2125] avg loss 0.00125949, throughput 4.00063K wps
[Epoch 22 Batch 1410/2125] avg loss 0.00101889, throughput 3.99638K wps
[Epoch 22 Batch 1440/2125] avg loss 0.00108351, throughput 3.99449K wps
[Epoch 22 Batch 1470/2125] avg loss 0.00100813, throughput 4.00018K wps
[Epoch 22 Batch 1500/2125] avg loss 0.00103933, throughput 3.99707K wps
[Epoch 22 Batch 1530/2125] avg loss 0.00108292, throughput 3.99814K wps
[Epoch 22 Batch 1560/2125] avg loss 0.00123269, throughput 4.00059K wps
[Epoch 22 Batch 1590/2125] avg loss 0.001167, throughput 4.00044K wps
[Epoch 22 Batch 1620/2125] avg loss 0.00129808, throughput 4.00589K wps
[Epoch 22 Batch 1650/2125] avg loss 0.000979522, throughput 3.99962K wps
[Epoch 22 Batch 1680/2125] avg loss 0.00109215, throughput 3.99876K wps
[Epoch 22 Batch 1710/2125] avg loss 0.00154641, throughput 3.99413K wps
[Epoch 22 Batch 1740/2125] avg loss 0.00114566, throughput 3.99816K wps
[Epoch 22 Batch 1770/2125] avg loss 0.00138119, throughput 3.9991K wps
[Epoch 22 Batch 1800/2125] avg loss 0.000767476, throughput 3.99794K wps
[Epoch 22 Batch 1830/2125] avg loss 0.00109501, throughput 4.00079K wps
[Epoch 22 Batch 1860/2125] avg loss 0.00112389, throughput 3.99936K wps
[Epoch 22 Batch 1890/2125] avg loss 0.00103954, throughput 3.99536K wps
[Epoch 22 Batch 1920/2125] avg loss 0.00127528, throughput 3.99816K wps
[Epoch 22 Batch 1950/2125] avg loss 0.00105772, throughput 3.99904K wps
[Epoch 22 Batch 1980/2125] avg loss 0.00125009, throughput 3.99878K wps
[Epoch 22 Batch 2010/2125] avg loss 0.00126563, throughput 4.00151K wps
[Epoch 22 Batch 2040/2125] avg loss 0.000810132, throughput 3.99661K wps
[Epoch 22 Batch 2070/2125] avg loss 0.00126447, throughput 4.00366K wps
[Epoch 22 Batch 2100/2125] avg loss 0.0010862, throughput 3.99898K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 22] train avg loss 0.00101382, test acc 0.9281, test avg loss 0.406431, throughput 3.99136K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.43 s
[Epoch 23 Batch 30/2125] avg loss 0.000965095, throughput 4.08568K wps
[Epoch 23 Batch 60/2125] avg loss 0.000906007, throughput 4.00044K wps
[Epoch 23 Batch 90/2125] avg loss 0.000717685, throughput 3.99768K wps
[Epoch 23 Batch 120/2125] avg loss 0.000802848, throughput 3.99467K wps
[Epoch 23 Batch 150/2125] avg loss 0.000903851, throughput 3.99943K wps
[Epoch 23 Batch 180/2125] avg loss 0.000947468, throughput 3.99522K wps
[Epoch 23 Batch 210/2125] avg loss 0.000813685, throughput 3.99961K wps
[Epoch 23 Batch 240/2125] avg loss 0.00130544, throughput 3.99682K wps
[Epoch 23 Batch 270/2125] avg loss 0.00068095, throughput 3.99927K wps
[Epoch 23 Batch 300/2125] avg loss 0.000650091, throughput 4.00233K wps
[Epoch 23 Batch 330/2125] avg loss 0.000973076, throughput 3.99796K wps
[Epoch 23 Batch 360/2125] avg loss 0.000807176, throughput 3.99872K wps
[Epoch 23 Batch 390/2125] avg loss 0.000985109, throughput 3.99846K wps
[Epoch 23 Batch 420/2125] avg loss 0.000869727, throughput 3.99757K wps
[Epoch 23 Batch 450/2125] avg loss 0.00082561, throughput 3.99961K wps
[Epoch 23 Batch 480/2125] avg loss 0.000788764, throughput 3.99704K wps
[Epoch 23 Batch 510/2125] avg loss 0.000718422, throughput 3.99948K wps
[Epoch 23 Batch 540/2125] avg loss 0.000823188, throughput 3.99841K wps
[Epoch 23 Batch 570/2125] avg loss 0.000881613, throughput 3.97928K wps
[Epoch 23 Batch 600/2125] avg loss 0.0010073, throughput 3.99077K wps
[Epoch 23 Batch 630/2125] avg loss 0.00097048, throughput 3.99773K wps
[Epoch 23 Batch 660/2125] avg loss 0.0009027, throughput 4.00118K wps
[Epoch 23 Batch 690/2125] avg loss 0.000854212, throughput 3.99717K wps
[Epoch 23 Batch 720/2125] avg loss 0.000897682, throughput 3.9995K wps
[Epoch 23 Batch 750/2125] avg loss 0.00112234, throughput 3.99642K wps
[Epoch 23 Batch 780/2125] avg loss 0.00110559, throughput 3.9985K wps
[Epoch 23 Batch 810/2125] avg loss 0.000979687, throughput 3.99871K wps
[Epoch 23 Batch 840/2125] avg loss 0.000823226, throughput 4.00066K wps
[Epoch 23 Batch 870/2125] avg loss 0.00107551, throughput 4.00169K wps
[Epoch 23 Batch 900/2125] avg loss 0.00107284, throughput 4.00048K wps
[Epoch 23 Batch 930/2125] avg loss 0.000793685, throughput 3.99916K wps
[Epoch 23 Batch 960/2125] avg loss 0.000826126, throughput 4.00184K wps
[Epoch 23 Batch 990/2125] avg loss 0.000938135, throughput 4.00189K wps
[Epoch 23 Batch 1020/2125] avg loss 0.000802025, throughput 3.99907K wps
[Epoch 23 Batch 1050/2125] avg loss 0.00114581, throughput 4.00121K wps
[Epoch 23 Batch 1080/2125] avg loss 0.00105139, throughput 3.9988K wps
[Epoch 23 Batch 1110/2125] avg loss 0.000658475, throughput 4.00257K wps
[Epoch 23 Batch 1140/2125] avg loss 0.000921987, throughput 4.00371K wps
[Epoch 23 Batch 1170/2125] avg loss 0.000930614, throughput 3.99867K wps
[Epoch 23 Batch 1200/2125] avg loss 0.00094148, throughput 3.99734K wps
[Epoch 23 Batch 1230/2125] avg loss 0.00111428, throughput 4.00117K wps
[Epoch 23 Batch 1260/2125] avg loss 0.00118994, throughput 3.99877K wps
[Epoch 23 Batch 1290/2125] avg loss 0.00116206, throughput 3.99637K wps
[Epoch 23 Batch 1320/2125] avg loss 0.000990076, throughput 3.99878K wps
[Epoch 23 Batch 1350/2125] avg loss 0.00094303, throughput 4.00097K wps
[Epoch 23 Batch 1380/2125] avg loss 0.00101903, throughput 3.99831K wps
[Epoch 23 Batch 1410/2125] avg loss 0.000896234, throughput 4.00399K wps
[Epoch 23 Batch 1440/2125] avg loss 0.000962349, throughput 4.00124K wps
[Epoch 23 Batch 1470/2125] avg loss 0.00118886, throughput 3.99263K wps
[Epoch 23 Batch 1500/2125] avg loss 0.000842466, throughput 3.98948K wps
[Epoch 23 Batch 1530/2125] avg loss 0.000933186, throughput 4.00196K wps
[Epoch 23 Batch 1560/2125] avg loss 0.00112844, throughput 4.00131K wps
[Epoch 23 Batch 1590/2125] avg loss 0.00104191, throughput 4.00042K wps
[Epoch 23 Batch 1620/2125] avg loss 0.000897441, throughput 3.99727K wps
[Epoch 23 Batch 1650/2125] avg loss 0.00102691, throughput 3.99816K wps
[Epoch 23 Batch 1680/2125] avg loss 0.000856051, throughput 4.00245K wps
[Epoch 23 Batch 1710/2125] avg loss 0.000915185, throughput 4.0007K wps
[Epoch 23 Batch 1740/2125] avg loss 0.000725113, throughput 4.00325K wps
[Epoch 23 Batch 1770/2125] avg loss 0.00143779, throughput 3.99762K wps
[Epoch 23 Batch 1800/2125] avg loss 0.000691243, throughput 3.99635K wps
[Epoch 23 Batch 1830/2125] avg loss 0.000799892, throughput 3.99439K wps
[Epoch 23 Batch 1860/2125] avg loss 0.00108382, throughput 3.99743K wps
[Epoch 23 Batch 1890/2125] avg loss 0.00123987, throughput 3.99855K wps
[Epoch 23 Batch 1920/2125] avg loss 0.00109082, throughput 3.99762K wps
[Epoch 23 Batch 1950/2125] avg loss 0.000854456, throughput 3.99684K wps
[Epoch 23 Batch 1980/2125] avg loss 0.000787512, throughput 3.99497K wps
[Epoch 23 Batch 2010/2125] avg loss 0.00130669, throughput 3.9954K wps
[Epoch 23 Batch 2040/2125] avg loss 0.00104276, throughput 3.997K wps
[Epoch 23 Batch 2070/2125] avg loss 0.00113847, throughput 4.00402K wps
[Epoch 23 Batch 2100/2125] avg loss 0.00136508, throughput 4.00167K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 23] train avg loss 0.000955478, test acc 0.9266, test avg loss 0.4163, throughput 3.99972K wps
[Epoch 24 Batch 30/2125] avg loss 0.00074355, throughput 4.0927K wps
[Epoch 24 Batch 60/2125] avg loss 0.000628039, throughput 4.00792K wps
[Epoch 24 Batch 90/2125] avg loss 0.00085109, throughput 4.00823K wps
[Epoch 24 Batch 120/2125] avg loss 0.000788524, throughput 4.00533K wps
[Epoch 24 Batch 150/2125] avg loss 0.000739635, throughput 3.99997K wps
[Epoch 24 Batch 180/2125] avg loss 0.000758911, throughput 3.99758K wps
[Epoch 24 Batch 210/2125] avg loss 0.000676737, throughput 3.99918K wps
[Epoch 24 Batch 240/2125] avg loss 0.000837779, throughput 4.00217K wps
[Epoch 24 Batch 270/2125] avg loss 0.000907735, throughput 3.99845K wps
[Epoch 24 Batch 300/2125] avg loss 0.000912691, throughput 3.99816K wps
[Epoch 24 Batch 330/2125] avg loss 0.000853717, throughput 3.99693K wps
[Epoch 24 Batch 360/2125] avg loss 0.000762975, throughput 4.00297K wps
[Epoch 24 Batch 390/2125] avg loss 0.000848826, throughput 3.99919K wps
[Epoch 24 Batch 420/2125] avg loss 0.000982617, throughput 3.99675K wps
[Epoch 24 Batch 450/2125] avg loss 0.00071553, throughput 3.99925K wps
[Epoch 24 Batch 480/2125] avg loss 0.000864627, throughput 3.99927K wps
[Epoch 24 Batch 510/2125] avg loss 0.000981059, throughput 3.99652K wps
[Epoch 24 Batch 540/2125] avg loss 0.0010103, throughput 3.99887K wps
[Epoch 24 Batch 570/2125] avg loss 0.000858149, throughput 4.00202K wps
[Epoch 24 Batch 600/2125] avg loss 0.00094538, throughput 3.99419K wps
[Epoch 24 Batch 630/2125] avg loss 0.000653286, throughput 3.9979K wps
[Epoch 24 Batch 660/2125] avg loss 0.000911175, throughput 3.99934K wps
[Epoch 24 Batch 690/2125] avg loss 0.00120244, throughput 3.99711K wps
[Epoch 24 Batch 720/2125] avg loss 0.000885383, throughput 4.0008K wps
[Epoch 24 Batch 750/2125] avg loss 0.000888566, throughput 3.9983K wps
[Epoch 24 Batch 780/2125] avg loss 0.000931858, throughput 3.99689K wps
[Epoch 24 Batch 810/2125] avg loss 0.000835434, throughput 3.99665K wps
[Epoch 24 Batch 840/2125] avg loss 0.000708333, throughput 3.99541K wps
[Epoch 24 Batch 870/2125] avg loss 0.000820899, throughput 4.00032K wps
[Epoch 24 Batch 900/2125] avg loss 0.000878584, throughput 3.99582K wps
[Epoch 24 Batch 930/2125] avg loss 0.00101253, throughput 4.00255K wps
[Epoch 24 Batch 960/2125] avg loss 0.000869647, throughput 3.99943K wps
[Epoch 24 Batch 990/2125] avg loss 0.000928781, throughput 3.99805K wps
[Epoch 24 Batch 1020/2125] avg loss 0.000788742, throughput 3.99796K wps
[Epoch 24 Batch 1050/2125] avg loss 0.000753087, throughput 4.00055K wps
[Epoch 24 Batch 1080/2125] avg loss 0.00128214, throughput 3.99928K wps
[Epoch 24 Batch 1110/2125] avg loss 0.00101769, throughput 3.9955K wps
[Epoch 24 Batch 1140/2125] avg loss 0.000958631, throughput 3.99699K wps
[Epoch 24 Batch 1170/2125] avg loss 0.000772871, throughput 4.00085K wps
[Epoch 24 Batch 1200/2125] avg loss 0.000876131, throughput 3.99913K wps
[Epoch 24 Batch 1230/2125] avg loss 0.000833211, throughput 3.99788K wps
[Epoch 24 Batch 1260/2125] avg loss 0.000912889, throughput 4.00199K wps
[Epoch 24 Batch 1290/2125] avg loss 0.00102116, throughput 4.00124K wps
[Epoch 24 Batch 1320/2125] avg loss 0.00108214, throughput 3.99817K wps
[Epoch 24 Batch 1350/2125] avg loss 0.00106421, throughput 3.99864K wps
[Epoch 24 Batch 1380/2125] avg loss 0.00101004, throughput 3.99765K wps
[Epoch 24 Batch 1410/2125] avg loss 0.000939563, throughput 3.99679K wps
[Epoch 24 Batch 1440/2125] avg loss 0.00115979, throughput 3.99274K wps
[Epoch 24 Batch 1470/2125] avg loss 0.000863252, throughput 3.9941K wps
[Epoch 24 Batch 1500/2125] avg loss 0.000902895, throughput 3.99827K wps
[Epoch 24 Batch 1530/2125] avg loss 0.000699795, throughput 3.99707K wps
[Epoch 24 Batch 1560/2125] avg loss 0.000777025, throughput 3.99844K wps
[Epoch 24 Batch 1590/2125] avg loss 0.00108344, throughput 3.99669K wps
[Epoch 24 Batch 1620/2125] avg loss 0.000974974, throughput 4.00104K wps
[Epoch 24 Batch 1650/2125] avg loss 0.00130631, throughput 4.00204K wps
[Epoch 24 Batch 1680/2125] avg loss 0.000967089, throughput 4.00143K wps
[Epoch 24 Batch 1710/2125] avg loss 0.000783587, throughput 3.9979K wps
[Epoch 24 Batch 1740/2125] avg loss 0.000949005, throughput 3.99355K wps
[Epoch 24 Batch 1770/2125] avg loss 0.000989698, throughput 3.99405K wps
[Epoch 24 Batch 1800/2125] avg loss 0.00113419, throughput 4.00028K wps
[Epoch 24 Batch 1830/2125] avg loss 0.000926153, throughput 4.00245K wps
[Epoch 24 Batch 1860/2125] avg loss 0.00103329, throughput 3.99858K wps
[Epoch 24 Batch 1890/2125] avg loss 0.00103103, throughput 3.99595K wps
[Epoch 24 Batch 1920/2125] avg loss 0.0010831, throughput 3.99807K wps
[Epoch 24 Batch 1950/2125] avg loss 0.00095452, throughput 4.0016K wps
[Epoch 24 Batch 1980/2125] avg loss 0.00121423, throughput 3.98786K wps
[Epoch 24 Batch 2010/2125] avg loss 0.00110914, throughput 3.99384K wps
[Epoch 24 Batch 2040/2125] avg loss 0.000728592, throughput 4.00014K wps
[Epoch 24 Batch 2070/2125] avg loss 0.000716931, throughput 3.99422K wps
[Epoch 24 Batch 2100/2125] avg loss 0.00152705, throughput 3.99013K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 24] train avg loss 0.000922662, test acc 0.9272, test avg loss 0.426275, throughput 3.99977K wps
[Epoch 25 Batch 30/2125] avg loss 0.000542693, throughput 4.09035K wps
[Epoch 25 Batch 60/2125] avg loss 0.000634554, throughput 4.0001K wps
[Epoch 25 Batch 90/2125] avg loss 0.000802628, throughput 3.99953K wps
[Epoch 25 Batch 120/2125] avg loss 0.000638718, throughput 3.99925K wps
[Epoch 25 Batch 150/2125] avg loss 0.00081036, throughput 3.99829K wps
[Epoch 25 Batch 180/2125] avg loss 0.000989756, throughput 3.99629K wps
[Epoch 25 Batch 210/2125] avg loss 0.000705169, throughput 3.99707K wps
[Epoch 25 Batch 240/2125] avg loss 0.000751232, throughput 4.0027K wps
[Epoch 25 Batch 270/2125] avg loss 0.00105092, throughput 3.99853K wps
[Epoch 25 Batch 300/2125] avg loss 0.000801791, throughput 3.99969K wps
[Epoch 25 Batch 330/2125] avg loss 0.000764922, throughput 4.0007K wps
[Epoch 25 Batch 360/2125] avg loss 0.000823676, throughput 4.00029K wps
[Epoch 25 Batch 390/2125] avg loss 0.000915287, throughput 3.99376K wps
[Epoch 25 Batch 420/2125] avg loss 0.000680376, throughput 3.99688K wps
[Epoch 25 Batch 450/2125] avg loss 0.000823017, throughput 3.99949K wps
[Epoch 25 Batch 480/2125] avg loss 0.00102207, throughput 3.99895K wps
[Epoch 25 Batch 510/2125] avg loss 0.00102967, throughput 4.00133K wps
[Epoch 25 Batch 540/2125] avg loss 0.00078496, throughput 4.00065K wps
[Epoch 25 Batch 570/2125] avg loss 0.0010984, throughput 4.00103K wps
[Epoch 25 Batch 600/2125] avg loss 0.000731567, throughput 3.9977K wps
[Epoch 25 Batch 630/2125] avg loss 0.00111563, throughput 3.99786K wps
[Epoch 25 Batch 660/2125] avg loss 0.000643189, throughput 3.99895K wps
[Epoch 25 Batch 690/2125] avg loss 0.000943304, throughput 4.0004K wps
[Epoch 25 Batch 720/2125] avg loss 0.000797849, throughput 3.99988K wps
[Epoch 25 Batch 750/2125] avg loss 0.000847502, throughput 3.99936K wps
[Epoch 25 Batch 780/2125] avg loss 0.00108424, throughput 3.99822K wps
[Epoch 25 Batch 810/2125] avg loss 0.00111283, throughput 3.99488K wps
[Epoch 25 Batch 840/2125] avg loss 0.000839548, throughput 4.0002K wps
[Epoch 25 Batch 870/2125] avg loss 0.000872638, throughput 3.99697K wps
[Epoch 25 Batch 900/2125] avg loss 0.000928779, throughput 4.00237K wps
[Epoch 25 Batch 930/2125] avg loss 0.000936099, throughput 4.00417K wps
[Epoch 25 Batch 960/2125] avg loss 0.000887796, throughput 4.00833K wps
[Epoch 25 Batch 990/2125] avg loss 0.000971314, throughput 4.01263K wps
[Epoch 25 Batch 1020/2125] avg loss 0.000749305, throughput 4.00056K wps
[Epoch 25 Batch 1050/2125] avg loss 0.000876454, throughput 4.002K wps
[Epoch 25 Batch 1080/2125] avg loss 0.00105051, throughput 4.00535K wps
[Epoch 25 Batch 1110/2125] avg loss 0.000878548, throughput 3.99713K wps
[Epoch 25 Batch 1140/2125] avg loss 0.000749631, throughput 3.99933K wps
[Epoch 25 Batch 1170/2125] avg loss 0.00102222, throughput 3.99485K wps
[Epoch 25 Batch 1200/2125] avg loss 0.00112624, throughput 3.99983K wps
[Epoch 25 Batch 1230/2125] avg loss 0.000858281, throughput 3.99945K wps
[Epoch 25 Batch 1260/2125] avg loss 0.000799096, throughput 4.00176K wps
[Epoch 25 Batch 1290/2125] avg loss 0.000739759, throughput 4.00182K wps
[Epoch 25 Batch 1320/2125] avg loss 0.000695089, throughput 3.99627K wps
[Epoch 25 Batch 1350/2125] avg loss 0.000996013, throughput 3.998K wps
[Epoch 25 Batch 1380/2125] avg loss 0.00100718, throughput 3.99302K wps
[Epoch 25 Batch 1410/2125] avg loss 0.0011011, throughput 4.00373K wps
[Epoch 25 Batch 1440/2125] avg loss 0.000834168, throughput 4.00019K wps
[Epoch 25 Batch 1470/2125] avg loss 0.00112079, throughput 4.00352K wps
[Epoch 25 Batch 1500/2125] avg loss 0.00103098, throughput 3.99683K wps
[Epoch 25 Batch 1530/2125] avg loss 0.00099876, throughput 3.99749K wps
[Epoch 25 Batch 1560/2125] avg loss 0.000760131, throughput 3.9967K wps
[Epoch 25 Batch 1590/2125] avg loss 0.000996555, throughput 3.99826K wps
[Epoch 25 Batch 1620/2125] avg loss 0.000814146, throughput 3.99669K wps
[Epoch 25 Batch 1650/2125] avg loss 0.00105125, throughput 3.99979K wps
[Epoch 25 Batch 1680/2125] avg loss 0.000905662, throughput 3.99872K wps
[Epoch 25 Batch 1710/2125] avg loss 0.000707415, throughput 3.99921K wps
[Epoch 25 Batch 1740/2125] avg loss 0.00115767, throughput 3.99348K wps
[Epoch 25 Batch 1770/2125] avg loss 0.000631925, throughput 3.99946K wps
[Epoch 25 Batch 1800/2125] avg loss 0.00106952, throughput 4.00118K wps
[Epoch 25 Batch 1830/2125] avg loss 0.00100007, throughput 3.99743K wps
[Epoch 25 Batch 1860/2125] avg loss 0.00101421, throughput 4.00102K wps
[Epoch 25 Batch 1890/2125] avg loss 0.00095709, throughput 4.0017K wps
[Epoch 25 Batch 1920/2125] avg loss 0.000911437, throughput 3.99451K wps
[Epoch 25 Batch 1950/2125] avg loss 0.000725144, throughput 3.99835K wps
[Epoch 25 Batch 1980/2125] avg loss 0.000811346, throughput 4.00194K wps
[Epoch 25 Batch 2010/2125] avg loss 0.00128751, throughput 3.99909K wps
[Epoch 25 Batch 2040/2125] avg loss 0.000919827, throughput 3.99366K wps
[Epoch 25 Batch 2070/2125] avg loss 0.00129537, throughput 3.99327K wps
[Epoch 25 Batch 2100/2125] avg loss 0.000944068, throughput 3.99353K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 25] train avg loss 0.000902611, test acc 0.9263, test avg loss 0.434256, throughput 4.00047K wps
[Epoch 26 Batch 30/2125] avg loss 0.000498818, throughput 4.0869K wps
[Epoch 26 Batch 60/2125] avg loss 0.000420643, throughput 4.00087K wps
[Epoch 26 Batch 90/2125] avg loss 0.000630067, throughput 3.99238K wps
[Epoch 26 Batch 120/2125] avg loss 0.00068486, throughput 3.99662K wps
[Epoch 26 Batch 150/2125] avg loss 0.000815717, throughput 3.99419K wps
[Epoch 26 Batch 180/2125] avg loss 0.000786024, throughput 3.99795K wps
[Epoch 26 Batch 210/2125] avg loss 0.000723033, throughput 3.99614K wps
[Epoch 26 Batch 240/2125] avg loss 0.000781422, throughput 3.99491K wps
[Epoch 26 Batch 270/2125] avg loss 0.00100382, throughput 4.00006K wps
[Epoch 26 Batch 300/2125] avg loss 0.000654798, throughput 3.99799K wps
[Epoch 26 Batch 330/2125] avg loss 0.000793541, throughput 4.00099K wps
[Epoch 26 Batch 360/2125] avg loss 0.000755748, throughput 3.99736K wps
[Epoch 26 Batch 390/2125] avg loss 0.000685467, throughput 3.99623K wps
[Epoch 26 Batch 420/2125] avg loss 0.000699907, throughput 3.99971K wps
[Epoch 26 Batch 450/2125] avg loss 0.000805789, throughput 4.003K wps
[Epoch 26 Batch 480/2125] avg loss 0.00103566, throughput 3.99926K wps
[Epoch 26 Batch 510/2125] avg loss 0.000734148, throughput 4.00072K wps
[Epoch 26 Batch 540/2125] avg loss 0.000776538, throughput 3.99879K wps
[Epoch 26 Batch 570/2125] avg loss 0.000795585, throughput 4.00061K wps
[Epoch 26 Batch 600/2125] avg loss 0.000897024, throughput 3.99733K wps
[Epoch 26 Batch 630/2125] avg loss 0.000748029, throughput 4.00191K wps
[Epoch 26 Batch 660/2125] avg loss 0.000733924, throughput 3.99764K wps
[Epoch 26 Batch 690/2125] avg loss 0.00086095, throughput 3.99655K wps
[Epoch 26 Batch 720/2125] avg loss 0.000739644, throughput 4.00069K wps
[Epoch 26 Batch 750/2125] avg loss 0.000858789, throughput 4.00374K wps
[Epoch 26 Batch 780/2125] avg loss 0.000606775, throughput 4.00338K wps
[Epoch 26 Batch 810/2125] avg loss 0.000965441, throughput 4.00279K wps
[Epoch 26 Batch 840/2125] avg loss 0.000810623, throughput 4.00066K wps
[Epoch 26 Batch 870/2125] avg loss 0.000906753, throughput 3.99913K wps
[Epoch 26 Batch 900/2125] avg loss 0.000826334, throughput 3.99327K wps
[Epoch 26 Batch 930/2125] avg loss 0.000886264, throughput 3.99579K wps
[Epoch 26 Batch 960/2125] avg loss 0.000716234, throughput 3.99706K wps
[Epoch 26 Batch 990/2125] avg loss 0.000747726, throughput 3.99933K wps
[Epoch 26 Batch 1020/2125] avg loss 0.0009396, throughput 3.99813K wps
[Epoch 26 Batch 1050/2125] avg loss 0.000854807, throughput 4.00083K wps
[Epoch 26 Batch 1080/2125] avg loss 0.00102436, throughput 3.99597K wps
[Epoch 26 Batch 1110/2125] avg loss 0.000972614, throughput 4.00024K wps
[Epoch 26 Batch 1140/2125] avg loss 0.00114486, throughput 3.99443K wps
[Epoch 26 Batch 1170/2125] avg loss 0.000798882, throughput 3.99608K wps
[Epoch 26 Batch 1200/2125] avg loss 0.00087329, throughput 3.98397K wps
[Epoch 26 Batch 1230/2125] avg loss 0.000763931, throughput 3.99896K wps
[Epoch 26 Batch 1260/2125] avg loss 0.000817015, throughput 3.99805K wps
[Epoch 26 Batch 1290/2125] avg loss 0.000699429, throughput 3.99333K wps
[Epoch 26 Batch 1320/2125] avg loss 0.000949672, throughput 3.99653K wps
[Epoch 26 Batch 1350/2125] avg loss 0.000840443, throughput 3.99729K wps
[Epoch 26 Batch 1380/2125] avg loss 0.00130038, throughput 3.99842K wps
[Epoch 26 Batch 1410/2125] avg loss 0.000757843, throughput 3.99769K wps
[Epoch 26 Batch 1440/2125] avg loss 0.000851357, throughput 3.99738K wps
[Epoch 26 Batch 1470/2125] avg loss 0.000937213, throughput 3.99881K wps
[Epoch 26 Batch 1500/2125] avg loss 0.00106818, throughput 3.99957K wps
[Epoch 26 Batch 1530/2125] avg loss 0.00104096, throughput 3.99846K wps
[Epoch 26 Batch 1560/2125] avg loss 0.00102107, throughput 4.00005K wps
[Epoch 26 Batch 1590/2125] avg loss 0.00088301, throughput 4.0004K wps
[Epoch 26 Batch 1620/2125] avg loss 0.00108645, throughput 3.99498K wps
[Epoch 26 Batch 1650/2125] avg loss 0.000694435, throughput 3.99097K wps
[Epoch 26 Batch 1680/2125] avg loss 0.000767809, throughput 3.99842K wps
[Epoch 26 Batch 1710/2125] avg loss 0.00109181, throughput 3.99751K wps
[Epoch 26 Batch 1740/2125] avg loss 0.000742198, throughput 3.99786K wps
[Epoch 26 Batch 1770/2125] avg loss 0.000964551, throughput 3.99769K wps
[Epoch 26 Batch 1800/2125] avg loss 0.000947351, throughput 3.99745K wps
[Epoch 26 Batch 1830/2125] avg loss 0.00095134, throughput 4.00087K wps
[Epoch 26 Batch 1860/2125] avg loss 0.00113308, throughput 3.99866K wps
[Epoch 26 Batch 1890/2125] avg loss 0.000901305, throughput 4.00275K wps
[Epoch 26 Batch 1920/2125] avg loss 0.00120927, throughput 3.99586K wps
[Epoch 26 Batch 1950/2125] avg loss 0.000790672, throughput 3.9989K wps
[Epoch 26 Batch 1980/2125] avg loss 0.000907426, throughput 3.99853K wps
[Epoch 26 Batch 2010/2125] avg loss 0.000910196, throughput 3.99777K wps
[Epoch 26 Batch 2040/2125] avg loss 0.00101627, throughput 3.99965K wps
[Epoch 26 Batch 2070/2125] avg loss 0.0011931, throughput 3.99335K wps
[Epoch 26 Batch 2100/2125] avg loss 0.00103913, throughput 3.99742K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 26] train avg loss 0.000866075, test acc 0.9259, test avg loss 0.435779, throughput 3.99923K wps
[Epoch 27 Batch 30/2125] avg loss 0.000635316, throughput 4.0913K wps
[Epoch 27 Batch 60/2125] avg loss 0.000550861, throughput 3.99655K wps
[Epoch 27 Batch 90/2125] avg loss 0.000456379, throughput 3.99999K wps
[Epoch 27 Batch 120/2125] avg loss 0.000631079, throughput 4.00051K wps
[Epoch 27 Batch 150/2125] avg loss 0.000700417, throughput 3.99837K wps
[Epoch 27 Batch 180/2125] avg loss 0.000561199, throughput 3.99924K wps
[Epoch 27 Batch 210/2125] avg loss 0.000598493, throughput 3.99786K wps
[Epoch 27 Batch 240/2125] avg loss 0.000608703, throughput 4.00371K wps
[Epoch 27 Batch 270/2125] avg loss 0.000831164, throughput 4.00151K wps
[Epoch 27 Batch 300/2125] avg loss 0.000676873, throughput 3.99692K wps
[Epoch 27 Batch 330/2125] avg loss 0.000732389, throughput 4.00317K wps
[Epoch 27 Batch 360/2125] avg loss 0.000794212, throughput 3.99879K wps
[Epoch 27 Batch 390/2125] avg loss 0.000794384, throughput 4.00016K wps
[Epoch 27 Batch 420/2125] avg loss 0.000915684, throughput 3.99885K wps
[Epoch 27 Batch 450/2125] avg loss 0.000706366, throughput 3.99965K wps
[Epoch 27 Batch 480/2125] avg loss 0.00065528, throughput 3.99457K wps
[Epoch 27 Batch 510/2125] avg loss 0.000780761, throughput 3.9983K wps
[Epoch 27 Batch 540/2125] avg loss 0.00077302, throughput 3.99594K wps
[Epoch 27 Batch 570/2125] avg loss 0.000796542, throughput 4.0003K wps
[Epoch 27 Batch 600/2125] avg loss 0.00086993, throughput 3.99609K wps
[Epoch 27 Batch 630/2125] avg loss 0.000803891, throughput 3.9969K wps
[Epoch 27 Batch 660/2125] avg loss 0.000668058, throughput 3.99901K wps
[Epoch 27 Batch 690/2125] avg loss 0.00101888, throughput 3.99509K wps
[Epoch 27 Batch 720/2125] avg loss 0.00055092, throughput 4.00216K wps
[Epoch 27 Batch 750/2125] avg loss 0.000617076, throughput 4.00729K wps
[Epoch 27 Batch 780/2125] avg loss 0.000870937, throughput 4.00667K wps
[Epoch 27 Batch 810/2125] avg loss 0.000872373, throughput 4.0029K wps
[Epoch 27 Batch 840/2125] avg loss 0.000871922, throughput 3.99956K wps
[Epoch 27 Batch 870/2125] avg loss 0.000944092, throughput 3.99798K wps
[Epoch 27 Batch 900/2125] avg loss 0.000875856, throughput 3.99923K wps
[Epoch 27 Batch 930/2125] avg loss 0.000800257, throughput 3.99825K wps
[Epoch 27 Batch 960/2125] avg loss 0.000838509, throughput 4.00052K wps
[Epoch 27 Batch 990/2125] avg loss 0.000661349, throughput 4.00115K wps
[Epoch 27 Batch 1020/2125] avg loss 0.000782845, throughput 3.99716K wps
[Epoch 27 Batch 1050/2125] avg loss 0.00097543, throughput 3.99681K wps
[Epoch 27 Batch 1080/2125] avg loss 0.00081591, throughput 3.99942K wps
[Epoch 27 Batch 1110/2125] avg loss 0.000951907, throughput 3.99978K wps
[Epoch 27 Batch 1140/2125] avg loss 0.000876733, throughput 3.99958K wps
[Epoch 27 Batch 1170/2125] avg loss 0.00108206, throughput 4.00203K wps
[Epoch 27 Batch 1200/2125] avg loss 0.000862121, throughput 3.99688K wps
[Epoch 27 Batch 1230/2125] avg loss 0.000866138, throughput 3.99555K wps
[Epoch 27 Batch 1260/2125] avg loss 0.000969637, throughput 3.99917K wps
[Epoch 27 Batch 1290/2125] avg loss 0.000798511, throughput 3.9983K wps
[Epoch 27 Batch 1320/2125] avg loss 0.000857802, throughput 3.99873K wps
[Epoch 27 Batch 1350/2125] avg loss 0.000816272, throughput 3.99899K wps
[Epoch 27 Batch 1380/2125] avg loss 0.00068316, throughput 3.99606K wps
[Epoch 27 Batch 1410/2125] avg loss 0.00110631, throughput 4.00344K wps
[Epoch 27 Batch 1440/2125] avg loss 0.000936967, throughput 3.99788K wps
[Epoch 27 Batch 1470/2125] avg loss 0.00101446, throughput 3.9972K wps
[Epoch 27 Batch 1500/2125] avg loss 0.000671273, throughput 4.00367K wps
[Epoch 27 Batch 1530/2125] avg loss 0.000752257, throughput 3.99805K wps
[Epoch 27 Batch 1560/2125] avg loss 0.000859808, throughput 3.9948K wps
[Epoch 27 Batch 1590/2125] avg loss 0.000928378, throughput 3.9989K wps
[Epoch 27 Batch 1620/2125] avg loss 0.00116471, throughput 3.99369K wps
[Epoch 27 Batch 1650/2125] avg loss 0.00082462, throughput 3.99848K wps
[Epoch 27 Batch 1680/2125] avg loss 0.000996676, throughput 3.99565K wps
[Epoch 27 Batch 1710/2125] avg loss 0.000928437, throughput 3.99883K wps
[Epoch 27 Batch 1740/2125] avg loss 0.000894818, throughput 3.99752K wps
[Epoch 27 Batch 1770/2125] avg loss 0.000949365, throughput 3.99732K wps
[Epoch 27 Batch 1800/2125] avg loss 0.000762819, throughput 3.99623K wps
[Epoch 27 Batch 1830/2125] avg loss 0.00110174, throughput 4.00038K wps
[Epoch 27 Batch 1860/2125] avg loss 0.000924775, throughput 3.99693K wps
[Epoch 27 Batch 1890/2125] avg loss 0.000825042, throughput 3.99657K wps
[Epoch 27 Batch 1920/2125] avg loss 0.000640435, throughput 4.00164K wps
[Epoch 27 Batch 1950/2125] avg loss 0.000901496, throughput 3.99867K wps
[Epoch 27 Batch 1980/2125] avg loss 0.000933259, throughput 3.99784K wps
[Epoch 27 Batch 2010/2125] avg loss 0.000921317, throughput 4.00121K wps
[Epoch 27 Batch 2040/2125] avg loss 0.000977747, throughput 3.99937K wps
[Epoch 27 Batch 2070/2125] avg loss 0.000859687, throughput 3.99847K wps
[Epoch 27 Batch 2100/2125] avg loss 0.000879006, throughput 3.99508K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 27] train avg loss 0.000822073, test acc 0.9265, test avg loss 0.446754, throughput 4.00021K wps
[Epoch 28 Batch 30/2125] avg loss 0.000737825, throughput 4.09029K wps
[Epoch 28 Batch 60/2125] avg loss 0.000647331, throughput 3.99795K wps
[Epoch 28 Batch 90/2125] avg loss 0.000600486, throughput 3.99836K wps
[Epoch 28 Batch 120/2125] avg loss 0.000649889, throughput 4.0003K wps
[Epoch 28 Batch 150/2125] avg loss 0.000734482, throughput 3.99859K wps
[Epoch 28 Batch 180/2125] avg loss 0.000754165, throughput 3.99746K wps
[Epoch 28 Batch 210/2125] avg loss 0.000633512, throughput 3.99798K wps
[Epoch 28 Batch 240/2125] avg loss 0.00060276, throughput 4.00064K wps
[Epoch 28 Batch 270/2125] avg loss 0.000522579, throughput 3.99948K wps
[Epoch 28 Batch 300/2125] avg loss 0.000880485, throughput 3.99586K wps
[Epoch 28 Batch 330/2125] avg loss 0.000617328, throughput 3.99913K wps
[Epoch 28 Batch 360/2125] avg loss 0.000823878, throughput 4.00084K wps
[Epoch 28 Batch 390/2125] avg loss 0.000729743, throughput 3.98881K wps
[Epoch 28 Batch 420/2125] avg loss 0.000698249, throughput 3.98328K wps
[Epoch 28 Batch 450/2125] avg loss 0.0009332, throughput 4.00776K wps
[Epoch 28 Batch 480/2125] avg loss 0.000533977, throughput 4.00246K wps
[Epoch 28 Batch 510/2125] avg loss 0.000725661, throughput 4.00173K wps
[Epoch 28 Batch 540/2125] avg loss 0.000751737, throughput 4.00518K wps
[Epoch 28 Batch 570/2125] avg loss 0.0007934, throughput 4.00878K wps
[Epoch 28 Batch 600/2125] avg loss 0.000928704, throughput 4.00473K wps
[Epoch 28 Batch 630/2125] avg loss 0.000738289, throughput 3.99785K wps
[Epoch 28 Batch 660/2125] avg loss 0.000683416, throughput 4.00843K wps
[Epoch 28 Batch 690/2125] avg loss 0.000713073, throughput 4.00364K wps
[Epoch 28 Batch 720/2125] avg loss 0.00058635, throughput 4.00282K wps
[Epoch 28 Batch 750/2125] avg loss 0.000674699, throughput 3.99703K wps
[Epoch 28 Batch 780/2125] avg loss 0.00103259, throughput 3.99531K wps
[Epoch 28 Batch 810/2125] avg loss 0.000642306, throughput 4.00029K wps
[Epoch 28 Batch 840/2125] avg loss 0.000989487, throughput 3.99734K wps
[Epoch 28 Batch 870/2125] avg loss 0.00073437, throughput 3.99655K wps
[Epoch 28 Batch 900/2125] avg loss 0.000857023, throughput 3.99778K wps
[Epoch 28 Batch 930/2125] avg loss 0.000798029, throughput 3.99691K wps
[Epoch 28 Batch 960/2125] avg loss 0.000629364, throughput 3.99897K wps
[Epoch 28 Batch 990/2125] avg loss 0.000785235, throughput 3.99488K wps
[Epoch 28 Batch 1020/2125] avg loss 0.00105952, throughput 3.99735K wps
[Epoch 28 Batch 1050/2125] avg loss 0.000798474, throughput 3.99246K wps
[Epoch 28 Batch 1080/2125] avg loss 0.000771222, throughput 3.99385K wps
[Epoch 28 Batch 1110/2125] avg loss 0.000811845, throughput 3.99958K wps
[Epoch 28 Batch 1140/2125] avg loss 0.000743731, throughput 3.99818K wps
[Epoch 28 Batch 1170/2125] avg loss 0.000922899, throughput 4.00068K wps
[Epoch 28 Batch 1200/2125] avg loss 0.00110057, throughput 3.99768K wps
[Epoch 28 Batch 1230/2125] avg loss 0.000995872, throughput 3.99705K wps
[Epoch 28 Batch 1260/2125] avg loss 0.000854191, throughput 3.99965K wps
[Epoch 28 Batch 1290/2125] avg loss 0.000936086, throughput 3.99832K wps
[Epoch 28 Batch 1320/2125] avg loss 0.000649409, throughput 4.00067K wps
[Epoch 28 Batch 1350/2125] avg loss 0.000762534, throughput 4.00164K wps
[Epoch 28 Batch 1380/2125] avg loss 0.000820072, throughput 4.00058K wps
[Epoch 28 Batch 1410/2125] avg loss 0.000745522, throughput 3.99977K wps
[Epoch 28 Batch 1440/2125] avg loss 0.000620482, throughput 3.99916K wps
[Epoch 28 Batch 1470/2125] avg loss 0.000953186, throughput 4.00015K wps
[Epoch 28 Batch 1500/2125] avg loss 0.000968806, throughput 4.00053K wps
[Epoch 28 Batch 1530/2125] avg loss 0.000864644, throughput 4.00495K wps
[Epoch 28 Batch 1560/2125] avg loss 0.000791578, throughput 4.00177K wps
[Epoch 28 Batch 1590/2125] avg loss 0.000652583, throughput 3.99778K wps
[Epoch 28 Batch 1620/2125] avg loss 0.000648645, throughput 3.99397K wps
[Epoch 28 Batch 1650/2125] avg loss 0.000803292, throughput 3.99699K wps
[Epoch 28 Batch 1680/2125] avg loss 0.000786271, throughput 3.99656K wps
[Epoch 28 Batch 1710/2125] avg loss 0.00089298, throughput 3.9995K wps
[Epoch 28 Batch 1740/2125] avg loss 0.000752158, throughput 3.99354K wps
[Epoch 28 Batch 1770/2125] avg loss 0.00109439, throughput 4.00355K wps
[Epoch 28 Batch 1800/2125] avg loss 0.000861575, throughput 3.99684K wps
[Epoch 28 Batch 1830/2125] avg loss 0.000908036, throughput 3.99885K wps
[Epoch 28 Batch 1860/2125] avg loss 0.00087281, throughput 4.00327K wps
[Epoch 28 Batch 1890/2125] avg loss 0.000998997, throughput 3.99706K wps
[Epoch 28 Batch 1920/2125] avg loss 0.00076708, throughput 4.00115K wps
[Epoch 28 Batch 1950/2125] avg loss 0.000989032, throughput 3.99984K wps
[Epoch 28 Batch 1980/2125] avg loss 0.00140115, throughput 4.00104K wps
[Epoch 28 Batch 2010/2125] avg loss 0.000910206, throughput 3.99949K wps
[Epoch 28 Batch 2040/2125] avg loss 0.000823205, throughput 4.0003K wps
[Epoch 28 Batch 2070/2125] avg loss 0.000787063, throughput 3.99838K wps
[Epoch 28 Batch 2100/2125] avg loss 0.000859805, throughput 4.00162K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 28] train avg loss 0.000804769, test acc 0.9259, test avg loss 0.457036, throughput 4.00039K wps
[Epoch 29 Batch 30/2125] avg loss 0.000929935, throughput 4.08873K wps
[Epoch 29 Batch 60/2125] avg loss 0.000683766, throughput 3.99688K wps
[Epoch 29 Batch 90/2125] avg loss 0.00056342, throughput 3.99599K wps
[Epoch 29 Batch 120/2125] avg loss 0.000736325, throughput 4.00037K wps
[Epoch 29 Batch 150/2125] avg loss 0.000658219, throughput 3.99718K wps
[Epoch 29 Batch 180/2125] avg loss 0.000841627, throughput 3.99818K wps
[Epoch 29 Batch 210/2125] avg loss 0.00088663, throughput 3.99989K wps
[Epoch 29 Batch 240/2125] avg loss 0.000635085, throughput 4.00049K wps
[Epoch 29 Batch 270/2125] avg loss 0.000713321, throughput 3.99628K wps
[Epoch 29 Batch 300/2125] avg loss 0.00060911, throughput 4.00069K wps
[Epoch 29 Batch 330/2125] avg loss 0.000584163, throughput 3.9967K wps
[Epoch 29 Batch 360/2125] avg loss 0.000702866, throughput 3.99779K wps
[Epoch 29 Batch 390/2125] avg loss 0.000865809, throughput 3.99846K wps
[Epoch 29 Batch 420/2125] avg loss 0.000673951, throughput 3.9996K wps
[Epoch 29 Batch 450/2125] avg loss 0.000842687, throughput 3.9954K wps
[Epoch 29 Batch 480/2125] avg loss 0.000906033, throughput 3.99786K wps
[Epoch 29 Batch 510/2125] avg loss 0.000645151, throughput 3.9998K wps
[Epoch 29 Batch 540/2125] avg loss 0.000537045, throughput 3.99439K wps
[Epoch 29 Batch 570/2125] avg loss 0.000708492, throughput 3.99609K wps
[Epoch 29 Batch 600/2125] avg loss 0.000697456, throughput 3.99894K wps
[Epoch 29 Batch 630/2125] avg loss 0.000696587, throughput 3.99717K wps
[Epoch 29 Batch 660/2125] avg loss 0.000712381, throughput 3.99733K wps
[Epoch 29 Batch 690/2125] avg loss 0.000620977, throughput 3.99774K wps
[Epoch 29 Batch 720/2125] avg loss 0.000925455, throughput 3.99672K wps
[Epoch 29 Batch 750/2125] avg loss 0.000836824, throughput 3.99937K wps
[Epoch 29 Batch 780/2125] avg loss 0.000576661, throughput 3.99348K wps
[Epoch 29 Batch 810/2125] avg loss 0.000668587, throughput 3.99492K wps
[Epoch 29 Batch 840/2125] avg loss 0.000983525, throughput 3.99656K wps
[Epoch 29 Batch 870/2125] avg loss 0.000770493, throughput 3.99853K wps
[Epoch 29 Batch 900/2125] avg loss 0.000775387, throughput 4.00133K wps
[Epoch 29 Batch 930/2125] avg loss 0.000743566, throughput 4.00023K wps
[Epoch 29 Batch 960/2125] avg loss 0.000820381, throughput 4.00125K wps
[Epoch 29 Batch 990/2125] avg loss 0.000715147, throughput 3.99633K wps
[Epoch 29 Batch 1020/2125] avg loss 0.000786707, throughput 3.99634K wps
[Epoch 29 Batch 1050/2125] avg loss 0.000742307, throughput 3.99641K wps
[Epoch 29 Batch 1080/2125] avg loss 0.000816046, throughput 3.99947K wps
[Epoch 29 Batch 1110/2125] avg loss 0.000684895, throughput 3.99175K wps
[Epoch 29 Batch 1140/2125] avg loss 0.00108338, throughput 3.99471K wps
[Epoch 29 Batch 1170/2125] avg loss 0.000786615, throughput 3.99864K wps
[Epoch 29 Batch 1200/2125] avg loss 0.000568282, throughput 3.99989K wps
[Epoch 29 Batch 1230/2125] avg loss 0.0008601, throughput 4.00122K wps
[Epoch 29 Batch 1260/2125] avg loss 0.000987709, throughput 3.99751K wps
[Epoch 29 Batch 1290/2125] avg loss 0.000880833, throughput 3.99777K wps
[Epoch 29 Batch 1320/2125] avg loss 0.000650443, throughput 3.99817K wps
[Epoch 29 Batch 1350/2125] avg loss 0.00065146, throughput 3.9998K wps
[Epoch 29 Batch 1380/2125] avg loss 0.000816061, throughput 3.99755K wps
[Epoch 29 Batch 1410/2125] avg loss 0.000754989, throughput 3.99895K wps
[Epoch 29 Batch 1440/2125] avg loss 0.000672763, throughput 3.99633K wps
[Epoch 29 Batch 1470/2125] avg loss 0.001001, throughput 4.00205K wps
[Epoch 29 Batch 1500/2125] avg loss 0.000906626, throughput 4.00221K wps
[Epoch 29 Batch 1530/2125] avg loss 0.000741029, throughput 4.00029K wps
[Epoch 29 Batch 1560/2125] avg loss 0.000970956, throughput 3.99766K wps
[Epoch 29 Batch 1590/2125] avg loss 0.00086996, throughput 3.99833K wps
[Epoch 29 Batch 1620/2125] avg loss 0.00087189, throughput 3.99994K wps
[Epoch 29 Batch 1650/2125] avg loss 0.000945509, throughput 3.99891K wps
[Epoch 29 Batch 1680/2125] avg loss 0.000969558, throughput 3.99782K wps
[Epoch 29 Batch 1710/2125] avg loss 0.00077986, throughput 3.99636K wps
[Epoch 29 Batch 1740/2125] avg loss 0.00100297, throughput 4.00094K wps
[Epoch 29 Batch 1770/2125] avg loss 0.000935416, throughput 3.99733K wps
[Epoch 29 Batch 1800/2125] avg loss 0.000770199, throughput 3.99893K wps
[Epoch 29 Batch 1830/2125] avg loss 0.000710103, throughput 3.97416K wps
[Epoch 29 Batch 1860/2125] avg loss 0.000731922, throughput 3.99568K wps
[Epoch 29 Batch 1890/2125] avg loss 0.000801361, throughput 4.00049K wps
[Epoch 29 Batch 1920/2125] avg loss 0.000975304, throughput 4.00001K wps
[Epoch 29 Batch 1950/2125] avg loss 0.000703561, throughput 3.99866K wps
[Epoch 29 Batch 1980/2125] avg loss 0.00086022, throughput 4.0022K wps
[Epoch 29 Batch 2010/2125] avg loss 0.000816924, throughput 3.99458K wps
[Epoch 29 Batch 2040/2125] avg loss 0.000968135, throughput 3.99619K wps
[Epoch 29 Batch 2070/2125] avg loss 0.00062645, throughput 3.99486K wps
[Epoch 29 Batch 2100/2125] avg loss 0.000855737, throughput 3.99514K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 29] train avg loss 0.000784884, test acc 0.9259, test avg loss 0.469774, throughput 3.9989K wps
[Epoch 30 Batch 30/2125] avg loss 0.000602965, throughput 4.09237K wps
[Epoch 30 Batch 60/2125] avg loss 0.000524209, throughput 3.99927K wps
[Epoch 30 Batch 90/2125] avg loss 0.000750805, throughput 3.99584K wps
[Epoch 30 Batch 120/2125] avg loss 0.00061639, throughput 3.99914K wps
[Epoch 30 Batch 150/2125] avg loss 0.000505856, throughput 3.99486K wps
[Epoch 30 Batch 180/2125] avg loss 0.000700347, throughput 4.00275K wps
[Epoch 30 Batch 210/2125] avg loss 0.000732188, throughput 4.00001K wps
[Epoch 30 Batch 240/2125] avg loss 0.000858117, throughput 3.99883K wps
[Epoch 30 Batch 270/2125] avg loss 0.000746414, throughput 3.99855K wps
[Epoch 30 Batch 300/2125] avg loss 0.000663594, throughput 4.00096K wps
[Epoch 30 Batch 330/2125] avg loss 0.000664492, throughput 3.99982K wps
[Epoch 30 Batch 360/2125] avg loss 0.000631212, throughput 4.00355K wps
[Epoch 30 Batch 390/2125] avg loss 0.000744493, throughput 4.00199K wps
[Epoch 30 Batch 420/2125] avg loss 0.000636179, throughput 3.99422K wps
[Epoch 30 Batch 450/2125] avg loss 0.000671849, throughput 4.00303K wps
[Epoch 30 Batch 480/2125] avg loss 0.000652879, throughput 3.99818K wps
[Epoch 30 Batch 510/2125] avg loss 0.000588253, throughput 3.99604K wps
[Epoch 30 Batch 540/2125] avg loss 0.000626489, throughput 3.99933K wps
[Epoch 30 Batch 570/2125] avg loss 0.00054316, throughput 3.99624K wps
[Epoch 30 Batch 600/2125] avg loss 0.000692588, throughput 4.00039K wps
[Epoch 30 Batch 630/2125] avg loss 0.000811866, throughput 3.99308K wps
[Epoch 30 Batch 660/2125] avg loss 0.000685939, throughput 3.99961K wps
[Epoch 30 Batch 690/2125] avg loss 0.000935545, throughput 4.00537K wps
[Epoch 30 Batch 720/2125] avg loss 0.000515659, throughput 4.0009K wps
[Epoch 30 Batch 750/2125] avg loss 0.000740801, throughput 4.00042K wps
[Epoch 30 Batch 780/2125] avg loss 0.000647098, throughput 4.00073K wps
[Epoch 30 Batch 810/2125] avg loss 0.000643846, throughput 3.99635K wps
[Epoch 30 Batch 840/2125] avg loss 0.000520487, throughput 3.99859K wps
[Epoch 30 Batch 870/2125] avg loss 0.000639696, throughput 3.99794K wps
[Epoch 30 Batch 900/2125] avg loss 0.000838838, throughput 3.99846K wps
[Epoch 30 Batch 930/2125] avg loss 0.000546262, throughput 3.99901K wps
[Epoch 30 Batch 960/2125] avg loss 0.000803195, throughput 3.99839K wps
[Epoch 30 Batch 990/2125] avg loss 0.000638765, throughput 3.9973K wps
[Epoch 30 Batch 1020/2125] avg loss 0.000663643, throughput 3.99693K wps
[Epoch 30 Batch 1050/2125] avg loss 0.000630729, throughput 3.9993K wps
[Epoch 30 Batch 1080/2125] avg loss 0.000797249, throughput 3.99913K wps
[Epoch 30 Batch 1110/2125] avg loss 0.00076418, throughput 3.9989K wps
[Epoch 30 Batch 1140/2125] avg loss 0.000674708, throughput 3.9964K wps
[Epoch 30 Batch 1170/2125] avg loss 0.00053373, throughput 3.99828K wps
[Epoch 30 Batch 1200/2125] avg loss 0.000858992, throughput 3.9967K wps
[Epoch 30 Batch 1230/2125] avg loss 0.000903863, throughput 3.99845K wps
[Epoch 30 Batch 1260/2125] avg loss 0.000666568, throughput 3.99753K wps
[Epoch 30 Batch 1290/2125] avg loss 0.0008097, throughput 3.99867K wps
[Epoch 30 Batch 1320/2125] avg loss 0.000937447, throughput 4.00201K wps
[Epoch 30 Batch 1350/2125] avg loss 0.000981574, throughput 3.99851K wps
[Epoch 30 Batch 1380/2125] avg loss 0.000803376, throughput 4.00356K wps
[Epoch 30 Batch 1410/2125] avg loss 0.000743747, throughput 3.99624K wps
[Epoch 30 Batch 1440/2125] avg loss 0.000905113, throughput 4.00548K wps
[Epoch 30 Batch 1470/2125] avg loss 0.000947276, throughput 4.00251K wps
[Epoch 30 Batch 1500/2125] avg loss 0.000614349, throughput 4.0054K wps
[Epoch 30 Batch 1530/2125] avg loss 0.000699731, throughput 3.99805K wps
[Epoch 30 Batch 1560/2125] avg loss 0.00100421, throughput 4.00104K wps
[Epoch 30 Batch 1590/2125] avg loss 0.00101241, throughput 3.9998K wps
[Epoch 30 Batch 1620/2125] avg loss 0.000645661, throughput 3.99941K wps
[Epoch 30 Batch 1650/2125] avg loss 0.000794009, throughput 4.00165K wps
[Epoch 30 Batch 1680/2125] avg loss 0.000834149, throughput 3.99896K wps
[Epoch 30 Batch 1710/2125] avg loss 0.000578249, throughput 3.99639K wps
[Epoch 30 Batch 1740/2125] avg loss 0.000853795, throughput 3.99996K wps
[Epoch 30 Batch 1770/2125] avg loss 0.000623496, throughput 3.99712K wps
[Epoch 30 Batch 1800/2125] avg loss 0.00103409, throughput 3.99493K wps
[Epoch 30 Batch 1830/2125] avg loss 0.000776601, throughput 3.99897K wps
[Epoch 30 Batch 1860/2125] avg loss 0.000968276, throughput 4.00274K wps
[Epoch 30 Batch 1890/2125] avg loss 0.000922933, throughput 3.99882K wps
[Epoch 30 Batch 1920/2125] avg loss 0.000745566, throughput 3.99439K wps
[Epoch 30 Batch 1950/2125] avg loss 0.000988853, throughput 3.99665K wps
[Epoch 30 Batch 1980/2125] avg loss 0.000718949, throughput 3.99735K wps
[Epoch 30 Batch 2010/2125] avg loss 0.000814874, throughput 3.99776K wps
[Epoch 30 Batch 2040/2125] avg loss 0.000741253, throughput 4.00027K wps
[Epoch 30 Batch 2070/2125] avg loss 0.00100422, throughput 4.00127K wps
[Epoch 30 Batch 2100/2125] avg loss 0.000873948, throughput 4.00904K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 30] train avg loss 0.000743403, test acc 0.9265, test avg loss 0.471367, throughput 4.00058K wps
[Epoch 31 Batch 30/2125] avg loss 0.000562114, throughput 4.08757K wps
[Epoch 31 Batch 60/2125] avg loss 0.000596844, throughput 3.99547K wps
[Epoch 31 Batch 90/2125] avg loss 0.000582605, throughput 3.99595K wps
[Epoch 31 Batch 120/2125] avg loss 0.000738184, throughput 4.00106K wps
[Epoch 31 Batch 150/2125] avg loss 0.000542339, throughput 3.99651K wps
[Epoch 31 Batch 180/2125] avg loss 0.000702906, throughput 3.99707K wps
[Epoch 31 Batch 210/2125] avg loss 0.000622419, throughput 3.99713K wps
[Epoch 31 Batch 240/2125] avg loss 0.000626121, throughput 3.99774K wps
[Epoch 31 Batch 270/2125] avg loss 0.000588582, throughput 3.99875K wps
[Epoch 31 Batch 300/2125] avg loss 0.000659028, throughput 3.99764K wps
[Epoch 31 Batch 330/2125] avg loss 0.000626745, throughput 3.99591K wps
[Epoch 31 Batch 360/2125] avg loss 0.000404509, throughput 3.99786K wps
[Epoch 31 Batch 390/2125] avg loss 0.000653009, throughput 3.99567K wps
[Epoch 31 Batch 420/2125] avg loss 0.00055823, throughput 3.99657K wps
[Epoch 31 Batch 450/2125] avg loss 0.000615714, throughput 4.00242K wps
[Epoch 31 Batch 480/2125] avg loss 0.000542521, throughput 3.99734K wps
[Epoch 31 Batch 510/2125] avg loss 0.000754268, throughput 3.99659K wps
[Epoch 31 Batch 540/2125] avg loss 0.000700408, throughput 3.99798K wps
[Epoch 31 Batch 570/2125] avg loss 0.00047011, throughput 4.0011K wps
[Epoch 31 Batch 600/2125] avg loss 0.00065937, throughput 3.99926K wps
[Epoch 31 Batch 630/2125] avg loss 0.000731632, throughput 3.99541K wps
[Epoch 31 Batch 660/2125] avg loss 0.00076307, throughput 3.99894K wps
[Epoch 31 Batch 690/2125] avg loss 0.00074155, throughput 3.99543K wps
[Epoch 31 Batch 720/2125] avg loss 0.000502855, throughput 3.99676K wps
[Epoch 31 Batch 750/2125] avg loss 0.000575837, throughput 3.99602K wps
[Epoch 31 Batch 780/2125] avg loss 0.000832985, throughput 3.99594K wps
[Epoch 31 Batch 810/2125] avg loss 0.000656416, throughput 3.99508K wps
[Epoch 31 Batch 840/2125] avg loss 0.00059712, throughput 3.9987K wps
[Epoch 31 Batch 870/2125] avg loss 0.000692417, throughput 4.00005K wps
[Epoch 31 Batch 900/2125] avg loss 0.000562758, throughput 3.99526K wps
[Epoch 31 Batch 930/2125] avg loss 0.000587624, throughput 3.99793K wps
[Epoch 31 Batch 960/2125] avg loss 0.000791173, throughput 3.99778K wps
[Epoch 31 Batch 990/2125] avg loss 0.000785071, throughput 3.99369K wps
[Epoch 31 Batch 1020/2125] avg loss 0.000778091, throughput 3.99028K wps
[Epoch 31 Batch 1050/2125] avg loss 0.000805277, throughput 3.99116K wps
[Epoch 31 Batch 1080/2125] avg loss 0.000844015, throughput 3.99827K wps
[Epoch 31 Batch 1110/2125] avg loss 0.000769042, throughput 3.99694K wps
[Epoch 31 Batch 1140/2125] avg loss 0.00056128, throughput 3.9989K wps
[Epoch 31 Batch 1170/2125] avg loss 0.000792887, throughput 3.99598K wps
[Epoch 31 Batch 1200/2125] avg loss 0.000702931, throughput 3.99766K wps
[Epoch 31 Batch 1230/2125] avg loss 0.000588298, throughput 4.00224K wps
[Epoch 31 Batch 1260/2125] avg loss 0.000688227, throughput 4.00193K wps
[Epoch 31 Batch 1290/2125] avg loss 0.000733043, throughput 3.99564K wps
[Epoch 31 Batch 1320/2125] avg loss 0.00107545, throughput 4.00049K wps
[Epoch 31 Batch 1350/2125] avg loss 0.000720277, throughput 3.99867K wps
[Epoch 31 Batch 1380/2125] avg loss 0.000783616, throughput 3.99655K wps
[Epoch 31 Batch 1410/2125] avg loss 0.000809735, throughput 4.00131K wps
[Epoch 31 Batch 1440/2125] avg loss 0.00099167, throughput 3.99527K wps
[Epoch 31 Batch 1470/2125] avg loss 0.00100378, throughput 4.00095K wps
[Epoch 31 Batch 1500/2125] avg loss 0.00096878, throughput 3.99978K wps
[Epoch 31 Batch 1530/2125] avg loss 0.000882958, throughput 3.9991K wps
[Epoch 31 Batch 1560/2125] avg loss 0.000693352, throughput 3.99657K wps
[Epoch 31 Batch 1590/2125] avg loss 0.000781832, throughput 3.9989K wps
[Epoch 31 Batch 1620/2125] avg loss 0.000847513, throughput 4.00115K wps
[Epoch 31 Batch 1650/2125] avg loss 0.000672331, throughput 3.99795K wps
[Epoch 31 Batch 1680/2125] avg loss 0.000662667, throughput 3.99563K wps
[Epoch 31 Batch 1710/2125] avg loss 0.000908484, throughput 3.9991K wps
[Epoch 31 Batch 1740/2125] avg loss 0.00115129, throughput 3.99859K wps
[Epoch 31 Batch 1770/2125] avg loss 0.000847584, throughput 3.99829K wps
[Epoch 31 Batch 1800/2125] avg loss 0.000730554, throughput 3.99971K wps
[Epoch 31 Batch 1830/2125] avg loss 0.000743838, throughput 4.00028K wps
[Epoch 31 Batch 1860/2125] avg loss 0.00084088, throughput 4.00396K wps
[Epoch 31 Batch 1890/2125] avg loss 0.0007661, throughput 4.00391K wps
[Epoch 31 Batch 1920/2125] avg loss 0.000605405, throughput 4.00542K wps
[Epoch 31 Batch 1950/2125] avg loss 0.000642492, throughput 3.99893K wps
[Epoch 31 Batch 1980/2125] avg loss 0.000864399, throughput 3.99498K wps
[Epoch 31 Batch 2010/2125] avg loss 0.00100363, throughput 4.00018K wps
[Epoch 31 Batch 2040/2125] avg loss 0.00104718, throughput 3.99988K wps
[Epoch 31 Batch 2070/2125] avg loss 0.00087473, throughput 3.99959K wps
[Epoch 31 Batch 2100/2125] avg loss 0.000701173, throughput 3.99875K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 31] train avg loss 0.000728408, test acc 0.9245, test avg loss 0.476435, throughput 3.99933K wps
[Epoch 32 Batch 30/2125] avg loss 0.000566543, throughput 4.08522K wps
[Epoch 32 Batch 60/2125] avg loss 0.00078045, throughput 3.99622K wps
[Epoch 32 Batch 90/2125] avg loss 0.000559478, throughput 4.00163K wps
[Epoch 32 Batch 120/2125] avg loss 0.000517989, throughput 3.99637K wps
[Epoch 32 Batch 150/2125] avg loss 0.000543092, throughput 3.99225K wps
[Epoch 32 Batch 180/2125] avg loss 0.000562026, throughput 3.99425K wps
[Epoch 32 Batch 210/2125] avg loss 0.000660242, throughput 3.99509K wps
[Epoch 32 Batch 240/2125] avg loss 0.000690587, throughput 3.98759K wps
[Epoch 32 Batch 270/2125] avg loss 0.000404988, throughput 3.99516K wps
[Epoch 32 Batch 300/2125] avg loss 0.000504552, throughput 3.99834K wps
[Epoch 32 Batch 330/2125] avg loss 0.000571348, throughput 3.99469K wps
[Epoch 32 Batch 360/2125] avg loss 0.000853378, throughput 3.99545K wps
[Epoch 32 Batch 390/2125] avg loss 0.000500844, throughput 4.0007K wps
[Epoch 32 Batch 420/2125] avg loss 0.000649048, throughput 3.99611K wps
[Epoch 32 Batch 450/2125] avg loss 0.000794052, throughput 4.0016K wps
[Epoch 32 Batch 480/2125] avg loss 0.000622439, throughput 3.99683K wps
[Epoch 32 Batch 510/2125] avg loss 0.000694158, throughput 3.99921K wps
[Epoch 32 Batch 540/2125] avg loss 0.000608309, throughput 3.99757K wps
[Epoch 32 Batch 570/2125] avg loss 0.000564789, throughput 4.00066K wps
[Epoch 32 Batch 600/2125] avg loss 0.000594967, throughput 4.00072K wps
[Epoch 32 Batch 630/2125] avg loss 0.000624351, throughput 3.99981K wps
[Epoch 32 Batch 660/2125] avg loss 0.000553642, throughput 3.995K wps
[Epoch 32 Batch 690/2125] avg loss 0.000632714, throughput 3.99912K wps
[Epoch 32 Batch 720/2125] avg loss 0.000739224, throughput 3.99902K wps
[Epoch 32 Batch 750/2125] avg loss 0.00069068, throughput 3.99426K wps
[Epoch 32 Batch 780/2125] avg loss 0.000590273, throughput 3.99668K wps
[Epoch 32 Batch 810/2125] avg loss 0.0005952, throughput 3.99815K wps
[Epoch 32 Batch 840/2125] avg loss 0.000774886, throughput 4.00668K wps
[Epoch 32 Batch 870/2125] avg loss 0.000566418, throughput 4.00536K wps
[Epoch 32 Batch 900/2125] avg loss 0.000580927, throughput 4.00236K wps
[Epoch 32 Batch 930/2125] avg loss 0.000847092, throughput 3.99488K wps
[Epoch 32 Batch 960/2125] avg loss 0.000560955, throughput 3.99536K wps
[Epoch 32 Batch 990/2125] avg loss 0.000629422, throughput 4.00151K wps
[Epoch 32 Batch 1020/2125] avg loss 0.000785722, throughput 3.99546K wps
[Epoch 32 Batch 1050/2125] avg loss 0.00068028, throughput 3.9981K wps
[Epoch 32 Batch 1080/2125] avg loss 0.000696045, throughput 3.99637K wps
[Epoch 32 Batch 1110/2125] avg loss 0.000865904, throughput 3.99693K wps
[Epoch 32 Batch 1140/2125] avg loss 0.000726607, throughput 3.99845K wps
[Epoch 32 Batch 1170/2125] avg loss 0.000758688, throughput 3.99696K wps
[Epoch 32 Batch 1200/2125] avg loss 0.000586353, throughput 3.9992K wps
[Epoch 32 Batch 1230/2125] avg loss 0.000872436, throughput 3.9979K wps
[Epoch 32 Batch 1260/2125] avg loss 0.000838061, throughput 3.9957K wps
[Epoch 32 Batch 1290/2125] avg loss 0.000888865, throughput 3.99964K wps
[Epoch 32 Batch 1320/2125] avg loss 0.00065767, throughput 4.00141K wps
[Epoch 32 Batch 1350/2125] avg loss 0.000509297, throughput 4.00035K wps
[Epoch 32 Batch 1380/2125] avg loss 0.000970279, throughput 3.99662K wps
[Epoch 32 Batch 1410/2125] avg loss 0.00101615, throughput 3.9966K wps
[Epoch 32 Batch 1440/2125] avg loss 0.000663084, throughput 4.0021K wps
[Epoch 32 Batch 1470/2125] avg loss 0.000732383, throughput 4.00138K wps
[Epoch 32 Batch 1500/2125] avg loss 0.000779812, throughput 3.99849K wps
[Epoch 32 Batch 1530/2125] avg loss 0.00111858, throughput 3.99789K wps
[Epoch 32 Batch 1560/2125] avg loss 0.000845754, throughput 4.00044K wps
[Epoch 32 Batch 1590/2125] avg loss 0.000992209, throughput 3.99701K wps
[Epoch 32 Batch 1620/2125] avg loss 0.000726248, throughput 4.00189K wps
[Epoch 32 Batch 1650/2125] avg loss 0.000531464, throughput 4.00036K wps
[Epoch 32 Batch 1680/2125] avg loss 0.000855222, throughput 4.00003K wps
[Epoch 32 Batch 1710/2125] avg loss 0.000743592, throughput 4.00087K wps
[Epoch 32 Batch 1740/2125] avg loss 0.000890789, throughput 3.99989K wps
[Epoch 32 Batch 1770/2125] avg loss 0.00067579, throughput 4.00116K wps
[Epoch 32 Batch 1800/2125] avg loss 0.000743043, throughput 3.99829K wps
[Epoch 32 Batch 1830/2125] avg loss 0.00073145, throughput 4.00078K wps
[Epoch 32 Batch 1860/2125] avg loss 0.000644021, throughput 4.00148K wps
[Epoch 32 Batch 1890/2125] avg loss 0.000922808, throughput 4.00105K wps
[Epoch 32 Batch 1920/2125] avg loss 0.00077047, throughput 3.99661K wps
[Epoch 32 Batch 1950/2125] avg loss 0.0008374, throughput 4.00376K wps
[Epoch 32 Batch 1980/2125] avg loss 0.00065675, throughput 3.99639K wps
[Epoch 32 Batch 2010/2125] avg loss 0.000595434, throughput 4.00013K wps
[Epoch 32 Batch 2040/2125] avg loss 0.00100549, throughput 3.99945K wps
[Epoch 32 Batch 2070/2125] avg loss 0.000893263, throughput 3.99691K wps
[Epoch 32 Batch 2100/2125] avg loss 0.000783547, throughput 3.99941K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 32] train avg loss 0.000712937, test acc 0.9247, test avg loss 0.476527, throughput 3.99967K wps
[Epoch 33 Batch 30/2125] avg loss 0.000741799, throughput 4.0961K wps
[Epoch 33 Batch 60/2125] avg loss 0.000524465, throughput 4.00781K wps
[Epoch 33 Batch 90/2125] avg loss 0.000512383, throughput 3.99973K wps
[Epoch 33 Batch 120/2125] avg loss 0.000663212, throughput 3.99965K wps
[Epoch 33 Batch 150/2125] avg loss 0.00057798, throughput 3.99828K wps
[Epoch 33 Batch 180/2125] avg loss 0.000567367, throughput 3.99698K wps
[Epoch 33 Batch 210/2125] avg loss 0.000624383, throughput 4.00059K wps
[Epoch 33 Batch 240/2125] avg loss 0.000466776, throughput 3.98654K wps
[Epoch 33 Batch 270/2125] avg loss 0.000593926, throughput 3.99286K wps
[Epoch 33 Batch 300/2125] avg loss 0.000536811, throughput 4.00451K wps
[Epoch 33 Batch 330/2125] avg loss 0.000610953, throughput 4.00495K wps
[Epoch 33 Batch 360/2125] avg loss 0.000536142, throughput 3.99828K wps
[Epoch 33 Batch 390/2125] avg loss 0.00057203, throughput 3.9984K wps
[Epoch 33 Batch 420/2125] avg loss 0.000531544, throughput 3.99516K wps
[Epoch 33 Batch 450/2125] avg loss 0.000725259, throughput 3.99925K wps
[Epoch 33 Batch 480/2125] avg loss 0.000598801, throughput 3.99809K wps
[Epoch 33 Batch 510/2125] avg loss 0.000717601, throughput 4.00154K wps
[Epoch 33 Batch 540/2125] avg loss 0.000709467, throughput 3.99858K wps
[Epoch 33 Batch 570/2125] avg loss 0.000571049, throughput 4.00035K wps
[Epoch 33 Batch 600/2125] avg loss 0.000512535, throughput 4.00071K wps
[Epoch 33 Batch 630/2125] avg loss 0.000834342, throughput 3.99725K wps
[Epoch 33 Batch 660/2125] avg loss 0.000535817, throughput 3.99737K wps
[Epoch 33 Batch 690/2125] avg loss 0.000540229, throughput 4.00042K wps
[Epoch 33 Batch 720/2125] avg loss 0.000714691, throughput 3.99778K wps
[Epoch 33 Batch 750/2125] avg loss 0.000668927, throughput 3.9954K wps
[Epoch 33 Batch 780/2125] avg loss 0.000731852, throughput 4.00014K wps
[Epoch 33 Batch 810/2125] avg loss 0.00075728, throughput 3.99677K wps
[Epoch 33 Batch 840/2125] avg loss 0.000930594, throughput 3.99953K wps
[Epoch 33 Batch 870/2125] avg loss 0.000764511, throughput 3.99564K wps
[Epoch 33 Batch 900/2125] avg loss 0.000564065, throughput 4.00276K wps
[Epoch 33 Batch 930/2125] avg loss 0.000336129, throughput 3.99973K wps
[Epoch 33 Batch 960/2125] avg loss 0.000738087, throughput 3.998K wps
[Epoch 33 Batch 990/2125] avg loss 0.000752113, throughput 3.99877K wps
[Epoch 33 Batch 1020/2125] avg loss 0.00054674, throughput 4.00218K wps
[Epoch 33 Batch 1050/2125] avg loss 0.000763591, throughput 4.00164K wps
[Epoch 33 Batch 1080/2125] avg loss 0.000679502, throughput 3.99842K wps
[Epoch 33 Batch 1110/2125] avg loss 0.000913868, throughput 4.00084K wps
[Epoch 33 Batch 1140/2125] avg loss 0.000612898, throughput 3.99846K wps
[Epoch 33 Batch 1170/2125] avg loss 0.000891081, throughput 3.9997K wps
[Epoch 33 Batch 1200/2125] avg loss 0.000705084, throughput 3.99825K wps
[Epoch 33 Batch 1230/2125] avg loss 0.000761227, throughput 4.00257K wps
[Epoch 33 Batch 1260/2125] avg loss 0.000679416, throughput 3.99729K wps
[Epoch 33 Batch 1290/2125] avg loss 0.000633341, throughput 3.99716K wps
[Epoch 33 Batch 1320/2125] avg loss 0.00058984, throughput 4.00112K wps
[Epoch 33 Batch 1350/2125] avg loss 0.000933687, throughput 3.99367K wps
[Epoch 33 Batch 1380/2125] avg loss 0.00112575, throughput 3.99785K wps
[Epoch 33 Batch 1410/2125] avg loss 0.00072818, throughput 3.99783K wps
[Epoch 33 Batch 1440/2125] avg loss 0.000815996, throughput 3.99932K wps
[Epoch 33 Batch 1470/2125] avg loss 0.000718646, throughput 4.00074K wps
[Epoch 33 Batch 1500/2125] avg loss 0.000771352, throughput 3.99697K wps
[Epoch 33 Batch 1530/2125] avg loss 0.000792576, throughput 3.99557K wps
[Epoch 33 Batch 1560/2125] avg loss 0.000752157, throughput 3.99872K wps
[Epoch 33 Batch 1590/2125] avg loss 0.000737809, throughput 3.99873K wps
[Epoch 33 Batch 1620/2125] avg loss 0.000596318, throughput 3.99635K wps
[Epoch 33 Batch 1650/2125] avg loss 0.000622394, throughput 3.99501K wps
[Epoch 33 Batch 1680/2125] avg loss 0.000961721, throughput 4.00277K wps
[Epoch 33 Batch 1710/2125] avg loss 0.000673463, throughput 4.00106K wps
[Epoch 33 Batch 1740/2125] avg loss 0.000638156, throughput 4.00103K wps
[Epoch 33 Batch 1770/2125] avg loss 0.000804236, throughput 3.99874K wps
[Epoch 33 Batch 1800/2125] avg loss 0.000506351, throughput 3.99876K wps
[Epoch 33 Batch 1830/2125] avg loss 0.000865973, throughput 3.99993K wps
[Epoch 33 Batch 1860/2125] avg loss 0.000642712, throughput 3.99659K wps
[Epoch 33 Batch 1890/2125] avg loss 0.000530124, throughput 4.00181K wps
[Epoch 33 Batch 1920/2125] avg loss 0.000591402, throughput 3.99836K wps
[Epoch 33 Batch 1950/2125] avg loss 0.00102414, throughput 3.99782K wps
[Epoch 33 Batch 1980/2125] avg loss 0.00060695, throughput 4.00372K wps
[Epoch 33 Batch 2010/2125] avg loss 0.00110833, throughput 4.00448K wps
[Epoch 33 Batch 2040/2125] avg loss 0.000565368, throughput 4.00471K wps
[Epoch 33 Batch 2070/2125] avg loss 0.000833045, throughput 3.99754K wps
[Epoch 33 Batch 2100/2125] avg loss 0.000701266, throughput 3.99949K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 33] train avg loss 0.000688343, test acc 0.9258, test avg loss 0.486577, throughput 4.00046K wps
[Epoch 34 Batch 30/2125] avg loss 0.000452605, throughput 4.0941K wps
[Epoch 34 Batch 60/2125] avg loss 0.00062591, throughput 3.99923K wps
[Epoch 34 Batch 90/2125] avg loss 0.000422064, throughput 4.00597K wps
[Epoch 34 Batch 120/2125] avg loss 0.00045391, throughput 4.00429K wps
[Epoch 34 Batch 150/2125] avg loss 0.0005394, throughput 4.00433K wps
[Epoch 34 Batch 180/2125] avg loss 0.000640173, throughput 4.00016K wps
[Epoch 34 Batch 210/2125] avg loss 0.000597867, throughput 3.99748K wps
[Epoch 34 Batch 240/2125] avg loss 0.000636002, throughput 3.99708K wps
[Epoch 34 Batch 270/2125] avg loss 0.000592117, throughput 3.99885K wps
[Epoch 34 Batch 300/2125] avg loss 0.000755573, throughput 3.99777K wps
[Epoch 34 Batch 330/2125] avg loss 0.000693169, throughput 4.00048K wps
[Epoch 34 Batch 360/2125] avg loss 0.000518609, throughput 3.9986K wps
[Epoch 34 Batch 390/2125] avg loss 0.000620868, throughput 3.99896K wps
[Epoch 34 Batch 420/2125] avg loss 0.000785244, throughput 3.99955K wps
[Epoch 34 Batch 450/2125] avg loss 0.000639306, throughput 4.00321K wps
[Epoch 34 Batch 480/2125] avg loss 0.000479328, throughput 3.99719K wps
[Epoch 34 Batch 510/2125] avg loss 0.000629679, throughput 3.99758K wps
[Epoch 34 Batch 540/2125] avg loss 0.000500492, throughput 4.00127K wps
[Epoch 34 Batch 570/2125] avg loss 0.000608785, throughput 4.00092K wps
[Epoch 34 Batch 600/2125] avg loss 0.000580605, throughput 3.99879K wps
[Epoch 34 Batch 630/2125] avg loss 0.000611208, throughput 3.99868K wps
[Epoch 34 Batch 660/2125] avg loss 0.000397887, throughput 4.00205K wps
[Epoch 34 Batch 690/2125] avg loss 0.000586791, throughput 3.99965K wps
[Epoch 34 Batch 720/2125] avg loss 0.000546589, throughput 4.00339K wps
[Epoch 34 Batch 750/2125] avg loss 0.000644049, throughput 4.00513K wps
[Epoch 34 Batch 780/2125] avg loss 0.000556173, throughput 4.00267K wps
[Epoch 34 Batch 810/2125] avg loss 0.000726711, throughput 4.00003K wps
[Epoch 34 Batch 840/2125] avg loss 0.00049288, throughput 3.99667K wps
[Epoch 34 Batch 870/2125] avg loss 0.000725554, throughput 3.99643K wps
[Epoch 34 Batch 900/2125] avg loss 0.000485747, throughput 4.00091K wps
[Epoch 34 Batch 930/2125] avg loss 0.000753506, throughput 4.00109K wps
[Epoch 34 Batch 960/2125] avg loss 0.000758386, throughput 4.00132K wps
[Epoch 34 Batch 990/2125] avg loss 0.000598284, throughput 4.00526K wps
[Epoch 34 Batch 1020/2125] avg loss 0.000734019, throughput 4.00409K wps
[Epoch 34 Batch 1050/2125] avg loss 0.000910577, throughput 4.00343K wps
[Epoch 34 Batch 1080/2125] avg loss 0.000530594, throughput 4.00078K wps
[Epoch 34 Batch 1110/2125] avg loss 0.000597591, throughput 4.00069K wps
[Epoch 34 Batch 1140/2125] avg loss 0.000767544, throughput 3.99882K wps
[Epoch 34 Batch 1170/2125] avg loss 0.000780567, throughput 3.99897K wps
[Epoch 34 Batch 1200/2125] avg loss 0.000794481, throughput 3.99503K wps
[Epoch 34 Batch 1230/2125] avg loss 0.000670238, throughput 3.99895K wps
[Epoch 34 Batch 1260/2125] avg loss 0.00067188, throughput 3.9998K wps
[Epoch 34 Batch 1290/2125] avg loss 0.000715262, throughput 3.99958K wps
[Epoch 34 Batch 1320/2125] avg loss 0.000827315, throughput 3.99853K wps
[Epoch 34 Batch 1350/2125] avg loss 0.000516449, throughput 3.99736K wps
[Epoch 34 Batch 1380/2125] avg loss 0.000527505, throughput 3.99787K wps
[Epoch 34 Batch 1410/2125] avg loss 0.000622352, throughput 3.9955K wps
[Epoch 34 Batch 1440/2125] avg loss 0.000762432, throughput 3.99926K wps
[Epoch 34 Batch 1470/2125] avg loss 0.000864129, throughput 3.99757K wps
[Epoch 34 Batch 1500/2125] avg loss 0.00106258, throughput 3.99895K wps
[Epoch 34 Batch 1530/2125] avg loss 0.000783908, throughput 3.9964K wps
[Epoch 34 Batch 1560/2125] avg loss 0.000781077, throughput 3.99624K wps
[Epoch 34 Batch 1590/2125] avg loss 0.000882333, throughput 3.99763K wps
[Epoch 34 Batch 1620/2125] avg loss 0.000753814, throughput 3.99926K wps
[Epoch 34 Batch 1650/2125] avg loss 0.000752073, throughput 3.98559K wps
[Epoch 34 Batch 1680/2125] avg loss 0.000703431, throughput 4.00046K wps
[Epoch 34 Batch 1710/2125] avg loss 0.000646851, throughput 4.00874K wps
[Epoch 34 Batch 1740/2125] avg loss 0.000815093, throughput 4.00308K wps
[Epoch 34 Batch 1770/2125] avg loss 0.000961444, throughput 4.00127K wps
[Epoch 34 Batch 1800/2125] avg loss 0.00058127, throughput 3.99876K wps
[Epoch 34 Batch 1830/2125] avg loss 0.000814104, throughput 3.99631K wps
[Epoch 34 Batch 1860/2125] avg loss 0.000830717, throughput 3.99501K wps
[Epoch 34 Batch 1890/2125] avg loss 0.000559434, throughput 3.99968K wps
[Epoch 34 Batch 1920/2125] avg loss 0.000927427, throughput 3.99657K wps
[Epoch 34 Batch 1950/2125] avg loss 0.000588326, throughput 3.99604K wps
[Epoch 34 Batch 1980/2125] avg loss 0.000578754, throughput 3.99864K wps
[Epoch 34 Batch 2010/2125] avg loss 0.000641508, throughput 3.9991K wps
[Epoch 34 Batch 2040/2125] avg loss 0.000742528, throughput 3.99303K wps
[Epoch 34 Batch 2070/2125] avg loss 0.000657367, throughput 4.00124K wps
[Epoch 34 Batch 2100/2125] avg loss 0.000716526, throughput 3.99746K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 34] train avg loss 0.000667978, test acc 0.9264, test avg loss 0.504697, throughput 4.00074K wps
[Epoch 35 Batch 30/2125] avg loss 0.000447106, throughput 4.09102K wps
[Epoch 35 Batch 60/2125] avg loss 0.000536364, throughput 3.99895K wps
[Epoch 35 Batch 90/2125] avg loss 0.000694358, throughput 3.99972K wps
[Epoch 35 Batch 120/2125] avg loss 0.00054274, throughput 3.9997K wps
[Epoch 35 Batch 150/2125] avg loss 0.000570825, throughput 3.99822K wps
[Epoch 35 Batch 180/2125] avg loss 0.000512974, throughput 3.99094K wps
[Epoch 35 Batch 210/2125] avg loss 0.000493516, throughput 3.99983K wps
[Epoch 35 Batch 240/2125] avg loss 0.000326099, throughput 3.99866K wps
[Epoch 35 Batch 270/2125] avg loss 0.000783189, throughput 3.99719K wps
[Epoch 35 Batch 300/2125] avg loss 0.000543812, throughput 3.99755K wps
[Epoch 35 Batch 330/2125] avg loss 0.000449742, throughput 3.99679K wps
[Epoch 35 Batch 360/2125] avg loss 0.000438698, throughput 3.99916K wps
[Epoch 35 Batch 390/2125] avg loss 0.000755718, throughput 3.99951K wps
[Epoch 35 Batch 420/2125] avg loss 0.000513864, throughput 3.99743K wps
[Epoch 35 Batch 450/2125] avg loss 0.000690165, throughput 3.99837K wps
[Epoch 35 Batch 480/2125] avg loss 0.000558821, throughput 3.9938K wps
[Epoch 35 Batch 510/2125] avg loss 0.000514694, throughput 3.99724K wps
[Epoch 35 Batch 540/2125] avg loss 0.000636741, throughput 3.9994K wps
[Epoch 35 Batch 570/2125] avg loss 0.00033338, throughput 3.99571K wps
[Epoch 35 Batch 600/2125] avg loss 0.000556613, throughput 3.99144K wps
[Epoch 35 Batch 630/2125] avg loss 0.000642543, throughput 3.99467K wps
[Epoch 35 Batch 660/2125] avg loss 0.000729844, throughput 3.99865K wps
[Epoch 35 Batch 690/2125] avg loss 0.000918392, throughput 3.99901K wps
[Epoch 35 Batch 720/2125] avg loss 0.000756423, throughput 3.99868K wps
[Epoch 35 Batch 750/2125] avg loss 0.000714865, throughput 3.99854K wps
[Epoch 35 Batch 780/2125] avg loss 0.000495457, throughput 3.99941K wps
[Epoch 35 Batch 810/2125] avg loss 0.000574018, throughput 3.99585K wps
[Epoch 35 Batch 840/2125] avg loss 0.000673645, throughput 3.99526K wps
[Epoch 35 Batch 870/2125] avg loss 0.000597183, throughput 3.9971K wps
[Epoch 35 Batch 900/2125] avg loss 0.000688386, throughput 3.99705K wps
[Epoch 35 Batch 930/2125] avg loss 0.000719584, throughput 3.9955K wps
[Epoch 35 Batch 960/2125] avg loss 0.000718391, throughput 3.99755K wps
[Epoch 35 Batch 990/2125] avg loss 0.000567337, throughput 3.99731K wps
[Epoch 35 Batch 1020/2125] avg loss 0.000583596, throughput 3.99961K wps
[Epoch 35 Batch 1050/2125] avg loss 0.000778093, throughput 3.99491K wps
[Epoch 35 Batch 1080/2125] avg loss 0.000653532, throughput 3.99696K wps
[Epoch 35 Batch 1110/2125] avg loss 0.000498266, throughput 4.00304K wps
[Epoch 35 Batch 1140/2125] avg loss 0.000596413, throughput 3.99978K wps
[Epoch 35 Batch 1170/2125] avg loss 0.000571021, throughput 3.99672K wps
[Epoch 35 Batch 1200/2125] avg loss 0.000605239, throughput 3.99876K wps
[Epoch 35 Batch 1230/2125] avg loss 0.000753937, throughput 3.99687K wps
[Epoch 35 Batch 1260/2125] avg loss 0.000639298, throughput 3.99742K wps
[Epoch 35 Batch 1290/2125] avg loss 0.000675465, throughput 4.00024K wps
[Epoch 35 Batch 1320/2125] avg loss 0.000699599, throughput 3.99757K wps
[Epoch 35 Batch 1350/2125] avg loss 0.000740905, throughput 4.00101K wps
[Epoch 35 Batch 1380/2125] avg loss 0.000601953, throughput 3.99661K wps
[Epoch 35 Batch 1410/2125] avg loss 0.000615455, throughput 3.99759K wps
[Epoch 35 Batch 1440/2125] avg loss 0.000574169, throughput 3.99582K wps
[Epoch 35 Batch 1470/2125] avg loss 0.000711218, throughput 4.00082K wps
[Epoch 35 Batch 1500/2125] avg loss 0.000614694, throughput 3.99972K wps
[Epoch 35 Batch 1530/2125] avg loss 0.000566401, throughput 3.99629K wps
[Epoch 35 Batch 1560/2125] avg loss 0.000707263, throughput 4.0013K wps
[Epoch 35 Batch 1590/2125] avg loss 0.000560952, throughput 4.00188K wps
[Epoch 35 Batch 1620/2125] avg loss 0.000830883, throughput 3.99886K wps
[Epoch 35 Batch 1650/2125] avg loss 0.000703042, throughput 4.00003K wps
[Epoch 35 Batch 1680/2125] avg loss 0.000805586, throughput 3.99708K wps
[Epoch 35 Batch 1710/2125] avg loss 0.00047678, throughput 4.00014K wps
[Epoch 35 Batch 1740/2125] avg loss 0.00104092, throughput 4.0005K wps
[Epoch 35 Batch 1770/2125] avg loss 0.000611379, throughput 3.99548K wps
[Epoch 35 Batch 1800/2125] avg loss 0.000505348, throughput 3.9938K wps
[Epoch 35 Batch 1830/2125] avg loss 0.000721301, throughput 3.99505K wps
[Epoch 35 Batch 1860/2125] avg loss 0.000575498, throughput 4.0017K wps
[Epoch 35 Batch 1890/2125] avg loss 0.00076176, throughput 3.99882K wps
[Epoch 35 Batch 1920/2125] avg loss 0.000801515, throughput 4.00014K wps
[Epoch 35 Batch 1950/2125] avg loss 0.000648938, throughput 3.99304K wps
[Epoch 35 Batch 1980/2125] avg loss 0.000831884, throughput 3.99784K wps
[Epoch 35 Batch 2010/2125] avg loss 0.000840848, throughput 3.99567K wps
[Epoch 35 Batch 2040/2125] avg loss 0.000865857, throughput 3.99723K wps
[Epoch 35 Batch 2070/2125] avg loss 0.000561674, throughput 3.9986K wps
[Epoch 35 Batch 2100/2125] avg loss 0.000914346, throughput 3.99397K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 35] train avg loss 0.000644781, test acc 0.9255, test avg loss 0.501557, throughput 3.99909K wps
[Epoch 36 Batch 30/2125] avg loss 0.000568021, throughput 4.09295K wps
[Epoch 36 Batch 60/2125] avg loss 0.000522178, throughput 4.00137K wps
[Epoch 36 Batch 90/2125] avg loss 0.000458351, throughput 3.99891K wps
[Epoch 36 Batch 120/2125] avg loss 0.000601583, throughput 4.00132K wps
[Epoch 36 Batch 150/2125] avg loss 0.000647641, throughput 3.9995K wps
[Epoch 36 Batch 180/2125] avg loss 0.000624277, throughput 3.99915K wps
[Epoch 36 Batch 210/2125] avg loss 0.000568933, throughput 3.99841K wps
[Epoch 36 Batch 240/2125] avg loss 0.000469592, throughput 3.99773K wps
[Epoch 36 Batch 270/2125] avg loss 0.000473659, throughput 4.00432K wps
[Epoch 36 Batch 300/2125] avg loss 0.000621363, throughput 3.99621K wps
[Epoch 36 Batch 330/2125] avg loss 0.000603391, throughput 4.00469K wps
[Epoch 36 Batch 360/2125] avg loss 0.000501094, throughput 4.00142K wps
[Epoch 36 Batch 390/2125] avg loss 0.000487282, throughput 4.00047K wps
[Epoch 36 Batch 420/2125] avg loss 0.000486226, throughput 4.00265K wps
[Epoch 36 Batch 450/2125] avg loss 0.000631894, throughput 3.99545K wps
[Epoch 36 Batch 480/2125] avg loss 0.000520547, throughput 3.99834K wps
[Epoch 36 Batch 510/2125] avg loss 0.000459551, throughput 3.99751K wps
[Epoch 36 Batch 540/2125] avg loss 0.000539961, throughput 3.998K wps
[Epoch 36 Batch 570/2125] avg loss 0.000719802, throughput 3.99776K wps
[Epoch 36 Batch 600/2125] avg loss 0.000602143, throughput 4.00103K wps
[Epoch 36 Batch 630/2125] avg loss 0.000594162, throughput 3.99945K wps
[Epoch 36 Batch 660/2125] avg loss 0.00050007, throughput 3.99684K wps
[Epoch 36 Batch 690/2125] avg loss 0.000516313, throughput 3.99969K wps
[Epoch 36 Batch 720/2125] avg loss 0.000657458, throughput 3.99654K wps
[Epoch 36 Batch 750/2125] avg loss 0.000758164, throughput 3.99729K wps
[Epoch 36 Batch 780/2125] avg loss 0.000532993, throughput 4.0004K wps
[Epoch 36 Batch 810/2125] avg loss 0.000625491, throughput 3.99878K wps
[Epoch 36 Batch 840/2125] avg loss 0.000726351, throughput 3.9953K wps
[Epoch 36 Batch 870/2125] avg loss 0.00053797, throughput 3.9802K wps
[Epoch 36 Batch 900/2125] avg loss 0.000618732, throughput 3.99961K wps
[Epoch 36 Batch 930/2125] avg loss 0.000672407, throughput 4.00592K wps
[Epoch 36 Batch 960/2125] avg loss 0.00082263, throughput 4.00321K wps
[Epoch 36 Batch 990/2125] avg loss 0.000800913, throughput 3.99877K wps
[Epoch 36 Batch 1020/2125] avg loss 0.000889533, throughput 4.00461K wps
[Epoch 36 Batch 1050/2125] avg loss 0.000853757, throughput 3.9989K wps
[Epoch 36 Batch 1080/2125] avg loss 0.000642823, throughput 4.00026K wps
[Epoch 36 Batch 1110/2125] avg loss 0.000495772, throughput 3.99924K wps
[Epoch 36 Batch 1140/2125] avg loss 0.000674923, throughput 4.00069K wps
[Epoch 36 Batch 1170/2125] avg loss 0.000542702, throughput 3.99499K wps
[Epoch 36 Batch 1200/2125] avg loss 0.000636929, throughput 3.99856K wps
[Epoch 36 Batch 1230/2125] avg loss 0.000396196, throughput 3.99887K wps
[Epoch 36 Batch 1260/2125] avg loss 0.000682661, throughput 4.00194K wps
[Epoch 36 Batch 1290/2125] avg loss 0.000517533, throughput 3.99927K wps
[Epoch 36 Batch 1320/2125] avg loss 0.000551599, throughput 3.9979K wps
[Epoch 36 Batch 1350/2125] avg loss 0.000705336, throughput 3.99838K wps
[Epoch 36 Batch 1380/2125] avg loss 0.000560397, throughput 3.99509K wps
[Epoch 36 Batch 1410/2125] avg loss 0.000501754, throughput 3.99944K wps
[Epoch 36 Batch 1440/2125] avg loss 0.000707981, throughput 3.99935K wps
[Epoch 36 Batch 1470/2125] avg loss 0.000565611, throughput 4.00212K wps
[Epoch 36 Batch 1500/2125] avg loss 0.000614669, throughput 4.00424K wps
[Epoch 36 Batch 1530/2125] avg loss 0.000854074, throughput 3.99829K wps
[Epoch 36 Batch 1560/2125] avg loss 0.000778095, throughput 4.0005K wps
[Epoch 36 Batch 1590/2125] avg loss 0.000799434, throughput 3.99969K wps
[Epoch 36 Batch 1620/2125] avg loss 0.000612099, throughput 3.99648K wps
[Epoch 36 Batch 1650/2125] avg loss 0.000703342, throughput 3.99977K wps
[Epoch 36 Batch 1680/2125] avg loss 0.000707363, throughput 4.00063K wps
[Epoch 36 Batch 1710/2125] avg loss 0.000635853, throughput 3.99488K wps
[Epoch 36 Batch 1740/2125] avg loss 0.000553167, throughput 3.99643K wps
[Epoch 36 Batch 1770/2125] avg loss 0.000741592, throughput 3.99967K wps
[Epoch 36 Batch 1800/2125] avg loss 0.000691627, throughput 4.00089K wps
[Epoch 36 Batch 1830/2125] avg loss 0.000685294, throughput 3.99782K wps
[Epoch 36 Batch 1860/2125] avg loss 0.000713333, throughput 4.00296K wps
[Epoch 36 Batch 1890/2125] avg loss 0.000725928, throughput 3.99754K wps
[Epoch 36 Batch 1920/2125] avg loss 0.000601362, throughput 3.9974K wps
[Epoch 36 Batch 1950/2125] avg loss 0.000640869, throughput 3.99382K wps
[Epoch 36 Batch 1980/2125] avg loss 0.000608051, throughput 3.99904K wps
[Epoch 36 Batch 2010/2125] avg loss 0.000618961, throughput 3.99635K wps
[Epoch 36 Batch 2040/2125] avg loss 0.000517726, throughput 3.99644K wps
[Epoch 36 Batch 2070/2125] avg loss 0.000734555, throughput 3.99606K wps
[Epoch 36 Batch 2100/2125] avg loss 0.000699936, throughput 4.00021K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 36] train avg loss 0.000623852, test acc 0.9259, test avg loss 0.513592, throughput 4.00023K wps
[Epoch 37 Batch 30/2125] avg loss 0.000489082, throughput 4.08841K wps
[Epoch 37 Batch 60/2125] avg loss 0.000541815, throughput 3.99997K wps
[Epoch 37 Batch 90/2125] avg loss 0.000548714, throughput 3.9978K wps
[Epoch 37 Batch 120/2125] avg loss 0.000390184, throughput 3.9959K wps
[Epoch 37 Batch 150/2125] avg loss 0.000611791, throughput 3.99746K wps
[Epoch 37 Batch 180/2125] avg loss 0.000494616, throughput 3.99923K wps
[Epoch 37 Batch 210/2125] avg loss 0.000509664, throughput 4.00122K wps
[Epoch 37 Batch 240/2125] avg loss 0.000569262, throughput 4.00127K wps
[Epoch 37 Batch 270/2125] avg loss 0.000503203, throughput 4.00049K wps
[Epoch 37 Batch 300/2125] avg loss 0.000360989, throughput 3.99631K wps
[Epoch 37 Batch 330/2125] avg loss 0.00049511, throughput 3.99575K wps
[Epoch 37 Batch 360/2125] avg loss 0.000606094, throughput 3.99942K wps
[Epoch 37 Batch 390/2125] avg loss 0.000696008, throughput 3.99346K wps
[Epoch 37 Batch 420/2125] avg loss 0.000738579, throughput 4.00012K wps
[Epoch 37 Batch 450/2125] avg loss 0.000378474, throughput 3.99742K wps
[Epoch 37 Batch 480/2125] avg loss 0.000372135, throughput 4.00205K wps
[Epoch 37 Batch 510/2125] avg loss 0.000507656, throughput 3.99842K wps
[Epoch 37 Batch 540/2125] avg loss 0.000847241, throughput 3.9974K wps
[Epoch 37 Batch 570/2125] avg loss 0.000738618, throughput 3.99334K wps
[Epoch 37 Batch 600/2125] avg loss 0.000633713, throughput 4.00021K wps
[Epoch 37 Batch 630/2125] avg loss 0.000462529, throughput 4.00005K wps
[Epoch 37 Batch 660/2125] avg loss 0.000380934, throughput 3.99997K wps
[Epoch 37 Batch 690/2125] avg loss 0.000582171, throughput 3.99609K wps
[Epoch 37 Batch 720/2125] avg loss 0.000835412, throughput 3.99761K wps
[Epoch 37 Batch 750/2125] avg loss 0.000515321, throughput 4.00156K wps
[Epoch 37 Batch 780/2125] avg loss 0.000472887, throughput 3.99682K wps
[Epoch 37 Batch 810/2125] avg loss 0.00055241, throughput 4.00083K wps
[Epoch 37 Batch 840/2125] avg loss 0.000757632, throughput 3.99877K wps
[Epoch 37 Batch 870/2125] avg loss 0.000489278, throughput 3.99781K wps
[Epoch 37 Batch 900/2125] avg loss 0.000559141, throughput 3.99704K wps
[Epoch 37 Batch 930/2125] avg loss 0.000629977, throughput 3.99997K wps
[Epoch 37 Batch 960/2125] avg loss 0.00072383, throughput 4.00089K wps
[Epoch 37 Batch 990/2125] avg loss 0.000595261, throughput 4.00004K wps
[Epoch 37 Batch 1020/2125] avg loss 0.000537218, throughput 3.99871K wps
[Epoch 37 Batch 1050/2125] avg loss 0.000576357, throughput 3.99841K wps
[Epoch 37 Batch 1080/2125] avg loss 0.000516834, throughput 3.99752K wps
[Epoch 37 Batch 1110/2125] avg loss 0.000812101, throughput 3.99813K wps
[Epoch 37 Batch 1140/2125] avg loss 0.000958291, throughput 3.99842K wps
[Epoch 37 Batch 1170/2125] avg loss 0.000391171, throughput 3.99757K wps
[Epoch 37 Batch 1200/2125] avg loss 0.000549015, throughput 3.99884K wps
[Epoch 37 Batch 1230/2125] avg loss 0.000537962, throughput 3.99868K wps
[Epoch 37 Batch 1260/2125] avg loss 0.000485991, throughput 3.9971K wps
[Epoch 37 Batch 1290/2125] avg loss 0.000675547, throughput 3.99922K wps
[Epoch 37 Batch 1320/2125] avg loss 0.000720562, throughput 3.9999K wps
[Epoch 37 Batch 1350/2125] avg loss 0.000596038, throughput 4.00007K wps
[Epoch 37 Batch 1380/2125] avg loss 0.000659151, throughput 3.99464K wps
[Epoch 37 Batch 1410/2125] avg loss 0.000601392, throughput 3.99068K wps
[Epoch 37 Batch 1440/2125] avg loss 0.00088501, throughput 3.99569K wps
[Epoch 37 Batch 1470/2125] avg loss 0.000577771, throughput 3.99758K wps
[Epoch 37 Batch 1500/2125] avg loss 0.000735934, throughput 3.99844K wps
[Epoch 37 Batch 1530/2125] avg loss 0.000808555, throughput 3.99951K wps
[Epoch 37 Batch 1560/2125] avg loss 0.000667988, throughput 3.99646K wps
[Epoch 37 Batch 1590/2125] avg loss 0.00079734, throughput 3.99664K wps
[Epoch 37 Batch 1620/2125] avg loss 0.000489713, throughput 3.9952K wps
[Epoch 37 Batch 1650/2125] avg loss 0.000576807, throughput 4.00293K wps
[Epoch 37 Batch 1680/2125] avg loss 0.000693747, throughput 4.00123K wps
[Epoch 37 Batch 1710/2125] avg loss 0.000635749, throughput 3.99981K wps
[Epoch 37 Batch 1740/2125] avg loss 0.000555127, throughput 4.0007K wps
[Epoch 37 Batch 1770/2125] avg loss 0.00038756, throughput 3.99986K wps
[Epoch 37 Batch 1800/2125] avg loss 0.000612641, throughput 3.9973K wps
[Epoch 37 Batch 1830/2125] avg loss 0.000800948, throughput 3.9989K wps
[Epoch 37 Batch 1860/2125] avg loss 0.000739848, throughput 3.998K wps
[Epoch 37 Batch 1890/2125] avg loss 0.0005672, throughput 3.9992K wps
[Epoch 37 Batch 1920/2125] avg loss 0.000504307, throughput 3.99286K wps
[Epoch 37 Batch 1950/2125] avg loss 0.000577537, throughput 3.99998K wps
[Epoch 37 Batch 1980/2125] avg loss 0.000630683, throughput 3.99918K wps
[Epoch 37 Batch 2010/2125] avg loss 0.00084321, throughput 3.99455K wps
[Epoch 37 Batch 2040/2125] avg loss 0.000757747, throughput 3.99796K wps
[Epoch 37 Batch 2070/2125] avg loss 0.000862609, throughput 3.99615K wps
[Epoch 37 Batch 2100/2125] avg loss 0.000615932, throughput 3.99527K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 37] train avg loss 0.000607516, test acc 0.9259, test avg loss 0.512129, throughput 3.9995K wps
[Epoch 38 Batch 30/2125] avg loss 0.000521831, throughput 4.09204K wps
[Epoch 38 Batch 60/2125] avg loss 0.000521988, throughput 3.99575K wps
[Epoch 38 Batch 90/2125] avg loss 0.000515111, throughput 3.99071K wps
[Epoch 38 Batch 120/2125] avg loss 0.000446293, throughput 4.00105K wps
[Epoch 38 Batch 150/2125] avg loss 0.000599612, throughput 4.00292K wps
[Epoch 38 Batch 180/2125] avg loss 0.000426712, throughput 3.99767K wps
[Epoch 38 Batch 210/2125] avg loss 0.000598407, throughput 3.99818K wps
[Epoch 38 Batch 240/2125] avg loss 0.000322995, throughput 4.0025K wps
[Epoch 38 Batch 270/2125] avg loss 0.000489682, throughput 4.00349K wps
[Epoch 38 Batch 300/2125] avg loss 0.000428039, throughput 3.99648K wps
[Epoch 38 Batch 330/2125] avg loss 0.000518001, throughput 4.00104K wps
[Epoch 38 Batch 360/2125] avg loss 0.000601656, throughput 4.00468K wps
[Epoch 38 Batch 390/2125] avg loss 0.00050944, throughput 4.00399K wps
[Epoch 38 Batch 420/2125] avg loss 0.000378299, throughput 4.00326K wps
[Epoch 38 Batch 450/2125] avg loss 0.000742986, throughput 4.00499K wps
[Epoch 38 Batch 480/2125] avg loss 0.000541298, throughput 4.00531K wps
[Epoch 38 Batch 510/2125] avg loss 0.000495234, throughput 4.00356K wps
[Epoch 38 Batch 540/2125] avg loss 0.000533019, throughput 3.99997K wps
[Epoch 38 Batch 570/2125] avg loss 0.000577985, throughput 3.99735K wps
[Epoch 38 Batch 600/2125] avg loss 0.000583915, throughput 3.99881K wps
[Epoch 38 Batch 630/2125] avg loss 0.000529152, throughput 4.00003K wps
[Epoch 38 Batch 660/2125] avg loss 0.000615381, throughput 3.99772K wps
[Epoch 38 Batch 690/2125] avg loss 0.000605221, throughput 3.99811K wps
[Epoch 38 Batch 720/2125] avg loss 0.000519116, throughput 3.99931K wps
[Epoch 38 Batch 750/2125] avg loss 0.000363482, throughput 3.99508K wps
[Epoch 38 Batch 780/2125] avg loss 0.000532206, throughput 4.00143K wps
[Epoch 38 Batch 810/2125] avg loss 0.000545921, throughput 3.9945K wps
[Epoch 38 Batch 840/2125] avg loss 0.000610733, throughput 3.99809K wps
[Epoch 38 Batch 870/2125] avg loss 0.000744715, throughput 3.99993K wps
[Epoch 38 Batch 900/2125] avg loss 0.000602536, throughput 3.99744K wps
[Epoch 38 Batch 930/2125] avg loss 0.000586342, throughput 3.99435K wps
[Epoch 38 Batch 960/2125] avg loss 0.000617014, throughput 3.9954K wps
[Epoch 38 Batch 990/2125] avg loss 0.0007145, throughput 3.99579K wps
[Epoch 38 Batch 1020/2125] avg loss 0.000747149, throughput 3.99782K wps
[Epoch 38 Batch 1050/2125] avg loss 0.000697401, throughput 3.9947K wps
[Epoch 38 Batch 1080/2125] avg loss 0.000498802, throughput 3.9958K wps
[Epoch 38 Batch 1110/2125] avg loss 0.000772636, throughput 3.99662K wps
[Epoch 38 Batch 1140/2125] avg loss 0.000553575, throughput 3.99778K wps
[Epoch 38 Batch 1170/2125] avg loss 0.000862908, throughput 3.99746K wps
[Epoch 38 Batch 1200/2125] avg loss 0.000651565, throughput 3.99645K wps
[Epoch 38 Batch 1230/2125] avg loss 0.000736963, throughput 3.99473K wps
[Epoch 38 Batch 1260/2125] avg loss 0.00062858, throughput 3.99957K wps
[Epoch 38 Batch 1290/2125] avg loss 0.000651288, throughput 3.99703K wps
[Epoch 38 Batch 1320/2125] avg loss 0.00044728, throughput 3.99528K wps
[Epoch 38 Batch 1350/2125] avg loss 0.000552284, throughput 4.0028K wps
[Epoch 38 Batch 1380/2125] avg loss 0.000768865, throughput 3.99979K wps
[Epoch 38 Batch 1410/2125] avg loss 0.000518313, throughput 4.00224K wps
[Epoch 38 Batch 1440/2125] avg loss 0.00071132, throughput 3.99782K wps
[Epoch 38 Batch 1470/2125] avg loss 0.00091477, throughput 3.99669K wps
[Epoch 38 Batch 1500/2125] avg loss 0.000679781, throughput 3.99454K wps
[Epoch 38 Batch 1530/2125] avg loss 0.000813706, throughput 3.99643K wps
[Epoch 38 Batch 1560/2125] avg loss 0.000610996, throughput 4.00033K wps
[Epoch 38 Batch 1590/2125] avg loss 0.000591285, throughput 4.00084K wps
[Epoch 38 Batch 1620/2125] avg loss 0.000788298, throughput 3.99534K wps
[Epoch 38 Batch 1650/2125] avg loss 0.000618669, throughput 3.99988K wps
[Epoch 38 Batch 1680/2125] avg loss 0.000880584, throughput 3.99737K wps
[Epoch 38 Batch 1710/2125] avg loss 0.000631044, throughput 3.99727K wps
[Epoch 38 Batch 1740/2125] avg loss 0.000609827, throughput 3.995K wps
[Epoch 38 Batch 1770/2125] avg loss 0.000512288, throughput 3.99735K wps
[Epoch 38 Batch 1800/2125] avg loss 0.000585393, throughput 3.99813K wps
[Epoch 38 Batch 1830/2125] avg loss 0.000770726, throughput 3.99918K wps
[Epoch 38 Batch 1860/2125] avg loss 0.000524059, throughput 3.99913K wps
[Epoch 38 Batch 1890/2125] avg loss 0.000756379, throughput 3.99981K wps
[Epoch 38 Batch 1920/2125] avg loss 0.000532172, throughput 3.99732K wps
[Epoch 38 Batch 1950/2125] avg loss 0.000558451, throughput 3.99726K wps
[Epoch 38 Batch 1980/2125] avg loss 0.000642183, throughput 3.99758K wps
[Epoch 38 Batch 2010/2125] avg loss 0.000843172, throughput 3.99398K wps
[Epoch 38 Batch 2040/2125] avg loss 0.000633781, throughput 3.99238K wps
[Epoch 38 Batch 2070/2125] avg loss 0.000548426, throughput 3.99698K wps
[Epoch 38 Batch 2100/2125] avg loss 0.000655619, throughput 3.99759K wps
Begin Testing...
[Batch 30/237] elapsed 0.45 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 38] train avg loss 0.000609106, test acc 0.9258, test avg loss 0.521572, throughput 3.99963K wps
[Epoch 39 Batch 30/2125] avg loss 0.000375195, throughput 4.0996K wps
[Epoch 39 Batch 60/2125] avg loss 0.000489124, throughput 4.00563K wps
[Epoch 39 Batch 90/2125] avg loss 0.000524336, throughput 3.99211K wps
[Epoch 39 Batch 120/2125] avg loss 0.000503548, throughput 4.0001K wps
[Epoch 39 Batch 150/2125] avg loss 0.000371884, throughput 3.99828K wps
[Epoch 39 Batch 180/2125] avg loss 0.000629955, throughput 3.99883K wps
[Epoch 39 Batch 210/2125] avg loss 0.000404404, throughput 3.99998K wps
[Epoch 39 Batch 240/2125] avg loss 0.000467708, throughput 3.99567K wps
[Epoch 39 Batch 270/2125] avg loss 0.000593987, throughput 3.99648K wps
[Epoch 39 Batch 300/2125] avg loss 0.000582054, throughput 3.99745K wps
[Epoch 39 Batch 330/2125] avg loss 0.000592823, throughput 4.00021K wps
[Epoch 39 Batch 360/2125] avg loss 0.000387964, throughput 3.99799K wps
[Epoch 39 Batch 390/2125] avg loss 0.00052848, throughput 3.9983K wps
[Epoch 39 Batch 420/2125] avg loss 0.000547845, throughput 3.99494K wps
[Epoch 39 Batch 450/2125] avg loss 0.000442266, throughput 3.99798K wps
[Epoch 39 Batch 480/2125] avg loss 0.000671473, throughput 3.9956K wps
[Epoch 39 Batch 510/2125] avg loss 0.000685477, throughput 3.99742K wps
[Epoch 39 Batch 540/2125] avg loss 0.000457115, throughput 3.99562K wps
[Epoch 39 Batch 570/2125] avg loss 0.000605103, throughput 3.99959K wps
[Epoch 39 Batch 600/2125] avg loss 0.000479822, throughput 3.99545K wps
[Epoch 39 Batch 630/2125] avg loss 0.000568171, throughput 3.99946K wps
[Epoch 39 Batch 660/2125] avg loss 0.000587787, throughput 3.9988K wps
[Epoch 39 Batch 690/2125] avg loss 0.000522814, throughput 3.99459K wps
[Epoch 39 Batch 720/2125] avg loss 0.000518276, throughput 4.00038K wps
[Epoch 39 Batch 750/2125] avg loss 0.000514193, throughput 3.99737K wps
[Epoch 39 Batch 780/2125] avg loss 0.000589798, throughput 3.99615K wps
[Epoch 39 Batch 810/2125] avg loss 0.000539598, throughput 3.99741K wps
[Epoch 39 Batch 840/2125] avg loss 0.000441267, throughput 3.99801K wps
[Epoch 39 Batch 870/2125] avg loss 0.0007351, throughput 3.99457K wps
[Epoch 39 Batch 900/2125] avg loss 0.000701716, throughput 3.99594K wps
[Epoch 39 Batch 930/2125] avg loss 0.0005804, throughput 4.00258K wps
[Epoch 39 Batch 960/2125] avg loss 0.000708314, throughput 4.00317K wps
[Epoch 39 Batch 990/2125] avg loss 0.000390968, throughput 4.00138K wps
[Epoch 39 Batch 1020/2125] avg loss 0.000475012, throughput 4.0029K wps
[Epoch 39 Batch 1050/2125] avg loss 0.00046389, throughput 4.00674K wps
[Epoch 39 Batch 1080/2125] avg loss 0.000754605, throughput 4.0021K wps
[Epoch 39 Batch 1110/2125] avg loss 0.000511727, throughput 4.00607K wps
[Epoch 39 Batch 1140/2125] avg loss 0.000447333, throughput 3.99574K wps
[Epoch 39 Batch 1170/2125] avg loss 0.000671267, throughput 3.99881K wps
[Epoch 39 Batch 1200/2125] avg loss 0.000365708, throughput 3.99852K wps
[Epoch 39 Batch 1230/2125] avg loss 0.000611478, throughput 3.99731K wps
[Epoch 39 Batch 1260/2125] avg loss 0.00057606, throughput 3.99662K wps
[Epoch 39 Batch 1290/2125] avg loss 0.000812743, throughput 3.99473K wps
[Epoch 39 Batch 1320/2125] avg loss 0.000672154, throughput 3.99342K wps
[Epoch 39 Batch 1350/2125] avg loss 0.000536591, throughput 3.9963K wps
[Epoch 39 Batch 1380/2125] avg loss 0.000763829, throughput 3.99727K wps
[Epoch 39 Batch 1410/2125] avg loss 0.000541017, throughput 4.00147K wps
[Epoch 39 Batch 1440/2125] avg loss 0.00063652, throughput 3.99769K wps
[Epoch 39 Batch 1470/2125] avg loss 0.000622694, throughput 3.99326K wps
[Epoch 39 Batch 1500/2125] avg loss 0.000707492, throughput 3.99059K wps
[Epoch 39 Batch 1530/2125] avg loss 0.000488732, throughput 4.00188K wps
[Epoch 39 Batch 1560/2125] avg loss 0.000585086, throughput 3.99837K wps
[Epoch 39 Batch 1590/2125] avg loss 0.000788465, throughput 3.99534K wps
[Epoch 39 Batch 1620/2125] avg loss 0.000558765, throughput 3.99884K wps
[Epoch 39 Batch 1650/2125] avg loss 0.000539162, throughput 4.00003K wps
[Epoch 39 Batch 1680/2125] avg loss 0.000624402, throughput 4.00058K wps
[Epoch 39 Batch 1710/2125] avg loss 0.000888819, throughput 3.99713K wps
[Epoch 39 Batch 1740/2125] avg loss 0.000655128, throughput 3.9974K wps
[Epoch 39 Batch 1770/2125] avg loss 0.00064443, throughput 3.99393K wps
[Epoch 39 Batch 1800/2125] avg loss 0.000512413, throughput 4.00041K wps
[Epoch 39 Batch 1830/2125] avg loss 0.000666991, throughput 3.99396K wps
[Epoch 39 Batch 1860/2125] avg loss 0.000483009, throughput 3.9939K wps
[Epoch 39 Batch 1890/2125] avg loss 0.000751327, throughput 3.99831K wps
[Epoch 39 Batch 1920/2125] avg loss 0.000557222, throughput 3.99979K wps
[Epoch 39 Batch 1950/2125] avg loss 0.000579086, throughput 4.00025K wps
[Epoch 39 Batch 1980/2125] avg loss 0.00067319, throughput 3.99728K wps
[Epoch 39 Batch 2010/2125] avg loss 0.000663073, throughput 4.00079K wps
[Epoch 39 Batch 2040/2125] avg loss 0.000449248, throughput 3.99696K wps
[Epoch 39 Batch 2070/2125] avg loss 0.000687132, throughput 3.99412K wps
[Epoch 39 Batch 2100/2125] avg loss 0.000644513, throughput 3.99682K wps
Begin Testing...
[Batch 30/237] elapsed 0.46 s
[Batch 60/237] elapsed 0.43 s
[Batch 90/237] elapsed 0.43 s
[Batch 120/237] elapsed 0.43 s
[Batch 150/237] elapsed 0.43 s
[Batch 180/237] elapsed 0.43 s
[Batch 210/237] elapsed 0.43 s
[Epoch 39] train avg loss 0.00057687, test acc 0.9254, test avg loss 0.530448, throughput 3.99941K wps
Test loss 0.320728, test acc 0.9393
Total time cost 4396.23s