Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
3282 lines (3281 sloc) 210 KB
Namespace(batch_size=50, data_name='SST-2', dropout=0.5, epochs=40, gpu=0, log_interval=30, lr=0.0001, model_mode='static', save_prefix='sa-model')
Use gpu0
1614
53
Done! Tokenizing Time=4.34s, #Sentences=118038
Done! Tokenizing Time=0.77s, #Sentences=1745
SentimentNet(
(embedding): Embedding(17814 -> 300, float32)
(encoder): ConvolutionalEncoder(
(_convs): HybridConcurrent(
(0): HybridSequential(
(0): Conv1D(300 -> 100, kernel_size=(3,), stride=(1,))
(1): Activation(relu)
(2): HybridLambda(<lambda>)
)
(1): HybridSequential(
(0): Conv1D(300 -> 100, kernel_size=(4,), stride=(1,))
(1): Activation(relu)
(2): HybridLambda(<lambda>)
)
(2): HybridSequential(
(0): Conv1D(300 -> 100, kernel_size=(5,), stride=(1,))
(1): Activation(relu)
(2): HybridLambda(<lambda>)
)
)
)
(output): HybridSequential(
(0): Dropout(p = 0.5, axes=())
(1): Dense(None -> 2, linear)
)
)
[Epoch 0 Batch 30/2125] avg loss 0.0146534, throughput 5.32638K wps
[Epoch 0 Batch 60/2125] avg loss 0.0145718, throughput 13.2611K wps
[Epoch 0 Batch 90/2125] avg loss 0.0141987, throughput 13.481K wps
[Epoch 0 Batch 120/2125] avg loss 0.0135289, throughput 13.4466K wps
[Epoch 0 Batch 150/2125] avg loss 0.0136993, throughput 13.4449K wps
[Epoch 0 Batch 180/2125] avg loss 0.013486, throughput 13.4345K wps
[Epoch 0 Batch 210/2125] avg loss 0.0136846, throughput 13.4504K wps
[Epoch 0 Batch 240/2125] avg loss 0.0134533, throughput 13.4591K wps
[Epoch 0 Batch 270/2125] avg loss 0.0131329, throughput 13.428K wps
[Epoch 0 Batch 300/2125] avg loss 0.0132444, throughput 13.4373K wps
[Epoch 0 Batch 330/2125] avg loss 0.0130399, throughput 13.4799K wps
[Epoch 0 Batch 360/2125] avg loss 0.0129664, throughput 13.4369K wps
[Epoch 0 Batch 390/2125] avg loss 0.0127753, throughput 13.4408K wps
[Epoch 0 Batch 420/2125] avg loss 0.0125482, throughput 13.4222K wps
[Epoch 0 Batch 450/2125] avg loss 0.0130593, throughput 13.4136K wps
[Epoch 0 Batch 480/2125] avg loss 0.0123206, throughput 13.4403K wps
[Epoch 0 Batch 510/2125] avg loss 0.0123928, throughput 13.4256K wps
[Epoch 0 Batch 540/2125] avg loss 0.0120423, throughput 13.4225K wps
[Epoch 0 Batch 570/2125] avg loss 0.0121525, throughput 13.4038K wps
[Epoch 0 Batch 600/2125] avg loss 0.0123051, throughput 13.4128K wps
[Epoch 0 Batch 630/2125] avg loss 0.012019, throughput 13.447K wps
[Epoch 0 Batch 660/2125] avg loss 0.0120158, throughput 13.433K wps
[Epoch 0 Batch 690/2125] avg loss 0.0116366, throughput 13.4184K wps
[Epoch 0 Batch 720/2125] avg loss 0.0118445, throughput 13.4244K wps
[Epoch 0 Batch 750/2125] avg loss 0.0115844, throughput 13.4472K wps
[Epoch 0 Batch 780/2125] avg loss 0.0116283, throughput 13.4503K wps
[Epoch 0 Batch 810/2125] avg loss 0.0112999, throughput 13.4335K wps
[Epoch 0 Batch 840/2125] avg loss 0.0111871, throughput 13.446K wps
[Epoch 0 Batch 870/2125] avg loss 0.0110903, throughput 13.4148K wps
[Epoch 0 Batch 900/2125] avg loss 0.0111122, throughput 13.4345K wps
[Epoch 0 Batch 930/2125] avg loss 0.0108446, throughput 13.4439K wps
[Epoch 0 Batch 960/2125] avg loss 0.0110299, throughput 13.4086K wps
[Epoch 0 Batch 990/2125] avg loss 0.0108426, throughput 13.4245K wps
[Epoch 0 Batch 1020/2125] avg loss 0.0104296, throughput 13.4346K wps
[Epoch 0 Batch 1050/2125] avg loss 0.0106169, throughput 13.451K wps
[Epoch 0 Batch 1080/2125] avg loss 0.0103461, throughput 13.4339K wps
[Epoch 0 Batch 1110/2125] avg loss 0.0104177, throughput 13.4285K wps
[Epoch 0 Batch 1140/2125] avg loss 0.0102509, throughput 13.4223K wps
[Epoch 0 Batch 1170/2125] avg loss 0.0101936, throughput 13.4468K wps
[Epoch 0 Batch 1200/2125] avg loss 0.0101648, throughput 13.4128K wps
[Epoch 0 Batch 1230/2125] avg loss 0.00964848, throughput 13.4113K wps
[Epoch 0 Batch 1260/2125] avg loss 0.00972363, throughput 13.454K wps
[Epoch 0 Batch 1290/2125] avg loss 0.00966511, throughput 13.4292K wps
[Epoch 0 Batch 1320/2125] avg loss 0.00982469, throughput 13.4221K wps
[Epoch 0 Batch 1350/2125] avg loss 0.00938853, throughput 13.4363K wps
[Epoch 0 Batch 1380/2125] avg loss 0.00953433, throughput 13.4382K wps
[Epoch 0 Batch 1410/2125] avg loss 0.00932853, throughput 13.4478K wps
[Epoch 0 Batch 1440/2125] avg loss 0.00926112, throughput 13.4177K wps
[Epoch 0 Batch 1470/2125] avg loss 0.00959436, throughput 13.381K wps
[Epoch 0 Batch 1500/2125] avg loss 0.00935453, throughput 13.4296K wps
[Epoch 0 Batch 1530/2125] avg loss 0.0091412, throughput 13.4469K wps
[Epoch 0 Batch 1560/2125] avg loss 0.00910388, throughput 13.3545K wps
[Epoch 0 Batch 1590/2125] avg loss 0.00887516, throughput 13.4277K wps
[Epoch 0 Batch 1620/2125] avg loss 0.00891047, throughput 13.3777K wps
[Epoch 0 Batch 1650/2125] avg loss 0.00884401, throughput 13.3567K wps
[Epoch 0 Batch 1680/2125] avg loss 0.00888073, throughput 13.4179K wps
[Epoch 0 Batch 1710/2125] avg loss 0.00898941, throughput 13.3764K wps
[Epoch 0 Batch 1740/2125] avg loss 0.00898894, throughput 13.2511K wps
[Epoch 0 Batch 1770/2125] avg loss 0.00841213, throughput 13.435K wps
[Epoch 0 Batch 1800/2125] avg loss 0.00878635, throughput 13.2924K wps
[Epoch 0 Batch 1830/2125] avg loss 0.00882273, throughput 13.4296K wps
[Epoch 0 Batch 1860/2125] avg loss 0.00840505, throughput 13.4229K wps
[Epoch 0 Batch 1890/2125] avg loss 0.00845803, throughput 13.4382K wps
[Epoch 0 Batch 1920/2125] avg loss 0.00810139, throughput 13.3192K wps
[Epoch 0 Batch 1950/2125] avg loss 0.00859184, throughput 13.3754K wps
[Epoch 0 Batch 1980/2125] avg loss 0.00858049, throughput 13.4038K wps
[Epoch 0 Batch 2010/2125] avg loss 0.00818034, throughput 13.3906K wps
[Epoch 0 Batch 2040/2125] avg loss 0.0084738, throughput 13.434K wps
[Epoch 0 Batch 2070/2125] avg loss 0.00818707, throughput 13.3411K wps
[Epoch 0 Batch 2100/2125] avg loss 0.0083282, throughput 13.3621K wps
Begin Testing...
[Batch 30/237] elapsed 0.29 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 0] train avg loss 0.0107207, test acc 0.8334, test avg loss 0.40004, throughput 12.9748K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 1 Batch 30/2125] avg loss 0.00831467, throughput 13.6498K wps
[Epoch 1 Batch 60/2125] avg loss 0.00805099, throughput 13.2698K wps
[Epoch 1 Batch 90/2125] avg loss 0.00822964, throughput 13.3992K wps
[Epoch 1 Batch 120/2125] avg loss 0.00825579, throughput 13.408K wps
[Epoch 1 Batch 150/2125] avg loss 0.00806523, throughput 13.3817K wps
[Epoch 1 Batch 180/2125] avg loss 0.00785, throughput 13.2822K wps
[Epoch 1 Batch 210/2125] avg loss 0.0080381, throughput 13.3381K wps
[Epoch 1 Batch 240/2125] avg loss 0.00807023, throughput 13.3446K wps
[Epoch 1 Batch 270/2125] avg loss 0.0080889, throughput 13.3556K wps
[Epoch 1 Batch 300/2125] avg loss 0.00759852, throughput 13.2342K wps
[Epoch 1 Batch 330/2125] avg loss 0.00786776, throughput 13.3712K wps
[Epoch 1 Batch 360/2125] avg loss 0.00777422, throughput 13.2676K wps
[Epoch 1 Batch 390/2125] avg loss 0.00805686, throughput 13.3395K wps
[Epoch 1 Batch 420/2125] avg loss 0.00725809, throughput 13.2458K wps
[Epoch 1 Batch 450/2125] avg loss 0.00790547, throughput 13.3119K wps
[Epoch 1 Batch 480/2125] avg loss 0.00782725, throughput 13.2427K wps
[Epoch 1 Batch 510/2125] avg loss 0.0077112, throughput 13.2662K wps
[Epoch 1 Batch 540/2125] avg loss 0.00764779, throughput 13.2416K wps
[Epoch 1 Batch 570/2125] avg loss 0.00781515, throughput 13.3361K wps
[Epoch 1 Batch 600/2125] avg loss 0.00757053, throughput 13.2599K wps
[Epoch 1 Batch 630/2125] avg loss 0.00761841, throughput 13.2477K wps
[Epoch 1 Batch 660/2125] avg loss 0.00779623, throughput 13.2639K wps
[Epoch 1 Batch 690/2125] avg loss 0.00796579, throughput 13.2599K wps
[Epoch 1 Batch 720/2125] avg loss 0.00759087, throughput 13.2667K wps
[Epoch 1 Batch 750/2125] avg loss 0.00780331, throughput 13.1748K wps
[Epoch 1 Batch 780/2125] avg loss 0.0076351, throughput 13.3235K wps
[Epoch 1 Batch 810/2125] avg loss 0.00750919, throughput 13.3469K wps
[Epoch 1 Batch 840/2125] avg loss 0.00749751, throughput 13.2394K wps
[Epoch 1 Batch 870/2125] avg loss 0.00711399, throughput 13.3108K wps
[Epoch 1 Batch 900/2125] avg loss 0.00739057, throughput 13.2163K wps
[Epoch 1 Batch 930/2125] avg loss 0.00724539, throughput 13.2243K wps
[Epoch 1 Batch 960/2125] avg loss 0.00707681, throughput 13.1823K wps
[Epoch 1 Batch 990/2125] avg loss 0.00749918, throughput 13.2497K wps
[Epoch 1 Batch 1020/2125] avg loss 0.00764632, throughput 13.3645K wps
[Epoch 1 Batch 1050/2125] avg loss 0.00727298, throughput 13.3651K wps
[Epoch 1 Batch 1080/2125] avg loss 0.00718971, throughput 13.2228K wps
[Epoch 1 Batch 1110/2125] avg loss 0.00746057, throughput 13.2172K wps
[Epoch 1 Batch 1140/2125] avg loss 0.00736938, throughput 13.146K wps
[Epoch 1 Batch 1170/2125] avg loss 0.00758748, throughput 13.3104K wps
[Epoch 1 Batch 1200/2125] avg loss 0.00747739, throughput 13.2234K wps
[Epoch 1 Batch 1230/2125] avg loss 0.00706741, throughput 13.2602K wps
[Epoch 1 Batch 1260/2125] avg loss 0.00709088, throughput 13.2699K wps
[Epoch 1 Batch 1290/2125] avg loss 0.00741011, throughput 13.3354K wps
[Epoch 1 Batch 1320/2125] avg loss 0.0074885, throughput 13.2102K wps
[Epoch 1 Batch 1350/2125] avg loss 0.00716928, throughput 13.2249K wps
[Epoch 1 Batch 1380/2125] avg loss 0.00727062, throughput 13.1891K wps
[Epoch 1 Batch 1410/2125] avg loss 0.00730444, throughput 13.3151K wps
[Epoch 1 Batch 1440/2125] avg loss 0.00741777, throughput 13.1876K wps
[Epoch 1 Batch 1470/2125] avg loss 0.00702204, throughput 13.212K wps
[Epoch 1 Batch 1500/2125] avg loss 0.00728929, throughput 13.1771K wps
[Epoch 1 Batch 1530/2125] avg loss 0.00678881, throughput 13.1919K wps
[Epoch 1 Batch 1560/2125] avg loss 0.00712185, throughput 13.21K wps
[Epoch 1 Batch 1590/2125] avg loss 0.00711323, throughput 13.2141K wps
[Epoch 1 Batch 1620/2125] avg loss 0.00708076, throughput 13.208K wps
[Epoch 1 Batch 1650/2125] avg loss 0.00715817, throughput 13.2197K wps
[Epoch 1 Batch 1680/2125] avg loss 0.00736282, throughput 13.1866K wps
[Epoch 1 Batch 1710/2125] avg loss 0.00732599, throughput 13.1801K wps
[Epoch 1 Batch 1740/2125] avg loss 0.007201, throughput 13.1649K wps
[Epoch 1 Batch 1770/2125] avg loss 0.00696781, throughput 13.1253K wps
[Epoch 1 Batch 1800/2125] avg loss 0.00725484, throughput 13.1656K wps
[Epoch 1 Batch 1830/2125] avg loss 0.00708344, throughput 13.0535K wps
[Epoch 1 Batch 1860/2125] avg loss 0.00682544, throughput 13.2788K wps
[Epoch 1 Batch 1890/2125] avg loss 0.00765838, throughput 13.3752K wps
[Epoch 1 Batch 1920/2125] avg loss 0.00682234, throughput 13.2689K wps
[Epoch 1 Batch 1950/2125] avg loss 0.00746441, throughput 13.2309K wps
[Epoch 1 Batch 1980/2125] avg loss 0.00731475, throughput 13.2308K wps
[Epoch 1 Batch 2010/2125] avg loss 0.00725325, throughput 13.2622K wps
[Epoch 1 Batch 2040/2125] avg loss 0.00735213, throughput 13.2743K wps
[Epoch 1 Batch 2070/2125] avg loss 0.00643221, throughput 13.2417K wps
[Epoch 1 Batch 2100/2125] avg loss 0.00722509, throughput 13.2286K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 1] train avg loss 0.00747038, test acc 0.8534, test avg loss 0.349388, throughput 13.262K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 2 Batch 30/2125] avg loss 0.00684883, throughput 13.4228K wps
[Epoch 2 Batch 60/2125] avg loss 0.00675612, throughput 13.1665K wps
[Epoch 2 Batch 90/2125] avg loss 0.00714641, throughput 13.207K wps
[Epoch 2 Batch 120/2125] avg loss 0.0069786, throughput 13.2405K wps
[Epoch 2 Batch 150/2125] avg loss 0.0070687, throughput 13.2229K wps
[Epoch 2 Batch 180/2125] avg loss 0.00719998, throughput 13.2447K wps
[Epoch 2 Batch 210/2125] avg loss 0.00694307, throughput 13.2268K wps
[Epoch 2 Batch 240/2125] avg loss 0.00681365, throughput 13.2082K wps
[Epoch 2 Batch 270/2125] avg loss 0.00663865, throughput 13.2161K wps
[Epoch 2 Batch 300/2125] avg loss 0.00736739, throughput 13.2024K wps
[Epoch 2 Batch 330/2125] avg loss 0.00662529, throughput 13.1816K wps
[Epoch 2 Batch 360/2125] avg loss 0.00674064, throughput 13.169K wps
[Epoch 2 Batch 390/2125] avg loss 0.00717306, throughput 13.1883K wps
[Epoch 2 Batch 420/2125] avg loss 0.00694572, throughput 13.1935K wps
[Epoch 2 Batch 450/2125] avg loss 0.00701094, throughput 13.2161K wps
[Epoch 2 Batch 480/2125] avg loss 0.00704228, throughput 13.1689K wps
[Epoch 2 Batch 510/2125] avg loss 0.0069788, throughput 13.2163K wps
[Epoch 2 Batch 540/2125] avg loss 0.00704758, throughput 13.2147K wps
[Epoch 2 Batch 570/2125] avg loss 0.0069595, throughput 13.223K wps
[Epoch 2 Batch 600/2125] avg loss 0.00694432, throughput 13.1895K wps
[Epoch 2 Batch 630/2125] avg loss 0.00661558, throughput 13.1837K wps
[Epoch 2 Batch 660/2125] avg loss 0.0066975, throughput 13.213K wps
[Epoch 2 Batch 690/2125] avg loss 0.00703434, throughput 13.1852K wps
[Epoch 2 Batch 720/2125] avg loss 0.00690671, throughput 13.2111K wps
[Epoch 2 Batch 750/2125] avg loss 0.00669105, throughput 13.2147K wps
[Epoch 2 Batch 780/2125] avg loss 0.00642979, throughput 13.1995K wps
[Epoch 2 Batch 810/2125] avg loss 0.00662498, throughput 13.2098K wps
[Epoch 2 Batch 840/2125] avg loss 0.00635656, throughput 13.2206K wps
[Epoch 2 Batch 870/2125] avg loss 0.0069886, throughput 13.2278K wps
[Epoch 2 Batch 900/2125] avg loss 0.0061332, throughput 13.1818K wps
[Epoch 2 Batch 930/2125] avg loss 0.0066373, throughput 13.2326K wps
[Epoch 2 Batch 960/2125] avg loss 0.0063128, throughput 13.1856K wps
[Epoch 2 Batch 990/2125] avg loss 0.00708121, throughput 13.1581K wps
[Epoch 2 Batch 1020/2125] avg loss 0.00683951, throughput 13.1772K wps
[Epoch 2 Batch 1050/2125] avg loss 0.00704309, throughput 13.1656K wps
[Epoch 2 Batch 1080/2125] avg loss 0.00678009, throughput 13.2388K wps
[Epoch 2 Batch 1110/2125] avg loss 0.00724449, throughput 13.24K wps
[Epoch 2 Batch 1140/2125] avg loss 0.00649344, throughput 13.2548K wps
[Epoch 2 Batch 1170/2125] avg loss 0.00623237, throughput 13.239K wps
[Epoch 2 Batch 1200/2125] avg loss 0.00681128, throughput 13.2329K wps
[Epoch 2 Batch 1230/2125] avg loss 0.0063335, throughput 13.2475K wps
[Epoch 2 Batch 1260/2125] avg loss 0.00667441, throughput 13.2322K wps
[Epoch 2 Batch 1290/2125] avg loss 0.00623227, throughput 13.2004K wps
[Epoch 2 Batch 1320/2125] avg loss 0.00659944, throughput 13.2325K wps
[Epoch 2 Batch 1350/2125] avg loss 0.00722577, throughput 13.2284K wps
[Epoch 2 Batch 1380/2125] avg loss 0.00660247, throughput 13.2396K wps
[Epoch 2 Batch 1410/2125] avg loss 0.00701938, throughput 13.2197K wps
[Epoch 2 Batch 1440/2125] avg loss 0.00645516, throughput 13.2156K wps
[Epoch 2 Batch 1470/2125] avg loss 0.00709327, throughput 13.2515K wps
[Epoch 2 Batch 1500/2125] avg loss 0.00681654, throughput 13.2308K wps
[Epoch 2 Batch 1530/2125] avg loss 0.00613883, throughput 13.2319K wps
[Epoch 2 Batch 1560/2125] avg loss 0.00654453, throughput 13.2354K wps
[Epoch 2 Batch 1590/2125] avg loss 0.00671055, throughput 13.2048K wps
[Epoch 2 Batch 1620/2125] avg loss 0.00611911, throughput 13.221K wps
[Epoch 2 Batch 1650/2125] avg loss 0.00670375, throughput 13.1829K wps
[Epoch 2 Batch 1680/2125] avg loss 0.00700342, throughput 13.2239K wps
[Epoch 2 Batch 1710/2125] avg loss 0.00677016, throughput 13.2176K wps
[Epoch 2 Batch 1740/2125] avg loss 0.00657458, throughput 13.2242K wps
[Epoch 2 Batch 1770/2125] avg loss 0.00654003, throughput 13.2338K wps
[Epoch 2 Batch 1800/2125] avg loss 0.00678511, throughput 13.232K wps
[Epoch 2 Batch 1830/2125] avg loss 0.00655117, throughput 13.2364K wps
[Epoch 2 Batch 1860/2125] avg loss 0.00635708, throughput 13.2392K wps
[Epoch 2 Batch 1890/2125] avg loss 0.00659348, throughput 13.2294K wps
[Epoch 2 Batch 1920/2125] avg loss 0.00669368, throughput 13.2343K wps
[Epoch 2 Batch 1950/2125] avg loss 0.00639563, throughput 13.2149K wps
[Epoch 2 Batch 1980/2125] avg loss 0.00652862, throughput 13.2355K wps
[Epoch 2 Batch 2010/2125] avg loss 0.00664764, throughput 13.2448K wps
[Epoch 2 Batch 2040/2125] avg loss 0.00668586, throughput 13.2251K wps
[Epoch 2 Batch 2070/2125] avg loss 0.00661059, throughput 13.2103K wps
[Epoch 2 Batch 2100/2125] avg loss 0.00673988, throughput 13.2261K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 2] train avg loss 0.00674327, test acc 0.8668, test avg loss 0.327952, throughput 13.2185K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 3 Batch 30/2125] avg loss 0.00647, throughput 13.5107K wps
[Epoch 3 Batch 60/2125] avg loss 0.0065113, throughput 13.1665K wps
[Epoch 3 Batch 90/2125] avg loss 0.0062733, throughput 13.2112K wps
[Epoch 3 Batch 120/2125] avg loss 0.00675723, throughput 13.2248K wps
[Epoch 3 Batch 150/2125] avg loss 0.00635352, throughput 13.1962K wps
[Epoch 3 Batch 180/2125] avg loss 0.0068886, throughput 13.249K wps
[Epoch 3 Batch 210/2125] avg loss 0.00668413, throughput 13.1978K wps
[Epoch 3 Batch 240/2125] avg loss 0.00646459, throughput 13.2164K wps
[Epoch 3 Batch 270/2125] avg loss 0.00653774, throughput 13.2041K wps
[Epoch 3 Batch 300/2125] avg loss 0.00623913, throughput 13.2406K wps
[Epoch 3 Batch 330/2125] avg loss 0.00603111, throughput 13.2001K wps
[Epoch 3 Batch 360/2125] avg loss 0.00619598, throughput 13.2085K wps
[Epoch 3 Batch 390/2125] avg loss 0.00602232, throughput 13.2333K wps
[Epoch 3 Batch 420/2125] avg loss 0.00634754, throughput 13.2386K wps
[Epoch 3 Batch 450/2125] avg loss 0.0065352, throughput 13.2357K wps
[Epoch 3 Batch 480/2125] avg loss 0.00697678, throughput 13.237K wps
[Epoch 3 Batch 510/2125] avg loss 0.00675154, throughput 13.2387K wps
[Epoch 3 Batch 540/2125] avg loss 0.00646675, throughput 13.2077K wps
[Epoch 3 Batch 570/2125] avg loss 0.00611559, throughput 13.2273K wps
[Epoch 3 Batch 600/2125] avg loss 0.00617743, throughput 13.1555K wps
[Epoch 3 Batch 630/2125] avg loss 0.00632218, throughput 13.1529K wps
[Epoch 3 Batch 660/2125] avg loss 0.00642298, throughput 13.2068K wps
[Epoch 3 Batch 690/2125] avg loss 0.00637397, throughput 13.1979K wps
[Epoch 3 Batch 720/2125] avg loss 0.00653949, throughput 13.2101K wps
[Epoch 3 Batch 750/2125] avg loss 0.00658596, throughput 13.1676K wps
[Epoch 3 Batch 780/2125] avg loss 0.0061653, throughput 13.1766K wps
[Epoch 3 Batch 810/2125] avg loss 0.00647233, throughput 13.161K wps
[Epoch 3 Batch 840/2125] avg loss 0.00685449, throughput 13.1454K wps
[Epoch 3 Batch 870/2125] avg loss 0.00624603, throughput 13.2403K wps
[Epoch 3 Batch 900/2125] avg loss 0.00626926, throughput 13.2102K wps
[Epoch 3 Batch 930/2125] avg loss 0.00612363, throughput 13.1787K wps
[Epoch 3 Batch 960/2125] avg loss 0.00603955, throughput 13.1852K wps
[Epoch 3 Batch 990/2125] avg loss 0.00629705, throughput 13.1972K wps
[Epoch 3 Batch 1020/2125] avg loss 0.00613809, throughput 13.2228K wps
[Epoch 3 Batch 1050/2125] avg loss 0.00622259, throughput 13.2131K wps
[Epoch 3 Batch 1080/2125] avg loss 0.00654415, throughput 13.216K wps
[Epoch 3 Batch 1110/2125] avg loss 0.00611599, throughput 13.2049K wps
[Epoch 3 Batch 1140/2125] avg loss 0.00671138, throughput 13.1959K wps
[Epoch 3 Batch 1170/2125] avg loss 0.00647355, throughput 13.2065K wps
[Epoch 3 Batch 1200/2125] avg loss 0.00680476, throughput 13.1564K wps
[Epoch 3 Batch 1230/2125] avg loss 0.00624785, throughput 13.251K wps
[Epoch 3 Batch 1260/2125] avg loss 0.00627035, throughput 13.1808K wps
[Epoch 3 Batch 1290/2125] avg loss 0.00685914, throughput 13.2017K wps
[Epoch 3 Batch 1320/2125] avg loss 0.00672644, throughput 13.2184K wps
[Epoch 3 Batch 1350/2125] avg loss 0.00591808, throughput 13.1746K wps
[Epoch 3 Batch 1380/2125] avg loss 0.00611181, throughput 13.1501K wps
[Epoch 3 Batch 1410/2125] avg loss 0.00612708, throughput 13.0687K wps
[Epoch 3 Batch 1440/2125] avg loss 0.00567642, throughput 13.2047K wps
[Epoch 3 Batch 1470/2125] avg loss 0.0057043, throughput 13.192K wps
[Epoch 3 Batch 1500/2125] avg loss 0.00609568, throughput 13.2461K wps
[Epoch 3 Batch 1530/2125] avg loss 0.00592292, throughput 13.175K wps
[Epoch 3 Batch 1560/2125] avg loss 0.00643153, throughput 13.1952K wps
[Epoch 3 Batch 1590/2125] avg loss 0.00595985, throughput 13.2225K wps
[Epoch 3 Batch 1620/2125] avg loss 0.00627467, throughput 13.1919K wps
[Epoch 3 Batch 1650/2125] avg loss 0.00646688, throughput 13.1066K wps
[Epoch 3 Batch 1680/2125] avg loss 0.00616185, throughput 13.2089K wps
[Epoch 3 Batch 1710/2125] avg loss 0.00657012, throughput 13.2284K wps
[Epoch 3 Batch 1740/2125] avg loss 0.00658108, throughput 13.216K wps
[Epoch 3 Batch 1770/2125] avg loss 0.00631713, throughput 13.1904K wps
[Epoch 3 Batch 1800/2125] avg loss 0.00601576, throughput 13.2155K wps
[Epoch 3 Batch 1830/2125] avg loss 0.0062461, throughput 13.1918K wps
[Epoch 3 Batch 1860/2125] avg loss 0.00581384, throughput 13.1052K wps
[Epoch 3 Batch 1890/2125] avg loss 0.00637057, throughput 13.1719K wps
[Epoch 3 Batch 1920/2125] avg loss 0.00649338, throughput 13.2072K wps
[Epoch 3 Batch 1950/2125] avg loss 0.00667244, throughput 13.1967K wps
[Epoch 3 Batch 1980/2125] avg loss 0.00625702, throughput 13.0595K wps
[Epoch 3 Batch 2010/2125] avg loss 0.00627446, throughput 13.2196K wps
[Epoch 3 Batch 2040/2125] avg loss 0.0059055, throughput 13.1694K wps
[Epoch 3 Batch 2070/2125] avg loss 0.00636481, throughput 13.1925K wps
[Epoch 3 Batch 2100/2125] avg loss 0.00569815, throughput 13.1753K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 3] train avg loss 0.00633021, test acc 0.8708, test avg loss 0.314354, throughput 13.1994K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 4 Batch 30/2125] avg loss 0.00585832, throughput 13.4795K wps
[Epoch 4 Batch 60/2125] avg loss 0.00572205, throughput 13.1083K wps
[Epoch 4 Batch 90/2125] avg loss 0.00652273, throughput 13.2138K wps
[Epoch 4 Batch 120/2125] avg loss 0.0062934, throughput 13.1432K wps
[Epoch 4 Batch 150/2125] avg loss 0.00665345, throughput 13.1624K wps
[Epoch 4 Batch 180/2125] avg loss 0.0059113, throughput 13.1766K wps
[Epoch 4 Batch 210/2125] avg loss 0.0064233, throughput 13.1173K wps
[Epoch 4 Batch 240/2125] avg loss 0.00619236, throughput 13.2072K wps
[Epoch 4 Batch 270/2125] avg loss 0.00584398, throughput 13.1003K wps
[Epoch 4 Batch 300/2125] avg loss 0.00580483, throughput 13.1103K wps
[Epoch 4 Batch 330/2125] avg loss 0.00629033, throughput 13.1311K wps
[Epoch 4 Batch 360/2125] avg loss 0.00608293, throughput 13.2192K wps
[Epoch 4 Batch 390/2125] avg loss 0.00574367, throughput 13.1879K wps
[Epoch 4 Batch 420/2125] avg loss 0.00587618, throughput 13.1671K wps
[Epoch 4 Batch 450/2125] avg loss 0.00641766, throughput 13.161K wps
[Epoch 4 Batch 480/2125] avg loss 0.00625374, throughput 13.1569K wps
[Epoch 4 Batch 510/2125] avg loss 0.00580617, throughput 13.1568K wps
[Epoch 4 Batch 540/2125] avg loss 0.00577172, throughput 13.0552K wps
[Epoch 4 Batch 570/2125] avg loss 0.00620569, throughput 13.1679K wps
[Epoch 4 Batch 600/2125] avg loss 0.00630589, throughput 13.1835K wps
[Epoch 4 Batch 630/2125] avg loss 0.00576278, throughput 13.1473K wps
[Epoch 4 Batch 660/2125] avg loss 0.00604007, throughput 13.1342K wps
[Epoch 4 Batch 690/2125] avg loss 0.00555951, throughput 13.1066K wps
[Epoch 4 Batch 720/2125] avg loss 0.00626695, throughput 13.1545K wps
[Epoch 4 Batch 750/2125] avg loss 0.0058206, throughput 13.0782K wps
[Epoch 4 Batch 780/2125] avg loss 0.00547201, throughput 13.1976K wps
[Epoch 4 Batch 810/2125] avg loss 0.00659553, throughput 13.1111K wps
[Epoch 4 Batch 840/2125] avg loss 0.00630413, throughput 13.1445K wps
[Epoch 4 Batch 870/2125] avg loss 0.00593383, throughput 13.1845K wps
[Epoch 4 Batch 900/2125] avg loss 0.00580809, throughput 13.1868K wps
[Epoch 4 Batch 930/2125] avg loss 0.00648236, throughput 13.0323K wps
[Epoch 4 Batch 960/2125] avg loss 0.00639276, throughput 13.1812K wps
[Epoch 4 Batch 990/2125] avg loss 0.00583455, throughput 13.0485K wps
[Epoch 4 Batch 1020/2125] avg loss 0.00563975, throughput 13.1623K wps
[Epoch 4 Batch 1050/2125] avg loss 0.00621978, throughput 13.1344K wps
[Epoch 4 Batch 1080/2125] avg loss 0.0062317, throughput 13.1441K wps
[Epoch 4 Batch 1110/2125] avg loss 0.0059694, throughput 13.1779K wps
[Epoch 4 Batch 1140/2125] avg loss 0.00592759, throughput 13.0678K wps
[Epoch 4 Batch 1170/2125] avg loss 0.00597553, throughput 13.1571K wps
[Epoch 4 Batch 1200/2125] avg loss 0.00579552, throughput 13.0277K wps
[Epoch 4 Batch 1230/2125] avg loss 0.00593854, throughput 13.211K wps
[Epoch 4 Batch 1260/2125] avg loss 0.00574343, throughput 13.1323K wps
[Epoch 4 Batch 1290/2125] avg loss 0.00578821, throughput 13.1305K wps
[Epoch 4 Batch 1320/2125] avg loss 0.00636347, throughput 13.2181K wps
[Epoch 4 Batch 1350/2125] avg loss 0.00554208, throughput 13.2166K wps
[Epoch 4 Batch 1380/2125] avg loss 0.00579053, throughput 13.0449K wps
[Epoch 4 Batch 1410/2125] avg loss 0.00607741, throughput 13.1911K wps
[Epoch 4 Batch 1440/2125] avg loss 0.00587338, throughput 13.1687K wps
[Epoch 4 Batch 1470/2125] avg loss 0.0057504, throughput 13.1065K wps
[Epoch 4 Batch 1500/2125] avg loss 0.00571794, throughput 13.1725K wps
[Epoch 4 Batch 1530/2125] avg loss 0.00596371, throughput 13.1266K wps
[Epoch 4 Batch 1560/2125] avg loss 0.00619462, throughput 13.1855K wps
[Epoch 4 Batch 1590/2125] avg loss 0.00602154, throughput 13.1543K wps
[Epoch 4 Batch 1620/2125] avg loss 0.00588569, throughput 13.1491K wps
[Epoch 4 Batch 1650/2125] avg loss 0.0061227, throughput 13.0452K wps
[Epoch 4 Batch 1680/2125] avg loss 0.00582573, throughput 13.1986K wps
[Epoch 4 Batch 1710/2125] avg loss 0.00608981, throughput 13.1947K wps
[Epoch 4 Batch 1740/2125] avg loss 0.00618156, throughput 13.2041K wps
[Epoch 4 Batch 1770/2125] avg loss 0.00555834, throughput 13.1968K wps
[Epoch 4 Batch 1800/2125] avg loss 0.00589645, throughput 13.0936K wps
[Epoch 4 Batch 1830/2125] avg loss 0.00618808, throughput 13.1978K wps
[Epoch 4 Batch 1860/2125] avg loss 0.00602533, throughput 13.1179K wps
[Epoch 4 Batch 1890/2125] avg loss 0.00608156, throughput 13.1423K wps
[Epoch 4 Batch 1920/2125] avg loss 0.00570912, throughput 13.1082K wps
[Epoch 4 Batch 1950/2125] avg loss 0.0056477, throughput 13.0824K wps
[Epoch 4 Batch 1980/2125] avg loss 0.00580865, throughput 13.0474K wps
[Epoch 4 Batch 2010/2125] avg loss 0.0060656, throughput 13.1806K wps
[Epoch 4 Batch 2040/2125] avg loss 0.00605649, throughput 13.0702K wps
[Epoch 4 Batch 2070/2125] avg loss 0.00633486, throughput 13.0511K wps
[Epoch 4 Batch 2100/2125] avg loss 0.00599849, throughput 13.0906K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 4] train avg loss 0.00600103, test acc 0.8792, test avg loss 0.300857, throughput 13.1448K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 5 Batch 30/2125] avg loss 0.00563732, throughput 13.3977K wps
[Epoch 5 Batch 60/2125] avg loss 0.0060528, throughput 13.0249K wps
[Epoch 5 Batch 90/2125] avg loss 0.00586119, throughput 13.0607K wps
[Epoch 5 Batch 120/2125] avg loss 0.00556103, throughput 13.1406K wps
[Epoch 5 Batch 150/2125] avg loss 0.00562748, throughput 13.0909K wps
[Epoch 5 Batch 180/2125] avg loss 0.00579206, throughput 13.0609K wps
[Epoch 5 Batch 210/2125] avg loss 0.00582652, throughput 13.0428K wps
[Epoch 5 Batch 240/2125] avg loss 0.00534942, throughput 13.0718K wps
[Epoch 5 Batch 270/2125] avg loss 0.00604375, throughput 13.0298K wps
[Epoch 5 Batch 300/2125] avg loss 0.00589181, throughput 13.0174K wps
[Epoch 5 Batch 330/2125] avg loss 0.00592598, throughput 13.0316K wps
[Epoch 5 Batch 360/2125] avg loss 0.00585637, throughput 13.0389K wps
[Epoch 5 Batch 390/2125] avg loss 0.0056172, throughput 13.0768K wps
[Epoch 5 Batch 420/2125] avg loss 0.00598009, throughput 13.0609K wps
[Epoch 5 Batch 450/2125] avg loss 0.00582877, throughput 13.0093K wps
[Epoch 5 Batch 480/2125] avg loss 0.00586841, throughput 13.1469K wps
[Epoch 5 Batch 510/2125] avg loss 0.00572491, throughput 13.1K wps
[Epoch 5 Batch 540/2125] avg loss 0.00582843, throughput 13.0141K wps
[Epoch 5 Batch 570/2125] avg loss 0.00607487, throughput 13.0298K wps
[Epoch 5 Batch 600/2125] avg loss 0.00539282, throughput 13.0432K wps
[Epoch 5 Batch 630/2125] avg loss 0.00560032, throughput 12.9965K wps
[Epoch 5 Batch 660/2125] avg loss 0.00583346, throughput 13.0548K wps
[Epoch 5 Batch 690/2125] avg loss 0.00589042, throughput 13.0007K wps
[Epoch 5 Batch 720/2125] avg loss 0.00574881, throughput 13.0145K wps
[Epoch 5 Batch 750/2125] avg loss 0.00587846, throughput 13.1085K wps
[Epoch 5 Batch 780/2125] avg loss 0.00562374, throughput 13.0711K wps
[Epoch 5 Batch 810/2125] avg loss 0.00581997, throughput 13.0404K wps
[Epoch 5 Batch 840/2125] avg loss 0.00597409, throughput 13.0405K wps
[Epoch 5 Batch 870/2125] avg loss 0.00540211, throughput 13.0274K wps
[Epoch 5 Batch 900/2125] avg loss 0.00603102, throughput 13.0309K wps
[Epoch 5 Batch 930/2125] avg loss 0.00530265, throughput 13.0507K wps
[Epoch 5 Batch 960/2125] avg loss 0.00629305, throughput 13.07K wps
[Epoch 5 Batch 990/2125] avg loss 0.00603561, throughput 13.0389K wps
[Epoch 5 Batch 1020/2125] avg loss 0.0065085, throughput 13.0709K wps
[Epoch 5 Batch 1050/2125] avg loss 0.0059079, throughput 13.036K wps
[Epoch 5 Batch 1080/2125] avg loss 0.00610543, throughput 13.0535K wps
[Epoch 5 Batch 1110/2125] avg loss 0.00546853, throughput 13.0446K wps
[Epoch 5 Batch 1140/2125] avg loss 0.00565315, throughput 13.0182K wps
[Epoch 5 Batch 1170/2125] avg loss 0.00617655, throughput 13.089K wps
[Epoch 5 Batch 1200/2125] avg loss 0.00549522, throughput 13.0613K wps
[Epoch 5 Batch 1230/2125] avg loss 0.00603817, throughput 13.0524K wps
[Epoch 5 Batch 1260/2125] avg loss 0.00572189, throughput 13.0591K wps
[Epoch 5 Batch 1290/2125] avg loss 0.00547878, throughput 13.0609K wps
[Epoch 5 Batch 1320/2125] avg loss 0.00573034, throughput 13.0717K wps
[Epoch 5 Batch 1350/2125] avg loss 0.00540832, throughput 13.0828K wps
[Epoch 5 Batch 1380/2125] avg loss 0.00544094, throughput 13.0591K wps
[Epoch 5 Batch 1410/2125] avg loss 0.00587786, throughput 13.0443K wps
[Epoch 5 Batch 1440/2125] avg loss 0.00615397, throughput 13.0696K wps
[Epoch 5 Batch 1470/2125] avg loss 0.00542762, throughput 13.0534K wps
[Epoch 5 Batch 1500/2125] avg loss 0.00559689, throughput 13.0426K wps
[Epoch 5 Batch 1530/2125] avg loss 0.00547372, throughput 13.0633K wps
[Epoch 5 Batch 1560/2125] avg loss 0.00553164, throughput 13.0764K wps
[Epoch 5 Batch 1590/2125] avg loss 0.00554978, throughput 13.056K wps
[Epoch 5 Batch 1620/2125] avg loss 0.00571743, throughput 13.0772K wps
[Epoch 5 Batch 1650/2125] avg loss 0.0054182, throughput 13.0571K wps
[Epoch 5 Batch 1680/2125] avg loss 0.00585806, throughput 13.0332K wps
[Epoch 5 Batch 1710/2125] avg loss 0.00565101, throughput 13.0312K wps
[Epoch 5 Batch 1740/2125] avg loss 0.00530467, throughput 13.0855K wps
[Epoch 5 Batch 1770/2125] avg loss 0.00557676, throughput 13.0947K wps
[Epoch 5 Batch 1800/2125] avg loss 0.00545278, throughput 13.0806K wps
[Epoch 5 Batch 1830/2125] avg loss 0.00525408, throughput 13.0628K wps
[Epoch 5 Batch 1860/2125] avg loss 0.00593944, throughput 13.0413K wps
[Epoch 5 Batch 1890/2125] avg loss 0.00608962, throughput 13.0608K wps
[Epoch 5 Batch 1920/2125] avg loss 0.00563606, throughput 13.0815K wps
[Epoch 5 Batch 1950/2125] avg loss 0.0058697, throughput 13.0601K wps
[Epoch 5 Batch 1980/2125] avg loss 0.0054139, throughput 13.0757K wps
[Epoch 5 Batch 2010/2125] avg loss 0.0056225, throughput 13.0356K wps
[Epoch 5 Batch 2040/2125] avg loss 0.0061017, throughput 13.0513K wps
[Epoch 5 Batch 2070/2125] avg loss 0.00535643, throughput 13.0801K wps
[Epoch 5 Batch 2100/2125] avg loss 0.00511572, throughput 13.0931K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 5] train avg loss 0.00573618, test acc 0.8835, test avg loss 0.292497, throughput 13.0612K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 6 Batch 30/2125] avg loss 0.00529602, throughput 13.3229K wps
[Epoch 6 Batch 60/2125] avg loss 0.00539823, throughput 13.041K wps
[Epoch 6 Batch 90/2125] avg loss 0.00554407, throughput 13.1228K wps
[Epoch 6 Batch 120/2125] avg loss 0.00556862, throughput 13.0617K wps
[Epoch 6 Batch 150/2125] avg loss 0.00512814, throughput 13.0698K wps
[Epoch 6 Batch 180/2125] avg loss 0.00522073, throughput 13.0419K wps
[Epoch 6 Batch 210/2125] avg loss 0.00586009, throughput 13.0537K wps
[Epoch 6 Batch 240/2125] avg loss 0.00471365, throughput 13.0401K wps
[Epoch 6 Batch 270/2125] avg loss 0.00554174, throughput 13.1368K wps
[Epoch 6 Batch 300/2125] avg loss 0.00521006, throughput 13.0368K wps
[Epoch 6 Batch 330/2125] avg loss 0.00516326, throughput 13.0566K wps
[Epoch 6 Batch 360/2125] avg loss 0.00516495, throughput 13.0581K wps
[Epoch 6 Batch 390/2125] avg loss 0.00606962, throughput 13.0473K wps
[Epoch 6 Batch 420/2125] avg loss 0.00624259, throughput 13.0886K wps
[Epoch 6 Batch 450/2125] avg loss 0.00534879, throughput 13.0385K wps
[Epoch 6 Batch 480/2125] avg loss 0.00613109, throughput 13.0265K wps
[Epoch 6 Batch 510/2125] avg loss 0.00550476, throughput 13.0747K wps
[Epoch 6 Batch 540/2125] avg loss 0.00540599, throughput 13.0165K wps
[Epoch 6 Batch 570/2125] avg loss 0.00580941, throughput 13.0603K wps
[Epoch 6 Batch 600/2125] avg loss 0.00533137, throughput 13.0351K wps
[Epoch 6 Batch 630/2125] avg loss 0.00561812, throughput 13.0404K wps
[Epoch 6 Batch 660/2125] avg loss 0.00572457, throughput 13.0317K wps
[Epoch 6 Batch 690/2125] avg loss 0.00582734, throughput 13.0387K wps
[Epoch 6 Batch 720/2125] avg loss 0.0057461, throughput 13.0381K wps
[Epoch 6 Batch 750/2125] avg loss 0.00513088, throughput 13.0387K wps
[Epoch 6 Batch 780/2125] avg loss 0.0056779, throughput 13.0158K wps
[Epoch 6 Batch 810/2125] avg loss 0.00487793, throughput 13.0288K wps
[Epoch 6 Batch 840/2125] avg loss 0.00530419, throughput 13.0419K wps
[Epoch 6 Batch 870/2125] avg loss 0.00570289, throughput 13.0228K wps
[Epoch 6 Batch 900/2125] avg loss 0.0059064, throughput 12.9527K wps
[Epoch 6 Batch 930/2125] avg loss 0.00589339, throughput 13.0257K wps
[Epoch 6 Batch 960/2125] avg loss 0.0052241, throughput 13.002K wps
[Epoch 6 Batch 990/2125] avg loss 0.00545026, throughput 13.074K wps
[Epoch 6 Batch 1020/2125] avg loss 0.00566594, throughput 13.033K wps
[Epoch 6 Batch 1050/2125] avg loss 0.0056143, throughput 13.0203K wps
[Epoch 6 Batch 1080/2125] avg loss 0.00541853, throughput 13.0447K wps
[Epoch 6 Batch 1110/2125] avg loss 0.00540749, throughput 12.9879K wps
[Epoch 6 Batch 1140/2125] avg loss 0.00474064, throughput 13.0075K wps
[Epoch 6 Batch 1170/2125] avg loss 0.00546262, throughput 13.0335K wps
[Epoch 6 Batch 1200/2125] avg loss 0.00536907, throughput 13.0137K wps
[Epoch 6 Batch 1230/2125] avg loss 0.00531813, throughput 13.0132K wps
[Epoch 6 Batch 1260/2125] avg loss 0.00538954, throughput 13.0275K wps
[Epoch 6 Batch 1290/2125] avg loss 0.00551746, throughput 13.0513K wps
[Epoch 6 Batch 1320/2125] avg loss 0.00577296, throughput 13.0397K wps
[Epoch 6 Batch 1350/2125] avg loss 0.00587341, throughput 13.0416K wps
[Epoch 6 Batch 1380/2125] avg loss 0.00523018, throughput 13.0963K wps
[Epoch 6 Batch 1410/2125] avg loss 0.00548502, throughput 13.0018K wps
[Epoch 6 Batch 1440/2125] avg loss 0.00582374, throughput 13.0412K wps
[Epoch 6 Batch 1470/2125] avg loss 0.00517117, throughput 13.0459K wps
[Epoch 6 Batch 1500/2125] avg loss 0.00573087, throughput 13.07K wps
[Epoch 6 Batch 1530/2125] avg loss 0.00567245, throughput 13.0511K wps
[Epoch 6 Batch 1560/2125] avg loss 0.00536937, throughput 13.0314K wps
[Epoch 6 Batch 1590/2125] avg loss 0.00599678, throughput 13.0826K wps
[Epoch 6 Batch 1620/2125] avg loss 0.00567642, throughput 13.0921K wps
[Epoch 6 Batch 1650/2125] avg loss 0.00531116, throughput 13.0678K wps
[Epoch 6 Batch 1680/2125] avg loss 0.00520122, throughput 13.0538K wps
[Epoch 6 Batch 1710/2125] avg loss 0.00521306, throughput 13.0654K wps
[Epoch 6 Batch 1740/2125] avg loss 0.00547113, throughput 13.0483K wps
[Epoch 6 Batch 1770/2125] avg loss 0.00548241, throughput 13.04K wps
[Epoch 6 Batch 1800/2125] avg loss 0.00539338, throughput 12.9957K wps
[Epoch 6 Batch 1830/2125] avg loss 0.00578748, throughput 13.0571K wps
[Epoch 6 Batch 1860/2125] avg loss 0.00527118, throughput 13.0539K wps
[Epoch 6 Batch 1890/2125] avg loss 0.00489659, throughput 13.0352K wps
[Epoch 6 Batch 1920/2125] avg loss 0.00571583, throughput 13.0386K wps
[Epoch 6 Batch 1950/2125] avg loss 0.0056383, throughput 13.0517K wps
[Epoch 6 Batch 1980/2125] avg loss 0.00563255, throughput 13.0127K wps
[Epoch 6 Batch 2010/2125] avg loss 0.00577851, throughput 13.0194K wps
[Epoch 6 Batch 2040/2125] avg loss 0.0056442, throughput 13.0305K wps
[Epoch 6 Batch 2070/2125] avg loss 0.00563103, throughput 13.0521K wps
[Epoch 6 Batch 2100/2125] avg loss 0.00581668, throughput 13.0365K wps
Begin Testing...
[Batch 30/237] elapsed 0.31 s
[Batch 60/237] elapsed 0.29 s
[Batch 90/237] elapsed 0.29 s
[Batch 120/237] elapsed 0.28 s
[Batch 150/237] elapsed 0.28 s
[Batch 180/237] elapsed 0.28 s
[Batch 210/237] elapsed 0.28 s
[Epoch 6] train avg loss 0.00550052, test acc 0.8855, test avg loss 0.284569, throughput 13.0464K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 7 Batch 30/2125] avg loss 0.00489727, throughput 13.3365K wps
[Epoch 7 Batch 60/2125] avg loss 0.00551526, throughput 13.0382K wps
[Epoch 7 Batch 90/2125] avg loss 0.00546297, throughput 13.0514K wps
[Epoch 7 Batch 120/2125] avg loss 0.00564331, throughput 13.0271K wps
[Epoch 7 Batch 150/2125] avg loss 0.00501519, throughput 13.0477K wps
[Epoch 7 Batch 180/2125] avg loss 0.0050059, throughput 13.0279K wps
[Epoch 7 Batch 210/2125] avg loss 0.00566174, throughput 12.9718K wps
[Epoch 7 Batch 240/2125] avg loss 0.00473596, throughput 13.018K wps
[Epoch 7 Batch 270/2125] avg loss 0.00531458, throughput 13.0039K wps
[Epoch 7 Batch 300/2125] avg loss 0.00557742, throughput 13.0258K wps
[Epoch 7 Batch 330/2125] avg loss 0.00522102, throughput 13.0298K wps
[Epoch 7 Batch 360/2125] avg loss 0.00533208, throughput 13.0124K wps
[Epoch 7 Batch 390/2125] avg loss 0.00513772, throughput 13.0342K wps
[Epoch 7 Batch 420/2125] avg loss 0.00545082, throughput 13.0022K wps
[Epoch 7 Batch 450/2125] avg loss 0.00514944, throughput 12.99K wps
[Epoch 7 Batch 480/2125] avg loss 0.00533184, throughput 13.0303K wps
[Epoch 7 Batch 510/2125] avg loss 0.0051945, throughput 13.0325K wps
[Epoch 7 Batch 540/2125] avg loss 0.00564715, throughput 13.0297K wps
[Epoch 7 Batch 570/2125] avg loss 0.00540348, throughput 13.0054K wps
[Epoch 7 Batch 600/2125] avg loss 0.00519981, throughput 13.0209K wps
[Epoch 7 Batch 630/2125] avg loss 0.00498804, throughput 13.0276K wps
[Epoch 7 Batch 660/2125] avg loss 0.005115, throughput 13.0196K wps
[Epoch 7 Batch 690/2125] avg loss 0.00529532, throughput 13.0663K wps
[Epoch 7 Batch 720/2125] avg loss 0.00498398, throughput 13.0689K wps
[Epoch 7 Batch 750/2125] avg loss 0.00554126, throughput 13.0609K wps
[Epoch 7 Batch 780/2125] avg loss 0.00483449, throughput 13.0431K wps
[Epoch 7 Batch 810/2125] avg loss 0.00539661, throughput 13.0464K wps
[Epoch 7 Batch 840/2125] avg loss 0.00521769, throughput 13.047K wps
[Epoch 7 Batch 870/2125] avg loss 0.00531091, throughput 13.0404K wps
[Epoch 7 Batch 900/2125] avg loss 0.00508302, throughput 13.0549K wps
[Epoch 7 Batch 930/2125] avg loss 0.00524593, throughput 13.0476K wps
[Epoch 7 Batch 960/2125] avg loss 0.00546597, throughput 13.0805K wps
[Epoch 7 Batch 990/2125] avg loss 0.0047233, throughput 13.0098K wps
[Epoch 7 Batch 1020/2125] avg loss 0.00486006, throughput 13.0705K wps
[Epoch 7 Batch 1050/2125] avg loss 0.00506945, throughput 13.0252K wps
[Epoch 7 Batch 1080/2125] avg loss 0.0051339, throughput 13.0405K wps
[Epoch 7 Batch 1110/2125] avg loss 0.00569618, throughput 13.0209K wps
[Epoch 7 Batch 1140/2125] avg loss 0.00511799, throughput 13.026K wps
[Epoch 7 Batch 1170/2125] avg loss 0.0057373, throughput 13.0518K wps
[Epoch 7 Batch 1200/2125] avg loss 0.0052473, throughput 13.0442K wps
[Epoch 7 Batch 1230/2125] avg loss 0.00555965, throughput 13.0638K wps
[Epoch 7 Batch 1260/2125] avg loss 0.00526781, throughput 13.0437K wps
[Epoch 7 Batch 1290/2125] avg loss 0.00490494, throughput 13.0418K wps
[Epoch 7 Batch 1320/2125] avg loss 0.00548809, throughput 13.0427K wps
[Epoch 7 Batch 1350/2125] avg loss 0.00514208, throughput 13.0483K wps
[Epoch 7 Batch 1380/2125] avg loss 0.00541068, throughput 13.0189K wps
[Epoch 7 Batch 1410/2125] avg loss 0.00539332, throughput 13.0004K wps
[Epoch 7 Batch 1440/2125] avg loss 0.0054288, throughput 13.0278K wps
[Epoch 7 Batch 1470/2125] avg loss 0.00495918, throughput 13.054K wps
[Epoch 7 Batch 1500/2125] avg loss 0.00501154, throughput 13.0393K wps
[Epoch 7 Batch 1530/2125] avg loss 0.0053927, throughput 13.0164K wps
[Epoch 7 Batch 1560/2125] avg loss 0.00509691, throughput 13.0425K wps
[Epoch 7 Batch 1590/2125] avg loss 0.0054726, throughput 13.0399K wps
[Epoch 7 Batch 1620/2125] avg loss 0.00515372, throughput 13.0627K wps
[Epoch 7 Batch 1650/2125] avg loss 0.00535286, throughput 13.0321K wps
[Epoch 7 Batch 1680/2125] avg loss 0.0054705, throughput 13.0482K wps
[Epoch 7 Batch 1710/2125] avg loss 0.00548655, throughput 13.057K wps
[Epoch 7 Batch 1740/2125] avg loss 0.00557564, throughput 13.0172K wps
[Epoch 7 Batch 1770/2125] avg loss 0.0056687, throughput 13.0659K wps
[Epoch 7 Batch 1800/2125] avg loss 0.00536957, throughput 13.0325K wps
[Epoch 7 Batch 1830/2125] avg loss 0.0056761, throughput 13.0221K wps
[Epoch 7 Batch 1860/2125] avg loss 0.00574388, throughput 13.0435K wps
[Epoch 7 Batch 1890/2125] avg loss 0.00510967, throughput 13.0689K wps
[Epoch 7 Batch 1920/2125] avg loss 0.00504467, throughput 13.019K wps
[Epoch 7 Batch 1950/2125] avg loss 0.00498785, throughput 13.0449K wps
[Epoch 7 Batch 1980/2125] avg loss 0.00523397, throughput 13.0545K wps
[Epoch 7 Batch 2010/2125] avg loss 0.00534987, throughput 13.0365K wps
[Epoch 7 Batch 2040/2125] avg loss 0.00521423, throughput 13.0599K wps
[Epoch 7 Batch 2070/2125] avg loss 0.00519595, throughput 13.0323K wps
[Epoch 7 Batch 2100/2125] avg loss 0.00515696, throughput 13.037K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 7] train avg loss 0.00527578, test acc 0.8936, test avg loss 0.273446, throughput 13.0408K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 8 Batch 30/2125] avg loss 0.0053289, throughput 13.3605K wps
[Epoch 8 Batch 60/2125] avg loss 0.00475094, throughput 12.9785K wps
[Epoch 8 Batch 90/2125] avg loss 0.00489613, throughput 13.0803K wps
[Epoch 8 Batch 120/2125] avg loss 0.00487771, throughput 13.0015K wps
[Epoch 8 Batch 150/2125] avg loss 0.00490254, throughput 13.0317K wps
[Epoch 8 Batch 180/2125] avg loss 0.00526954, throughput 13.0518K wps
[Epoch 8 Batch 210/2125] avg loss 0.00500878, throughput 13.0497K wps
[Epoch 8 Batch 240/2125] avg loss 0.00480471, throughput 13.0362K wps
[Epoch 8 Batch 270/2125] avg loss 0.00491234, throughput 13.0536K wps
[Epoch 8 Batch 300/2125] avg loss 0.00539597, throughput 13.0532K wps
[Epoch 8 Batch 330/2125] avg loss 0.0047547, throughput 13.0453K wps
[Epoch 8 Batch 360/2125] avg loss 0.00539877, throughput 13.0567K wps
[Epoch 8 Batch 390/2125] avg loss 0.00547524, throughput 13.0659K wps
[Epoch 8 Batch 420/2125] avg loss 0.00483638, throughput 13.0668K wps
[Epoch 8 Batch 450/2125] avg loss 0.00470114, throughput 13.0494K wps
[Epoch 8 Batch 480/2125] avg loss 0.00497318, throughput 13.0418K wps
[Epoch 8 Batch 510/2125] avg loss 0.0047149, throughput 13.0459K wps
[Epoch 8 Batch 540/2125] avg loss 0.00491911, throughput 13.0715K wps
[Epoch 8 Batch 570/2125] avg loss 0.00497349, throughput 13.0537K wps
[Epoch 8 Batch 600/2125] avg loss 0.00559422, throughput 13.0469K wps
[Epoch 8 Batch 630/2125] avg loss 0.00504926, throughput 13.0307K wps
[Epoch 8 Batch 660/2125] avg loss 0.00507823, throughput 13.0653K wps
[Epoch 8 Batch 690/2125] avg loss 0.00533474, throughput 13.0431K wps
[Epoch 8 Batch 720/2125] avg loss 0.00476913, throughput 13.0529K wps
[Epoch 8 Batch 750/2125] avg loss 0.00510215, throughput 13.0425K wps
[Epoch 8 Batch 780/2125] avg loss 0.00495003, throughput 13.0543K wps
[Epoch 8 Batch 810/2125] avg loss 0.00498764, throughput 13.0203K wps
[Epoch 8 Batch 840/2125] avg loss 0.004946, throughput 13.0451K wps
[Epoch 8 Batch 870/2125] avg loss 0.00536523, throughput 13.0795K wps
[Epoch 8 Batch 900/2125] avg loss 0.00518169, throughput 13.064K wps
[Epoch 8 Batch 930/2125] avg loss 0.00510354, throughput 13.061K wps
[Epoch 8 Batch 960/2125] avg loss 0.00513167, throughput 13.0349K wps
[Epoch 8 Batch 990/2125] avg loss 0.00514527, throughput 13.0424K wps
[Epoch 8 Batch 1020/2125] avg loss 0.00520494, throughput 13.02K wps
[Epoch 8 Batch 1050/2125] avg loss 0.00515524, throughput 13.0335K wps
[Epoch 8 Batch 1080/2125] avg loss 0.0047987, throughput 13.041K wps
[Epoch 8 Batch 1110/2125] avg loss 0.00502498, throughput 13.0399K wps
[Epoch 8 Batch 1140/2125] avg loss 0.00513901, throughput 13.0651K wps
[Epoch 8 Batch 1170/2125] avg loss 0.00492976, throughput 13.0354K wps
[Epoch 8 Batch 1200/2125] avg loss 0.00496094, throughput 13.0617K wps
[Epoch 8 Batch 1230/2125] avg loss 0.00502153, throughput 13.0045K wps
[Epoch 8 Batch 1260/2125] avg loss 0.005234, throughput 13.0436K wps
[Epoch 8 Batch 1290/2125] avg loss 0.00513653, throughput 13.0411K wps
[Epoch 8 Batch 1320/2125] avg loss 0.00543115, throughput 13.0635K wps
[Epoch 8 Batch 1350/2125] avg loss 0.00496344, throughput 13.0141K wps
[Epoch 8 Batch 1380/2125] avg loss 0.00535832, throughput 13.0331K wps
[Epoch 8 Batch 1410/2125] avg loss 0.00525344, throughput 13.022K wps
[Epoch 8 Batch 1440/2125] avg loss 0.00466618, throughput 13.0435K wps
[Epoch 8 Batch 1470/2125] avg loss 0.00511565, throughput 13.0704K wps
[Epoch 8 Batch 1500/2125] avg loss 0.00473033, throughput 13.0468K wps
[Epoch 8 Batch 1530/2125] avg loss 0.00513721, throughput 13.0492K wps
[Epoch 8 Batch 1560/2125] avg loss 0.00462622, throughput 13.0042K wps
[Epoch 8 Batch 1590/2125] avg loss 0.00490244, throughput 13.0139K wps
[Epoch 8 Batch 1620/2125] avg loss 0.00491856, throughput 13.0371K wps
[Epoch 8 Batch 1650/2125] avg loss 0.00478893, throughput 13.0422K wps
[Epoch 8 Batch 1680/2125] avg loss 0.00450924, throughput 13.0276K wps
[Epoch 8 Batch 1710/2125] avg loss 0.00532649, throughput 13.0481K wps
[Epoch 8 Batch 1740/2125] avg loss 0.0047705, throughput 13.0388K wps
[Epoch 8 Batch 1770/2125] avg loss 0.00567512, throughput 13.0324K wps
[Epoch 8 Batch 1800/2125] avg loss 0.00543517, throughput 13.0721K wps
[Epoch 8 Batch 1830/2125] avg loss 0.00509122, throughput 13.0558K wps
[Epoch 8 Batch 1860/2125] avg loss 0.00499477, throughput 13.0736K wps
[Epoch 8 Batch 1890/2125] avg loss 0.00509872, throughput 13.0364K wps
[Epoch 8 Batch 1920/2125] avg loss 0.00515377, throughput 13.0287K wps
[Epoch 8 Batch 1950/2125] avg loss 0.00485917, throughput 13.0631K wps
[Epoch 8 Batch 1980/2125] avg loss 0.00538904, throughput 13.0621K wps
[Epoch 8 Batch 2010/2125] avg loss 0.00503572, throughput 13.0785K wps
[Epoch 8 Batch 2040/2125] avg loss 0.00526071, throughput 12.9686K wps
[Epoch 8 Batch 2070/2125] avg loss 0.00520167, throughput 13.016K wps
[Epoch 8 Batch 2100/2125] avg loss 0.00461075, throughput 13.0215K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 8] train avg loss 0.0050561, test acc 0.8946, test avg loss 0.267582, throughput 13.0473K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 9 Batch 30/2125] avg loss 0.00487354, throughput 13.2547K wps
[Epoch 9 Batch 60/2125] avg loss 0.00514728, throughput 12.979K wps
[Epoch 9 Batch 90/2125] avg loss 0.00486835, throughput 13.0022K wps
[Epoch 9 Batch 120/2125] avg loss 0.00504409, throughput 13.0173K wps
[Epoch 9 Batch 150/2125] avg loss 0.00450541, throughput 13.0067K wps
[Epoch 9 Batch 180/2125] avg loss 0.00455999, throughput 13.0605K wps
[Epoch 9 Batch 210/2125] avg loss 0.00457952, throughput 13.0525K wps
[Epoch 9 Batch 240/2125] avg loss 0.0046758, throughput 13.0281K wps
[Epoch 9 Batch 270/2125] avg loss 0.00492716, throughput 13.0513K wps
[Epoch 9 Batch 300/2125] avg loss 0.00473147, throughput 13.0304K wps
[Epoch 9 Batch 330/2125] avg loss 0.00418271, throughput 13.0396K wps
[Epoch 9 Batch 360/2125] avg loss 0.0051796, throughput 13.0502K wps
[Epoch 9 Batch 390/2125] avg loss 0.00476682, throughput 12.9982K wps
[Epoch 9 Batch 420/2125] avg loss 0.00530145, throughput 13.0459K wps
[Epoch 9 Batch 450/2125] avg loss 0.00481711, throughput 13.0324K wps
[Epoch 9 Batch 480/2125] avg loss 0.00518156, throughput 13.0549K wps
[Epoch 9 Batch 510/2125] avg loss 0.00471717, throughput 13.0688K wps
[Epoch 9 Batch 540/2125] avg loss 0.00524673, throughput 13.0409K wps
[Epoch 9 Batch 570/2125] avg loss 0.00483889, throughput 12.9964K wps
[Epoch 9 Batch 600/2125] avg loss 0.00471748, throughput 13.0599K wps
[Epoch 9 Batch 630/2125] avg loss 0.00495033, throughput 13.0666K wps
[Epoch 9 Batch 660/2125] avg loss 0.00475963, throughput 13.053K wps
[Epoch 9 Batch 690/2125] avg loss 0.00503712, throughput 13.0472K wps
[Epoch 9 Batch 720/2125] avg loss 0.00456688, throughput 13.028K wps
[Epoch 9 Batch 750/2125] avg loss 0.00491152, throughput 13.0335K wps
[Epoch 9 Batch 780/2125] avg loss 0.00474354, throughput 13.0449K wps
[Epoch 9 Batch 810/2125] avg loss 0.00459055, throughput 13.0321K wps
[Epoch 9 Batch 840/2125] avg loss 0.00490235, throughput 13.0037K wps
[Epoch 9 Batch 870/2125] avg loss 0.00452326, throughput 13.0452K wps
[Epoch 9 Batch 900/2125] avg loss 0.00486225, throughput 13.0093K wps
[Epoch 9 Batch 930/2125] avg loss 0.0052025, throughput 13.0128K wps
[Epoch 9 Batch 960/2125] avg loss 0.00459851, throughput 13.0469K wps
[Epoch 9 Batch 990/2125] avg loss 0.0043809, throughput 13.0296K wps
[Epoch 9 Batch 1020/2125] avg loss 0.00488686, throughput 13.0458K wps
[Epoch 9 Batch 1050/2125] avg loss 0.00475795, throughput 13.0266K wps
[Epoch 9 Batch 1080/2125] avg loss 0.00486136, throughput 13.004K wps
[Epoch 9 Batch 1110/2125] avg loss 0.00450087, throughput 13.0671K wps
[Epoch 9 Batch 1140/2125] avg loss 0.00474866, throughput 13.0322K wps
[Epoch 9 Batch 1170/2125] avg loss 0.00528162, throughput 13.052K wps
[Epoch 9 Batch 1200/2125] avg loss 0.00534983, throughput 13.0719K wps
[Epoch 9 Batch 1230/2125] avg loss 0.0045652, throughput 13.0648K wps
[Epoch 9 Batch 1260/2125] avg loss 0.00514157, throughput 13.0227K wps
[Epoch 9 Batch 1290/2125] avg loss 0.00474281, throughput 13.0497K wps
[Epoch 9 Batch 1320/2125] avg loss 0.00499515, throughput 13.0608K wps
[Epoch 9 Batch 1350/2125] avg loss 0.00465628, throughput 13.0355K wps
[Epoch 9 Batch 1380/2125] avg loss 0.0049037, throughput 13.0328K wps
[Epoch 9 Batch 1410/2125] avg loss 0.00503872, throughput 13.0511K wps
[Epoch 9 Batch 1440/2125] avg loss 0.00501118, throughput 13.0395K wps
[Epoch 9 Batch 1470/2125] avg loss 0.00521993, throughput 13.0231K wps
[Epoch 9 Batch 1500/2125] avg loss 0.00497974, throughput 13.0322K wps
[Epoch 9 Batch 1530/2125] avg loss 0.00483901, throughput 13.0457K wps
[Epoch 9 Batch 1560/2125] avg loss 0.00547709, throughput 13.0463K wps
[Epoch 9 Batch 1590/2125] avg loss 0.00513875, throughput 13.0017K wps
[Epoch 9 Batch 1620/2125] avg loss 0.00487098, throughput 13.0332K wps
[Epoch 9 Batch 1650/2125] avg loss 0.00459922, throughput 13.0708K wps
[Epoch 9 Batch 1680/2125] avg loss 0.00478235, throughput 13.0189K wps
[Epoch 9 Batch 1710/2125] avg loss 0.00480334, throughput 13.0247K wps
[Epoch 9 Batch 1740/2125] avg loss 0.00511015, throughput 13.0151K wps
[Epoch 9 Batch 1770/2125] avg loss 0.00516096, throughput 12.9998K wps
[Epoch 9 Batch 1800/2125] avg loss 0.0048796, throughput 13.0204K wps
[Epoch 9 Batch 1830/2125] avg loss 0.0055006, throughput 13.0642K wps
[Epoch 9 Batch 1860/2125] avg loss 0.00498942, throughput 13.0295K wps
[Epoch 9 Batch 1890/2125] avg loss 0.00464604, throughput 13.0477K wps
[Epoch 9 Batch 1920/2125] avg loss 0.00546015, throughput 13.0192K wps
[Epoch 9 Batch 1950/2125] avg loss 0.00498948, throughput 13.0319K wps
[Epoch 9 Batch 1980/2125] avg loss 0.00561813, throughput 13.0293K wps
[Epoch 9 Batch 2010/2125] avg loss 0.00473792, throughput 13.0468K wps
[Epoch 9 Batch 2040/2125] avg loss 0.00499492, throughput 13.0236K wps
[Epoch 9 Batch 2070/2125] avg loss 0.00480289, throughput 13.015K wps
[Epoch 9 Batch 2100/2125] avg loss 0.0044921, throughput 13.0177K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 9] train avg loss 0.00488994, test acc 0.8992, test avg loss 0.258244, throughput 13.0378K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 10 Batch 30/2125] avg loss 0.00463219, throughput 13.2368K wps
[Epoch 10 Batch 60/2125] avg loss 0.00456259, throughput 12.9348K wps
[Epoch 10 Batch 90/2125] avg loss 0.00442893, throughput 13.0551K wps
[Epoch 10 Batch 120/2125] avg loss 0.00473723, throughput 13.0429K wps
[Epoch 10 Batch 150/2125] avg loss 0.00478148, throughput 12.9693K wps
[Epoch 10 Batch 180/2125] avg loss 0.00438599, throughput 12.9951K wps
[Epoch 10 Batch 210/2125] avg loss 0.00458489, throughput 13.0132K wps
[Epoch 10 Batch 240/2125] avg loss 0.0049337, throughput 12.9946K wps
[Epoch 10 Batch 270/2125] avg loss 0.00478409, throughput 12.9995K wps
[Epoch 10 Batch 300/2125] avg loss 0.00467541, throughput 13.0103K wps
[Epoch 10 Batch 330/2125] avg loss 0.00426637, throughput 12.9769K wps
[Epoch 10 Batch 360/2125] avg loss 0.00484949, throughput 12.9939K wps
[Epoch 10 Batch 390/2125] avg loss 0.0046239, throughput 12.9958K wps
[Epoch 10 Batch 420/2125] avg loss 0.00458509, throughput 13.0479K wps
[Epoch 10 Batch 450/2125] avg loss 0.00519226, throughput 13.0657K wps
[Epoch 10 Batch 480/2125] avg loss 0.00475123, throughput 13.0234K wps
[Epoch 10 Batch 510/2125] avg loss 0.00501899, throughput 13.0473K wps
[Epoch 10 Batch 540/2125] avg loss 0.00488884, throughput 13.0003K wps
[Epoch 10 Batch 570/2125] avg loss 0.00458709, throughput 12.9951K wps
[Epoch 10 Batch 600/2125] avg loss 0.00429609, throughput 13.0136K wps
[Epoch 10 Batch 630/2125] avg loss 0.00491455, throughput 13.0304K wps
[Epoch 10 Batch 660/2125] avg loss 0.00497493, throughput 13.0235K wps
[Epoch 10 Batch 690/2125] avg loss 0.00450033, throughput 12.9815K wps
[Epoch 10 Batch 720/2125] avg loss 0.00496708, throughput 13.0466K wps
[Epoch 10 Batch 750/2125] avg loss 0.00499454, throughput 13.0356K wps
[Epoch 10 Batch 780/2125] avg loss 0.00466061, throughput 13.0036K wps
[Epoch 10 Batch 810/2125] avg loss 0.00487541, throughput 13.0064K wps
[Epoch 10 Batch 840/2125] avg loss 0.00490537, throughput 12.9933K wps
[Epoch 10 Batch 870/2125] avg loss 0.00483937, throughput 13.0146K wps
[Epoch 10 Batch 900/2125] avg loss 0.00477051, throughput 13.0031K wps
[Epoch 10 Batch 930/2125] avg loss 0.00472212, throughput 13.0397K wps
[Epoch 10 Batch 960/2125] avg loss 0.00465306, throughput 13.0068K wps
[Epoch 10 Batch 990/2125] avg loss 0.00462936, throughput 13.0434K wps
[Epoch 10 Batch 1020/2125] avg loss 0.00483166, throughput 13.0089K wps
[Epoch 10 Batch 1050/2125] avg loss 0.00498593, throughput 13.0292K wps
[Epoch 10 Batch 1080/2125] avg loss 0.00460536, throughput 13.0121K wps
[Epoch 10 Batch 1110/2125] avg loss 0.00427885, throughput 13.012K wps
[Epoch 10 Batch 1140/2125] avg loss 0.00491303, throughput 12.9926K wps
[Epoch 10 Batch 1170/2125] avg loss 0.00449123, throughput 12.9964K wps
[Epoch 10 Batch 1200/2125] avg loss 0.00446943, throughput 13.0187K wps
[Epoch 10 Batch 1230/2125] avg loss 0.00473072, throughput 13.031K wps
[Epoch 10 Batch 1260/2125] avg loss 0.0050429, throughput 12.9883K wps
[Epoch 10 Batch 1290/2125] avg loss 0.00418762, throughput 13.019K wps
[Epoch 10 Batch 1320/2125] avg loss 0.00477549, throughput 13.0234K wps
[Epoch 10 Batch 1350/2125] avg loss 0.00434229, throughput 12.9797K wps
[Epoch 10 Batch 1380/2125] avg loss 0.004725, throughput 12.9743K wps
[Epoch 10 Batch 1410/2125] avg loss 0.00410397, throughput 12.9591K wps
[Epoch 10 Batch 1440/2125] avg loss 0.00436766, throughput 12.9962K wps
[Epoch 10 Batch 1470/2125] avg loss 0.00479595, throughput 13.0342K wps
[Epoch 10 Batch 1500/2125] avg loss 0.00504538, throughput 13.0437K wps
[Epoch 10 Batch 1530/2125] avg loss 0.00484045, throughput 13.0683K wps
[Epoch 10 Batch 1560/2125] avg loss 0.00457275, throughput 13.0131K wps
[Epoch 10 Batch 1590/2125] avg loss 0.00493911, throughput 12.9692K wps
[Epoch 10 Batch 1620/2125] avg loss 0.00477632, throughput 12.9998K wps
[Epoch 10 Batch 1650/2125] avg loss 0.00452896, throughput 13.0613K wps
[Epoch 10 Batch 1680/2125] avg loss 0.00476733, throughput 13.0008K wps
[Epoch 10 Batch 1710/2125] avg loss 0.00414784, throughput 13.0453K wps
[Epoch 10 Batch 1740/2125] avg loss 0.00479172, throughput 13.0121K wps
[Epoch 10 Batch 1770/2125] avg loss 0.00430776, throughput 13.0657K wps
[Epoch 10 Batch 1800/2125] avg loss 0.00449579, throughput 13.0024K wps
[Epoch 10 Batch 1830/2125] avg loss 0.00430686, throughput 13.0472K wps
[Epoch 10 Batch 1860/2125] avg loss 0.00470338, throughput 13.0219K wps
[Epoch 10 Batch 1890/2125] avg loss 0.00511728, throughput 13.0242K wps
[Epoch 10 Batch 1920/2125] avg loss 0.00493882, throughput 13.0447K wps
[Epoch 10 Batch 1950/2125] avg loss 0.00482965, throughput 13.0335K wps
[Epoch 10 Batch 1980/2125] avg loss 0.00465689, throughput 13.036K wps
[Epoch 10 Batch 2010/2125] avg loss 0.00492148, throughput 13.0295K wps
[Epoch 10 Batch 2040/2125] avg loss 0.00495581, throughput 13.0228K wps
[Epoch 10 Batch 2070/2125] avg loss 0.00463854, throughput 13.0313K wps
[Epoch 10 Batch 2100/2125] avg loss 0.00426113, throughput 13.0052K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 10] train avg loss 0.00469369, test acc 0.9022, test avg loss 0.256358, throughput 13.0185K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 11 Batch 30/2125] avg loss 0.00439612, throughput 13.2911K wps
[Epoch 11 Batch 60/2125] avg loss 0.00512342, throughput 12.9897K wps
[Epoch 11 Batch 90/2125] avg loss 0.00471012, throughput 13.018K wps
[Epoch 11 Batch 120/2125] avg loss 0.00449045, throughput 13.0165K wps
[Epoch 11 Batch 150/2125] avg loss 0.0043413, throughput 13.0267K wps
[Epoch 11 Batch 180/2125] avg loss 0.00461929, throughput 13.048K wps
[Epoch 11 Batch 210/2125] avg loss 0.00400917, throughput 13.0111K wps
[Epoch 11 Batch 240/2125] avg loss 0.00467413, throughput 13.0521K wps
[Epoch 11 Batch 270/2125] avg loss 0.00441925, throughput 13.0293K wps
[Epoch 11 Batch 300/2125] avg loss 0.00434765, throughput 13.0146K wps
[Epoch 11 Batch 330/2125] avg loss 0.00438609, throughput 13.0279K wps
[Epoch 11 Batch 360/2125] avg loss 0.00464899, throughput 13.0562K wps
[Epoch 11 Batch 390/2125] avg loss 0.00424968, throughput 13.0107K wps
[Epoch 11 Batch 420/2125] avg loss 0.00434739, throughput 13.0556K wps
[Epoch 11 Batch 450/2125] avg loss 0.00492268, throughput 13.0249K wps
[Epoch 11 Batch 480/2125] avg loss 0.00445621, throughput 13.0162K wps
[Epoch 11 Batch 510/2125] avg loss 0.00424498, throughput 13.0291K wps
[Epoch 11 Batch 540/2125] avg loss 0.00466372, throughput 13.0398K wps
[Epoch 11 Batch 570/2125] avg loss 0.0049172, throughput 13.0564K wps
[Epoch 11 Batch 600/2125] avg loss 0.00462921, throughput 13.0619K wps
[Epoch 11 Batch 630/2125] avg loss 0.00469985, throughput 13.0259K wps
[Epoch 11 Batch 660/2125] avg loss 0.00472906, throughput 13.0289K wps
[Epoch 11 Batch 690/2125] avg loss 0.00414245, throughput 13.0176K wps
[Epoch 11 Batch 720/2125] avg loss 0.00476775, throughput 13.0134K wps
[Epoch 11 Batch 750/2125] avg loss 0.00492387, throughput 13.0457K wps
[Epoch 11 Batch 780/2125] avg loss 0.00451052, throughput 13.0228K wps
[Epoch 11 Batch 810/2125] avg loss 0.00432217, throughput 13.0207K wps
[Epoch 11 Batch 840/2125] avg loss 0.00487958, throughput 13.0485K wps
[Epoch 11 Batch 870/2125] avg loss 0.00437792, throughput 13.0285K wps
[Epoch 11 Batch 900/2125] avg loss 0.0045119, throughput 13.0467K wps
[Epoch 11 Batch 930/2125] avg loss 0.00446216, throughput 12.9316K wps
[Epoch 11 Batch 960/2125] avg loss 0.00500506, throughput 13.0116K wps
[Epoch 11 Batch 990/2125] avg loss 0.00454064, throughput 12.9999K wps
[Epoch 11 Batch 1020/2125] avg loss 0.00466014, throughput 12.9937K wps
[Epoch 11 Batch 1050/2125] avg loss 0.00430362, throughput 13.0125K wps
[Epoch 11 Batch 1080/2125] avg loss 0.00492524, throughput 13.0399K wps
[Epoch 11 Batch 1110/2125] avg loss 0.00488086, throughput 13.029K wps
[Epoch 11 Batch 1140/2125] avg loss 0.00454233, throughput 12.9924K wps
[Epoch 11 Batch 1170/2125] avg loss 0.00471031, throughput 13.003K wps
[Epoch 11 Batch 1200/2125] avg loss 0.00410433, throughput 13.0004K wps
[Epoch 11 Batch 1230/2125] avg loss 0.00415804, throughput 13.0067K wps
[Epoch 11 Batch 1260/2125] avg loss 0.00489544, throughput 13.0179K wps
[Epoch 11 Batch 1290/2125] avg loss 0.00396588, throughput 13.0528K wps
[Epoch 11 Batch 1320/2125] avg loss 0.0045856, throughput 13.033K wps
[Epoch 11 Batch 1350/2125] avg loss 0.00455914, throughput 13.0325K wps
[Epoch 11 Batch 1380/2125] avg loss 0.00460889, throughput 13.0344K wps
[Epoch 11 Batch 1410/2125] avg loss 0.00409966, throughput 13.0117K wps
[Epoch 11 Batch 1440/2125] avg loss 0.00409682, throughput 13.0232K wps
[Epoch 11 Batch 1470/2125] avg loss 0.0041999, throughput 13.045K wps
[Epoch 11 Batch 1500/2125] avg loss 0.00450395, throughput 13.068K wps
[Epoch 11 Batch 1530/2125] avg loss 0.0042695, throughput 13.0452K wps
[Epoch 11 Batch 1560/2125] avg loss 0.00444548, throughput 13.027K wps
[Epoch 11 Batch 1590/2125] avg loss 0.00479238, throughput 13.021K wps
[Epoch 11 Batch 1620/2125] avg loss 0.00496977, throughput 13.0338K wps
[Epoch 11 Batch 1650/2125] avg loss 0.00491567, throughput 13.0126K wps
[Epoch 11 Batch 1680/2125] avg loss 0.00489227, throughput 13.0476K wps
[Epoch 11 Batch 1710/2125] avg loss 0.00440319, throughput 13.0159K wps
[Epoch 11 Batch 1740/2125] avg loss 0.00439514, throughput 13.0653K wps
[Epoch 11 Batch 1770/2125] avg loss 0.00453886, throughput 13.0327K wps
[Epoch 11 Batch 1800/2125] avg loss 0.00462189, throughput 13.0402K wps
[Epoch 11 Batch 1830/2125] avg loss 0.00457657, throughput 13.02K wps
[Epoch 11 Batch 1860/2125] avg loss 0.00406198, throughput 13.0334K wps
[Epoch 11 Batch 1890/2125] avg loss 0.00465675, throughput 13.0559K wps
[Epoch 11 Batch 1920/2125] avg loss 0.00422979, throughput 13.0488K wps
[Epoch 11 Batch 1950/2125] avg loss 0.00440141, throughput 13.0368K wps
[Epoch 11 Batch 1980/2125] avg loss 0.00442809, throughput 13.0155K wps
[Epoch 11 Batch 2010/2125] avg loss 0.00427759, throughput 13.0253K wps
[Epoch 11 Batch 2040/2125] avg loss 0.00415962, throughput 13.0413K wps
[Epoch 11 Batch 2070/2125] avg loss 0.00444463, throughput 13.0414K wps
[Epoch 11 Batch 2100/2125] avg loss 0.00442988, throughput 13.0312K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 11] train avg loss 0.00451254, test acc 0.9024, test avg loss 0.250863, throughput 13.0316K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 12 Batch 30/2125] avg loss 0.004432, throughput 13.2841K wps
[Epoch 12 Batch 60/2125] avg loss 0.00446594, throughput 12.9665K wps
[Epoch 12 Batch 90/2125] avg loss 0.00453141, throughput 13.063K wps
[Epoch 12 Batch 120/2125] avg loss 0.0045982, throughput 13.0393K wps
[Epoch 12 Batch 150/2125] avg loss 0.0046483, throughput 12.9611K wps
[Epoch 12 Batch 180/2125] avg loss 0.00429048, throughput 12.9318K wps
[Epoch 12 Batch 210/2125] avg loss 0.00426198, throughput 12.9452K wps
[Epoch 12 Batch 240/2125] avg loss 0.00413546, throughput 12.9567K wps
[Epoch 12 Batch 270/2125] avg loss 0.00410289, throughput 12.9418K wps
[Epoch 12 Batch 300/2125] avg loss 0.00440509, throughput 13.0687K wps
[Epoch 12 Batch 330/2125] avg loss 0.00438586, throughput 13.0402K wps
[Epoch 12 Batch 360/2125] avg loss 0.00450637, throughput 13.0449K wps
[Epoch 12 Batch 390/2125] avg loss 0.00459048, throughput 13.035K wps
[Epoch 12 Batch 420/2125] avg loss 0.00431939, throughput 13.0393K wps
[Epoch 12 Batch 450/2125] avg loss 0.00475438, throughput 13.0536K wps
[Epoch 12 Batch 480/2125] avg loss 0.00431199, throughput 13.048K wps
[Epoch 12 Batch 510/2125] avg loss 0.00460281, throughput 13.073K wps
[Epoch 12 Batch 540/2125] avg loss 0.00435745, throughput 13.019K wps
[Epoch 12 Batch 570/2125] avg loss 0.0040052, throughput 13.04K wps
[Epoch 12 Batch 600/2125] avg loss 0.00480842, throughput 13.0116K wps
[Epoch 12 Batch 630/2125] avg loss 0.00472182, throughput 13.059K wps
[Epoch 12 Batch 660/2125] avg loss 0.00424668, throughput 13.0272K wps
[Epoch 12 Batch 690/2125] avg loss 0.00410406, throughput 13.0401K wps
[Epoch 12 Batch 720/2125] avg loss 0.00429984, throughput 13.0275K wps
[Epoch 12 Batch 750/2125] avg loss 0.00423676, throughput 12.9965K wps
[Epoch 12 Batch 780/2125] avg loss 0.0040808, throughput 13.0154K wps
[Epoch 12 Batch 810/2125] avg loss 0.00414959, throughput 12.9903K wps
[Epoch 12 Batch 840/2125] avg loss 0.00440433, throughput 12.971K wps
[Epoch 12 Batch 870/2125] avg loss 0.00462165, throughput 13.0349K wps
[Epoch 12 Batch 900/2125] avg loss 0.00436974, throughput 13.0411K wps
[Epoch 12 Batch 930/2125] avg loss 0.00415744, throughput 13.03K wps
[Epoch 12 Batch 960/2125] avg loss 0.00450949, throughput 13.0499K wps
[Epoch 12 Batch 990/2125] avg loss 0.00455577, throughput 13.0404K wps
[Epoch 12 Batch 1020/2125] avg loss 0.00393174, throughput 13.0486K wps
[Epoch 12 Batch 1050/2125] avg loss 0.00432644, throughput 13.0371K wps
[Epoch 12 Batch 1080/2125] avg loss 0.00481055, throughput 13.0326K wps
[Epoch 12 Batch 1110/2125] avg loss 0.00497609, throughput 13.039K wps
[Epoch 12 Batch 1140/2125] avg loss 0.00438994, throughput 13.053K wps
[Epoch 12 Batch 1170/2125] avg loss 0.00408271, throughput 13.0379K wps
[Epoch 12 Batch 1200/2125] avg loss 0.00436413, throughput 13.035K wps
[Epoch 12 Batch 1230/2125] avg loss 0.0039559, throughput 13.0479K wps
[Epoch 12 Batch 1260/2125] avg loss 0.00460543, throughput 13.0506K wps
[Epoch 12 Batch 1290/2125] avg loss 0.00442131, throughput 13.0583K wps
[Epoch 12 Batch 1320/2125] avg loss 0.00423652, throughput 13.0731K wps
[Epoch 12 Batch 1350/2125] avg loss 0.00408657, throughput 13.0295K wps
[Epoch 12 Batch 1380/2125] avg loss 0.00455586, throughput 13.0487K wps
[Epoch 12 Batch 1410/2125] avg loss 0.00394801, throughput 13.0481K wps
[Epoch 12 Batch 1440/2125] avg loss 0.00436683, throughput 13.0338K wps
[Epoch 12 Batch 1470/2125] avg loss 0.00458529, throughput 13.0101K wps
[Epoch 12 Batch 1500/2125] avg loss 0.00453556, throughput 13.0469K wps
[Epoch 12 Batch 1530/2125] avg loss 0.00430159, throughput 13.0146K wps
[Epoch 12 Batch 1560/2125] avg loss 0.00400878, throughput 13.0104K wps
[Epoch 12 Batch 1590/2125] avg loss 0.00430305, throughput 13.0184K wps
[Epoch 12 Batch 1620/2125] avg loss 0.00442934, throughput 13.0173K wps
[Epoch 12 Batch 1650/2125] avg loss 0.00465626, throughput 13.0194K wps
[Epoch 12 Batch 1680/2125] avg loss 0.00442837, throughput 13.029K wps
[Epoch 12 Batch 1710/2125] avg loss 0.00424534, throughput 13.0293K wps
[Epoch 12 Batch 1740/2125] avg loss 0.00409261, throughput 13.0201K wps
[Epoch 12 Batch 1770/2125] avg loss 0.00453943, throughput 13.0204K wps
[Epoch 12 Batch 1800/2125] avg loss 0.00430907, throughput 13.0081K wps
[Epoch 12 Batch 1830/2125] avg loss 0.00423487, throughput 13.0536K wps
[Epoch 12 Batch 1860/2125] avg loss 0.00435848, throughput 13.0196K wps
[Epoch 12 Batch 1890/2125] avg loss 0.0039893, throughput 13.0404K wps
[Epoch 12 Batch 1920/2125] avg loss 0.0042889, throughput 13.064K wps
[Epoch 12 Batch 1950/2125] avg loss 0.004329, throughput 13.0278K wps
[Epoch 12 Batch 1980/2125] avg loss 0.00464105, throughput 12.9722K wps
[Epoch 12 Batch 2010/2125] avg loss 0.00430307, throughput 13.0253K wps
[Epoch 12 Batch 2040/2125] avg loss 0.00440488, throughput 12.9861K wps
[Epoch 12 Batch 2070/2125] avg loss 0.00417828, throughput 13.0273K wps
[Epoch 12 Batch 2100/2125] avg loss 0.00441316, throughput 12.9992K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 12] train avg loss 0.00436661, test acc 0.9051, test avg loss 0.243255, throughput 13.0287K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 13 Batch 30/2125] avg loss 0.00460449, throughput 13.2494K wps
[Epoch 13 Batch 60/2125] avg loss 0.00470978, throughput 13.0011K wps
[Epoch 13 Batch 90/2125] avg loss 0.00411841, throughput 13.0288K wps
[Epoch 13 Batch 120/2125] avg loss 0.00423631, throughput 13.0135K wps
[Epoch 13 Batch 150/2125] avg loss 0.00441583, throughput 12.9868K wps
[Epoch 13 Batch 180/2125] avg loss 0.00440673, throughput 13.0209K wps
[Epoch 13 Batch 210/2125] avg loss 0.00441126, throughput 13.0085K wps
[Epoch 13 Batch 240/2125] avg loss 0.00434363, throughput 13.0295K wps
[Epoch 13 Batch 270/2125] avg loss 0.00424654, throughput 13.0206K wps
[Epoch 13 Batch 300/2125] avg loss 0.00427065, throughput 13.0188K wps
[Epoch 13 Batch 330/2125] avg loss 0.00417862, throughput 13.0427K wps
[Epoch 13 Batch 360/2125] avg loss 0.0041734, throughput 13.0481K wps
[Epoch 13 Batch 390/2125] avg loss 0.00477934, throughput 12.9935K wps
[Epoch 13 Batch 420/2125] avg loss 0.00428013, throughput 12.9911K wps
[Epoch 13 Batch 450/2125] avg loss 0.00397696, throughput 13.0453K wps
[Epoch 13 Batch 480/2125] avg loss 0.00422973, throughput 13.0494K wps
[Epoch 13 Batch 510/2125] avg loss 0.00449374, throughput 13.0531K wps
[Epoch 13 Batch 540/2125] avg loss 0.00470054, throughput 13.0422K wps
[Epoch 13 Batch 570/2125] avg loss 0.00401116, throughput 13.0434K wps
[Epoch 13 Batch 600/2125] avg loss 0.00356173, throughput 13.034K wps
[Epoch 13 Batch 630/2125] avg loss 0.00418183, throughput 13.056K wps
[Epoch 13 Batch 660/2125] avg loss 0.00394343, throughput 13.0211K wps
[Epoch 13 Batch 690/2125] avg loss 0.00408041, throughput 13.005K wps
[Epoch 13 Batch 720/2125] avg loss 0.00374206, throughput 13.0524K wps
[Epoch 13 Batch 750/2125] avg loss 0.00430053, throughput 13.0188K wps
[Epoch 13 Batch 780/2125] avg loss 0.00417996, throughput 13.0388K wps
[Epoch 13 Batch 810/2125] avg loss 0.00426384, throughput 13.0594K wps
[Epoch 13 Batch 840/2125] avg loss 0.00380529, throughput 13.045K wps
[Epoch 13 Batch 870/2125] avg loss 0.00463473, throughput 13.0242K wps
[Epoch 13 Batch 900/2125] avg loss 0.00429358, throughput 13.0261K wps
[Epoch 13 Batch 930/2125] avg loss 0.00461155, throughput 13.0143K wps
[Epoch 13 Batch 960/2125] avg loss 0.00388917, throughput 13.0411K wps
[Epoch 13 Batch 990/2125] avg loss 0.00373124, throughput 13.0129K wps
[Epoch 13 Batch 1020/2125] avg loss 0.00409156, throughput 13.038K wps
[Epoch 13 Batch 1050/2125] avg loss 0.00427371, throughput 13.025K wps
[Epoch 13 Batch 1080/2125] avg loss 0.00429757, throughput 13.0567K wps
[Epoch 13 Batch 1110/2125] avg loss 0.00455252, throughput 13.038K wps
[Epoch 13 Batch 1140/2125] avg loss 0.00416289, throughput 13.0432K wps
[Epoch 13 Batch 1170/2125] avg loss 0.00447864, throughput 13.0103K wps
[Epoch 13 Batch 1200/2125] avg loss 0.00424998, throughput 13.0133K wps
[Epoch 13 Batch 1230/2125] avg loss 0.00409025, throughput 13.0256K wps
[Epoch 13 Batch 1260/2125] avg loss 0.00407764, throughput 13.03K wps
[Epoch 13 Batch 1290/2125] avg loss 0.00406332, throughput 13.0086K wps
[Epoch 13 Batch 1320/2125] avg loss 0.00439272, throughput 12.9849K wps
[Epoch 13 Batch 1350/2125] avg loss 0.00427864, throughput 13.0331K wps
[Epoch 13 Batch 1380/2125] avg loss 0.00474853, throughput 13.025K wps
[Epoch 13 Batch 1410/2125] avg loss 0.00374927, throughput 13.042K wps
[Epoch 13 Batch 1440/2125] avg loss 0.00432066, throughput 13.0442K wps
[Epoch 13 Batch 1470/2125] avg loss 0.00428813, throughput 13.048K wps
[Epoch 13 Batch 1500/2125] avg loss 0.00401085, throughput 13.0193K wps
[Epoch 13 Batch 1530/2125] avg loss 0.00417223, throughput 13.0399K wps
[Epoch 13 Batch 1560/2125] avg loss 0.00459627, throughput 13.0301K wps
[Epoch 13 Batch 1590/2125] avg loss 0.00414645, throughput 13.0252K wps
[Epoch 13 Batch 1620/2125] avg loss 0.00441551, throughput 13.0346K wps
[Epoch 13 Batch 1650/2125] avg loss 0.0041341, throughput 13.031K wps
[Epoch 13 Batch 1680/2125] avg loss 0.00446602, throughput 13.0457K wps
[Epoch 13 Batch 1710/2125] avg loss 0.00440814, throughput 13.0321K wps
[Epoch 13 Batch 1740/2125] avg loss 0.00435078, throughput 13.0389K wps
[Epoch 13 Batch 1770/2125] avg loss 0.00438745, throughput 13.0256K wps
[Epoch 13 Batch 1800/2125] avg loss 0.0040828, throughput 13.017K wps
[Epoch 13 Batch 1830/2125] avg loss 0.00393877, throughput 13.0415K wps
[Epoch 13 Batch 1860/2125] avg loss 0.00463056, throughput 13.0247K wps
[Epoch 13 Batch 1890/2125] avg loss 0.00412229, throughput 13.0279K wps
[Epoch 13 Batch 1920/2125] avg loss 0.00436618, throughput 13.0065K wps
[Epoch 13 Batch 1950/2125] avg loss 0.0040992, throughput 13.0211K wps
[Epoch 13 Batch 1980/2125] avg loss 0.00419972, throughput 13.0339K wps
[Epoch 13 Batch 2010/2125] avg loss 0.00406994, throughput 13.0776K wps
[Epoch 13 Batch 2040/2125] avg loss 0.00400746, throughput 13.0344K wps
[Epoch 13 Batch 2070/2125] avg loss 0.00382491, throughput 13.0291K wps
[Epoch 13 Batch 2100/2125] avg loss 0.00384657, throughput 13.0019K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 13] train avg loss 0.00423566, test acc 0.9065, test avg loss 0.239366, throughput 13.0317K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 14 Batch 30/2125] avg loss 0.00367983, throughput 13.3028K wps
[Epoch 14 Batch 60/2125] avg loss 0.00447138, throughput 12.9264K wps
[Epoch 14 Batch 90/2125] avg loss 0.00417891, throughput 12.9923K wps
[Epoch 14 Batch 120/2125] avg loss 0.00402629, throughput 13.0235K wps
[Epoch 14 Batch 150/2125] avg loss 0.00392608, throughput 13.0254K wps
[Epoch 14 Batch 180/2125] avg loss 0.00395683, throughput 13.0044K wps
[Epoch 14 Batch 210/2125] avg loss 0.004666, throughput 13.0162K wps
[Epoch 14 Batch 240/2125] avg loss 0.00406965, throughput 13.0122K wps
[Epoch 14 Batch 270/2125] avg loss 0.00445779, throughput 13.0194K wps
[Epoch 14 Batch 300/2125] avg loss 0.00418929, throughput 13.0345K wps
[Epoch 14 Batch 330/2125] avg loss 0.00384174, throughput 13.0231K wps
[Epoch 14 Batch 360/2125] avg loss 0.00404094, throughput 13.0162K wps
[Epoch 14 Batch 390/2125] avg loss 0.00356081, throughput 13.0255K wps
[Epoch 14 Batch 420/2125] avg loss 0.0038863, throughput 13.0412K wps
[Epoch 14 Batch 450/2125] avg loss 0.00419323, throughput 13.0286K wps
[Epoch 14 Batch 480/2125] avg loss 0.00374029, throughput 13.0392K wps
[Epoch 14 Batch 510/2125] avg loss 0.00385913, throughput 13.0353K wps
[Epoch 14 Batch 540/2125] avg loss 0.00402441, throughput 13.0005K wps
[Epoch 14 Batch 570/2125] avg loss 0.0039237, throughput 13.0437K wps
[Epoch 14 Batch 600/2125] avg loss 0.00410277, throughput 13.0155K wps
[Epoch 14 Batch 630/2125] avg loss 0.00426796, throughput 13.0336K wps
[Epoch 14 Batch 660/2125] avg loss 0.00422485, throughput 13.0021K wps
[Epoch 14 Batch 690/2125] avg loss 0.00417366, throughput 12.9975K wps
[Epoch 14 Batch 720/2125] avg loss 0.00378448, throughput 13.0459K wps
[Epoch 14 Batch 750/2125] avg loss 0.00393976, throughput 13.0512K wps
[Epoch 14 Batch 780/2125] avg loss 0.00425567, throughput 13.0262K wps
[Epoch 14 Batch 810/2125] avg loss 0.00421361, throughput 13.0215K wps
[Epoch 14 Batch 840/2125] avg loss 0.00401697, throughput 13.0016K wps
[Epoch 14 Batch 870/2125] avg loss 0.00433052, throughput 13.0437K wps
[Epoch 14 Batch 900/2125] avg loss 0.00426012, throughput 13.026K wps
[Epoch 14 Batch 930/2125] avg loss 0.00416908, throughput 13.0422K wps
[Epoch 14 Batch 960/2125] avg loss 0.00390944, throughput 13.0158K wps
[Epoch 14 Batch 990/2125] avg loss 0.00393775, throughput 13.0092K wps
[Epoch 14 Batch 1020/2125] avg loss 0.00400211, throughput 13.0123K wps
[Epoch 14 Batch 1050/2125] avg loss 0.00348282, throughput 13.0358K wps
[Epoch 14 Batch 1080/2125] avg loss 0.00344293, throughput 12.9928K wps
[Epoch 14 Batch 1110/2125] avg loss 0.00386038, throughput 13.0213K wps
[Epoch 14 Batch 1140/2125] avg loss 0.00409677, throughput 13.0145K wps
[Epoch 14 Batch 1170/2125] avg loss 0.00426101, throughput 13.0377K wps
[Epoch 14 Batch 1200/2125] avg loss 0.00365829, throughput 13.0443K wps
[Epoch 14 Batch 1230/2125] avg loss 0.00388916, throughput 13.0187K wps
[Epoch 14 Batch 1260/2125] avg loss 0.00358196, throughput 13.0402K wps
[Epoch 14 Batch 1290/2125] avg loss 0.00438942, throughput 13.0447K wps
[Epoch 14 Batch 1320/2125] avg loss 0.00392326, throughput 13.0288K wps
[Epoch 14 Batch 1350/2125] avg loss 0.00454918, throughput 13.0214K wps
[Epoch 14 Batch 1380/2125] avg loss 0.00374795, throughput 13.0103K wps
[Epoch 14 Batch 1410/2125] avg loss 0.00412156, throughput 12.9902K wps
[Epoch 14 Batch 1440/2125] avg loss 0.00373907, throughput 13.0353K wps
[Epoch 14 Batch 1470/2125] avg loss 0.0040285, throughput 13.0328K wps
[Epoch 14 Batch 1500/2125] avg loss 0.00427852, throughput 13.0113K wps
[Epoch 14 Batch 1530/2125] avg loss 0.00423327, throughput 13.0128K wps
[Epoch 14 Batch 1560/2125] avg loss 0.00420903, throughput 13.0129K wps
[Epoch 14 Batch 1590/2125] avg loss 0.00410675, throughput 13.062K wps
[Epoch 14 Batch 1620/2125] avg loss 0.00407414, throughput 13.0352K wps
[Epoch 14 Batch 1650/2125] avg loss 0.00419647, throughput 13.0275K wps
[Epoch 14 Batch 1680/2125] avg loss 0.00411973, throughput 13.0494K wps
[Epoch 14 Batch 1710/2125] avg loss 0.00431315, throughput 13.0361K wps
[Epoch 14 Batch 1740/2125] avg loss 0.00438475, throughput 13.0567K wps
[Epoch 14 Batch 1770/2125] avg loss 0.00403314, throughput 13.0361K wps
[Epoch 14 Batch 1800/2125] avg loss 0.00423233, throughput 13.0695K wps
[Epoch 14 Batch 1830/2125] avg loss 0.00434093, throughput 13.048K wps
[Epoch 14 Batch 1860/2125] avg loss 0.00439115, throughput 13.0675K wps
[Epoch 14 Batch 1890/2125] avg loss 0.00434374, throughput 13.0519K wps
[Epoch 14 Batch 1920/2125] avg loss 0.00437603, throughput 13.0511K wps
[Epoch 14 Batch 1950/2125] avg loss 0.00421272, throughput 13.0485K wps
[Epoch 14 Batch 1980/2125] avg loss 0.00435753, throughput 13.0174K wps
[Epoch 14 Batch 2010/2125] avg loss 0.0042752, throughput 13.0236K wps
[Epoch 14 Batch 2040/2125] avg loss 0.00455252, throughput 13.0198K wps
[Epoch 14 Batch 2070/2125] avg loss 0.00443396, throughput 13.0461K wps
[Epoch 14 Batch 2100/2125] avg loss 0.00433913, throughput 13.0335K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 14] train avg loss 0.00409843, test acc 0.9097, test avg loss 0.234511, throughput 13.0304K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 15 Batch 30/2125] avg loss 0.00392266, throughput 13.2945K wps
[Epoch 15 Batch 60/2125] avg loss 0.00384701, throughput 12.9616K wps
[Epoch 15 Batch 90/2125] avg loss 0.00372484, throughput 13.0459K wps
[Epoch 15 Batch 120/2125] avg loss 0.00414512, throughput 13.0354K wps
[Epoch 15 Batch 150/2125] avg loss 0.0036412, throughput 13.0113K wps
[Epoch 15 Batch 180/2125] avg loss 0.00371128, throughput 13.0082K wps
[Epoch 15 Batch 210/2125] avg loss 0.00384055, throughput 13.021K wps
[Epoch 15 Batch 240/2125] avg loss 0.00372327, throughput 13.0091K wps
[Epoch 15 Batch 270/2125] avg loss 0.0042703, throughput 13.033K wps
[Epoch 15 Batch 300/2125] avg loss 0.00415688, throughput 13.0451K wps
[Epoch 15 Batch 330/2125] avg loss 0.00407474, throughput 13.0012K wps
[Epoch 15 Batch 360/2125] avg loss 0.0040035, throughput 13.0441K wps
[Epoch 15 Batch 390/2125] avg loss 0.00372885, throughput 13.0004K wps
[Epoch 15 Batch 420/2125] avg loss 0.00439616, throughput 13.0313K wps
[Epoch 15 Batch 450/2125] avg loss 0.00368926, throughput 13.0421K wps
[Epoch 15 Batch 480/2125] avg loss 0.004246, throughput 13.0374K wps
[Epoch 15 Batch 510/2125] avg loss 0.00359273, throughput 13.0375K wps
[Epoch 15 Batch 540/2125] avg loss 0.00421585, throughput 13.0267K wps
[Epoch 15 Batch 570/2125] avg loss 0.00391381, throughput 13.0316K wps
[Epoch 15 Batch 600/2125] avg loss 0.00416672, throughput 13.0675K wps
[Epoch 15 Batch 630/2125] avg loss 0.00426655, throughput 13.0178K wps
[Epoch 15 Batch 660/2125] avg loss 0.00361569, throughput 13.0137K wps
[Epoch 15 Batch 690/2125] avg loss 0.00401387, throughput 13.0377K wps
[Epoch 15 Batch 720/2125] avg loss 0.00375065, throughput 13.0089K wps
[Epoch 15 Batch 750/2125] avg loss 0.00385859, throughput 13.0409K wps
[Epoch 15 Batch 780/2125] avg loss 0.00378184, throughput 13.0009K wps
[Epoch 15 Batch 810/2125] avg loss 0.00387312, throughput 13.0446K wps
[Epoch 15 Batch 840/2125] avg loss 0.00397715, throughput 13.0221K wps
[Epoch 15 Batch 870/2125] avg loss 0.0038148, throughput 13.0357K wps
[Epoch 15 Batch 900/2125] avg loss 0.0037526, throughput 13.0352K wps
[Epoch 15 Batch 930/2125] avg loss 0.0039986, throughput 13.027K wps
[Epoch 15 Batch 960/2125] avg loss 0.0041187, throughput 13.0129K wps
[Epoch 15 Batch 990/2125] avg loss 0.0038083, throughput 13.0502K wps
[Epoch 15 Batch 1020/2125] avg loss 0.00392418, throughput 13.0693K wps
[Epoch 15 Batch 1050/2125] avg loss 0.00422168, throughput 13.0296K wps
[Epoch 15 Batch 1080/2125] avg loss 0.00422378, throughput 13.0322K wps
[Epoch 15 Batch 1110/2125] avg loss 0.00400112, throughput 13.0389K wps
[Epoch 15 Batch 1140/2125] avg loss 0.00378117, throughput 13.0337K wps
[Epoch 15 Batch 1170/2125] avg loss 0.00382795, throughput 13.0236K wps
[Epoch 15 Batch 1200/2125] avg loss 0.00434121, throughput 13.0383K wps
[Epoch 15 Batch 1230/2125] avg loss 0.00419396, throughput 13.0272K wps
[Epoch 15 Batch 1260/2125] avg loss 0.0039982, throughput 13.0326K wps
[Epoch 15 Batch 1290/2125] avg loss 0.00391384, throughput 13.009K wps
[Epoch 15 Batch 1320/2125] avg loss 0.00419142, throughput 13.022K wps
[Epoch 15 Batch 1350/2125] avg loss 0.00384725, throughput 13.0346K wps
[Epoch 15 Batch 1380/2125] avg loss 0.0044773, throughput 13.0401K wps
[Epoch 15 Batch 1410/2125] avg loss 0.00388669, throughput 13.0678K wps
[Epoch 15 Batch 1440/2125] avg loss 0.00376397, throughput 13.0501K wps
[Epoch 15 Batch 1470/2125] avg loss 0.00388761, throughput 13.0498K wps
[Epoch 15 Batch 1500/2125] avg loss 0.00374548, throughput 13.0726K wps
[Epoch 15 Batch 1530/2125] avg loss 0.0044372, throughput 13.0517K wps
[Epoch 15 Batch 1560/2125] avg loss 0.00415714, throughput 13.0421K wps
[Epoch 15 Batch 1590/2125] avg loss 0.00418559, throughput 13.0461K wps
[Epoch 15 Batch 1620/2125] avg loss 0.00376983, throughput 13.0483K wps
[Epoch 15 Batch 1650/2125] avg loss 0.00398962, throughput 13.0438K wps
[Epoch 15 Batch 1680/2125] avg loss 0.0041679, throughput 13.0412K wps
[Epoch 15 Batch 1710/2125] avg loss 0.00422559, throughput 13.0309K wps
[Epoch 15 Batch 1740/2125] avg loss 0.00376126, throughput 13.0462K wps
[Epoch 15 Batch 1770/2125] avg loss 0.00398465, throughput 13.0053K wps
[Epoch 15 Batch 1800/2125] avg loss 0.00366801, throughput 13.05K wps
[Epoch 15 Batch 1830/2125] avg loss 0.0039749, throughput 13.0109K wps
[Epoch 15 Batch 1860/2125] avg loss 0.00402454, throughput 13.0352K wps
[Epoch 15 Batch 1890/2125] avg loss 0.00344344, throughput 13.0011K wps
[Epoch 15 Batch 1920/2125] avg loss 0.0042763, throughput 13.0362K wps
[Epoch 15 Batch 1950/2125] avg loss 0.00382176, throughput 13.038K wps
[Epoch 15 Batch 1980/2125] avg loss 0.00402336, throughput 13.0453K wps
[Epoch 15 Batch 2010/2125] avg loss 0.00376372, throughput 13.025K wps
[Epoch 15 Batch 2040/2125] avg loss 0.00333549, throughput 13.029K wps
[Epoch 15 Batch 2070/2125] avg loss 0.00397546, throughput 13.0408K wps
[Epoch 15 Batch 2100/2125] avg loss 0.00437819, throughput 13.0338K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 15] train avg loss 0.00396415, test acc 0.9083, test avg loss 0.234313, throughput 13.0356K wps
[Epoch 16 Batch 30/2125] avg loss 0.0040105, throughput 13.3377K wps
[Epoch 16 Batch 60/2125] avg loss 0.00364065, throughput 12.9788K wps
[Epoch 16 Batch 90/2125] avg loss 0.00375185, throughput 13.0245K wps
[Epoch 16 Batch 120/2125] avg loss 0.0041887, throughput 13.0504K wps
[Epoch 16 Batch 150/2125] avg loss 0.00454868, throughput 13.016K wps
[Epoch 16 Batch 180/2125] avg loss 0.00388421, throughput 13.0476K wps
[Epoch 16 Batch 210/2125] avg loss 0.00434047, throughput 12.9977K wps
[Epoch 16 Batch 240/2125] avg loss 0.00422487, throughput 13.0383K wps
[Epoch 16 Batch 270/2125] avg loss 0.00388613, throughput 13.0353K wps
[Epoch 16 Batch 300/2125] avg loss 0.00376504, throughput 13.0204K wps
[Epoch 16 Batch 330/2125] avg loss 0.00435938, throughput 12.9994K wps
[Epoch 16 Batch 360/2125] avg loss 0.00407995, throughput 13.0567K wps
[Epoch 16 Batch 390/2125] avg loss 0.00411306, throughput 13.0057K wps
[Epoch 16 Batch 420/2125] avg loss 0.00382999, throughput 13.0165K wps
[Epoch 16 Batch 450/2125] avg loss 0.00408007, throughput 13.0475K wps
[Epoch 16 Batch 480/2125] avg loss 0.00328257, throughput 13.0209K wps
[Epoch 16 Batch 510/2125] avg loss 0.00359095, throughput 13.0139K wps
[Epoch 16 Batch 540/2125] avg loss 0.00329728, throughput 13.0576K wps
[Epoch 16 Batch 570/2125] avg loss 0.00389053, throughput 13.0231K wps
[Epoch 16 Batch 600/2125] avg loss 0.00395923, throughput 13.0213K wps
[Epoch 16 Batch 630/2125] avg loss 0.00368593, throughput 13.0234K wps
[Epoch 16 Batch 660/2125] avg loss 0.00342329, throughput 13.0387K wps
[Epoch 16 Batch 690/2125] avg loss 0.00357226, throughput 13.0242K wps
[Epoch 16 Batch 720/2125] avg loss 0.00361954, throughput 13.0258K wps
[Epoch 16 Batch 750/2125] avg loss 0.00360877, throughput 13.0357K wps
[Epoch 16 Batch 780/2125] avg loss 0.00382116, throughput 12.9959K wps
[Epoch 16 Batch 810/2125] avg loss 0.00430665, throughput 12.9892K wps
[Epoch 16 Batch 840/2125] avg loss 0.00403294, throughput 12.9853K wps
[Epoch 16 Batch 870/2125] avg loss 0.00374107, throughput 13.0348K wps
[Epoch 16 Batch 900/2125] avg loss 0.00397517, throughput 13.0067K wps
[Epoch 16 Batch 930/2125] avg loss 0.00392918, throughput 13.0325K wps
[Epoch 16 Batch 960/2125] avg loss 0.00344152, throughput 13.0178K wps
[Epoch 16 Batch 990/2125] avg loss 0.00388759, throughput 12.9887K wps
[Epoch 16 Batch 1020/2125] avg loss 0.00390811, throughput 12.9676K wps
[Epoch 16 Batch 1050/2125] avg loss 0.00404407, throughput 13.0162K wps
[Epoch 16 Batch 1080/2125] avg loss 0.00361522, throughput 13.0237K wps
[Epoch 16 Batch 1110/2125] avg loss 0.0040172, throughput 12.9741K wps
[Epoch 16 Batch 1140/2125] avg loss 0.00398455, throughput 12.9983K wps
[Epoch 16 Batch 1170/2125] avg loss 0.00388881, throughput 13.0243K wps
[Epoch 16 Batch 1200/2125] avg loss 0.00355716, throughput 12.9943K wps
[Epoch 16 Batch 1230/2125] avg loss 0.00370882, throughput 12.9697K wps
[Epoch 16 Batch 1260/2125] avg loss 0.00319152, throughput 12.9308K wps
[Epoch 16 Batch 1290/2125] avg loss 0.00383574, throughput 13.0382K wps
[Epoch 16 Batch 1320/2125] avg loss 0.00365963, throughput 13.0696K wps
[Epoch 16 Batch 1350/2125] avg loss 0.00413337, throughput 13.0234K wps
[Epoch 16 Batch 1380/2125] avg loss 0.00401802, throughput 12.9862K wps
[Epoch 16 Batch 1410/2125] avg loss 0.00414913, throughput 12.9992K wps
[Epoch 16 Batch 1440/2125] avg loss 0.00368783, throughput 13.0659K wps
[Epoch 16 Batch 1470/2125] avg loss 0.0037652, throughput 13.0347K wps
[Epoch 16 Batch 1500/2125] avg loss 0.00432413, throughput 13.0131K wps
[Epoch 16 Batch 1530/2125] avg loss 0.00322032, throughput 12.9937K wps
[Epoch 16 Batch 1560/2125] avg loss 0.00378976, throughput 13.0212K wps
[Epoch 16 Batch 1590/2125] avg loss 0.00379524, throughput 13.0258K wps
[Epoch 16 Batch 1620/2125] avg loss 0.0042228, throughput 13.0409K wps
[Epoch 16 Batch 1650/2125] avg loss 0.00400566, throughput 13.0302K wps
[Epoch 16 Batch 1680/2125] avg loss 0.00424646, throughput 13.0377K wps
[Epoch 16 Batch 1710/2125] avg loss 0.0037202, throughput 13.0441K wps
[Epoch 16 Batch 1740/2125] avg loss 0.00368547, throughput 12.9991K wps
[Epoch 16 Batch 1770/2125] avg loss 0.00414498, throughput 13.0287K wps
[Epoch 16 Batch 1800/2125] avg loss 0.00354833, throughput 13.0068K wps
[Epoch 16 Batch 1830/2125] avg loss 0.00368223, throughput 12.9817K wps
[Epoch 16 Batch 1860/2125] avg loss 0.00409619, throughput 13.0371K wps
[Epoch 16 Batch 1890/2125] avg loss 0.00424413, throughput 13.02K wps
[Epoch 16 Batch 1920/2125] avg loss 0.00380615, throughput 13.0364K wps
[Epoch 16 Batch 1950/2125] avg loss 0.00411433, throughput 13.0434K wps
[Epoch 16 Batch 1980/2125] avg loss 0.00387477, throughput 13.0061K wps
[Epoch 16 Batch 2010/2125] avg loss 0.00406587, throughput 13.0227K wps
[Epoch 16 Batch 2040/2125] avg loss 0.00391458, throughput 13.0255K wps
[Epoch 16 Batch 2070/2125] avg loss 0.00341358, throughput 13.0184K wps
[Epoch 16 Batch 2100/2125] avg loss 0.00409376, throughput 13.0377K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 16] train avg loss 0.0038712, test acc 0.9135, test avg loss 0.226467, throughput 13.0231K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 17 Batch 30/2125] avg loss 0.00379517, throughput 13.3518K wps
[Epoch 17 Batch 60/2125] avg loss 0.00377721, throughput 12.9496K wps
[Epoch 17 Batch 90/2125] avg loss 0.00364753, throughput 12.9692K wps
[Epoch 17 Batch 120/2125] avg loss 0.00368579, throughput 13.019K wps
[Epoch 17 Batch 150/2125] avg loss 0.00323325, throughput 12.9799K wps
[Epoch 17 Batch 180/2125] avg loss 0.00381301, throughput 13.0404K wps
[Epoch 17 Batch 210/2125] avg loss 0.00407039, throughput 12.9997K wps
[Epoch 17 Batch 240/2125] avg loss 0.00377946, throughput 13.0276K wps
[Epoch 17 Batch 270/2125] avg loss 0.00415404, throughput 13.0358K wps
[Epoch 17 Batch 300/2125] avg loss 0.00380138, throughput 12.9896K wps
[Epoch 17 Batch 330/2125] avg loss 0.00436845, throughput 12.9914K wps
[Epoch 17 Batch 360/2125] avg loss 0.00353497, throughput 13.0186K wps
[Epoch 17 Batch 390/2125] avg loss 0.00350184, throughput 12.9906K wps
[Epoch 17 Batch 420/2125] avg loss 0.00376791, throughput 13.0177K wps
[Epoch 17 Batch 450/2125] avg loss 0.00386867, throughput 13.0079K wps
[Epoch 17 Batch 480/2125] avg loss 0.00347539, throughput 12.9235K wps
[Epoch 17 Batch 510/2125] avg loss 0.00372529, throughput 12.9004K wps
[Epoch 17 Batch 540/2125] avg loss 0.00368011, throughput 12.9571K wps
[Epoch 17 Batch 570/2125] avg loss 0.00396864, throughput 12.9057K wps
[Epoch 17 Batch 600/2125] avg loss 0.00380196, throughput 12.9509K wps
[Epoch 17 Batch 630/2125] avg loss 0.0036834, throughput 13.0475K wps
[Epoch 17 Batch 660/2125] avg loss 0.00429326, throughput 13.0389K wps
[Epoch 17 Batch 690/2125] avg loss 0.00352379, throughput 13.0648K wps
[Epoch 17 Batch 720/2125] avg loss 0.00358203, throughput 13.0566K wps
[Epoch 17 Batch 750/2125] avg loss 0.00370598, throughput 13.0417K wps
[Epoch 17 Batch 780/2125] avg loss 0.00400516, throughput 13.0488K wps
[Epoch 17 Batch 810/2125] avg loss 0.00335344, throughput 13.0504K wps
[Epoch 17 Batch 840/2125] avg loss 0.00401869, throughput 13.0346K wps
[Epoch 17 Batch 870/2125] avg loss 0.00368852, throughput 13.0578K wps
[Epoch 17 Batch 900/2125] avg loss 0.00359391, throughput 13.0434K wps
[Epoch 17 Batch 930/2125] avg loss 0.00358717, throughput 13.0455K wps
[Epoch 17 Batch 960/2125] avg loss 0.00394839, throughput 13.0414K wps
[Epoch 17 Batch 990/2125] avg loss 0.00354817, throughput 13.0407K wps
[Epoch 17 Batch 1020/2125] avg loss 0.00385378, throughput 13.0218K wps
[Epoch 17 Batch 1050/2125] avg loss 0.00390372, throughput 13.0537K wps
[Epoch 17 Batch 1080/2125] avg loss 0.00392813, throughput 13.0009K wps
[Epoch 17 Batch 1110/2125] avg loss 0.00382092, throughput 13.0129K wps
[Epoch 17 Batch 1140/2125] avg loss 0.00336209, throughput 13.0114K wps
[Epoch 17 Batch 1170/2125] avg loss 0.00392277, throughput 13.0191K wps
[Epoch 17 Batch 1200/2125] avg loss 0.00377152, throughput 13.0111K wps
[Epoch 17 Batch 1230/2125] avg loss 0.00382059, throughput 13.0381K wps
[Epoch 17 Batch 1260/2125] avg loss 0.00374385, throughput 13.0304K wps
[Epoch 17 Batch 1290/2125] avg loss 0.00384446, throughput 13.0253K wps
[Epoch 17 Batch 1320/2125] avg loss 0.00330637, throughput 13.0164K wps
[Epoch 17 Batch 1350/2125] avg loss 0.00364925, throughput 13.0377K wps
[Epoch 17 Batch 1380/2125] avg loss 0.00408684, throughput 13.0343K wps
[Epoch 17 Batch 1410/2125] avg loss 0.00366686, throughput 13.021K wps
[Epoch 17 Batch 1440/2125] avg loss 0.00380024, throughput 13.0405K wps
[Epoch 17 Batch 1470/2125] avg loss 0.00346509, throughput 13.0294K wps
[Epoch 17 Batch 1500/2125] avg loss 0.00370726, throughput 13.028K wps
[Epoch 17 Batch 1530/2125] avg loss 0.00388914, throughput 13.0391K wps
[Epoch 17 Batch 1560/2125] avg loss 0.00397772, throughput 13.0316K wps
[Epoch 17 Batch 1590/2125] avg loss 0.00379307, throughput 13.0357K wps
[Epoch 17 Batch 1620/2125] avg loss 0.00366674, throughput 13.0456K wps
[Epoch 17 Batch 1650/2125] avg loss 0.00400225, throughput 13.0358K wps
[Epoch 17 Batch 1680/2125] avg loss 0.0039912, throughput 13.0356K wps
[Epoch 17 Batch 1710/2125] avg loss 0.00394174, throughput 13.0016K wps
[Epoch 17 Batch 1740/2125] avg loss 0.00377986, throughput 13.0392K wps
[Epoch 17 Batch 1770/2125] avg loss 0.00341243, throughput 13.0415K wps
[Epoch 17 Batch 1800/2125] avg loss 0.00386588, throughput 13.0458K wps
[Epoch 17 Batch 1830/2125] avg loss 0.00371254, throughput 13.032K wps
[Epoch 17 Batch 1860/2125] avg loss 0.00379056, throughput 13.0183K wps
[Epoch 17 Batch 1890/2125] avg loss 0.00407398, throughput 13.0392K wps
[Epoch 17 Batch 1920/2125] avg loss 0.00400235, throughput 13.0254K wps
[Epoch 17 Batch 1950/2125] avg loss 0.00406403, throughput 13.0358K wps
[Epoch 17 Batch 1980/2125] avg loss 0.00373047, throughput 13.0398K wps
[Epoch 17 Batch 2010/2125] avg loss 0.00367087, throughput 12.9992K wps
[Epoch 17 Batch 2040/2125] avg loss 0.00363325, throughput 13.0195K wps
[Epoch 17 Batch 2070/2125] avg loss 0.00377753, throughput 13.0501K wps
[Epoch 17 Batch 2100/2125] avg loss 0.0034977, throughput 13.0339K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 17] train avg loss 0.00376819, test acc 0.9160, test avg loss 0.221875, throughput 13.0244K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 18 Batch 30/2125] avg loss 0.00380037, throughput 13.3159K wps
[Epoch 18 Batch 60/2125] avg loss 0.00357001, throughput 13.0363K wps
[Epoch 18 Batch 90/2125] avg loss 0.00363682, throughput 13.0628K wps
[Epoch 18 Batch 120/2125] avg loss 0.00394649, throughput 13.0385K wps
[Epoch 18 Batch 150/2125] avg loss 0.00383199, throughput 13.0407K wps
[Epoch 18 Batch 180/2125] avg loss 0.00328285, throughput 13.0436K wps
[Epoch 18 Batch 210/2125] avg loss 0.00329219, throughput 13.0237K wps
[Epoch 18 Batch 240/2125] avg loss 0.00345209, throughput 13.0489K wps
[Epoch 18 Batch 270/2125] avg loss 0.00347556, throughput 13.0514K wps
[Epoch 18 Batch 300/2125] avg loss 0.00348204, throughput 13.0446K wps
[Epoch 18 Batch 330/2125] avg loss 0.00360567, throughput 13.0552K wps
[Epoch 18 Batch 360/2125] avg loss 0.00406612, throughput 13.0444K wps
[Epoch 18 Batch 390/2125] avg loss 0.00328911, throughput 13.0352K wps
[Epoch 18 Batch 420/2125] avg loss 0.00357586, throughput 13.031K wps
[Epoch 18 Batch 450/2125] avg loss 0.00387403, throughput 13.0417K wps
[Epoch 18 Batch 480/2125] avg loss 0.00362561, throughput 13.0614K wps
[Epoch 18 Batch 510/2125] avg loss 0.00372555, throughput 13.0587K wps
[Epoch 18 Batch 540/2125] avg loss 0.00367479, throughput 13.0327K wps
[Epoch 18 Batch 570/2125] avg loss 0.00328914, throughput 13.0368K wps
[Epoch 18 Batch 600/2125] avg loss 0.00327707, throughput 13.0367K wps
[Epoch 18 Batch 630/2125] avg loss 0.00363672, throughput 13.0301K wps
[Epoch 18 Batch 660/2125] avg loss 0.00347647, throughput 13.0691K wps
[Epoch 18 Batch 690/2125] avg loss 0.00363193, throughput 13.0491K wps
[Epoch 18 Batch 720/2125] avg loss 0.0038453, throughput 13.0343K wps
[Epoch 18 Batch 750/2125] avg loss 0.00375554, throughput 13.0379K wps
[Epoch 18 Batch 780/2125] avg loss 0.00372045, throughput 13.021K wps
[Epoch 18 Batch 810/2125] avg loss 0.00386224, throughput 13.0335K wps
[Epoch 18 Batch 840/2125] avg loss 0.00397616, throughput 13.0352K wps
[Epoch 18 Batch 870/2125] avg loss 0.00353016, throughput 13.0496K wps
[Epoch 18 Batch 900/2125] avg loss 0.00375099, throughput 13.022K wps
[Epoch 18 Batch 930/2125] avg loss 0.00345263, throughput 13.027K wps
[Epoch 18 Batch 960/2125] avg loss 0.00353829, throughput 13.0371K wps
[Epoch 18 Batch 990/2125] avg loss 0.00329179, throughput 13.0526K wps
[Epoch 18 Batch 1020/2125] avg loss 0.00371043, throughput 13.0194K wps
[Epoch 18 Batch 1050/2125] avg loss 0.00358337, throughput 12.9961K wps
[Epoch 18 Batch 1080/2125] avg loss 0.00353572, throughput 13.0003K wps
[Epoch 18 Batch 1110/2125] avg loss 0.00350291, throughput 12.993K wps
[Epoch 18 Batch 1140/2125] avg loss 0.00414716, throughput 13.0241K wps
[Epoch 18 Batch 1170/2125] avg loss 0.00327178, throughput 12.9889K wps
[Epoch 18 Batch 1200/2125] avg loss 0.00339236, throughput 12.9798K wps
[Epoch 18 Batch 1230/2125] avg loss 0.00348411, throughput 12.9805K wps
[Epoch 18 Batch 1260/2125] avg loss 0.00334051, throughput 12.98K wps
[Epoch 18 Batch 1290/2125] avg loss 0.00343477, throughput 12.9806K wps
[Epoch 18 Batch 1320/2125] avg loss 0.00352003, throughput 13.0416K wps
[Epoch 18 Batch 1350/2125] avg loss 0.0040186, throughput 13.0327K wps
[Epoch 18 Batch 1380/2125] avg loss 0.00384736, throughput 13.0325K wps
[Epoch 18 Batch 1410/2125] avg loss 0.0036815, throughput 13.0458K wps
[Epoch 18 Batch 1440/2125] avg loss 0.00384073, throughput 13.0549K wps
[Epoch 18 Batch 1470/2125] avg loss 0.00411062, throughput 13.0043K wps
[Epoch 18 Batch 1500/2125] avg loss 0.00384613, throughput 13.0442K wps
[Epoch 18 Batch 1530/2125] avg loss 0.00359564, throughput 13.0374K wps
[Epoch 18 Batch 1560/2125] avg loss 0.00376017, throughput 13.0253K wps
[Epoch 18 Batch 1590/2125] avg loss 0.00356619, throughput 13.0186K wps
[Epoch 18 Batch 1620/2125] avg loss 0.00334339, throughput 13.0316K wps
[Epoch 18 Batch 1650/2125] avg loss 0.00372236, throughput 13.0249K wps
[Epoch 18 Batch 1680/2125] avg loss 0.00377962, throughput 13.0209K wps
[Epoch 18 Batch 1710/2125] avg loss 0.00351578, throughput 13.0169K wps
[Epoch 18 Batch 1740/2125] avg loss 0.00352864, throughput 13.0453K wps
[Epoch 18 Batch 1770/2125] avg loss 0.00375546, throughput 13.0144K wps
[Epoch 18 Batch 1800/2125] avg loss 0.00342293, throughput 13.0436K wps
[Epoch 18 Batch 1830/2125] avg loss 0.00361794, throughput 13.054K wps
[Epoch 18 Batch 1860/2125] avg loss 0.00415669, throughput 13.0179K wps
[Epoch 18 Batch 1890/2125] avg loss 0.00364101, throughput 13.0261K wps
[Epoch 18 Batch 1920/2125] avg loss 0.00367119, throughput 12.9876K wps
[Epoch 18 Batch 1950/2125] avg loss 0.00365678, throughput 13.01K wps
[Epoch 18 Batch 1980/2125] avg loss 0.00382094, throughput 13.0197K wps
[Epoch 18 Batch 2010/2125] avg loss 0.0038679, throughput 13.0138K wps
[Epoch 18 Batch 2040/2125] avg loss 0.00347497, throughput 13.0328K wps
[Epoch 18 Batch 2070/2125] avg loss 0.00364852, throughput 13.0684K wps
[Epoch 18 Batch 2100/2125] avg loss 0.00406773, throughput 13.0386K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 18] train avg loss 0.00364229, test acc 0.9160, test avg loss 0.221248, throughput 13.0342K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 19 Batch 30/2125] avg loss 0.00333032, throughput 13.2598K wps
[Epoch 19 Batch 60/2125] avg loss 0.003342, throughput 12.9597K wps
[Epoch 19 Batch 90/2125] avg loss 0.00324128, throughput 13.0417K wps
[Epoch 19 Batch 120/2125] avg loss 0.00342368, throughput 13.0353K wps
[Epoch 19 Batch 150/2125] avg loss 0.00323114, throughput 13.0269K wps
[Epoch 19 Batch 180/2125] avg loss 0.00348818, throughput 13.0342K wps
[Epoch 19 Batch 210/2125] avg loss 0.00371651, throughput 13.0189K wps
[Epoch 19 Batch 240/2125] avg loss 0.00380687, throughput 13.0325K wps
[Epoch 19 Batch 270/2125] avg loss 0.00359498, throughput 13.0386K wps
[Epoch 19 Batch 300/2125] avg loss 0.00337504, throughput 13.0288K wps
[Epoch 19 Batch 330/2125] avg loss 0.00372998, throughput 13.0446K wps
[Epoch 19 Batch 360/2125] avg loss 0.00325813, throughput 13.0365K wps
[Epoch 19 Batch 390/2125] avg loss 0.00336109, throughput 13.0497K wps
[Epoch 19 Batch 420/2125] avg loss 0.00364825, throughput 13.0513K wps
[Epoch 19 Batch 450/2125] avg loss 0.0036614, throughput 13.0585K wps
[Epoch 19 Batch 480/2125] avg loss 0.00383583, throughput 13.0262K wps
[Epoch 19 Batch 510/2125] avg loss 0.00381237, throughput 13.019K wps
[Epoch 19 Batch 540/2125] avg loss 0.00351115, throughput 13.0414K wps
[Epoch 19 Batch 570/2125] avg loss 0.00357023, throughput 13.0656K wps
[Epoch 19 Batch 600/2125] avg loss 0.00396972, throughput 13.0367K wps
[Epoch 19 Batch 630/2125] avg loss 0.00411243, throughput 13.0444K wps
[Epoch 19 Batch 660/2125] avg loss 0.00388361, throughput 13.0353K wps
[Epoch 19 Batch 690/2125] avg loss 0.00375108, throughput 13.0274K wps
[Epoch 19 Batch 720/2125] avg loss 0.00367564, throughput 13.0182K wps
[Epoch 19 Batch 750/2125] avg loss 0.00372109, throughput 13.0356K wps
[Epoch 19 Batch 780/2125] avg loss 0.00325143, throughput 13.0376K wps
[Epoch 19 Batch 810/2125] avg loss 0.00337119, throughput 13.0245K wps
[Epoch 19 Batch 840/2125] avg loss 0.00356101, throughput 13.0431K wps
[Epoch 19 Batch 870/2125] avg loss 0.00372445, throughput 13.0601K wps
[Epoch 19 Batch 900/2125] avg loss 0.00365581, throughput 13.1062K wps
[Epoch 19 Batch 930/2125] avg loss 0.00339489, throughput 13.0584K wps
[Epoch 19 Batch 960/2125] avg loss 0.0038211, throughput 13.0361K wps
[Epoch 19 Batch 990/2125] avg loss 0.00337443, throughput 13.0465K wps
[Epoch 19 Batch 1020/2125] avg loss 0.00371991, throughput 13.023K wps
[Epoch 19 Batch 1050/2125] avg loss 0.00321865, throughput 13.0379K wps
[Epoch 19 Batch 1080/2125] avg loss 0.00320431, throughput 13.0306K wps
[Epoch 19 Batch 1110/2125] avg loss 0.00382136, throughput 13.0221K wps
[Epoch 19 Batch 1140/2125] avg loss 0.00373115, throughput 13.0286K wps
[Epoch 19 Batch 1170/2125] avg loss 0.00318647, throughput 13.0222K wps
[Epoch 19 Batch 1200/2125] avg loss 0.00376787, throughput 13.0211K wps
[Epoch 19 Batch 1230/2125] avg loss 0.00319275, throughput 13.0592K wps
[Epoch 19 Batch 1260/2125] avg loss 0.00322838, throughput 13.0456K wps
[Epoch 19 Batch 1290/2125] avg loss 0.00375371, throughput 13.0283K wps
[Epoch 19 Batch 1320/2125] avg loss 0.00389086, throughput 13.022K wps
[Epoch 19 Batch 1350/2125] avg loss 0.00335745, throughput 13.0258K wps
[Epoch 19 Batch 1380/2125] avg loss 0.0037387, throughput 13.0563K wps
[Epoch 19 Batch 1410/2125] avg loss 0.00335793, throughput 13.0419K wps
[Epoch 19 Batch 1440/2125] avg loss 0.00335514, throughput 13.0353K wps
[Epoch 19 Batch 1470/2125] avg loss 0.00359117, throughput 13.0409K wps
[Epoch 19 Batch 1500/2125] avg loss 0.0037384, throughput 13.0575K wps
[Epoch 19 Batch 1530/2125] avg loss 0.00372485, throughput 13.0401K wps
[Epoch 19 Batch 1560/2125] avg loss 0.00363807, throughput 13.0425K wps
[Epoch 19 Batch 1590/2125] avg loss 0.00363723, throughput 13.0369K wps
[Epoch 19 Batch 1620/2125] avg loss 0.00332827, throughput 13.0262K wps
[Epoch 19 Batch 1650/2125] avg loss 0.00360477, throughput 13.0373K wps
[Epoch 19 Batch 1680/2125] avg loss 0.00358824, throughput 13.0326K wps
[Epoch 19 Batch 1710/2125] avg loss 0.00368528, throughput 13.0279K wps
[Epoch 19 Batch 1740/2125] avg loss 0.00329103, throughput 13.0271K wps
[Epoch 19 Batch 1770/2125] avg loss 0.00358826, throughput 13.0456K wps
[Epoch 19 Batch 1800/2125] avg loss 0.0036889, throughput 13.0329K wps
[Epoch 19 Batch 1830/2125] avg loss 0.00349941, throughput 13.0441K wps
[Epoch 19 Batch 1860/2125] avg loss 0.00365271, throughput 13.0542K wps
[Epoch 19 Batch 1890/2125] avg loss 0.00363691, throughput 13.0298K wps
[Epoch 19 Batch 1920/2125] avg loss 0.00348565, throughput 13.039K wps
[Epoch 19 Batch 1950/2125] avg loss 0.00356547, throughput 13.0211K wps
[Epoch 19 Batch 1980/2125] avg loss 0.00367103, throughput 13.031K wps
[Epoch 19 Batch 2010/2125] avg loss 0.00348293, throughput 13.0624K wps
[Epoch 19 Batch 2040/2125] avg loss 0.00376662, throughput 13.0166K wps
[Epoch 19 Batch 2070/2125] avg loss 0.00376964, throughput 13.0241K wps
[Epoch 19 Batch 2100/2125] avg loss 0.00316494, throughput 13.0344K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 19] train avg loss 0.00356107, test acc 0.9163, test avg loss 0.219648, throughput 13.0396K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 20 Batch 30/2125] avg loss 0.00325364, throughput 13.2655K wps
[Epoch 20 Batch 60/2125] avg loss 0.00379026, throughput 13.0186K wps
[Epoch 20 Batch 90/2125] avg loss 0.00315214, throughput 13.0256K wps
[Epoch 20 Batch 120/2125] avg loss 0.00399627, throughput 12.9779K wps
[Epoch 20 Batch 150/2125] avg loss 0.00361636, throughput 13.0191K wps
[Epoch 20 Batch 180/2125] avg loss 0.00330075, throughput 13.0155K wps
[Epoch 20 Batch 210/2125] avg loss 0.00319167, throughput 13.0262K wps
[Epoch 20 Batch 240/2125] avg loss 0.00340287, throughput 13.02K wps
[Epoch 20 Batch 270/2125] avg loss 0.00360868, throughput 13.0052K wps
[Epoch 20 Batch 300/2125] avg loss 0.00325528, throughput 13.0081K wps
[Epoch 20 Batch 330/2125] avg loss 0.00364653, throughput 13.0344K wps
[Epoch 20 Batch 360/2125] avg loss 0.00342333, throughput 13.0093K wps
[Epoch 20 Batch 390/2125] avg loss 0.00366197, throughput 13.0146K wps
[Epoch 20 Batch 420/2125] avg loss 0.00353612, throughput 13.0167K wps
[Epoch 20 Batch 450/2125] avg loss 0.00334337, throughput 13.0442K wps
[Epoch 20 Batch 480/2125] avg loss 0.00362846, throughput 13.0158K wps
[Epoch 20 Batch 510/2125] avg loss 0.00337423, throughput 13.0211K wps
[Epoch 20 Batch 540/2125] avg loss 0.00338118, throughput 13.0172K wps
[Epoch 20 Batch 570/2125] avg loss 0.00332546, throughput 13.024K wps
[Epoch 20 Batch 600/2125] avg loss 0.00282707, throughput 13.0363K wps
[Epoch 20 Batch 630/2125] avg loss 0.00364279, throughput 13.0074K wps
[Epoch 20 Batch 660/2125] avg loss 0.00375727, throughput 13.0384K wps
[Epoch 20 Batch 690/2125] avg loss 0.00350241, throughput 13.0635K wps
[Epoch 20 Batch 720/2125] avg loss 0.00359555, throughput 13.0307K wps
[Epoch 20 Batch 750/2125] avg loss 0.00376029, throughput 13.0329K wps
[Epoch 20 Batch 780/2125] avg loss 0.00308908, throughput 12.9988K wps
[Epoch 20 Batch 810/2125] avg loss 0.00291356, throughput 13.0382K wps
[Epoch 20 Batch 840/2125] avg loss 0.00351401, throughput 12.9962K wps
[Epoch 20 Batch 870/2125] avg loss 0.00332808, throughput 13.0391K wps
[Epoch 20 Batch 900/2125] avg loss 0.00340837, throughput 13.0279K wps
[Epoch 20 Batch 930/2125] avg loss 0.0037494, throughput 13.0332K wps
[Epoch 20 Batch 960/2125] avg loss 0.00365242, throughput 13.0459K wps
[Epoch 20 Batch 990/2125] avg loss 0.00319358, throughput 13.0289K wps
[Epoch 20 Batch 1020/2125] avg loss 0.00368465, throughput 13.0269K wps
[Epoch 20 Batch 1050/2125] avg loss 0.00332283, throughput 13.0262K wps
[Epoch 20 Batch 1080/2125] avg loss 0.0032852, throughput 13.0172K wps
[Epoch 20 Batch 1110/2125] avg loss 0.00334568, throughput 13.0368K wps
[Epoch 20 Batch 1140/2125] avg loss 0.00349287, throughput 13.0506K wps
[Epoch 20 Batch 1170/2125] avg loss 0.00365657, throughput 13.0292K wps
[Epoch 20 Batch 1200/2125] avg loss 0.00371424, throughput 13.0259K wps
[Epoch 20 Batch 1230/2125] avg loss 0.00354485, throughput 13.0646K wps
[Epoch 20 Batch 1260/2125] avg loss 0.00355154, throughput 13.0178K wps
[Epoch 20 Batch 1290/2125] avg loss 0.00357599, throughput 13.0281K wps
[Epoch 20 Batch 1320/2125] avg loss 0.00380203, throughput 13.0305K wps
[Epoch 20 Batch 1350/2125] avg loss 0.00374207, throughput 13.0403K wps
[Epoch 20 Batch 1380/2125] avg loss 0.0038371, throughput 13.0198K wps
[Epoch 20 Batch 1410/2125] avg loss 0.00297569, throughput 13.0243K wps
[Epoch 20 Batch 1440/2125] avg loss 0.00323038, throughput 13.0218K wps
[Epoch 20 Batch 1470/2125] avg loss 0.00305389, throughput 12.9926K wps
[Epoch 20 Batch 1500/2125] avg loss 0.00362442, throughput 13.0336K wps
[Epoch 20 Batch 1530/2125] avg loss 0.00349679, throughput 13.0296K wps
[Epoch 20 Batch 1560/2125] avg loss 0.00374167, throughput 13.0192K wps
[Epoch 20 Batch 1590/2125] avg loss 0.00334839, throughput 13.0356K wps
[Epoch 20 Batch 1620/2125] avg loss 0.00340319, throughput 13.0164K wps
[Epoch 20 Batch 1650/2125] avg loss 0.00355812, throughput 13.0315K wps
[Epoch 20 Batch 1680/2125] avg loss 0.00370905, throughput 13.0211K wps
[Epoch 20 Batch 1710/2125] avg loss 0.00313742, throughput 13.03K wps
[Epoch 20 Batch 1740/2125] avg loss 0.00335656, throughput 13.0191K wps
[Epoch 20 Batch 1770/2125] avg loss 0.00352875, throughput 13.0253K wps
[Epoch 20 Batch 1800/2125] avg loss 0.0033546, throughput 12.9902K wps
[Epoch 20 Batch 1830/2125] avg loss 0.00351648, throughput 13K wps
[Epoch 20 Batch 1860/2125] avg loss 0.0034119, throughput 13.0408K wps
[Epoch 20 Batch 1890/2125] avg loss 0.00351254, throughput 13.0294K wps
[Epoch 20 Batch 1920/2125] avg loss 0.00418875, throughput 13.0489K wps
[Epoch 20 Batch 1950/2125] avg loss 0.00329137, throughput 13.0409K wps
[Epoch 20 Batch 1980/2125] avg loss 0.00345598, throughput 12.9466K wps
[Epoch 20 Batch 2010/2125] avg loss 0.00352979, throughput 12.9914K wps
[Epoch 20 Batch 2040/2125] avg loss 0.0033633, throughput 12.9847K wps
[Epoch 20 Batch 2070/2125] avg loss 0.00354877, throughput 13.0223K wps
[Epoch 20 Batch 2100/2125] avg loss 0.00301909, throughput 13.0245K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 20] train avg loss 0.00346717, test acc 0.9195, test avg loss 0.214334, throughput 13.0259K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 21 Batch 30/2125] avg loss 0.00340804, throughput 13.2969K wps
[Epoch 21 Batch 60/2125] avg loss 0.00336072, throughput 13.044K wps
[Epoch 21 Batch 90/2125] avg loss 0.00348118, throughput 13.0187K wps
[Epoch 21 Batch 120/2125] avg loss 0.0035052, throughput 12.9911K wps
[Epoch 21 Batch 150/2125] avg loss 0.00376303, throughput 12.9896K wps
[Epoch 21 Batch 180/2125] avg loss 0.00350666, throughput 12.9931K wps
[Epoch 21 Batch 210/2125] avg loss 0.0035884, throughput 13.0193K wps
[Epoch 21 Batch 240/2125] avg loss 0.00328145, throughput 13.0097K wps
[Epoch 21 Batch 270/2125] avg loss 0.00329265, throughput 13.0162K wps
[Epoch 21 Batch 300/2125] avg loss 0.00353044, throughput 13.0347K wps
[Epoch 21 Batch 330/2125] avg loss 0.00321339, throughput 13.0463K wps
[Epoch 21 Batch 360/2125] avg loss 0.00340141, throughput 13.0524K wps
[Epoch 21 Batch 390/2125] avg loss 0.00341568, throughput 13.0181K wps
[Epoch 21 Batch 420/2125] avg loss 0.00317441, throughput 13.0019K wps
[Epoch 21 Batch 450/2125] avg loss 0.00317029, throughput 12.9935K wps
[Epoch 21 Batch 480/2125] avg loss 0.00317192, throughput 12.9988K wps
[Epoch 21 Batch 510/2125] avg loss 0.00355693, throughput 13.0029K wps
[Epoch 21 Batch 540/2125] avg loss 0.00356468, throughput 13.0387K wps
[Epoch 21 Batch 570/2125] avg loss 0.00317879, throughput 13.0432K wps
[Epoch 21 Batch 600/2125] avg loss 0.00337251, throughput 13.0577K wps
[Epoch 21 Batch 630/2125] avg loss 0.00312681, throughput 13.0619K wps
[Epoch 21 Batch 660/2125] avg loss 0.00366349, throughput 13.0371K wps
[Epoch 21 Batch 690/2125] avg loss 0.0031605, throughput 13.0552K wps
[Epoch 21 Batch 720/2125] avg loss 0.0035559, throughput 13.0192K wps
[Epoch 21 Batch 750/2125] avg loss 0.0036008, throughput 13.0241K wps
[Epoch 21 Batch 780/2125] avg loss 0.00319272, throughput 13.0295K wps
[Epoch 21 Batch 810/2125] avg loss 0.00337064, throughput 13.0398K wps
[Epoch 21 Batch 840/2125] avg loss 0.00328807, throughput 13.0527K wps
[Epoch 21 Batch 870/2125] avg loss 0.00308281, throughput 13.0322K wps
[Epoch 21 Batch 900/2125] avg loss 0.00352512, throughput 12.9781K wps
[Epoch 21 Batch 930/2125] avg loss 0.00411958, throughput 12.9952K wps
[Epoch 21 Batch 960/2125] avg loss 0.00348957, throughput 12.9843K wps
[Epoch 21 Batch 990/2125] avg loss 0.00316536, throughput 12.9996K wps
[Epoch 21 Batch 1020/2125] avg loss 0.00358585, throughput 13.0134K wps
[Epoch 21 Batch 1050/2125] avg loss 0.00311278, throughput 13.0488K wps
[Epoch 21 Batch 1080/2125] avg loss 0.00363, throughput 13.0154K wps
[Epoch 21 Batch 1110/2125] avg loss 0.00304965, throughput 13.0418K wps
[Epoch 21 Batch 1140/2125] avg loss 0.00310901, throughput 12.9995K wps
[Epoch 21 Batch 1170/2125] avg loss 0.00350101, throughput 12.9933K wps
[Epoch 21 Batch 1200/2125] avg loss 0.00332908, throughput 13.0197K wps
[Epoch 21 Batch 1230/2125] avg loss 0.00348729, throughput 12.978K wps
[Epoch 21 Batch 1260/2125] avg loss 0.00277682, throughput 13.0134K wps
[Epoch 21 Batch 1290/2125] avg loss 0.00361149, throughput 13.0157K wps
[Epoch 21 Batch 1320/2125] avg loss 0.00330757, throughput 12.9855K wps
[Epoch 21 Batch 1350/2125] avg loss 0.00397718, throughput 12.9654K wps
[Epoch 21 Batch 1380/2125] avg loss 0.00298495, throughput 12.9923K wps
[Epoch 21 Batch 1410/2125] avg loss 0.00351031, throughput 12.9967K wps
[Epoch 21 Batch 1440/2125] avg loss 0.00288164, throughput 13.0235K wps
[Epoch 21 Batch 1470/2125] avg loss 0.00340059, throughput 12.9959K wps
[Epoch 21 Batch 1500/2125] avg loss 0.00330231, throughput 12.9814K wps
[Epoch 21 Batch 1530/2125] avg loss 0.00345453, throughput 12.9993K wps
[Epoch 21 Batch 1560/2125] avg loss 0.00314334, throughput 12.9936K wps
[Epoch 21 Batch 1590/2125] avg loss 0.00351174, throughput 12.9914K wps
[Epoch 21 Batch 1620/2125] avg loss 0.00351746, throughput 13.0182K wps
[Epoch 21 Batch 1650/2125] avg loss 0.00322798, throughput 13.0048K wps
[Epoch 21 Batch 1680/2125] avg loss 0.00307843, throughput 13.0189K wps
[Epoch 21 Batch 1710/2125] avg loss 0.00338694, throughput 12.9955K wps
[Epoch 21 Batch 1740/2125] avg loss 0.00330563, throughput 12.9918K wps
[Epoch 21 Batch 1770/2125] avg loss 0.00327924, throughput 13.0565K wps
[Epoch 21 Batch 1800/2125] avg loss 0.00293007, throughput 13.0192K wps
[Epoch 21 Batch 1830/2125] avg loss 0.0034596, throughput 13.0222K wps
[Epoch 21 Batch 1860/2125] avg loss 0.00302376, throughput 12.9946K wps
[Epoch 21 Batch 1890/2125] avg loss 0.00296224, throughput 13.0242K wps
[Epoch 21 Batch 1920/2125] avg loss 0.00295917, throughput 13.0101K wps
[Epoch 21 Batch 1950/2125] avg loss 0.00339445, throughput 13.0338K wps
[Epoch 21 Batch 1980/2125] avg loss 0.00352154, throughput 13.0174K wps
[Epoch 21 Batch 2010/2125] avg loss 0.0035373, throughput 13.0245K wps
[Epoch 21 Batch 2040/2125] avg loss 0.00333003, throughput 13.0525K wps
[Epoch 21 Batch 2070/2125] avg loss 0.00353276, throughput 13.0368K wps
[Epoch 21 Batch 2100/2125] avg loss 0.00348206, throughput 13.0301K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 21] train avg loss 0.00335435, test acc 0.9198, test avg loss 0.212792, throughput 13.0198K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 22 Batch 30/2125] avg loss 0.00318668, throughput 13.2915K wps
[Epoch 22 Batch 60/2125] avg loss 0.00304544, throughput 12.9541K wps
[Epoch 22 Batch 90/2125] avg loss 0.00297212, throughput 13.06K wps
[Epoch 22 Batch 120/2125] avg loss 0.00314685, throughput 13.0575K wps
[Epoch 22 Batch 150/2125] avg loss 0.00322009, throughput 13.0076K wps
[Epoch 22 Batch 180/2125] avg loss 0.00328437, throughput 12.9814K wps
[Epoch 22 Batch 210/2125] avg loss 0.00316612, throughput 13.0069K wps
[Epoch 22 Batch 240/2125] avg loss 0.00351658, throughput 13.0273K wps
[Epoch 22 Batch 270/2125] avg loss 0.00312726, throughput 13.0414K wps
[Epoch 22 Batch 300/2125] avg loss 0.00309616, throughput 13.0052K wps
[Epoch 22 Batch 330/2125] avg loss 0.00344311, throughput 13.0102K wps
[Epoch 22 Batch 360/2125] avg loss 0.00332381, throughput 13.0075K wps
[Epoch 22 Batch 390/2125] avg loss 0.00385045, throughput 12.9978K wps
[Epoch 22 Batch 420/2125] avg loss 0.00340182, throughput 13.0281K wps
[Epoch 22 Batch 450/2125] avg loss 0.00351511, throughput 13.0332K wps
[Epoch 22 Batch 480/2125] avg loss 0.00324646, throughput 13.014K wps
[Epoch 22 Batch 510/2125] avg loss 0.00340652, throughput 13.0114K wps
[Epoch 22 Batch 540/2125] avg loss 0.00304091, throughput 13.0067K wps
[Epoch 22 Batch 570/2125] avg loss 0.00328353, throughput 13.0224K wps
[Epoch 22 Batch 600/2125] avg loss 0.00314292, throughput 13.0196K wps
[Epoch 22 Batch 630/2125] avg loss 0.00355252, throughput 13.0107K wps
[Epoch 22 Batch 660/2125] avg loss 0.00334283, throughput 13.0192K wps
[Epoch 22 Batch 690/2125] avg loss 0.00303291, throughput 12.998K wps
[Epoch 22 Batch 720/2125] avg loss 0.00308853, throughput 13.0114K wps
[Epoch 22 Batch 750/2125] avg loss 0.00380705, throughput 13.0347K wps
[Epoch 22 Batch 780/2125] avg loss 0.00300008, throughput 12.9199K wps
[Epoch 22 Batch 810/2125] avg loss 0.00316427, throughput 12.6961K wps
[Epoch 22 Batch 840/2125] avg loss 0.00277654, throughput 8.42493K wps
[Epoch 22 Batch 870/2125] avg loss 0.00326899, throughput 13.0371K wps
[Epoch 22 Batch 900/2125] avg loss 0.00288976, throughput 13.0279K wps
[Epoch 22 Batch 930/2125] avg loss 0.00323909, throughput 13.001K wps
[Epoch 22 Batch 960/2125] avg loss 0.00363276, throughput 13.0228K wps
[Epoch 22 Batch 990/2125] avg loss 0.00342493, throughput 12.9952K wps
[Epoch 22 Batch 1020/2125] avg loss 0.00323376, throughput 13.0169K wps
[Epoch 22 Batch 1050/2125] avg loss 0.00362549, throughput 13.0091K wps
[Epoch 22 Batch 1080/2125] avg loss 0.0030895, throughput 13.0193K wps
[Epoch 22 Batch 1110/2125] avg loss 0.00347705, throughput 13.0164K wps
[Epoch 22 Batch 1140/2125] avg loss 0.00327748, throughput 12.9768K wps
[Epoch 22 Batch 1170/2125] avg loss 0.00313504, throughput 12.967K wps
[Epoch 22 Batch 1200/2125] avg loss 0.00328433, throughput 12.9968K wps
[Epoch 22 Batch 1230/2125] avg loss 0.00331304, throughput 13.0197K wps
[Epoch 22 Batch 1260/2125] avg loss 0.00278453, throughput 13.0413K wps
[Epoch 22 Batch 1290/2125] avg loss 0.00351316, throughput 13.0165K wps
[Epoch 22 Batch 1320/2125] avg loss 0.00344273, throughput 13.0221K wps
[Epoch 22 Batch 1350/2125] avg loss 0.00319724, throughput 13.0037K wps
[Epoch 22 Batch 1380/2125] avg loss 0.00332809, throughput 13.033K wps
[Epoch 22 Batch 1410/2125] avg loss 0.00311263, throughput 13.0453K wps
[Epoch 22 Batch 1440/2125] avg loss 0.00352053, throughput 13.02K wps
[Epoch 22 Batch 1470/2125] avg loss 0.0029531, throughput 13.0578K wps
[Epoch 22 Batch 1500/2125] avg loss 0.00298112, throughput 13.0423K wps
[Epoch 22 Batch 1530/2125] avg loss 0.00321431, throughput 13.0317K wps
[Epoch 22 Batch 1560/2125] avg loss 0.0035819, throughput 13.0238K wps
[Epoch 22 Batch 1590/2125] avg loss 0.00347495, throughput 13.0104K wps
[Epoch 22 Batch 1620/2125] avg loss 0.00341578, throughput 13.0188K wps
[Epoch 22 Batch 1650/2125] avg loss 0.00357964, throughput 13.01K wps
[Epoch 22 Batch 1680/2125] avg loss 0.00330691, throughput 12.9975K wps
[Epoch 22 Batch 1710/2125] avg loss 0.0032361, throughput 13.0126K wps
[Epoch 22 Batch 1740/2125] avg loss 0.00312774, throughput 13.0054K wps
[Epoch 22 Batch 1770/2125] avg loss 0.00360743, throughput 13.0007K wps
[Epoch 22 Batch 1800/2125] avg loss 0.0031159, throughput 13.0144K wps
[Epoch 22 Batch 1830/2125] avg loss 0.00366774, throughput 12.9957K wps
[Epoch 22 Batch 1860/2125] avg loss 0.00317652, throughput 12.9983K wps
[Epoch 22 Batch 1890/2125] avg loss 0.00350644, throughput 13.0057K wps
[Epoch 22 Batch 1920/2125] avg loss 0.00332994, throughput 13.01K wps
[Epoch 22 Batch 1950/2125] avg loss 0.00344179, throughput 13.0313K wps
[Epoch 22 Batch 1980/2125] avg loss 0.00326501, throughput 13.0235K wps
[Epoch 22 Batch 2010/2125] avg loss 0.00323493, throughput 13.0128K wps
[Epoch 22 Batch 2040/2125] avg loss 0.00305001, throughput 12.9888K wps
[Epoch 22 Batch 2070/2125] avg loss 0.00304937, throughput 13.0207K wps
[Epoch 22 Batch 2100/2125] avg loss 0.00325244, throughput 13.0354K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 22] train avg loss 0.00327826, test acc 0.9193, test avg loss 0.211961, throughput 12.914K wps
[Epoch 23 Batch 30/2125] avg loss 0.00306055, throughput 13.3543K wps
[Epoch 23 Batch 60/2125] avg loss 0.00293111, throughput 12.9855K wps
[Epoch 23 Batch 90/2125] avg loss 0.0030889, throughput 13.0249K wps
[Epoch 23 Batch 120/2125] avg loss 0.00308024, throughput 13.0054K wps
[Epoch 23 Batch 150/2125] avg loss 0.00316672, throughput 13.0264K wps
[Epoch 23 Batch 180/2125] avg loss 0.00329708, throughput 13.0327K wps
[Epoch 23 Batch 210/2125] avg loss 0.00340299, throughput 13.0275K wps
[Epoch 23 Batch 240/2125] avg loss 0.00384433, throughput 13.0228K wps
[Epoch 23 Batch 270/2125] avg loss 0.00338777, throughput 13.0802K wps
[Epoch 23 Batch 300/2125] avg loss 0.00274949, throughput 13.05K wps
[Epoch 23 Batch 330/2125] avg loss 0.00274733, throughput 13.0564K wps
[Epoch 23 Batch 360/2125] avg loss 0.00266031, throughput 13.0419K wps
[Epoch 23 Batch 390/2125] avg loss 0.0033383, throughput 13.0374K wps
[Epoch 23 Batch 420/2125] avg loss 0.00303632, throughput 13.0509K wps
[Epoch 23 Batch 450/2125] avg loss 0.00328621, throughput 13.0098K wps
[Epoch 23 Batch 480/2125] avg loss 0.00308122, throughput 12.9676K wps
[Epoch 23 Batch 510/2125] avg loss 0.00292528, throughput 13.0271K wps
[Epoch 23 Batch 540/2125] avg loss 0.00336326, throughput 13.0305K wps
[Epoch 23 Batch 570/2125] avg loss 0.00319786, throughput 13.0084K wps
[Epoch 23 Batch 600/2125] avg loss 0.0030814, throughput 13.0297K wps
[Epoch 23 Batch 630/2125] avg loss 0.00332495, throughput 12.9816K wps
[Epoch 23 Batch 660/2125] avg loss 0.00301071, throughput 12.9991K wps
[Epoch 23 Batch 690/2125] avg loss 0.00316793, throughput 12.9871K wps
[Epoch 23 Batch 720/2125] avg loss 0.00320037, throughput 13.0068K wps
[Epoch 23 Batch 750/2125] avg loss 0.00320642, throughput 12.9771K wps
[Epoch 23 Batch 780/2125] avg loss 0.00350441, throughput 13.0149K wps
[Epoch 23 Batch 810/2125] avg loss 0.00322144, throughput 13.0315K wps
[Epoch 23 Batch 840/2125] avg loss 0.00302282, throughput 13.0402K wps
[Epoch 23 Batch 870/2125] avg loss 0.00337433, throughput 13.0426K wps
[Epoch 23 Batch 900/2125] avg loss 0.00340362, throughput 13.029K wps
[Epoch 23 Batch 930/2125] avg loss 0.00315156, throughput 13.0083K wps
[Epoch 23 Batch 960/2125] avg loss 0.00317939, throughput 13.032K wps
[Epoch 23 Batch 990/2125] avg loss 0.00325276, throughput 13.0308K wps
[Epoch 23 Batch 1020/2125] avg loss 0.00311891, throughput 13.0292K wps
[Epoch 23 Batch 1050/2125] avg loss 0.00307095, throughput 13.0274K wps
[Epoch 23 Batch 1080/2125] avg loss 0.00349585, throughput 13.0289K wps
[Epoch 23 Batch 1110/2125] avg loss 0.0029851, throughput 13.0359K wps
[Epoch 23 Batch 1140/2125] avg loss 0.0032356, throughput 13.0328K wps
[Epoch 23 Batch 1170/2125] avg loss 0.00322048, throughput 13.0185K wps
[Epoch 23 Batch 1200/2125] avg loss 0.00335645, throughput 13.0494K wps
[Epoch 23 Batch 1230/2125] avg loss 0.00319902, throughput 13.0547K wps
[Epoch 23 Batch 1260/2125] avg loss 0.00332523, throughput 13.0103K wps
[Epoch 23 Batch 1290/2125] avg loss 0.00328028, throughput 13.0447K wps
[Epoch 23 Batch 1320/2125] avg loss 0.00364756, throughput 13.0391K wps
[Epoch 23 Batch 1350/2125] avg loss 0.00342578, throughput 13.0056K wps
[Epoch 23 Batch 1380/2125] avg loss 0.00321361, throughput 13.0413K wps
[Epoch 23 Batch 1410/2125] avg loss 0.003254, throughput 13.0309K wps
[Epoch 23 Batch 1440/2125] avg loss 0.00332028, throughput 13.0412K wps
[Epoch 23 Batch 1470/2125] avg loss 0.00347713, throughput 13.0267K wps
[Epoch 23 Batch 1500/2125] avg loss 0.00324624, throughput 13.0503K wps
[Epoch 23 Batch 1530/2125] avg loss 0.00377973, throughput 13.0243K wps
[Epoch 23 Batch 1560/2125] avg loss 0.00311475, throughput 13.0509K wps
[Epoch 23 Batch 1590/2125] avg loss 0.00317187, throughput 13.0169K wps
[Epoch 23 Batch 1620/2125] avg loss 0.00318511, throughput 13.0353K wps
[Epoch 23 Batch 1650/2125] avg loss 0.00329183, throughput 13.0416K wps
[Epoch 23 Batch 1680/2125] avg loss 0.00331106, throughput 13.029K wps
[Epoch 23 Batch 1710/2125] avg loss 0.00320002, throughput 13.0378K wps
[Epoch 23 Batch 1740/2125] avg loss 0.00331277, throughput 13.0529K wps
[Epoch 23 Batch 1770/2125] avg loss 0.00329282, throughput 13.0528K wps
[Epoch 23 Batch 1800/2125] avg loss 0.00294953, throughput 13.0332K wps
[Epoch 23 Batch 1830/2125] avg loss 0.00314648, throughput 13.0442K wps
[Epoch 23 Batch 1860/2125] avg loss 0.00324519, throughput 13.0514K wps
[Epoch 23 Batch 1890/2125] avg loss 0.00349125, throughput 13.0214K wps
[Epoch 23 Batch 1920/2125] avg loss 0.00309204, throughput 13.0482K wps
[Epoch 23 Batch 1950/2125] avg loss 0.00318097, throughput 13.0356K wps
[Epoch 23 Batch 1980/2125] avg loss 0.00267071, throughput 13.0534K wps
[Epoch 23 Batch 2010/2125] avg loss 0.00324271, throughput 13.0643K wps
[Epoch 23 Batch 2040/2125] avg loss 0.00296586, throughput 13.0327K wps
[Epoch 23 Batch 2070/2125] avg loss 0.00359319, throughput 13.0287K wps
[Epoch 23 Batch 2100/2125] avg loss 0.00335765, throughput 13.0147K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 23] train avg loss 0.00321768, test acc 0.9227, test avg loss 0.207118, throughput 13.0343K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 24 Batch 30/2125] avg loss 0.00294067, throughput 13.2886K wps
[Epoch 24 Batch 60/2125] avg loss 0.00309955, throughput 12.9523K wps
[Epoch 24 Batch 90/2125] avg loss 0.00287012, throughput 12.9976K wps
[Epoch 24 Batch 120/2125] avg loss 0.00334498, throughput 13.0121K wps
[Epoch 24 Batch 150/2125] avg loss 0.00318319, throughput 13.0135K wps
[Epoch 24 Batch 180/2125] avg loss 0.00305166, throughput 13.0094K wps
[Epoch 24 Batch 210/2125] avg loss 0.0025984, throughput 12.9955K wps
[Epoch 24 Batch 240/2125] avg loss 0.00346175, throughput 12.9892K wps
[Epoch 24 Batch 270/2125] avg loss 0.00299901, throughput 12.9937K wps
[Epoch 24 Batch 300/2125] avg loss 0.00335715, throughput 12.9889K wps
[Epoch 24 Batch 330/2125] avg loss 0.00328475, throughput 13.0008K wps
[Epoch 24 Batch 360/2125] avg loss 0.00306112, throughput 13.0108K wps
[Epoch 24 Batch 390/2125] avg loss 0.00338318, throughput 13.0029K wps
[Epoch 24 Batch 420/2125] avg loss 0.002999, throughput 13.0424K wps
[Epoch 24 Batch 450/2125] avg loss 0.00283195, throughput 12.9822K wps
[Epoch 24 Batch 480/2125] avg loss 0.00307445, throughput 13.0595K wps
[Epoch 24 Batch 510/2125] avg loss 0.00324278, throughput 13.0458K wps
[Epoch 24 Batch 540/2125] avg loss 0.00346682, throughput 12.9661K wps
[Epoch 24 Batch 570/2125] avg loss 0.00321594, throughput 13.0235K wps
[Epoch 24 Batch 600/2125] avg loss 0.00274924, throughput 13.0544K wps
[Epoch 24 Batch 630/2125] avg loss 0.00299719, throughput 13.015K wps
[Epoch 24 Batch 660/2125] avg loss 0.00304961, throughput 13.0524K wps
[Epoch 24 Batch 690/2125] avg loss 0.00351129, throughput 13.0415K wps
[Epoch 24 Batch 720/2125] avg loss 0.00328309, throughput 13.038K wps
[Epoch 24 Batch 750/2125] avg loss 0.00301614, throughput 13.0298K wps
[Epoch 24 Batch 780/2125] avg loss 0.00296566, throughput 13.0366K wps
[Epoch 24 Batch 810/2125] avg loss 0.00323708, throughput 13.0638K wps
[Epoch 24 Batch 840/2125] avg loss 0.00300685, throughput 13.0286K wps
[Epoch 24 Batch 870/2125] avg loss 0.00304256, throughput 13.0278K wps
[Epoch 24 Batch 900/2125] avg loss 0.00339622, throughput 13.045K wps
[Epoch 24 Batch 930/2125] avg loss 0.0031182, throughput 13.0476K wps
[Epoch 24 Batch 960/2125] avg loss 0.00333787, throughput 13.0485K wps
[Epoch 24 Batch 990/2125] avg loss 0.00309191, throughput 13.0682K wps
[Epoch 24 Batch 1020/2125] avg loss 0.00291281, throughput 13.0399K wps
[Epoch 24 Batch 1050/2125] avg loss 0.00324282, throughput 13.0423K wps
[Epoch 24 Batch 1080/2125] avg loss 0.00324622, throughput 13.0391K wps
[Epoch 24 Batch 1110/2125] avg loss 0.00315665, throughput 13.0238K wps
[Epoch 24 Batch 1140/2125] avg loss 0.00291272, throughput 13.0261K wps
[Epoch 24 Batch 1170/2125] avg loss 0.00281253, throughput 13.0014K wps
[Epoch 24 Batch 1200/2125] avg loss 0.00282304, throughput 13.0647K wps
[Epoch 24 Batch 1230/2125] avg loss 0.00334127, throughput 13.0214K wps
[Epoch 24 Batch 1260/2125] avg loss 0.00336692, throughput 13.0262K wps
[Epoch 24 Batch 1290/2125] avg loss 0.00322606, throughput 13.0438K wps
[Epoch 24 Batch 1320/2125] avg loss 0.00377891, throughput 13.0267K wps
[Epoch 24 Batch 1350/2125] avg loss 0.00304466, throughput 13.0285K wps
[Epoch 24 Batch 1380/2125] avg loss 0.00302928, throughput 13.0143K wps
[Epoch 24 Batch 1410/2125] avg loss 0.00302931, throughput 13.051K wps
[Epoch 24 Batch 1440/2125] avg loss 0.00316829, throughput 13.032K wps
[Epoch 24 Batch 1470/2125] avg loss 0.00347236, throughput 13.0572K wps
[Epoch 24 Batch 1500/2125] avg loss 0.00303017, throughput 13.015K wps
[Epoch 24 Batch 1530/2125] avg loss 0.00281732, throughput 13.0428K wps
[Epoch 24 Batch 1560/2125] avg loss 0.00311397, throughput 13.0121K wps
[Epoch 24 Batch 1590/2125] avg loss 0.0031169, throughput 13.0062K wps
[Epoch 24 Batch 1620/2125] avg loss 0.0030749, throughput 13.0309K wps
[Epoch 24 Batch 1650/2125] avg loss 0.00331598, throughput 13.0458K wps
[Epoch 24 Batch 1680/2125] avg loss 0.00301142, throughput 13.0499K wps
[Epoch 24 Batch 1710/2125] avg loss 0.00294381, throughput 13.0458K wps
[Epoch 24 Batch 1740/2125] avg loss 0.00320512, throughput 13.0659K wps
[Epoch 24 Batch 1770/2125] avg loss 0.00333968, throughput 13.0274K wps
[Epoch 24 Batch 1800/2125] avg loss 0.0033911, throughput 13.0267K wps
[Epoch 24 Batch 1830/2125] avg loss 0.00258453, throughput 13.045K wps
[Epoch 24 Batch 1860/2125] avg loss 0.00312994, throughput 13.0048K wps
[Epoch 24 Batch 1890/2125] avg loss 0.0031295, throughput 13.0116K wps
[Epoch 24 Batch 1920/2125] avg loss 0.0031969, throughput 13.0241K wps
[Epoch 24 Batch 1950/2125] avg loss 0.00272014, throughput 13.0215K wps
[Epoch 24 Batch 1980/2125] avg loss 0.00306909, throughput 13.0403K wps
[Epoch 24 Batch 2010/2125] avg loss 0.00315642, throughput 13.0214K wps
[Epoch 24 Batch 2040/2125] avg loss 0.00291721, throughput 13.0294K wps
[Epoch 24 Batch 2070/2125] avg loss 0.00295224, throughput 13.0016K wps
[Epoch 24 Batch 2100/2125] avg loss 0.00342173, throughput 13.052K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 24] train avg loss 0.00312798, test acc 0.9228, test avg loss 0.206152, throughput 13.0307K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 25 Batch 30/2125] avg loss 0.00292068, throughput 13.3529K wps
[Epoch 25 Batch 60/2125] avg loss 0.00278864, throughput 12.9435K wps
[Epoch 25 Batch 90/2125] avg loss 0.00337375, throughput 13.0094K wps
[Epoch 25 Batch 120/2125] avg loss 0.00296434, throughput 13.0263K wps
[Epoch 25 Batch 150/2125] avg loss 0.00314952, throughput 13.023K wps
[Epoch 25 Batch 180/2125] avg loss 0.00318904, throughput 13.0082K wps
[Epoch 25 Batch 210/2125] avg loss 0.00302151, throughput 13.0352K wps
[Epoch 25 Batch 240/2125] avg loss 0.00307988, throughput 13.0159K wps
[Epoch 25 Batch 270/2125] avg loss 0.00296676, throughput 12.9874K wps
[Epoch 25 Batch 300/2125] avg loss 0.00286897, throughput 13.0332K wps
[Epoch 25 Batch 330/2125] avg loss 0.00329648, throughput 13.0287K wps
[Epoch 25 Batch 360/2125] avg loss 0.00300738, throughput 13.0053K wps
[Epoch 25 Batch 390/2125] avg loss 0.00352264, throughput 12.9791K wps
[Epoch 25 Batch 420/2125] avg loss 0.00319397, throughput 13.0251K wps
[Epoch 25 Batch 450/2125] avg loss 0.00303166, throughput 12.9982K wps
[Epoch 25 Batch 480/2125] avg loss 0.00311915, throughput 13.0127K wps
[Epoch 25 Batch 510/2125] avg loss 0.00315727, throughput 13.008K wps
[Epoch 25 Batch 540/2125] avg loss 0.00291827, throughput 13.0272K wps
[Epoch 25 Batch 570/2125] avg loss 0.00288903, throughput 13.0871K wps
[Epoch 25 Batch 600/2125] avg loss 0.00303731, throughput 13.0328K wps
[Epoch 25 Batch 630/2125] avg loss 0.0035898, throughput 13.0696K wps
[Epoch 25 Batch 660/2125] avg loss 0.00287993, throughput 13.0583K wps
[Epoch 25 Batch 690/2125] avg loss 0.00337025, throughput 13.0583K wps
[Epoch 25 Batch 720/2125] avg loss 0.00291224, throughput 13.0595K wps
[Epoch 25 Batch 750/2125] avg loss 0.00314566, throughput 13.0462K wps
[Epoch 25 Batch 780/2125] avg loss 0.00330971, throughput 13.0427K wps
[Epoch 25 Batch 810/2125] avg loss 0.00288207, throughput 13.0232K wps
[Epoch 25 Batch 840/2125] avg loss 0.0033177, throughput 13.0514K wps
[Epoch 25 Batch 870/2125] avg loss 0.00326009, throughput 13.0471K wps
[Epoch 25 Batch 900/2125] avg loss 0.00298231, throughput 13.0138K wps
[Epoch 25 Batch 930/2125] avg loss 0.00299829, throughput 13.023K wps
[Epoch 25 Batch 960/2125] avg loss 0.00294601, throughput 13.0204K wps
[Epoch 25 Batch 990/2125] avg loss 0.00295495, throughput 13.0472K wps
[Epoch 25 Batch 1020/2125] avg loss 0.00285826, throughput 13.0448K wps
[Epoch 25 Batch 1050/2125] avg loss 0.00311701, throughput 13.0418K wps
[Epoch 25 Batch 1080/2125] avg loss 0.00329676, throughput 13.0568K wps
[Epoch 25 Batch 1110/2125] avg loss 0.00329414, throughput 13.0443K wps
[Epoch 25 Batch 1140/2125] avg loss 0.00316096, throughput 13.0098K wps
[Epoch 25 Batch 1170/2125] avg loss 0.00300846, throughput 13.0058K wps
[Epoch 25 Batch 1200/2125] avg loss 0.00378204, throughput 12.9959K wps
[Epoch 25 Batch 1230/2125] avg loss 0.00307824, throughput 12.9638K wps
[Epoch 25 Batch 1260/2125] avg loss 0.00288608, throughput 13.0579K wps
[Epoch 25 Batch 1290/2125] avg loss 0.00318369, throughput 13.0092K wps
[Epoch 25 Batch 1320/2125] avg loss 0.00277051, throughput 12.9955K wps
[Epoch 25 Batch 1350/2125] avg loss 0.00318658, throughput 13.0082K wps
[Epoch 25 Batch 1380/2125] avg loss 0.00296936, throughput 13.0093K wps
[Epoch 25 Batch 1410/2125] avg loss 0.00331664, throughput 13.0095K wps
[Epoch 25 Batch 1440/2125] avg loss 0.0030627, throughput 13.0622K wps
[Epoch 25 Batch 1470/2125] avg loss 0.00320056, throughput 13.0415K wps
[Epoch 25 Batch 1500/2125] avg loss 0.00286351, throughput 13.0134K wps
[Epoch 25 Batch 1530/2125] avg loss 0.00322345, throughput 13.042K wps
[Epoch 25 Batch 1560/2125] avg loss 0.00295341, throughput 13.0346K wps
[Epoch 25 Batch 1590/2125] avg loss 0.00311607, throughput 13.0441K wps
[Epoch 25 Batch 1620/2125] avg loss 0.00269561, throughput 13.0529K wps
[Epoch 25 Batch 1650/2125] avg loss 0.00292716, throughput 13.0283K wps
[Epoch 25 Batch 1680/2125] avg loss 0.00333346, throughput 13.0662K wps
[Epoch 25 Batch 1710/2125] avg loss 0.00295182, throughput 13.0267K wps
[Epoch 25 Batch 1740/2125] avg loss 0.00302734, throughput 13.0299K wps
[Epoch 25 Batch 1770/2125] avg loss 0.00295506, throughput 13.0382K wps
[Epoch 25 Batch 1800/2125] avg loss 0.00334351, throughput 13.0501K wps
[Epoch 25 Batch 1830/2125] avg loss 0.00313268, throughput 12.9918K wps
[Epoch 25 Batch 1860/2125] avg loss 0.00282932, throughput 12.9063K wps
[Epoch 25 Batch 1890/2125] avg loss 0.00317964, throughput 12.9863K wps
[Epoch 25 Batch 1920/2125] avg loss 0.00305349, throughput 13.0021K wps
[Epoch 25 Batch 1950/2125] avg loss 0.00336013, throughput 12.9715K wps
[Epoch 25 Batch 1980/2125] avg loss 0.00308337, throughput 13.0578K wps
[Epoch 25 Batch 2010/2125] avg loss 0.00341878, throughput 13.0575K wps
[Epoch 25 Batch 2040/2125] avg loss 0.00317701, throughput 13.0208K wps
[Epoch 25 Batch 2070/2125] avg loss 0.00286192, throughput 13.0322K wps
[Epoch 25 Batch 2100/2125] avg loss 0.00307893, throughput 13.0311K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 25] train avg loss 0.00309488, test acc 0.9230, test avg loss 0.205644, throughput 13.0292K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 26 Batch 30/2125] avg loss 0.00309307, throughput 13.2949K wps
[Epoch 26 Batch 60/2125] avg loss 0.00257836, throughput 13.0084K wps
[Epoch 26 Batch 90/2125] avg loss 0.00273888, throughput 13.0429K wps
[Epoch 26 Batch 120/2125] avg loss 0.00252516, throughput 13.031K wps
[Epoch 26 Batch 150/2125] avg loss 0.00307221, throughput 13.0239K wps
[Epoch 26 Batch 180/2125] avg loss 0.00309371, throughput 12.987K wps
[Epoch 26 Batch 210/2125] avg loss 0.00279488, throughput 13.0022K wps
[Epoch 26 Batch 240/2125] avg loss 0.00307046, throughput 13.0235K wps
[Epoch 26 Batch 270/2125] avg loss 0.00348869, throughput 13.0159K wps
[Epoch 26 Batch 300/2125] avg loss 0.00271129, throughput 12.9957K wps
[Epoch 26 Batch 330/2125] avg loss 0.00333588, throughput 13.0412K wps
[Epoch 26 Batch 360/2125] avg loss 0.00320287, throughput 13.0234K wps
[Epoch 26 Batch 390/2125] avg loss 0.00257086, throughput 13.007K wps
[Epoch 26 Batch 420/2125] avg loss 0.00295498, throughput 13.0444K wps
[Epoch 26 Batch 450/2125] avg loss 0.00317472, throughput 13.0163K wps
[Epoch 26 Batch 480/2125] avg loss 0.00336029, throughput 13.0196K wps
[Epoch 26 Batch 510/2125] avg loss 0.00278362, throughput 12.9984K wps
[Epoch 26 Batch 540/2125] avg loss 0.00263381, throughput 13.0474K wps
[Epoch 26 Batch 570/2125] avg loss 0.0028381, throughput 13.0358K wps
[Epoch 26 Batch 600/2125] avg loss 0.00308725, throughput 13.0263K wps
[Epoch 26 Batch 630/2125] avg loss 0.00312486, throughput 13.0393K wps
[Epoch 26 Batch 660/2125] avg loss 0.00309178, throughput 13.0037K wps
[Epoch 26 Batch 690/2125] avg loss 0.00313275, throughput 13.0073K wps
[Epoch 26 Batch 720/2125] avg loss 0.00287197, throughput 13.0302K wps
[Epoch 26 Batch 750/2125] avg loss 0.00292382, throughput 13.046K wps
[Epoch 26 Batch 780/2125] avg loss 0.0024608, throughput 13.0348K wps
[Epoch 26 Batch 810/2125] avg loss 0.00285794, throughput 13.0201K wps
[Epoch 26 Batch 840/2125] avg loss 0.00279765, throughput 13.0332K wps
[Epoch 26 Batch 870/2125] avg loss 0.00293323, throughput 13.0217K wps
[Epoch 26 Batch 900/2125] avg loss 0.00308426, throughput 13.0272K wps
[Epoch 26 Batch 930/2125] avg loss 0.00290386, throughput 13.0393K wps
[Epoch 26 Batch 960/2125] avg loss 0.00300278, throughput 13.024K wps
[Epoch 26 Batch 990/2125] avg loss 0.00259192, throughput 13.0553K wps
[Epoch 26 Batch 1020/2125] avg loss 0.00293703, throughput 13.0098K wps
[Epoch 26 Batch 1050/2125] avg loss 0.00289736, throughput 13.0419K wps
[Epoch 26 Batch 1080/2125] avg loss 0.00320345, throughput 13.0147K wps
[Epoch 26 Batch 1110/2125] avg loss 0.00297505, throughput 13.049K wps
[Epoch 26 Batch 1140/2125] avg loss 0.00345878, throughput 13.028K wps
[Epoch 26 Batch 1170/2125] avg loss 0.00322019, throughput 13.0238K wps
[Epoch 26 Batch 1200/2125] avg loss 0.00305188, throughput 13.0474K wps
[Epoch 26 Batch 1230/2125] avg loss 0.00294006, throughput 12.9896K wps
[Epoch 26 Batch 1260/2125] avg loss 0.00304607, throughput 13.008K wps
[Epoch 26 Batch 1290/2125] avg loss 0.0027714, throughput 13.0152K wps
[Epoch 26 Batch 1320/2125] avg loss 0.00296277, throughput 13.0337K wps
[Epoch 26 Batch 1350/2125] avg loss 0.00324384, throughput 13.0269K wps
[Epoch 26 Batch 1380/2125] avg loss 0.00299103, throughput 13.004K wps
[Epoch 26 Batch 1410/2125] avg loss 0.00295821, throughput 13.0214K wps
[Epoch 26 Batch 1440/2125] avg loss 0.00326494, throughput 13.0072K wps
[Epoch 26 Batch 1470/2125] avg loss 0.00330849, throughput 13.0179K wps
[Epoch 26 Batch 1500/2125] avg loss 0.00310859, throughput 13.0165K wps
[Epoch 26 Batch 1530/2125] avg loss 0.00276463, throughput 13.0607K wps
[Epoch 26 Batch 1560/2125] avg loss 0.00307179, throughput 13.0429K wps
[Epoch 26 Batch 1590/2125] avg loss 0.00285371, throughput 13.0021K wps
[Epoch 26 Batch 1620/2125] avg loss 0.00321416, throughput 13.0228K wps
[Epoch 26 Batch 1650/2125] avg loss 0.00264973, throughput 13.0109K wps
[Epoch 26 Batch 1680/2125] avg loss 0.00315006, throughput 13.0395K wps
[Epoch 26 Batch 1710/2125] avg loss 0.00319111, throughput 13.0293K wps
[Epoch 26 Batch 1740/2125] avg loss 0.0030175, throughput 13.0254K wps
[Epoch 26 Batch 1770/2125] avg loss 0.00337297, throughput 12.993K wps
[Epoch 26 Batch 1800/2125] avg loss 0.00320539, throughput 13.0019K wps
[Epoch 26 Batch 1830/2125] avg loss 0.00352841, throughput 12.9941K wps
[Epoch 26 Batch 1860/2125] avg loss 0.0032199, throughput 13.0421K wps
[Epoch 26 Batch 1890/2125] avg loss 0.00290879, throughput 13.0479K wps
[Epoch 26 Batch 1920/2125] avg loss 0.00308589, throughput 13.0409K wps
[Epoch 26 Batch 1950/2125] avg loss 0.00292123, throughput 13.052K wps
[Epoch 26 Batch 1980/2125] avg loss 0.00341767, throughput 13.002K wps
[Epoch 26 Batch 2010/2125] avg loss 0.00310417, throughput 13.0168K wps
[Epoch 26 Batch 2040/2125] avg loss 0.00298741, throughput 13.0505K wps
[Epoch 26 Batch 2070/2125] avg loss 0.00327644, throughput 13.037K wps
[Epoch 26 Batch 2100/2125] avg loss 0.0031217, throughput 13.0452K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 26] train avg loss 0.00301997, test acc 0.9245, test avg loss 0.203586, throughput 13.0286K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 27 Batch 30/2125] avg loss 0.00319244, throughput 13.3212K wps
[Epoch 27 Batch 60/2125] avg loss 0.00285799, throughput 12.8773K wps
[Epoch 27 Batch 90/2125] avg loss 0.00264066, throughput 13.0011K wps
[Epoch 27 Batch 120/2125] avg loss 0.00282276, throughput 13.0091K wps
[Epoch 27 Batch 150/2125] avg loss 0.00268776, throughput 13.0314K wps
[Epoch 27 Batch 180/2125] avg loss 0.00276783, throughput 13.01K wps
[Epoch 27 Batch 210/2125] avg loss 0.00307882, throughput 13.0366K wps
[Epoch 27 Batch 240/2125] avg loss 0.00296978, throughput 13.0091K wps
[Epoch 27 Batch 270/2125] avg loss 0.00277039, throughput 12.993K wps
[Epoch 27 Batch 300/2125] avg loss 0.00295926, throughput 13.0044K wps
[Epoch 27 Batch 330/2125] avg loss 0.00275349, throughput 13.011K wps
[Epoch 27 Batch 360/2125] avg loss 0.00289722, throughput 13.0137K wps
[Epoch 27 Batch 390/2125] avg loss 0.00297479, throughput 13.0049K wps
[Epoch 27 Batch 420/2125] avg loss 0.0030401, throughput 13.0268K wps
[Epoch 27 Batch 450/2125] avg loss 0.00281136, throughput 13.0262K wps
[Epoch 27 Batch 480/2125] avg loss 0.00259404, throughput 13.0016K wps
[Epoch 27 Batch 510/2125] avg loss 0.00320533, throughput 12.9796K wps
[Epoch 27 Batch 540/2125] avg loss 0.00254449, throughput 12.9987K wps
[Epoch 27 Batch 570/2125] avg loss 0.00277248, throughput 12.9938K wps
[Epoch 27 Batch 600/2125] avg loss 0.00301628, throughput 12.9991K wps
[Epoch 27 Batch 630/2125] avg loss 0.003408, throughput 13.022K wps
[Epoch 27 Batch 660/2125] avg loss 0.00323789, throughput 12.9849K wps
[Epoch 27 Batch 690/2125] avg loss 0.00363754, throughput 12.9653K wps
[Epoch 27 Batch 720/2125] avg loss 0.00262299, throughput 12.9948K wps
[Epoch 27 Batch 750/2125] avg loss 0.00285409, throughput 12.9974K wps
[Epoch 27 Batch 780/2125] avg loss 0.00329771, throughput 13.0098K wps
[Epoch 27 Batch 810/2125] avg loss 0.00277051, throughput 13.0092K wps
[Epoch 27 Batch 840/2125] avg loss 0.0028322, throughput 12.9748K wps
[Epoch 27 Batch 870/2125] avg loss 0.00311069, throughput 12.9749K wps
[Epoch 27 Batch 900/2125] avg loss 0.00293332, throughput 13.0004K wps
[Epoch 27 Batch 930/2125] avg loss 0.00287628, throughput 13.0024K wps
[Epoch 27 Batch 960/2125] avg loss 0.00270789, throughput 13.0147K wps
[Epoch 27 Batch 990/2125] avg loss 0.00293458, throughput 12.9942K wps
[Epoch 27 Batch 1020/2125] avg loss 0.00303982, throughput 13.0105K wps
[Epoch 27 Batch 1050/2125] avg loss 0.00311624, throughput 13.0016K wps
[Epoch 27 Batch 1080/2125] avg loss 0.00280809, throughput 12.9918K wps
[Epoch 27 Batch 1110/2125] avg loss 0.00289151, throughput 12.832K wps
[Epoch 27 Batch 1140/2125] avg loss 0.00295967, throughput 12.9638K wps
[Epoch 27 Batch 1170/2125] avg loss 0.00324721, throughput 12.9822K wps
[Epoch 27 Batch 1200/2125] avg loss 0.00324302, throughput 12.9087K wps
[Epoch 27 Batch 1230/2125] avg loss 0.00305141, throughput 13.033K wps
[Epoch 27 Batch 1260/2125] avg loss 0.0031638, throughput 13.0208K wps
[Epoch 27 Batch 1290/2125] avg loss 0.00297849, throughput 13.0152K wps
[Epoch 27 Batch 1320/2125] avg loss 0.00282135, throughput 13.0335K wps
[Epoch 27 Batch 1350/2125] avg loss 0.00302895, throughput 13.0181K wps
[Epoch 27 Batch 1380/2125] avg loss 0.00273829, throughput 13.0291K wps
[Epoch 27 Batch 1410/2125] avg loss 0.00302266, throughput 13.043K wps
[Epoch 27 Batch 1440/2125] avg loss 0.00305696, throughput 13.024K wps
[Epoch 27 Batch 1470/2125] avg loss 0.00314967, throughput 13.0512K wps
[Epoch 27 Batch 1500/2125] avg loss 0.00300382, throughput 13.0596K wps
[Epoch 27 Batch 1530/2125] avg loss 0.00270575, throughput 13.0305K wps
[Epoch 27 Batch 1560/2125] avg loss 0.00283105, throughput 13.0272K wps
[Epoch 27 Batch 1590/2125] avg loss 0.0028308, throughput 13.04K wps
[Epoch 27 Batch 1620/2125] avg loss 0.00293227, throughput 13.0325K wps
[Epoch 27 Batch 1650/2125] avg loss 0.0028875, throughput 13.0352K wps
[Epoch 27 Batch 1680/2125] avg loss 0.00298829, throughput 13K wps
[Epoch 27 Batch 1710/2125] avg loss 0.00287406, throughput 13.0559K wps
[Epoch 27 Batch 1740/2125] avg loss 0.00329628, throughput 12.9953K wps
[Epoch 27 Batch 1770/2125] avg loss 0.00279459, throughput 13.0104K wps
[Epoch 27 Batch 1800/2125] avg loss 0.00278234, throughput 12.9918K wps
[Epoch 27 Batch 1830/2125] avg loss 0.00318185, throughput 13.0189K wps
[Epoch 27 Batch 1860/2125] avg loss 0.00276199, throughput 13.0538K wps
[Epoch 27 Batch 1890/2125] avg loss 0.00300837, throughput 13.0275K wps
[Epoch 27 Batch 1920/2125] avg loss 0.00284356, throughput 13.0047K wps
[Epoch 27 Batch 1950/2125] avg loss 0.00290827, throughput 12.9944K wps
[Epoch 27 Batch 1980/2125] avg loss 0.00280848, throughput 13.0379K wps
[Epoch 27 Batch 2010/2125] avg loss 0.00316284, throughput 13.0304K wps
[Epoch 27 Batch 2040/2125] avg loss 0.00277544, throughput 13.0269K wps
[Epoch 27 Batch 2070/2125] avg loss 0.00312495, throughput 13.0664K wps
[Epoch 27 Batch 2100/2125] avg loss 0.00301417, throughput 13.0172K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 27] train avg loss 0.00294998, test acc 0.9243, test avg loss 0.202228, throughput 13.0114K wps
[Epoch 28 Batch 30/2125] avg loss 0.00336237, throughput 13.3211K wps
[Epoch 28 Batch 60/2125] avg loss 0.00291456, throughput 12.9528K wps
[Epoch 28 Batch 90/2125] avg loss 0.00277753, throughput 13.0401K wps
[Epoch 28 Batch 120/2125] avg loss 0.00283257, throughput 13.0041K wps
[Epoch 28 Batch 150/2125] avg loss 0.00297559, throughput 13.0163K wps
[Epoch 28 Batch 180/2125] avg loss 0.00258766, throughput 13.0281K wps
[Epoch 28 Batch 210/2125] avg loss 0.00258651, throughput 13.0148K wps
[Epoch 28 Batch 240/2125] avg loss 0.00287242, throughput 13.0265K wps
[Epoch 28 Batch 270/2125] avg loss 0.00263855, throughput 13.0074K wps
[Epoch 28 Batch 300/2125] avg loss 0.00314527, throughput 12.9845K wps
[Epoch 28 Batch 330/2125] avg loss 0.00301611, throughput 13.0255K wps
[Epoch 28 Batch 360/2125] avg loss 0.00307036, throughput 13.0034K wps
[Epoch 28 Batch 390/2125] avg loss 0.00297649, throughput 12.9995K wps
[Epoch 28 Batch 420/2125] avg loss 0.00250895, throughput 13.0205K wps
[Epoch 28 Batch 450/2125] avg loss 0.00280949, throughput 12.9855K wps
[Epoch 28 Batch 480/2125] avg loss 0.00279183, throughput 12.999K wps
[Epoch 28 Batch 510/2125] avg loss 0.00272416, throughput 13.0203K wps
[Epoch 28 Batch 540/2125] avg loss 0.00285518, throughput 12.9772K wps
[Epoch 28 Batch 570/2125] avg loss 0.0027932, throughput 13.0171K wps
[Epoch 28 Batch 600/2125] avg loss 0.00274548, throughput 12.972K wps
[Epoch 28 Batch 630/2125] avg loss 0.0028952, throughput 13.0023K wps
[Epoch 28 Batch 660/2125] avg loss 0.00259371, throughput 13.0303K wps
[Epoch 28 Batch 690/2125] avg loss 0.00286326, throughput 12.9915K wps
[Epoch 28 Batch 720/2125] avg loss 0.00259887, throughput 12.9974K wps
[Epoch 28 Batch 750/2125] avg loss 0.00291254, throughput 12.9924K wps
[Epoch 28 Batch 780/2125] avg loss 0.00307075, throughput 13.0126K wps
[Epoch 28 Batch 810/2125] avg loss 0.00277041, throughput 13.0103K wps
[Epoch 28 Batch 840/2125] avg loss 0.0031052, throughput 13.0131K wps
[Epoch 28 Batch 870/2125] avg loss 0.00297204, throughput 13K wps
[Epoch 28 Batch 900/2125] avg loss 0.00293595, throughput 12.9881K wps
[Epoch 28 Batch 930/2125] avg loss 0.00316044, throughput 12.9979K wps
[Epoch 28 Batch 960/2125] avg loss 0.00278638, throughput 12.998K wps
[Epoch 28 Batch 990/2125] avg loss 0.00278444, throughput 13.0124K wps
[Epoch 28 Batch 1020/2125] avg loss 0.00306156, throughput 12.9949K wps
[Epoch 28 Batch 1050/2125] avg loss 0.00287778, throughput 12.9945K wps
[Epoch 28 Batch 1080/2125] avg loss 0.00311871, throughput 13.0292K wps
[Epoch 28 Batch 1110/2125] avg loss 0.00278357, throughput 12.9937K wps
[Epoch 28 Batch 1140/2125] avg loss 0.0024275, throughput 13.0085K wps
[Epoch 28 Batch 1170/2125] avg loss 0.00274181, throughput 13.0241K wps
[Epoch 28 Batch 1200/2125] avg loss 0.00277957, throughput 13.0262K wps
[Epoch 28 Batch 1230/2125] avg loss 0.00339099, throughput 12.9935K wps
[Epoch 28 Batch 1260/2125] avg loss 0.00245832, throughput 13.0074K wps
[Epoch 28 Batch 1290/2125] avg loss 0.0029978, throughput 13.0254K wps
[Epoch 28 Batch 1320/2125] avg loss 0.00273496, throughput 13.0397K wps
[Epoch 28 Batch 1350/2125] avg loss 0.00268716, throughput 13.021K wps
[Epoch 28 Batch 1380/2125] avg loss 0.00315445, throughput 12.9805K wps
[Epoch 28 Batch 1410/2125] avg loss 0.00325114, throughput 13.0068K wps
[Epoch 28 Batch 1440/2125] avg loss 0.00240512, throughput 12.9914K wps
[Epoch 28 Batch 1470/2125] avg loss 0.00313427, throughput 13.0069K wps
[Epoch 28 Batch 1500/2125] avg loss 0.00282848, throughput 13.0313K wps
[Epoch 28 Batch 1530/2125] avg loss 0.00284285, throughput 13.0816K wps
[Epoch 28 Batch 1560/2125] avg loss 0.00257567, throughput 13.0127K wps
[Epoch 28 Batch 1590/2125] avg loss 0.00275417, throughput 13.0259K wps
[Epoch 28 Batch 1620/2125] avg loss 0.00264211, throughput 13.0012K wps
[Epoch 28 Batch 1650/2125] avg loss 0.00290519, throughput 13.026K wps
[Epoch 28 Batch 1680/2125] avg loss 0.00253793, throughput 13.0102K wps
[Epoch 28 Batch 1710/2125] avg loss 0.00327842, throughput 12.9976K wps
[Epoch 28 Batch 1740/2125] avg loss 0.00254702, throughput 13.013K wps
[Epoch 28 Batch 1770/2125] avg loss 0.00319532, throughput 13.0001K wps
[Epoch 28 Batch 1800/2125] avg loss 0.0030645, throughput 13.0238K wps
[Epoch 28 Batch 1830/2125] avg loss 0.00281729, throughput 13.0288K wps
[Epoch 28 Batch 1860/2125] avg loss 0.00249438, throughput 13.0127K wps
[Epoch 28 Batch 1890/2125] avg loss 0.00315183, throughput 13.0115K wps
[Epoch 28 Batch 1920/2125] avg loss 0.00327628, throughput 13.0034K wps
[Epoch 28 Batch 1950/2125] avg loss 0.00294528, throughput 13K wps
[Epoch 28 Batch 1980/2125] avg loss 0.0035058, throughput 13.0033K wps
[Epoch 28 Batch 2010/2125] avg loss 0.00279053, throughput 13.0321K wps
[Epoch 28 Batch 2040/2125] avg loss 0.00290919, throughput 13.0183K wps
[Epoch 28 Batch 2070/2125] avg loss 0.00251492, throughput 13.0478K wps
[Epoch 28 Batch 2100/2125] avg loss 0.00292977, throughput 13.0086K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 28] train avg loss 0.00287539, test acc 0.9243, test avg loss 0.202864, throughput 13.0144K wps
[Epoch 29 Batch 30/2125] avg loss 0.00305574, throughput 13.3534K wps
[Epoch 29 Batch 60/2125] avg loss 0.00271966, throughput 13.0012K wps
[Epoch 29 Batch 90/2125] avg loss 0.00295136, throughput 12.9989K wps
[Epoch 29 Batch 120/2125] avg loss 0.00278037, throughput 12.9932K wps
[Epoch 29 Batch 150/2125] avg loss 0.00259505, throughput 13.0008K wps
[Epoch 29 Batch 180/2125] avg loss 0.00297848, throughput 12.9925K wps
[Epoch 29 Batch 210/2125] avg loss 0.00293489, throughput 12.9888K wps
[Epoch 29 Batch 240/2125] avg loss 0.00237541, throughput 13.011K wps
[Epoch 29 Batch 270/2125] avg loss 0.00284443, throughput 12.9966K wps
[Epoch 29 Batch 300/2125] avg loss 0.00263351, throughput 13.0011K wps
[Epoch 29 Batch 330/2125] avg loss 0.00282423, throughput 12.9841K wps
[Epoch 29 Batch 360/2125] avg loss 0.00272761, throughput 13.0609K wps
[Epoch 29 Batch 390/2125] avg loss 0.0029992, throughput 13.055K wps
[Epoch 29 Batch 420/2125] avg loss 0.00317551, throughput 13.058K wps
[Epoch 29 Batch 450/2125] avg loss 0.00301045, throughput 13.0527K wps
[Epoch 29 Batch 480/2125] avg loss 0.00283299, throughput 13.0462K wps
[Epoch 29 Batch 510/2125] avg loss 0.00302885, throughput 13.022K wps
[Epoch 29 Batch 540/2125] avg loss 0.0027779, throughput 13.0141K wps
[Epoch 29 Batch 570/2125] avg loss 0.00311266, throughput 13.0609K wps
[Epoch 29 Batch 600/2125] avg loss 0.00295852, throughput 13.0178K wps
[Epoch 29 Batch 630/2125] avg loss 0.0028018, throughput 13.0039K wps
[Epoch 29 Batch 660/2125] avg loss 0.00255468, throughput 13.0346K wps
[Epoch 29 Batch 690/2125] avg loss 0.00292847, throughput 13.0295K wps
[Epoch 29 Batch 720/2125] avg loss 0.00313821, throughput 13.0212K wps
[Epoch 29 Batch 750/2125] avg loss 0.00257754, throughput 13.0403K wps
[Epoch 29 Batch 780/2125] avg loss 0.0028968, throughput 13.028K wps
[Epoch 29 Batch 810/2125] avg loss 0.00296435, throughput 13.0429K wps
[Epoch 29 Batch 840/2125] avg loss 0.00264712, throughput 13.0168K wps
[Epoch 29 Batch 870/2125] avg loss 0.00300997, throughput 13.0168K wps
[Epoch 29 Batch 900/2125] avg loss 0.00248172, throughput 13.0316K wps
[Epoch 29 Batch 930/2125] avg loss 0.00286541, throughput 13.0261K wps
[Epoch 29 Batch 960/2125] avg loss 0.0027587, throughput 13.0177K wps
[Epoch 29 Batch 990/2125] avg loss 0.00261261, throughput 13.035K wps
[Epoch 29 Batch 1020/2125] avg loss 0.00264116, throughput 13.0365K wps
[Epoch 29 Batch 1050/2125] avg loss 0.00306445, throughput 13.0417K wps
[Epoch 29 Batch 1080/2125] avg loss 0.00296102, throughput 13.0207K wps
[Epoch 29 Batch 1110/2125] avg loss 0.00263544, throughput 13.0009K wps
[Epoch 29 Batch 1140/2125] avg loss 0.00293647, throughput 13.0252K wps
[Epoch 29 Batch 1170/2125] avg loss 0.0029169, throughput 13.0004K wps
[Epoch 29 Batch 1200/2125] avg loss 0.00275281, throughput 13.0303K wps
[Epoch 29 Batch 1230/2125] avg loss 0.0029503, throughput 13.0258K wps
[Epoch 29 Batch 1260/2125] avg loss 0.00294732, throughput 13.0187K wps
[Epoch 29 Batch 1290/2125] avg loss 0.00292207, throughput 13.0273K wps
[Epoch 29 Batch 1320/2125] avg loss 0.00274032, throughput 13.0199K wps
[Epoch 29 Batch 1350/2125] avg loss 0.0026251, throughput 13.039K wps
[Epoch 29 Batch 1380/2125] avg loss 0.00292937, throughput 13.0516K wps
[Epoch 29 Batch 1410/2125] avg loss 0.00271435, throughput 13.007K wps
[Epoch 29 Batch 1440/2125] avg loss 0.00274994, throughput 13.0407K wps
[Epoch 29 Batch 1470/2125] avg loss 0.00286386, throughput 13.0321K wps
[Epoch 29 Batch 1500/2125] avg loss 0.00288733, throughput 13.0395K wps
[Epoch 29 Batch 1530/2125] avg loss 0.00252576, throughput 13.0412K wps
[Epoch 29 Batch 1560/2125] avg loss 0.00307799, throughput 13.0305K wps
[Epoch 29 Batch 1590/2125] avg loss 0.00325709, throughput 13.0413K wps
[Epoch 29 Batch 1620/2125] avg loss 0.00284818, throughput 13.0256K wps
[Epoch 29 Batch 1650/2125] avg loss 0.0029414, throughput 13.033K wps
[Epoch 29 Batch 1680/2125] avg loss 0.00317533, throughput 13.014K wps
[Epoch 29 Batch 1710/2125] avg loss 0.00270608, throughput 13.039K wps
[Epoch 29 Batch 1740/2125] avg loss 0.00329757, throughput 13.0199K wps
[Epoch 29 Batch 1770/2125] avg loss 0.00302532, throughput 13.0278K wps
[Epoch 29 Batch 1800/2125] avg loss 0.00295814, throughput 13.0185K wps
[Epoch 29 Batch 1830/2125] avg loss 0.00245664, throughput 13.0598K wps
[Epoch 29 Batch 1860/2125] avg loss 0.002851, throughput 12.9845K wps
[Epoch 29 Batch 1890/2125] avg loss 0.00278669, throughput 13.0373K wps
[Epoch 29 Batch 1920/2125] avg loss 0.00263952, throughput 13.0592K wps
[Epoch 29 Batch 1950/2125] avg loss 0.00303859, throughput 13.005K wps
[Epoch 29 Batch 1980/2125] avg loss 0.00266955, throughput 13.0284K wps
[Epoch 29 Batch 2010/2125] avg loss 0.00310361, throughput 13.0307K wps
[Epoch 29 Batch 2040/2125] avg loss 0.00287177, throughput 13.0421K wps
[Epoch 29 Batch 2070/2125] avg loss 0.0028699, throughput 13.0321K wps
[Epoch 29 Batch 2100/2125] avg loss 0.0028723, throughput 13.0296K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 29] train avg loss 0.0028547, test acc 0.9251, test avg loss 0.199881, throughput 13.0299K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 30 Batch 30/2125] avg loss 0.00253268, throughput 13.2826K wps
[Epoch 30 Batch 60/2125] avg loss 0.00291157, throughput 12.9303K wps
[Epoch 30 Batch 90/2125] avg loss 0.00248083, throughput 13.0469K wps
[Epoch 30 Batch 120/2125] avg loss 0.00309381, throughput 12.9728K wps
[Epoch 30 Batch 150/2125] avg loss 0.00281809, throughput 13.0044K wps
[Epoch 30 Batch 180/2125] avg loss 0.00287772, throughput 12.9989K wps
[Epoch 30 Batch 210/2125] avg loss 0.00284828, throughput 13.0108K wps
[Epoch 30 Batch 240/2125] avg loss 0.00255149, throughput 12.993K wps
[Epoch 30 Batch 270/2125] avg loss 0.00255606, throughput 12.995K wps
[Epoch 30 Batch 300/2125] avg loss 0.00250608, throughput 12.9921K wps
[Epoch 30 Batch 330/2125] avg loss 0.00324263, throughput 12.9786K wps
[Epoch 30 Batch 360/2125] avg loss 0.00283295, throughput 13.016K wps
[Epoch 30 Batch 390/2125] avg loss 0.00278565, throughput 12.9859K wps
[Epoch 30 Batch 420/2125] avg loss 0.00272659, throughput 13.0135K wps
[Epoch 30 Batch 450/2125] avg loss 0.00265164, throughput 13.0225K wps
[Epoch 30 Batch 480/2125] avg loss 0.00247548, throughput 13.0071K wps
[Epoch 30 Batch 510/2125] avg loss 0.00261569, throughput 13.0198K wps
[Epoch 30 Batch 540/2125] avg loss 0.00245494, throughput 12.9845K wps
[Epoch 30 Batch 570/2125] avg loss 0.00313459, throughput 13.0178K wps
[Epoch 30 Batch 600/2125] avg loss 0.00281207, throughput 12.993K wps
[Epoch 30 Batch 630/2125] avg loss 0.00258357, throughput 13.0086K wps
[Epoch 30 Batch 660/2125] avg loss 0.00295541, throughput 12.9998K wps
[Epoch 30 Batch 690/2125] avg loss 0.00294229, throughput 12.9926K wps
[Epoch 30 Batch 720/2125] avg loss 0.00265324, throughput 12.9977K wps
[Epoch 30 Batch 750/2125] avg loss 0.00295993, throughput 13.026K wps
[Epoch 30 Batch 780/2125] avg loss 0.00284374, throughput 13.0109K wps
[Epoch 30 Batch 810/2125] avg loss 0.00292834, throughput 13.0371K wps
[Epoch 30 Batch 840/2125] avg loss 0.00243565, throughput 12.9954K wps
[Epoch 30 Batch 870/2125] avg loss 0.00281817, throughput 13.0219K wps
[Epoch 30 Batch 900/2125] avg loss 0.00318445, throughput 12.9781K wps
[Epoch 30 Batch 930/2125] avg loss 0.00306266, throughput 12.9834K wps
[Epoch 30 Batch 960/2125] avg loss 0.00253574, throughput 13.033K wps
[Epoch 30 Batch 990/2125] avg loss 0.00272494, throughput 13.0221K wps
[Epoch 30 Batch 1020/2125] avg loss 0.00256296, throughput 13.0108K wps
[Epoch 30 Batch 1050/2125] avg loss 0.00293006, throughput 13.0306K wps
[Epoch 30 Batch 1080/2125] avg loss 0.00279983, throughput 12.9803K wps
[Epoch 30 Batch 1110/2125] avg loss 0.00317411, throughput 13.0064K wps
[Epoch 30 Batch 1140/2125] avg loss 0.00271028, throughput 12.9797K wps
[Epoch 30 Batch 1170/2125] avg loss 0.00278624, throughput 12.9907K wps
[Epoch 30 Batch 1200/2125] avg loss 0.0027296, throughput 12.9954K wps
[Epoch 30 Batch 1230/2125] avg loss 0.00337784, throughput 13.0291K wps
[Epoch 30 Batch 1260/2125] avg loss 0.00253655, throughput 12.9988K wps
[Epoch 30 Batch 1290/2125] avg loss 0.00278213, throughput 13.0071K wps
[Epoch 30 Batch 1320/2125] avg loss 0.00277059, throughput 12.9985K wps
[Epoch 30 Batch 1350/2125] avg loss 0.00297125, throughput 13.0288K wps
[Epoch 30 Batch 1380/2125] avg loss 0.00279462, throughput 12.9952K wps
[Epoch 30 Batch 1410/2125] avg loss 0.00268619, throughput 12.9839K wps
[Epoch 30 Batch 1440/2125] avg loss 0.00322937, throughput 13.0137K wps
[Epoch 30 Batch 1470/2125] avg loss 0.00304167, throughput 13.0082K wps
[Epoch 30 Batch 1500/2125] avg loss 0.00257426, throughput 12.9809K wps
[Epoch 30 Batch 1530/2125] avg loss 0.0029759, throughput 12.9926K wps
[Epoch 30 Batch 1560/2125] avg loss 0.0028387, throughput 13.0001K wps
[Epoch 30 Batch 1590/2125] avg loss 0.00321981, throughput 13.0229K wps
[Epoch 30 Batch 1620/2125] avg loss 0.00262852, throughput 12.9833K wps
[Epoch 30 Batch 1650/2125] avg loss 0.00285669, throughput 13.0079K wps
[Epoch 30 Batch 1680/2125] avg loss 0.00236405, throughput 13.0141K wps
[Epoch 30 Batch 1710/2125] avg loss 0.00279562, throughput 12.9805K wps
[Epoch 30 Batch 1740/2125] avg loss 0.00271875, throughput 13.0324K wps
[Epoch 30 Batch 1770/2125] avg loss 0.00255318, throughput 12.9481K wps
[Epoch 30 Batch 1800/2125] avg loss 0.00256011, throughput 12.9966K wps
[Epoch 30 Batch 1830/2125] avg loss 0.00301445, throughput 13.0009K wps
[Epoch 30 Batch 1860/2125] avg loss 0.00273875, throughput 12.9974K wps
[Epoch 30 Batch 1890/2125] avg loss 0.00287614, throughput 12.9928K wps
[Epoch 30 Batch 1920/2125] avg loss 0.00259276, throughput 13.0291K wps
[Epoch 30 Batch 1950/2125] avg loss 0.00280674, throughput 12.994K wps
[Epoch 30 Batch 1980/2125] avg loss 0.00261433, throughput 13.0315K wps
[Epoch 30 Batch 2010/2125] avg loss 0.00298076, throughput 13.007K wps
[Epoch 30 Batch 2040/2125] avg loss 0.00272535, throughput 13.0268K wps
[Epoch 30 Batch 2070/2125] avg loss 0.00301058, throughput 13.0131K wps
[Epoch 30 Batch 2100/2125] avg loss 0.00306657, throughput 13.0373K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 30] train avg loss 0.0028019, test acc 0.9270, test avg loss 0.199773, throughput 13.008K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 31 Batch 30/2125] avg loss 0.00303747, throughput 13.3687K wps
[Epoch 31 Batch 60/2125] avg loss 0.00264512, throughput 12.9559K wps
[Epoch 31 Batch 90/2125] avg loss 0.00264855, throughput 13.0838K wps
[Epoch 31 Batch 120/2125] avg loss 0.00265558, throughput 13.0098K wps
[Epoch 31 Batch 150/2125] avg loss 0.00258036, throughput 13.0095K wps
[Epoch 31 Batch 180/2125] avg loss 0.00294654, throughput 13.0085K wps
[Epoch 31 Batch 210/2125] avg loss 0.00304457, throughput 13.0602K wps
[Epoch 31 Batch 240/2125] avg loss 0.00311108, throughput 13.0287K wps
[Epoch 31 Batch 270/2125] avg loss 0.00234782, throughput 13.0267K wps
[Epoch 31 Batch 300/2125] avg loss 0.00243276, throughput 13.0289K wps
[Epoch 31 Batch 330/2125] avg loss 0.00278956, throughput 13.058K wps
[Epoch 31 Batch 360/2125] avg loss 0.00214459, throughput 13.0429K wps
[Epoch 31 Batch 390/2125] avg loss 0.00285317, throughput 13.0413K wps
[Epoch 31 Batch 420/2125] avg loss 0.00282449, throughput 13.0721K wps
[Epoch 31 Batch 450/2125] avg loss 0.0027144, throughput 13.0006K wps
[Epoch 31 Batch 480/2125] avg loss 0.00245618, throughput 13.0374K wps
[Epoch 31 Batch 510/2125] avg loss 0.00271992, throughput 13.0334K wps
[Epoch 31 Batch 540/2125] avg loss 0.00280403, throughput 13.0477K wps
[Epoch 31 Batch 570/2125] avg loss 0.00255538, throughput 13.042K wps
[Epoch 31 Batch 600/2125] avg loss 0.00249887, throughput 13.0221K wps
[Epoch 31 Batch 630/2125] avg loss 0.0028978, throughput 13.0327K wps
[Epoch 31 Batch 660/2125] avg loss 0.00292209, throughput 13.0124K wps
[Epoch 31 Batch 690/2125] avg loss 0.00243167, throughput 13.0398K wps
[Epoch 31 Batch 720/2125] avg loss 0.00245674, throughput 13.0518K wps
[Epoch 31 Batch 750/2125] avg loss 0.00234301, throughput 13.0575K wps
[Epoch 31 Batch 780/2125] avg loss 0.00299553, throughput 13.0556K wps
[Epoch 31 Batch 810/2125] avg loss 0.0025515, throughput 13.028K wps
[Epoch 31 Batch 840/2125] avg loss 0.00297599, throughput 13.046K wps
[Epoch 31 Batch 870/2125] avg loss 0.00255244, throughput 13.037K wps
[Epoch 31 Batch 900/2125] avg loss 0.00276538, throughput 13.0088K wps
[Epoch 31 Batch 930/2125] avg loss 0.0026237, throughput 13.0221K wps
[Epoch 31 Batch 960/2125] avg loss 0.00289882, throughput 13.0266K wps
[Epoch 31 Batch 990/2125] avg loss 0.00257634, throughput 13.0058K wps
[Epoch 31 Batch 1020/2125] avg loss 0.00311506, throughput 13.0358K wps
[Epoch 31 Batch 1050/2125] avg loss 0.00271475, throughput 13.0064K wps
[Epoch 31 Batch 1080/2125] avg loss 0.00288252, throughput 13.0544K wps
[Epoch 31 Batch 1110/2125] avg loss 0.00285293, throughput 13.033K wps
[Epoch 31 Batch 1140/2125] avg loss 0.00266553, throughput 13.0268K wps
[Epoch 31 Batch 1170/2125] avg loss 0.00301651, throughput 13.05K wps
[Epoch 31 Batch 1200/2125] avg loss 0.00280176, throughput 12.9955K wps
[Epoch 31 Batch 1230/2125] avg loss 0.00265341, throughput 13.0353K wps
[Epoch 31 Batch 1260/2125] avg loss 0.00267892, throughput 13.0266K wps
[Epoch 31 Batch 1290/2125] avg loss 0.00264384, throughput 13.0301K wps
[Epoch 31 Batch 1320/2125] avg loss 0.00294033, throughput 13.0015K wps
[Epoch 31 Batch 1350/2125] avg loss 0.00292864, throughput 13.0361K wps
[Epoch 31 Batch 1380/2125] avg loss 0.00265896, throughput 13.0318K wps
[Epoch 31 Batch 1410/2125] avg loss 0.00286623, throughput 13.0438K wps
[Epoch 31 Batch 1440/2125] avg loss 0.00281992, throughput 13.0178K wps
[Epoch 31 Batch 1470/2125] avg loss 0.00299474, throughput 13.0363K wps
[Epoch 31 Batch 1500/2125] avg loss 0.00288999, throughput 13.0193K wps
[Epoch 31 Batch 1530/2125] avg loss 0.00293867, throughput 13.0558K wps
[Epoch 31 Batch 1560/2125] avg loss 0.00276228, throughput 13.0483K wps
[Epoch 31 Batch 1590/2125] avg loss 0.00297694, throughput 13.0551K wps
[Epoch 31 Batch 1620/2125] avg loss 0.00278429, throughput 13.035K wps
[Epoch 31 Batch 1650/2125] avg loss 0.00262548, throughput 13.0173K wps
[Epoch 31 Batch 1680/2125] avg loss 0.00274715, throughput 13.0269K wps
[Epoch 31 Batch 1710/2125] avg loss 0.00275529, throughput 13.0299K wps
[Epoch 31 Batch 1740/2125] avg loss 0.00291586, throughput 13.019K wps
[Epoch 31 Batch 1770/2125] avg loss 0.00260269, throughput 13.0256K wps
[Epoch 31 Batch 1800/2125] avg loss 0.00273644, throughput 13.0496K wps
[Epoch 31 Batch 1830/2125] avg loss 0.00261324, throughput 13.0355K wps
[Epoch 31 Batch 1860/2125] avg loss 0.00308047, throughput 13.0312K wps
[Epoch 31 Batch 1890/2125] avg loss 0.00278306, throughput 13.0406K wps
[Epoch 31 Batch 1920/2125] avg loss 0.00268109, throughput 13.01K wps
[Epoch 31 Batch 1950/2125] avg loss 0.00263999, throughput 13.0087K wps
[Epoch 31 Batch 1980/2125] avg loss 0.00272545, throughput 13.0353K wps
[Epoch 31 Batch 2010/2125] avg loss 0.00235274, throughput 13.0489K wps
[Epoch 31 Batch 2040/2125] avg loss 0.00279956, throughput 13.0132K wps
[Epoch 31 Batch 2070/2125] avg loss 0.00275674, throughput 12.9997K wps
[Epoch 31 Batch 2100/2125] avg loss 0.00253034, throughput 13.0351K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 31] train avg loss 0.00274457, test acc 0.9269, test avg loss 0.1982, throughput 13.0355K wps
[Epoch 32 Batch 30/2125] avg loss 0.00235259, throughput 13.2772K wps
[Epoch 32 Batch 60/2125] avg loss 0.00280896, throughput 12.9594K wps
[Epoch 32 Batch 90/2125] avg loss 0.00287247, throughput 13.0045K wps
[Epoch 32 Batch 120/2125] avg loss 0.00244387, throughput 13.0306K wps
[Epoch 32 Batch 150/2125] avg loss 0.00294207, throughput 13.0406K wps
[Epoch 32 Batch 180/2125] avg loss 0.00281504, throughput 13.0297K wps
[Epoch 32 Batch 210/2125] avg loss 0.00270698, throughput 13.0552K wps
[Epoch 32 Batch 240/2125] avg loss 0.0027128, throughput 13.0344K wps
[Epoch 32 Batch 270/2125] avg loss 0.00257044, throughput 13.0246K wps
[Epoch 32 Batch 300/2125] avg loss 0.0025551, throughput 13.0209K wps
[Epoch 32 Batch 330/2125] avg loss 0.00253055, throughput 13.0343K wps
[Epoch 32 Batch 360/2125] avg loss 0.00254543, throughput 13.0197K wps
[Epoch 32 Batch 390/2125] avg loss 0.00229606, throughput 13.0263K wps
[Epoch 32 Batch 420/2125] avg loss 0.00247856, throughput 13.0376K wps
[Epoch 32 Batch 450/2125] avg loss 0.00283902, throughput 13.0316K wps
[Epoch 32 Batch 480/2125] avg loss 0.00238255, throughput 13.0443K wps
[Epoch 32 Batch 510/2125] avg loss 0.00274446, throughput 13.0423K wps
[Epoch 32 Batch 540/2125] avg loss 0.00273461, throughput 13.0528K wps
[Epoch 32 Batch 570/2125] avg loss 0.00238871, throughput 13.0539K wps
[Epoch 32 Batch 600/2125] avg loss 0.00272229, throughput 13.0261K wps
[Epoch 32 Batch 630/2125] avg loss 0.00277221, throughput 13.0271K wps
[Epoch 32 Batch 660/2125] avg loss 0.00255844, throughput 13.0344K wps
[Epoch 32 Batch 690/2125] avg loss 0.00257655, throughput 13.0702K wps
[Epoch 32 Batch 720/2125] avg loss 0.00266564, throughput 13.04K wps
[Epoch 32 Batch 750/2125] avg loss 0.0028598, throughput 13.0422K wps
[Epoch 32 Batch 780/2125] avg loss 0.00280486, throughput 13.0458K wps
[Epoch 32 Batch 810/2125] avg loss 0.00254057, throughput 13.0244K wps
[Epoch 32 Batch 840/2125] avg loss 0.00303618, throughput 13.0528K wps
[Epoch 32 Batch 870/2125] avg loss 0.00264206, throughput 13.0585K wps
[Epoch 32 Batch 900/2125] avg loss 0.00290312, throughput 13.0488K wps
[Epoch 32 Batch 930/2125] avg loss 0.00293801, throughput 13.0795K wps
[Epoch 32 Batch 960/2125] avg loss 0.00266675, throughput 13.0456K wps
[Epoch 32 Batch 990/2125] avg loss 0.00258839, throughput 12.9993K wps
[Epoch 32 Batch 1020/2125] avg loss 0.00279993, throughput 13.0232K wps
[Epoch 32 Batch 1050/2125] avg loss 0.00301355, throughput 13.0554K wps
[Epoch 32 Batch 1080/2125] avg loss 0.00241562, throughput 13.0515K wps
[Epoch 32 Batch 1110/2125] avg loss 0.0025964, throughput 13.0574K wps
[Epoch 32 Batch 1140/2125] avg loss 0.00262903, throughput 13.0775K wps
[Epoch 32 Batch 1170/2125] avg loss 0.0027972, throughput 13.051K wps
[Epoch 32 Batch 1200/2125] avg loss 0.00301918, throughput 13.0528K wps
[Epoch 32 Batch 1230/2125] avg loss 0.0030347, throughput 13.063K wps
[Epoch 32 Batch 1260/2125] avg loss 0.00271806, throughput 13.0215K wps
[Epoch 32 Batch 1290/2125] avg loss 0.00300912, throughput 13.0313K wps
[Epoch 32 Batch 1320/2125] avg loss 0.00267273, throughput 13.0553K wps
[Epoch 32 Batch 1350/2125] avg loss 0.0024247, throughput 13.0061K wps
[Epoch 32 Batch 1380/2125] avg loss 0.00298268, throughput 13.062K wps
[Epoch 32 Batch 1410/2125] avg loss 0.00299267, throughput 13.0459K wps
[Epoch 32 Batch 1440/2125] avg loss 0.00250228, throughput 13.0301K wps
[Epoch 32 Batch 1470/2125] avg loss 0.00240672, throughput 12.9351K wps
[Epoch 32 Batch 1500/2125] avg loss 0.00297205, throughput 12.975K wps
[Epoch 32 Batch 1530/2125] avg loss 0.00268663, throughput 12.9395K wps
[Epoch 32 Batch 1560/2125] avg loss 0.00247242, throughput 12.7479K wps
[Epoch 32 Batch 1590/2125] avg loss 0.00270891, throughput 13.0431K wps
[Epoch 32 Batch 1620/2125] avg loss 0.00255964, throughput 13.0076K wps
[Epoch 32 Batch 1650/2125] avg loss 0.00250204, throughput 12.9984K wps
[Epoch 32 Batch 1680/2125] avg loss 0.00282104, throughput 13.0454K wps
[Epoch 32 Batch 1710/2125] avg loss 0.00253428, throughput 13.0473K wps
[Epoch 32 Batch 1740/2125] avg loss 0.00254575, throughput 13.0489K wps
[Epoch 32 Batch 1770/2125] avg loss 0.00259926, throughput 13.0291K wps
[Epoch 32 Batch 1800/2125] avg loss 0.00276404, throughput 13.0081K wps
[Epoch 32 Batch 1830/2125] avg loss 0.0025501, throughput 12.99K wps
[Epoch 32 Batch 1860/2125] avg loss 0.00263449, throughput 13.0018K wps
[Epoch 32 Batch 1890/2125] avg loss 0.00282573, throughput 12.992K wps
[Epoch 32 Batch 1920/2125] avg loss 0.00273653, throughput 13.004K wps
[Epoch 32 Batch 1950/2125] avg loss 0.00249798, throughput 12.9967K wps
[Epoch 32 Batch 1980/2125] avg loss 0.00278206, throughput 13.0237K wps
[Epoch 32 Batch 2010/2125] avg loss 0.00262019, throughput 13.0153K wps
[Epoch 32 Batch 2040/2125] avg loss 0.00332103, throughput 13.0257K wps
[Epoch 32 Batch 2070/2125] avg loss 0.00278105, throughput 13.0102K wps
[Epoch 32 Batch 2100/2125] avg loss 0.00277204, throughput 13.0261K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 32] train avg loss 0.00269481, test acc 0.9285, test avg loss 0.198745, throughput 13.0289K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 33 Batch 30/2125] avg loss 0.00278801, throughput 13.3167K wps
[Epoch 33 Batch 60/2125] avg loss 0.00247203, throughput 13.0137K wps
[Epoch 33 Batch 90/2125] avg loss 0.00288039, throughput 12.9836K wps
[Epoch 33 Batch 120/2125] avg loss 0.00244561, throughput 13.0076K wps
[Epoch 33 Batch 150/2125] avg loss 0.0026278, throughput 13.0007K wps
[Epoch 33 Batch 180/2125] avg loss 0.00280161, throughput 13.0145K wps
[Epoch 33 Batch 210/2125] avg loss 0.00297578, throughput 12.9972K wps
[Epoch 33 Batch 240/2125] avg loss 0.00265921, throughput 12.9707K wps
[Epoch 33 Batch 270/2125] avg loss 0.0027186, throughput 13.0521K wps
[Epoch 33 Batch 300/2125] avg loss 0.00281871, throughput 13.0447K wps
[Epoch 33 Batch 330/2125] avg loss 0.00267125, throughput 13.0603K wps
[Epoch 33 Batch 360/2125] avg loss 0.00257027, throughput 13.0499K wps
[Epoch 33 Batch 390/2125] avg loss 0.00255135, throughput 13.0174K wps
[Epoch 33 Batch 420/2125] avg loss 0.00233471, throughput 13.0117K wps
[Epoch 33 Batch 450/2125] avg loss 0.00263028, throughput 12.9887K wps
[Epoch 33 Batch 480/2125] avg loss 0.00276289, throughput 13.0322K wps
[Epoch 33 Batch 510/2125] avg loss 0.00235381, throughput 13.0409K wps
[Epoch 33 Batch 540/2125] avg loss 0.00309084, throughput 13.0428K wps
[Epoch 33 Batch 570/2125] avg loss 0.00250473, throughput 13.0298K wps
[Epoch 33 Batch 600/2125] avg loss 0.00271994, throughput 13.0492K wps
[Epoch 33 Batch 630/2125] avg loss 0.00245572, throughput 12.9954K wps
[Epoch 33 Batch 660/2125] avg loss 0.00251412, throughput 13.0361K wps
[Epoch 33 Batch 690/2125] avg loss 0.00248623, throughput 13.0575K wps
[Epoch 33 Batch 720/2125] avg loss 0.00279828, throughput 13.0296K wps
[Epoch 33 Batch 750/2125] avg loss 0.00254143, throughput 13.0113K wps
[Epoch 33 Batch 780/2125] avg loss 0.00302932, throughput 13.0074K wps
[Epoch 33 Batch 810/2125] avg loss 0.00282216, throughput 13.0036K wps
[Epoch 33 Batch 840/2125] avg loss 0.00293382, throughput 13.0057K wps
[Epoch 33 Batch 870/2125] avg loss 0.00294222, throughput 13.0072K wps
[Epoch 33 Batch 900/2125] avg loss 0.00242973, throughput 12.9904K wps
[Epoch 33 Batch 930/2125] avg loss 0.0023605, throughput 13.0456K wps
[Epoch 33 Batch 960/2125] avg loss 0.00261641, throughput 13.0109K wps
[Epoch 33 Batch 990/2125] avg loss 0.00266706, throughput 13.0015K wps
[Epoch 33 Batch 1020/2125] avg loss 0.00272797, throughput 13.0095K wps
[Epoch 33 Batch 1050/2125] avg loss 0.00232943, throughput 12.9986K wps
[Epoch 33 Batch 1080/2125] avg loss 0.00263771, throughput 13.0291K wps
[Epoch 33 Batch 1110/2125] avg loss 0.00270747, throughput 13.0625K wps
[Epoch 33 Batch 1140/2125] avg loss 0.00280533, throughput 13.0478K wps
[Epoch 33 Batch 1170/2125] avg loss 0.00274875, throughput 13.0384K wps
[Epoch 33 Batch 1200/2125] avg loss 0.00264215, throughput 13.0438K wps
[Epoch 33 Batch 1230/2125] avg loss 0.00276506, throughput 13.0688K wps
[Epoch 33 Batch 1260/2125] avg loss 0.00310766, throughput 12.9959K wps
[Epoch 33 Batch 1290/2125] avg loss 0.00259452, throughput 12.9995K wps
[Epoch 33 Batch 1320/2125] avg loss 0.00223478, throughput 13.02K wps
[Epoch 33 Batch 1350/2125] avg loss 0.0027067, throughput 12.9772K wps
[Epoch 33 Batch 1380/2125] avg loss 0.00312957, throughput 13.0374K wps
[Epoch 33 Batch 1410/2125] avg loss 0.00262505, throughput 13.0215K wps
[Epoch 33 Batch 1440/2125] avg loss 0.00282589, throughput 13.0384K wps
[Epoch 33 Batch 1470/2125] avg loss 0.00246154, throughput 13.0636K wps
[Epoch 33 Batch 1500/2125] avg loss 0.00285089, throughput 13.0339K wps
[Epoch 33 Batch 1530/2125] avg loss 0.00272326, throughput 12.9842K wps
[Epoch 33 Batch 1560/2125] avg loss 0.00288729, throughput 13.0008K wps
[Epoch 33 Batch 1590/2125] avg loss 0.00263579, throughput 13.0048K wps
[Epoch 33 Batch 1620/2125] avg loss 0.00278608, throughput 13.024K wps
[Epoch 33 Batch 1650/2125] avg loss 0.0025905, throughput 13.0215K wps
[Epoch 33 Batch 1680/2125] avg loss 0.00302047, throughput 13.0271K wps
[Epoch 33 Batch 1710/2125] avg loss 0.0024814, throughput 13.0062K wps
[Epoch 33 Batch 1740/2125] avg loss 0.00256285, throughput 13.0068K wps
[Epoch 33 Batch 1770/2125] avg loss 0.00263297, throughput 12.9863K wps
[Epoch 33 Batch 1800/2125] avg loss 0.00267965, throughput 12.9992K wps
[Epoch 33 Batch 1830/2125] avg loss 0.00288676, throughput 13.0263K wps
[Epoch 33 Batch 1860/2125] avg loss 0.00265203, throughput 13.005K wps
[Epoch 33 Batch 1890/2125] avg loss 0.00228222, throughput 12.9897K wps
[Epoch 33 Batch 1920/2125] avg loss 0.00252096, throughput 13.0344K wps
[Epoch 33 Batch 1950/2125] avg loss 0.00276379, throughput 13.0188K wps
[Epoch 33 Batch 1980/2125] avg loss 0.00259754, throughput 13.0043K wps
[Epoch 33 Batch 2010/2125] avg loss 0.00285324, throughput 13.0329K wps
[Epoch 33 Batch 2040/2125] avg loss 0.00256986, throughput 12.9919K wps
[Epoch 33 Batch 2070/2125] avg loss 0.00291417, throughput 13.0335K wps
[Epoch 33 Batch 2100/2125] avg loss 0.00258287, throughput 13.0187K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 33] train avg loss 0.00267477, test acc 0.9282, test avg loss 0.197443, throughput 13.0231K wps
[Epoch 34 Batch 30/2125] avg loss 0.00208439, throughput 13.3091K wps
[Epoch 34 Batch 60/2125] avg loss 0.00267503, throughput 12.9202K wps
[Epoch 34 Batch 90/2125] avg loss 0.00251118, throughput 13.0354K wps
[Epoch 34 Batch 120/2125] avg loss 0.00267765, throughput 13.0504K wps
[Epoch 34 Batch 150/2125] avg loss 0.0022963, throughput 13.0315K wps
[Epoch 34 Batch 180/2125] avg loss 0.00263456, throughput 13.0331K wps
[Epoch 34 Batch 210/2125] avg loss 0.00255273, throughput 13.0488K wps
[Epoch 34 Batch 240/2125] avg loss 0.00245222, throughput 13.0569K wps
[Epoch 34 Batch 270/2125] avg loss 0.00267552, throughput 12.9918K wps
[Epoch 34 Batch 300/2125] avg loss 0.0025341, throughput 13.0337K wps
[Epoch 34 Batch 330/2125] avg loss 0.00242502, throughput 13.0239K wps
[Epoch 34 Batch 360/2125] avg loss 0.00279085, throughput 12.996K wps
[Epoch 34 Batch 390/2125] avg loss 0.00243751, throughput 13.0169K wps
[Epoch 34 Batch 420/2125] avg loss 0.00240085, throughput 13.0344K wps
[Epoch 34 Batch 450/2125] avg loss 0.00269917, throughput 13.0121K wps
[Epoch 34 Batch 480/2125] avg loss 0.00229918, throughput 12.9815K wps
[Epoch 34 Batch 510/2125] avg loss 0.00262163, throughput 12.9797K wps
[Epoch 34 Batch 540/2125] avg loss 0.00252621, throughput 13.0114K wps
[Epoch 34 Batch 570/2125] avg loss 0.00259415, throughput 12.9912K wps
[Epoch 34 Batch 600/2125] avg loss 0.00250905, throughput 12.9817K wps
[Epoch 34 Batch 630/2125] avg loss 0.00280079, throughput 12.9939K wps
[Epoch 34 Batch 660/2125] avg loss 0.00234346, throughput 12.9767K wps
[Epoch 34 Batch 690/2125] avg loss 0.00244634, throughput 13.0072K wps
[Epoch 34 Batch 720/2125] avg loss 0.0022905, throughput 12.9362K wps
[Epoch 34 Batch 750/2125] avg loss 0.00247208, throughput 13.0142K wps
[Epoch 34 Batch 780/2125] avg loss 0.00249911, throughput 13.0163K wps
[Epoch 34 Batch 810/2125] avg loss 0.00261215, throughput 13.0005K wps
[Epoch 34 Batch 840/2125] avg loss 0.0028639, throughput 13.0093K wps
[Epoch 34 Batch 870/2125] avg loss 0.00274278, throughput 13.0028K wps
[Epoch 34 Batch 900/2125] avg loss 0.00214009, throughput 13.1289K wps
[Epoch 34 Batch 930/2125] avg loss 0.00275477, throughput 13.0493K wps
[Epoch 34 Batch 960/2125] avg loss 0.00293236, throughput 13.001K wps
[Epoch 34 Batch 990/2125] avg loss 0.00247522, throughput 13.0349K wps
[Epoch 34 Batch 1020/2125] avg loss 0.00265731, throughput 13.0142K wps
[Epoch 34 Batch 1050/2125] avg loss 0.00255408, throughput 13.0621K wps
[Epoch 34 Batch 1080/2125] avg loss 0.00244037, throughput 13.0686K wps
[Epoch 34 Batch 1110/2125] avg loss 0.00251986, throughput 13.0537K wps
[Epoch 34 Batch 1140/2125] avg loss 0.00228528, throughput 13.0509K wps
[Epoch 34 Batch 1170/2125] avg loss 0.00252525, throughput 13.0414K wps
[Epoch 34 Batch 1200/2125] avg loss 0.00291747, throughput 13.0446K wps
[Epoch 34 Batch 1230/2125] avg loss 0.00261406, throughput 13.042K wps
[Epoch 34 Batch 1260/2125] avg loss 0.00262672, throughput 13.0293K wps
[Epoch 34 Batch 1290/2125] avg loss 0.00272254, throughput 13.0482K wps
[Epoch 34 Batch 1320/2125] avg loss 0.00277727, throughput 13.0495K wps
[Epoch 34 Batch 1350/2125] avg loss 0.00247835, throughput 12.999K wps
[Epoch 34 Batch 1380/2125] avg loss 0.00226665, throughput 13.024K wps
[Epoch 34 Batch 1410/2125] avg loss 0.00260432, throughput 13.0446K wps
[Epoch 34 Batch 1440/2125] avg loss 0.00319152, throughput 13.0467K wps
[Epoch 34 Batch 1470/2125] avg loss 0.00260235, throughput 13.0369K wps
[Epoch 34 Batch 1500/2125] avg loss 0.00311106, throughput 13.0499K wps
[Epoch 34 Batch 1530/2125] avg loss 0.00267519, throughput 13.0085K wps
[Epoch 34 Batch 1560/2125] avg loss 0.00291032, throughput 13.0136K wps
[Epoch 34 Batch 1590/2125] avg loss 0.00243844, throughput 13.0192K wps
[Epoch 34 Batch 1620/2125] avg loss 0.00301436, throughput 13.0452K wps
[Epoch 34 Batch 1650/2125] avg loss 0.00245369, throughput 13.0424K wps
[Epoch 34 Batch 1680/2125] avg loss 0.00298983, throughput 13.0411K wps
[Epoch 34 Batch 1710/2125] avg loss 0.00228201, throughput 13.0515K wps
[Epoch 34 Batch 1740/2125] avg loss 0.00260694, throughput 13.0223K wps
[Epoch 34 Batch 1770/2125] avg loss 0.00328736, throughput 13.0224K wps
[Epoch 34 Batch 1800/2125] avg loss 0.00276567, throughput 13.0175K wps
[Epoch 34 Batch 1830/2125] avg loss 0.00278299, throughput 13.0114K wps
[Epoch 34 Batch 1860/2125] avg loss 0.00245918, throughput 13.0058K wps
[Epoch 34 Batch 1890/2125] avg loss 0.00251049, throughput 13.0372K wps
[Epoch 34 Batch 1920/2125] avg loss 0.00255099, throughput 13.0658K wps
[Epoch 34 Batch 1950/2125] avg loss 0.00239846, throughput 13.0336K wps
[Epoch 34 Batch 1980/2125] avg loss 0.00261758, throughput 13.0516K wps
[Epoch 34 Batch 2010/2125] avg loss 0.00271365, throughput 13.0603K wps
[Epoch 34 Batch 2040/2125] avg loss 0.00270844, throughput 13.0292K wps
[Epoch 34 Batch 2070/2125] avg loss 0.00264522, throughput 13.0407K wps
[Epoch 34 Batch 2100/2125] avg loss 0.00213149, throughput 13.0208K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 34] train avg loss 0.00259604, test acc 0.9266, test avg loss 0.198313, throughput 13.0296K wps
[Epoch 35 Batch 30/2125] avg loss 0.00228468, throughput 13.3333K wps
[Epoch 35 Batch 60/2125] avg loss 0.00285041, throughput 13.0104K wps
[Epoch 35 Batch 90/2125] avg loss 0.00257947, throughput 13.007K wps
[Epoch 35 Batch 120/2125] avg loss 0.00244668, throughput 13.004K wps
[Epoch 35 Batch 150/2125] avg loss 0.00249792, throughput 13.0051K wps
[Epoch 35 Batch 180/2125] avg loss 0.00260251, throughput 13.0032K wps
[Epoch 35 Batch 210/2125] avg loss 0.00255331, throughput 12.9979K wps
[Epoch 35 Batch 240/2125] avg loss 0.00211179, throughput 12.9847K wps
[Epoch 35 Batch 270/2125] avg loss 0.0027993, throughput 13.0135K wps
[Epoch 35 Batch 300/2125] avg loss 0.0020887, throughput 13.0059K wps
[Epoch 35 Batch 330/2125] avg loss 0.00250257, throughput 12.9802K wps
[Epoch 35 Batch 360/2125] avg loss 0.00227038, throughput 13.0108K wps
[Epoch 35 Batch 390/2125] avg loss 0.00276578, throughput 13.0057K wps
[Epoch 35 Batch 420/2125] avg loss 0.00244997, throughput 13.0145K wps
[Epoch 35 Batch 450/2125] avg loss 0.00239827, throughput 12.9771K wps
[Epoch 35 Batch 480/2125] avg loss 0.00272115, throughput 13.0141K wps
[Epoch 35 Batch 510/2125] avg loss 0.00246409, throughput 13.013K wps
[Epoch 35 Batch 540/2125] avg loss 0.00270824, throughput 12.9871K wps
[Epoch 35 Batch 570/2125] avg loss 0.0023499, throughput 13.0184K wps
[Epoch 35 Batch 600/2125] avg loss 0.00230692, throughput 12.9887K wps
[Epoch 35 Batch 630/2125] avg loss 0.00284974, throughput 13.0094K wps
[Epoch 35 Batch 660/2125] avg loss 0.00212807, throughput 13.0084K wps
[Epoch 35 Batch 690/2125] avg loss 0.00292264, throughput 13.0017K wps
[Epoch 35 Batch 720/2125] avg loss 0.00264211, throughput 13.0296K wps
[Epoch 35 Batch 750/2125] avg loss 0.00249027, throughput 13.0131K wps
[Epoch 35 Batch 780/2125] avg loss 0.00265836, throughput 12.9888K wps
[Epoch 35 Batch 810/2125] avg loss 0.00253538, throughput 13.0389K wps
[Epoch 35 Batch 840/2125] avg loss 0.00253754, throughput 13.0136K wps
[Epoch 35 Batch 870/2125] avg loss 0.00265737, throughput 12.9941K wps
[Epoch 35 Batch 900/2125] avg loss 0.00202477, throughput 12.9839K wps
[Epoch 35 Batch 930/2125] avg loss 0.00216021, throughput 12.9797K wps
[Epoch 35 Batch 960/2125] avg loss 0.00239709, throughput 13.0344K wps
[Epoch 35 Batch 990/2125] avg loss 0.00243497, throughput 13.0551K wps
[Epoch 35 Batch 1020/2125] avg loss 0.00247842, throughput 13.0418K wps
[Epoch 35 Batch 1050/2125] avg loss 0.00257918, throughput 13.0351K wps
[Epoch 35 Batch 1080/2125] avg loss 0.00261667, throughput 13.0359K wps
[Epoch 35 Batch 1110/2125] avg loss 0.00287969, throughput 13.0495K wps
[Epoch 35 Batch 1140/2125] avg loss 0.00259828, throughput 13.0366K wps
[Epoch 35 Batch 1170/2125] avg loss 0.00244801, throughput 13.061K wps
[Epoch 35 Batch 1200/2125] avg loss 0.00233395, throughput 13.0322K wps
[Epoch 35 Batch 1230/2125] avg loss 0.002487, throughput 13.0291K wps
[Epoch 35 Batch 1260/2125] avg loss 0.00224403, throughput 12.9941K wps
[Epoch 35 Batch 1290/2125] avg loss 0.00269557, throughput 13.0578K wps
[Epoch 35 Batch 1320/2125] avg loss 0.00258479, throughput 13.0669K wps
[Epoch 35 Batch 1350/2125] avg loss 0.00214803, throughput 13.0302K wps
[Epoch 35 Batch 1380/2125] avg loss 0.00247324, throughput 12.994K wps
[Epoch 35 Batch 1410/2125] avg loss 0.00260455, throughput 12.9957K wps
[Epoch 35 Batch 1440/2125] avg loss 0.00256249, throughput 12.957K wps
[Epoch 35 Batch 1470/2125] avg loss 0.0024506, throughput 13.0005K wps
[Epoch 35 Batch 1500/2125] avg loss 0.00278615, throughput 12.968K wps
[Epoch 35 Batch 1530/2125] avg loss 0.00273318, throughput 13.0617K wps
[Epoch 35 Batch 1560/2125] avg loss 0.00306933, throughput 13.0638K wps
[Epoch 35 Batch 1590/2125] avg loss 0.0026685, throughput 12.9683K wps
[Epoch 35 Batch 1620/2125] avg loss 0.00260017, throughput 12.9686K wps
[Epoch 35 Batch 1650/2125] avg loss 0.00264462, throughput 12.9932K wps
[Epoch 35 Batch 1680/2125] avg loss 0.00264769, throughput 13.0264K wps
[Epoch 35 Batch 1710/2125] avg loss 0.00242881, throughput 12.991K wps
[Epoch 35 Batch 1740/2125] avg loss 0.00304248, throughput 13.0483K wps
[Epoch 35 Batch 1770/2125] avg loss 0.00264118, throughput 12.9374K wps
[Epoch 35 Batch 1800/2125] avg loss 0.00260569, throughput 13.0273K wps
[Epoch 35 Batch 1830/2125] avg loss 0.00271927, throughput 13.0047K wps
[Epoch 35 Batch 1860/2125] avg loss 0.00263372, throughput 13.002K wps
[Epoch 35 Batch 1890/2125] avg loss 0.00260312, throughput 13.0511K wps
[Epoch 35 Batch 1920/2125] avg loss 0.0023772, throughput 13.0523K wps
[Epoch 35 Batch 1950/2125] avg loss 0.0025657, throughput 13.0189K wps
[Epoch 35 Batch 1980/2125] avg loss 0.00281726, throughput 13.03K wps
[Epoch 35 Batch 2010/2125] avg loss 0.0025779, throughput 13.035K wps
[Epoch 35 Batch 2040/2125] avg loss 0.00276819, throughput 13.0538K wps
[Epoch 35 Batch 2070/2125] avg loss 0.00253653, throughput 13.0617K wps
[Epoch 35 Batch 2100/2125] avg loss 0.0026898, throughput 13.0677K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 35] train avg loss 0.00255033, test acc 0.9277, test avg loss 0.195715, throughput 13.0197K wps
[Epoch 36 Batch 30/2125] avg loss 0.00218987, throughput 13.3054K wps
[Epoch 36 Batch 60/2125] avg loss 0.002722, throughput 12.9822K wps
[Epoch 36 Batch 90/2125] avg loss 0.0023166, throughput 13.0475K wps
[Epoch 36 Batch 120/2125] avg loss 0.0025669, throughput 13.034K wps
[Epoch 36 Batch 150/2125] avg loss 0.0022579, throughput 13.0193K wps
[Epoch 36 Batch 180/2125] avg loss 0.0029789, throughput 13.0491K wps
[Epoch 36 Batch 210/2125] avg loss 0.00247748, throughput 13.0037K wps
[Epoch 36 Batch 240/2125] avg loss 0.00212368, throughput 13.0453K wps
[Epoch 36 Batch 270/2125] avg loss 0.00267933, throughput 13.0181K wps
[Epoch 36 Batch 300/2125] avg loss 0.00234377, throughput 13.0103K wps
[Epoch 36 Batch 330/2125] avg loss 0.00264571, throughput 13.0408K wps
[Epoch 36 Batch 360/2125] avg loss 0.00213294, throughput 13.0251K wps
[Epoch 36 Batch 390/2125] avg loss 0.00259489, throughput 13.0328K wps
[Epoch 36 Batch 420/2125] avg loss 0.00210639, throughput 13.0352K wps
[Epoch 36 Batch 450/2125] avg loss 0.00267536, throughput 13.0612K wps
[Epoch 36 Batch 480/2125] avg loss 0.00241397, throughput 13.0479K wps
[Epoch 36 Batch 510/2125] avg loss 0.00223572, throughput 13.0628K wps
[Epoch 36 Batch 540/2125] avg loss 0.00269713, throughput 13.0379K wps
[Epoch 36 Batch 570/2125] avg loss 0.00237276, throughput 12.9773K wps
[Epoch 36 Batch 600/2125] avg loss 0.00217104, throughput 12.992K wps
[Epoch 36 Batch 630/2125] avg loss 0.00225278, throughput 12.9868K wps
[Epoch 36 Batch 660/2125] avg loss 0.00264273, throughput 13.0299K wps
[Epoch 36 Batch 690/2125] avg loss 0.00219961, throughput 13.0039K wps
[Epoch 36 Batch 720/2125] avg loss 0.00275725, throughput 12.9687K wps
[Epoch 36 Batch 750/2125] avg loss 0.00250543, throughput 13.0877K wps
[Epoch 36 Batch 780/2125] avg loss 0.00259252, throughput 13.001K wps
[Epoch 36 Batch 810/2125] avg loss 0.00268894, throughput 13.0007K wps
[Epoch 36 Batch 840/2125] avg loss 0.00263277, throughput 13.0149K wps
[Epoch 36 Batch 870/2125] avg loss 0.00260277, throughput 13.0033K wps
[Epoch 36 Batch 900/2125] avg loss 0.00255472, throughput 13.0062K wps
[Epoch 36 Batch 930/2125] avg loss 0.00256934, throughput 12.9857K wps
[Epoch 36 Batch 960/2125] avg loss 0.00268656, throughput 13.0247K wps
[Epoch 36 Batch 990/2125] avg loss 0.00291057, throughput 12.9839K wps
[Epoch 36 Batch 1020/2125] avg loss 0.0026496, throughput 13.0317K wps
[Epoch 36 Batch 1050/2125] avg loss 0.00267154, throughput 13.017K wps
[Epoch 36 Batch 1080/2125] avg loss 0.002693, throughput 13.0352K wps
[Epoch 36 Batch 1110/2125] avg loss 0.00265466, throughput 13.0232K wps
[Epoch 36 Batch 1140/2125] avg loss 0.00282528, throughput 13.0284K wps
[Epoch 36 Batch 1170/2125] avg loss 0.00232417, throughput 13.0223K wps
[Epoch 36 Batch 1200/2125] avg loss 0.00242418, throughput 12.9841K wps
[Epoch 36 Batch 1230/2125] avg loss 0.00258298, throughput 12.9807K wps
[Epoch 36 Batch 1260/2125] avg loss 0.00232396, throughput 13.0171K wps
[Epoch 36 Batch 1290/2125] avg loss 0.00234698, throughput 12.9935K wps
[Epoch 36 Batch 1320/2125] avg loss 0.00237616, throughput 12.9987K wps
[Epoch 36 Batch 1350/2125] avg loss 0.00264006, throughput 13.0128K wps
[Epoch 36 Batch 1380/2125] avg loss 0.00221186, throughput 13.0139K wps
[Epoch 36 Batch 1410/2125] avg loss 0.00281558, throughput 13.0082K wps
[Epoch 36 Batch 1440/2125] avg loss 0.00251985, throughput 13.0263K wps
[Epoch 36 Batch 1470/2125] avg loss 0.00266699, throughput 12.9818K wps
[Epoch 36 Batch 1500/2125] avg loss 0.00300227, throughput 13.0113K wps
[Epoch 36 Batch 1530/2125] avg loss 0.00264989, throughput 12.9981K wps
[Epoch 36 Batch 1560/2125] avg loss 0.00231277, throughput 13.0009K wps
[Epoch 36 Batch 1590/2125] avg loss 0.00284491, throughput 12.9948K wps
[Epoch 36 Batch 1620/2125] avg loss 0.0023724, throughput 13.0273K wps
[Epoch 36 Batch 1650/2125] avg loss 0.00240761, throughput 13.021K wps
[Epoch 36 Batch 1680/2125] avg loss 0.00260453, throughput 12.9992K wps
[Epoch 36 Batch 1710/2125] avg loss 0.00259779, throughput 13.0363K wps
[Epoch 36 Batch 1740/2125] avg loss 0.00270571, throughput 13.0193K wps
[Epoch 36 Batch 1770/2125] avg loss 0.00240746, throughput 13.0611K wps
[Epoch 36 Batch 1800/2125] avg loss 0.00268697, throughput 13.0579K wps
[Epoch 36 Batch 1830/2125] avg loss 0.00254159, throughput 13.0681K wps
[Epoch 36 Batch 1860/2125] avg loss 0.00259832, throughput 13.0108K wps
[Epoch 36 Batch 1890/2125] avg loss 0.00217661, throughput 13.0035K wps
[Epoch 36 Batch 1920/2125] avg loss 0.00243343, throughput 12.986K wps
[Epoch 36 Batch 1950/2125] avg loss 0.00222522, throughput 13.0377K wps
[Epoch 36 Batch 1980/2125] avg loss 0.0024861, throughput 12.9941K wps
[Epoch 36 Batch 2010/2125] avg loss 0.00255275, throughput 12.9853K wps
[Epoch 36 Batch 2040/2125] avg loss 0.00249316, throughput 12.9764K wps
[Epoch 36 Batch 2070/2125] avg loss 0.00277039, throughput 13.0361K wps
[Epoch 36 Batch 2100/2125] avg loss 0.00235898, throughput 13.0548K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 36] train avg loss 0.00251744, test acc 0.9289, test avg loss 0.194911, throughput 13.0216K wps
Observed Improvement.
Begin Testing...
[Batch 30/35] elapsed 0.27 s
[Epoch 37 Batch 30/2125] avg loss 0.00229, throughput 13.2995K wps
[Epoch 37 Batch 60/2125] avg loss 0.00265373, throughput 12.9578K wps
[Epoch 37 Batch 90/2125] avg loss 0.00251261, throughput 13.0411K wps
[Epoch 37 Batch 120/2125] avg loss 0.00228911, throughput 13.026K wps
[Epoch 37 Batch 150/2125] avg loss 0.00213667, throughput 13.0005K wps
[Epoch 37 Batch 180/2125] avg loss 0.00236017, throughput 13.038K wps
[Epoch 37 Batch 210/2125] avg loss 0.00224621, throughput 13.0234K wps
[Epoch 37 Batch 240/2125] avg loss 0.00251149, throughput 12.9798K wps
[Epoch 37 Batch 270/2125] avg loss 0.00251365, throughput 13.0454K wps
[Epoch 37 Batch 300/2125] avg loss 0.00218653, throughput 12.9848K wps
[Epoch 37 Batch 330/2125] avg loss 0.00230758, throughput 13.0109K wps
[Epoch 37 Batch 360/2125] avg loss 0.00260181, throughput 13.0412K wps
[Epoch 37 Batch 390/2125] avg loss 0.00254967, throughput 13.0442K wps
[Epoch 37 Batch 420/2125] avg loss 0.00239276, throughput 13.0487K wps
[Epoch 37 Batch 450/2125] avg loss 0.00220492, throughput 13.0364K wps
[Epoch 37 Batch 480/2125] avg loss 0.00254432, throughput 13.0623K wps
[Epoch 37 Batch 510/2125] avg loss 0.00251645, throughput 13.0392K wps
[Epoch 37 Batch 540/2125] avg loss 0.00281147, throughput 13.0543K wps
[Epoch 37 Batch 570/2125] avg loss 0.0025184, throughput 13.0358K wps
[Epoch 37 Batch 600/2125] avg loss 0.00244094, throughput 13.0024K wps
[Epoch 37 Batch 630/2125] avg loss 0.00218818, throughput 13.0163K wps
[Epoch 37 Batch 660/2125] avg loss 0.00270412, throughput 13.0085K wps
[Epoch 37 Batch 690/2125] avg loss 0.00252502, throughput 13.0013K wps
[Epoch 37 Batch 720/2125] avg loss 0.00229205, throughput 13.0195K wps
[Epoch 37 Batch 750/2125] avg loss 0.00245212, throughput 13.0196K wps
[Epoch 37 Batch 780/2125] avg loss 0.00232147, throughput 13.028K wps
[Epoch 37 Batch 810/2125] avg loss 0.00268725, throughput 13.0132K wps
[Epoch 37 Batch 840/2125] avg loss 0.00233877, throughput 13.0194K wps
[Epoch 37 Batch 870/2125] avg loss 0.00238977, throughput 13.0683K wps
[Epoch 37 Batch 900/2125] avg loss 0.0025691, throughput 12.997K wps
[Epoch 37 Batch 930/2125] avg loss 0.00256927, throughput 13.068K wps
[Epoch 37 Batch 960/2125] avg loss 0.00296125, throughput 12.9966K wps
[Epoch 37 Batch 990/2125] avg loss 0.00251727, throughput 13.0044K wps
[Epoch 37 Batch 1020/2125] avg loss 0.00221911, throughput 13.0251K wps
[Epoch 37 Batch 1050/2125] avg loss 0.00241764, throughput 13.0189K wps
[Epoch 37 Batch 1080/2125] avg loss 0.00243631, throughput 13.0162K wps
[Epoch 37 Batch 1110/2125] avg loss 0.00270235, throughput 13.0372K wps
[Epoch 37 Batch 1140/2125] avg loss 0.00306197, throughput 13.0225K wps
[Epoch 37 Batch 1170/2125] avg loss 0.00235197, throughput 12.9788K wps
[Epoch 37 Batch 1200/2125] avg loss 0.00241919, throughput 12.9927K wps
[Epoch 37 Batch 1230/2125] avg loss 0.00241915, throughput 13.0336K wps
[Epoch 37 Batch 1260/2125] avg loss 0.00233449, throughput 13.0148K wps
[Epoch 37 Batch 1290/2125] avg loss 0.00228946, throughput 13.0205K wps
[Epoch 37 Batch 1320/2125] avg loss 0.00273739, throughput 13.0432K wps
[Epoch 37 Batch 1350/2125] avg loss 0.00243641, throughput 13.0337K wps
[Epoch 37 Batch 1380/2125] avg loss 0.0027454, throughput 12.9711K wps
[Epoch 37 Batch 1410/2125] avg loss 0.00228273, throughput 13.0532K wps
[Epoch 37 Batch 1440/2125] avg loss 0.0025799, throughput 13.0329K wps
[Epoch 37 Batch 1470/2125] avg loss 0.00232017, throughput 13.0394K wps
[Epoch 37 Batch 1500/2125] avg loss 0.00244158, throughput 13.0572K wps
[Epoch 37 Batch 1530/2125] avg loss 0.00269878, throughput 13.0503K wps
[Epoch 37 Batch 1560/2125] avg loss 0.00294755, throughput 13.0393K wps
[Epoch 37 Batch 1590/2125] avg loss 0.00253718, throughput 13.0469K wps
[Epoch 37 Batch 1620/2125] avg loss 0.00232399, throughput 12.9983K wps
[Epoch 37 Batch 1650/2125] avg loss 0.00261202, throughput 13.0082K wps
[Epoch 37 Batch 1680/2125] avg loss 0.00272545, throughput 13.0129K wps
[Epoch 37 Batch 1710/2125] avg loss 0.00243082, throughput 13.0226K wps
[Epoch 37 Batch 1740/2125] avg loss 0.00266356, throughput 13.0353K wps
[Epoch 37 Batch 1770/2125] avg loss 0.00233294, throughput 13.0076K wps
[Epoch 37 Batch 1800/2125] avg loss 0.00253386, throughput 12.9634K wps
[Epoch 37 Batch 1830/2125] avg loss 0.00248127, throughput 12.8148K wps
[Epoch 37 Batch 1860/2125] avg loss 0.00269292, throughput 12.9282K wps
[Epoch 37 Batch 1890/2125] avg loss 0.00238513, throughput 13.0361K wps
[Epoch 37 Batch 1920/2125] avg loss 0.00252149, throughput 12.9428K wps
[Epoch 37 Batch 1950/2125] avg loss 0.00242578, throughput 13.0382K wps
[Epoch 37 Batch 1980/2125] avg loss 0.00260544, throughput 12.9865K wps
[Epoch 37 Batch 2010/2125] avg loss 0.00274526, throughput 13.0128K wps
[Epoch 37 Batch 2040/2125] avg loss 0.00241109, throughput 13.0577K wps
[Epoch 37 Batch 2070/2125] avg loss 0.00248704, throughput 13.003K wps
[Epoch 37 Batch 2100/2125] avg loss 0.00245384, throughput 13.0525K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 37] train avg loss 0.00249189, test acc 0.9282, test avg loss 0.198133, throughput 13.0213K wps
[Epoch 38 Batch 30/2125] avg loss 0.00271092, throughput 13.3341K wps
[Epoch 38 Batch 60/2125] avg loss 0.00234951, throughput 12.932K wps
[Epoch 38 Batch 90/2125] avg loss 0.00241453, throughput 13.0694K wps
[Epoch 38 Batch 120/2125] avg loss 0.00244932, throughput 12.9917K wps
[Epoch 38 Batch 150/2125] avg loss 0.00263708, throughput 13.0329K wps
[Epoch 38 Batch 180/2125] avg loss 0.00280159, throughput 12.9908K wps
[Epoch 38 Batch 210/2125] avg loss 0.00224591, throughput 13.0283K wps
[Epoch 38 Batch 240/2125] avg loss 0.00229038, throughput 12.9926K wps
[Epoch 38 Batch 270/2125] avg loss 0.00233852, throughput 13.0523K wps
[Epoch 38 Batch 300/2125] avg loss 0.00223738, throughput 13.0497K wps
[Epoch 38 Batch 330/2125] avg loss 0.00253746, throughput 13.0285K wps
[Epoch 38 Batch 360/2125] avg loss 0.00244323, throughput 13.0323K wps
[Epoch 38 Batch 390/2125] avg loss 0.00238587, throughput 13.0205K wps
[Epoch 38 Batch 420/2125] avg loss 0.00211504, throughput 13.0183K wps
[Epoch 38 Batch 450/2125] avg loss 0.0026049, throughput 13.0327K wps
[Epoch 38 Batch 480/2125] avg loss 0.00224181, throughput 13.0316K wps
[Epoch 38 Batch 510/2125] avg loss 0.00274652, throughput 13.039K wps
[Epoch 38 Batch 540/2125] avg loss 0.00199845, throughput 13.0059K wps
[Epoch 38 Batch 570/2125] avg loss 0.00244247, throughput 13.0325K wps
[Epoch 38 Batch 600/2125] avg loss 0.00224251, throughput 13.0351K wps
[Epoch 38 Batch 630/2125] avg loss 0.00235902, throughput 13.0265K wps
[Epoch 38 Batch 660/2125] avg loss 0.00197143, throughput 13.0428K wps
[Epoch 38 Batch 690/2125] avg loss 0.00215606, throughput 13.0178K wps
[Epoch 38 Batch 720/2125] avg loss 0.00219978, throughput 13.0456K wps
[Epoch 38 Batch 750/2125] avg loss 0.00246126, throughput 13.0263K wps
[Epoch 38 Batch 780/2125] avg loss 0.00256558, throughput 13.0423K wps
[Epoch 38 Batch 810/2125] avg loss 0.00221526, throughput 13.0359K wps
[Epoch 38 Batch 840/2125] avg loss 0.00270272, throughput 13.0226K wps
[Epoch 38 Batch 870/2125] avg loss 0.0025463, throughput 13.0514K wps
[Epoch 38 Batch 900/2125] avg loss 0.00234512, throughput 13.0448K wps
[Epoch 38 Batch 930/2125] avg loss 0.00205902, throughput 13.0394K wps
[Epoch 38 Batch 960/2125] avg loss 0.00233934, throughput 13.0741K wps
[Epoch 38 Batch 990/2125] avg loss 0.00268222, throughput 13.0368K wps
[Epoch 38 Batch 1020/2125] avg loss 0.00263182, throughput 13.0335K wps
[Epoch 38 Batch 1050/2125] avg loss 0.00224518, throughput 13.0221K wps
[Epoch 38 Batch 1080/2125] avg loss 0.00222984, throughput 13.0398K wps
[Epoch 38 Batch 1110/2125] avg loss 0.00228774, throughput 13.0361K wps
[Epoch 38 Batch 1140/2125] avg loss 0.00254151, throughput 13.028K wps
[Epoch 38 Batch 1170/2125] avg loss 0.00201996, throughput 13.0101K wps
[Epoch 38 Batch 1200/2125] avg loss 0.00216778, throughput 13.056K wps
[Epoch 38 Batch 1230/2125] avg loss 0.00242757, throughput 13.0417K wps
[Epoch 38 Batch 1260/2125] avg loss 0.00223751, throughput 13.0469K wps
[Epoch 38 Batch 1290/2125] avg loss 0.00236459, throughput 13.0119K wps
[Epoch 38 Batch 1320/2125] avg loss 0.00229838, throughput 13.033K wps
[Epoch 38 Batch 1350/2125] avg loss 0.00257909, throughput 13.0051K wps
[Epoch 38 Batch 1380/2125] avg loss 0.00298368, throughput 13.0592K wps
[Epoch 38 Batch 1410/2125] avg loss 0.00240695, throughput 13.0365K wps
[Epoch 38 Batch 1440/2125] avg loss 0.00302408, throughput 13.0478K wps
[Epoch 38 Batch 1470/2125] avg loss 0.00228788, throughput 13.0539K wps
[Epoch 38 Batch 1500/2125] avg loss 0.00253937, throughput 13.0202K wps
[Epoch 38 Batch 1530/2125] avg loss 0.00253785, throughput 13.0152K wps
[Epoch 38 Batch 1560/2125] avg loss 0.00210532, throughput 13.0205K wps
[Epoch 38 Batch 1590/2125] avg loss 0.00236827, throughput 13.0327K wps
[Epoch 38 Batch 1620/2125] avg loss 0.00270622, throughput 13.0525K wps
[Epoch 38 Batch 1650/2125] avg loss 0.00256775, throughput 13.0412K wps
[Epoch 38 Batch 1680/2125] avg loss 0.0026507, throughput 12.9935K wps
[Epoch 38 Batch 1710/2125] avg loss 0.00243715, throughput 13.0083K wps
[Epoch 38 Batch 1740/2125] avg loss 0.0024252, throughput 12.9864K wps
[Epoch 38 Batch 1770/2125] avg loss 0.00203929, throughput 13.0082K wps
[Epoch 38 Batch 1800/2125] avg loss 0.00266421, throughput 13.0172K wps
[Epoch 38 Batch 1830/2125] avg loss 0.00280609, throughput 13.0302K wps
[Epoch 38 Batch 1860/2125] avg loss 0.00233894, throughput 13.0063K wps
[Epoch 38 Batch 1890/2125] avg loss 0.00278019, throughput 13.0056K wps
[Epoch 38 Batch 1920/2125] avg loss 0.00235023, throughput 13.0148K wps
[Epoch 38 Batch 1950/2125] avg loss 0.00251013, throughput 13.0187K wps
[Epoch 38 Batch 1980/2125] avg loss 0.00231056, throughput 13.0426K wps
[Epoch 38 Batch 2010/2125] avg loss 0.00251767, throughput 13.0501K wps
[Epoch 38 Batch 2040/2125] avg loss 0.00259888, throughput 13.0338K wps
[Epoch 38 Batch 2070/2125] avg loss 0.00219474, throughput 13.0181K wps
[Epoch 38 Batch 2100/2125] avg loss 0.00249545, throughput 13.0525K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 38] train avg loss 0.00242652, test acc 0.9287, test avg loss 0.194549, throughput 13.0331K wps
[Epoch 39 Batch 30/2125] avg loss 0.00242062, throughput 13.2082K wps
[Epoch 39 Batch 60/2125] avg loss 0.00212379, throughput 12.9871K wps
[Epoch 39 Batch 90/2125] avg loss 0.00256552, throughput 13.0473K wps
[Epoch 39 Batch 120/2125] avg loss 0.00215252, throughput 13.0608K wps
[Epoch 39 Batch 150/2125] avg loss 0.00229174, throughput 13.05K wps
[Epoch 39 Batch 180/2125] avg loss 0.00248083, throughput 13.0269K wps
[Epoch 39 Batch 210/2125] avg loss 0.00264278, throughput 13.0111K wps
[Epoch 39 Batch 240/2125] avg loss 0.00221025, throughput 13.0085K wps
[Epoch 39 Batch 270/2125] avg loss 0.00267071, throughput 12.9645K wps
[Epoch 39 Batch 300/2125] avg loss 0.0023241, throughput 12.9916K wps
[Epoch 39 Batch 330/2125] avg loss 0.00225853, throughput 13.0057K wps
[Epoch 39 Batch 360/2125] avg loss 0.00250258, throughput 13.01K wps
[Epoch 39 Batch 390/2125] avg loss 0.00244643, throughput 12.9983K wps
[Epoch 39 Batch 420/2125] avg loss 0.00271636, throughput 13.0091K wps
[Epoch 39 Batch 450/2125] avg loss 0.00216782, throughput 13.0117K wps
[Epoch 39 Batch 480/2125] avg loss 0.00238005, throughput 13.0252K wps
[Epoch 39 Batch 510/2125] avg loss 0.00253889, throughput 13.0213K wps
[Epoch 39 Batch 540/2125] avg loss 0.00250407, throughput 13.0519K wps
[Epoch 39 Batch 570/2125] avg loss 0.00248544, throughput 12.9858K wps
[Epoch 39 Batch 600/2125] avg loss 0.00235354, throughput 13.0272K wps
[Epoch 39 Batch 630/2125] avg loss 0.00253421, throughput 12.9722K wps
[Epoch 39 Batch 660/2125] avg loss 0.00224827, throughput 13.0692K wps
[Epoch 39 Batch 690/2125] avg loss 0.00225612, throughput 13.0101K wps
[Epoch 39 Batch 720/2125] avg loss 0.0025729, throughput 13.0221K wps
[Epoch 39 Batch 750/2125] avg loss 0.00242042, throughput 13.0272K wps
[Epoch 39 Batch 780/2125] avg loss 0.00232569, throughput 13.0228K wps
[Epoch 39 Batch 810/2125] avg loss 0.00224189, throughput 13.0133K wps
[Epoch 39 Batch 840/2125] avg loss 0.00219711, throughput 13.0072K wps
[Epoch 39 Batch 870/2125] avg loss 0.00248814, throughput 13.0423K wps
[Epoch 39 Batch 900/2125] avg loss 0.00242297, throughput 13.0605K wps
[Epoch 39 Batch 930/2125] avg loss 0.00209282, throughput 13.0274K wps
[Epoch 39 Batch 960/2125] avg loss 0.00229862, throughput 13.0104K wps
[Epoch 39 Batch 990/2125] avg loss 0.0024645, throughput 13.0047K wps
[Epoch 39 Batch 1020/2125] avg loss 0.00245628, throughput 12.9924K wps
[Epoch 39 Batch 1050/2125] avg loss 0.0025751, throughput 13.0725K wps
[Epoch 39 Batch 1080/2125] avg loss 0.00292095, throughput 13.0537K wps
[Epoch 39 Batch 1110/2125] avg loss 0.00234483, throughput 13.0744K wps
[Epoch 39 Batch 1140/2125] avg loss 0.00216099, throughput 13.0467K wps
[Epoch 39 Batch 1170/2125] avg loss 0.00212039, throughput 13.0728K wps
[Epoch 39 Batch 1200/2125] avg loss 0.00232755, throughput 13.0413K wps
[Epoch 39 Batch 1230/2125] avg loss 0.00259924, throughput 13.0407K wps
[Epoch 39 Batch 1260/2125] avg loss 0.00239885, throughput 13.041K wps
[Epoch 39 Batch 1290/2125] avg loss 0.00259557, throughput 13.063K wps
[Epoch 39 Batch 1320/2125] avg loss 0.00243636, throughput 13.0385K wps
[Epoch 39 Batch 1350/2125] avg loss 0.00243117, throughput 13.0352K wps
[Epoch 39 Batch 1380/2125] avg loss 0.00235274, throughput 13.0358K wps
[Epoch 39 Batch 1410/2125] avg loss 0.00230236, throughput 13.033K wps
[Epoch 39 Batch 1440/2125] avg loss 0.00267889, throughput 13.058K wps
[Epoch 39 Batch 1470/2125] avg loss 0.00230252, throughput 13.03K wps
[Epoch 39 Batch 1500/2125] avg loss 0.00259099, throughput 13.0368K wps
[Epoch 39 Batch 1530/2125] avg loss 0.00226003, throughput 13.0153K wps
[Epoch 39 Batch 1560/2125] avg loss 0.00227092, throughput 13.0533K wps
[Epoch 39 Batch 1590/2125] avg loss 0.00284014, throughput 13.0312K wps
[Epoch 39 Batch 1620/2125] avg loss 0.00259848, throughput 13.0411K wps
[Epoch 39 Batch 1650/2125] avg loss 0.00269442, throughput 13.0376K wps
[Epoch 39 Batch 1680/2125] avg loss 0.00256223, throughput 13.0459K wps
[Epoch 39 Batch 1710/2125] avg loss 0.00243207, throughput 13.0467K wps
[Epoch 39 Batch 1740/2125] avg loss 0.00246747, throughput 13.0385K wps
[Epoch 39 Batch 1770/2125] avg loss 0.00246045, throughput 13.0312K wps
[Epoch 39 Batch 1800/2125] avg loss 0.00272619, throughput 13.0402K wps
[Epoch 39 Batch 1830/2125] avg loss 0.00269437, throughput 13.0234K wps
[Epoch 39 Batch 1860/2125] avg loss 0.00218431, throughput 13.0554K wps
[Epoch 39 Batch 1890/2125] avg loss 0.00280381, throughput 13.0322K wps
[Epoch 39 Batch 1920/2125] avg loss 0.00227626, throughput 13.0473K wps
[Epoch 39 Batch 1950/2125] avg loss 0.00254724, throughput 13.0509K wps
[Epoch 39 Batch 1980/2125] avg loss 0.00253452, throughput 13.0532K wps
[Epoch 39 Batch 2010/2125] avg loss 0.00224982, throughput 13.0303K wps
[Epoch 39 Batch 2040/2125] avg loss 0.00208572, throughput 13.0492K wps
[Epoch 39 Batch 2070/2125] avg loss 0.00247006, throughput 13.047K wps
[Epoch 39 Batch 2100/2125] avg loss 0.00254862, throughput 13.0079K wps
Begin Testing...
[Batch 30/237] elapsed 0.28 s
[Batch 60/237] elapsed 0.27 s
[Batch 90/237] elapsed 0.27 s
[Batch 120/237] elapsed 0.27 s
[Batch 150/237] elapsed 0.27 s
[Batch 180/237] elapsed 0.27 s
[Batch 210/237] elapsed 0.27 s
[Epoch 39] train avg loss 0.00243083, test acc 0.9286, test avg loss 0.193378, throughput 13.0338K wps
Test loss 0.150356, test acc 0.9450
Total time cost 1403.84s