-
Notifications
You must be signed in to change notification settings - Fork 151
Expand file tree
/
Copy pathCR_multichannel.log
More file actions
8908 lines (8908 loc) · 516 KB
/
CR_multichannel.log
File metadata and controls
8908 lines (8908 loc) · 516 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Namespace(batch_size=50, data_name='CR', dropout=0.5, epochs=200, gpu=0, log_interval=30, model_mode='multichannel')
Use gpu0
maximum length (in tokens): 105
Done! Tokenizing Time=0.06s, #Sentences=3775
SentimentNet(
(embedding): Embedding(5343 -> 300, float32)
(embedding_extend): Embedding(5343 -> 300, float32)
(encoder): ConvolutionalEncoder(
(_convs): HybridConcurrent(
(0): HybridSequential(
(0): Conv1D(600 -> 100, kernel_size=(3,), stride=(1,))
(1): HybridLambda(<lambda>)
(2): Activation(relu)
)
(1): HybridSequential(
(0): Conv1D(600 -> 100, kernel_size=(4,), stride=(1,))
(1): HybridLambda(<lambda>)
(2): Activation(relu)
)
(2): HybridSequential(
(0): Conv1D(600 -> 100, kernel_size=(5,), stride=(1,))
(1): HybridLambda(<lambda>)
(2): Activation(relu)
)
)
)
(output): HybridSequential(
(0): Dropout(p = 0.5, axes=())
(1): Dense(None -> 2, linear)
)
)
[Epoch 0 Batch 30/62] avg loss 0.0134723, throughput 0.343674K wps
[Epoch 0 Batch 60/62] avg loss 0.0129089, throughput 3.26262K wps
Begin Testing...
[Epoch 0] train avg loss 0.0133812, dev acc 0.6372, dev avg loss 0.645867, throughput 0.339026K wps
Observed Improvement.
Begin Testing...
[Epoch 1 Batch 30/62] avg loss 0.0130102, throughput 3.34238K wps
[Epoch 1 Batch 60/62] avg loss 0.0129448, throughput 3.23616K wps
Begin Testing...
[Epoch 1] train avg loss 0.0131052, dev acc 0.6372, dev avg loss 0.640512, throughput 3.29626K wps
Observed Improvement.
Begin Testing...
[Epoch 2 Batch 30/62] avg loss 0.0126977, throughput 3.33366K wps
[Epoch 2 Batch 60/62] avg loss 0.0127733, throughput 3.25764K wps
Begin Testing...
[Epoch 2] train avg loss 0.0128615, dev acc 0.6372, dev avg loss 0.633058, throughput 3.30186K wps
Observed Improvement.
Begin Testing...
[Epoch 3 Batch 30/62] avg loss 0.0125133, throughput 3.33257K wps
[Epoch 3 Batch 60/62] avg loss 0.0125409, throughput 3.2568K wps
Begin Testing...
[Epoch 3] train avg loss 0.012671, dev acc 0.6372, dev avg loss 0.623487, throughput 3.30107K wps
Observed Improvement.
Begin Testing...
[Epoch 4 Batch 30/62] avg loss 0.0122701, throughput 3.33109K wps
[Epoch 4 Batch 60/62] avg loss 0.0124346, throughput 3.2536K wps
Begin Testing...
[Epoch 4] train avg loss 0.0125311, dev acc 0.6372, dev avg loss 0.61702, throughput 3.29828K wps
Observed Improvement.
Begin Testing...
[Epoch 5 Batch 30/62] avg loss 0.0122674, throughput 3.33839K wps
[Epoch 5 Batch 60/62] avg loss 0.0119561, throughput 3.25442K wps
Begin Testing...
[Epoch 5] train avg loss 0.0123007, dev acc 0.6431, dev avg loss 0.609875, throughput 3.30228K wps
Observed Improvement.
Begin Testing...
[Epoch 6 Batch 30/62] avg loss 0.0119415, throughput 3.33358K wps
[Epoch 6 Batch 60/62] avg loss 0.0119814, throughput 3.23653K wps
Begin Testing...
[Epoch 6] train avg loss 0.0121306, dev acc 0.6401, dev avg loss 0.602441, throughput 3.29088K wps
[Epoch 7 Batch 30/62] avg loss 0.0117209, throughput 3.1797K wps
[Epoch 7 Batch 60/62] avg loss 0.0119209, throughput 3.27397K wps
Begin Testing...
[Epoch 7] train avg loss 0.0119841, dev acc 0.6490, dev avg loss 0.593291, throughput 3.23459K wps
Observed Improvement.
Begin Testing...
[Epoch 8 Batch 30/62] avg loss 0.0115267, throughput 3.30825K wps
[Epoch 8 Batch 60/62] avg loss 0.0114703, throughput 3.23038K wps
Begin Testing...
[Epoch 8] train avg loss 0.0116767, dev acc 0.6608, dev avg loss 0.584986, throughput 3.27405K wps
Observed Improvement.
Begin Testing...
[Epoch 9 Batch 30/62] avg loss 0.0115753, throughput 3.32254K wps
[Epoch 9 Batch 60/62] avg loss 0.0112319, throughput 3.22645K wps
Begin Testing...
[Epoch 9] train avg loss 0.0115095, dev acc 0.6637, dev avg loss 0.576667, throughput 3.28022K wps
Observed Improvement.
Begin Testing...
[Epoch 10 Batch 30/62] avg loss 0.0111671, throughput 3.30046K wps
[Epoch 10 Batch 60/62] avg loss 0.0112022, throughput 3.22593K wps
Begin Testing...
[Epoch 10] train avg loss 0.0113404, dev acc 0.6873, dev avg loss 0.568127, throughput 3.26977K wps
Observed Improvement.
Begin Testing...
[Epoch 11 Batch 30/62] avg loss 0.0109008, throughput 3.30439K wps
[Epoch 11 Batch 60/62] avg loss 0.0109946, throughput 3.2263K wps
Begin Testing...
[Epoch 11] train avg loss 0.0110775, dev acc 0.6637, dev avg loss 0.561266, throughput 3.27122K wps
[Epoch 12 Batch 30/62] avg loss 0.0105575, throughput 3.31961K wps
[Epoch 12 Batch 60/62] avg loss 0.0107065, throughput 3.21559K wps
Begin Testing...
[Epoch 12] train avg loss 0.0108074, dev acc 0.7109, dev avg loss 0.550624, throughput 3.27443K wps
Observed Improvement.
Begin Testing...
[Epoch 13 Batch 30/62] avg loss 0.0107503, throughput 3.32673K wps
[Epoch 13 Batch 60/62] avg loss 0.0103472, throughput 3.2312K wps
Begin Testing...
[Epoch 13] train avg loss 0.0106261, dev acc 0.6873, dev avg loss 0.543977, throughput 3.28388K wps
[Epoch 14 Batch 30/62] avg loss 0.0103625, throughput 3.30368K wps
[Epoch 14 Batch 60/62] avg loss 0.0102037, throughput 3.24421K wps
Begin Testing...
[Epoch 14] train avg loss 0.0104303, dev acc 0.7286, dev avg loss 0.532857, throughput 3.28033K wps
Observed Improvement.
Begin Testing...
[Epoch 15 Batch 30/62] avg loss 0.0102559, throughput 3.34094K wps
[Epoch 15 Batch 60/62] avg loss 0.0100903, throughput 3.22256K wps
Begin Testing...
[Epoch 15] train avg loss 0.010348, dev acc 0.7404, dev avg loss 0.525199, throughput 3.28631K wps
Observed Improvement.
Begin Testing...
[Epoch 16 Batch 30/62] avg loss 0.00959528, throughput 3.31265K wps
[Epoch 16 Batch 60/62] avg loss 0.0102398, throughput 3.22636K wps
Begin Testing...
[Epoch 16] train avg loss 0.0100203, dev acc 0.7611, dev avg loss 0.517671, throughput 3.27555K wps
Observed Improvement.
Begin Testing...
[Epoch 17 Batch 30/62] avg loss 0.00966294, throughput 3.31638K wps
[Epoch 17 Batch 60/62] avg loss 0.00970601, throughput 3.24517K wps
Begin Testing...
[Epoch 17] train avg loss 0.00980017, dev acc 0.7463, dev avg loss 0.509622, throughput 3.28731K wps
[Epoch 18 Batch 30/62] avg loss 0.00939605, throughput 3.31666K wps
[Epoch 18 Batch 60/62] avg loss 0.00937983, throughput 3.22201K wps
Begin Testing...
[Epoch 18] train avg loss 0.00957299, dev acc 0.7611, dev avg loss 0.506679, throughput 3.27647K wps
Observed Improvement.
Begin Testing...
[Epoch 19 Batch 30/62] avg loss 0.00941913, throughput 3.31988K wps
[Epoch 19 Batch 60/62] avg loss 0.00922998, throughput 3.22734K wps
Begin Testing...
[Epoch 19] train avg loss 0.00941015, dev acc 0.7640, dev avg loss 0.495414, throughput 3.2798K wps
Observed Improvement.
Begin Testing...
[Epoch 20 Batch 30/62] avg loss 0.00904799, throughput 3.31764K wps
[Epoch 20 Batch 60/62] avg loss 0.00924885, throughput 3.23532K wps
Begin Testing...
[Epoch 20] train avg loss 0.00929458, dev acc 0.7611, dev avg loss 0.490562, throughput 3.28106K wps
[Epoch 21 Batch 30/62] avg loss 0.00896077, throughput 3.29325K wps
[Epoch 21 Batch 60/62] avg loss 0.00893734, throughput 3.22498K wps
Begin Testing...
[Epoch 21] train avg loss 0.00908357, dev acc 0.7817, dev avg loss 0.483413, throughput 3.26385K wps
Observed Improvement.
Begin Testing...
[Epoch 22 Batch 30/62] avg loss 0.00876101, throughput 3.29305K wps
[Epoch 22 Batch 60/62] avg loss 0.00869727, throughput 3.21492K wps
Begin Testing...
[Epoch 22] train avg loss 0.00886027, dev acc 0.7640, dev avg loss 0.48087, throughput 3.25976K wps
[Epoch 23 Batch 30/62] avg loss 0.00860429, throughput 3.31881K wps
[Epoch 23 Batch 60/62] avg loss 0.00838256, throughput 3.23879K wps
Begin Testing...
[Epoch 23] train avg loss 0.00865908, dev acc 0.7699, dev avg loss 0.474155, throughput 3.28369K wps
[Epoch 24 Batch 30/62] avg loss 0.0086947, throughput 3.2851K wps
[Epoch 24 Batch 60/62] avg loss 0.00827456, throughput 3.23034K wps
Begin Testing...
[Epoch 24] train avg loss 0.00858808, dev acc 0.7758, dev avg loss 0.469919, throughput 3.26461K wps
[Epoch 25 Batch 30/62] avg loss 0.00830087, throughput 3.30858K wps
[Epoch 25 Batch 60/62] avg loss 0.00799783, throughput 3.22975K wps
Begin Testing...
[Epoch 25] train avg loss 0.00827258, dev acc 0.7788, dev avg loss 0.46376, throughput 3.27575K wps
[Epoch 26 Batch 30/62] avg loss 0.00792655, throughput 3.29768K wps
[Epoch 26 Batch 60/62] avg loss 0.00838465, throughput 3.23694K wps
Begin Testing...
[Epoch 26] train avg loss 0.00824073, dev acc 0.7817, dev avg loss 0.459772, throughput 3.27171K wps
Observed Improvement.
Begin Testing...
[Epoch 27 Batch 30/62] avg loss 0.0081229, throughput 3.32102K wps
[Epoch 27 Batch 60/62] avg loss 0.00793227, throughput 3.23567K wps
Begin Testing...
[Epoch 27] train avg loss 0.00806027, dev acc 0.7817, dev avg loss 0.456041, throughput 3.28356K wps
Observed Improvement.
Begin Testing...
[Epoch 28 Batch 30/62] avg loss 0.00770145, throughput 3.29657K wps
[Epoch 28 Batch 60/62] avg loss 0.00784183, throughput 3.22498K wps
Begin Testing...
[Epoch 28] train avg loss 0.00783569, dev acc 0.7876, dev avg loss 0.451941, throughput 3.26522K wps
Observed Improvement.
Begin Testing...
[Epoch 29 Batch 30/62] avg loss 0.00780541, throughput 3.31888K wps
[Epoch 29 Batch 60/62] avg loss 0.00755988, throughput 3.24892K wps
Begin Testing...
[Epoch 29] train avg loss 0.00775539, dev acc 0.7758, dev avg loss 0.44961, throughput 3.29014K wps
[Epoch 30 Batch 30/62] avg loss 0.00767302, throughput 3.3019K wps
[Epoch 30 Batch 60/62] avg loss 0.00729676, throughput 3.23396K wps
Begin Testing...
[Epoch 30] train avg loss 0.0075499, dev acc 0.7876, dev avg loss 0.445713, throughput 3.27319K wps
Observed Improvement.
Begin Testing...
[Epoch 31 Batch 30/62] avg loss 0.00729512, throughput 3.30777K wps
[Epoch 31 Batch 60/62] avg loss 0.00731529, throughput 3.23202K wps
Begin Testing...
[Epoch 31] train avg loss 0.00741992, dev acc 0.7788, dev avg loss 0.445104, throughput 3.27476K wps
[Epoch 32 Batch 30/62] avg loss 0.00715808, throughput 3.28979K wps
[Epoch 32 Batch 60/62] avg loss 0.00740053, throughput 3.22262K wps
Begin Testing...
[Epoch 32] train avg loss 0.00739935, dev acc 0.7847, dev avg loss 0.443554, throughput 3.2617K wps
[Epoch 33 Batch 30/62] avg loss 0.0070608, throughput 3.29572K wps
[Epoch 33 Batch 60/62] avg loss 0.00714915, throughput 3.21816K wps
Begin Testing...
[Epoch 33] train avg loss 0.00720835, dev acc 0.7906, dev avg loss 0.437648, throughput 3.26273K wps
Observed Improvement.
Begin Testing...
[Epoch 34 Batch 30/62] avg loss 0.00700767, throughput 3.27862K wps
[Epoch 34 Batch 60/62] avg loss 0.00707173, throughput 3.23245K wps
Begin Testing...
[Epoch 34] train avg loss 0.00705619, dev acc 0.7906, dev avg loss 0.436573, throughput 3.26106K wps
Observed Improvement.
Begin Testing...
[Epoch 35 Batch 30/62] avg loss 0.00676388, throughput 3.29216K wps
[Epoch 35 Batch 60/62] avg loss 0.00698487, throughput 3.23153K wps
Begin Testing...
[Epoch 35] train avg loss 0.00705509, dev acc 0.7876, dev avg loss 0.433251, throughput 3.269K wps
[Epoch 36 Batch 30/62] avg loss 0.00663338, throughput 3.30999K wps
[Epoch 36 Batch 60/62] avg loss 0.00684657, throughput 3.21403K wps
Begin Testing...
[Epoch 36] train avg loss 0.00683581, dev acc 0.7817, dev avg loss 0.438491, throughput 3.26818K wps
[Epoch 37 Batch 30/62] avg loss 0.00666328, throughput 3.33245K wps
[Epoch 37 Batch 60/62] avg loss 0.00666124, throughput 3.23217K wps
Begin Testing...
[Epoch 37] train avg loss 0.00670551, dev acc 0.7788, dev avg loss 0.43255, throughput 3.28855K wps
[Epoch 38 Batch 30/62] avg loss 0.00645705, throughput 3.29445K wps
[Epoch 38 Batch 60/62] avg loss 0.00658762, throughput 3.22781K wps
Begin Testing...
[Epoch 38] train avg loss 0.00662022, dev acc 0.8201, dev avg loss 0.426898, throughput 3.26504K wps
Observed Improvement.
Begin Testing...
[Epoch 39 Batch 30/62] avg loss 0.00621295, throughput 3.30825K wps
[Epoch 39 Batch 60/62] avg loss 0.00640877, throughput 3.21856K wps
Begin Testing...
[Epoch 39] train avg loss 0.00647128, dev acc 0.8024, dev avg loss 0.423956, throughput 3.26855K wps
[Epoch 40 Batch 30/62] avg loss 0.00630716, throughput 3.31153K wps
[Epoch 40 Batch 60/62] avg loss 0.00626895, throughput 3.20577K wps
Begin Testing...
[Epoch 40] train avg loss 0.00635059, dev acc 0.8142, dev avg loss 0.422436, throughput 3.26418K wps
[Epoch 41 Batch 30/62] avg loss 0.00591307, throughput 3.30095K wps
[Epoch 41 Batch 60/62] avg loss 0.00633876, throughput 3.22163K wps
Begin Testing...
[Epoch 41] train avg loss 0.00615976, dev acc 0.7817, dev avg loss 0.429495, throughput 3.26752K wps
[Epoch 42 Batch 30/62] avg loss 0.00602473, throughput 3.31965K wps
[Epoch 42 Batch 60/62] avg loss 0.00625138, throughput 3.22533K wps
Begin Testing...
[Epoch 42] train avg loss 0.00620687, dev acc 0.8171, dev avg loss 0.419217, throughput 3.27828K wps
[Epoch 43 Batch 30/62] avg loss 0.00601633, throughput 3.29877K wps
[Epoch 43 Batch 60/62] avg loss 0.00602545, throughput 3.22344K wps
Begin Testing...
[Epoch 43] train avg loss 0.00607012, dev acc 0.8053, dev avg loss 0.420083, throughput 3.26671K wps
[Epoch 44 Batch 30/62] avg loss 0.00561471, throughput 3.32077K wps
[Epoch 44 Batch 60/62] avg loss 0.00606478, throughput 3.23325K wps
Begin Testing...
[Epoch 44] train avg loss 0.00587506, dev acc 0.8142, dev avg loss 0.416824, throughput 3.28334K wps
[Epoch 45 Batch 30/62] avg loss 0.00577342, throughput 3.30063K wps
[Epoch 45 Batch 60/62] avg loss 0.00558224, throughput 3.2328K wps
Begin Testing...
[Epoch 45] train avg loss 0.00570039, dev acc 0.8083, dev avg loss 0.415431, throughput 3.27265K wps
[Epoch 46 Batch 30/62] avg loss 0.00538927, throughput 3.29858K wps
[Epoch 46 Batch 60/62] avg loss 0.00571484, throughput 3.23673K wps
Begin Testing...
[Epoch 46] train avg loss 0.00562714, dev acc 0.8171, dev avg loss 0.413723, throughput 3.27451K wps
[Epoch 47 Batch 30/62] avg loss 0.0054065, throughput 3.32832K wps
[Epoch 47 Batch 60/62] avg loss 0.00558721, throughput 3.22322K wps
Begin Testing...
[Epoch 47] train avg loss 0.00553656, dev acc 0.8112, dev avg loss 0.411693, throughput 3.2798K wps
[Epoch 48 Batch 30/62] avg loss 0.00544149, throughput 3.30003K wps
[Epoch 48 Batch 60/62] avg loss 0.00541284, throughput 3.23414K wps
Begin Testing...
[Epoch 48] train avg loss 0.0055122, dev acc 0.8053, dev avg loss 0.414174, throughput 3.27228K wps
[Epoch 49 Batch 30/62] avg loss 0.00565469, throughput 3.29724K wps
[Epoch 49 Batch 60/62] avg loss 0.00513911, throughput 3.20441K wps
Begin Testing...
[Epoch 49] train avg loss 0.00546651, dev acc 0.8201, dev avg loss 0.408184, throughput 3.25307K wps
Observed Improvement.
Begin Testing...
[Epoch 50 Batch 30/62] avg loss 0.00536651, throughput 3.29486K wps
[Epoch 50 Batch 60/62] avg loss 0.00519544, throughput 3.22975K wps
Begin Testing...
[Epoch 50] train avg loss 0.00529447, dev acc 0.8112, dev avg loss 0.413166, throughput 3.26715K wps
[Epoch 51 Batch 30/62] avg loss 0.00519159, throughput 3.28311K wps
[Epoch 51 Batch 60/62] avg loss 0.00512819, throughput 3.21988K wps
Begin Testing...
[Epoch 51] train avg loss 0.00521718, dev acc 0.8201, dev avg loss 0.40608, throughput 3.25741K wps
Observed Improvement.
Begin Testing...
[Epoch 52 Batch 30/62] avg loss 0.00527688, throughput 3.30122K wps
[Epoch 52 Batch 60/62] avg loss 0.00485971, throughput 3.22841K wps
Begin Testing...
[Epoch 52] train avg loss 0.00511752, dev acc 0.8230, dev avg loss 0.405056, throughput 3.27042K wps
Observed Improvement.
Begin Testing...
[Epoch 53 Batch 30/62] avg loss 0.00514033, throughput 3.3005K wps
[Epoch 53 Batch 60/62] avg loss 0.00480066, throughput 3.23974K wps
Begin Testing...
[Epoch 53] train avg loss 0.00503389, dev acc 0.8112, dev avg loss 0.40768, throughput 3.27547K wps
[Epoch 54 Batch 30/62] avg loss 0.00481823, throughput 3.29592K wps
[Epoch 54 Batch 60/62] avg loss 0.00488368, throughput 3.23277K wps
Begin Testing...
[Epoch 54] train avg loss 0.00488105, dev acc 0.8171, dev avg loss 0.406379, throughput 3.26935K wps
[Epoch 55 Batch 30/62] avg loss 0.00467364, throughput 3.31428K wps
[Epoch 55 Batch 60/62] avg loss 0.0048564, throughput 3.19604K wps
Begin Testing...
[Epoch 55] train avg loss 0.00483617, dev acc 0.8230, dev avg loss 0.402401, throughput 3.2616K wps
Observed Improvement.
Begin Testing...
[Epoch 56 Batch 30/62] avg loss 0.00480682, throughput 3.28673K wps
[Epoch 56 Batch 60/62] avg loss 0.00461925, throughput 3.22143K wps
Begin Testing...
[Epoch 56] train avg loss 0.00479454, dev acc 0.8201, dev avg loss 0.405635, throughput 3.25941K wps
[Epoch 57 Batch 30/62] avg loss 0.00440682, throughput 3.3132K wps
[Epoch 57 Batch 60/62] avg loss 0.00475593, throughput 3.20848K wps
Begin Testing...
[Epoch 57] train avg loss 0.0045979, dev acc 0.8260, dev avg loss 0.401906, throughput 3.26637K wps
Observed Improvement.
Begin Testing...
[Epoch 58 Batch 30/62] avg loss 0.00439177, throughput 3.29761K wps
[Epoch 58 Batch 60/62] avg loss 0.00462431, throughput 3.21642K wps
Begin Testing...
[Epoch 58] train avg loss 0.00455091, dev acc 0.8319, dev avg loss 0.400016, throughput 3.262K wps
Observed Improvement.
Begin Testing...
[Epoch 59 Batch 30/62] avg loss 0.00437344, throughput 3.30008K wps
[Epoch 59 Batch 60/62] avg loss 0.00448443, throughput 3.22982K wps
Begin Testing...
[Epoch 59] train avg loss 0.00447979, dev acc 0.8407, dev avg loss 0.399509, throughput 3.27032K wps
Observed Improvement.
Begin Testing...
[Epoch 60 Batch 30/62] avg loss 0.0042501, throughput 3.31032K wps
[Epoch 60 Batch 60/62] avg loss 0.00455598, throughput 3.23222K wps
Begin Testing...
[Epoch 60] train avg loss 0.00443393, dev acc 0.8348, dev avg loss 0.401052, throughput 3.2708K wps
[Epoch 61 Batch 30/62] avg loss 0.00422981, throughput 3.29831K wps
[Epoch 61 Batch 60/62] avg loss 0.00422732, throughput 3.21102K wps
Begin Testing...
[Epoch 61] train avg loss 0.00425805, dev acc 0.8289, dev avg loss 0.396667, throughput 3.26158K wps
[Epoch 62 Batch 30/62] avg loss 0.00437636, throughput 3.3024K wps
[Epoch 62 Batch 60/62] avg loss 0.00405651, throughput 3.23519K wps
Begin Testing...
[Epoch 62] train avg loss 0.0043168, dev acc 0.8319, dev avg loss 0.396371, throughput 3.2754K wps
[Epoch 63 Batch 30/62] avg loss 0.00411264, throughput 3.29625K wps
[Epoch 63 Batch 60/62] avg loss 0.00404028, throughput 3.21693K wps
Begin Testing...
[Epoch 63] train avg loss 0.00416166, dev acc 0.8437, dev avg loss 0.397971, throughput 3.26325K wps
Observed Improvement.
Begin Testing...
[Epoch 64 Batch 30/62] avg loss 0.00402705, throughput 3.31298K wps
[Epoch 64 Batch 60/62] avg loss 0.00414228, throughput 3.21653K wps
Begin Testing...
[Epoch 64] train avg loss 0.00410033, dev acc 0.8348, dev avg loss 0.395691, throughput 3.27071K wps
[Epoch 65 Batch 30/62] avg loss 0.00397217, throughput 3.3061K wps
[Epoch 65 Batch 60/62] avg loss 0.00396839, throughput 3.22889K wps
Begin Testing...
[Epoch 65] train avg loss 0.00406161, dev acc 0.8260, dev avg loss 0.403318, throughput 3.27189K wps
[Epoch 66 Batch 30/62] avg loss 0.00385111, throughput 3.29607K wps
[Epoch 66 Batch 60/62] avg loss 0.00402661, throughput 3.22201K wps
Begin Testing...
[Epoch 66] train avg loss 0.00399262, dev acc 0.8407, dev avg loss 0.395094, throughput 3.26631K wps
[Epoch 67 Batch 30/62] avg loss 0.00385873, throughput 3.29535K wps
[Epoch 67 Batch 60/62] avg loss 0.00373285, throughput 3.204K wps
Begin Testing...
[Epoch 67] train avg loss 0.0038312, dev acc 0.8496, dev avg loss 0.395513, throughput 3.25459K wps
Observed Improvement.
Begin Testing...
[Epoch 68 Batch 30/62] avg loss 0.00352478, throughput 3.28579K wps
[Epoch 68 Batch 60/62] avg loss 0.00391364, throughput 3.2085K wps
Begin Testing...
[Epoch 68] train avg loss 0.00381432, dev acc 0.8230, dev avg loss 0.392567, throughput 3.25221K wps
[Epoch 69 Batch 30/62] avg loss 0.00372169, throughput 3.2866K wps
[Epoch 69 Batch 60/62] avg loss 0.0036612, throughput 3.21547K wps
Begin Testing...
[Epoch 69] train avg loss 0.00371983, dev acc 0.8319, dev avg loss 0.392922, throughput 3.2561K wps
[Epoch 70 Batch 30/62] avg loss 0.00347704, throughput 3.28018K wps
[Epoch 70 Batch 60/62] avg loss 0.00367201, throughput 3.2143K wps
Begin Testing...
[Epoch 70] train avg loss 0.00359678, dev acc 0.8525, dev avg loss 0.395358, throughput 3.25485K wps
Observed Improvement.
Begin Testing...
[Epoch 71 Batch 30/62] avg loss 0.00344919, throughput 3.29709K wps
[Epoch 71 Batch 60/62] avg loss 0.00355758, throughput 3.21736K wps
Begin Testing...
[Epoch 71] train avg loss 0.00353729, dev acc 0.8319, dev avg loss 0.391801, throughput 3.26274K wps
[Epoch 72 Batch 30/62] avg loss 0.00365617, throughput 3.30477K wps
[Epoch 72 Batch 60/62] avg loss 0.00334467, throughput 3.2184K wps
Begin Testing...
[Epoch 72] train avg loss 0.00355517, dev acc 0.8466, dev avg loss 0.394742, throughput 3.26841K wps
[Epoch 73 Batch 30/62] avg loss 0.00331663, throughput 3.29091K wps
[Epoch 73 Batch 60/62] avg loss 0.00331115, throughput 3.22724K wps
Begin Testing...
[Epoch 73] train avg loss 0.00337466, dev acc 0.8348, dev avg loss 0.390193, throughput 3.26476K wps
[Epoch 74 Batch 30/62] avg loss 0.00329994, throughput 3.29878K wps
[Epoch 74 Batch 60/62] avg loss 0.00337102, throughput 3.22008K wps
Begin Testing...
[Epoch 74] train avg loss 0.00334545, dev acc 0.8348, dev avg loss 0.390526, throughput 3.26467K wps
[Epoch 75 Batch 30/62] avg loss 0.00317221, throughput 3.31775K wps
[Epoch 75 Batch 60/62] avg loss 0.003445, throughput 3.23261K wps
Begin Testing...
[Epoch 75] train avg loss 0.00332648, dev acc 0.8525, dev avg loss 0.391598, throughput 3.28137K wps
Observed Improvement.
Begin Testing...
[Epoch 76 Batch 30/62] avg loss 0.0031992, throughput 3.31361K wps
[Epoch 76 Batch 60/62] avg loss 0.00347003, throughput 3.22705K wps
Begin Testing...
[Epoch 76] train avg loss 0.00335065, dev acc 0.8407, dev avg loss 0.389136, throughput 3.27604K wps
[Epoch 77 Batch 30/62] avg loss 0.00329104, throughput 3.29523K wps
[Epoch 77 Batch 60/62] avg loss 0.00304171, throughput 3.21723K wps
Begin Testing...
[Epoch 77] train avg loss 0.00324275, dev acc 0.8378, dev avg loss 0.388245, throughput 3.26066K wps
[Epoch 78 Batch 30/62] avg loss 0.00303973, throughput 3.28892K wps
[Epoch 78 Batch 60/62] avg loss 0.00313471, throughput 3.22585K wps
Begin Testing...
[Epoch 78] train avg loss 0.00310289, dev acc 0.8437, dev avg loss 0.389966, throughput 3.26276K wps
[Epoch 79 Batch 30/62] avg loss 0.00301366, throughput 3.30693K wps
[Epoch 79 Batch 60/62] avg loss 0.00304934, throughput 3.21177K wps
Begin Testing...
[Epoch 79] train avg loss 0.00305653, dev acc 0.8407, dev avg loss 0.388926, throughput 3.26426K wps
[Epoch 80 Batch 30/62] avg loss 0.00292107, throughput 3.28739K wps
[Epoch 80 Batch 60/62] avg loss 0.0029615, throughput 3.20525K wps
Begin Testing...
[Epoch 80] train avg loss 0.0030004, dev acc 0.8407, dev avg loss 0.389602, throughput 3.25222K wps
[Epoch 81 Batch 30/62] avg loss 0.00300573, throughput 3.29602K wps
[Epoch 81 Batch 60/62] avg loss 0.00289599, throughput 3.21929K wps
Begin Testing...
[Epoch 81] train avg loss 0.00294066, dev acc 0.8525, dev avg loss 0.392829, throughput 3.26372K wps
Observed Improvement.
Begin Testing...
[Epoch 82 Batch 30/62] avg loss 0.00279445, throughput 3.28863K wps
[Epoch 82 Batch 60/62] avg loss 0.00288153, throughput 3.23195K wps
Begin Testing...
[Epoch 82] train avg loss 0.00284179, dev acc 0.8437, dev avg loss 0.388285, throughput 3.26551K wps
[Epoch 83 Batch 30/62] avg loss 0.00293351, throughput 3.29329K wps
[Epoch 83 Batch 60/62] avg loss 0.00277965, throughput 3.23905K wps
Begin Testing...
[Epoch 83] train avg loss 0.00287721, dev acc 0.8437, dev avg loss 0.388527, throughput 3.27501K wps
[Epoch 84 Batch 30/62] avg loss 0.00268538, throughput 3.30941K wps
[Epoch 84 Batch 60/62] avg loss 0.0028424, throughput 3.22296K wps
Begin Testing...
[Epoch 84] train avg loss 0.0027851, dev acc 0.8407, dev avg loss 0.397724, throughput 3.27197K wps
[Epoch 85 Batch 30/62] avg loss 0.00278038, throughput 3.27828K wps
[Epoch 85 Batch 60/62] avg loss 0.0026297, throughput 3.21692K wps
Begin Testing...
[Epoch 85] train avg loss 0.00273534, dev acc 0.8289, dev avg loss 0.389478, throughput 3.25278K wps
[Epoch 86 Batch 30/62] avg loss 0.00276298, throughput 3.28136K wps
[Epoch 86 Batch 60/62] avg loss 0.00273945, throughput 3.21462K wps
Begin Testing...
[Epoch 86] train avg loss 0.00280084, dev acc 0.8348, dev avg loss 0.388193, throughput 3.25468K wps
[Epoch 87 Batch 30/62] avg loss 0.00254576, throughput 3.29895K wps
[Epoch 87 Batch 60/62] avg loss 0.00271004, throughput 3.21273K wps
Begin Testing...
[Epoch 87] train avg loss 0.00267001, dev acc 0.8525, dev avg loss 0.390259, throughput 3.26138K wps
Observed Improvement.
Begin Testing...
[Epoch 88 Batch 30/62] avg loss 0.00272699, throughput 3.28798K wps
[Epoch 88 Batch 60/62] avg loss 0.00248168, throughput 3.18734K wps
Begin Testing...
[Epoch 88] train avg loss 0.00262369, dev acc 0.8437, dev avg loss 0.38882, throughput 3.24523K wps
[Epoch 89 Batch 30/62] avg loss 0.00249105, throughput 3.2749K wps
[Epoch 89 Batch 60/62] avg loss 0.00255826, throughput 3.20457K wps
Begin Testing...
[Epoch 89] train avg loss 0.00258675, dev acc 0.8407, dev avg loss 0.38885, throughput 3.24594K wps
[Epoch 90 Batch 30/62] avg loss 0.0025422, throughput 3.30303K wps
[Epoch 90 Batch 60/62] avg loss 0.00255329, throughput 3.20295K wps
Begin Testing...
[Epoch 90] train avg loss 0.00261719, dev acc 0.8289, dev avg loss 0.390579, throughput 3.25905K wps
[Epoch 91 Batch 30/62] avg loss 0.00236048, throughput 3.31575K wps
[Epoch 91 Batch 60/62] avg loss 0.00257886, throughput 3.20389K wps
Begin Testing...
[Epoch 91] train avg loss 0.00248943, dev acc 0.8289, dev avg loss 0.390438, throughput 3.26667K wps
[Epoch 92 Batch 30/62] avg loss 0.00232691, throughput 3.30306K wps
[Epoch 92 Batch 60/62] avg loss 0.0024305, throughput 3.20762K wps
Begin Testing...
[Epoch 92] train avg loss 0.00242786, dev acc 0.8319, dev avg loss 0.388668, throughput 3.26243K wps
[Epoch 93 Batch 30/62] avg loss 0.00251963, throughput 3.31215K wps
[Epoch 93 Batch 60/62] avg loss 0.00224475, throughput 3.22292K wps
Begin Testing...
[Epoch 93] train avg loss 0.00239372, dev acc 0.8525, dev avg loss 0.390595, throughput 3.27298K wps
Observed Improvement.
Begin Testing...
[Epoch 94 Batch 30/62] avg loss 0.00221825, throughput 3.29441K wps
[Epoch 94 Batch 60/62] avg loss 0.00243126, throughput 3.2292K wps
Begin Testing...
[Epoch 94] train avg loss 0.0023344, dev acc 0.8525, dev avg loss 0.391429, throughput 3.26687K wps
Observed Improvement.
Begin Testing...
[Epoch 95 Batch 30/62] avg loss 0.00241627, throughput 3.29648K wps
[Epoch 95 Batch 60/62] avg loss 0.00222388, throughput 3.21014K wps
Begin Testing...
[Epoch 95] train avg loss 0.00233129, dev acc 0.8378, dev avg loss 0.38917, throughput 3.26002K wps
[Epoch 96 Batch 30/62] avg loss 0.00228826, throughput 3.28843K wps
[Epoch 96 Batch 60/62] avg loss 0.00246106, throughput 3.22754K wps
Begin Testing...
[Epoch 96] train avg loss 0.0023792, dev acc 0.8407, dev avg loss 0.389341, throughput 3.26457K wps
[Epoch 97 Batch 30/62] avg loss 0.00223325, throughput 3.27964K wps
[Epoch 97 Batch 60/62] avg loss 0.00223822, throughput 3.22679K wps
Begin Testing...
[Epoch 97] train avg loss 0.00229143, dev acc 0.8378, dev avg loss 0.389024, throughput 3.25916K wps
[Epoch 98 Batch 30/62] avg loss 0.00221346, throughput 3.27796K wps
[Epoch 98 Batch 60/62] avg loss 0.00227809, throughput 3.23167K wps
Begin Testing...
[Epoch 98] train avg loss 0.00231521, dev acc 0.8289, dev avg loss 0.38874, throughput 3.25917K wps
[Epoch 99 Batch 30/62] avg loss 0.00208261, throughput 3.29972K wps
[Epoch 99 Batch 60/62] avg loss 0.00204515, throughput 3.21821K wps
Begin Testing...
[Epoch 99] train avg loss 0.00210253, dev acc 0.8466, dev avg loss 0.387696, throughput 3.26405K wps
[Epoch 100 Batch 30/62] avg loss 0.0021161, throughput 3.29562K wps
[Epoch 100 Batch 60/62] avg loss 0.00209687, throughput 3.20877K wps
Begin Testing...
[Epoch 100] train avg loss 0.00214338, dev acc 0.8319, dev avg loss 0.388735, throughput 3.25817K wps
[Epoch 101 Batch 30/62] avg loss 0.00216147, throughput 3.27939K wps
[Epoch 101 Batch 60/62] avg loss 0.00205364, throughput 3.21589K wps
Begin Testing...
[Epoch 101] train avg loss 0.00213624, dev acc 0.8289, dev avg loss 0.391255, throughput 3.25327K wps
[Epoch 102 Batch 30/62] avg loss 0.00211164, throughput 3.27082K wps
[Epoch 102 Batch 60/62] avg loss 0.0019952, throughput 3.22466K wps
Begin Testing...
[Epoch 102] train avg loss 0.002083, dev acc 0.8466, dev avg loss 0.39249, throughput 3.25337K wps
[Epoch 103 Batch 30/62] avg loss 0.00200842, throughput 3.29248K wps
[Epoch 103 Batch 60/62] avg loss 0.00205018, throughput 3.22181K wps
Begin Testing...
[Epoch 103] train avg loss 0.00203013, dev acc 0.8437, dev avg loss 0.388835, throughput 3.26247K wps
[Epoch 104 Batch 30/62] avg loss 0.0020067, throughput 3.28145K wps
[Epoch 104 Batch 60/62] avg loss 0.00189851, throughput 3.22149K wps
Begin Testing...
[Epoch 104] train avg loss 0.00197464, dev acc 0.8525, dev avg loss 0.390005, throughput 3.25658K wps
Observed Improvement.
Begin Testing...
[Epoch 105 Batch 30/62] avg loss 0.00199213, throughput 3.2788K wps
[Epoch 105 Batch 60/62] avg loss 0.0020008, throughput 3.1782K wps
Begin Testing...
[Epoch 105] train avg loss 0.00205958, dev acc 0.8437, dev avg loss 0.389645, throughput 3.23448K wps
[Epoch 106 Batch 30/62] avg loss 0.00187829, throughput 3.29454K wps
[Epoch 106 Batch 60/62] avg loss 0.00190597, throughput 3.19596K wps
Begin Testing...
[Epoch 106] train avg loss 0.00190647, dev acc 0.8407, dev avg loss 0.390076, throughput 3.24988K wps
[Epoch 107 Batch 30/62] avg loss 0.00185693, throughput 3.31479K wps
[Epoch 107 Batch 60/62] avg loss 0.00189857, throughput 3.23387K wps
Begin Testing...
[Epoch 107] train avg loss 0.00191293, dev acc 0.8407, dev avg loss 0.390179, throughput 3.28082K wps
[Epoch 108 Batch 30/62] avg loss 0.00174523, throughput 3.309K wps
[Epoch 108 Batch 60/62] avg loss 0.00195723, throughput 3.22334K wps
Begin Testing...
[Epoch 108] train avg loss 0.00187826, dev acc 0.8407, dev avg loss 0.391125, throughput 3.27192K wps
[Epoch 109 Batch 30/62] avg loss 0.00175578, throughput 3.28018K wps
[Epoch 109 Batch 60/62] avg loss 0.0018422, throughput 3.21034K wps
Begin Testing...
[Epoch 109] train avg loss 0.00181805, dev acc 0.8555, dev avg loss 0.393443, throughput 3.25118K wps
Observed Improvement.
Begin Testing...
[Epoch 110 Batch 30/62] avg loss 0.00169076, throughput 3.29496K wps
[Epoch 110 Batch 60/62] avg loss 0.0018451, throughput 3.19842K wps
Begin Testing...
[Epoch 110] train avg loss 0.00181804, dev acc 0.8378, dev avg loss 0.392322, throughput 3.25407K wps
[Epoch 111 Batch 30/62] avg loss 0.00184038, throughput 3.29625K wps
[Epoch 111 Batch 60/62] avg loss 0.00178821, throughput 3.18867K wps
Begin Testing...
[Epoch 111] train avg loss 0.001864, dev acc 0.8437, dev avg loss 0.391509, throughput 3.24997K wps
[Epoch 112 Batch 30/62] avg loss 0.00172273, throughput 3.29424K wps
[Epoch 112 Batch 60/62] avg loss 0.00172912, throughput 3.18377K wps
Begin Testing...
[Epoch 112] train avg loss 0.00175813, dev acc 0.8525, dev avg loss 0.395375, throughput 3.24298K wps
[Epoch 113 Batch 30/62] avg loss 0.00162077, throughput 3.26473K wps
[Epoch 113 Batch 60/62] avg loss 0.00189236, throughput 3.21073K wps
Begin Testing...
[Epoch 113] train avg loss 0.00180109, dev acc 0.8437, dev avg loss 0.393, throughput 3.24343K wps
[Epoch 114 Batch 30/62] avg loss 0.00166752, throughput 3.2699K wps
[Epoch 114 Batch 60/62] avg loss 0.00175969, throughput 3.21173K wps
Begin Testing...
[Epoch 114] train avg loss 0.00172691, dev acc 0.8525, dev avg loss 0.395342, throughput 3.24659K wps
[Epoch 115 Batch 30/62] avg loss 0.00165015, throughput 3.29846K wps
[Epoch 115 Batch 60/62] avg loss 0.00162462, throughput 3.2352K wps
Begin Testing...
[Epoch 115] train avg loss 0.00165205, dev acc 0.8466, dev avg loss 0.393891, throughput 3.27301K wps
[Epoch 116 Batch 30/62] avg loss 0.00156695, throughput 3.29889K wps
[Epoch 116 Batch 60/62] avg loss 0.00171269, throughput 3.23348K wps
Begin Testing...
[Epoch 116] train avg loss 0.00166122, dev acc 0.8437, dev avg loss 0.394668, throughput 3.27234K wps
[Epoch 117 Batch 30/62] avg loss 0.00156918, throughput 3.31637K wps
[Epoch 117 Batch 60/62] avg loss 0.00166263, throughput 3.23099K wps
Begin Testing...
[Epoch 117] train avg loss 0.00162889, dev acc 0.8466, dev avg loss 0.396185, throughput 3.28021K wps
[Epoch 118 Batch 30/62] avg loss 0.00166891, throughput 3.26902K wps
[Epoch 118 Batch 60/62] avg loss 0.00160721, throughput 3.20859K wps
Begin Testing...
[Epoch 118] train avg loss 0.00163918, dev acc 0.8466, dev avg loss 0.398858, throughput 3.24596K wps
[Epoch 119 Batch 30/62] avg loss 0.00157578, throughput 3.26775K wps
[Epoch 119 Batch 60/62] avg loss 0.00150904, throughput 3.15171K wps
Begin Testing...
[Epoch 119] train avg loss 0.00158108, dev acc 0.8466, dev avg loss 0.399138, throughput 3.21564K wps
[Epoch 120 Batch 30/62] avg loss 0.00143529, throughput 3.25264K wps
[Epoch 120 Batch 60/62] avg loss 0.00157817, throughput 3.15017K wps
Begin Testing...
[Epoch 120] train avg loss 0.00151118, dev acc 0.8407, dev avg loss 0.39601, throughput 3.20674K wps
[Epoch 121 Batch 30/62] avg loss 0.00147313, throughput 3.25485K wps
[Epoch 121 Batch 60/62] avg loss 0.00150156, throughput 3.15262K wps
Begin Testing...
[Epoch 121] train avg loss 0.00153048, dev acc 0.8437, dev avg loss 0.400966, throughput 3.20918K wps
[Epoch 122 Batch 30/62] avg loss 0.00145238, throughput 3.24992K wps
[Epoch 122 Batch 60/62] avg loss 0.0015233, throughput 3.15663K wps
Begin Testing...
[Epoch 122] train avg loss 0.00150055, dev acc 0.8496, dev avg loss 0.398548, throughput 3.20964K wps
[Epoch 123 Batch 30/62] avg loss 0.00153649, throughput 3.26206K wps
[Epoch 123 Batch 60/62] avg loss 0.00144486, throughput 3.20947K wps
Begin Testing...
[Epoch 123] train avg loss 0.00152311, dev acc 0.8496, dev avg loss 0.397735, throughput 3.24192K wps
[Epoch 124 Batch 30/62] avg loss 0.00146153, throughput 3.28926K wps
[Epoch 124 Batch 60/62] avg loss 0.00137095, throughput 3.22121K wps
Begin Testing...
[Epoch 124] train avg loss 0.00146532, dev acc 0.8407, dev avg loss 0.396849, throughput 3.26132K wps
[Epoch 125 Batch 30/62] avg loss 0.001439, throughput 3.27489K wps
[Epoch 125 Batch 60/62] avg loss 0.00146776, throughput 3.19366K wps
Begin Testing...
[Epoch 125] train avg loss 0.00147532, dev acc 0.8437, dev avg loss 0.396767, throughput 3.24169K wps
[Epoch 126 Batch 30/62] avg loss 0.00146663, throughput 3.28177K wps
[Epoch 126 Batch 60/62] avg loss 0.00132044, throughput 3.1568K wps
Begin Testing...
[Epoch 126] train avg loss 0.0014118, dev acc 0.8496, dev avg loss 0.399621, throughput 3.22407K wps
[Epoch 127 Batch 30/62] avg loss 0.00137669, throughput 3.23611K wps
[Epoch 127 Batch 60/62] avg loss 0.00138536, throughput 3.15519K wps
Begin Testing...
[Epoch 127] train avg loss 0.00141245, dev acc 0.8496, dev avg loss 0.398625, throughput 3.20103K wps
[Epoch 128 Batch 30/62] avg loss 0.00141957, throughput 3.26182K wps
[Epoch 128 Batch 60/62] avg loss 0.00142737, throughput 3.1661K wps
Begin Testing...
[Epoch 128] train avg loss 0.00148748, dev acc 0.8319, dev avg loss 0.398191, throughput 3.22134K wps
[Epoch 129 Batch 30/62] avg loss 0.00136145, throughput 3.2844K wps
[Epoch 129 Batch 60/62] avg loss 0.00127108, throughput 3.19551K wps
Begin Testing...
[Epoch 129] train avg loss 0.00133248, dev acc 0.8437, dev avg loss 0.398243, throughput 3.24635K wps
[Epoch 130 Batch 30/62] avg loss 0.00131541, throughput 3.25239K wps
[Epoch 130 Batch 60/62] avg loss 0.00128013, throughput 3.16022K wps
Begin Testing...
[Epoch 130] train avg loss 0.0013367, dev acc 0.8407, dev avg loss 0.398031, throughput 3.21253K wps
[Epoch 131 Batch 30/62] avg loss 0.00147907, throughput 3.24936K wps
[Epoch 131 Batch 60/62] avg loss 0.00130702, throughput 3.21129K wps
Begin Testing...
[Epoch 131] train avg loss 0.00142462, dev acc 0.8378, dev avg loss 0.398569, throughput 3.2362K wps
[Epoch 132 Batch 30/62] avg loss 0.00145317, throughput 3.26555K wps
[Epoch 132 Batch 60/62] avg loss 0.00117774, throughput 3.16072K wps
Begin Testing...
[Epoch 132] train avg loss 0.00137631, dev acc 0.8378, dev avg loss 0.41748, throughput 3.21897K wps
[Epoch 133 Batch 30/62] avg loss 0.00127893, throughput 3.23708K wps
[Epoch 133 Batch 60/62] avg loss 0.00141459, throughput 3.15392K wps
Begin Testing...
[Epoch 133] train avg loss 0.0013522, dev acc 0.8437, dev avg loss 0.399488, throughput 3.20165K wps
[Epoch 134 Batch 30/62] avg loss 0.00130058, throughput 3.23931K wps
[Epoch 134 Batch 60/62] avg loss 0.00117367, throughput 3.13912K wps
Begin Testing...
[Epoch 134] train avg loss 0.00123426, dev acc 0.8496, dev avg loss 0.401637, throughput 3.19605K wps
[Epoch 135 Batch 30/62] avg loss 0.001158, throughput 3.2345K wps
[Epoch 135 Batch 60/62] avg loss 0.00123383, throughput 3.16245K wps
Begin Testing...
[Epoch 135] train avg loss 0.00120515, dev acc 0.8496, dev avg loss 0.403589, throughput 3.20434K wps
[Epoch 136 Batch 30/62] avg loss 0.00126212, throughput 3.26366K wps
[Epoch 136 Batch 60/62] avg loss 0.00118602, throughput 3.14379K wps
Begin Testing...
[Epoch 136] train avg loss 0.00131031, dev acc 0.8496, dev avg loss 0.405487, throughput 3.20872K wps
[Epoch 137 Batch 30/62] avg loss 0.00116614, throughput 3.24979K wps
[Epoch 137 Batch 60/62] avg loss 0.00130914, throughput 3.16653K wps
Begin Testing...
[Epoch 137] train avg loss 0.00123935, dev acc 0.8437, dev avg loss 0.400147, throughput 3.21577K wps
[Epoch 138 Batch 30/62] avg loss 0.00114876, throughput 3.24703K wps
[Epoch 138 Batch 60/62] avg loss 0.00122893, throughput 3.19313K wps
Begin Testing...
[Epoch 138] train avg loss 0.00120131, dev acc 0.8437, dev avg loss 0.406947, throughput 3.22604K wps
[Epoch 139 Batch 30/62] avg loss 0.0011528, throughput 3.29324K wps
[Epoch 139 Batch 60/62] avg loss 0.0011626, throughput 3.15172K wps
Begin Testing...
[Epoch 139] train avg loss 0.00117297, dev acc 0.8466, dev avg loss 0.401623, throughput 3.22666K wps
[Epoch 140 Batch 30/62] avg loss 0.00121011, throughput 3.23324K wps
[Epoch 140 Batch 60/62] avg loss 0.00115775, throughput 3.18156K wps
Begin Testing...
[Epoch 140] train avg loss 0.00120417, dev acc 0.8378, dev avg loss 0.401717, throughput 3.21581K wps
[Epoch 141 Batch 30/62] avg loss 0.0011073, throughput 3.2744K wps
[Epoch 141 Batch 60/62] avg loss 0.00117166, throughput 3.18508K wps
Begin Testing...
[Epoch 141] train avg loss 0.00114364, dev acc 0.8437, dev avg loss 0.402345, throughput 3.23459K wps
[Epoch 142 Batch 30/62] avg loss 0.00122524, throughput 3.24093K wps
[Epoch 142 Batch 60/62] avg loss 0.0010482, throughput 3.16989K wps
Begin Testing...
[Epoch 142] train avg loss 0.00114104, dev acc 0.8407, dev avg loss 0.401988, throughput 3.2114K wps
[Epoch 143 Batch 30/62] avg loss 0.00109619, throughput 3.23056K wps
[Epoch 143 Batch 60/62] avg loss 0.00106824, throughput 3.16917K wps
Begin Testing...
[Epoch 143] train avg loss 0.00110914, dev acc 0.8437, dev avg loss 0.402415, throughput 3.20723K wps
[Epoch 144 Batch 30/62] avg loss 0.0011102, throughput 3.25516K wps
[Epoch 144 Batch 60/62] avg loss 0.00107301, throughput 3.18672K wps
Begin Testing...
[Epoch 144] train avg loss 0.00111035, dev acc 0.8466, dev avg loss 0.407574, throughput 3.22758K wps
[Epoch 145 Batch 30/62] avg loss 0.00107743, throughput 3.2741K wps
[Epoch 145 Batch 60/62] avg loss 0.00110053, throughput 3.18113K wps
Begin Testing...
[Epoch 145] train avg loss 0.00110291, dev acc 0.8437, dev avg loss 0.404043, throughput 3.23428K wps
[Epoch 146 Batch 30/62] avg loss 0.0011441, throughput 3.24367K wps
[Epoch 146 Batch 60/62] avg loss 0.00103995, throughput 3.18511K wps
Begin Testing...
[Epoch 146] train avg loss 0.00110344, dev acc 0.8407, dev avg loss 0.404134, throughput 3.21965K wps
[Epoch 147 Batch 30/62] avg loss 0.00098683, throughput 3.26594K wps
[Epoch 147 Batch 60/62] avg loss 0.00108643, throughput 3.15675K wps
Begin Testing...
[Epoch 147] train avg loss 0.00104901, dev acc 0.8466, dev avg loss 0.4055, throughput 3.21648K wps
[Epoch 148 Batch 30/62] avg loss 0.00112259, throughput 3.25421K wps
[Epoch 148 Batch 60/62] avg loss 0.00105626, throughput 3.15004K wps
Begin Testing...
[Epoch 148] train avg loss 0.00110405, dev acc 0.8496, dev avg loss 0.409642, throughput 3.2073K wps
[Epoch 149 Batch 30/62] avg loss 0.000989519, throughput 3.24871K wps
[Epoch 149 Batch 60/62] avg loss 0.00107194, throughput 3.15076K wps
Begin Testing...
[Epoch 149] train avg loss 0.00103309, dev acc 0.8437, dev avg loss 0.409576, throughput 3.20555K wps
[Epoch 150 Batch 30/62] avg loss 0.000917148, throughput 3.2596K wps
[Epoch 150 Batch 60/62] avg loss 0.00104164, throughput 3.16694K wps
Begin Testing...
[Epoch 150] train avg loss 0.000985006, dev acc 0.8407, dev avg loss 0.407092, throughput 3.21986K wps
[Epoch 151 Batch 30/62] avg loss 0.00102093, throughput 3.25806K wps
[Epoch 151 Batch 60/62] avg loss 0.0010371, throughput 3.19089K wps
Begin Testing...
[Epoch 151] train avg loss 0.00105247, dev acc 0.8466, dev avg loss 0.413291, throughput 3.23024K wps
[Epoch 152 Batch 30/62] avg loss 0.00104704, throughput 3.25952K wps
[Epoch 152 Batch 60/62] avg loss 0.000967719, throughput 3.16422K wps
Begin Testing...
[Epoch 152] train avg loss 0.00102265, dev acc 0.8407, dev avg loss 0.406275, throughput 3.21705K wps
[Epoch 153 Batch 30/62] avg loss 0.000955485, throughput 3.24862K wps
[Epoch 153 Batch 60/62] avg loss 0.00103006, throughput 3.13185K wps
Begin Testing...
[Epoch 153] train avg loss 0.00100455, dev acc 0.8378, dev avg loss 0.406671, throughput 3.1951K wps
[Epoch 154 Batch 30/62] avg loss 0.000836776, throughput 3.24998K wps
[Epoch 154 Batch 60/62] avg loss 0.00100767, throughput 3.13571K wps
Begin Testing...
[Epoch 154] train avg loss 0.000935857, dev acc 0.8378, dev avg loss 0.407009, throughput 3.19422K wps
[Epoch 155 Batch 30/62] avg loss 0.000885451, throughput 3.2654K wps
[Epoch 155 Batch 60/62] avg loss 0.000964668, throughput 3.14981K wps
Begin Testing...
[Epoch 155] train avg loss 0.000924592, dev acc 0.8378, dev avg loss 0.408405, throughput 3.21263K wps
[Epoch 156 Batch 30/62] avg loss 0.00087178, throughput 3.24694K wps
[Epoch 156 Batch 60/62] avg loss 0.000953181, throughput 3.15303K wps
Begin Testing...
[Epoch 156] train avg loss 0.000921692, dev acc 0.8378, dev avg loss 0.408626, throughput 3.20456K wps
[Epoch 157 Batch 30/62] avg loss 0.000950652, throughput 3.27553K wps
[Epoch 157 Batch 60/62] avg loss 0.000978885, throughput 3.16088K wps
Begin Testing...
[Epoch 157] train avg loss 0.000988944, dev acc 0.8407, dev avg loss 0.409322, throughput 3.22324K wps
[Epoch 158 Batch 30/62] avg loss 0.000960652, throughput 3.23562K wps
[Epoch 158 Batch 60/62] avg loss 0.000923171, throughput 3.16347K wps
Begin Testing...
[Epoch 158] train avg loss 0.000952645, dev acc 0.8407, dev avg loss 0.409051, throughput 3.2059K wps
[Epoch 159 Batch 30/62] avg loss 0.000875838, throughput 3.25856K wps
[Epoch 159 Batch 60/62] avg loss 0.000920967, throughput 3.18236K wps
Begin Testing...
[Epoch 159] train avg loss 0.000908508, dev acc 0.8378, dev avg loss 0.408372, throughput 3.22629K wps
[Epoch 160 Batch 30/62] avg loss 0.000900512, throughput 3.27347K wps
[Epoch 160 Batch 60/62] avg loss 0.000901154, throughput 3.16393K wps
Begin Testing...
[Epoch 160] train avg loss 0.000914601, dev acc 0.8437, dev avg loss 0.413876, throughput 3.22348K wps
[Epoch 161 Batch 30/62] avg loss 0.00081682, throughput 3.24988K wps
[Epoch 161 Batch 60/62] avg loss 0.000921989, throughput 3.14469K wps
Begin Testing...
[Epoch 161] train avg loss 0.000894754, dev acc 0.8378, dev avg loss 0.410974, throughput 3.20317K wps
[Epoch 162 Batch 30/62] avg loss 0.000941156, throughput 3.24553K wps
[Epoch 162 Batch 60/62] avg loss 0.000802051, throughput 3.14844K wps
Begin Testing...
[Epoch 162] train avg loss 0.000886813, dev acc 0.8378, dev avg loss 0.411818, throughput 3.20274K wps
[Epoch 163 Batch 30/62] avg loss 0.000819615, throughput 3.25889K wps
[Epoch 163 Batch 60/62] avg loss 0.000866993, throughput 3.13349K wps
Begin Testing...
[Epoch 163] train avg loss 0.000886667, dev acc 0.8496, dev avg loss 0.417006, throughput 3.20103K wps
[Epoch 164 Batch 30/62] avg loss 0.000928237, throughput 3.25729K wps
[Epoch 164 Batch 60/62] avg loss 0.000897526, throughput 3.15221K wps
Begin Testing...
[Epoch 164] train avg loss 0.000909757, dev acc 0.8466, dev avg loss 0.413948, throughput 3.21002K wps
[Epoch 165 Batch 30/62] avg loss 0.000867886, throughput 3.24834K wps
[Epoch 165 Batch 60/62] avg loss 0.000871119, throughput 3.16348K wps
Begin Testing...
[Epoch 165] train avg loss 0.000872623, dev acc 0.8378, dev avg loss 0.412925, throughput 3.2114K wps
[Epoch 166 Batch 30/62] avg loss 0.000913893, throughput 3.24569K wps
[Epoch 166 Batch 60/62] avg loss 0.000869908, throughput 3.16556K wps
Begin Testing...
[Epoch 166] train avg loss 0.000896341, dev acc 0.8407, dev avg loss 0.413815, throughput 3.21167K wps
[Epoch 167 Batch 30/62] avg loss 0.000937027, throughput 3.21996K wps
[Epoch 167 Batch 60/62] avg loss 0.00086624, throughput 3.16658K wps
Begin Testing...
[Epoch 167] train avg loss 0.000903147, dev acc 0.8437, dev avg loss 0.415977, throughput 3.19907K wps
[Epoch 168 Batch 30/62] avg loss 0.00081504, throughput 3.2317K wps
[Epoch 168 Batch 60/62] avg loss 0.000825115, throughput 3.15748K wps
Begin Testing...
[Epoch 168] train avg loss 0.000816721, dev acc 0.8437, dev avg loss 0.414828, throughput 3.20054K wps
[Epoch 169 Batch 30/62] avg loss 0.000791054, throughput 3.23152K wps
[Epoch 169 Batch 60/62] avg loss 0.000879886, throughput 3.16558K wps
Begin Testing...
[Epoch 169] train avg loss 0.000838755, dev acc 0.8378, dev avg loss 0.414819, throughput 3.20455K wps
[Epoch 170 Batch 30/62] avg loss 0.000781487, throughput 3.24735K wps
[Epoch 170 Batch 60/62] avg loss 0.000873738, throughput 3.16372K wps
Begin Testing...
[Epoch 170] train avg loss 0.000830254, dev acc 0.8437, dev avg loss 0.415007, throughput 3.21206K wps
[Epoch 171 Batch 30/62] avg loss 0.000802799, throughput 3.22428K wps
[Epoch 171 Batch 60/62] avg loss 0.000815636, throughput 3.14616K wps
Begin Testing...
[Epoch 171] train avg loss 0.000824586, dev acc 0.8437, dev avg loss 0.415976, throughput 3.19095K wps
[Epoch 172 Batch 30/62] avg loss 0.000783063, throughput 3.23417K wps
[Epoch 172 Batch 60/62] avg loss 0.000767788, throughput 3.15906K wps
Begin Testing...
[Epoch 172] train avg loss 0.000786149, dev acc 0.8437, dev avg loss 0.416616, throughput 3.2029K wps
[Epoch 173 Batch 30/62] avg loss 0.000788148, throughput 3.24589K wps
[Epoch 173 Batch 60/62] avg loss 0.0008021, throughput 3.16691K wps
Begin Testing...
[Epoch 173] train avg loss 0.000803167, dev acc 0.8466, dev avg loss 0.423611, throughput 3.21237K wps
[Epoch 174 Batch 30/62] avg loss 0.00074542, throughput 3.22868K wps
[Epoch 174 Batch 60/62] avg loss 0.00078851, throughput 3.15956K wps
Begin Testing...
[Epoch 174] train avg loss 0.00076482, dev acc 0.8466, dev avg loss 0.419527, throughput 3.20162K wps
[Epoch 175 Batch 30/62] avg loss 0.00080713, throughput 3.24349K wps
[Epoch 175 Batch 60/62] avg loss 0.000775451, throughput 3.19714K wps
Begin Testing...
[Epoch 175] train avg loss 0.000794876, dev acc 0.8407, dev avg loss 0.416965, throughput 3.2257K wps
[Epoch 176 Batch 30/62] avg loss 0.000758438, throughput 3.25468K wps
[Epoch 176 Batch 60/62] avg loss 0.000798023, throughput 3.16705K wps
Begin Testing...
[Epoch 176] train avg loss 0.000785955, dev acc 0.8437, dev avg loss 0.42017, throughput 3.21597K wps
[Epoch 177 Batch 30/62] avg loss 0.000696878, throughput 3.25449K wps
[Epoch 177 Batch 60/62] avg loss 0.000789322, throughput 3.16129K wps
Begin Testing...
[Epoch 177] train avg loss 0.000776826, dev acc 0.8466, dev avg loss 0.4172, throughput 3.21371K wps
[Epoch 178 Batch 30/62] avg loss 0.000754291, throughput 3.24968K wps
[Epoch 178 Batch 60/62] avg loss 0.000699679, throughput 3.15357K wps
Begin Testing...
[Epoch 178] train avg loss 0.000727838, dev acc 0.8466, dev avg loss 0.419713, throughput 3.20714K wps
[Epoch 179 Batch 30/62] avg loss 0.000775183, throughput 3.24934K wps
[Epoch 179 Batch 60/62] avg loss 0.000739611, throughput 3.15185K wps
Begin Testing...
[Epoch 179] train avg loss 0.000770162, dev acc 0.8437, dev avg loss 0.419011, throughput 3.20564K wps
[Epoch 180 Batch 30/62] avg loss 0.000819214, throughput 3.23868K wps
[Epoch 180 Batch 60/62] avg loss 0.000665515, throughput 3.1493K wps
Begin Testing...
[Epoch 180] train avg loss 0.000753941, dev acc 0.8407, dev avg loss 0.418744, throughput 3.19894K wps
[Epoch 181 Batch 30/62] avg loss 0.000670374, throughput 3.24977K wps
[Epoch 181 Batch 60/62] avg loss 0.000801806, throughput 3.15296K wps
Begin Testing...
[Epoch 181] train avg loss 0.000738884, dev acc 0.8407, dev avg loss 0.41933, throughput 3.20529K wps
[Epoch 182 Batch 30/62] avg loss 0.000640986, throughput 3.25459K wps
[Epoch 182 Batch 60/62] avg loss 0.00075251, throughput 3.15949K wps
Begin Testing...
[Epoch 182] train avg loss 0.00069369, dev acc 0.8466, dev avg loss 0.421982, throughput 3.21269K wps
[Epoch 183 Batch 30/62] avg loss 0.000817545, throughput 3.25483K wps
[Epoch 183 Batch 60/62] avg loss 0.000731832, throughput 3.16371K wps
Begin Testing...
[Epoch 183] train avg loss 0.000777945, dev acc 0.8407, dev avg loss 0.421007, throughput 3.21695K wps
[Epoch 184 Batch 30/62] avg loss 0.000745812, throughput 3.23429K wps
[Epoch 184 Batch 60/62] avg loss 0.000758532, throughput 3.19677K wps
Begin Testing...
[Epoch 184] train avg loss 0.000754565, dev acc 0.8437, dev avg loss 0.423129, throughput 3.22255K wps
[Epoch 185 Batch 30/62] avg loss 0.000781601, throughput 3.26775K wps
[Epoch 185 Batch 60/62] avg loss 0.000662684, throughput 3.16472K wps
Begin Testing...
[Epoch 185] train avg loss 0.000730471, dev acc 0.8437, dev avg loss 0.422122, throughput 3.22158K wps
[Epoch 186 Batch 30/62] avg loss 0.000764231, throughput 3.25969K wps
[Epoch 186 Batch 60/62] avg loss 0.000725359, throughput 3.17502K wps
Begin Testing...
[Epoch 186] train avg loss 0.000745411, dev acc 0.8466, dev avg loss 0.423248, throughput 3.22482K wps
[Epoch 187 Batch 30/62] avg loss 0.000656999, throughput 3.27893K wps
[Epoch 187 Batch 60/62] avg loss 0.000786882, throughput 3.14309K wps
Begin Testing...
[Epoch 187] train avg loss 0.000723593, dev acc 0.8466, dev avg loss 0.423615, throughput 3.21681K wps
[Epoch 188 Batch 30/62] avg loss 0.000680193, throughput 3.2625K wps
[Epoch 188 Batch 60/62] avg loss 0.000650376, throughput 3.15546K wps
Begin Testing...
[Epoch 188] train avg loss 0.000680365, dev acc 0.8407, dev avg loss 0.422953, throughput 3.21327K wps
[Epoch 189 Batch 30/62] avg loss 0.000678876, throughput 3.26007K wps
[Epoch 189 Batch 60/62] avg loss 0.000693678, throughput 3.1385K wps
Begin Testing...
[Epoch 189] train avg loss 0.000728686, dev acc 0.8407, dev avg loss 0.430882, throughput 3.20432K wps
[Epoch 190 Batch 30/62] avg loss 0.000719715, throughput 3.24675K wps
[Epoch 190 Batch 60/62] avg loss 0.000650824, throughput 3.17041K wps
Begin Testing...
[Epoch 190] train avg loss 0.00068864, dev acc 0.8437, dev avg loss 0.423589, throughput 3.21395K wps
[Epoch 191 Batch 30/62] avg loss 0.000665074, throughput 3.24358K wps
[Epoch 191 Batch 60/62] avg loss 0.000667189, throughput 3.14327K wps
Begin Testing...
[Epoch 191] train avg loss 0.000666489, dev acc 0.8466, dev avg loss 0.425574, throughput 3.19886K wps
[Epoch 192 Batch 30/62] avg loss 0.000607768, throughput 3.23692K wps
[Epoch 192 Batch 60/62] avg loss 0.000663213, throughput 3.167K wps
Begin Testing...
[Epoch 192] train avg loss 0.000640783, dev acc 0.8437, dev avg loss 0.427018, throughput 3.20836K wps
[Epoch 193 Batch 30/62] avg loss 0.000641234, throughput 3.25018K wps
[Epoch 193 Batch 60/62] avg loss 0.000735396, throughput 3.19331K wps
Begin Testing...
[Epoch 193] train avg loss 0.000686551, dev acc 0.8437, dev avg loss 0.425355, throughput 3.22816K wps
[Epoch 194 Batch 30/62] avg loss 0.000602857, throughput 3.27102K wps
[Epoch 194 Batch 60/62] avg loss 0.000681736, throughput 3.15736K wps
Begin Testing...
[Epoch 194] train avg loss 0.000638873, dev acc 0.8466, dev avg loss 0.427264, throughput 3.21752K wps
[Epoch 195 Batch 30/62] avg loss 0.000690048, throughput 3.25811K wps
[Epoch 195 Batch 60/62] avg loss 0.000633498, throughput 3.13181K wps
Begin Testing...
[Epoch 195] train avg loss 0.000668724, dev acc 0.8407, dev avg loss 0.425668, throughput 3.20223K wps
[Epoch 196 Batch 30/62] avg loss 0.000633123, throughput 3.24878K wps
[Epoch 196 Batch 60/62] avg loss 0.000699721, throughput 3.15622K wps
Begin Testing...
[Epoch 196] train avg loss 0.000699741, dev acc 0.8466, dev avg loss 0.427989, throughput 3.20812K wps
[Epoch 197 Batch 30/62] avg loss 0.000622794, throughput 3.25063K wps
[Epoch 197 Batch 60/62] avg loss 0.000614688, throughput 3.15786K wps
Begin Testing...
[Epoch 197] train avg loss 0.000626001, dev acc 0.8466, dev avg loss 0.428052, throughput 3.20909K wps
[Epoch 198 Batch 30/62] avg loss 0.000616123, throughput 3.24976K wps
[Epoch 198 Batch 60/62] avg loss 0.000667874, throughput 3.15063K wps
Begin Testing...
[Epoch 198] train avg loss 0.000654691, dev acc 0.8466, dev avg loss 0.430411, throughput 3.20528K wps
[Epoch 199 Batch 30/62] avg loss 0.000640891, throughput 3.2579K wps
[Epoch 199 Batch 60/62] avg loss 0.00067586, throughput 3.20935K wps
Begin Testing...
[Epoch 199] train avg loss 0.00067198, dev acc 0.8437, dev avg loss 0.429373, throughput 3.24112K wps
Test loss 0.314019, test acc 0.8621
Total time cost 460.29s
[Epoch 0 Batch 30/62] avg loss 0.0135574, throughput 3.07866K wps
[Epoch 0 Batch 60/62] avg loss 0.012979, throughput 3.18491K wps
Begin Testing...
[Epoch 0] train avg loss 0.0133952, dev acc 0.6519, dev avg loss 0.641543, throughput 3.13922K wps
Observed Improvement.
Begin Testing...
[Epoch 1 Batch 30/62] avg loss 0.0130709, throughput 3.24171K wps
[Epoch 1 Batch 60/62] avg loss 0.0128744, throughput 3.16558K wps
Begin Testing...
[Epoch 1] train avg loss 0.0131227, dev acc 0.6519, dev avg loss 0.631503, throughput 3.20907K wps
Observed Improvement.
Begin Testing...
[Epoch 2 Batch 30/62] avg loss 0.0128439, throughput 3.27923K wps
[Epoch 2 Batch 60/62] avg loss 0.0128639, throughput 3.13792K wps
Begin Testing...
[Epoch 2] train avg loss 0.0130122, dev acc 0.6519, dev avg loss 0.625119, throughput 3.21116K wps
Observed Improvement.
Begin Testing...
[Epoch 3 Batch 30/62] avg loss 0.0127279, throughput 3.28489K wps
[Epoch 3 Batch 60/62] avg loss 0.0125027, throughput 3.1594K wps
Begin Testing...
[Epoch 3] train avg loss 0.0127523, dev acc 0.6519, dev avg loss 0.616692, throughput 3.22643K wps
Observed Improvement.
Begin Testing...
[Epoch 4 Batch 30/62] avg loss 0.0126343, throughput 3.22785K wps
[Epoch 4 Batch 60/62] avg loss 0.0121908, throughput 3.19766K wps
Begin Testing...
[Epoch 4] train avg loss 0.0126011, dev acc 0.6519, dev avg loss 0.609618, throughput 3.2194K wps
Observed Improvement.
Begin Testing...
[Epoch 5 Batch 30/62] avg loss 0.012367, throughput 3.28611K wps
[Epoch 5 Batch 60/62] avg loss 0.0121978, throughput 3.17065K wps
Begin Testing...
[Epoch 5] train avg loss 0.0124231, dev acc 0.6519, dev avg loss 0.601991, throughput 3.23314K wps
Observed Improvement.
Begin Testing...
[Epoch 6 Batch 30/62] avg loss 0.0121676, throughput 3.27088K wps
[Epoch 6 Batch 60/62] avg loss 0.01198, throughput 3.16811K wps
Begin Testing...
[Epoch 6] train avg loss 0.0122455, dev acc 0.6519, dev avg loss 0.594572, throughput 3.22396K wps
Observed Improvement.
Begin Testing...
[Epoch 7 Batch 30/62] avg loss 0.0119005, throughput 3.25237K wps
[Epoch 7 Batch 60/62] avg loss 0.0119368, throughput 3.1731K wps
Begin Testing...
[Epoch 7] train avg loss 0.0120854, dev acc 0.6549, dev avg loss 0.587292, throughput 3.21922K wps
Observed Improvement.
Begin Testing...
[Epoch 8 Batch 30/62] avg loss 0.0116713, throughput 3.24862K wps
[Epoch 8 Batch 60/62] avg loss 0.011845, throughput 3.15592K wps
Begin Testing...
[Epoch 8] train avg loss 0.0118977, dev acc 0.6519, dev avg loss 0.578652, throughput 3.20737K wps
[Epoch 9 Batch 30/62] avg loss 0.0115793, throughput 3.23409K wps
[Epoch 9 Batch 60/62] avg loss 0.0116747, throughput 3.15045K wps
Begin Testing...
[Epoch 9] train avg loss 0.0117178, dev acc 0.6667, dev avg loss 0.570167, throughput 3.19645K wps
Observed Improvement.
Begin Testing...
[Epoch 10 Batch 30/62] avg loss 0.0115629, throughput 3.2725K wps
[Epoch 10 Batch 60/62] avg loss 0.0111805, throughput 3.16966K wps
Begin Testing...
[Epoch 10] train avg loss 0.0114645, dev acc 0.6519, dev avg loss 0.564529, throughput 3.22574K wps
[Epoch 11 Batch 30/62] avg loss 0.011223, throughput 3.25131K wps
[Epoch 11 Batch 60/62] avg loss 0.0111048, throughput 3.14373K wps
Begin Testing...
[Epoch 11] train avg loss 0.0112874, dev acc 0.6726, dev avg loss 0.553088, throughput 3.20335K wps
Observed Improvement.
Begin Testing...
[Epoch 12 Batch 30/62] avg loss 0.0113727, throughput 3.25312K wps
[Epoch 12 Batch 60/62] avg loss 0.0106706, throughput 3.17616K wps
Begin Testing...
[Epoch 12] train avg loss 0.0111691, dev acc 0.6962, dev avg loss 0.542883, throughput 3.21861K wps
Observed Improvement.
Begin Testing...
[Epoch 13 Batch 30/62] avg loss 0.0107939, throughput 3.2613K wps
[Epoch 13 Batch 60/62] avg loss 0.0108432, throughput 3.17437K wps
Begin Testing...
[Epoch 13] train avg loss 0.0109506, dev acc 0.7080, dev avg loss 0.533148, throughput 3.22344K wps
Observed Improvement.
Begin Testing...
[Epoch 14 Batch 30/62] avg loss 0.0105937, throughput 3.26849K wps
[Epoch 14 Batch 60/62] avg loss 0.0106055, throughput 3.16112K wps
Begin Testing...
[Epoch 14] train avg loss 0.010696, dev acc 0.7139, dev avg loss 0.523547, throughput 3.21994K wps
Observed Improvement.