Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the opl loss doesn't decrease #5

Closed
ChenJunzhi-buaa opened this issue May 3, 2022 · 2 comments
Closed

the opl loss doesn't decrease #5

ChenJunzhi-buaa opened this issue May 3, 2022 · 2 comments

Comments

@ChenJunzhi-buaa
Copy link

thank yor for your great work!

I print the opl loss, but it I find it doesn't decrease. How does the opl loss take effect at such take effect?

Prints are like below:
############### print loss #########################################################
base_loss:4.280840873718262;op_loss:0.5367265939712524
Training Epoch: 2 [128/50000] Loss: 4.8176 LR: 0.100000
base_loss:4.107833385467529;op_loss:0.5339833498001099
Training Epoch: 2 [256/50000] Loss: 4.6418 LR: 0.100000
base_loss:4.187429904937744;op_loss:0.5304139852523804
Training Epoch: 2 [384/50000] Loss: 4.7178 LR: 0.100000
base_loss:4.127537727355957;op_loss:0.5252766013145447
Training Epoch: 2 [512/50000] Loss: 4.6528 LR: 0.100000
base_loss:4.084356307983398;op_loss:0.5540026426315308
Training Epoch: 2 [640/50000] Loss: 4.6384 LR: 0.100000
base_loss:4.1102681159973145;op_loss:0.5689339637756348
Training Epoch: 2 [768/50000] Loss: 4.6792 LR: 0.100000
base_loss:4.056018352508545;op_loss:0.5612157583236694
Training Epoch: 2 [896/50000] Loss: 4.6172 LR: 0.100000
base_loss:4.160975456237793;op_loss:0.5274627208709717
Training Epoch: 2 [1024/50000] Loss: 4.6884 LR: 0.100000
base_loss:4.068561553955078;op_loss:0.5268392562866211
Training Epoch: 2 [1152/50000] Loss: 4.5954 LR: 0.100000
base_loss:4.090016841888428;op_loss:0.5170165300369263
Training Epoch: 2 [1280/50000] Loss: 4.6070 LR: 0.100000
base_loss:4.1942949295043945;op_loss:0.5494662523269653
Training Epoch: 2 [1408/50000] Loss: 4.7438 LR: 0.100000
base_loss:4.133640289306641;op_loss:0.5417580604553223
Training Epoch: 2 [1536/50000] Loss: 4.6754 LR: 0.100000
base_loss:4.124314785003662;op_loss:0.5448318719863892
Training Epoch: 2 [1664/50000] Loss: 4.6691 LR: 0.100000
base_loss:4.153204917907715;op_loss:0.5473470687866211
Training Epoch: 2 [1792/50000] Loss: 4.7006 LR: 0.100000
base_loss:4.167095184326172;op_loss:0.569574236869812
Training Epoch: 2 [1920/50000] Loss: 4.7367 LR: 0.100000
base_loss:4.202641010284424;op_loss:0.5590764880180359
Training Epoch: 2 [2048/50000] Loss: 4.7617 LR: 0.100000
base_loss:4.2228193283081055;op_loss:0.5472540855407715
Training Epoch: 2 [2176/50000] Loss: 4.7701 LR: 0.100000
base_loss:4.1299238204956055;op_loss:0.5477103590965271
Training Epoch: 2 [2304/50000] Loss: 4.6776 LR: 0.100000
base_loss:4.061820983886719;op_loss:0.5515830516815186
Training Epoch: 2 [2432/50000] Loss: 4.6134 LR: 0.100000
base_loss:4.058617115020752;op_loss:0.5518112182617188
Training Epoch: 2 [2560/50000] Loss: 4.6104 LR: 0.100000
base_loss:4.192203521728516;op_loss:0.5860923528671265
Training Epoch: 2 [2688/50000] Loss: 4.7783 LR: 0.100000
base_loss:3.96213436126709;op_loss:0.5402579307556152
Training Epoch: 2 [2816/50000] Loss: 4.5024 LR: 0.100000
base_loss:4.114309310913086;op_loss:0.5352994799613953
Training Epoch: 2 [2944/50000] Loss: 4.6496 LR: 0.100000
base_loss:4.110159873962402;op_loss:0.5759779214859009
Training Epoch: 2 [3072/50000] Loss: 4.6861 LR: 0.100000
base_loss:4.094690322875977;op_loss:0.5454019904136658
Training Epoch: 2 [3200/50000] Loss: 4.6401 LR: 0.100000
base_loss:4.0557379722595215;op_loss:0.5510457754135132
Training Epoch: 2 [3328/50000] Loss: 4.6068 LR: 0.100000
base_loss:4.225389003753662;op_loss:0.609066367149353
Training Epoch: 2 [3456/50000] Loss: 4.8345 LR: 0.100000
base_loss:4.351457595825195;op_loss:0.5496102571487427
Training Epoch: 2 [3584/50000] Loss: 4.9011 LR: 0.100000
base_loss:4.223308086395264;op_loss:0.5360826849937439
Training Epoch: 2 [3712/50000] Loss: 4.7594 LR: 0.100000
base_loss:4.35025691986084;op_loss:0.5664458274841309
Training Epoch: 2 [3840/50000] Loss: 4.9167 LR: 0.100000
base_loss:4.255465030670166;op_loss:0.5395272374153137
Training Epoch: 2 [3968/50000] Loss: 4.7950 LR: 0.100000
base_loss:4.104780673980713;op_loss:0.5505534410476685
Training Epoch: 2 [4096/50000] Loss: 4.6553 LR: 0.100000
base_loss:4.003930568695068;op_loss:0.5199926495552063
Training Epoch: 2 [4224/50000] Loss: 4.5239 LR: 0.100000
base_loss:4.060220718383789;op_loss:0.5441393852233887
Training Epoch: 2 [4352/50000] Loss: 4.6044 LR: 0.100000
base_loss:4.145919322967529;op_loss:0.5306097269058228
Training Epoch: 2 [4480/50000] Loss: 4.6765 LR: 0.100000
base_loss:4.119339942932129;op_loss:0.5385489463806152
Training Epoch: 2 [4608/50000] Loss: 4.6579 LR: 0.100000
base_loss:4.11818265914917;op_loss:0.560119092464447
Training Epoch: 2 [4736/50000] Loss: 4.6783 LR: 0.100000
base_loss:4.3832106590271;op_loss:0.5470128059387207
Training Epoch: 2 [4864/50000] Loss: 4.9302 LR: 0.100000
base_loss:4.000164985656738;op_loss:0.528315544128418
Training Epoch: 2 [4992/50000] Loss: 4.5285 LR: 0.100000
base_loss:4.128720760345459;op_loss:0.5646262168884277
Training Epoch: 2 [5120/50000] Loss: 4.6933 LR: 0.100000
base_loss:4.194307804107666;op_loss:0.5486522912979126
Training Epoch: 2 [5248/50000] Loss: 4.7430 LR: 0.100000
base_loss:4.138046741485596;op_loss:0.5774385333061218
Training Epoch: 2 [5376/50000] Loss: 4.7155 LR: 0.100000
base_loss:4.057068824768066;op_loss:0.552772045135498
Training Epoch: 2 [5504/50000] Loss: 4.6098 LR: 0.100000
base_loss:4.105223655700684;op_loss:0.5938466787338257
Training Epoch: 2 [5632/50000] Loss: 4.6991 LR: 0.100000
base_loss:4.064273357391357;op_loss:0.5769248008728027
Training Epoch: 2 [5760/50000] Loss: 4.6412 LR: 0.100000
base_loss:4.177098274230957;op_loss:0.541038990020752
Training Epoch: 2 [5888/50000] Loss: 4.7181 LR: 0.100000
base_loss:4.053581714630127;op_loss:0.5154528021812439
Training Epoch: 2 [6016/50000] Loss: 4.5690 LR: 0.100000
base_loss:3.9969096183776855;op_loss:0.5439358949661255
Training Epoch: 2 [6144/50000] Loss: 4.5408 LR: 0.100000
base_loss:4.2833333015441895;op_loss:0.596412181854248
Training Epoch: 2 [6272/50000] Loss: 4.8797 LR: 0.100000
base_loss:4.224669933319092;op_loss:0.5579344034194946
Training Epoch: 2 [6400/50000] Loss: 4.7826 LR: 0.100000
base_loss:4.171744346618652;op_loss:0.539709210395813
Training Epoch: 2 [6528/50000] Loss: 4.7115 LR: 0.100000
base_loss:3.9602854251861572;op_loss:0.5358787775039673
Training Epoch: 2 [6656/50000] Loss: 4.4962 LR: 0.100000
base_loss:3.9493770599365234;op_loss:0.5226551294326782
Training Epoch: 2 [6784/50000] Loss: 4.4720 LR: 0.100000
base_loss:4.003721714019775;op_loss:0.5647306442260742
Training Epoch: 2 [6912/50000] Loss: 4.5685 LR: 0.100000
base_loss:3.979095458984375;op_loss:0.5380873084068298
Training Epoch: 2 [7040/50000] Loss: 4.5172 LR: 0.100000
base_loss:4.058328151702881;op_loss:0.5032943487167358
Training Epoch: 2 [7168/50000] Loss: 4.5616 LR: 0.100000
base_loss:4.172536849975586;op_loss:0.5657532215118408
Training Epoch: 2 [7296/50000] Loss: 4.7383 LR: 0.100000
base_loss:3.997509717941284;op_loss:0.540406346321106
Training Epoch: 2 [7424/50000] Loss: 4.5379 LR: 0.100000
base_loss:4.1355462074279785;op_loss:0.5518960356712341
Training Epoch: 2 [7552/50000] Loss: 4.6874 LR: 0.100000
base_loss:3.9606270790100098;op_loss:0.5324451923370361
Training Epoch: 2 [7680/50000] Loss: 4.4931 LR: 0.100000
base_loss:4.090785026550293;op_loss:0.5301138758659363
Training Epoch: 2 [7808/50000] Loss: 4.6209 LR: 0.100000
base_loss:4.014397144317627;op_loss:0.5153027176856995
Training Epoch: 2 [7936/50000] Loss: 4.5297 LR: 0.100000
base_loss:3.9166464805603027;op_loss:0.5357871055603027
Training Epoch: 2 [8064/50000] Loss: 4.4524 LR: 0.100000
base_loss:4.113657474517822;op_loss:0.5453531742095947
Training Epoch: 2 [8192/50000] Loss: 4.6590 LR: 0.100000
base_loss:4.08429479598999;op_loss:0.5551325082778931
Training Epoch: 2 [8320/50000] Loss: 4.6394 LR: 0.100000
base_loss:4.230902671813965;op_loss:0.5798734426498413
Training Epoch: 2 [8448/50000] Loss: 4.8108 LR: 0.100000
base_loss:3.9448187351226807;op_loss:0.5061675310134888
Training Epoch: 2 [8576/50000] Loss: 4.4510 LR: 0.100000
base_loss:4.013118743896484;op_loss:0.5448779463768005
Training Epoch: 2 [8704/50000] Loss: 4.5580 LR: 0.100000
base_loss:4.005880355834961;op_loss:0.5329939126968384
Training Epoch: 2 [8832/50000] Loss: 4.5389 LR: 0.100000
base_loss:4.086043834686279;op_loss:0.5604828596115112
Training Epoch: 2 [8960/50000] Loss: 4.6465 LR: 0.100000
base_loss:4.011306285858154;op_loss:0.5554200410842896
Training Epoch: 2 [9088/50000] Loss: 4.5667 LR: 0.100000
base_loss:4.126717567443848;op_loss:0.6163482666015625
Training Epoch: 2 [9216/50000] Loss: 4.7431 LR: 0.100000
base_loss:4.091407775878906;op_loss:0.5237964391708374
Training Epoch: 2 [9344/50000] Loss: 4.6152 LR: 0.100000
base_loss:4.059153079986572;op_loss:0.5792750120162964
Training Epoch: 2 [9472/50000] Loss: 4.6384 LR: 0.100000
base_loss:3.981687307357788;op_loss:0.5341448783874512
Training Epoch: 2 [9600/50000] Loss: 4.5158 LR: 0.100000
base_loss:4.207813262939453;op_loss:0.5748806595802307
Training Epoch: 2 [9728/50000] Loss: 4.7827 LR: 0.100000
base_loss:4.008450984954834;op_loss:0.5871504545211792
Training Epoch: 2 [9856/50000] Loss: 4.5956 LR: 0.100000
base_loss:4.080292701721191;op_loss:0.5415703058242798
Training Epoch: 2 [9984/50000] Loss: 4.6219 LR: 0.100000
base_loss:3.9237353801727295;op_loss:0.524863064289093
Training Epoch: 2 [10112/50000] Loss: 4.4486 LR: 0.100000
base_loss:4.118155002593994;op_loss:0.5795061588287354
Training Epoch: 2 [10240/50000] Loss: 4.6977 LR: 0.100000
base_loss:3.868434190750122;op_loss:0.5216013193130493
Training Epoch: 2 [10368/50000] Loss: 4.3900 LR: 0.100000
base_loss:4.077742576599121;op_loss:0.5491902828216553
Training Epoch: 2 [10496/50000] Loss: 4.6269 LR: 0.100000
base_loss:3.8872153759002686;op_loss:0.5644634962081909
Training Epoch: 2 [10624/50000] Loss: 4.4517 LR: 0.100000
base_loss:4.216022968292236;op_loss:0.5591652989387512
Training Epoch: 2 [10752/50000] Loss: 4.7752 LR: 0.100000
base_loss:3.9916350841522217;op_loss:0.5574780106544495
Training Epoch: 2 [10880/50000] Loss: 4.5491 LR: 0.100000
base_loss:3.9976184368133545;op_loss:0.5582561492919922
Training Epoch: 2 [11008/50000] Loss: 4.5559 LR: 0.100000
base_loss:3.9008119106292725;op_loss:0.5464879274368286
Training Epoch: 2 [11136/50000] Loss: 4.4473 LR: 0.100000
base_loss:3.9299840927124023;op_loss:0.5282109379768372
Training Epoch: 2 [11264/50000] Loss: 4.4582 LR: 0.100000
base_loss:4.037593364715576;op_loss:0.546193540096283
Training Epoch: 2 [11392/50000] Loss: 4.5838 LR: 0.100000
base_loss:3.9935431480407715;op_loss:0.5289690494537354
Training Epoch: 2 [11520/50000] Loss: 4.5225 LR: 0.100000
base_loss:4.07742977142334;op_loss:0.5728453397750854
Training Epoch: 2 [11648/50000] Loss: 4.6503 LR: 0.100000
base_loss:4.042142391204834;op_loss:0.547771155834198
Training Epoch: 2 [11776/50000] Loss: 4.5899 LR: 0.100000
base_loss:4.176750659942627;op_loss:0.557599663734436
Training Epoch: 2 [11904/50000] Loss: 4.7344 LR: 0.100000
base_loss:4.099972724914551;op_loss:0.5662820339202881
Training Epoch: 2 [12032/50000] Loss: 4.6663 LR: 0.100000
base_loss:4.098907947540283;op_loss:0.5655167102813721
Training Epoch: 2 [12160/50000] Loss: 4.6644 LR: 0.100000
base_loss:4.045915603637695;op_loss:0.5533789396286011
Training Epoch: 2 [12288/50000] Loss: 4.5993 LR: 0.100000
base_loss:4.0994486808776855;op_loss:0.5776498317718506
Training Epoch: 2 [12416/50000] Loss: 4.6771 LR: 0.100000
base_loss:3.9360766410827637;op_loss:0.5429057478904724
Training Epoch: 2 [12544/50000] Loss: 4.4790 LR: 0.100000
base_loss:3.9477016925811768;op_loss:0.538610577583313
Training Epoch: 2 [12672/50000] Loss: 4.4863 LR: 0.100000
base_loss:4.210412502288818;op_loss:0.5811970233917236
Training Epoch: 2 [12800/50000] Loss: 4.7916 LR: 0.100000
base_loss:3.9044268131256104;op_loss:0.5374801754951477
Training Epoch: 2 [12928/50000] Loss: 4.4419 LR: 0.100000
base_loss:4.1393961906433105;op_loss:0.6037479639053345
Training Epoch: 2 [13056/50000] Loss: 4.7431 LR: 0.100000
base_loss:3.940371513366699;op_loss:0.532346785068512
Training Epoch: 2 [13184/50000] Loss: 4.4727 LR: 0.100000
base_loss:3.9259440898895264;op_loss:0.550885021686554
Training Epoch: 2 [13312/50000] Loss: 4.4768 LR: 0.100000
base_loss:3.8317489624023438;op_loss:0.5320054292678833
Training Epoch: 2 [13440/50000] Loss: 4.3638 LR: 0.100000
base_loss:3.880911350250244;op_loss:0.5213427543640137
Training Epoch: 2 [13568/50000] Loss: 4.4023 LR: 0.100000
base_loss:4.073591232299805;op_loss:0.5471822023391724
Training Epoch: 2 [13696/50000] Loss: 4.6208 LR: 0.100000
base_loss:4.001219749450684;op_loss:0.5670425891876221
Training Epoch: 2 [13824/50000] Loss: 4.5683 LR: 0.100000
base_loss:3.9181933403015137;op_loss:0.5290877223014832
Training Epoch: 2 [13952/50000] Loss: 4.4473 LR: 0.100000
base_loss:3.7816948890686035;op_loss:0.5207613706588745
Training Epoch: 2 [14080/50000] Loss: 4.3025 LR: 0.100000
base_loss:3.9933664798736572;op_loss:0.560212254524231
Training Epoch: 2 [14208/50000] Loss: 4.5536 LR: 0.100000
base_loss:3.93005633354187;op_loss:0.5484799146652222
Training Epoch: 2 [14336/50000] Loss: 4.4785 LR: 0.100000
base_loss:3.836000919342041;op_loss:0.55451500415802
Training Epoch: 2 [14464/50000] Loss: 4.3905 LR: 0.100000
base_loss:4.100119590759277;op_loss:0.5685286521911621
Training Epoch: 2 [14592/50000] Loss: 4.6686 LR: 0.100000
base_loss:4.030941009521484;op_loss:0.5982041358947754
Training Epoch: 2 [14720/50000] Loss: 4.6291 LR: 0.100000
base_loss:3.987234592437744;op_loss:0.566533088684082
Training Epoch: 2 [14848/50000] Loss: 4.5538 LR: 0.100000
base_loss:4.078621864318848;op_loss:0.6090583205223083
Training Epoch: 2 [14976/50000] Loss: 4.6877 LR: 0.100000
base_loss:4.001223087310791;op_loss:0.5583722591400146
Training Epoch: 2 [15104/50000] Loss: 4.5596 LR: 0.100000
base_loss:3.9525036811828613;op_loss:0.5767878293991089
Training Epoch: 2 [15232/50000] Loss: 4.5293 LR: 0.100000
base_loss:4.11889123916626;op_loss:0.5561187863349915
Training Epoch: 2 [15360/50000] Loss: 4.6750 LR: 0.100000
base_loss:3.926288366317749;op_loss:0.5604448318481445
Training Epoch: 2 [15488/50000] Loss: 4.4867 LR: 0.100000
base_loss:4.002326011657715;op_loss:0.5613310933113098
Training Epoch: 2 [15616/50000] Loss: 4.5637 LR: 0.100000
base_loss:4.182285308837891;op_loss:0.6124283075332642
Training Epoch: 2 [15744/50000] Loss: 4.7947 LR: 0.100000
base_loss:3.996242046356201;op_loss:0.5417048335075378
Training Epoch: 2 [15872/50000] Loss: 4.5379 LR: 0.100000
base_loss:4.020390510559082;op_loss:0.5404340028762817
Training Epoch: 2 [16000/50000] Loss: 4.5608 LR: 0.100000
base_loss:3.944962739944458;op_loss:0.554619312286377
Training Epoch: 2 [16128/50000] Loss: 4.4996 LR: 0.100000
base_loss:3.8157522678375244;op_loss:0.5304190516471863
Training Epoch: 2 [16256/50000] Loss: 4.3462 LR: 0.100000
base_loss:3.8531455993652344;op_loss:0.5176503658294678
Training Epoch: 2 [16384/50000] Loss: 4.3708 LR: 0.100000
base_loss:3.969573974609375;op_loss:0.533740222454071
Training Epoch: 2 [16512/50000] Loss: 4.5033 LR: 0.100000
base_loss:4.1333112716674805;op_loss:0.5763177871704102
Training Epoch: 2 [16640/50000] Loss: 4.7096 LR: 0.100000
base_loss:4.007053375244141;op_loss:0.5581700801849365
Training Epoch: 2 [16768/50000] Loss: 4.5652 LR: 0.100000
base_loss:4.020654678344727;op_loss:0.5500863790512085
Training Epoch: 2 [16896/50000] Loss: 4.5707 LR: 0.100000
base_loss:3.8734445571899414;op_loss:0.5279120206832886
Training Epoch: 2 [17024/50000] Loss: 4.4014 LR: 0.100000
base_loss:3.935228109359741;op_loss:0.5523131489753723
Training Epoch: 2 [17152/50000] Loss: 4.4875 LR: 0.100000
base_loss:4.027062892913818;op_loss:0.5717694163322449
Training Epoch: 2 [17280/50000] Loss: 4.5988 LR: 0.100000
base_loss:4.000626087188721;op_loss:0.5528971552848816
Training Epoch: 2 [17408/50000] Loss: 4.5535 LR: 0.100000
base_loss:3.934094190597534;op_loss:0.5325213074684143
Training Epoch: 2 [17536/50000] Loss: 4.4666 LR: 0.100000
base_loss:3.9861996173858643;op_loss:0.5766400098800659
Training Epoch: 2 [17664/50000] Loss: 4.5628 LR: 0.100000
base_loss:4.126750946044922;op_loss:0.5524164438247681
Training Epoch: 2 [17792/50000] Loss: 4.6792 LR: 0.100000
base_loss:4.030887126922607;op_loss:0.5731723308563232
Training Epoch: 2 [17920/50000] Loss: 4.6041 LR: 0.100000
base_loss:3.911486864089966;op_loss:0.547804594039917
Training Epoch: 2 [18048/50000] Loss: 4.4593 LR: 0.100000
base_loss:3.832413673400879;op_loss:0.5421112775802612
Training Epoch: 2 [18176/50000] Loss: 4.3745 LR: 0.100000
base_loss:3.91367506980896;op_loss:0.5375984907150269
Training Epoch: 2 [18304/50000] Loss: 4.4513 LR: 0.100000
base_loss:3.917893409729004;op_loss:0.5319538116455078
Training Epoch: 2 [18432/50000] Loss: 4.4498 LR: 0.100000
base_loss:3.9348998069763184;op_loss:0.5671139359474182
Training Epoch: 2 [18560/50000] Loss: 4.5020 LR: 0.100000
base_loss:3.8741447925567627;op_loss:0.5473669767379761
Training Epoch: 2 [18688/50000] Loss: 4.4215 LR: 0.100000
base_loss:3.9020802974700928;op_loss:0.5262272357940674
Training Epoch: 2 [18816/50000] Loss: 4.4283 LR: 0.100000
base_loss:3.887955665588379;op_loss:0.5487526655197144
Training Epoch: 2 [18944/50000] Loss: 4.4367 LR: 0.100000
base_loss:3.846449136734009;op_loss:0.5653930902481079
Training Epoch: 2 [19072/50000] Loss: 4.4118 LR: 0.100000
base_loss:4.031688690185547;op_loss:0.5698837041854858
Training Epoch: 2 [19200/50000] Loss: 4.6016 LR: 0.100000
base_loss:3.9116621017456055;op_loss:0.5691394209861755
Training Epoch: 2 [19328/50000] Loss: 4.4808 LR: 0.100000
base_loss:3.9180822372436523;op_loss:0.5745803117752075
Training Epoch: 2 [19456/50000] Loss: 4.4927 LR: 0.100000
base_loss:3.887133836746216;op_loss:0.5276685953140259
Training Epoch: 2 [19584/50000] Loss: 4.4148 LR: 0.100000
base_loss:3.848511219024658;op_loss:0.5183668732643127
Training Epoch: 2 [19712/50000] Loss: 4.3669 LR: 0.100000
base_loss:3.9894495010375977;op_loss:0.5411995649337769
Training Epoch: 2 [19840/50000] Loss: 4.5306 LR: 0.100000
base_loss:3.8970415592193604;op_loss:0.5570001006126404
Training Epoch: 2 [19968/50000] Loss: 4.4540 LR: 0.100000
base_loss:4.106607913970947;op_loss:0.5823178291320801
Training Epoch: 2 [20096/50000] Loss: 4.6889 LR: 0.100000
base_loss:3.996297836303711;op_loss:0.5623360872268677
Training Epoch: 2 [20224/50000] Loss: 4.5586 LR: 0.100000
base_loss:3.9712746143341064;op_loss:0.5528634786605835
Training Epoch: 2 [20352/50000] Loss: 4.5241 LR: 0.100000
base_loss:4.122766971588135;op_loss:0.5336371660232544
Training Epoch: 2 [20480/50000] Loss: 4.6564 LR: 0.100000
base_loss:3.965708017349243;op_loss:0.5651992559432983
Training Epoch: 2 [20608/50000] Loss: 4.5309 LR: 0.100000
base_loss:3.8252618312835693;op_loss:0.5347216129302979
Training Epoch: 2 [20736/50000] Loss: 4.3600 LR: 0.100000
base_loss:4.087770938873291;op_loss:0.5342988967895508
Training Epoch: 2 [20864/50000] Loss: 4.6221 LR: 0.100000
base_loss:3.8877315521240234;op_loss:0.539771556854248
Training Epoch: 2 [20992/50000] Loss: 4.4275 LR: 0.100000
base_loss:3.938214063644409;op_loss:0.5663728713989258
Training Epoch: 2 [21120/50000] Loss: 4.5046 LR: 0.100000
base_loss:3.9643566608428955;op_loss:0.5357565879821777
Training Epoch: 2 [21248/50000] Loss: 4.5001 LR: 0.100000
base_loss:4.123490333557129;op_loss:0.6017665863037109
Training Epoch: 2 [21376/50000] Loss: 4.7253 LR: 0.100000
base_loss:3.916621685028076;op_loss:0.5306718349456787
Training Epoch: 2 [21504/50000] Loss: 4.4473 LR: 0.100000
base_loss:3.9351696968078613;op_loss:0.5486545562744141
Training Epoch: 2 [21632/50000] Loss: 4.4838 LR: 0.100000
base_loss:3.9070043563842773;op_loss:0.5428156852722168
Training Epoch: 2 [21760/50000] Loss: 4.4498 LR: 0.100000
base_loss:3.899783134460449;op_loss:0.5319163203239441
Training Epoch: 2 [21888/50000] Loss: 4.4317 LR: 0.100000
base_loss:3.9853343963623047;op_loss:0.5571230053901672
Training Epoch: 2 [22016/50000] Loss: 4.5425 LR: 0.100000
base_loss:3.9992835521698;op_loss:0.5658630728721619
Training Epoch: 2 [22144/50000] Loss: 4.5651 LR: 0.100000
base_loss:3.716024875640869;op_loss:0.5126596093177795
Training Epoch: 2 [22272/50000] Loss: 4.2287 LR: 0.100000
base_loss:4.106863021850586;op_loss:0.5616646409034729
Training Epoch: 2 [22400/50000] Loss: 4.6685 LR: 0.100000
base_loss:3.8402814865112305;op_loss:0.5228695869445801
Training Epoch: 2 [22528/50000] Loss: 4.3632 LR: 0.100000
base_loss:3.886693239212036;op_loss:0.5065106153488159
Training Epoch: 2 [22656/50000] Loss: 4.3932 LR: 0.100000
base_loss:4.062132358551025;op_loss:0.6078095436096191
Training Epoch: 2 [22784/50000] Loss: 4.6699 LR: 0.100000
base_loss:3.7729077339172363;op_loss:0.5214307308197021
Training Epoch: 2 [22912/50000] Loss: 4.2943 LR: 0.100000
base_loss:3.8444833755493164;op_loss:0.5350896120071411
Training Epoch: 2 [23040/50000] Loss: 4.3796 LR: 0.100000
base_loss:3.901613712310791;op_loss:0.5505872368812561
Training Epoch: 2 [23168/50000] Loss: 4.4522 LR: 0.100000
base_loss:4.030137538909912;op_loss:0.5822103023529053
Training Epoch: 2 [23296/50000] Loss: 4.6123 LR: 0.100000
base_loss:4.140347957611084;op_loss:0.5442243814468384
Training Epoch: 2 [23424/50000] Loss: 4.6846 LR: 0.100000
base_loss:4.010788917541504;op_loss:0.5474587678909302
Training Epoch: 2 [23552/50000] Loss: 4.5582 LR: 0.100000
base_loss:4.013339996337891;op_loss:0.568802535533905
Training Epoch: 2 [23680/50000] Loss: 4.5821 LR: 0.100000
base_loss:4.00309944152832;op_loss:0.5742208957672119
Training Epoch: 2 [23808/50000] Loss: 4.5773 LR: 0.100000
base_loss:3.830199956893921;op_loss:0.5263798236846924
Training Epoch: 2 [23936/50000] Loss: 4.3566 LR: 0.100000
base_loss:4.025938510894775;op_loss:0.5630514621734619
Training Epoch: 2 [24064/50000] Loss: 4.5890 LR: 0.100000
base_loss:3.835080862045288;op_loss:0.5731643438339233
Training Epoch: 2 [24192/50000] Loss: 4.4082 LR: 0.100000
base_loss:3.789602518081665;op_loss:0.5308221578598022
Training Epoch: 2 [24320/50000] Loss: 4.3204 LR: 0.100000
base_loss:3.897852659225464;op_loss:0.5399445295333862
Training Epoch: 2 [24448/50000] Loss: 4.4378 LR: 0.100000
base_loss:3.82706880569458;op_loss:0.5551329851150513
Training Epoch: 2 [24576/50000] Loss: 4.3822 LR: 0.100000
base_loss:3.956073045730591;op_loss:0.5589912533760071
Training Epoch: 2 [24704/50000] Loss: 4.5151 LR: 0.100000
base_loss:3.8474905490875244;op_loss:0.5719531178474426
Training Epoch: 2 [24832/50000] Loss: 4.4194 LR: 0.100000
base_loss:3.898164987564087;op_loss:0.562261700630188
Training Epoch: 2 [24960/50000] Loss: 4.4604 LR: 0.100000
base_loss:4.061901092529297;op_loss:0.5722566843032837
Training Epoch: 2 [25088/50000] Loss: 4.6342 LR: 0.100000
base_loss:3.7705395221710205;op_loss:0.5475386381149292
Training Epoch: 2 [25216/50000] Loss: 4.3181 LR: 0.100000
base_loss:3.700868606567383;op_loss:0.5162238478660583
Training Epoch: 2 [25344/50000] Loss: 4.2171 LR: 0.100000
base_loss:4.059940338134766;op_loss:0.6250487565994263
Training Epoch: 2 [25472/50000] Loss: 4.6850 LR: 0.100000
base_loss:3.941880702972412;op_loss:0.5762021541595459
Training Epoch: 2 [25600/50000] Loss: 4.5181 LR: 0.100000
base_loss:3.712475299835205;op_loss:0.5045211911201477
Training Epoch: 2 [25728/50000] Loss: 4.2170 LR: 0.100000
base_loss:3.9507226943969727;op_loss:0.5714401006698608
Training Epoch: 2 [25856/50000] Loss: 4.5222 LR: 0.100000
base_loss:3.8116304874420166;op_loss:0.5507208108901978
Training Epoch: 2 [25984/50000] Loss: 4.3624 LR: 0.100000
base_loss:3.8392250537872314;op_loss:0.5401781797409058
Training Epoch: 2 [26112/50000] Loss: 4.3794 LR: 0.100000
base_loss:4.095599174499512;op_loss:0.5746181011199951
Training Epoch: 2 [26240/50000] Loss: 4.6702 LR: 0.100000
base_loss:3.993638038635254;op_loss:0.5610768795013428
Training Epoch: 2 [26368/50000] Loss: 4.5547 LR: 0.100000
base_loss:3.8439080715179443;op_loss:0.5643384456634521
Training Epoch: 2 [26496/50000] Loss: 4.4082 LR: 0.100000
base_loss:3.7558159828186035;op_loss:0.5430795550346375
Training Epoch: 2 [26624/50000] Loss: 4.2989 LR: 0.100000
base_loss:3.8994176387786865;op_loss:0.5508720874786377
Training Epoch: 2 [26752/50000] Loss: 4.4503 LR: 0.100000
base_loss:3.832751750946045;op_loss:0.5519804954528809
Training Epoch: 2 [26880/50000] Loss: 4.3847 LR: 0.100000
base_loss:3.7795217037200928;op_loss:0.5214532613754272
Training Epoch: 2 [27008/50000] Loss: 4.3010 LR: 0.100000
base_loss:3.8258230686187744;op_loss:0.5615601539611816
Training Epoch: 2 [27136/50000] Loss: 4.3874 LR: 0.100000
base_loss:3.8854758739471436;op_loss:0.5499289631843567
Training Epoch: 2 [27264/50000] Loss: 4.4354 LR: 0.100000
base_loss:3.6913657188415527;op_loss:0.5313825011253357
Training Epoch: 2 [27392/50000] Loss: 4.2227 LR: 0.100000
base_loss:3.885385513305664;op_loss:0.5844192504882812
Training Epoch: 2 [27520/50000] Loss: 4.4698 LR: 0.100000
base_loss:3.8427481651306152;op_loss:0.5542815923690796
Training Epoch: 2 [27648/50000] Loss: 4.3970 LR: 0.100000
base_loss:3.936248779296875;op_loss:0.5676923990249634
Training Epoch: 2 [27776/50000] Loss: 4.5039 LR: 0.100000
base_loss:3.9283361434936523;op_loss:0.5396283864974976
Training Epoch: 2 [27904/50000] Loss: 4.4680 LR: 0.100000
base_loss:3.8383162021636963;op_loss:0.5481919050216675
Training Epoch: 2 [28032/50000] Loss: 4.3865 LR: 0.100000
base_loss:3.753767967224121;op_loss:0.5237563848495483
Training Epoch: 2 [28160/50000] Loss: 4.2775 LR: 0.100000
base_loss:3.7770819664001465;op_loss:0.5357043743133545
Training Epoch: 2 [28288/50000] Loss: 4.3128 LR: 0.100000
base_loss:3.775968074798584;op_loss:0.5484045743942261
Training Epoch: 2 [28416/50000] Loss: 4.3244 LR: 0.100000
base_loss:3.8435142040252686;op_loss:0.5724435448646545
Training Epoch: 2 [28544/50000] Loss: 4.4160 LR: 0.100000
base_loss:3.767258882522583;op_loss:0.511907696723938
Training Epoch: 2 [28672/50000] Loss: 4.2792 LR: 0.100000
base_loss:3.7392678260803223;op_loss:0.564441442489624
Training Epoch: 2 [28800/50000] Loss: 4.3037 LR: 0.100000
base_loss:3.997739553451538;op_loss:0.5390644669532776
Training Epoch: 2 [28928/50000] Loss: 4.5368 LR: 0.100000
base_loss:3.846623420715332;op_loss:0.5556312799453735
Training Epoch: 2 [29056/50000] Loss: 4.4023 LR: 0.100000
base_loss:3.877669334411621;op_loss:0.5602124929428101
Training Epoch: 2 [29184/50000] Loss: 4.4379 LR: 0.100000
base_loss:3.759528398513794;op_loss:0.5330902338027954
Training Epoch: 2 [29312/50000] Loss: 4.2926 LR: 0.100000
base_loss:3.962939500808716;op_loss:0.5954258441925049
Training Epoch: 2 [29440/50000] Loss: 4.5584 LR: 0.100000
base_loss:3.9653871059417725;op_loss:0.5620343685150146
Training Epoch: 2 [29568/50000] Loss: 4.5274 LR: 0.100000
base_loss:3.8975536823272705;op_loss:0.553621768951416
Training Epoch: 2 [29696/50000] Loss: 4.4512 LR: 0.100000
base_loss:3.9335806369781494;op_loss:0.5649428367614746
Training Epoch: 2 [29824/50000] Loss: 4.4985 LR: 0.100000
base_loss:3.818086862564087;op_loss:0.5452444553375244
Training Epoch: 2 [29952/50000] Loss: 4.3633 LR: 0.100000
base_loss:3.938337802886963;op_loss:0.571191132068634
Training Epoch: 2 [30080/50000] Loss: 4.5095 LR: 0.100000
base_loss:3.6936147212982178;op_loss:0.5652775764465332
Training Epoch: 2 [30208/50000] Loss: 4.2589 LR: 0.100000
base_loss:3.6759190559387207;op_loss:0.5365843772888184
Training Epoch: 2 [30336/50000] Loss: 4.2125 LR: 0.100000
base_loss:3.935365676879883;op_loss:0.5915765166282654
Training Epoch: 2 [30464/50000] Loss: 4.5269 LR: 0.100000
base_loss:3.7919299602508545;op_loss:0.5455601215362549
Training Epoch: 2 [30592/50000] Loss: 4.3375 LR: 0.100000
base_loss:3.8465006351470947;op_loss:0.5511833429336548
Training Epoch: 2 [30720/50000] Loss: 4.3977 LR: 0.100000
base_loss:3.6641602516174316;op_loss:0.5380809307098389
Training Epoch: 2 [30848/50000] Loss: 4.2022 LR: 0.100000
base_loss:3.7215418815612793;op_loss:0.5309674739837646
Training Epoch: 2 [30976/50000] Loss: 4.2525 LR: 0.100000
base_loss:3.7070746421813965;op_loss:0.5367683172225952
Training Epoch: 2 [31104/50000] Loss: 4.2438 LR: 0.100000
base_loss:3.83191180229187;op_loss:0.5683099031448364
Training Epoch: 2 [31232/50000] Loss: 4.4002 LR: 0.100000
base_loss:3.687145471572876;op_loss:0.5676800012588501
Training Epoch: 2 [31360/50000] Loss: 4.2548 LR: 0.100000
base_loss:3.8021647930145264;op_loss:0.5442438125610352
Training Epoch: 2 [31488/50000] Loss: 4.3464 LR: 0.100000
base_loss:3.9961986541748047;op_loss:0.5436975955963135
Training Epoch: 2 [31616/50000] Loss: 4.5399 LR: 0.100000
base_loss:3.748223066329956;op_loss:0.5534614324569702
Training Epoch: 2 [31744/50000] Loss: 4.3017 LR: 0.100000
base_loss:3.763866662979126;op_loss:0.5668948888778687
Training Epoch: 2 [31872/50000] Loss: 4.3308 LR: 0.100000
base_loss:3.8091788291931152;op_loss:0.5503913164138794
Training Epoch: 2 [32000/50000] Loss: 4.3596 LR: 0.100000
base_loss:3.8184268474578857;op_loss:0.554498553276062
Training Epoch: 2 [32128/50000] Loss: 4.3729 LR: 0.100000
base_loss:3.8836381435394287;op_loss:0.5511849522590637
Training Epoch: 2 [32256/50000] Loss: 4.4348 LR: 0.100000
base_loss:3.7631940841674805;op_loss:0.5355446338653564
Training Epoch: 2 [32384/50000] Loss: 4.2987 LR: 0.100000
base_loss:3.9454233646392822;op_loss:0.5683684945106506
Training Epoch: 2 [32512/50000] Loss: 4.5138 LR: 0.100000
base_loss:3.9010353088378906;op_loss:0.5177701711654663
Training Epoch: 2 [32640/50000] Loss: 4.4188 LR: 0.100000
base_loss:3.8644816875457764;op_loss:0.5810962915420532
Training Epoch: 2 [32768/50000] Loss: 4.4456 LR: 0.100000
base_loss:3.881481170654297;op_loss:0.5591676235198975
Training Epoch: 2 [32896/50000] Loss: 4.4406 LR: 0.100000
base_loss:3.805453062057495;op_loss:0.5270752906799316
Training Epoch: 2 [33024/50000] Loss: 4.3325 LR: 0.100000
base_loss:3.8777475357055664;op_loss:0.5881315469741821
Training Epoch: 2 [33152/50000] Loss: 4.4659 LR: 0.100000
base_loss:3.8695993423461914;op_loss:0.5827788710594177
Training Epoch: 2 [33280/50000] Loss: 4.4524 LR: 0.100000
base_loss:3.704249858856201;op_loss:0.5382960438728333
Training Epoch: 2 [33408/50000] Loss: 4.2425 LR: 0.100000
base_loss:3.7612247467041016;op_loss:0.5391104817390442
Training Epoch: 2 [33536/50000] Loss: 4.3003 LR: 0.100000
base_loss:3.8835256099700928;op_loss:0.5955924987792969
Training Epoch: 2 [33664/50000] Loss: 4.4791 LR: 0.100000
base_loss:3.7209033966064453;op_loss:0.5071792006492615
Training Epoch: 2 [33792/50000] Loss: 4.2281 LR: 0.100000
base_loss:3.689117431640625;op_loss:0.5536002516746521
Training Epoch: 2 [33920/50000] Loss: 4.2427 LR: 0.100000
base_loss:3.8811049461364746;op_loss:0.5463123917579651
Training Epoch: 2 [34048/50000] Loss: 4.4274 LR: 0.100000
base_loss:3.8837778568267822;op_loss:0.5643473863601685
Training Epoch: 2 [34176/50000] Loss: 4.4481 LR: 0.100000
base_loss:3.891619920730591;op_loss:0.5946428775787354
Training Epoch: 2 [34304/50000] Loss: 4.4863 LR: 0.100000
base_loss:3.7175278663635254;op_loss:0.5497081279754639
Training Epoch: 2 [34432/50000] Loss: 4.2672 LR: 0.100000
base_loss:3.782231092453003;op_loss:0.5552924871444702
Training Epoch: 2 [34560/50000] Loss: 4.3375 LR: 0.100000
base_loss:3.6383509635925293;op_loss:0.5464444160461426
Training Epoch: 2 [34688/50000] Loss: 4.1848 LR: 0.100000
base_loss:3.7457454204559326;op_loss:0.5486184358596802
Training Epoch: 2 [34816/50000] Loss: 4.2944 LR: 0.100000
base_loss:3.6042778491973877;op_loss:0.5284616947174072
Training Epoch: 2 [34944/50000] Loss: 4.1327 LR: 0.100000
base_loss:3.6705310344696045;op_loss:0.5567630529403687
Training Epoch: 2 [35072/50000] Loss: 4.2273 LR: 0.100000
base_loss:3.7394771575927734;op_loss:0.5675286054611206
Training Epoch: 2 [35200/50000] Loss: 4.3070 LR: 0.100000
base_loss:3.779526948928833;op_loss:0.5739312171936035
Training Epoch: 2 [35328/50000] Loss: 4.3535 LR: 0.100000
base_loss:3.9953362941741943;op_loss:0.5840990543365479
Training Epoch: 2 [35456/50000] Loss: 4.5794 LR: 0.100000
base_loss:3.669971227645874;op_loss:0.5484297275543213
Training Epoch: 2 [35584/50000] Loss: 4.2184 LR: 0.100000
base_loss:3.697075843811035;op_loss:0.5392510294914246
Training Epoch: 2 [35712/50000] Loss: 4.2363 LR: 0.100000
base_loss:3.703193187713623;op_loss:0.5987867116928101
Training Epoch: 2 [35840/50000] Loss: 4.3020 LR: 0.100000
base_loss:3.948091506958008;op_loss:0.5643701553344727
Training Epoch: 2 [35968/50000] Loss: 4.5125 LR: 0.100000
base_loss:3.746151924133301;op_loss:0.5392628908157349
Training Epoch: 2 [36096/50000] Loss: 4.2854 LR: 0.100000
base_loss:3.6312782764434814;op_loss:0.5010913610458374
Training Epoch: 2 [36224/50000] Loss: 4.1324 LR: 0.100000
base_loss:3.8958747386932373;op_loss:0.5502843856811523
Training Epoch: 2 [36352/50000] Loss: 4.4462 LR: 0.100000
base_loss:3.9277234077453613;op_loss:0.5992369651794434
Training Epoch: 2 [36480/50000] Loss: 4.5270 LR: 0.100000
base_loss:3.767594814300537;op_loss:0.5715197324752808
Training Epoch: 2 [36608/50000] Loss: 4.3391 LR: 0.100000
base_loss:3.9161529541015625;op_loss:0.5501008033752441
Training Epoch: 2 [36736/50000] Loss: 4.4663 LR: 0.100000
base_loss:3.5899252891540527;op_loss:0.5299320220947266
Training Epoch: 2 [36864/50000] Loss: 4.1199 LR: 0.100000
base_loss:3.7980997562408447;op_loss:0.5714657306671143
Training Epoch: 2 [36992/50000] Loss: 4.3696 LR: 0.100000
base_loss:3.7315499782562256;op_loss:0.540511965751648
Training Epoch: 2 [37120/50000] Loss: 4.2721 LR: 0.100000
base_loss:3.830615282058716;op_loss:0.5773620009422302
Training Epoch: 2 [37248/50000] Loss: 4.4080 LR: 0.100000
base_loss:3.806159257888794;op_loss:0.5807817578315735
Training Epoch: 2 [37376/50000] Loss: 4.3869 LR: 0.100000
base_loss:3.804353952407837;op_loss:0.5602822303771973
Training Epoch: 2 [37504/50000] Loss: 4.3646 LR: 0.100000
base_loss:3.6165828704833984;op_loss:0.5466784238815308
Training Epoch: 2 [37632/50000] Loss: 4.1633 LR: 0.100000
base_loss:3.851067543029785;op_loss:0.5554298758506775
Training Epoch: 2 [37760/50000] Loss: 4.4065 LR: 0.100000
base_loss:3.4815421104431152;op_loss:0.5254203081130981
Training Epoch: 2 [37888/50000] Loss: 4.0070 LR: 0.100000
base_loss:3.733215808868408;op_loss:0.5669469833374023
Training Epoch: 2 [38016/50000] Loss: 4.3002 LR: 0.100000
base_loss:3.7020106315612793;op_loss:0.5734156966209412
Training Epoch: 2 [38144/50000] Loss: 4.2754 LR: 0.100000
base_loss:3.6346089839935303;op_loss:0.5675078630447388
Training Epoch: 2 [38272/50000] Loss: 4.2021 LR: 0.100000
base_loss:3.5478475093841553;op_loss:0.5357934236526489
Training Epoch: 2 [38400/50000] Loss: 4.0836 LR: 0.100000
base_loss:3.640148639678955;op_loss:0.5261424779891968
Training Epoch: 2 [38528/50000] Loss: 4.1663 LR: 0.100000
base_loss:3.7112765312194824;op_loss:0.5307673215866089
Training Epoch: 2 [38656/50000] Loss: 4.2420 LR: 0.100000
base_loss:3.5381064414978027;op_loss:0.5429272055625916
Training Epoch: 2 [38784/50000] Loss: 4.0810 LR: 0.100000
base_loss:3.752046585083008;op_loss:0.5622568130493164
Training Epoch: 2 [38912/50000] Loss: 4.3143 LR: 0.100000
base_loss:3.8045544624328613;op_loss:0.5921481847763062
Training Epoch: 2 [39040/50000] Loss: 4.3967 LR: 0.100000
base_loss:3.7374045848846436;op_loss:0.5568718910217285
Training Epoch: 2 [39168/50000] Loss: 4.2943 LR: 0.100000
base_loss:3.764887809753418;op_loss:0.5605908632278442
Training Epoch: 2 [39296/50000] Loss: 4.3255 LR: 0.100000
base_loss:3.6292269229888916;op_loss:0.5411778092384338
Training Epoch: 2 [39424/50000] Loss: 4.1704 LR: 0.100000
base_loss:3.6681225299835205;op_loss:0.5112259984016418
Training Epoch: 2 [39552/50000] Loss: 4.1793 LR: 0.100000
base_loss:3.6919984817504883;op_loss:0.552815318107605
Training Epoch: 2 [39680/50000] Loss: 4.2448 LR: 0.100000
base_loss:3.6374194622039795;op_loss:0.5640242099761963
Training Epoch: 2 [39808/50000] Loss: 4.2014 LR: 0.100000
base_loss:3.694801092147827;op_loss:0.5668648481369019
Training Epoch: 2 [39936/50000] Loss: 4.2617 LR: 0.100000
base_loss:3.7429771423339844;op_loss:0.5550873875617981
Training Epoch: 2 [40064/50000] Loss: 4.2981 LR: 0.100000
base_loss:3.6577112674713135;op_loss:0.5416204929351807
Training Epoch: 2 [40192/50000] Loss: 4.1993 LR: 0.100000
base_loss:3.5659420490264893;op_loss:0.5471091270446777
Training Epoch: 2 [40320/50000] Loss: 4.1131 LR: 0.100000
base_loss:3.695629596710205;op_loss:0.5519151091575623
Training Epoch: 2 [40448/50000] Loss: 4.2475 LR: 0.100000
base_loss:3.5980539321899414;op_loss:0.5849438905715942
Training Epoch: 2 [40576/50000] Loss: 4.1830 LR: 0.100000
base_loss:3.7832765579223633;op_loss:0.5506361722946167
Training Epoch: 2 [40704/50000] Loss: 4.3339 LR: 0.100000
base_loss:3.767059326171875;op_loss:0.5836620330810547
Training Epoch: 2 [40832/50000] Loss: 4.3507 LR: 0.100000
base_loss:3.5711848735809326;op_loss:0.4968909025192261
Training Epoch: 2 [40960/50000] Loss: 4.0681 LR: 0.100000
base_loss:3.7396597862243652;op_loss:0.5382078886032104
Training Epoch: 2 [41088/50000] Loss: 4.2779 LR: 0.100000
base_loss:3.6316683292388916;op_loss:0.5416370630264282
Training Epoch: 2 [41216/50000] Loss: 4.1733 LR: 0.100000
base_loss:3.6659433841705322;op_loss:0.5240254402160645
Training Epoch: 2 [41344/50000] Loss: 4.1900 LR: 0.100000
base_loss:3.622417449951172;op_loss:0.5476852059364319
Training Epoch: 2 [41472/50000] Loss: 4.1701 LR: 0.100000
base_loss:3.5213141441345215;op_loss:0.5527032613754272
Training Epoch: 2 [41600/50000] Loss: 4.0740 LR: 0.100000
base_loss:3.851975917816162;op_loss:0.5474400520324707
Training Epoch: 2 [41728/50000] Loss: 4.3994 LR: 0.100000
base_loss:3.6676106452941895;op_loss:0.5726674795150757
Training Epoch: 2 [41856/50000] Loss: 4.2403 LR: 0.100000
base_loss:3.666172504425049;op_loss:0.5618478059768677
Training Epoch: 2 [41984/50000] Loss: 4.2280 LR: 0.100000
base_loss:3.665099859237671;op_loss:0.546825647354126
Training Epoch: 2 [42112/50000] Loss: 4.2119 LR: 0.100000
base_loss:3.8053016662597656;op_loss:0.5450616478919983
Training Epoch: 2 [42240/50000] Loss: 4.3504 LR: 0.100000
base_loss:3.467334270477295;op_loss:0.5174261331558228
Training Epoch: 2 [42368/50000] Loss: 3.9848 LR: 0.100000
base_loss:3.6633100509643555;op_loss:0.566246747970581
Training Epoch: 2 [42496/50000] Loss: 4.2296 LR: 0.100000
base_loss:3.712745428085327;op_loss:0.5483819246292114
Training Epoch: 2 [42624/50000] Loss: 4.2611 LR: 0.100000
base_loss:3.606707811355591;op_loss:0.5496277809143066
Training Epoch: 2 [42752/50000] Loss: 4.1563 LR: 0.100000
base_loss:3.5225234031677246;op_loss:0.5447441339492798
Training Epoch: 2 [42880/50000] Loss: 4.0673 LR: 0.100000
base_loss:3.703758716583252;op_loss:0.5799238681793213
Training Epoch: 2 [43008/50000] Loss: 4.2837 LR: 0.100000
base_loss:3.8243813514709473;op_loss:0.5476846098899841
Training Epoch: 2 [43136/50000] Loss: 4.3721 LR: 0.100000
base_loss:3.8335187435150146;op_loss:0.5841560363769531
Training Epoch: 2 [43264/50000] Loss: 4.4177 LR: 0.100000
base_loss:3.772874355316162;op_loss:0.5466620922088623
Training Epoch: 2 [43392/50000] Loss: 4.3195 LR: 0.100000
base_loss:3.614379405975342;op_loss:0.5419776439666748
Training Epoch: 2 [43520/50000] Loss: 4.1564 LR: 0.100000
base_loss:3.5509157180786133;op_loss:0.5490822196006775
Training Epoch: 2 [43648/50000] Loss: 4.1000 LR: 0.100000
base_loss:3.785449743270874;op_loss:0.5515962839126587
Training Epoch: 2 [43776/50000] Loss: 4.3370 LR: 0.100000
base_loss:3.7325148582458496;op_loss:0.5258728265762329
Training Epoch: 2 [43904/50000] Loss: 4.2584 LR: 0.100000
base_loss:3.6273269653320312;op_loss:0.5088866949081421
Training Epoch: 2 [44032/50000] Loss: 4.1362 LR: 0.100000
base_loss:3.7741997241973877;op_loss:0.5788636803627014
Training Epoch: 2 [44160/50000] Loss: 4.3531 LR: 0.100000
base_loss:3.7089836597442627;op_loss:0.5462599992752075
Training Epoch: 2 [44288/50000] Loss: 4.2552 LR: 0.100000
base_loss:3.6435673236846924;op_loss:0.5522520542144775
Training Epoch: 2 [44416/50000] Loss: 4.1958 LR: 0.100000
base_loss:3.6178042888641357;op_loss:0.5518785715103149
Training Epoch: 2 [44544/50000] Loss: 4.1697 LR: 0.100000
base_loss:3.5295042991638184;op_loss:0.5132536888122559
Training Epoch: 2 [44672/50000] Loss: 4.0428 LR: 0.100000
base_loss:3.848250150680542;op_loss:0.5623078346252441
Training Epoch: 2 [44800/50000] Loss: 4.4106 LR: 0.100000
base_loss:3.5371782779693604;op_loss:0.5307868719100952
Training Epoch: 2 [44928/50000] Loss: 4.0680 LR: 0.100000
base_loss:3.611419200897217;op_loss:0.5348109006881714
############################################################

@kahnchana
Copy link
Owner

Hey, thanks for your interest in OPL.

What is the base loss you use? Also, your base loss is not converging properly either, is it?

Ideally, your loss should reduce over time similar to these train plots. The first diagram (left) shows OPL value and the second two (center, right) the two components composing OPL.

Do let us know how this goes and if you manage to get it working for your use case.

@ChenJunzhi-buaa
Copy link
Author

Thank you for your reply!

I just ran the training script in https://github.com/kahnchana/opl/tree/master/classification/cifar .

The base loss is CE loss.

Maybe it is not a question. It did decrease in the longer term.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants