-
Notifications
You must be signed in to change notification settings - Fork 423
V2-refactor - Higher train errors with eval=val #50
Comments
Thinking more about it, it makes sense train error could also be affected if we are modifying the params (LR, bias etc) adaptively based on the val error (which is really great!). Still, the train error on first iteration is much higher with "eval = val" (0.76) vs. "eval = train" (0.61)? |
Do you use CUDNN? I find there is some unstable stuff in CUDNN pooling, which makes unpredictable result. Now I disabled CuDNN pooling. |
No, I don't use CUDNN. I'm planning to run the exact same model in both master and v2-refac and see if my submission scores in the competition are any different. Will update with what I find. |
Ok, something's definitely amiss. I ran the exact same bowl.conf and pred.conf files (just minor chnages to be rev compatible) in both master and v2-refactor. My train error / val error in Master was .227 / .257 and in V2-refac was .342 / .296. When I submitted in Kaggle, leaderboard score for master was 0.90 and V2-refac was 0.98. This was only one single submission with no averaging of multiple outputs. Something definitely seems off with using v2-refac or I'm missing some details? FYI, this is the older master from ~6 weeks back, I haven't updated at all. |
Thanks very much! I will check my configuration tomorrow. If possible, could you share me your configuration? just email me: antinucleon àt gmail.com so that I will be more clear of what happened. I used V2 for all competition, and I didn't find out any abnormal. |
I re-run the experiment again. result is: So I don't think it is CXXNET's problem. |
I just switched to v2-refactor for the Kaggle bowl. My first goal was to match the CNN model with what I had been using with master and see if I get the same train error scores...just as a way to baseline.
I kept getting much higher train errors in v2-refactor (0.33 vs. 0.19 in master). I was able to root cause this to using "eval = val" in bowl.conf. If I change to "eval = train", then train error matches what I see in master.
Just curious to know why this is the case? I was assuming "eval=val" should only affect the validation error score and not train errors.
The text was updated successfully, but these errors were encountered: