-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the training loss and validation loss. #160
Comments
I have the same question. I have however gotten my Training Loss > Validation Loss by increasing the Dropout to > 0.8 though that did cause the Validation Loss/Training Loss about twice as long (2x the epochs) to reach the minimum Validation Loss I could get. I also have a follow-up. What is a good Validation Loss to get to for decent generation of data (I know this could be different for a given data set). No matter what variables I change I can't get my lowest Validation Loss < 0.5. Most of the time the Validation Loss will get close to 0.5 then start going back up. This would suggest that I'm Overfitting if I'm not mistaken. About my data: The "best" run I've done is the following (lowest Validation Loss): All of my test have been on the base data, meaning I haven not been running it with the It's hard to really tell if any of my .t7 files are any better than the rest as so far they're fairly comparable. And it's not as though the "cards" it produce are really that "bad". But there are some patterns that I would like to code to pick up on. Like when cards reference themselves, the generated code never produces a card that references it's own name (it'll put some other random name instead). Or cards with bulletin points have "Chose one or both" before them, but none of the generated cards have this. |
I think this stackoverflow answer covers the confusion. |
As you have said in the following:
The first part is quite clear. Regarding the second part, my question is:
If
training loss << validation loss
, it is overfitting; if roughlytraining loss = validation loss
, it is underfitting. Then ,what is the balanced situation? Is ittraining loss > validation loss
ortraining loss
is lower but not much lower thanvalidation loss
?I do not think
training loss > validation loss
will happen right?The text was updated successfully, but these errors were encountered: