-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added forced_even to BatchIterator ensure equal batch size #11
Conversation
Added forced_even to BatchIterator ensure equal batch size
Thanks! |
Nice. I just wonder whether it may cause some problems if the model tacitly ignores some samples (since they should always be the same samples) and the user not noticing it. And the trouble with the StratifiedKFold still remains, right? |
@BenjaminBossan what is the issue with StratifiedKFold? |
Pylearn2 address then issue with the missing samples: "Batches of size unequal to batch_size will be discarded. Those examples will never be visited." One way to deal with this is to randomize before each new iteration over the data set. |
Yes, shuffling before each new epoch would fix that. |
The issue with StratifiedKFold is mentioned here: And yes, shuffling would fix the issue above, but one would have to make sure that the samples set aside for evaluation are not used for training. |
@BenjaminBossan see also #12 |
@BenjaminBossan I don't think there's an issue with StratifiedKFold, since the batch iterator will only ever see those samples that the KFold yielded, and it will simply discard the last Shuffling is also not an issue. The batch iterator could simply shuffle @cancan101 I think I've nevertheless found a problem with this patch. It's that |
@dnouri What do you mean about |
Take a look at the code. If |
It seems like that code will just raise an exception if forced_even is not On Wed Dec 31 2014 at 9:25:24 AM Daniel Nouri notifications@github.com
|
EDIT: Ah, if it's not used. Well, yes, but that's only a problem for some conv layer implementations, right? |
Won't it hit the issue in #8? I can give it a test. |
I certainly wouldn't expect it to return predictions for only some of the samples that I passed in, if I set |
I suppose one option would be to pad the data out to the correct length (perhaps when |
Closes #8
Closes #1
CC @dnouri
Naming inspired by https://github.com/lisa-lab/pylearn2/blob/0e26c340d2e607dc5190c8ee68a2dc471d45e1af/pylearn2/utils/iteration.py#L177