New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BaseDatasetIterator does not respect the number of examples given #6283

Closed
GuutBoy opened this Issue Aug 27, 2018 · 2 comments

Comments

Projects
None yet
2 participants
@GuutBoy
Copy link

GuutBoy commented Aug 27, 2018

The next method of the BaseDatasetIterator does not respect the numExamples parameter given at construction and thus does not follow the general Iterator interface. Namely, it does not check if it has exceeded the number of examples that should be given by the BaseDatasetIterator (in which case it should technically throw a NoSuchElementException).

This means that the iterator will return more examples than expected if

  • The user does not check the hasNext() method on the iterator
  • or the batch size does not divide the number of examples

A small example of this issue given below using the MnistDataSetIterator. The example demonstrates an iterator over 100 examples in three different ways, two of which we will get more that the expected number of examples:

import java.io.IOException;
import org.deeplearning4j.datasets.iterator.impl.MnistDataSetIterator;

public class DatasetIteratorTest {

  public static void main(String[] args) throws IOException {
    final int numExamples = 100;
    MnistDataSetIterator init = new MnistDataSetIterator(10, numExamples, false, true, true, 123);
    init.next();

    System.out.println("MnistDataSetIterator with " + numExamples + " examples and batchsize " + 10);
    System.out.println("Loop checking hasNext()");
    MnistDataSetIterator iter1 = new MnistDataSetIterator(10, numExamples, false, true, true, 123);
    int examples1 = 0;
    int itCount1 = 0;
    while (iter1.hasNext()) {
      itCount1++;
      examples1 += iter1.next().numExamples();
    }
    System.out.println("Number of examples " + examples1 + " in " + itCount1 + " iterations\n");
    System.out.println("MnistDataSetIterator with " + numExamples + " examples and batchsize " + 10);
    System.out.println("Loop NOT checking hasNext()");
    MnistDataSetIterator iter2 = new MnistDataSetIterator(10, numExamples, false, true, true, 123);
    int examples2 = 0;
    int itCount2 = 0;
    for (int i = 0; i < 100; i++) {
      itCount2++;
      examples2 += iter2.next().numExamples();
    }
    System.out.println("Number of examples " + examples2 + " in " + itCount2 + " iterations\n");
    System.out.println("MnistDataSetIterator with " + numExamples + " examples and batchsize " + 19);
    System.out.println("Loop checking hasNext() but with batch size not dividing number of examples");
    MnistDataSetIterator iter3 = new MnistDataSetIterator(19, numExamples, false, true, true, 123);
    int examples3 = 0;
    int itCount3 = 0;
    while (iter3.hasNext()) {
      itCount3++;
      examples3 += iter3.next().numExamples();
    }
    System.out.println("Number of examples " + examples3 + " in " + itCount3 + " iterations");
  }

}

@AlexDBlack AlexDBlack self-assigned this Aug 28, 2018

AlexDBlack added a commit that referenced this issue Aug 28, 2018

@AlexDBlack

This comment has been minimized.

Copy link
Member

AlexDBlack commented Aug 28, 2018

Thanks for reporting, and for the code too reproduce this.
It has been fixed here - fix will be merged soon.
#6295

AlexDBlack added a commit that referenced this issue Aug 29, 2018

Various Fixes (#6295)
* #6283 Fix BaseDatasetIterator (MnistDataSetIterator) with specified numExamples

* Test condition fix

* #6279 MultiLayerNetwork.doEvaluation output array workspace use

* #6279 ComputationGraph.doEvaluation output array workspace use

* One round of URL javadoc fixes
@lock

This comment has been minimized.

Copy link

lock bot commented Sep 28, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Sep 28, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.