Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullPointerException if UnknownWordHandling.UseUnknownVector #8121

Closed
cowwoc opened this issue Aug 21, 2019 · 1 comment · Fixed by SkymindIO/deeplearning4j#141

Comments

@cowwoc
Copy link

commented Aug 21, 2019

Issue Description

If a user configured CnnSentenceDataSetIterator with unknownWordHandling(UnknownWordHandling.UseUnknownVector) and the code runs across an unknown word then the following exception will be thrown:

Exception in thread "AMDSI prefetch thread" java.lang.RuntimeException: java.lang.NullPointerException
    at org.deeplearning4j.datasets.iterator.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:396)
Caused by: java.lang.NullPointerException
    at org.nd4j.linalg.api.ndarray.BaseNDArray.assign(BaseNDArray.java:1398)
    at org.nd4j.linalg.api.ndarray.BaseNDArray.put(BaseNDArray.java:2378)
    at org.deeplearning4j.iterator.CnnSentenceDataSetIterator.next(CnnSentenceDataSetIterator.java:375)
    at org.deeplearning4j.iterator.CnnSentenceDataSetIterator.next(CnnSentenceDataSetIterator.java:261)
    at org.deeplearning4j.iterator.CnnSentenceDataSetIterator.next(CnnSentenceDataSetIterator.java:62)
    at org.deeplearning4j.datasets.iterator.impl.MultiDataSetIteratorAdapter.next(MultiDataSetIteratorAdapter.java:78)
    at org.deeplearning4j.datasets.iterator.impl.MultiDataSetIteratorAdapter.next(MultiDataSetIteratorAdapter.java:29)
    at org.deeplearning4j.datasets.iterator.AsyncMultiDataSetIterator$AsyncPrefetchThread.run(AsyncMultiDataSetIterator.java:368)

There are actually two bugs at play here:

  1. BaseNDArray.java:1398 invokes:
        Preconditions.checkState((this.isScalar() && arr.isScalar()) || (this.isVector() && arr.isVector()) || Shape.shapeEqualWithSqueeze(this.shape(), arr.shape()),
                "Cannot assign arrays: arrays must both be scalars, both vectors, or shapes must be equal other than size 1 dimensions. Attempting to do x.assign(y)" +
                        " with x.shape=%ndShape and y.shape=%ndShape", this, arr );

but if arr is null then the aforementioned NullPointerException is thrown. Please handle the possibility of arr being null.

  1. CnnSentenceDataSetIterator:186 invokes:
        if (unknownWordHandling == UnknownWordHandling.UseUnknownVector && word == UNKNOWN_WORD_SENTINEL) { //Yes, this *should* be using == for the sentinel String here
            vector = unknown;
        }

but the field unknown is never set, so its value is null. This means that if an unknown word is encountered, getVector() returns null and the aforementioned exception is thrown.

Expected behavior: You should probably return something other than null.

Version Information

Please indicate relevant versions, including, if relevant:

  • Deeplearning4j version: 1.0.0-beta 4
  • Platform information (OS, etc): Windows 10.0.18362.295
  • CUDA version, if used: N/A
  • NVIDIA driver version, if in use: N/A

@AlexDBlack AlexDBlack self-assigned this Aug 21, 2019

AlexDBlack added a commit to SkymindIO/deeplearning4j that referenced this issue Aug 21, 2019
eclipse#8121 CnnSentenceDataSetIterator fixes
Signed-off-by: AlexDBlack <blacka101@gmail.com>
AlexDBlack added a commit to SkymindIO/deeplearning4j that referenced this issue Aug 21, 2019
Various fixes (#141)
* eclipse#8121 CnnSentenceDataSetIterator fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* eclipse#8120 CnnSentenceDataSetIterator.loadSingleSentence no words UX/exception improvement

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* eclipse#8122 AggregatingSentenceIterator builder - addSentencePreProcessor -> sentencePreProcessor

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* eclipse#8082 Arbiter - fix GridSearchCandidateGenerator search size issue

Signed-off-by: AlexDBlack <blacka101@gmail.com>
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2019

Thanks for reporting - fixed here, will be merged to eclipse/deeplearning4j master soon.
SkymindIO#141

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.