Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3D CNN throwing ND4JIllegalStateException #6366

Closed
HGuillemet opened this issue Sep 4, 2018 · 4 comments

Comments

@HGuillemet
Copy link

commented Sep 4, 2018

Here is a standalone test class that throws:
ND4JIllegalStateException: X, Y and Z arguments should have the same length for PairwiseTransform. x: length 1024, shape [32, 32]; y: 28672, shape [28, 32, 32]; z: 1024, shape [32, 32]

gist

It uses a dummy custom RecordReader that returns 10 3D records filled with 0 and with label 0.

@AlexDBlack AlexDBlack self-assigned this Sep 4, 2018

AlexDBlack added a commit that referenced this issue Sep 5, 2018
@AlexDBlack AlexDBlack referenced this issue Sep 5, 2018
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Sep 5, 2018

So, looking at this: there's a couple of issues - one on your side, and one on ours.

First, 3D CNN data/activations have 5 dimensions, not 3. See this for more details:
https://github.com/deeplearning4j/deeplearning4j/blob/0294e884659c05dc8cdbb9aef5cce0ff10e420e2/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/conf/layers/Convolution3D.java#L48-L56
For single examples, your leading dimension should be 1

Second: RecordReaderMultiDataSetIterator hadn't been set up for 5D data via NDArrayWritables.
This PR adds support for that: #6370
After it has been merged, you can switch to snapshots to access the fix: https://deeplearning4j.org/docs/latest/deeplearning4j-config-snapshots
It usually takes a few hours after a pull request has been merged for the snapshots to become available.

AlexDBlack added a commit that referenced this issue Sep 5, 2018
AlexDBlack added a commit that referenced this issue Sep 5, 2018
@HGuillemet

This comment has been minimized.

Copy link
Author

commented Sep 5, 2018

Thanks.
I added
.dataFormat(Convolution3D.DataFormat.NCDHW)
to my Convolution3D layer (first layer only, I guess it's enough. BTW any reason why this setting is not available for SubSampling3D Layer ?)
And changed the shape of my data brick to [1, 1, nZ, nY, X]
Now I get:
java.lang.RuntimeException: Unexpected rank: 5
Waiting for your merge.
I'm trying to switch to snapshot but am struggling with an UnsatisfiedLinkError : libjnind4jcpu.so not finding libmkldnn.so...

@HGuillemet

This comment has been minimized.

Copy link
Author

commented Sep 5, 2018

Found a workaround for the UnsatisfiedLinkError by manually creating a missing symlink to libmkldnn.so.0.
Jenkins cannot build your commit, apparently.

AlexDBlack added a commit that referenced this issue Sep 6, 2018
AlexDBlack added a commit that referenced this issue Sep 7, 2018
AlexDBlack added a commit that referenced this issue Sep 7, 2018
DL4J Issues (#6370)
* #6358 Fix issue with LayerWorkspaceMgr when used directly in conjunction with CuDNN

* 6326 RnnOutputLayer + CompGraph + Dropout fix

* Another test + improve RnnOutputLayer error

* #6368 Fix MultiLayerNetwork pretrainLayer iterator reset supported check

* #6352 fix use of assert keywork for argument validation in ND4J

* #6352 more assert keyword fixes

* #6366 RecordReaderMultiDataSetIterator: better validation/errors; 5D (3D CNN) NDArrayWritable support

* Cleanup

* Compilation fix

* More cleanup

* Removed method fix

* More fixes
saudet added a commit to bytedeco/javacpp-presets that referenced this issue Sep 10, 2018
@lock

This comment has been minimized.

Copy link

commented Oct 7, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Oct 7, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.