Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE when using mask arrays in a network with StackVertex #6490

Closed
timxyz opened this issue Sep 25, 2018 · 4 comments · Fixed by #6546

Comments

@timxyz
Copy link

commented Sep 25, 2018

Using deeplearning4j v1.0.0-beta2., I get the following crash immediately upon trying to fit a ComputationGraph with multiple inputs (each specified with a mask) and containing a StackVertex.

java.lang.NullPointerException
	at org.deeplearning4j.nn.graph.vertex.impl.StackVertex.feedForwardMaskArrays(StackVertex.java:186)
	at org.deeplearning4j.nn.graph.ComputationGraph.setLayerMaskArrays(ComputationGraph.java:3669)
	at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1116)
	at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1098)
	at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1006)
	at org.deeplearning4j.earlystopping.trainer.EarlyStoppingGraphTrainer.fit(EarlyStoppingGraphTrainer.java:78)
	at org.deeplearning4j.earlystopping.trainer.BaseEarlyStoppingTrainer.fit(BaseEarlyStoppingTrainer.java:122)

In particular it is long size1_ex0 = maskArrays[0].size(1); failing, due to maskArrays[0] being null.

I've noticed in the implementation of feedForwardMaskArrays for other vertices (e.g. MergeVertex) that they are coded to explicitly expect some elements in INDArray[] maskArrays to be null, but I'm afraid I don't know enough about how mask arrays are implemented to say whether that is the cause of the issue in StackVertex..

The error seems to be independent of whether or not I have null entries in the mask array I specify in my MultiDataSets - it simply occurs whenever and however I use masks. The network works OK without masks, but I'd like it to take into account variable-length time series.

Thanks for any insights!

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Sep 25, 2018

Are you able to share your network architecture that's causing this?

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Oct 1, 2018

I have tried a number of different network configurations using StackVertex, and I was unable to reproduce this issue.
@timxyz can you share your network configuration? If I can't reproduce it, I can't fix it.

@timxyz

This comment has been minimized.

Copy link
Author

commented Oct 3, 2018

@AlexDBlack Thanks for nudging me on this, I'd forgotten to get back. I investigated this a bit more and realised it is due to a second StackVertex appearing after UnstackVertex. In my case I realised the second Stack was unnecessary and was able to reformulate my network to get it working again.

Here is the simplest reproducing example I could make:

		val nnConfig = NeuralNetConfiguration.Builder()
			.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
			.updater(Adam(2E-2))
			.graphBuilder()
			.setInputTypes(
				InputType.recurrent(d.toLong()),
				InputType.recurrent(d.toLong())
			)
			.addInputs("m1", "m2")
			.addVertex("m", StackVertex(), "m1", "m2")
			.addLayer("M", LastTimeStep(LSTM.Builder().nIn(d).nOut(1).activation(Activation.TANH).build()), "m")
			.addVertex("p1", UnstackVertex(0, 2), "M")
			.addVertex("p2", UnstackVertex(1, 2), "M")
			.addVertex("p", StackVertex(), "p1", "p2")
			.addVertex("q1", UnstackVertex(0, 2), "p")
			.addVertex("q2", UnstackVertex(1, 2), "p")
			.addVertex("q", MergeVertex(), "q1", "q2")
			.addLayer("probability", OutputLayer.Builder().nIn(d * 2).nOut(6).lossFunction(LossFunctions.LossFunction.MEAN_ABSOLUTE_ERROR).build(), "q")
			.setOutputs("probability")
			.build()

@AlexDBlack AlexDBlack self-assigned this Oct 8, 2018

AlexDBlack added a commit that referenced this issue Oct 8, 2018
AlexDBlack added a commit that referenced this issue Oct 10, 2018
DL4J/ND4J Fixes (#6546)
* #6539 Handle 0 gradient case for gradient normalization

* #6521 Nd4j.gemm validation

* #6543 View/order checks for BaseNDArray.mmuli()

* #6542 mmuli shape validation

* #6545 Require scalars, vectors, or same shape for INDArray.assign()

* #6520 Fix setLearningRate(double) for the no updater state (SGD, etc) case

* #6490 Fix StackVertex NPE with some masking cases

* Cnn3DLossLayer. Typo in RecordReaderMultiDataSetIteratorTest.

* Small fix

* Cnn3DLossLayer gradient checks (not yet passing)

* Cnn3dLossLayer masking + test fixes

* Extra tests, CNN3D tweaks

* Fix Conv3d layer support for NDHWC data format

* Fix Cnn3DLossLayer

* Allow size 1 dimensions in assign shape check

* Minor test fixes

* Fix RollAxis; other tweaks
@lock

This comment has been minimized.

Copy link

commented Nov 9, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Nov 9, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.