New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot perform backprop: Dropout mask array is absent (already cleared?) in beta2 #6326

Closed
fmorbini opened this Issue Aug 31, 2018 · 5 comments

Comments

Projects
None yet
2 participants
@fmorbini
Copy link

fmorbini commented Aug 31, 2018

i'm training the following network:

  ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder()
      .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
      .seed(1000)
      .gradientNormalization(GradientNormalization.RenormalizeL2PerParamType)
      .l2(1e-5)
      .dropOut(0.8)
      .updater(new Adam(3))
      .weightInit(WeightInit.XAVIER).graphBuilder().addInputs("vectors", "ontology")
      .addVertex("merge", new MergeVertex(), "vectors", "ontology")
      .addLayer(LSTM_LAYER,
          new Bidirectional(Bidirectional.Mode.CONCAT, new LSTM.Builder()
              .nIn(getW2vService().getVectorSize() + ontologySize).nOut(hiddenaLayerSize)
              .activation(Activation.TANH)
              .dropOut(new GaussianNoise(0.05))
              .build())
          ,"merge")
      .addLayer("intentOut",
          new RnnOutputLayer.Builder().activation(Activation.SOFTMAX)
              .lossFunction(LossFunctions.LossFunction.MCXENT).nIn(hiddenaLayerSize*2)
              .nOut(intentsDictionary.size()).build(),
          LSTM_LAYER)
      .addLayer("neOut",
          new RnnOutputLayer.Builder().activation(Activation.SOFTMAX)
              .lossFunction(LossFunctions.LossFunction.MCXENT).nIn(hiddenaLayerSize*2)
              .nOut(nesDictionary.size()).build(),
          LSTM_LAYER)
      .setOutputs("intentOut", "neOut").build();

on 1.0-alpha and it works fine.

I gave a try to 1.0-beta2 and i get the following exception as soon as training starts:

Exception in thread "main" java.lang.IllegalStateException: Cannot perform backprop: Dropout mask array is absent (already cleared?)
at org.nd4j.base.Preconditions.throwStateEx(Preconditions.java:626)
at org.nd4j.base.Preconditions.checkState(Preconditions.java:253)
at org.deeplearning4j.nn.conf.dropout.Dropout.backprop(Dropout.java:154)
at org.deeplearning4j.nn.layers.AbstractLayer.backpropDropOutIfPresent(AbstractLayer.java:309)
at org.deeplearning4j.nn.layers.recurrent.RnnOutputLayer.backpropGradient(RnnOutputLayer.java:70)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doBackward(LayerVertex.java:148)
at org.deeplearning4j.nn.graph.ComputationGraph.calcBackpropGradients(ComputationGraph.java:2592)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1366)
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1326)
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:160)
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:51)
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52)
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1149)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1098)
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1065)

@AlexDBlack

This comment has been minimized.

Copy link
Member

AlexDBlack commented Sep 1, 2018

I'll need more than just your network configuration - I'll need to see how you are training this too.

@fmorbini

This comment has been minimized.

Copy link

fmorbini commented Sep 4, 2018

@AlexDBlack here is a gist: https://gist.github.com/fmorbini/936a7f1c9b0aa905e3ecf71dc096cb30
that repros the issue at least on my machine.

it's not the same issue but related to #5571
Unfortunately i have little time to play around with new versions. It takes already some time to modify the code to adhere to the changes in api. Sorry for not following up to all your messages.

thank you

@AlexDBlack

This comment has been minimized.

Copy link
Member

AlexDBlack commented Sep 5, 2018

@fmorbini thanks, that's helpful. I'll post here once I've had a chance to run it.

@AlexDBlack AlexDBlack self-assigned this Sep 5, 2018

@AlexDBlack AlexDBlack referenced this issue Sep 5, 2018

Merged

DL4J Issues #6370

@AlexDBlack

This comment has been minimized.

Copy link
Member

AlexDBlack commented Sep 5, 2018

Thanks for reporting - fixed here: #6370
That should be merged in the next 24-hours or so.

FYI the issue is specific to dropout on output layers in computation graph.
You can work around it by removing the dropout from the output layer, or using an explicit dropout layer instead of dropout on the output layer.

@lock

This comment has been minimized.

Copy link

lock bot commented Oct 7, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Oct 7, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.