Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when use VariationalAutoEncoder with Bidirectional LSTM but not with unidirectional LSTM #5472

Closed
rexlaw412 opened this issue Jun 5, 2018 · 4 comments

Comments

@rexlaw412
Copy link

commented Jun 5, 2018

Issue Description

The program will crash when using VariationalAutoEncoder with Bidirectional LSTM(or GravesLSTM),
but it run properly if just using it with LSTM,or using LSTM independently.

The following is the code for test:

        ComputationGraphConfiguration.GraphBuilder builder = new NeuralNetConfiguration.Builder()
                .updater(new Adam(0.01))
                .activation(Activation.RELU)
                .graphBuilder()
                .addInputs("IN")
                .setInputTypes(InputType.recurrent(49))
                .addLayer("AUTOENCODER",
                        new VariationalAutoencoder.Builder()
                                .encoderLayerSizes(64)
                                .decoderLayerSizes(64)
                                .nOut(7)
                                .pzxActivationFunction(Activation.IDENTITY)
                                .reconstructionDistribution(new BernoulliReconstructionDistribution(Activation.SIGMOID.getActivationFunction())).build(),
                        "IN")
                //.addLayer("RNN", new GravesLSTM.Builder().nOut(128).build(), "AUTOENCODER")//you may comment out this line, or next line for testing
                .addLayer("RNN", new Bidirectional(Bidirectional.Mode.ADD, new GravesLSTM.Builder().nOut(128).build()), "AUTOENCODER")
                .addLayer("OUT", new RnnOutputLayer.Builder()
                        .nOut(2)
                        .activation(Activation.SOFTMAX)
                        .lossFunction(LossFunctions.LossFunction.MCXENT).build(), "RNN")
                .setOutputs("OUT")
                .pretrain(true)
                .backprop(true);

        ComputationGraphConfiguration config = builder.build();
        ComputationGraph net = new ComputationGraph(config);
        net.init();

When using Bidirectional LSTM, the following exception will be thrown:
org.nd4j.linalg.workspace.ND4JWorkspaceException: Cannot duplicate INDArray: Array outdated workspace pointer from workspace WS_LAYER_WORKING_MEM (array generation 846, current workspace generation 847)
All open workspaces: [WS_LAYER_WORKING_MEM, WS_RNN_LOOP_WORKING_MEM]
at org.nd4j.linalg.workspace.WorkspaceUtils.assertValidArray(WorkspaceUtils.java:94)
at org.nd4j.linalg.api.ndarray.BaseNDArray.dup(BaseNDArray.java:1743)
at org.nd4j.linalg.api.blas.params.GemmParams.copyIfNeccessary(GemmParams.java:139)
at org.nd4j.linalg.api.blas.params.GemmParams.(GemmParams.java:101)
at org.nd4j.linalg.api.blas.impl.BaseLevel3.gemm(BaseLevel3.java:48)
at org.nd4j.linalg.api.ndarray.BaseNDArray.mmuli(BaseNDArray.java:3529)
at org.nd4j.linalg.api.ndarray.BaseNDArray.mmul(BaseNDArray.java:3222)
at org.deeplearning4j.nn.layers.recurrent.LSTMHelpers.activateHelper(LSTMHelpers.java:211)
at org.deeplearning4j.nn.layers.recurrent.GravesLSTM.activateHelper(GravesLSTM.java:139)
at org.deeplearning4j.nn.layers.recurrent.GravesLSTM.activate(GravesLSTM.java:115)
at org.deeplearning4j.nn.layers.recurrent.BidirectionalLayer.activate(BidirectionalLayer.java:173)
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:105)
at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsDetached(ComputationGraph.java:1803)
at org.deeplearning4j.nn.graph.ComputationGraph.score(ComputationGraph.java:2748)
at org.deeplearning4j.nn.graph.ComputationGraph.score(ComputationGraph.java:2713)
at org.deeplearning4j.earlystopping.scorecalc.DataSetLossCalculator.scoreMinibatch(DataSetLossCalculator.java:80)
at org.deeplearning4j.earlystopping.scorecalc.base.BaseScoreCalculator.calculateScore(BaseScoreCalculator.java:51)
at org.deeplearning4j.earlystopping.trainer.BaseEarlyStoppingTrainer.fit(BaseEarlyStoppingTrainer.java:196)
at Hawk.main(Hawk.java:27)

Version Information

  • Deeplearning4j version (1.0.0-beta)
  • platform information (Window 10)

@AlexDBlack AlexDBlack added this to the DL4J/Arbiter/DataVec Planning Sprint (01 June) milestone Jun 6, 2018

@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Jun 7, 2018

Thanks for the issue. So far I'm unable to reproduce this with the provided configuration (on either 1.0.0-beta or current master version of DL4J).
Are you able to provide something I can run that reproduces it?

@AlexDBlack AlexDBlack self-assigned this Jun 7, 2018

@rexlaw412

This comment has been minimized.

Copy link
Author

commented Jun 9, 2018

Thank you very much for help, and really sorry for late reply.

I isolated the code from my original project.
Here is my demo project:
https://github.com/rexlaw412/dl4j_issue_5472

Also, I found more information during create this demo project.
The problem only occur when using early stopper.

Hope above project and observation can help for solve the problem.
Lastly, thank you again for help.

AlexDBlack added a commit that referenced this issue Jun 12, 2018
@AlexDBlack

This comment has been minimized.

Copy link
Contributor

commented Jun 12, 2018

Thanks for reporting, and thanks for the repo that helped me to reproduce the issue.
This was a bug in the workspace memory management for the bidirectional layer, which could turn up in a limited number of circumstances.
The issue has been fixed here: #5560 - I'll close this issue once that has been merged.

A few hours after that PR has been merged, you will be able to access the fix using snapshots, as described here: https://deeplearning4j.org/snapshots

@lock

This comment has been minimized.

Copy link

commented Sep 21, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Sep 21, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.