Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getMemoryReport produces an exception on a valid ComputationGraph #4223

Closed
bixbyr opened this issue Oct 30, 2017 · 2 comments
Closed

getMemoryReport produces an exception on a valid ComputationGraph #4223

bixbyr opened this issue Oct 30, 2017 · 2 comments

Comments

@bixbyr
Copy link

bixbyr commented Oct 30, 2017

When calling getMemoryReport on a valid ComputationGraph which has a single input, and two outputs, get an exception.

// Network definition
class MultiTaskNetwork {

val l2RegParam = 10e-4
val momentumParam = 0.9

private val networkBaseConf = new NeuralNetConfiguration.Builder()
.trainingWorkspaceMode(WorkspaceMode.SEPARATE)
.learningRate(0.01)
.updater(new Nesterovs(momentumParam))
.graphBuilder()
.setInputTypes(InputType.convolutional(17,19,19))
.addInputs("input")
.addLayer("baseConvolution", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nIn(17).nOut(256).l2Bias(l2RegParam).build, "input")
.addLayer("baseBatchNorm", new BatchNormalization.Builder().build, "baseConvolution")
.addLayer("baseReLu", new ActivationLayer.Builder().activation(RELU).build, "baseBatchNorm")

private val (networkTrunkConf, lastBlockOutput) = (1 to 20).foldLeft((networkBaseConf, "baseReLu")) {
case ((partialNetworkConf, inputName), blockNumber) =>
/*
Each residual block applies the following modules sequentially to its input:
(1) A convolution of 256 filters of kernel size 3 × 3 with stride 1
(2) Batch normalization
(3) A rectifier nonlinearity
(4) A convolution of 256 filters of kernel size 3 × 3 with stride 1
(5) Batch normalization
(6) A skip connection that adds the input to the block
(7) A rectifier nonlinearity
*/

  val confWithNextBlock = partialNetworkConf
    .addLayer(s"block${blockNumber}Conv1", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nOut(256).l2Bias(l2RegParam).build, inputName)

// .addLayer(s"block${blockNumber}Conv1", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nIn(17).nOut(256).l2Bias(l2RegParam).build, inputName)
.addLayer(s"block${blockNumber}BatchNorm1", new BatchNormalization.Builder().build, s"block${blockNumber}Conv1")
.addLayer(s"block${blockNumber}ReLu1", new ActivationLayer.Builder().activation(RELU).build, s"block${blockNumber}BatchNorm1")
.addLayer(s"block${blockNumber}Conv2", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nOut(256).l2Bias(l2RegParam).build, s"block${blockNumber}ReLu1")
// .addLayer(s"block${blockNumber}Conv2", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nIn(17).nOut(256).l2Bias(l2RegParam).build, s"block${blockNumber}ReLu1")
.addLayer(s"block${blockNumber}BatchNorm2", new BatchNormalization.Builder().build, s"block${blockNumber}Conv2")
.addVertex(s"block${blockNumber}SkipConnection", new ElementWiseVertex(ElementWiseVertex.Op.Add), inputName, s"block${blockNumber}BatchNorm2")
.addLayer(s"block${blockNumber}Output", new ActivationLayer.Builder().activation(RELU).build(), s"block${blockNumber}SkipConnection")
(confWithNextBlock, s"block${blockNumber}Output")
}

val fullNetworkConf = networkTrunkConf
/*
Policy head
(1) A convolution of 2 filters of kernel size 1 × 1 with stride 1
(2) Batch normalization
(3) A rectifier nonlinearity
(4) A fully connected linear layer that outputs a vector of size 192 + 1 = 362, corresponding to logit probabilities for all intersections and the pass move
*/
.addLayer("policyConv", new ConvolutionLayer.Builder(Array(1,1), Array(1,1)).convolutionMode(Same).nOut(2).l2Bias(l2RegParam).build, lastBlockOutput)
.addLayer("policyBatchNorm", new BatchNormalization.Builder().activation(RELU).build, "policyConv")
.addLayer("policyReLu", new ActivationLayer.Builder().activation(RELU).build, "policyBatchNorm")
.addLayer("policyFullyConnected", new DenseLayer.Builder().nOut(362).l2Bias(l2RegParam).build, "policyReLu")
.addLayer("policyOutput", new OutputLayer.Builder().lossFunction(MSE).nOut(362).build, "policyFullyConnected")
// Should this be the probit?

/*
Value head
(1) A convolution of 1 filter of kernel size 1 ×  1 with stride 1
(2) Batch normalization
(3) A rectifier nonlinearity
(4) A fully connected linear layer to a hidden layer of size 256
(5) A rectifier nonlinearity
(6) A fully connected linear layer to a scalar
(7) A tanh nonlinearity outputting a scalar in the range [− 1,  1]
 */
.addLayer("valueHeadConv", new ConvolutionLayer.Builder(Array(1,1), Array(1,1)).convolutionMode(Same).nOut(1).l2Bias(l2RegParam).build, lastBlockOutput)
.addLayer("valueBatchNorm", new BatchNormalization.Builder().activation(RELU).build, "valueHeadConv")
.addLayer("valueReLu", new BatchNormalization.Builder().activation(RELU).build, "valueBatchNorm")
.addLayer("valueFullyConnectedReLu", new DenseLayer.Builder().nOut(256).activation(RELU).l2Bias(l2RegParam).build, "valueReLu")
.addLayer("valueSingleNode", new DenseLayer.Builder().nIn(256).nOut(1).activation(TANH).l2Bias(l2RegParam).build, "valueFullyConnectedReLu")
.addLayer("valueOutput", new OutputLayer.Builder().lossFunction(NEGATIVELOGLIKELIHOOD).nOut(1).build, "valueSingleNode")
.setOutputs("valueOutput", "policyOutput")
.build()

val network = new ComputationGraph(fullNetworkConf)
}

// The getMemoryReport statement produces this exception
val multiTaskNetwork = new MultiTaskNetwork()
val memReport = multiTaskNetwork.fullNetworkConf.getMemoryReport(InputType.convolutional(17,19,19))

Exception in thread "main" java.lang.IllegalStateException: Invalid input type (layer index = -1, layer name="policyFullyConnected"): expected FeedForward input type. Got: InputTypeConvolutional(h=17,w=19,d=2)
at org.deeplearning4j.nn.conf.layers.FeedForwardLayer.getOutputType(FeedForwardLayer.java:36)
at org.deeplearning4j.nn.conf.layers.DenseLayer.getMemoryReport(DenseLayer.java:71)
at org.deeplearning4j.nn.conf.graph.LayerVertex.getMemoryReport(LayerVertex.java:133)
at org.deeplearning4j.nn.conf.ComputationGraphConfiguration.getMemoryReport(ComputationGraphConfiguration.java:495)
at com.starfruit.Main$.main(Main.scala:50)
at com.starfruit.Main.main(Main.scala)

Version info:
scalaVersion := "2.11.11"

val nd4jVersion = "0.9.1"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-core" % nd4jVersion
libraryDependencies += "org.nd4j" % "nd4s_2.11" % nd4jVersion
libraryDependencies += "org.nd4j" % "nd4j-native-platform" % nd4jVersion

@AlexDBlack
Copy link
Contributor

#4326

@lock
Copy link

lock bot commented Sep 24, 2018

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Sep 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants