getMemoryReport produces an exception on a valid ComputationGraph #4223

bixbyr · 2017-10-30T03:19:55Z

When calling getMemoryReport on a valid ComputationGraph which has a single input, and two outputs, get an exception.

// Network definition
class MultiTaskNetwork {

val l2RegParam = 10e-4
val momentumParam = 0.9

private val networkBaseConf = new NeuralNetConfiguration.Builder()
.trainingWorkspaceMode(WorkspaceMode.SEPARATE)
.learningRate(0.01)
.updater(new Nesterovs(momentumParam))
.graphBuilder()
.setInputTypes(InputType.convolutional(17,19,19))
.addInputs("input")
.addLayer("baseConvolution", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nIn(17).nOut(256).l2Bias(l2RegParam).build, "input")
.addLayer("baseBatchNorm", new BatchNormalization.Builder().build, "baseConvolution")
.addLayer("baseReLu", new ActivationLayer.Builder().activation(RELU).build, "baseBatchNorm")

private val (networkTrunkConf, lastBlockOutput) = (1 to 20).foldLeft((networkBaseConf, "baseReLu")) {
case ((partialNetworkConf, inputName), blockNumber) =>
/*
Each residual block applies the following modules sequentially to its input:
(1) A convolution of 256 filters of kernel size 3 × 3 with stride 1
(2) Batch normalization
(3) A rectifier nonlinearity
(4) A convolution of 256 filters of kernel size 3 × 3 with stride 1
(5) Batch normalization
(6) A skip connection that adds the input to the block
(7) A rectifier nonlinearity
*/

  val confWithNextBlock = partialNetworkConf
    .addLayer(s"block${blockNumber}Conv1", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nOut(256).l2Bias(l2RegParam).build, inputName)

// .addLayer(s"block${blockNumber}Conv1", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nIn(17).nOut(256).l2Bias(l2RegParam).build, inputName)
.addLayer(s"block${blockNumber}BatchNorm1", new BatchNormalization.Builder().build, s"block${blockNumber}Conv1")
.addLayer(s"block${blockNumber}ReLu1", new ActivationLayer.Builder().activation(RELU).build, s"block${blockNumber}BatchNorm1")
.addLayer(s"block${blockNumber}Conv2", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nOut(256).l2Bias(l2RegParam).build, s"block${blockNumber}ReLu1")
// .addLayer(s"block${blockNumber}Conv2", new ConvolutionLayer.Builder(Array(3,3), Array(1,1)).convolutionMode(Same).nIn(17).nOut(256).l2Bias(l2RegParam).build, s"block${blockNumber}ReLu1")
.addLayer(s"block${blockNumber}BatchNorm2", new BatchNormalization.Builder().build, s"block${blockNumber}Conv2")
.addVertex(s"block${blockNumber}SkipConnection", new ElementWiseVertex(ElementWiseVertex.Op.Add), inputName, s"block${blockNumber}BatchNorm2")
.addLayer(s"block${blockNumber}Output", new ActivationLayer.Builder().activation(RELU).build(), s"block${blockNumber}SkipConnection")
(confWithNextBlock, s"block${blockNumber}Output")
}

val fullNetworkConf = networkTrunkConf
/*
Policy head
(1) A convolution of 2 filters of kernel size 1 × 1 with stride 1
(2) Batch normalization
(3) A rectifier nonlinearity
(4) A fully connected linear layer that outputs a vector of size 192 + 1 = 362, corresponding to logit probabilities for all intersections and the pass move
*/
.addLayer("policyConv", new ConvolutionLayer.Builder(Array(1,1), Array(1,1)).convolutionMode(Same).nOut(2).l2Bias(l2RegParam).build, lastBlockOutput)
.addLayer("policyBatchNorm", new BatchNormalization.Builder().activation(RELU).build, "policyConv")
.addLayer("policyReLu", new ActivationLayer.Builder().activation(RELU).build, "policyBatchNorm")
.addLayer("policyFullyConnected", new DenseLayer.Builder().nOut(362).l2Bias(l2RegParam).build, "policyReLu")
.addLayer("policyOutput", new OutputLayer.Builder().lossFunction(MSE).nOut(362).build, "policyFullyConnected")
// Should this be the probit?

/*
Value head
(1) A convolution of 1 filter of kernel size 1 ×  1 with stride 1
(2) Batch normalization
(3) A rectifier nonlinearity
(4) A fully connected linear layer to a hidden layer of size 256
(5) A rectifier nonlinearity
(6) A fully connected linear layer to a scalar
(7) A tanh nonlinearity outputting a scalar in the range [− 1,  1]
 */
.addLayer("valueHeadConv", new ConvolutionLayer.Builder(Array(1,1), Array(1,1)).convolutionMode(Same).nOut(1).l2Bias(l2RegParam).build, lastBlockOutput)
.addLayer("valueBatchNorm", new BatchNormalization.Builder().activation(RELU).build, "valueHeadConv")
.addLayer("valueReLu", new BatchNormalization.Builder().activation(RELU).build, "valueBatchNorm")
.addLayer("valueFullyConnectedReLu", new DenseLayer.Builder().nOut(256).activation(RELU).l2Bias(l2RegParam).build, "valueReLu")
.addLayer("valueSingleNode", new DenseLayer.Builder().nIn(256).nOut(1).activation(TANH).l2Bias(l2RegParam).build, "valueFullyConnectedReLu")
.addLayer("valueOutput", new OutputLayer.Builder().lossFunction(NEGATIVELOGLIKELIHOOD).nOut(1).build, "valueSingleNode")
.setOutputs("valueOutput", "policyOutput")
.build()

val network = new ComputationGraph(fullNetworkConf)
}

// The getMemoryReport statement produces this exception
val multiTaskNetwork = new MultiTaskNetwork()
val memReport = multiTaskNetwork.fullNetworkConf.getMemoryReport(InputType.convolutional(17,19,19))

Exception in thread "main" java.lang.IllegalStateException: Invalid input type (layer index = -1, layer name="policyFullyConnected"): expected FeedForward input type. Got: InputTypeConvolutional(h=17,w=19,d=2)
at org.deeplearning4j.nn.conf.layers.FeedForwardLayer.getOutputType(FeedForwardLayer.java:36)
at org.deeplearning4j.nn.conf.layers.DenseLayer.getMemoryReport(DenseLayer.java:71)
at org.deeplearning4j.nn.conf.graph.LayerVertex.getMemoryReport(LayerVertex.java:133)
at org.deeplearning4j.nn.conf.ComputationGraphConfiguration.getMemoryReport(ComputationGraphConfiguration.java:495)
at com.starfruit.Main$.main(Main.scala:50)
at com.starfruit.Main.main(Main.scala)

Version info:
scalaVersion := "2.11.11"

val nd4jVersion = "0.9.1"
libraryDependencies += "org.deeplearning4j" % "deeplearning4j-core" % nd4jVersion
libraryDependencies += "org.nd4j" % "nd4s_2.11" % nd4jVersion
libraryDependencies += "org.nd4j" % "nd4j-native-platform" % nd4jVersion

The text was updated successfully, but these errors were encountered:

AlexDBlack · 2017-11-29T03:26:00Z

#4326

lock · 2018-09-24T06:44:00Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

AlexDBlack self-assigned this Nov 24, 2017

AlexDBlack added a commit that referenced this issue Nov 25, 2017

#4223 LayerVertex memory report: apply preprocessor first

2eebb37

AlexDBlack mentioned this issue Nov 25, 2017

Multiple fixes #4326

Merged

AlexDBlack added a commit that referenced this issue Nov 27, 2017

#4223 LayerVertex memory report: apply preprocessor first

fdc08fd

AlexDBlack added a commit that referenced this issue Nov 27, 2017

#4223 LayerVertex memory report: apply preprocessor first

1f1c814

AlexDBlack added a commit that referenced this issue Nov 29, 2017

#4223 LayerVertex memory report: apply preprocessor first

cb2f734

AlexDBlack closed this as completed Nov 29, 2017

lock bot locked and limited conversation to collaborators Sep 24, 2018

eclipsewebmaster unassigned AlexDBlack Jun 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getMemoryReport produces an exception on a valid ComputationGraph #4223

getMemoryReport produces an exception on a valid ComputationGraph #4223

bixbyr commented Oct 30, 2017

AlexDBlack commented Nov 29, 2017

lock bot commented Sep 24, 2018

getMemoryReport produces an exception on a valid ComputationGraph #4223

getMemoryReport produces an exception on a valid ComputationGraph #4223

Comments

bixbyr commented Oct 30, 2017

AlexDBlack commented Nov 29, 2017

lock bot commented Sep 24, 2018