[R] How to return the activations of hidden nodes in a CNN, given a single MNIST image #1152

Lodewic · 2016-01-04T11:03:25Z

I am trying to find how to return the activations of convolutional layers (or any layer), given one image from the MNIST dataset.

The model is similar to this example: http://mxnet.readthedocs.org/en/latest/R-package/mnistCompetition.html

antinucleon · 2016-01-04T14:40:12Z

Although I don't use R, I think it is similar to Python. You can use Group symbol to make multiple output, eg: out = mx.sym.Group([act2, softmax]). Then the first output of the network is act2, second output issoftmax

thirdwing · 2016-01-04T15:49:46Z

@hetong007

Lodewic · 2016-01-11T14:14:26Z

It has been a week since I asked this, but for future reference I'd still like to post the solution that I ended up using.

The documentation isn't of too much help, yet. Mainly the 'executor' that is described in the documentation for Python doesn't seem to be documented with a working example for R. So, I ended up using part of this issue: #1008
Where there is some code that is informative for my problem, and a good example is mentioned to be here: https://github.com/dmlc/mxnet/blob/master/R-package/R/model.R#L472

The model is practically the same as in the MNIST competition example for R so below is some code that you should be able to add to that.
(http://mxnet.readthedocs.org/en/latest/R-package/mnistCompetition.html)

Setup model configuration

As I noted, this is a convolutional net, since I wanted to get the outputs of the convolutional / pooling layers.

# input layer
data <- mx.symbol.Variable('data')
# first convolutional layer
convLayer1 <- mx.symbol.Convolution(data=data, kernel=c(5,5), num_filter=30)
convAct1 <- mx.symbol.Activation(data=convLayer1, act_type="tanh")
poolLayer1 <- mx.symbol.Pooling(data=convAct1, pool_type="max", kernel=c(2,2), stride=c(2,2))
# second convolutional layer
convLayer2 <- mx.symbol.Convolution(data=poolLayer1, kernel=c(5,5), num_filter=60)
convAct2 <- mx.symbol.Activation(data=convLayer2, act_type="tanh")
poolLayer2 <- mx.symbol.Pooling(data=convAct2, pool_type="max",
                           kernel=c(2,2), stride=c(2,2))
[unnamed-chunk-20-1.pdf](https://github.com/dmlc/mxnet/files/85846/unnamed-chunk-20-1.pdf)

# big hidden layer
flattenData <- mx.symbol.Flatten(data=poolLayer2)
hiddenLayer <- mx.symbol.FullyConnected(flattenData, num_hidden=500)
hiddenAct <- mx.symbol.Activation(hiddenLayer, act_type="tanh")
# softmax output layer
outLayer <- mx.symbol.FullyConnected(hiddenAct, num_hidden=10)
LeNet1 <- mx.symbol.SoftmaxOutput(outLayer)

Group symbols and creating an executor

The executor was done on the cpu rather than the gpu, since for the model above my gpu would run out of memory if I also ran the executor on the gpu. In hindsight, it of course looks a lot easier.

# Group some output layers for visual analysis
out <- mx.symbol.Group(c(convAct1, poolLayer1, convAct2, poolLayer2, LeNet1))
# Create an executor
executor <- mx.simple.bind(symbol=out, data=dim(test.array), ctx=mx.cpu())

Train the model

# Prepare for training the model
mx.set.seed(0)
# Set a logger to keep track of callback data
logger <- mx.metric.logger$new()
# Set gpu device
devices=mx.gpu(1)
# Train model
model <- mx.model.FeedForward.create(LeNet1, X=train.array, y=train_lab,
                                     eval.data=list(data=test.array, label=test_lab),
                                     ctx=devices, 
                                     num.round=1, 
                                     array.batch.size=100,
                                     learning.rate=0.05, 
                                     momentum=0.9, 
                                     wd=0.00001,
                                     eval.metric=mx.metric.accuracy,
                                     epoch.end.callback=mx.callback.log.train.metric(100, logger))

Update the 'executor' parameters and outputs

After training, update the executor weights, biases and parameters.
In this case, there were no auxiliary parameters, but I still added them to the executor for reference.

# Update parameters
mx.exec.update.arg.arrays(executor, model$arg.params, match.name=TRUE)
mx.exec.update.aux.arrays(executor, model$aux.params, match.name=TRUE)
# Select data to use
mx.exec.update.arg.arrays(executor, list(data=mx.nd.array(test.array)), match.name=TRUE)
# Do a forward pass with the current parameters and data
mx.exec.forward(executor, is.train=FALSE)
names(executor$ref.outputs)

These are the names in the output:
## [1] "activation0_output" "pooling0_output" "activation1_output"
## [4] "pooling1_output" "softmaxoutput0_output"

Visualize the output

So, in order to plot them (remember, I am using 28x28 pixel images from the MNIST dataset) you could run something like below to plot the output of the first 16 filters,

# Plot the filters of the 7th test example
par(mfrow=c(4,4), mar=c(0.1,0.1,0.1,0.1))
for (i in 1:16) {
    outputData <- as.array(executor$ref.outputs$activation0_output)[,,i,7]
    image(outputData,
          xaxt='n', yaxt='n',
          col=gray(seq(1,0,-0.1)))
}

I hope this might help other R users some time 👍

thirdwing · 2016-01-11T14:36:13Z

@Lodewic Excellent!

If you can add some background info, I am glad to add it as an example, just like what we have in https://github.com/dmlc/mxnet/tree/master/doc/R-package

You can send a PR when you have time!

Thank you!

thirdwing added the R label Jan 4, 2016

thirdwing closed this as completed Jan 11, 2016

khalida mentioned this issue Dec 21, 2016

custom loss symbol in R/Python #3368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[R] How to return the activations of hidden nodes in a CNN, given a single MNIST image #1152

[R] How to return the activations of hidden nodes in a CNN, given a single MNIST image #1152

Lodewic commented Jan 4, 2016

antinucleon commented Jan 4, 2016

thirdwing commented Jan 4, 2016

Lodewic commented Jan 11, 2016

thirdwing commented Jan 11, 2016

[R] How to return the activations of hidden nodes in a CNN, given a single MNIST image #1152

[R] How to return the activations of hidden nodes in a CNN, given a single MNIST image #1152

Comments

Lodewic commented Jan 4, 2016

antinucleon commented Jan 4, 2016

thirdwing commented Jan 4, 2016

Lodewic commented Jan 11, 2016

Setup model configuration

Group symbols and creating an executor

Train the model

Update the 'executor' parameters and outputs

Visualize the output

thirdwing commented Jan 11, 2016