Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[R] How to return the activations of hidden nodes in a CNN, given a single MNIST image #1152

Closed
Lodewic opened this issue Jan 4, 2016 · 4 comments
Labels

Comments

@Lodewic
Copy link
Contributor

Lodewic commented Jan 4, 2016

I am trying to find how to return the activations of convolutional layers (or any layer), given one image from the MNIST dataset.

The model is similar to this example: http://mxnet.readthedocs.org/en/latest/R-package/mnistCompetition.html

@antinucleon
Copy link
Contributor

Although I don't use R, I think it is similar to Python. You can use Group symbol to make multiple output, eg: out = mx.sym.Group([act2, softmax]). Then the first output of the network is act2, second output issoftmax

@thirdwing
Copy link
Contributor

@hetong007

@thirdwing thirdwing added the R label Jan 4, 2016
@Lodewic
Copy link
Contributor Author

Lodewic commented Jan 11, 2016

It has been a week since I asked this, but for future reference I'd still like to post the solution that I ended up using.

The documentation isn't of too much help, yet. Mainly the 'executor' that is described in the documentation for Python doesn't seem to be documented with a working example for R. So, I ended up using part of this issue: #1008
Where there is some code that is informative for my problem, and a good example is mentioned to be here: https://github.com/dmlc/mxnet/blob/master/R-package/R/model.R#L472

The model is practically the same as in the MNIST competition example for R so below is some code that you should be able to add to that.
(http://mxnet.readthedocs.org/en/latest/R-package/mnistCompetition.html)

Setup model configuration

As I noted, this is a convolutional net, since I wanted to get the outputs of the convolutional / pooling layers.

# input layer
data <- mx.symbol.Variable('data')
# first convolutional layer
convLayer1 <- mx.symbol.Convolution(data=data, kernel=c(5,5), num_filter=30)
convAct1 <- mx.symbol.Activation(data=convLayer1, act_type="tanh")
poolLayer1 <- mx.symbol.Pooling(data=convAct1, pool_type="max", kernel=c(2,2), stride=c(2,2))
# second convolutional layer
convLayer2 <- mx.symbol.Convolution(data=poolLayer1, kernel=c(5,5), num_filter=60)
convAct2 <- mx.symbol.Activation(data=convLayer2, act_type="tanh")
poolLayer2 <- mx.symbol.Pooling(data=convAct2, pool_type="max",
                           kernel=c(2,2), stride=c(2,2))
[unnamed-chunk-20-1.pdf](https://github.com/dmlc/mxnet/files/85846/unnamed-chunk-20-1.pdf)

# big hidden layer
flattenData <- mx.symbol.Flatten(data=poolLayer2)
hiddenLayer <- mx.symbol.FullyConnected(flattenData, num_hidden=500)
hiddenAct <- mx.symbol.Activation(hiddenLayer, act_type="tanh")
# softmax output layer
outLayer <- mx.symbol.FullyConnected(hiddenAct, num_hidden=10)
LeNet1 <- mx.symbol.SoftmaxOutput(outLayer)

Group symbols and creating an executor

The executor was done on the cpu rather than the gpu, since for the model above my gpu would run out of memory if I also ran the executor on the gpu. In hindsight, it of course looks a lot easier.

# Group some output layers for visual analysis
out <- mx.symbol.Group(c(convAct1, poolLayer1, convAct2, poolLayer2, LeNet1))
# Create an executor
executor <- mx.simple.bind(symbol=out, data=dim(test.array), ctx=mx.cpu())

Train the model

# Prepare for training the model
mx.set.seed(0)
# Set a logger to keep track of callback data
logger <- mx.metric.logger$new()
# Set gpu device
devices=mx.gpu(1)
# Train model
model <- mx.model.FeedForward.create(LeNet1, X=train.array, y=train_lab,
                                     eval.data=list(data=test.array, label=test_lab),
                                     ctx=devices, 
                                     num.round=1, 
                                     array.batch.size=100,
                                     learning.rate=0.05, 
                                     momentum=0.9, 
                                     wd=0.00001,
                                     eval.metric=mx.metric.accuracy,
                                     epoch.end.callback=mx.callback.log.train.metric(100, logger))

Update the 'executor' parameters and outputs

After training, update the executor weights, biases and parameters.
In this case, there were no auxiliary parameters, but I still added them to the executor for reference.

# Update parameters
mx.exec.update.arg.arrays(executor, model$arg.params, match.name=TRUE)
mx.exec.update.aux.arrays(executor, model$aux.params, match.name=TRUE)
# Select data to use
mx.exec.update.arg.arrays(executor, list(data=mx.nd.array(test.array)), match.name=TRUE)
# Do a forward pass with the current parameters and data
mx.exec.forward(executor, is.train=FALSE)
names(executor$ref.outputs)

These are the names in the output:
## [1] "activation0_output" "pooling0_output" "activation1_output"
## [4] "pooling1_output" "softmaxoutput0_output"

Visualize the output

So, in order to plot them (remember, I am using 28x28 pixel images from the MNIST dataset) you could run something like below to plot the output of the first 16 filters,

# Plot the filters of the 7th test example
par(mfrow=c(4,4), mar=c(0.1,0.1,0.1,0.1))
for (i in 1:16) {
    outputData <- as.array(executor$ref.outputs$activation0_output)[,,i,7]
    image(outputData,
          xaxt='n', yaxt='n',
          col=gray(seq(1,0,-0.1)))
}

I hope this might help other R users some time 👍

@thirdwing
Copy link
Contributor

@Lodewic Excellent!

If you can add some background info, I am glad to add it as an example, just like what we have in https://github.com/dmlc/mxnet/tree/master/doc/R-package

You can send a PR when you have time!

Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants