Pool and uppool with switch #2022

Open
ChienliMa opened this Issue Aug 7, 2014 · 17 comments

Projects

None yet

5 participants

@ChienliMa
Contributor

In Pro. Zeiler's paper < Visualizing and Understanding Convolutional Networks >.
http://www.matthewzeiler.com/pubs/arxive2013/arxive2013.pdf

Operation: Switch_pool() and Switch_upppool() are needed to visualizing CNN. I did not found similar function in theano( Or maybe I missed it ).

Currently I have an implementation using numpy. If there are other people need such an operation, I may try implementing it. Yet it seem a little difficult for me since there are a lot of c code in pooling function.

Need help.

@nouiz
Member
nouiz commented Aug 7, 2014

The uppool seem to be the gradient of the max_pool operation. We have it in
Theano. Normally people don't introduce it manually, so it isn't as much
documented.

Check the class DownsampleFactorMaxGrad in that file:

https://github.com/Theano/Theano/blob/master/theano/tensor/signal/downsample.py

I don't find the switch_pool and switch_uppoool in that paper. I think that
the switch_pool is just the max_pool. If not, what is the difference?

It used Alex's K. code I think. If so, that code is wrapped in Pylearn2. So
if they didn't add extra GPU code, we should have that code available in
Theano or Pylearn2.

Pylearn2 have some code for visualization. Maybe you can ask on its mailing
list. Maybe you'll get more information there.

Fred

On Thu, Aug 7, 2014 at 8:29 AM, Chienli Ma(马千里) notifications@github.com
wrote:

In Pro. Zeiler's paper < Visualizing and Understanding Convolutional
Networks >.
http://www.matthewzeiler.com/pubs/arxive2013/arxive2013.pdf

Operation: Switch_pool() and Switch_upppool() are needed to visualizing
CNN. I did not found similar function in theano( Or maybe I missed it ).

Currently I have an implementation using numpy. If there are other people
need such an operation, I may try implementing it. Yet it seem a little
difficult for me since there are a lot of c code in pooling function.

Need help.


Reply to this email directly or view it on GitHub
#2022.

@ChienliMa
Contributor

Thanks for your reply.

The difference is that pooling with switch will generate a map recording the exact location where max values come from. And this information will be used in uppooling operation to genrate the output. These operations are not introduced in this paper but in another paper about deconvnet, and this paper uses deconv to visualize CNN.

I will check the Pylearn.

Thanks again.

maqianlie@gmail.com

From: Frédéric Bastien
Date: 2014-08-08 03:58
To: Theano/Theano
CC: Chienli Ma(马千里)
Subject: Re: [Theano] Pool and uppool with switch (#2022)
The uppool seem to be the gradient of the max_pool operation. We have it in
Theano. Normally people don't introduce it manually, so it isn't as much
documented.

Check the class DownsampleFactorMaxGrad in that file:

https://github.com/Theano/Theano/blob/master/theano/tensor/signal/downsample.py

I don't find the switch_pool and switch_uppoool in that paper. I think that
the switch_pool is just the max_pool. If not, what is the difference?

It used Alex's K. code I think. If so, that code is wrapped in Pylearn2. So
if they didn't add extra GPU code, we should have that code available in
Theano or Pylearn2.

Pylearn2 have some code for visualization. Maybe you can ask on its mailing
list. Maybe you'll get more information there.

Fred

On Thu, Aug 7, 2014 at 8:29 AM, Chienli Ma(马千里) notifications@github.com
wrote:

In Pro. Zeiler's paper < Visualizing and Understanding Convolutional
Networks >.
http://www.matthewzeiler.com/pubs/arxive2013/arxive2013.pdf

Operation: Switch_pool() and Switch_upppool() are needed to visualizing
CNN. I did not found similar function in theano( Or maybe I missed it ).

Currently I have an implementation using numpy. If there are other people
need such an operation, I may try implementing it. Yet it seem a little
difficult for me since there are a lot of c code in pooling function.

Need help.


Reply to this email directly or view it on GitHub
#2022.


Reply to this email directly or view it on GitHub.

@f0k
Contributor
f0k commented Aug 8, 2014

I'd suggest you to look into the Karen Simonyan et al. paper: http://arxiv.org/abs/1312.6034
They show (Sect. 4) that Zeiler's method is closely related to computing the gradient of the network output wrt. the input image, which is very easy to do in Theano: just define the forward pass, then use theano.grad(network_output[the_class_you_are_interested_in], wrt=network_input). The generated graph already contains the "uppooling" operation you mention.

PS: If you have a softmax output layer, you will probably want to compute the gradient of the input to this layer rather than the gradient of its output, as in Simonyan's paper. Otherwise you will not only highlight regions in the image that were most relevant to detect your class of interest, but also regions that were most relevant to not detect any of the other classes.

@nouiz
Member
nouiz commented Sep 15, 2014

I don't see something to do about this. Is that the case? Can we close this ticket?

@f0k
Contributor
f0k commented Oct 15, 2015

They show (Sect. 4) that Zeiler's method is closely related to computing the gradient of the network output wrt. the input image, which is very easy to do in Theano: [...]

There's example code for Zeiler's method, Simonyan's method and Springenberg's method here: https://github.com/Lasagne/Recipes/blob/master/examples/Saliency%20Maps%20and%20Guided%20Backpropagation.ipynb
It uses Lasagne for defining the network, but should be easy to adapt to whatever you're using.

@nouiz, this can definitely be closed, Theano has everything that's needed.

@nouiz
Member
nouiz commented Oct 15, 2015

Should we add an unpool function in theano to help build that? It came up a
few times and it don't seem trivial for people to find how to do it.

This is in the same spirit as the function max_pool_2d_same_size

http://www.deeplearning.net/software/theano/library/tensor/signal/downsample.html#theano.tensor.signal.downsample.max_pool_2d_same_size

What do you think about adding such unpool method?

On Thu, Oct 15, 2015 at 8:27 AM, Jan Schlüter notifications@github.com
wrote:

They show (Sect. 4) that Zeiler's method is closely related to computing
the gradient of the network output wrt. the input image, which is very easy
to do in Theano: [...]

There's example code for Zeiler's method, Simonyan's method and
Springenberg's method here:
https://github.com/Lasagne/Recipes/blob/master/examples/Saliency%20Maps%20and%20Guided%20Backpropagation.ipynb
It uses Lasagne for defining the network, but should be easy to adapt to
whatever you're using.

@nouiz https://github.com/nouiz, this can definitely be closed, Theano
has everything that's needed.


Reply to this email directly or view it on GitHub
#2022 (comment).

@f0k
Contributor
f0k commented Oct 15, 2015

Should we add an unpool function in theano to help build that?

To implement what we're discussing here (the DeconvNet of Zeiler), you actually have to use the gradient of an existing pooling Op, not just something that does unpooling (because you need to use the correct pooling switches). An unpool() function wouldn't help for that.

It might be nice to have an unpooling function for convolutional auto-encoders, though. In Lasagne, we've added an Upscale2DLayer, but so far it supports upscaling by repetition only, and it would be nice to have it support bed-of-nails -- that's something that could be moved to a Theano function as well.

@nouiz
Member
nouiz commented Oct 15, 2015

I see that it use repeat. But what do you mean by support "bed-of-nails"?
Google don't give me related explanation:)

On Thu, Oct 15, 2015 at 9:34 AM, Jan Schlüter notifications@github.com
wrote:

Should we add an unpool function in theano to help build that?

To implement what we're discussing here (the DeconvNet of Zeiler), you
actually have to use the gradient of an existing pooling Op, not just
something that does unpooling (because you need to use the correct pooling
switches). An unpool() function wouldn't help for that.

It might be nice to have an unpooling function for convolutional
auto-encoders, though. In Lasagne, we've added an Upscale2DLayer
http://lasagne.readthedocs.org/en/latest/modules/layers/pool.html#lasagne.layers.Upscale2DLayer,
but so far it supports upscaling by repetition only, and it would be nice
to have it support bed-of-nails -- that's something that could be moved to
a Theano function as well.


Reply to this email directly or view it on GitHub
#2022 (comment).

@f0k
Contributor
f0k commented Oct 15, 2015

But what do you mean by support "bed-of-nails"?

When you uppool by copying the input value always into the top left corner of its corresponding output region, leaving everything else at zero. The resulting output will look like a bed of nails. Some papers do it like that when the pooling switches are unknown. (I wonder what happens for negative values, though, maybe that case is ruled out by using rectifiers.)

@ChienliMa
Contributor

@nouiz you can close this issue is what I've asked about can be implemented using existing feature of Theano.
Thanks @f0k for your tutorial. :)

@Sentient07
Contributor

Hi @f0k , I'm having a question regarding this.

When you uppool by copying the input value always into the top left corner of its corresponding output region, leaving everything else at zero. The resulting output will look like a bed of nails. Some papers do it like that when the pooling switches are unknown. (I wonder what happens for negative values, though, maybe that case is ruled out by using rectifiers.)

When you mean copying the input, we should have the original location(indices) of the max value stored somewhere, right ? Or, could we perform the unpooling operation by using gradient of maxpool ? (using DownsampleFactorMaxGrad ) But even in that case, I think we need to have the indices of pooled values stored somewhere, right ?

@f0k
Contributor
f0k commented Nov 23, 2016 edited

I think you're mixing up some things.

But what do you mean by support "bed-of-nails"?

When you uppool by copying the input value always into the top left corner of its corresponding output region, leaving everything else at zero.

When you mean copying the input, we should have the original location(indices) of the max value stored somewhere, right ?

No, in that message, I was replying to Frédéric's question, which was about generic unpooling, not about the backward pass of max-pooling. For the backward pass of max-pooling, you need theano.grad of a max-pooling forward pass. This will take care of using the same pooling switches. (In Lasagne, you can use an InverseLayer to unpool a MaxPool2DLayer, this creates the correct theano.grad expression for you.)

@Sentient07
Contributor

In Lasagne, you can use an InverseLayer to unpool a MaxPool2DLayer, this creates the correct theano.grad expression for you.

Yes! This was what I wanted to know. Sorry if my question is elementary, but from what I understand, switches store information of indices where the pooled activations come from, right ? Is there a way to store/access these indices in theano?

@f0k
Contributor
f0k commented Nov 23, 2016

Sorry if my question is elementary, but from what I understand, switches store information of indices where the pooled activations come from, right ? Is there a way to store/access these indices in theano?

No, they're just a concept to understand what's going on. In practice, they're not explicitly computed and kept for the backward pass -- instead, the backward pass finds where to propagate output gradients to by comparing the forward pass inputs to the forward pass outputs.

@Sentient07
Contributor

Perfect! Thank you, Jan!

@fsalmasri

@f0k I built a function to create switches and then give them to right place in the backward phase but the implementation was very slow. If I understood well from what you said that there is no need to implement these witches. Could you give me and example please ?

@f0k
Contributor
f0k commented Feb 9, 2017

If I understood well from what you said that there is no need to implement these witches.

Exactly. An efficient implementation already exists because it's needed for the max-pooling gradient.

Could you give me and example please?

What are you trying to achieve?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment