Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Max Unpooling Layer must bind with a Max Pooling Layer for reconstruction. #612

Open
kdexd opened this issue Mar 9, 2017 · 12 comments
Open

Comments

@kdexd
Copy link
Contributor

kdexd commented Mar 9, 2017

I came across this issue while working on #608.

Current implementation of max_unpooling_layer is broken, and we didn't realize that as it was untested. Max unpooling layer belongs to a category of upsampling Layers which reconstruct the inputs they receive, back to the input image space.

As described by Zeiler and Fergus, this is simply the backprop of a max pooling layer. More clarity can be ensured through this image:

image

So, if a neural network has both, max pooling and unpooling layers, the max pooling layer should precede the max pooling layer, and both of them should have a common worker_storage, where max pooling layer updates these "max locations" during its forward pass, and max unpooling layer accesses them during the same forward pass.

This analogy can be extended to deconvolution layer as well. They should depend on a corresponding convolution layer, and borrow their filters to carry out forward pass. Current implementation of deconvolution layers can be kept alongside and which setting to choose can be decided as a constructor argument.

Reference:

Visualizing and Understanding Convolutional Networks by Zeiler and Fergus. - https://arxiv.org/abs/1311.2901

@kdexd
Copy link
Contributor Author

kdexd commented Mar 9, 2017

/ping: @beru @bhack @edgarriba @nyanp @Randl

@pliptor
Copy link
Contributor

pliptor commented Mar 12, 2017

That's a nice diagram you drew. Is there an analogous operation that happens in average_unpooling_layer as well? I'm gonna read your reference.

@pliptor
Copy link
Contributor

pliptor commented Mar 13, 2017

There might be something to be learned here as this thread extends for almost a year period.

https://github.com/tensorflow/tensorflow/issues/2169

@kdexd
Copy link
Contributor Author

kdexd commented Mar 13, 2017

Me and @wangyida had a discussion about the implementation, we thought of two approaches:

  1. Say there's a network with three conv and pool alternating layers (total six). If we add a max unpool layer, it will share max locations with the last index pooling layer in the network, both of us agreed that we saw examples where unpooling layers appear after pooling layers in a symmetric fashion.

  2. We will always have to use a graph network when we add upsampling layers (deconv or unpool) where there would be a branch depicting direct connection between a pool and unpool layer, showing that they have a common workage storage to store max locations..

Something similar to second point was proposed in caffe, and looks like TF issue has this coming up at some point.

@pliptor
Copy link
Contributor

pliptor commented Mar 13, 2017

From my partial understanding, yes, I agree the max unpool layer must bind to a corresponding max pool layer to obtain the "switching locations " or "max locations" as you mention in 1. I realize though I have missing bits of understanding and I need to first read also their deconvnet paper first.

It seems to me the goal is to "probe" or "project" a given layer back to the image space for a particular input image in a trained network. Is it crazy to think then that from a user perspective, it is almost like, adding a qualifier to a layer that we wish to probe instead of specifying a unpooling layer and have a deconv method in place for the network? But my doubt is probably coming from the missing bit of (deconvnet) information above. More reading...

@pliptor
Copy link
Contributor

pliptor commented Mar 14, 2017

I think I'm starting to get a better idea now. Can you tell if the current max_pool_layer (not the unpool) keeps track of the switches so it can be used during back propagation? cs231n recommends it for efficiency. I thought max_unpool_layer could reuse the code it or vice-versa.

@kdexd
Copy link
Contributor Author

kdexd commented Mar 14, 2017

@pliptor Yes that was the basis of our discussion. You can check for max_pool_layer_worker_storage in tiny_dnn/core/backend.h (tiny or avx, I can't remember, but it's in one of them)

@pliptor
Copy link
Contributor

pliptor commented Mar 14, 2017

It is in backend_avx.h but not in backend_tiny.h

@kdexd
Copy link
Contributor Author

kdexd commented Mar 14, 2017

@pliptor: So that stores max locations from a pooling layer. We must connect an unpooling layer to a pooling layer in a graph network and make this storage as a shared reference between both the layers. Something close to this should do our job. 😉

@pliptor
Copy link
Contributor

pliptor commented Mar 14, 2017

@karandesai-96 That seems to account for part of the job required for what the paper you referred does. If I'm reading this right, there seems a need for an independent data path to propagate the feature maps across the layers. And then a way for the user to dump the values out for the layers of interest. Maybe the same datapath used for backpropagation can be used if time shared. Is it how you see it?

@kdexd
Copy link
Contributor Author

kdexd commented Mar 14, 2017

I was thinking of direct linkages, but what you suggest is a class which inherits from layer, and resides in between the connection of max pool and unpool layer, and since such linkages are common for conv-deconv, avg pool-avg unpool layer pairs, we can use the same layer in between, did I get your right ?

If I did, then your approach introduces ra memory requirements, but has a good generalization. I will let others decide because it hasn't been more than a month since I stepped into C++ world, I was a Python developer. I might miss out some considerations in taking such decisions as of now.

@pliptor
Copy link
Contributor

pliptor commented Mar 14, 2017

Say you have a serial network like this.

<image>
<layer1>
<layer2>
<layer3>
<output>

My understanding of the paper is that to probe layer2 you'd need something like this

<image>
<layer1>
<layer2> => <inverse operation> => <image projection>
<layer3> => <inverse operation>
<output> => <copy output>

So technically there appears to be something running in parallel with a data flow from bottom up for the probe system at least up to the last layer of interest (2 in this example). It seems almost like a forward propagation in reverse but I can be wrong. I'm not sure though if the max_unpool_layer was introduced with such an intent in tiny-dnn. I was mostly interested knowing the functionality that the user is getting with this layer rather than implementation. It might had been to be used with an autoencoder instead... and not for visualization. I guess that's it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants