Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement mean-pooling neural network operation #106

Open
rsokl opened this issue Jan 22, 2019 · 2 comments
Open

Implement mean-pooling neural network operation #106

rsokl opened this issue Jan 22, 2019 · 2 comments

Comments

@rsokl
Copy link
Owner

rsokl commented Jan 22, 2019

Okay, so this might not exactly be a "good first issue" - it is a little more advanced, but is still very much accessible to newcomers.

Similar to the mygrad.nnet.max_pool function, I would like there to be a mean-pooling layer. That is, a convolution-style windows is strided over the input, and the mean is computed for each window. E.g. the following is shows how mean-pooling should work on a shape-(3, 3) tensor, using a shape-(2, 2) pooling window strided with a step-size of 1 (both along the rows and the columns.

>>> import  mygrad as mg
>>> x = mg.Tensor([[0., 1.,  2.],
...                [3., 4.,  5.],
...                [6., 7., 8.]])

# Forward Pass
>>> out = mean_pool(x, pool=(2, 2), stride=1)
>>> out
Tensor([[2., 3.],
        [5., 6.]])

# Backprop
>>> out.sum().backward()  # must backprop from a scalar, thus we sum `out`
>>> x.grad
array([[0.25, 0.5 , 0.25],
       [0.5 , 1.  , 0.5 ],
       [0.25, 0.5 , 0.25]])

Like max_pool, this function should accommodate N-dimensional tensors. mygrad.sliding_window_view makes short work of this. This function basically boils down to taking the appropriate sliding-window view of the underlying numpy array of the input tensor, and using numpy.mean to take the average over the trailing N dimensions that you want to pool over. This is much easier than doing max-pooling, since numpy.mean is able to accept multiple axes .

Try starting with the forward pass for the 1D and 2D cases only. I can help you generalize to N-dimensions if you get stuck. I am also happy to help derive the proper back-propagation for this.

@davidmascharka
Copy link
Collaborator

Do you think it would be possible to provide a general-purpose pool function that can take in whatever arbitrary expression? Would be nice as you could then do:

y = mg.pool(x, pool=(2, 2), stride=2, fn=mg.mean)
z = mg.pool(x, pool=(2, 2), stride=2, fn=mg.max)
w = mg.pool(x, pool=(2, 2), stride=2, fn=mg.sum)

for example and avoid a lot of bloat

@rsokl
Copy link
Owner Author

rsokl commented Dec 12, 2019

Warning: stream-of-consciousness ahead.

I have been thinking about this. And it all basically comes down to this line:

        np.add.at(dx, tuple(index), grad.reshape(*x.shape[:-num_pool], -1))

dx is the gradient to write to. index stores the locations of where to update the gradient (which has had the pooling axes be flattened). And grad is the gradient being backpropped (also flattened to be commensurate with the contents of index).

The issue, then, is that max and min only accumulate one value per window, whereas sum/mean broadcast out to the entire window. It isn't super clear to me how to compute index in an a priori way that accommodates these various cases.

Ultimately I think this comes down to: is there some useful, not totally-inefficient way that we can fuse an operation with sliding-window-view that supports backprop. The totally naive way of doing this would say: if you operate on N windows, then we form a computational graph with N nodes that we backprop through. Clearly this is just too unwieldy.

It would be really neat to do some internal op-fusion with sliding-window-view that internally invokes the op's backprop machinery over the windows in a not-dumb way. This would be a super super nice win. And mygrad would actually kind of be the best. I should really think about this.

@petarmhg you might be interested in this convo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants