Forward/backward hooks for C++ torch::nn modules #25888

yf225 · 2019-09-10T00:31:38Z

🚀 Feature

As part of the Python/C++ API parity work, we would like to provide forward/backward hooks for C++ torch::nn modules:
(left: Python API, right: C++ API)

torch.nn.Module.register_forward_pre_hook -> torch::nn::Module::register_forward_pre_hook
torch.nn.Module.register_forward_hook -> torch::nn::Module::register_forward_hook
torch.nn.Module.register_backward_hook -> torch::nn::Module::register_backward_hook

Possible Steps

Implement torch::utils::hooks::RemovableHandle in C++ API, which mirrors torch.utils.hooks.RemovableHandle in Python API.
Implement register_forward_pre_hook, register_forward_hook and register_backward_hook methods for torch::nn::Module.
- We might need to add _backward_hooks, _forward_pre_hooks and _forward_hooks to torch::nn::Module (torch/csrc/api/include/torch/nn/module.h).
Add appropriate C++ tests for the new API, similar to Python tests.

cc @ezyang @ssnl @albanD @zou3519 @yf225

The text was updated successfully, but these errors were encountered:

yf225 · 2019-09-10T00:32:06Z

cc. @pbelevich

albanD · 2019-09-10T02:59:18Z

Hi,

You might want to wait for the backward_hooks for the python version to be fixed and then only implement the equivalent of the fixed version in cpp?

yf225 · 2019-09-10T04:24:57Z

@albanD That sounds great. Do you know if there is a tracking issue for fixing the backward hooks for the Python version?

ezyang · 2019-09-10T14:07:33Z

Note that @malvika2147 implemented hooks on variables.

albanD · 2019-09-10T14:09:29Z

@yf225 I would say #598 is the base issue for backward hooks on nn.Modules.

ssbotelh · 2019-10-19T00:08:50Z

The authors of this paper (https://arxiv.org/abs/1706.02677) claim to achieve great scaling on ResNet50 training by, among other things, overlapping gradient computation and communication:

"as soon as the gradient for a layer is computed, it is aggregated across workers, while gradient computation for the next layer continues".

If I understand it correctly, this would only be possible using something like register_backward_hook on modules. Since it seems like it'll be a while before this feature is fixed in Python (#598 and #12331) and makes its way to libtorch, can anyone think of an alternative way of being notified when the gradient computation is done for a module? Thanks in advance.

albanD · 2019-10-21T14:13:16Z

Hi,

I don't really see why the hooks on Module could help you do that.
But both DataParallel and DistributedDataParallel already overlap computation and communication if I remember correctly.

ssbotelh · 2019-10-21T16:00:25Z

Hi @albanD, thanks for your comment.

DataParallel only works on GPUs, which is not what I'm interested in. Besides, it still doesn't seem to work correctly on C++ libtorch, probably due to lack of NCCL integration -- see last comment in #18837.

I'm curious about your first statement. If a module hook can notify me that the gradient computation is done for layer N, I can dispatch the task of communicating that gradient among processes to a separate thread, while the main thread continues with the gradient for layer N+1. I assume that's how the authors of the above paper must have done in order to overlap computation and communication.

albanD · 2019-10-21T16:54:11Z

Hi,

The same thing can be done using hooks on Tensors with register_hook(). These work as expected.

ssbotelh · 2019-10-21T17:15:22Z

Is this available on the C++ API? Could you please point me to the documentation, I haven't found any. Thanks!

albanD · 2019-10-21T18:09:56Z

The PR for it is here: #26568

yf225 · 2019-11-21T04:14:19Z

@albanD Curious is register_forward_pre_hook also blocked by the need to fix backward_hooks, or are they independent of each other? Thanks!

albanD · 2019-11-21T15:31:17Z

Hi,

No they are independent.
Both forward_hook and forward_pre_hook are easy to do and not problematic :)

ly1231cn · 2019-12-24T15:49:49Z

@albanD Is register_forward_pre_hook available in the nightly build?

albanD · 2019-12-26T16:07:05Z

I don't think it has been added to the cpp api yet (only available in python). cc @yf225 ?

nuka137 · 2020-06-22T06:48:35Z

Is this feature discarded?
I need register_forward_hook in order to output model summary (number of parameter, tensor shape, etc...).
It helps us to debug model easily.
There is already Python version which uses torch.nn.Module.register_forward_hook.
I think similar API is also needed in C++.

albanD · 2020-06-22T14:04:39Z

It is not. We did not had time to get around to do it.
We would be very happy to help prepare and accept a PR adding this feature though !

nuka137 · 2020-06-22T22:23:26Z

@albanD

Thanks for your reply.
I'm interested in adding register_forward_hook to C++ API.
BTW, register_backward_hook seems to be blocked by #25888 (comment) .
So, is it acceptable if I implement only register_forward_hook ?

albanD · 2020-06-22T22:28:53Z

Yes only forward hook and pre_forward hook would be sufficient.
More work needs to be done for the backward hooks I'm afraid.

nuka137 · 2020-06-30T07:02:46Z

From works for a week, I found it is difficult to expose hook function as simple as Python.

I think we need to add hook function call to this method .
But inputs/outputs of forward function are not consistent, so we need to expose API arguments explicitly (or use template?).
(ex. AMSOutput which is an output of AdaptiveLogSoftmaxWithLoss while Tensor which is an output of DropoutImpl and so on)

This hook function should be similar to Python hook function, but I could not have good idea to realize this.
I'm appreciate to give me an advice about it.

albanD · 2020-06-30T13:40:19Z

My guess is that some template magic will be needed here yes.
The main features that are needed here are:

Call the user hooks before and after the forward call. One takes the Args... as input and the other one takes torch::detail::return_type_of_forward_t<Contained, Args...>.
Allow the user to change the values in these hooks. This one I'm not sure how to do.

cc @yf225 do you have an idea of how this can be done?

yf225 · 2020-06-30T16:24:23Z

I had some draft PR that might be able to do it: #30484. Haven't got it to the finish line, but maybe can serve as inspiration :) also cc. @glaringlee

glaringlee · 2020-06-30T17:16:41Z

@yf225 Thanks, I'll take a look at it and continue working on it.
@nuka137
I will prioritize this and hope we can finish this pre fwd hook later this year. We currently prioritize the work for adding transformer layer into pytorch c++ apis.

nuka137 · 2020-06-30T22:05:33Z

@glaringlee

Thanks! I hope this feature to land in future.

FlyinTeller · 2021-08-17T14:44:45Z

@glaringlee @yf225 Is there any plan to finish this functionality in the near future? I would much like to use spectral normalization from the C++ API, but this is depending on register_forward_pre_hook to be available

bjaeger1 · 2022-10-25T11:52:07Z

is there any progress on the implementation of register_forward_hook in the C++ API? I would need it to extract feature maps of certain layers of a trained model.

spiegelball · 2023-02-09T14:06:55Z

I am also very curious about this.

ezyang · 2023-02-10T17:50:35Z

There are currently no plans to add this from core team. If someone can come up with a short PR that adds the feature I can help review it, but it may not be so easy to play the same tricks you can do in Python in C++

Tihaw · 2023-06-28T08:13:15Z

Does backward hooks supported by Torchscript module by now?

yf225 added module: cpp Related to C++ API module: autograd Related to torch.autograd, and the autograd engine in general labels Sep 10, 2019

fmassa added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 13, 2019

driazati mentioned this issue Mar 5, 2020

[jit] Hook support tracking #34329

Open

glaringlee mentioned this issue Nov 4, 2020

Python/C++ API Parity: torch.nn modules and functional #25883

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Forward/backward hooks for C++ torch::nn modules #25888

Forward/backward hooks for C++ torch::nn modules #25888

yf225 commented Sep 10, 2019 •

edited by pytorch-probot bot

yf225 commented Sep 10, 2019

albanD commented Sep 10, 2019

yf225 commented Sep 10, 2019

ezyang commented Sep 10, 2019

albanD commented Sep 10, 2019

ssbotelh commented Oct 19, 2019

albanD commented Oct 21, 2019

ssbotelh commented Oct 21, 2019 •

edited

albanD commented Oct 21, 2019

ssbotelh commented Oct 21, 2019

albanD commented Oct 21, 2019

yf225 commented Nov 21, 2019

albanD commented Nov 21, 2019

ly1231cn commented Dec 24, 2019

albanD commented Dec 26, 2019

nuka137 commented Jun 22, 2020

albanD commented Jun 22, 2020

nuka137 commented Jun 22, 2020 •

edited

albanD commented Jun 22, 2020

nuka137 commented Jun 30, 2020

albanD commented Jun 30, 2020

yf225 commented Jun 30, 2020

glaringlee commented Jun 30, 2020 •

edited

nuka137 commented Jun 30, 2020

FlyinTeller commented Aug 17, 2021

bjaeger1 commented Oct 25, 2022

spiegelball commented Feb 9, 2023

ezyang commented Feb 10, 2023

Tihaw commented Jun 28, 2023

Forward/backward hooks for C++ torch::nn modules #25888

Forward/backward hooks for C++ torch::nn modules #25888

Comments

yf225 commented Sep 10, 2019 • edited by pytorch-probot bot

🚀 Feature

Possible Steps

yf225 commented Sep 10, 2019

albanD commented Sep 10, 2019

yf225 commented Sep 10, 2019

ezyang commented Sep 10, 2019

albanD commented Sep 10, 2019

ssbotelh commented Oct 19, 2019

albanD commented Oct 21, 2019

ssbotelh commented Oct 21, 2019 • edited

albanD commented Oct 21, 2019

ssbotelh commented Oct 21, 2019

albanD commented Oct 21, 2019

yf225 commented Nov 21, 2019

albanD commented Nov 21, 2019

ly1231cn commented Dec 24, 2019

albanD commented Dec 26, 2019

nuka137 commented Jun 22, 2020

albanD commented Jun 22, 2020

nuka137 commented Jun 22, 2020 • edited

albanD commented Jun 22, 2020

nuka137 commented Jun 30, 2020

albanD commented Jun 30, 2020

yf225 commented Jun 30, 2020

glaringlee commented Jun 30, 2020 • edited

nuka137 commented Jun 30, 2020

FlyinTeller commented Aug 17, 2021

bjaeger1 commented Oct 25, 2022

spiegelball commented Feb 9, 2023

ezyang commented Feb 10, 2023

Tihaw commented Jun 28, 2023

yf225 commented Sep 10, 2019 •

edited by pytorch-probot bot

ssbotelh commented Oct 21, 2019 •

edited

nuka137 commented Jun 22, 2020 •

edited

glaringlee commented Jun 30, 2020 •

edited