New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forward/backward hooks for C++ torch::nn modules #25888
Comments
cc. @pbelevich |
Hi, You might want to wait for the |
@albanD That sounds great. Do you know if there is a tracking issue for fixing the backward hooks for the Python version? |
Note that @malvika2147 implemented hooks on variables. |
The authors of this paper (https://arxiv.org/abs/1706.02677) claim to achieve great scaling on ResNet50 training by, among other things, overlapping gradient computation and communication:
If I understand it correctly, this would only be possible using something like |
Hi, I don't really see why the hooks on Module could help you do that. |
Hi @albanD, thanks for your comment. DataParallel only works on GPUs, which is not what I'm interested in. Besides, it still doesn't seem to work correctly on C++ libtorch, probably due to lack of NCCL integration -- see last comment in #18837. I'm curious about your first statement. If a module hook can notify me that the gradient computation is done for layer N, I can dispatch the task of communicating that gradient among processes to a separate thread, while the main thread continues with the gradient for layer N+1. I assume that's how the authors of the above paper must have done in order to overlap computation and communication. |
Hi, The same thing can be done using hooks on Tensors with |
Is this available on the C++ API? Could you please point me to the documentation, I haven't found any. Thanks! |
The PR for it is here: #26568 |
@albanD Curious is |
Hi, No they are independent. |
@albanD Is register_forward_pre_hook available in the nightly build? |
I don't think it has been added to the cpp api yet (only available in python). cc @yf225 ? |
Is this feature discarded? |
It is not. We did not had time to get around to do it. |
Thanks for your reply. |
Yes only forward hook and pre_forward hook would be sufficient. |
From works for a week, I found it is difficult to expose hook function as simple as Python. I think we need to add hook function call to this method . This hook function should be similar to Python hook function, but I could not have good idea to realize this. |
My guess is that some template magic will be needed here yes.
cc @yf225 do you have an idea of how this can be done? |
I had some draft PR that might be able to do it: #30484. Haven't got it to the finish line, but maybe can serve as inspiration :) also cc. @glaringlee |
Thanks! I hope this feature to land in future. |
@glaringlee @yf225 Is there any plan to finish this functionality in the near future? I would much like to use spectral normalization from the C++ API, but this is depending on |
is there any progress on the implementation of |
I am also very curious about this. |
There are currently no plans to add this from core team. If someone can come up with a short PR that adds the feature I can help review it, but it may not be so easy to play the same tricks you can do in Python in C++ |
Does backward hooks supported by Torchscript module by now? |
馃殌 Feature
As part of the Python/C++ API parity work, we would like to provide forward/backward hooks for C++
torch::nn
modules:(left: Python API, right: C++ API)
torch.nn.Module.register_forward_pre_hook
->torch::nn::Module::register_forward_pre_hook
torch.nn.Module.register_forward_hook
->torch::nn::Module::register_forward_hook
torch.nn.Module.register_backward_hook
->torch::nn::Module::register_backward_hook
Possible Steps
torch::utils::hooks::RemovableHandle
in C++ API, which mirrorstorch.utils.hooks.RemovableHandle
in Python API.register_forward_pre_hook
,register_forward_hook
andregister_backward_hook
methods fortorch::nn::Module
._backward_hooks
,_forward_pre_hooks
and_forward_hooks
totorch::nn::Module
(torch/csrc/api/include/torch/nn/module.h).cc @ezyang @ssnl @albanD @zou3519 @yf225
The text was updated successfully, but these errors were encountered: