-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port addcmul
operator from the TH code to Aten
#22797
Labels
better-engineering
Relatively self-contained tasks for better engineering contributors
module: cpu
CPU specific problem (e.g., perf, algorithm)
module: cuda
Related to torch.cuda, and CUDA support in general
module: porting
Issues related to porting TH/THNN legacy to ATen native
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Comments
VitalyFedyunin
added
module: cpu
CPU specific problem (e.g., perf, algorithm)
module: cuda
Related to torch.cuda, and CUDA support in general
module: operators
module: porting
Issues related to porting TH/THNN legacy to ATen native
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
labels
Jul 12, 2019
VitalyFedyunin
changed the title
Port
Port Jul 12, 2019
addcdiv
operator from the TH code to Atenaddcmul
operator from the TH code to Aten
ezyang
added
the
better-engineering
Relatively self-contained tasks for better engineering contributors
label
Jul 12, 2019
I will make the example out of it. |
Closed
petrex
pushed a commit
to petrex/pytorch
that referenced
this issue
Aug 1, 2019
Summary: Move CPU implementation of the `addcmul` operator to Aten ( pytorch#22797 ) ### before ```python In [11]: timeit x.addcmul(a, b) 1.31 ms ± 18.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` ### after ```python In [9]: timeit x.addcmul(a, b) 588 µs ± 22.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Adding custom code for the case when `value == 1`, doesn't provide significant performance gain. Pull Request resolved: pytorch#22874 Differential Revision: D16359348 Pulled By: VitalyFedyunin fbshipit-source-id: 941ead835672fca78a1fcc762da052e64308b111
zdevito
pushed a commit
to zdevito/ATen
that referenced
this issue
Aug 1, 2019
Summary: Move CPU implementation of the `addcmul` operator to Aten ( pytorch/pytorch#22797 ) ### before ```python In [11]: timeit x.addcmul(a, b) 1.31 ms ± 18.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` ### after ```python In [9]: timeit x.addcmul(a, b) 588 µs ± 22.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Adding custom code for the case when `value == 1`, doesn't provide significant performance gain. Pull Request resolved: pytorch/pytorch#22874 Differential Revision: D16359348 Pulled By: VitalyFedyunin fbshipit-source-id: 941ead835672fca78a1fcc762da052e64308b111
salexspb
pushed a commit
to salexspb/pytorch
that referenced
this issue
Aug 8, 2019
Summary: pytorch#22797 Pull Request resolved: pytorch#23814 Differential Revision: D16712381 Pulled By: ifedan fbshipit-source-id: aeca4fdb9b10143932f195900b1f424ef6d26c89
zdevito
pushed a commit
to zdevito/ATen
that referenced
this issue
Aug 8, 2019
Summary: pytorch/pytorch#22797 Pull Request resolved: pytorch/pytorch#23814 Differential Revision: D16712381 Pulled By: ifedan fbshipit-source-id: aeca4fdb9b10143932f195900b1f424ef6d26c89
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
better-engineering
Relatively self-contained tasks for better engineering contributors
module: cpu
CPU specific problem (e.g., perf, algorithm)
module: cuda
Related to torch.cuda, and CUDA support in general
module: porting
Issues related to porting TH/THNN legacy to ATen native
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
addcmul
is the point-wise math operator so porting if from the TH code to Aten (and TensorIterator) expected to be easy. Such migration will help to clean up the code, simplify dispatch as well as provide immediate 2-3x operator performance gain.Porting guide: https://github.com/pytorch/pytorch/wiki/TH-to-ATen-porting-guide
Example PR with porting of the adaptive_avg_pool2d: #14714
How to use TensorIterator: https://github.com/pytorch/pytorch/wiki/How-to-use-TensorIterator
The text was updated successfully, but these errors were encountered: