Skip to content

Conversation

Kiyosora
Copy link
Contributor

@dr-ci
Copy link

dr-ci bot commented Sep 18, 2020

💊 CI failures summary and remediations

As of commit 453287e (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 231 times.

@Kiyosora Kiyosora force-pushed the implement_float_power branch 5 times, most recently from 4c75c49 to 6f5552f Compare September 21, 2020 09:05
@Kiyosora Kiyosora changed the title [WIP] Implement NumPy-like function torch.float_power() Implement NumPy-like function torch.float_power() Sep 21, 2020
@Kiyosora Kiyosora marked this pull request as ready for review September 21, 2020 15:00
@Kiyosora Kiyosora force-pushed the implement_float_power branch from 6f5552f to a98486e Compare September 22, 2020 01:56
@Kiyosora
Copy link
Contributor Author

Hi, @mruberry ! A new NumPy-like function torch.float_power() is implemented in this PR.
Please kindly help to take a look at your convenience. 😃

@mruberry
Copy link
Collaborator

This is awesome, @Kiyosora! Just an fyi that we're finishing the PyTorch 1.7 release this week, so this review may be delayed a few days. Sorry for not being more responsive.

@Kiyosora
Copy link
Contributor Author

Thanks for the rapid reply, @mruberry !
I wouldn't mind for waiting, and hope everything goes well for the PyTorch 1.7 release. 🎉

@ailzhang ailzhang requested a review from mruberry September 22, 2020 19:59
@ailzhang ailzhang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 22, 2020
@Kiyosora Kiyosora force-pushed the implement_float_power branch from a98486e to bd88b41 Compare September 25, 2020 09:37
@codecov
Copy link

codecov bot commented Sep 25, 2020

Codecov Report

Merging #44937 (453287e) into master (18ae12a) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master   #44937   +/-   ##
=======================================
  Coverage   80.92%   80.93%           
=======================================
  Files        1855     1855           
  Lines      200156   200181   +25     
=======================================
+ Hits       161981   162011   +30     
+ Misses      38175    38170    -5     

Copy link
Collaborator

@mruberry mruberry Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resize_output to resize instead of resize_as. (pow also needs to be updated, but it doesn't need to be updated in this PR.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Progressing in the separate PR #46830

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.copy_?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please fix this for pow, too, actually. Sample code demonstrating the error:

torch.tensor((1 + 1j, 2 + 2j)) ** complex(1, 0)
: RuntimeError: fill_ only supports 0-dimension value tensor but got tensor with 1 dimensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Progressing as a separate PR #46830

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need explicit dispatches? I think the system will work out which functions to call. Explicitly defining the dispatch will, perhaps surprisingly, prevent float_power from deriving its derivative if it's implemented as a composite operation (as suggested above).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pow also has an inplace variant, pow_:

- func: pow_.Scalar(Tensor(a!) self, Scalar exponent) -> Tensor(a!)

Those entries should probably be moved next to the other pow entries, too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I remove the explicit dispatches, the following error comes out:

AssertionError: There's a formula for float_power(or its functional variant) in derivatives.yaml. It's required to add a dispatch section for it with explicit supported backends e.g CPU/CUDA or DefaultBackend in native_functions.yaml. Please see https://github.com/pytorch/pytorch/tree/master/aten/src/ATen/native#choosing-the-right-dispatch-keyword for instructions to choose the right dispatch keyword.

So, I guess we need the explicit dispatches here for autograd.

In addition, The improvement for pow_ is now progressing in the separate PR #46830.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way dispatch happens has actually changed. I think the correct dispatch for these is now Math: instead of CPU, CUDA since float_power is implemented using pow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that when using Math: as dispatch, both autograd test and XLA test will suffer from the precision lack. Even if I directly call pow without any dtype casting like this below, the problem still exists.

Tensor float_power(const Tensor& base, const Tensor& exp) {
  return at::pow(base, exp);
}

Since using CPU, CUDA can avoid from precision lack, maybe we should revert to it?
I am not familiar with autograd yet, maybe I have missed something... 😕

Copy link
Collaborator

@mruberry mruberry Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, can you elaborate on the lack of precision you're seeing, and how changing the dispatch can help with it?

Copy link
Contributor Author

@Kiyosora Kiyosora Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply, @mruberry
When I use Math as dispatch, an assertEqual error occurs in test_autograd, saying that the gradients calculated by in-place variant is inconsistent with the general, just like:

>>> a=torch.tensor([1.,2.,3.], dtype=torch.double, requires_grad=True)
>>> b=torch.tensor([1.,2.,3.], dtype=torch.double, requires_grad=True)
>>> grad=torch.randn_like(a).double()
>>> a.float_power(2.).backward(grad)
>>> a.grad
tensor([-4.0256, -1.6108,  1.2338], dtype=torch.float64)
>>> b.float_power_(2.).backward(grad)
>>> b.grad
tensor([-6.0385, -2.0134,  1.4394], dtype=torch.float64)

But in fact, the in-place variants usually not allow to calculating gradients, the original pow is also doing as so.

>>> a.pow_(2.).backward(grad)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

And when I changed dispatch from Math to CPU, CUDA, making a define in tools/autograd/derivatives.yaml (as we do in the previous version), The above abnormal phenomenon was eliminated.
It seems that there still not have any in-place variant used Math as dispatch so far, so I doubt it may related with this phenomenon...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this explanation, this is extremely interesting. cc @ailzhang and @albanD to take a look, too.

Copy link
Collaborator

@albanD albanD Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeError: a leaf Variable that requires grad is being used in an in-place operation.

This error is unrelated to the pow formula, it only happens because you modify your leaf inplace. Doing a.clone().pow_(2.) should work just fine.

saying that the gradients calculated by in-place variant is inconsistent with the general

If you don't provide a formula directly in derivatives.yaml, you need to make sure to only ever call functions that do from your generic aten implementation. In particular, always call the at:: version and not native:: version of ops.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just continuing here, attempt to perform the operation and verify it throws an error using self.assertRaisesRegex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, verify this throws an error instead of continuing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this paragraph. It's correct, but a little misleading since this PR also allows :attr:`input` to be a scalar. The type support can be mentioned below, too, instead of here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mathematical portion is also correct but can be removed. The description above it is already very clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"{input}" -> "input (Tensor or Number): the base value(s)"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"value" -> "values"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Kiyosora!

Sorry for the delay in reviewing this PR. This week is a holiday for the PyTorch team.

This PR is very good and close to being ready. I made a few small final suggestions. Looking forward to reviewing the final update!

@Kiyosora Kiyosora force-pushed the implement_float_power branch from edb2c7e to 453287e Compare November 26, 2020 08:45
@Kiyosora
Copy link
Contributor Author

Kiyosora commented Nov 26, 2020

Hey @Kiyosora!

Sorry for the delay in reviewing this PR. This week is a holiday for the PyTorch team.

This PR is very good and close to being ready. I made a few small final suggestions. Looking forward to reviewing the final update!

Happy Thanksgiving @mruberry !

Really appreciate for your consistant help!

I updated this PR by you suggestions,
Please take a look in any convenient time after your vacation. 😃

@Kiyosora Kiyosora requested a review from mruberry November 27, 2020 01:51
Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Thanks @Kiyosora!

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 272f4db.

@Kiyosora Kiyosora deleted the implement_float_power branch November 30, 2020 08:57
facebook-github-bot pushed a commit that referenced this pull request Jan 18, 2021
Summary:
- Related with #44937
- Use `resize_output` instead of `resize_as`
- Tuning the `native_functions.yaml`, move the inplace variant `pow_` next to the other `pow` entries

Pull Request resolved: #46830

Reviewed By: mrshenli

Differential Revision: D24567702

Pulled By: anjali411

fbshipit-source-id: a352422c9d4e356574dbfdf21fb57f7ca7c6075d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants