Clarify, make consistent, and test the behavior of logspace when dtype is integral #47647

xuhdev · 2020-11-10T02:16:01Z

torch.logspace doesn't seem to have explained how integers are handled.
Add some clarification and some test when dtype is integral.

The CUDA implementation is also updated to be consistent with CPU implementation.

facebook-github-bot · 2020-11-10T02:16:21Z

Hi @xuhdev!

Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but we do not have a signature on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

dr-ci · 2020-11-10T03:27:13Z

💊 CI failures summary and remediations

As of commit 878a465 (more details on the Dr. CI page):

3/3 failures possibly* introduced in this PR
- 2/3 non-CircleCI failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

pytorch_linux_xenial_py3_clang5_asan_test1 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jan 08 04:21:11 RuntimeError: test_tensor_creation_ops failed!

Jan 08 04:21:11     #123 0x55d03d94f43d in main /tmp/build/80754af9/python_1599604603603/work/Programs/python.c:69
Jan 08 04:21:11     #124 0x7f00bf0bb83f in __libc_start_main /build/glibc-e6zv40/glibc-2.23/csu/../csu/libc-start.c:291
Jan 08 04:21:11     #125 0x55d03da2ed0a in _start /home/rdonnelly/mc/conda-bld/compilers_linux-64_1534865402226/work/.build/src/glibc-2.12.2/csu/../sysdeps/x86_64/elf/start.S:103
Jan 08 04:21:11 
Jan 08 04:21:11 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/native/RangeFactories.cpp:86:5 in 
Jan 08 04:21:11 Traceback (most recent call last):
Jan 08 04:21:11   File "test/run_test.py", line 910, in <module>
Jan 08 04:21:11     main()
Jan 08 04:21:11   File "test/run_test.py", line 889, in main
Jan 08 04:21:11     raise RuntimeError(err_message)
Jan 08 04:21:11 RuntimeError: test_tensor_creation_ops failed!
Jan 08 04:21:12 + cleanup
Jan 08 04:21:12 + retcode=1
Jan 08 04:21:12 + set +x
Jan 08 04:21:12 =================== sccache compilation log ===================
Jan 08 04:21:12 =========== If your build fails, please take a look at the log above for possible reasons ===========
Jan 08 04:21:12 Compile requests                      0
Jan 08 04:21:12 Compile requests executed             0
Jan 08 04:21:12 Cache hits                            0
Jan 08 04:21:12 Cache misses                          0
Jan 08 04:21:12 Cache timeouts                        0

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.10-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 75 times.

walterddr · 2021-01-05T15:35:08Z

aten/src/ATen/native/cuda/RangeFactories.cu

@@ -133,10 +133,10 @@ Tensor& logspace_cuda_out(Tensor& result, Scalar start, Scalar end, c10::optiona
    r.fill_(std::pow(base, start.to<double>()));
  } else if (isIntegralType(r.scalar_type(), 0)) {
    AT_DISPATCH_INTEGRAL_TYPES(r.scalar_type(), "logspace_cuda", [&]() {
-      float scalar_base = static_cast<float>(base); // Use float to avoid promotion to double
+      double scalar_base = static_cast<double>(base);


could you clarify your change here please?
based on the test_tensor_creation_ops.py change, you are comparing within the device type instead of crossing cpu/gpu. maybe adding a test doing the cpu/gpu cross comparison can help us review this change.

I have added the test. This is to make the implementation consistent with CPU, because the CPU implementation uses double instead of float here.

does your added CPU/GPU comparison test break when you use the original float here?

i guess my point was, cuda impl used float for a reason, and according to the comment on the CPU side, it was "autopromoted". e.g. we might not need that extra precision

@walterddr Yes, it broke when I kept them as float. Actually it broke the tests on CUDA alone. Likely I guess this is about log/exp, and low precision can easily cause trouble.

i guess my point was, cuda impl used float for a reason, and according to the comment on the CPU side, it was "autopromoted". e.g. we might not need that extra precision

My speculation is that the CUDA implementation used float for perhaps performance reasons? cc @colesbury

Also, I'm having a hard time understanding the current asan test failure, which doesn't show what is broken (it still failed for the same reason when I did not add anything to the CUDA implementation).

walterddr

looks good. I got one more comment and it should be good to go

test/test_tensor_creation_ops.py

walterddr · 2021-01-07T23:07:39Z

test/test_tensor_creation_ops.py

@@ -2782,6 +2782,20 @@ def test_logspace(self, device, dtype):
        y = torch.logspace(0, 3, 4, base=2, device=device, dtype=dtype, out=x.narrow(1, 1, 2))
        self.assertEqual(x, torch.tensor(((0, 1, 2), (0, 4, 8)), device=device, dtype=dtype), atol=0, rtol=0)

+        # Integer test
+        if torch.testing.is_integral(dtype):


you can just create a separate test

@skipCPUIf(True, "compares with CPU") @dtypesIfCUDA(torch.uint8, torch.int8, torch.int16, torch.int32, torch.int64) def test_logspace_integral(self, device, dtype): ...

see how other special typed logspace test were doing before this one.

I added this. I chose to not skip all CPU tests because there are some parts not concerning CPUs.

torch.logspace doesn't seem to have explained how integers are handled. Add some clarification and some test when dtype is integral.

facebook-github-bot

@walterddr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

walterddr

thanks for the change. lgtm now. I'll add @colesbury to see if there's any concern on the float->double type change from the cuda side.

facebook-github-bot · 2021-01-15T20:36:42Z

@walterddr merged this pull request in 0ae0fac.

facebook-github-bot · 2021-01-16T02:49:35Z

This pull request has been reverted by 3df5f9c.

xuhdev · 2021-01-16T02:52:53Z

@walterddr Is there a reason for the reversion?

walterddr · 2021-01-16T02:53:47Z

the change actually broke ASAN instead of being flaky.

walterddr · 2021-01-16T03:11:24Z

also I should've checked this more carefully before, but seems like torch.logspace doesn't produce the same expected results as numpy.logspace. --> its a matter of whether we should convert the start/stop into integer before or after the fact.

I will look into it and iterate on top of this PR

…e is integral torch.logspace doesn't seem to have explained how integers are handled. Add some clarification and some test when dtype is integral. The CUDA implementation is also updated to be consistent with CPU implementation. Following up pytorch#47647

xuhdev · 2021-01-16T03:16:33Z

@walterddr Thanks, please keep me posted.

xuhdev · 2021-04-12T21:21:22Z

@walterddr Do we have any plan to further resolve this issue?

xuhdev force-pushed the logspace-int branch from e7a79b5 to 690951d Compare November 10, 2020 02:17

pytorchbot added the open source label Nov 10, 2020

xuhdev marked this pull request as draft November 10, 2020 08:26

xuhdev force-pushed the logspace-int branch from 690951d to 2e9e93d Compare November 19, 2020 05:22

facebook-github-bot added the cla signed label Dec 18, 2020

xuhdev force-pushed the logspace-int branch 4 times, most recently from 8343c4d to cf01217 Compare January 5, 2021 01:39

xuhdev closed this Jan 5, 2021

xuhdev reopened this Jan 5, 2021

xuhdev force-pushed the logspace-int branch 3 times, most recently from 326b298 to 375a47b Compare January 5, 2021 07:49

xuhdev marked this pull request as ready for review January 5, 2021 10:33

xuhdev requested a review from walterddr January 5, 2021 10:33

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jan 5, 2021

walterddr reviewed Jan 5, 2021

View reviewed changes

xuhdev changed the title ~~Clarify and test the behavior of logspace when dtype is integral~~ Clarify, make consistent, and test the behavior of logspace when dtype is integral Jan 5, 2021

xuhdev requested a review from walterddr January 5, 2021 18:18

walterddr reviewed Jan 7, 2021

View reviewed changes

xuhdev added 3 commits January 7, 2021 16:45

Clarify and test the behavior of logspace when dtype is integral

f668dd7

torch.logspace doesn't seem to have explained how integers are handled. Add some clarification and some test when dtype is integral.

Use double instead of float on CUDA as well, for consistency with CPU

fd358fc

Compare with CPU result

c6ae365

xuhdev force-pushed the logspace-int branch from 27ca088 to b82fdf7 Compare January 8, 2021 00:52

Move the new tests to a separate test function

878a465

xuhdev force-pushed the logspace-int branch from b82fdf7 to 878a465 Compare January 8, 2021 02:17

facebook-github-bot reviewed Jan 8, 2021

View reviewed changes

walterddr requested a review from colesbury January 8, 2021 02:40

walterddr approved these changes Jan 8, 2021

View reviewed changes

facebook-github-bot closed this in 0ae0fac Jan 15, 2021

facebook-github-bot added the Merged label Jan 15, 2021

xuhdev deleted the logspace-int branch January 15, 2021 20:39

facebook-github-bot added the Reverted label Jan 16, 2021

xuhdev mentioned this pull request Jan 16, 2021

Clarify, make consistent, and test the behavior of logspace when dtype is integral #50645

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify, make consistent, and test the behavior of logspace when dtype is integral #47647

Clarify, make consistent, and test the behavior of logspace when dtype is integral #47647

xuhdev commented Nov 10, 2020 •

edited

Loading

facebook-github-bot commented Nov 10, 2020

dr-ci bot commented Nov 10, 2020 •

edited by facebook-github-bot

Loading

walterddr Jan 5, 2021

xuhdev Jan 5, 2021

walterddr Jan 5, 2021

xuhdev Jan 5, 2021 •

edited

Loading

walterddr left a comment

walterddr Jan 7, 2021

xuhdev Jan 8, 2021

facebook-github-bot left a comment

walterddr left a comment •

edited

Loading

facebook-github-bot commented Jan 15, 2021

facebook-github-bot commented Jan 16, 2021

xuhdev commented Jan 16, 2021

walterddr commented Jan 16, 2021

walterddr commented Jan 16, 2021

xuhdev commented Jan 16, 2021

xuhdev commented Apr 12, 2021

Clarify, make consistent, and test the behavior of logspace when dtype is integral #47647

Clarify, make consistent, and test the behavior of logspace when dtype is integral #47647

Conversation

xuhdev commented Nov 10, 2020 • edited Loading

facebook-github-bot commented Nov 10, 2020

dr-ci bot commented Nov 10, 2020 • edited by facebook-github-bot Loading

💊 CI failures summary and remediations

🕵️ 1 new failure recognized by patterns

pytorch_linux_xenial_py3_clang5_asan_test1 (1/1)

ci.pytorch.org: 1 failed

walterddr Jan 5, 2021

Choose a reason for hiding this comment

xuhdev Jan 5, 2021

Choose a reason for hiding this comment

walterddr Jan 5, 2021

Choose a reason for hiding this comment

xuhdev Jan 5, 2021 • edited Loading

Choose a reason for hiding this comment

walterddr left a comment

Choose a reason for hiding this comment

walterddr Jan 7, 2021

Choose a reason for hiding this comment

xuhdev Jan 8, 2021

Choose a reason for hiding this comment

facebook-github-bot left a comment

Choose a reason for hiding this comment

walterddr left a comment • edited Loading

Choose a reason for hiding this comment

facebook-github-bot commented Jan 15, 2021

facebook-github-bot commented Jan 16, 2021

xuhdev commented Jan 16, 2021

walterddr commented Jan 16, 2021

walterddr commented Jan 16, 2021

xuhdev commented Jan 16, 2021

xuhdev commented Apr 12, 2021

xuhdev commented Nov 10, 2020 •

edited

Loading

dr-ci bot commented Nov 10, 2020 •

edited by facebook-github-bot

Loading

xuhdev Jan 5, 2021 •

edited

Loading

walterddr left a comment •

edited

Loading