Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alias for polygamma #59691

Closed
wants to merge 30 commits into from
Closed

Conversation

krshrimali
Copy link
Contributor

@krshrimali krshrimali commented Jun 9, 2021

@krshrimali krshrimali added the module: special Functions with no exact solutions, analogous to those in scipy.special label Jun 9, 2021
@krshrimali krshrimali requested a review from ezyang as a code owner June 9, 2021 07:57
@facebook-github-bot facebook-github-bot added oncall: jit Add this issue/PR to JIT oncall triage queue cla signed labels Jun 9, 2021
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 9, 2021

💊 CI failures summary and remediations

As of commit 6195f5a (more details on the Dr. CI page and at hud.pytorch.org/pr/59691):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_clang5_asan_test1 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jul 09 06:15:46 SUMMARY: UndefinedBehaviorSanit.../jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in
Jul 09 06:15:46     #9 0x55f6a043d8f2 in PyEval_EvalCode /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/ceval.c:731
Jul 09 06:15:46     #10 0x55f6a04a5cd5 in run_mod /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:1025
Jul 09 06:15:46     #11 0x55f6a04a7d5d in PyRun_StringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:949
Jul 09 06:15:46     #12 0x55f6a04a7dbb in PyRun_SimpleStringFlags /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Python/pythonrun.c:445
Jul 09 06:15:46     #13 0x55f6a04a8926 in run_command /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:301
Jul 09 06:15:46     #14 0x55f6a04a8926 in Py_Main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Modules/main.c:749
Jul 09 06:15:46     #15 0x55f6a03e2196 in main /home/builder/ktietz/cos6/ci_cos6/python_1622833237666/work/Programs/python.c:69
Jul 09 06:15:46     #16 0x7fbaa4fe083f in __libc_start_main /build/glibc-S7Ft5T/glibc-2.23/csu/../csu/libc-start.c:291
Jul 09 06:15:46     #17 0x55f6a047233d in _start (/opt/conda/bin/python3.6+0x1a733d)
Jul 09 06:15:46 
Jul 09 06:15:46 SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /var/lib/jenkins/workspace/aten/src/ATen/Utils.cpp:20:3 in 
Jul 09 06:15:46 + retcode=1
Jul 09 06:15:46 + set -e
Jul 09 06:15:46 + return 1
Jul 09 06:15:46 + [[ pytorch-linux-xenial-py3-clang5-asan-test1 == *-NO_AVX-* ]]
Jul 09 06:15:46 + [[ pytorch-linux-xenial-py3-clang5-asan-test1 == *-NO_AVX2-* ]]
Jul 09 06:15:46 + '[' -n https://github.com/pytorch/pytorch/pull/59691 ']'
Jul 09 06:15:46 + [[ pytorch-linux-xenial-py3-clang5-asan-test1 != *coverage* ]]
Jul 09 06:15:46 ++ mktemp
Jul 09 06:15:46 + DETERMINE_FROM=/tmp/tmp.F57h7VqpJk
Jul 09 06:15:46 + file_diff_from_base /tmp/tmp.F57h7VqpJk

Preview docs built from this PR

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@krshrimali krshrimali removed the request for review from ezyang June 9, 2021 07:59
@krshrimali krshrimali changed the title Alias for polygamma [WIP] Alias for polygamma Jun 9, 2021
@krshrimali krshrimali marked this pull request as draft June 9, 2021 08:50
@krshrimali krshrimali marked this pull request as ready for review June 11, 2021 10:38
@krshrimali krshrimali changed the title [WIP] Alias for polygamma Alias for polygamma Jun 11, 2021
SkipInfo('TestCommon', 'test_variant_consistency_jit'),),
SkipInfo('TestCommon', 'test_variant_consistency_jit'),
SkipInfo('TestCommon', 'test_jit_alias_remapping'),
SkipInfo('TestCommon', 'test_variant_consistency_eager')),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails because of the ordering of args in polygamma/special_polygamma: (int, tensor) but JIT tests expect first input to be a tensor.

======================================================================
ERROR: test_variant_consistency_eager_polygamma_polygamma_n_0_cpu_float32 (__main__.TestCommonCPU)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/krshrimali/git/krshrimali/pytorch/torch/testing/_internal/common_device_type.py", line 292, in instantiated_test
    result = test_fn(self, *args)
  File "/home/krshrimali/git/krshrimali/pytorch/torch/testing/_internal/common_device_type.py", line 266, in test_wrapper
    return test(*args, **kwargs)
  File "/home/krshrimali/git/krshrimali/pytorch/test/test_ops.py", line 345, in test_variant_consistency_eager
    _test_consistency_helper(samples, variants)
  File "/home/krshrimali/git/krshrimali/pytorch/test/test_ops.py", line 334, in _test_consistency_helper
    variant_forward = variant(cloned,
TypeError: special_polygamma(): argument 'n' (position 1) must be int, not Tensor

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to JIT but we use lambda to reorder x, n while calling polygamma

op=lambda x, n, **kwargs: torch.polygamma(n, x, **kwargs)

This is because we only verify gradients in consistency_eager for input=Tensor (but not the ones in arg).

This test now fails because, we don't reorder the arguments for the alias. (thus it expects first argument to be int and not a Tensor)

To mitigate should we allow specifying lambda wrappers around aliases? Not sure but might need to investigate.

Relevant lines of test

  • Here we acquire the actual operator of the mentioned alias.

    pytorch/test/test_ops.py

    Lines 288 to 292 in cf38b20

    for a_op in op.aliases:
    variants.append(a_op.op)
    variants.append(a_op.method_variant)
    variants.append(a_op.inplace_variant)
    inplace_ops.append(a_op.inplace_variant)

  • Here we call the alias variant

variant_forward = variant(cloned,

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After an offline discussion with @krshrimali and thinking a bit more. We might as well just add a new entry for special.polygamma in OpInfo and maybe add a handcoded alias test for polygamma?

Reason:

  • Even if we go through the hassle of making sure test_variant_consistency_eager works, we still don't get the alias_remapping test due to the use of lambda.
  • I don't think there are other operators which might need this op_aliases. (It is particularly due to the peculiar signature of polygamma that we need this)

@mruberry what is your opinion?
Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the issue. Sure, implementing a new OpInfo for it (with a comment) sounds like a fine workaround.

@krshrimali
Copy link
Contributor Author

krshrimali commented Jun 13, 2021

The Windows CI (pytorch-win-vs2019-cpu-py3) / test error is real (once all tests run - I'm expecting more tests to fail for the same reason), this was debugged for logsumexp PR and will be fixed there (#58838). Shouldn't be a long blocker though, logsumexp PR should be ready for another review in a day. :)

Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Thanks @krshrimali

Have one minor nit regarding the signature of special::polygamma_out.

However, I am concerned about disabling the consistency_eager test. Have put some pointers below as to why the test is failing. Would be nice if we can have a workaround without disabling it. Can you please have a look at that.

Thanks!

return torch::special_polygamma(n, self);
}

inline Tensor& polygamma_out(int64_t n, Tensor& result, const Tensor& self) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should stick to the signature of polygamma_out which is Tensor& polygamma_out(Tensor& result, int64_t n, const Tensor& self)

SkipInfo('TestCommon', 'test_variant_consistency_jit'),),
SkipInfo('TestCommon', 'test_variant_consistency_jit'),
SkipInfo('TestCommon', 'test_jit_alias_remapping'),
SkipInfo('TestCommon', 'test_variant_consistency_eager')),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not related to JIT but we use lambda to reorder x, n while calling polygamma

op=lambda x, n, **kwargs: torch.polygamma(n, x, **kwargs)

This is because we only verify gradients in consistency_eager for input=Tensor (but not the ones in arg).

This test now fails because, we don't reorder the arguments for the alias. (thus it expects first argument to be int and not a Tensor)

To mitigate should we allow specifying lambda wrappers around aliases? Not sure but might need to investigate.

Relevant lines of test

  • Here we acquire the actual operator of the mentioned alias.

    pytorch/test/test_ops.py

    Lines 288 to 292 in cf38b20

    for a_op in op.aliases:
    variants.append(a_op.op)
    variants.append(a_op.method_variant)
    variants.append(a_op.inplace_variant)
    inplace_ops.append(a_op.inplace_variant)

  • Here we call the alias variant

variant_forward = variant(cloned,

@mruberry
Copy link
Collaborator

Hey @krshrimali! Just checking in here. Looks like some tests are failing -- everything going OK on this PR? Is there something I'm supposed to do for it?

@krshrimali
Copy link
Contributor Author

Hey @krshrimali! Just checking in here. Looks like some tests are failing -- everything going OK on this PR? Is there something I'm supposed to do for it?

Thanks, @mruberry for the question. test_overrides failures are the only relevant test failures in this PR and this has been fixed in #58838 (logsumexp PR). (also mentioned here: #59691 (comment)).

If you could take a look at logsumexp whenever possible, and once that is in - this PR should be ready for final review/importing as well.

P.S: If we think that logsumexp is gonna take significant time, I can put that relevant change from that PR to this one so that this isn't blocked at least.

@mruberry
Copy link
Collaborator

Hey @krshrimali! Just checking in here. Looks like some tests are failing -- everything going OK on this PR? Is there something I'm supposed to do for it?

Thanks, @mruberry for the question. test_overrides failures are the only relevant test failures in this PR and this has been fixed in #58838 (logsumexp PR). (also mentioned here: #59691 (comment)).

If you could take a look at logsumexp whenever possible, and once that is in - this PR should be ready for final review/importing as well.

P.S: If we think that logsumexp is gonna take significant time, I can put that relevant change from that PR to this one so that this isn't blocked at least.

Got it. No, I don't think logsumexp will take too long. If the jit team doesn't get back to us soon I'll ping them

@kshitij12345 kshitij12345 mentioned this pull request Jun 24, 2021
17 tasks
self.name = alias_name
self.op = _getattr_qual(torch, alias_name)
self.op = alias_op if alias_op else _getattr_qual(torch, alias_name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice extension

Copy link
Contributor Author

@krshrimali krshrimali Jul 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, earlier we decided to keep this to avoid extra OpInfo entry for special alias. Later, we decided to instead have an extra OpInfo entry since it's only required for ops like polygamma which need re-ordering of the arguments. (context: #59691 (comment))

This has been removed now.

@mruberry
Copy link
Collaborator

mruberry commented Jul 8, 2021

Unfortunately it looks like the jit test is still failing:

test_variant_consistency_jit_special_polygamma_special_polygamma_n_0_cpu_float32

@krshrimali
Copy link
Contributor Author

Thanks @mruberry for taking a look. The skipped test had the wrong class mentioned, hence the errors. This should be fixed now. Should be ready for review once all the tests pass. :) Thank you!

@krshrimali
Copy link
Contributor Author

Gentle ping, PTAL @mruberry - whenever you find time. The failing test seems unrelated to the PR.

@@ -7029,6 +7029,34 @@ def gradcheck_wrapper_triangular_input(op, input, *args, upper=False, **kwargs):
# ~~~~~~~~~~~~~~~ <--- HERE
SkipInfo('TestJit', 'test_variant_consistency_jit'),),
sample_kwargs=lambda device, dtype, input: ({'n': 0}, {'n': 0})),
# A separate OpInfo entry for special.polygamma is needed to reorder the arguments
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just testing this for the n=0 case makes sense

Copy link
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks @krshrimali

@facebook-github-bot
Copy link
Contributor

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@mruberry merged this pull request in 7e1f01d.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged module: special Functions with no exact solutions, analogous to those in scipy.special oncall: jit Add this issue/PR to JIT oncall triage queue open source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants