[te] Fix clamp with uint8 args #49143

bertmaher · 2020-12-10T05:32:47Z

Stack from ghstack:

[te] Fix bugs with shift operators #49271 [te] Fix bugs with shift operators
[te] Ban uint8 tensors from fusion groups #49247 [te] Ban uint8 tensors from fusion groups
[te] Use c10::ScalarType utility functions in te::Dtype #49148 [te] Use c10::ScalarType utility functions in te::Dtype
[te] Use Dtype::is_signed instead of an ad hoc local predicate. #49147 [te] Use Dtype::is_signed instead of an ad hoc local predicate.
[te] Fix clamp with uint8 args #49143 [te] Fix clamp with uint8 args

Riddle me this, batman: how could torch.clamp(torch.tensor([0], dtype=torch.uint8), -10, 10) equal 10? The answer: the min/max args are first cast to the dtype of the input, giving min=246 and max 10. Then you have to apply Min and Max in the right order: Min(Max(in, min), max). Differ in any way and you're doomed. Hooray.

This PR makes TE match eager mode for this operator, plus fixes a major facepalm in the llvm min/max codegen where we were always generating signed comparisons.

Differential Revision: D25456366

Riddle me this, batman: how could `torch.clamp(torch.tensor([0], dtype=torch.uint8), -10, 10)` equal `10`? The answer: the min/max args are first cast to the dtype of the input, giving min=246 and max 10. Then you have to apply Min and Max in the right order: `Min(Max(in, min), max)`. Differ in any way and you're doomed. Hooray. This PR makes TE match eager mode for this operator, plus fixes a major facepalm in the llvm min/max codegen where we were always generating signed comparisons. Differential Revision: [D25456366](https://our.internmc.facebook.com/intern/diff/D25456366/) [ghstack-poisoned]

dr-ci · 2020-12-10T05:34:08Z

💊 CI failures summary and remediations

As of commit 6e611f2 (more details on the Dr. CI page):

✅ None of the CI failures appear to be your fault 💚

5/5 broken upstream at merge base 693e908 since Dec 11

🚧 3 ongoing upstream failures:

These were probably caused by upstream breakages that are not fixed yet:

pytorch_linux_bionic_py3_8_gcc9_coverage_test2 since Dec 11
- 🔁 rerun
pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test2 since Dec 11
- 🔁 rerun
pytorch_linux_xenial_py3_clang5_asan_test2 since Dec 11
- 🔁 rerun

🚧 2 fixed upstream failures:

These were probably caused by upstream breakages that were already fixed.

Please rebase on the viable/strict branch (expand for instructions)

If your commit is newer than viable/strict, you can try basing on an older, stable commit:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)

If your commit is older than viable/strict:

git fetch https://github.com/pytorch/pytorch viable/strict
git rebase FETCH_HEAD

Check out the recency history of this "viable master" tracking branch.

pytorch_linux_bionic_py3_6_clang9_test on Dec 11 from 1:37pm to 11:18pm PDT (24 commits; 1cb5aa6 - 8999915)
- 🔁 rerun
pytorch_linux_xenial_py3_6_gcc5_4_test on Dec 11 from 1:37pm to 11:18pm PDT (24 commits; 1cb5aa6 - 8999915)
- 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 12 times.

Riddle me this, batman: how could `torch.clamp(torch.tensor([0], dtype=torch.uint8), -10, 10)` equal `10`? The answer: the min/max args are first cast to the dtype of the input, giving min=246 and max 10. Then you have to apply Min and Max in the right order: `Min(Max(in, min), max)`. Differ in any way and you're doomed. Hooray. This PR makes TE match eager mode for this operator, plus fixes a major facepalm in the llvm min/max codegen where we were always generating signed comparisons. Differential Revision: [D25456366](https://our.internmc.facebook.com/intern/diff/D25456366/) [ghstack-poisoned]

Pull Request resolved: #49143 Riddle me this, batman: how could `torch.clamp(torch.tensor([0], dtype=torch.uint8), -10, 10)` equal `10`? The answer: the min/max args are first cast to the dtype of the input, giving min=246 and max 10. Then you have to apply Min and Max in the right order: `Min(Max(in, min), max)`. Differ in any way and you're doomed. Hooray. This PR makes TE match eager mode for this operator, plus fixes a major facepalm in the llvm min/max codegen where we were always generating signed comparisons. ghstack-source-id: 118276737 Differential Revision: [D25456366](https://our.internmc.facebook.com/intern/diff/D25456366/)

zheng-xq · 2020-12-10T06:44:11Z

torch/csrc/jit/tensorexpr/kernel.h

@@ -101,7 +101,8 @@ class TORCH_API TensorExprKernel {
      const torch::jit::Value* v,
      const std::function<
          ExprHandle(const ExprHandle&, const ExprHandle&, const ExprHandle&)>&
-          innerExpr);
+          innerExpr,
+      bool promote_inputs = true);


Minor: it doesn't feel that "promote_inputs" is true/false is two flavors of this function. My first impression is that "promote_inputs = true" is the public behavior, and "promote_inputs = false" is an internal behavior. I would prefer to remove this argument from the public function, and refactor the "= false" flavor to a "_interna"/"_impl" function.

zheng-xq · 2020-12-10T06:57:14Z

torch/csrc/jit/tensorexpr/types.cpp

@@ -9,6 +9,10 @@ namespace torch {
 namespace jit {
 namespace tensorexpr {

+static bool is_c10_type(const ScalarType& type) {
+  return type < ScalarType::Undefined;


Minor: this is fine. But it feels a bit dangerous to rely on enum value ordering. I've seen in other systems when a list of enums is imported, it also uses the same macros to define a list of type traits on those enums. Otherwise, the enum value might keep changing, which makes serialization compatibility a bit complicated. It is up to you.

Right, so the main reason I want to use ordering here is that we define the first N values of te::ScalarType in lockstep with c10::ScalarType (and we depend on that all over the place). Maybe something less brittle would be better. But I think I'd rather put the effort towards making te::ScalarType just go away in favor of c10::ScalarType, if possible. :)

asuhan · 2020-12-10T07:52:36Z

torch/csrc/jit/tensorexpr/types.cpp

@@ -38,6 +42,13 @@ bool is_floating_point(const ScalarType& type) {
  return false;
 }

+bool is_signed(const ScalarType& type) {


The generality of is_signed confused me a bit - given that it's only used on paths handling integrals anyway, I don't know if it's warranted. If we made it handle integrals only, it'd side-step the is_c10_type trick as well.

asuhan · 2020-12-10T08:02:35Z

torch/csrc/jit/tensorexpr/llvm_codegen.cpp

@@ -720,7 +720,8 @@ void LLVMCodeGenImpl::visit(const Max* v) {
  auto rhs = this->value_;

  if (v->dtype().is_integral()) {
-    auto icmp = irb_.CreateICmpSGT(lhs, rhs);
+    auto icmp = v->dtype().is_signed() ? irb_.CreateICmpSGT(lhs, rhs)


You could use CreateICmp and llvm_comparison_predicate here and avoid the ternary - I had to do a similar fix for comparisons.

ZolotukhinM · 2020-12-10T16:12:01Z

torch/csrc/jit/tensorexpr/kernel.cpp

@@ -688,7 +689,9 @@ Tensor* TensorExprKernel::computeThreeOperand(
            tensorOrConstant(n->inputs()[2], indices),
        };

-        promoteInputs(inputs);
+        if (promote_inputs) {


Should we still be demoting output with this flag?

Yes, if I understand correctly the output type should still be unchanged.

eellison · 2020-12-10T19:11:07Z

Related: #49178

Riddle me this, batman: how could `torch.clamp(torch.tensor([0], dtype=torch.uint8), -10, 10)` equal `10`? The answer: the min/max args are first cast to the dtype of the input, giving min=246 and max 10. Then you have to apply Min and Max in the right order: `Min(Max(in, min), max)`. Differ in any way and you're doomed. Hooray. This PR makes TE match eager mode for this operator, plus fixes a major facepalm in the llvm min/max codegen where we were always generating signed comparisons. Differential Revision: [D25456366](https://our.internmc.facebook.com/intern/diff/D25456366/) [ghstack-poisoned]

facebook-github-bot · 2020-12-12T07:11:54Z

This pull request has been merged in ae88d25.

facebook-github-bot added cla signed oncall: jit Add this issue/PR to JIT oncall triage queue labels Dec 10, 2020

bertmaher requested review from ZolotukhinM, asuhan, Krovatkin, enosair and zheng-xq and removed request for asuhan and enosair December 10, 2020 05:43

This was referenced Dec 10, 2020

[te] Use Dtype::is_signed instead of an ad hoc local predicate. #49147

Closed

[te] Use c10::ScalarType utility functions in te::Dtype #49148

Closed

bertmaher requested a review from penguinwu December 10, 2020 06:33

zheng-xq approved these changes Dec 10, 2020

View reviewed changes

asuhan approved these changes Dec 10, 2020

View reviewed changes

asuhan reviewed Dec 10, 2020

View reviewed changes

bertmaher mentioned this pull request Dec 10, 2020

[rfc][te] Dynamically create parameterized op-level tests #49151

Closed

ZolotukhinM reviewed Dec 10, 2020

View reviewed changes

This was referenced Dec 11, 2020

[te] Ban uint8 tensors from fusion groups #49247

Closed

[te] Fix bugs with shift operators #49271

Closed

facebook-github-bot closed this in ae88d25 Dec 12, 2020

facebook-github-bot added the Merged label Dec 12, 2020

facebook-github-bot deleted the gh/bertmaher/45/head branch December 15, 2020 15:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[te] Fix clamp with uint8 args #49143

[te] Fix clamp with uint8 args #49143

bertmaher commented Dec 10, 2020 •

edited

dr-ci bot commented Dec 10, 2020 •

edited

zheng-xq Dec 10, 2020

zheng-xq Dec 10, 2020

bertmaher Dec 10, 2020

asuhan Dec 10, 2020

asuhan Dec 10, 2020

ZolotukhinM Dec 10, 2020

bertmaher Dec 11, 2020

eellison commented Dec 10, 2020

facebook-github-bot commented Dec 12, 2020

[te] Fix clamp with uint8 args #49143

[te] Fix clamp with uint8 args #49143

Conversation

bertmaher commented Dec 10, 2020 • edited

dr-ci bot commented Dec 10, 2020 • edited

💊 CI failures summary and remediations

🚧 3 ongoing upstream failures:

🚧 2 fixed upstream failures:

zheng-xq Dec 10, 2020

Choose a reason for hiding this comment

zheng-xq Dec 10, 2020

Choose a reason for hiding this comment

bertmaher Dec 10, 2020

Choose a reason for hiding this comment

asuhan Dec 10, 2020

Choose a reason for hiding this comment

asuhan Dec 10, 2020

Choose a reason for hiding this comment

ZolotukhinM Dec 10, 2020

Choose a reason for hiding this comment

bertmaher Dec 11, 2020

Choose a reason for hiding this comment

eellison commented Dec 10, 2020

facebook-github-bot commented Dec 12, 2020

bertmaher commented Dec 10, 2020 •

edited

dr-ci bot commented Dec 10, 2020 •

edited