feat: support int64 <=> int32 auto conversion #1407

bowang007 · 2022-10-16T13:42:18Z

Signed-off-by: Bo Wang bowa@nvidia.com

Description

Support int64 <=> int32 type conversion.

Fixes #1382

Type of change

New feature (non-breaking change which adds functionality)

Checklist:

My code follows the style guidelines of this project (You can use the linters)
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes
I have added the relevant labels to my PR in so that relevant reviewers are notified

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

Code conforms to C++ style guidelines

narendasan · 2022-10-18T18:10:42Z

core/partitioning/shape_analysis.cpp

+        }
+      }
+    }
+    // TODO: This part might be necessary for some model, now checkint to verify


@bowang007 should this be uncommented?

yes, optimized and refactored, this part is now included.

narendasan · 2022-10-18T18:11:24Z

@inocsin Can you verify these changes on key models?

narendasan · 2022-10-18T18:12:22Z

Seems like tests are failing for partitioning?

inocsin · 2022-10-19T02:40:04Z

@inocsin Can you verify these changes on key models?

Sure, we have asked users to test this pr with their models

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions

Code conforms to Python style guidelines

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

gs-olive · 2022-10-25T18:32:52Z

When compiling BART (https://huggingface.co/facebook/bart-base) using Torch TensorRT, this PR currently segfaults on my machine. Will add an additional comment with the line which is causing the issue

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

github-actions

Code conforms to Python style guidelines

gs-olive

With the suggested edits, PR is functioning for BART model and successfully casts Long tensors to Int tensors. Suggestions are related to bugs arising from input vs output checking and usage of different aten::to schemas

gs-olive · 2022-10-25T18:53:38Z

core/partitioning/shape_analysis.cpp

+    auto const_zero = g->insertConstant(0);
+    const_zero->setType(torch::jit::BoolType::get());
+    auto none_val = g->insertNode(g->createNone())->output();
+    cast_node = g->create(torch::jit::aten::to, {g->inputs()[index], const_type, const_zero, const_zero, none_val});


Add an if/else here to use g->inputs() if is_input = true otherwise use g->outputs()

gs-olive · 2022-10-25T18:54:35Z

core/partitioning/shape_analysis.cpp

+}
+
+torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) {
+  torch::jit::Node* cast_node = getUpstreamCastNode(seg_block.raw_inputs()[index]);


Add an if/else here to use raw_inputs() if is_input = true otherwise use raw_outputs()

gs-olive · 2022-10-25T19:01:47Z

core/partitioning/shape_analysis.cpp

+  // if we can find upstream aten::to node, we use it's parameters for creating new cast node
+  if (cast_node) {
+    std::unordered_map<torch::jit::Value*, torch::jit::Value*> value_map;
+    value_map.insert({cast_node->inputs()[0], g->inputs()[index]});


May need an if/else here to check if insert should be g->inputs()[index] or g->outputs()[index].

gs-olive · 2022-10-25T19:07:44Z

core/partitioning/shape_analysis.cpp

+    //    auto cast_node = g->prependNode(g->create(torch::jit::aten::to, {g->inputs()[i], const_type, const_zero,
+    //    const_zero, none_val})); seg_block.inputs()[i]->replaceAllUsesAfterNodeWith(cast_node,
+    //    cast_node->outputs()[0]); LOG_DEBUG(seg_block << " in shape analysis");


Consider removing commented code if not needed.

gs-olive · 2022-10-25T21:19:19Z

core/partitioning/shape_analysis.cpp

+    if (!is_input) {
+      // if this value is output, we need to cast it to int32
+      auto const_val = g->insertConstant(3);
+      value_map.insert({cast_node->inputs()[1], const_val});


Throws an error when the upstream aten::to node does not have dtype as its second argument. For example, the schema aten::to.prim_Device(Tensor(a) self, Device? device, int? dtype=None, bool non_blocking=False, bool copy=False) -> Tensor(b|a) has Device as its second value, and this insertion causes it to be transformed to an invalid schema. We need to differentiate between schemas to ensure the dtype is placed in the right position. It seems that valid schemas for aten::to have dtype as either the second or third argument, or not at all. I believe there should be a check should be in getUpstreamCastNode to see if dtype is any of the arguments, and then a second check here to see if it is second or third argument in the schema.

The check here could be something like an if/else checking the debugName at the second index, as in:

if (cast_node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::DeviceObjType) { value_map.insert({cast_node->inputs()[2], const_val}); } else { value_map.insert({cast_node->inputs()[1], const_val}); }

gs-olive · 2022-10-25T21:35:21Z

core/partitioning/shape_analysis.cpp

+    auto cur_val = q.front();
+    q.pop();
+    auto node = cur_val->node();
+    if (node->kind().toQualString() == std::string("aten::to")) {


May need an additional check to ensure that the aten::to schema is valid for dtype insertion, as some of these schemas do not take an integer dtype at all, for example:

aten::to(Tensor(a) self, bool non_blocking=False, bool copy=False) -> Tensor(b|a)

aten::to(Tensor(a) self, Device device, ScalarType dtype, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)

aten::to(Tensor(a) self, Tensor other, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)

A check could be something like an additional && with

(node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) || (node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType)

Hey @gs-olive Any reproducer for this?
What I'm not sure about is that for getUpstreamNode() function when we pass in a int32 value will the first cast node be the cast node that casts this value to int64? If that's the case, then we don't need this check.
In other words, is it possible that the first cast node involving the passed value is to cast some other value? If the first cast node is not the cast node that casts to int64, will the second cast node be what we want?

Hi @bowang007 - as an update, while this is no longer throwing an error on my end, my thought was that we do need this check you have, but maybe it should be more stringent - something like:

if ((node->kind().toQualString() == std::string("aten::to")) && ((node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) || (node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType))) {

This is because, in the case where the aten::to is the second option in my above comment, then inserting a constant like 3 will cause the model to fail, as the schema for to as requested needs a ScalarType and not an int. I don't have a specific model to reproduce an error with, and I do not think I encountered one while testing, I just thought it is generally safer to be more strict about the type of upstream cast node used to recast to Int32 - specifically, if we are unsure whether a node has a valid schema for repurposing, we should choose the safer option which is to manually insert an Int32 cast node, as you do in createCastNode.

@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)

@gs-olive I got your point now, let me update this part.

core/partitioning/shape_analysis.cpp

ncomly-nvidia · 2022-11-07T19:19:36Z

Fixes #1346

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

github-actions

Code conforms to Python style guidelines

github-actions

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

github-actions

Code conforms to Python style guidelines

gs-olive

One additional bugfix requested and one minor optional comment, and then the PR is successful when used for compilation + inference on BART.

gs-olive · 2022-11-08T22:46:06Z

core/partitioning/shape_analysis.cpp

+
+torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) {
+  auto cast_raw_value = is_input ? seg_block.raw_inputs()[index] : seg_block.raw_outputs()[index];
+  auto cast_subgraph_value = is_input ? seg_block.outputs()[index] : seg_block.outputs()[index];


Please update this line to:

auto cast_subgraph_value = is_input ? seg_block.inputs()[index] : seg_block.outputs()[index];

Currently, it is using the outputs regardless of the truth value of is_input. With this change, the PR (used along with PR #1416 is working for compilation + inference with the BART model)

gs-olive · 2022-11-08T22:47:52Z

core/partitioning/shape_analysis.cpp

+    auto cur_val = q.front();
+    q.pop();
+    auto node = cur_val->node();
+    if (node->kind().toQualString() == std::string("aten::to")) {


@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions

Code conforms to C++ style guidelines

github-actions

Code conforms to Python style guidelines

gs-olive · 2022-11-09T21:02:54Z

Looks good! Fully functional now on BART!

feat: support int64 <=> int32 auto conversion

cb3b250

Signed-off-by: Bo Wang <bowa@nvidia.com>

facebook-github-bot added the cla signed label Oct 16, 2022

bowang007 requested a review from narendasan October 16, 2022 13:42

github-actions bot added component: core Issues re: The core compiler component: partitioning labels Oct 16, 2022

github-actions bot requested a review from peri044 October 16, 2022 13:42

github-actions bot approved these changes Oct 16, 2022

View reviewed changes

narendasan reviewed Oct 18, 2022

View reviewed changes

fix: cover more cases

4ab7e9c

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions bot approved these changes Oct 19, 2022

View reviewed changes

test: add test cases for automatic int64<=>32 type conversion

194a3bb

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions bot added the component: tests Issues re: Tests label Oct 20, 2022

github-actions bot approved these changes Oct 20, 2022

View reviewed changes

github-actions bot requested changes Oct 20, 2022

View reviewed changes

narendasan added the release: v1.3 Tagged to be included in v1.3 label Oct 25, 2022

github-actions bot requested review from andi4191 and narendasan October 25, 2022 00:44

narendasan requested a review from gs-olive October 25, 2022 18:34

github-actions bot requested changes Oct 25, 2022

View reviewed changes

github-actions bot approved these changes Oct 25, 2022

View reviewed changes

gs-olive requested changes Oct 25, 2022

View reviewed changes

narendasan assigned bowang007 Nov 1, 2022

fix: fix input/output wrong indexing bug

2fd0223

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions bot requested changes Nov 8, 2022

View reviewed changes

github-actions bot approved these changes Nov 8, 2022

View reviewed changes

gs-olive self-requested a review November 8, 2022 22:42

github-actions bot requested changes Nov 8, 2022

View reviewed changes

github-actions bot approved these changes Nov 8, 2022

View reviewed changes

gs-olive requested changes Nov 8, 2022

View reviewed changes

chore: fix typo and apply linting

710a42b

Signed-off-by: Bo Wang <bowa@nvidia.com>

github-actions bot approved these changes Nov 9, 2022

View reviewed changes

gs-olive approved these changes Nov 9, 2022

View reviewed changes

narendasan merged commit 86ff042 into master Nov 14, 2022

narendasan deleted the int_auto_conversion branch November 14, 2022 19:34

bowang007 mentioned this pull request Jan 12, 2023

✨[Feature] Support auto data type transformation between int64 <-> int32 #1346

Closed

feat: support int64 <=> int32 auto conversion #1407

feat: support int64 <=> int32 auto conversion #1407

Conversation

bowang007 commented Oct 16, 2022

Description

Type of change

Checklist:

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

narendasan commented Oct 18, 2022

narendasan commented Oct 18, 2022

inocsin commented Oct 19, 2022

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

gs-olive commented Oct 25, 2022

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

gs-olive left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gs-olive Nov 8, 2022 • edited

Choose a reason for hiding this comment

gs-olive Nov 8, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ncomly-nvidia commented Nov 7, 2022

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

gs-olive left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gs-olive Nov 8, 2022 • edited

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

github-actions bot left a comment

Choose a reason for hiding this comment

gs-olive commented Nov 9, 2022

gs-olive Nov 8, 2022 •

edited

gs-olive Nov 8, 2022 •

edited

gs-olive Nov 8, 2022 •

edited