Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support int64 <=> int32 auto conversion #1407

Merged
merged 5 commits into from
Nov 14, 2022
Merged

Conversation

bowang007
Copy link
Collaborator

Signed-off-by: Bo Wang bowa@nvidia.com

Description

Support int64 <=> int32 type conversion.

Fixes #1382

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

Signed-off-by: Bo Wang <bowa@nvidia.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

}
}
}
// TODO: This part might be necessary for some model, now checkint to verify
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowang007 should this be uncommented?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, optimized and refactored, this part is now included.

@narendasan
Copy link
Collaborator

@inocsin Can you verify these changes on key models?

@narendasan
Copy link
Collaborator

Seems like tests are failing for partitioning?

@inocsin
Copy link
Contributor

inocsin commented Oct 19, 2022

@inocsin Can you verify these changes on key models?

Sure, we have asked users to test this pr with their models

Signed-off-by: Bo Wang <bowa@nvidia.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Signed-off-by: Bo Wang <bowa@nvidia.com>
@github-actions github-actions bot added the component: tests Issues re: Tests label Oct 20, 2022
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

@narendasan narendasan added the release: v1.3 Tagged to be included in v1.3 label Oct 25, 2022
@gs-olive
Copy link
Collaborator

When compiling BART (https://huggingface.co/facebook/bart-base) using Torch TensorRT, this PR currently segfaults on my machine. Will add an additional comment with the line which is causing the issue

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link
Collaborator

@gs-olive gs-olive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the suggested edits, PR is functioning for BART model and successfully casts Long tensors to Int tensors. Suggestions are related to bugs arising from input vs output checking and usage of different aten::to schemas

auto const_zero = g->insertConstant(0);
const_zero->setType(torch::jit::BoolType::get());
auto none_val = g->insertNode(g->createNone())->output();
cast_node = g->create(torch::jit::aten::to, {g->inputs()[index], const_type, const_zero, const_zero, none_val});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an if/else here to use g->inputs() if is_input = true otherwise use g->outputs()

}

torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) {
torch::jit::Node* cast_node = getUpstreamCastNode(seg_block.raw_inputs()[index]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an if/else here to use raw_inputs() if is_input = true otherwise use raw_outputs()

// if we can find upstream aten::to node, we use it's parameters for creating new cast node
if (cast_node) {
std::unordered_map<torch::jit::Value*, torch::jit::Value*> value_map;
value_map.insert({cast_node->inputs()[0], g->inputs()[index]});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need an if/else here to check if insert should be g->inputs()[index] or g->outputs()[index].

Comment on lines 105 to 107
// auto cast_node = g->prependNode(g->create(torch::jit::aten::to, {g->inputs()[i], const_type, const_zero,
// const_zero, none_val})); seg_block.inputs()[i]->replaceAllUsesAfterNodeWith(cast_node,
// cast_node->outputs()[0]); LOG_DEBUG(seg_block << " in shape analysis");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing commented code if not needed.

if (!is_input) {
// if this value is output, we need to cast it to int32
auto const_val = g->insertConstant(3);
value_map.insert({cast_node->inputs()[1], const_val});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throws an error when the upstream aten::to node does not have dtype as its second argument. For example, the schema aten::to.prim_Device(Tensor(a) self, Device? device, int? dtype=None, bool non_blocking=False, bool copy=False) -> Tensor(b|a) has Device as its second value, and this insertion causes it to be transformed to an invalid schema. We need to differentiate between schemas to ensure the dtype is placed in the right position. It seems that valid schemas for aten::to have dtype as either the second or third argument, or not at all. I believe there should be a check should be in getUpstreamCastNode to see if dtype is any of the arguments, and then a second check here to see if it is second or third argument in the schema.

The check here could be something like an if/else checking the debugName at the second index, as in:

if (cast_node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::DeviceObjType) {
  value_map.insert({cast_node->inputs()[2], const_val});
} else {
  value_map.insert({cast_node->inputs()[1], const_val});
}

auto cur_val = q.front();
q.pop();
auto node = cur_val->node();
if (node->kind().toQualString() == std::string("aten::to")) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need an additional check to ensure that the aten::to schema is valid for dtype insertion, as some of these schemas do not take an integer dtype at all, for example:

  • aten::to(Tensor(a) self, bool non_blocking=False, bool copy=False) -> Tensor(b|a)
  • aten::to(Tensor(a) self, Device device, ScalarType dtype, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)
  • aten::to(Tensor(a) self, Tensor other, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)

A check could be something like an additional && with

(node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) ||
(node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gs-olive Any reproducer for this?
What I'm not sure about is that for getUpstreamNode() function when we pass in a int32 value will the first cast node be the cast node that casts this value to int64? If that's the case, then we don't need this check.
In other words, is it possible that the first cast node involving the passed value is to cast some other value? If the first cast node is not the cast node that casts to int64, will the second cast node be what we want?

Copy link
Collaborator

@gs-olive gs-olive Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @bowang007 - as an update, while this is no longer throwing an error on my end, my thought was that we do need this check you have, but maybe it should be more stringent - something like:

    if ((node->kind().toQualString() == std::string("aten::to")) &&
        ((node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) ||
        (node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType))) {

This is because, in the case where the aten::to is the second option in my above comment, then inserting a constant like 3 will cause the model to fail, as the schema for to as requested needs a ScalarType and not an int. I don't have a specific model to reproduce an error with, and I do not think I encountered one while testing, I just thought it is generally safer to be more strict about the type of upstream cast node used to recast to Int32 - specifically, if we are unsure whether a node has a valid schema for repurposing, we should choose the safer option which is to manually insert an Int32 cast node, as you do in createCastNode.

Copy link
Collaborator

@gs-olive gs-olive Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gs-olive I got your point now, let me update this part.

core/partitioning/shape_analysis.cpp Show resolved Hide resolved
@ncomly-nvidia
Copy link
Contributor

Fixes #1346

Signed-off-by: Bo Wang <bowa@nvidia.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

@gs-olive gs-olive self-requested a review November 8, 2022 22:42
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some changes that do not conform to C++ style guidelines:

diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
      cnt++;
    }
  }
-  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
-            << std::endl;
+  std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+            << " aten::to nodes)" << std::endl;

  return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
    LOG_DEBUG(seg_block << " cur seg block");
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}

TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
  }
  ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

Copy link
Collaborator

@gs-olive gs-olive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One additional bugfix requested and one minor optional comment, and then the PR is successful when used for compilation + inference on BART.


torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) {
auto cast_raw_value = is_input ? seg_block.raw_inputs()[index] : seg_block.raw_outputs()[index];
auto cast_subgraph_value = is_input ? seg_block.outputs()[index] : seg_block.outputs()[index];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this line to:

auto cast_subgraph_value = is_input ? seg_block.inputs()[index] : seg_block.outputs()[index];

Currently, it is using the outputs regardless of the truth value of is_input. With this change, the PR (used along with PR #1416 is working for compilation + inference with the BART model)

auto cur_val = q.front();
q.pop();
auto node = cur_val->node();
if (node->kind().toQualString() == std::string("aten::to")) {
Copy link
Collaborator

@gs-olive gs-olive Nov 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)

Signed-off-by: Bo Wang <bowa@nvidia.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to C++ style guidelines

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code conforms to Python style guidelines

@gs-olive
Copy link
Collaborator

gs-olive commented Nov 9, 2022

Looks good! Fully functional now on BART!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: core Issues re: The core compiler component: partitioning component: tests Issues re: Tests release: v1.3 Tagged to be included in v1.3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

✨[Feature] Automatic conversion for int32<->int64 in fallback
6 participants