Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Error with aten::div when using truncation with Int32 tensor inputs #1442

Merged
merged 1 commit into from
Nov 18, 2022

Conversation

gs-olive
Copy link
Collaborator

@gs-olive gs-olive commented Nov 4, 2022

Description

  • aten::div with truncation on integer tensor inputs currently throws an error if both inputs are integer type, as the TRT unary operations for absolute value and floor do not apply to Int32 or Bool types
  • For absolute value, this is a legitimate bug as aten::abs is functional for integer types
  • For the floor operation, aten::floor does not explicitly support integer inputs, and torch.floor() does not work with Int32 inputs by default (on 1.13.0.dev20220921+cu116). However, torch.div(..., rounding_mode="trunc") with integer tensors does return an integer value, and so the corollary Torch-TRT converter should behave similarly
  • Modified aten:abs converter logic to be a utility (moved file location), as the operator is used in multiple locations
  • Added regression test to ensure truncation divide with two integer tensors is functional

Note: The behavior of torch.floor() on Int32 types differs between 1.13.0.dev20220921+cu116 and 1.14.0.dev20221018+cu116: the former does not by default support this operation, while the latter does. This PR does not fix the general aten::floor operator for Int32 inputs, but instead fixes the aten::div truncation operator only.

Fixes #1441
Note: The issue was traced to a problem with aten::div with truncation enabled, and not aten::floor

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • [ x ] My code follows the style guidelines of this project (You can use the linters)
  • [ x ] I have performed a self-review of my own code
  • [ x ] I have commented my code, particularly in hard-to-understand areas and hacks
  • [ x ] I have made corresponding changes to the documentation
  • [ x ] I have added tests to verify my fix or my feature
  • [ x ] New and existing unit tests pass locally with my changes
  • [ x ] I have added the relevant labels to my PR in so that relevant reviewers are notified

@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests labels Nov 4, 2022
Comment on lines 329 to 345
auto abs = add_absolute_value(ctx, n, tmp_div->getOutput(0), util::node_info(n) + "_absolute_val");

// In this case, we allow the floor unary on non-TRT Unary types, as it is needed for this
// specific function. Floor applied to non-float types equates to identity
nvinfer1::ILayer* floor;
if ((abs->getOutput(0)->getType() == nvinfer1::DataType::kINT32) ||
(abs->getOutput(0)->getType() == nvinfer1::DataType::kBOOL)) {
LOG_GRAPH(
"Tensor is of unsupported type " << abs->getOutput(0)->getType()
<< " for IUnaryLayer::kFLOOR. Using identity instead.");
floor = ctx->net->addIdentity(*abs->getOutput(0));
TORCHTRT_CHECK(floor, "Unable to create identity layer from node: " << *n);
} else {
floor = ctx->net->addUnary(*abs->getOutput(0), nvinfer1::UnaryOperation::kFLOOR);
TORCHTRT_CHECK(floor, "Unable to create floor layer from node: " << *n);
}
floor->setName((util::node_info(n) + "_floor").c_str());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code block, both the abs and the floor operators were encountering errors when both inputs are integer types. The solution for abs was a converter utility, whereas for floor, the solution only appears here. The reasoning for this choice was that Torch support for torch.floor() applied to an Int32 type differs between 1.13.0.dev20220921+cu116 and 1.14.0.dev20221018+cu116, so it is unclear if the aten::floor converter should generally support Int32 inputs or not, currently.

@gs-olive gs-olive added the release: v1.3 Tagged to be included in v1.3 label Nov 9, 2022
@gs-olive gs-olive self-assigned this Nov 9, 2022
@@ -42,6 +42,12 @@ nvinfer1::ILayer* add_elementwise(
nvinfer1::ITensor* other,
const std::string& name);

nvinfer1::ILayer* add_absolute_value(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to add_abs

@@ -42,6 +42,12 @@ nvinfer1::ILayer* add_elementwise(
nvinfer1::ITensor* other,
const std::string& name);

nvinfer1::ILayer* add_absolute_value(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return nvinfer1::ITensor*

nvinfer1::ILayer* floor;
if ((abs->getOutput(0)->getType() == nvinfer1::DataType::kINT32) ||
(abs->getOutput(0)->getType() == nvinfer1::DataType::kBOOL)) {
LOG_GRAPH(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOG_DEBUG instead of LOG_GRAPH


// In this case, we allow the floor unary on non-TRT Unary types, as it is needed for this
// specific function. Floor applied to non-float types equates to identity
nvinfer1::ILayer* floor;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Work with ITensor* instead ideally

floor = ctx->net->addUnary(*abs->getOutput(0), nvinfer1::UnaryOperation::kFLOOR);
TORCHTRT_CHECK(floor, "Unable to create floor layer from node: " << *n);
}
floor->setName((util::node_info(n) + "_floor").c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to only be applied to the unary layer on line 342 so as not to overwrite the info from the abs on 329

TORCHTRT_CHECK(absolute_value_layer, "Unable to create max layer from node: " << *n);
}

return absolute_value_layer->getOutput(0);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched function schema to return the output of the absolute value layer and not the layer itself

LOG_DEBUG(
"Tensor is of unsupported type " << abs->getType()
<< " for IUnaryLayer::kFLOOR. Using identity instead.");
floor = abs;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using the identity function, the floor output tensor is just set to the absolute value result, to avoid unnecessary computation.

- `aten::div` with truncation on integer tensor inputs currently throws
an error if both inputs are integer type, as the TRT unary operations
for absolute value and floor do not apply to Int32 or Bool types
- For absolute value, this is a legitimate bug as `aten::abs` is
functional for integer types
- For the floor operation, `aten::floor` does not explicitly support
integer inputs, and `torch.floor()` does not work with Int32 inputs by
default. However, `torch.div(..., rounding_mode="trunc")` with integer
tensors does return an integer value, and so the corollary Torch-TRT
converter should behave similarly
- Modified `aten:abs` converter logic to be a utility, as it is used in
multiple locations
- Added regression test to ensure truncation divide with two integer
tensors is functional

- Address comments on PR

  - Update utility name to add_abs for conciseness
  - Refactor absolute value utility to return ITensor*
  - Update logging level for certain debug messages
Copy link
Collaborator

@narendasan narendasan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@narendasan narendasan merged commit 3ee60b7 into pytorch:master Nov 18, 2022
@gs-olive gs-olive deleted the trunc_div_bugfix branch November 18, 2022 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: core Issues re: The core compiler component: tests Issues re: Tests release: v1.3 Tagged to be included in v1.3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

🐛 [Bug] Bug encountered when compiling Jasper10x5dr network
3 participants