Skip to content

Conversation

narendasan
Copy link
Collaborator

Description

When int8 is enabled, we also need to enable fp16 kernels just in case those kernels provide a faster path. This explains significant performance differences between FP16 and INT8 in cases where most layers are not mapped to INT8 kernels.

Partially addresses #93

Type of change

Please delete options that are not relevant and/or add your own.

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation and have regenerated the documentation (make html in docsrc)
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes

overridden

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Adds evaluators for:
- aten::eq
- aten::ne
- aten::lt
- aten::gt
- aten::le
- aten::ge
- aten::add
- aten::sub
- aten::mul
- aten::Bool
- aten::Float
- aten::__not__
- aten::__is__
- aten::__isnot__
- aten::numel
- aten::dim
- aten::div
- aten::floordiv
- aten::floor
- aten::warn
- prim::min
- prim::max
- prim::shape
- prim::unchecked_cast
- prim::Uninitalized
- prim::RaiseException

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
int8 do not enable fp16 kernels

Signed-off-by: Naren Dasan <naren@narendasan.com>
Signed-off-by: Naren Dasan <narens@nvidia.com>
@github-actions github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: evaluators Issues re: Specific op evaluators labels Jun 11, 2020
@narendasan narendasan closed this Jun 11, 2020
@narendasan narendasan deleted the narendasan/int8_mixed_precision_fix branch July 15, 2020 00:20
mfeliz-cruise added a commit to mfeliz-cruise/Torch-TensorRT that referenced this pull request Jan 5, 2023
…g to float (pytorch#94)

* Support int inputs to aten::max/min and aten::argmax/argmin by casting to float

* correct layer name

* address nit, remove local variable
mfeliz-cruise added a commit to mfeliz-cruise/Torch-TensorRT that referenced this pull request Jan 9, 2023
…g to float (pytorch#94)

* Support int inputs to aten::max/min and aten::argmax/argmin by casting to float

* correct layer name

* address nit, remove local variable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: evaluators Issues re: Specific op evaluators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant