Narendasan/int8 mixed precision fix #94

narendasan · 2020-06-11T22:21:53Z

Description

When int8 is enabled, we also need to enable fp16 kernels just in case those kernels provide a faster path. This explains significant performance differences between FP16 and INT8 in cases where most layers are not mapped to INT8 kernels.

Partially addresses #93

Type of change

Please delete options that are not relevant and/or add your own.

Bug fix (non-breaking change which fixes an issue)

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas and hacks
I have made corresponding changes to the documentation and have regenerated the documentation (make html in docsrc)
I have added tests to verify my fix or my feature
New and existing unit tests pass locally with my changes

overridden Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

Adds evaluators for: - aten::eq - aten::ne - aten::lt - aten::gt - aten::le - aten::ge - aten::add - aten::sub - aten::mul - aten::Bool - aten::Float - aten::__not__ - aten::__is__ - aten::__isnot__ - aten::numel - aten::dim - aten::div - aten::floordiv - aten::floor - aten::warn - prim::min - prim::max - prim::shape - prim::unchecked_cast - prim::Uninitalized - prim::RaiseException Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

int8 do not enable fp16 kernels Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

…g to float (pytorch#94) * Support int inputs to aten::max/min and aten::argmax/argmin by casting to float * correct layer name * address nit, remove local variable

narendasan added 5 commits June 10, 2020 19:35

feat(//core/conversion/converters): Throw a warning if a converter is

6cce381

overridden Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

feat(//core/conversion): Evaluation of static conditionals works now

6421f3d

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

fix(//cpp/ptq): Enable FP16 kernels for INT8 applications

e1c5416

Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

fix(//core/conversion/conversionctx): In the case of strict types and

25c1afc

int8 do not enable fp16 kernels Signed-off-by: Naren Dasan <naren@narendasan.com> Signed-off-by: Naren Dasan <narens@nvidia.com>

github-actions bot added component: conversion Issues re: Conversion stage component: converters Issues re: Specific op converters component: evaluators Issues re: Specific op evaluators labels Jun 11, 2020

narendasan closed this Jun 11, 2020

narendasan deleted the narendasan/int8_mixed_precision_fix branch July 15, 2020 00:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Narendasan/int8 mixed precision fix #94

Narendasan/int8 mixed precision fix #94

Uh oh!

narendasan commented Jun 11, 2020

Uh oh!

Uh oh!

Narendasan/int8 mixed precision fix #94

Narendasan/int8 mixed precision fix #94

Uh oh!

Conversation

narendasan commented Jun 11, 2020

Description

Type of change

Checklist:

Uh oh!

Uh oh!