Detected subnormal FP16 values. --precisionConstraints --layerPrecisions did't work #2600

zll0000 · 2023-01-13T10:38:26Z

Description

I convert an ONNX model which is very complex to an tensortrt fp32 model, the outputs of the onnx model and trtfp32 model are the same the commd as follows:
trtexec --onnx=encoder2.onnx --fp16 --saveEngine=encoderfp16.trt --useCudaGraph --verbose --tacticSources=-cublasLt,+cublas --workspace=10240M --minShapes=src_tokens:1x1000 --optShapes=src_tokens:1x100000 --maxShapes=src_tokens:1x700000 --preview=+fasterDynamicShapes0805 >log.en
all the thing is ok
However when I conert the onnx model to an tensort fp16 the output is very different and some weights affected

[01/13/2023-10:07:54] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/13/2023-10:09:43] [W] [TRT] Using kFASTER_DYNAMIC_SHAPES_0805 preview feature.
[01/13/2023-10:20:45] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[01/13/2023-10:20:45] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[01/13/2023-10:20:45] [W] [TRT] Check verbose logs for the list of affected weights.
[01/13/2023-10:20:45] [W] [TRT] - 254 weights are affected by this issue: Detected subnormal FP16 values.
[01/13/2023-10:20:45] [W] [TRT] - 31 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.

detail log in log.en
List of affected weights: Conv_240.weight, Conv_263.weight, Conv_286.weight, Conv_309.weight, Conv_332.weight, Conv_355.weight, Conv_378.weight, Conv_416.weight, Conv_5165.bias, Conv_5165.weight, Conv_5169.weight, Conv_5173.bias, Conv_5173.weight, Gemm_1046.bias, Gemm_1046.weight, Gemm_1239.weight, Gemm_1432.bias, Gemm_1432.weight, Gemm_1625.weight, Gemm_181.....

I want to use --precisionConstraints and --layerPrecisions to restrict some weiths to fp32,the commd as follows:
trtexec --onnx=encoder2.onnx --fp16 --saveEngine=encoderfp16.trt --useCudaGraph --verbose --tacticSources=-cublasLt,+cublas --workspace=10240M --minShapes=src_tokens:1x1000 --optShapes=src_tokens:1x100000 --maxShapes=src_tokens:1x700000 --preview=+fasterDynamicShapes0805 --precisionConstraints=obey --layerPrecisions=Conv_240.weight:fp32, Conv_263.weight:fp32, Conv_286.weight:fp32 >log.en1

But the log output are the same,

[01/13/2023-10:42:38] [W] [TRT] onnx2trt_utils.cpp:377: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[01/13/2023-10:44:28] [W] [TRT] Using kFASTER_DYNAMIC_SHAPES_0805 preview feature.
[01/13/2023-10:55:24] [W] [TRT] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[01/13/2023-10:55:24] [W] [TRT] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[01/13/2023-10:55:24] [W] [TRT] Check verbose logs for the list of affected weights.
[01/13/2023-10:55:24] [W] [TRT] - 254 weights are affected by this issue: Detected subnormal FP16 values.
[01/13/2023-10:55:24] [W] [TRT] - 31 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
[01/13/2023-10:55:26] [W] [TRT] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage. See CUDA_MODULE_LOADING in https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars

detail log in log.en1

A clear and concise description of the bug or issue.

Environment

TensorRT Version: TensorRT-8.5.2.2.Linux.x86_64-gnu.cuda-11.8.cudnn8.6.tar.gz
GPU Type: Tesla V100-PCIE
CUDA Version: 11.6
CUDNN Version: cudnn-linux-x86_64-8.6.0.163_cuda11-archive
PyTorch Version (if applicable): 1.12.0
onnx: 1.12.0

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

Exact steps/commands to build your repro
Exact steps/commands to run your repro
Full traceback of errors encountered

The text was updated successfully, but these errors were encountered:

zll0000 · 2023-01-14T01:10:19Z

@zerollzeng
here can download detail log
https://forums.developer.nvidia.com/t/meet-some-problem-with-precisionconstraints-obey-layerprecisions/239520

zerollzeng · 2023-01-16T12:26:00Z

Can you try --precisionConstraints=obey --layerPrecisions=Conv_240:fp32, Conv_263:fp32, Conv_286:fp32 and see it it make difference? IIUC the --layerPrecisions is applied to onnx nodes, not weights or tensors.

zll0000 · 2023-01-17T07:35:27Z

@zerollzeng
in the log threre are several hundred parameters out of the range of FP16 each layer shoud be added in the commd like these --layerPrecisions=Conv_240:fp32, Conv_263:fp32, Conv_286:fp32.... have other solutions?
[01/17/2023-15:00:13] [W] [TRT] - 225 weights are affected by this issue: Detected subnormal FP16 values.
[01/17/2023-15:00:13] [W] [TRT] - 22 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.

zll0000 · 2023-01-17T07:38:02Z

@zerollzeng

zll0000 · 2023-01-17T07:41:05Z

@zerollzeng I dont undstand what '||' means, "OR" "and" ,other how to process the unnamed* layers

zerollzeng · 2023-01-18T01:42:17Z

in the log threre are several hundred parameters out of the range of FP16 each layer shoud be added in the commd like these --layerPrecisions=Conv_240:fp32, Conv_263:fp32, Conv_286:fp32.... have other solutions?

This is a warning, it might not affect the final accuracy, if you want to get rid of this warning totally I think the better solution is to retrain your model and restrict all the weights under the FP16 range.

zll0000 · 2023-01-18T01:45:08Z

it affect the accuracy

zerollzeng · 2023-01-18T01:47:50Z

I dont undstand what '||' means, "OR" "and" ,other how to process the unnamed* layers

you don't need to care about unnamed* layers, they are added by TensorRT. I think you can filter out all the nodes here(ignore the .weight and Unnamed Layer)

zerollzeng · 2023-01-18T01:48:43Z

Fall back many layer to FP32 may lead to huge degradation on performance, that's why I suggest retrain the model.

zll0000 · 2023-01-18T01:53:07Z

Fall back many layer to FP32 may lead to huge degradation on performance, that's why I suggest retrain the model.

ok Thanks

sardanian · 2023-01-24T03:03:40Z

I am having a similiar issue when using torch_tensorrt.

How can I retrain my model within the FP16 range? I tried retraining using Gradscaler as indicated in AMP training tutorials for Pytorch. Are you meaning a different way? And if so, which way is that?

Thank you.

dchebakov · 2023-07-20T20:41:50Z

I would suggest you use tensorrt python api for easy experiments and to read this similar issue #2922

ttyio · 2023-11-23T00:51:29Z

closing inactive issues, thanks all!

mxsurui · 2024-06-17T02:34:13Z

in the log threre are several hundred parameters out of the range of FP16 each layer shoud be added in the commd like these --layerPrecisions=Conv_240:fp32, Conv_263:fp32, Conv_286:fp32.... have other solutions?

This is a warning, it might not affect the final accuracy, if you want to get rid of this warning totally I think the better solution is to retrain your model and restrict all the weights under the FP16 range.

Could tell me how to restrict all the weights under the FP16 range? it confuse me a long time, thank you.

zerollzeng self-assigned this Jan 16, 2023

zerollzeng added the triaged Issue has been triaged by maintainers label Jan 16, 2023

ttyio closed this as completed Nov 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Detected subnormal FP16 values. --precisionConstraints --layerPrecisions did't work #2600

Detected subnormal FP16 values. --precisionConstraints --layerPrecisions did't work #2600

zll0000 commented Jan 13, 2023

zll0000 commented Jan 14, 2023 •

edited

Loading

zerollzeng commented Jan 16, 2023

zll0000 commented Jan 17, 2023

zll0000 commented Jan 17, 2023

zll0000 commented Jan 17, 2023 •

edited

Loading

zerollzeng commented Jan 18, 2023

zll0000 commented Jan 18, 2023 •

edited

Loading

zerollzeng commented Jan 18, 2023

zerollzeng commented Jan 18, 2023

zll0000 commented Jan 18, 2023

sardanian commented Jan 24, 2023

dchebakov commented Jul 20, 2023

ttyio commented Nov 23, 2023

mxsurui commented Jun 17, 2024

Detected subnormal FP16 values. --precisionConstraints --layerPrecisions did't work #2600

Detected subnormal FP16 values. --precisionConstraints --layerPrecisions did't work #2600

Comments

zll0000 commented Jan 13, 2023

Description

Environment

Relevant Files

Steps To Reproduce

zll0000 commented Jan 14, 2023 • edited Loading

zerollzeng commented Jan 16, 2023

zll0000 commented Jan 17, 2023

zll0000 commented Jan 17, 2023

zll0000 commented Jan 17, 2023 • edited Loading

zerollzeng commented Jan 18, 2023

zll0000 commented Jan 18, 2023 • edited Loading

zerollzeng commented Jan 18, 2023

zerollzeng commented Jan 18, 2023

zll0000 commented Jan 18, 2023

sardanian commented Jan 24, 2023

dchebakov commented Jul 20, 2023

ttyio commented Nov 23, 2023

mxsurui commented Jun 17, 2024

zll0000 commented Jan 14, 2023 •

edited

Loading

zll0000 commented Jan 17, 2023 •

edited

Loading

zll0000 commented Jan 18, 2023 •

edited

Loading