-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inquiry about Layer Performance of FP16 #3876
Comments
You can upload the onnx. |
It's vectorized format and trt will pad the tensor to the target format. you can refer to https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#data-format-desc |
@zerollzeng Oh I see. However, the data format of each layer is auto-chosen for the best performance, right? Since I convert on my Jetson Nano the layers is converted to datatype "Two wide channel vectorized row major FP16 format" |
@lix19937 Here is my onnx v8s_pruned. This onnx is exported from Ultralytics so it has meta data. So, I use below python script to convert:
|
Description
Hi, I'm newer to TensorRT and I'm trying to understand the layer performance. I read the doc Optimizing for Tensor Cores and see that with the FP16 precision, the dim of tensor should be multiples of 8 or 16.
So I converted an ONNX model to an engine model, then I printed the layer information. Here is a part of it:
I see the description
"Format/Datatype": "Channel major FP16 format where channel % 8 == 0"
and"Format/Datatype": "Channel major FP16 format where channel % 2 == 0"
. I don't know what this means because my channel is not divisible by 8"Dimensions": [1,25,160,160]
, and is my model optimized?Sorry for my bad English.
Environment
TensorRT Version:
NVIDIA GPU:
NVIDIA Driver Version:
CUDA Version:
CUDNN Version:
Operating System:
Python Version (if applicable):
Tensorflow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if so, version):
Relevant Files
Model link:
Steps To Reproduce
Commands or scripts:
Have you tried the latest release?:
Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (
polygraphy run <model.onnx> --onnxrt
):The text was updated successfully, but these errors were encountered: