Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calibration failure occurred with no scaling factors detected. #129

Closed
azuryl opened this issue Jan 28, 2020 · 17 comments
Closed

Calibration failure occurred with no scaling factors detected. #129

azuryl opened this issue Jan 28, 2020 · 17 comments

Comments

@azuryl
Copy link

azuryl commented Jan 28, 2020

in cocker
1.retinanet export retinanet_rn50fpn.pth retinanet_rn50fpn.onnx

2.retinanet export retinanet_rn50fpn.pth retinanet_rn50fpn_int8_engine.pth --int8 --calibration-images /coco/images/val2017/
create an INT8CalibrationTable file(Int8CalibrationTable_ResNet50FPN1280x1280_10) that can be used to create INT8 TensorRT engines for the same model later on without needing to do calibration.
in Xavier

  1. ./export /home/nvidia/project/retinanet_rn50fpn.onnx int8engine.plan /home/nvidia/project/Int8CalibrationTable_ResNet50FPN1280x1280_10

Building engine...
Building INT8 core model...
Building accelerated plugins...
Applying optimizations and building TRT CUDA engine...
Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
Builder failed while configuring INT8 mode.
Segmentation fault (core dumped)

@jin-nvidia
Copy link
Collaborator

jin-nvidia commented Jan 28, 2020

Hello.

Step 1. First we need to modify line 58-60 in the https://github.com/NVIDIA/retinanet-examples/blob/master/extras/cppapi/export.cpp file:

	const vector<string> calibration_files;
	string model_name = "";
	string calibration_table = argc == 4 ? string(argv[3]) : "";

by 1) specifying the calibration_files vector to contain paths to every calibration imgs you've used (the number of imgs should be at least twice your batch size)

	vector<string> calibration_files;
	calibration_files.push_back("path-to/your-image-1");
	calibration_files.push_back("path-to/your-image-2");
	...
	calibration_files.push_back("path-to/your-image-n");
  1. editing model_name to the names in https://github.com/NVIDIA/retinanet-examples/blob/master/retinanet/backbones/fpn.py such as ResNet18FPN,

and 3) setting string calibration_table = "";

Don't forget to redo make after modifying export.cpp.

Step 2: do ./export your.onnx engine.plan calibration-table-name.
If we're running it for the first time ever, then the calibration-table-name doesn't matter, because we will end up with a generated calibration table, with a name looking similar to
Int8CalibrationTable_ResNet34FPN512x864_20 depending on your model.

Step 3: for your export commands beyond the first export

From the second export, you can use this generated table and remember to write the table name into the command line in ./export. But before the second export, we would want to revert the changes to export.cpp, made in step1, and re-make the exe. Our export command from the second time forward would looks like

`./export model.onnx engine.plan Int8CalibrationTable_ResNet34FPN512x864_20````

@azuryl
Copy link
Author

azuryl commented Jan 28, 2020

@jin-nvidia
what is you means command #2 ?
my retinanet export retinanet_rn50fpn.pth retinanet_rn50fpn_int8_engine.pth --int8 --calibration-images /coco/images/val2017/ is run on X86 machine

because Xavier can't install retinanet can you tell me how install full retinanet-examples in Xavier?

@jin-nvidia
Copy link
Collaborator

Hello, please see my edited suggestions. To clarify, first let's get the onnx file from your x86, and then follow the above steps on your Xavier. No need to install retinanet on your Xavier.

@azuryl
Copy link
Author

azuryl commented Jan 28, 2020

Hello.

Step 1. First we need to modify line 58-60 in the https://github.com/NVIDIA/retinanet-examples/blob/master/extras/cppapi/export.cpp file:

	const vector<string> calibration_files;
	string model_name = "";
	string calibration_table = argc == 4 ? string(argv[3]) : "";

by 1) specifying the calibration_files vector to contain paths to every calibration imgs you've used (the number of imgs should be at least twice your batch size)

	vector<string> calibration_files;
	calibration_files.push_back("path-to/your-image-1");
	calibration_files.push_back("path-to/your-image-2");
	...
	calibration_files.push_back("path-to/your-image-n");
1. editing model_name to the names in https://github.com/NVIDIA/retinanet-examples/blob/master/retinanet/backbones/fpn.py such as `ResNet18FPN`,

and 3) setting string calibration_table = "";

Don't forget to redo make after modifying export.cpp.

Step 2: do ./export your.onnx engine.plan calibration-table-name.
If we're running it for the first time ever, then the calibration-table-name doesn't matter, because we will end up with a generated calibration table, with a name looking similar to
Int8CalibrationTable_ResNet34FPN512x864_20 depending on your model.

Step 3: for your export commands beyond the first export

From the second export, you can use this generated table and remember to write the table name into the command line in ./export. But before the second export, we would want to revert the changes to export.cpp, made in step1, and re-make the exe. Our export command from the second time forward would looks like

`./export model.onnx engine.plan Int8CalibrationTable_ResNet34FPN512x864_20````

thank you jin-nvidia

if I need calibrate /coco/images/val2017/ there are 5000 picture
so need I
calibration_files.push_back("path-to/your-image-1");
calibration_files.push_back("path-to/your-image-2");
...
calibration_files.push_back("path-to/your-image-5000") ?

@jin-nvidia
Copy link
Collaborator

We don't necessarily need so many calibration images, we can use 2n images if n is your batch size. However, if you want to use many images, you could possible search for and use a c++ function that lists all filenames within a dir.

@azuryl
Copy link
Author

azuryl commented Jan 29, 2020

thank you jin-nvidia

this morning
I tried in Xavier use Int8CalibrationTable_ResNet50FPN1280x1280_10 which generated in X86
./export /home/nvidia/project/retinanet_rn50fpn.onnx engine_int8.plan /home/nvidia/project/Int8CalibrationTable_ResNet50FPN1280x1280_10
Building engine...
Building INT8 core model...
Building accelerated plugins...
Applying optimizations and building TRT CUDA engine...
Writing to engine_int8.plan...
then
./infer engine_int8.plan /home/nvidia/project/val2017/000000579307.jpg
Loading engine...
Preparing data...
Running inference...
Took 0.144444 seconds per inference.
Found box {698.271, 415.605, 1022.74, 1279} with score 0.868652 and class 0
Found box {470.931, 335.796, 724.456, 745.605} with score 0.708496 and class 0
Found box {407.258, 578.137, 934.345, 955.117} with score 0.515137 and class 33
Found box {414.968, 470.679, 1018.39, 1250.81} with score 0.330811 and class 0

It seems have output
comapre with FP16

./infer engine_fp16.plan /home/nvidia/project/val2017/000000579307.jpg
Loading engine...
Preparing data...
Running inference...
Took 0.240416 seconds per inference.
Found box {697.321, 413.042, 1023.74, 1279} with score 0.870117 and class 0
Found box {471.379, 335.807, 724.159, 735.953} with score 0.725586 and class 0
Found box {409.038, 580.158, 933.746, 954.379} with score 0.530762 and class 33
Found box {414.299, 473.61, 1017.87, 1251.6} with score 0.323486 and class 0
Found box {514.193, 394.421, 715.604, 640.219} with score 0.313232 and class 0
Found box {533.986, 420.738, 693.752, 510.981} with score 0.305176 and class 24

result is similar

@james-nvidia
Copy link
Contributor

It looks like you have produced a model at FP16 and INT8.
Your inference speed still looks quite slow. What size backbone and input resolution are you using?

@azuryl
Copy link
Author

azuryl commented Jan 29, 2020

I use retinanet_rn50pn (https://github.com/NVIDIA/retinanet-examples/releases/download/19.04/retinanet_rn50fpn.zip)

retinanet export retinanet_rn50pn.pth retinanet_rn50pn.onnx

@james-nvidia
Copy link
Contributor

As you apply the repo to your own use case, you may be able to use a smaller backbone (eg RN34), or a smaller image size. Both of these will increase your inference speed.

Also, batching your images together will help enormously.

@azuryl
Copy link
Author

azuryl commented Jan 29, 2020

thank you James
is there any way to set batch in the example of xavier as in the docker
it just infer one picture one time in the example https://github.com/NVIDIA/retinanet-examples/tree/master/extras/cppapi

@james-nvidia
Copy link
Contributor

The C++ API is just a quick demo. If you have a folder of images then you might consider using the DeepStream SDK to infer them.
Otherwise you'll need to extend the C++ API to cover your own use case.

@azuryl
Copy link
Author

azuryl commented Apr 23, 2020

@james-nvidia
if I generate calibration have below fails, but the calibration file is generated. can I use it to export int8 plan
Tensor 634 is uniformly zero; network calibration failed.
Tensor 649 is uniformly zero; network calibration failed.
Tensor 663 is uniformly zero; network calibration failed.
Tensor 677 is uniformly zero; network calibration failed.
Tensor 690 is uniformly zero; network calibration failed.
Tensor 705 is uniformly zero; network calibration failed.
Tensor 719 is uniformly zero; network calibration failed.
Tensor 733 is uniformly zero; network calibration failed.
Tensor 748 is uniformly zero; network calibration failed.
Tensor 762 is uniformly zero; network calibration failed.
Tensor 776 is uniformly zero; network calibration failed.
Tensor 791 is uniformly zero; network calibration failed.
Tensor 805 is uniformly zero; network calibration failed.
Tensor 819 is uniformly zero; network calibration failed.
Tensor 834 is uniformly zero; network calibration failed.
Tensor 848 is uniformly zero; network calibration failed.
Tensor 862 is uniformly zero; network calibration failed.
Tensor 877 is uniformly zero; network calibration failed.
Tensor 891 is uniformly zero; network calibration failed.
Tensor 905 is uniformly zero; network calibration failed.
Tensor 920 is uniformly zero; network calibration failed.
Tensor 934 is uniformly zero; network calibration failed.
Tensor 948 is uniformly zero; network calibration failed.
Tensor 961 is uniformly zero; network calibration failed.
Tensor 976 is uniformly zero; network calibration failed.
Tensor 990 is uniformly zero; network calibration failed.
Tensor scores is uniformly zero; network calibration failed.

Writing to MAL_r50fpn_int8.plan..

@jin-nvidia
Copy link
Collaborator

Which branch are you using and in which container?

@azuryl
Copy link
Author

azuryl commented Apr 29, 2020

It looks like you have produced a model at FP16 and INT8.
Your inference speed still looks quite slow. What size backbone and input resolution are you using?
@james-nvidia in fact I did't use your method . I product in Int8CalibrationTable in X86 by try several times fail and one time success.

@azuryl
Copy link
Author

azuryl commented Apr 29, 2020

Which branch are you using and in which container?
@james-nvidia
nvcr.io/nvidia/pytorch:19.10-py3

@azuryl
Copy link
Author

azuryl commented Apr 29, 2020

why in GeForce GTX 1080 Ti
Building INT8 core model...
Building accelerated plugins...
Applying optimizations and building TRT CUDA engine...
Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.
Builder failed while configuring INT8 mode.
Segmentation fault (core dumped)

but in 2080TI no this issue

@XinyingZheng
Copy link

Hello.

Step 1. First we need to modify line 58-60 in the https://github.com/NVIDIA/retinanet-examples/blob/master/extras/cppapi/export.cpp file:

	const vector<string> calibration_files;
	string model_name = "";
	string calibration_table = argc == 4 ? string(argv[3]) : "";

by 1) specifying the calibration_files vector to contain paths to every calibration imgs you've used (the number of imgs should be at least twice your batch size)

	vector<string> calibration_files;
	calibration_files.push_back("path-to/your-image-1");
	calibration_files.push_back("path-to/your-image-2");
	...
	calibration_files.push_back("path-to/your-image-n");
  1. editing model_name to the names in https://github.com/NVIDIA/retinanet-examples/blob/master/retinanet/backbones/fpn.py such as ResNet18FPN,

and 3) setting string calibration_table = "";

Don't forget to redo make after modifying export.cpp.

Step 2: do ./export your.onnx engine.plan calibration-table-name. If we're running it for the first time ever, then the calibration-table-name doesn't matter, because we will end up with a generated calibration table, with a name looking similar to Int8CalibrationTable_ResNet34FPN512x864_20 depending on your model.

Step 3: for your export commands beyond the first export

From the second export, you can use this generated table and remember to write the table name into the command line in ./export. But before the second export, we would want to revert the changes to export.cpp, made in step1, and re-make the exe. Our export command from the second time forward would looks like

`./export model.onnx engine.plan Int8CalibrationTable_ResNet34FPN512x864_20````

I had the same problem. details as follows:
Internal Error (Calibration failure occurred with no scaling factors detected. This could be due to no int8 calibrator or insufficient custom scales for network layers. Please see int8 sample to setup calibration correctly.)
how to solove this problem, I provided a quantized file path in a txt, and get input data in get_batch function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants