About Determinism of TensorRT

Hi,
I'm using **TensorRT FP16 precision mode** to optimize my deep learning model which takes an image input and outputs a steering angle. While testing my model, I have observed that **FPS**(frames per second) of TensorRT inference is **different** for same inputs.  For example, when I run the model at time _**t**_ for input _**A**_, FPS of inferencing is _**X**_. But at time _**t+1**_, FPS of inferencing for input _**A**_ is _**Y**_.

My observation from results, TensorRT inference engine is **non-deterministic** if _FP16_ precision mode used.

I started to think that the **source of the non-determinism** is floating point operations when I see this comment about CUDA:

> ["If your code uses floating-point atomics, results may differ from run to run because floating-point operations are generally not associative, and the order in which data enters a computation (e.g. a sum) is non-deterministic when atomics are used.](https://devtalk.nvidia.com/default/topic/782499/cuda-programming-and-performance/cuda-result-changes-time-to-time/post/4338626/#4338626)"

Does type of precision such as _FP16_, _FP32_ and _INT8_ affect determinism of TensorRT? 
Why the FPS values are varying even if the calculations are same?

Do you have any thoughs?

Best regards. 

NOTES: 
- Hardware : Jetson TX2
- I expect same FPS value on every execution.
-  I measure FPS like that:
```cpp
clock_t beginExecuteEngine = clock();
context->execute(kBatchSize,bindings);
double deltaTimeExecuteEngine = double( clock() - beginExecuteEngine) /double(CLOCKS_PER_SEC);
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Determinism of TensorRT #44

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

About Determinism of TensorRT #44

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions