Skip to content

About Determinism of TensorRT #44

@hibestil

Description

@hibestil

Hi,
I'm using TensorRT FP16 precision mode to optimize my deep learning model which takes an image input and outputs a steering angle. While testing my model, I have observed that FPS(frames per second) of TensorRT inference is different for same inputs. For example, when I run the model at time t for input A, FPS of inferencing is X. But at time t+1, FPS of inferencing for input A is Y.

My observation from results, TensorRT inference engine is non-deterministic if FP16 precision mode used.

I started to think that the source of the non-determinism is floating point operations when I see this comment about CUDA:

"If your code uses floating-point atomics, results may differ from run to run because floating-point operations are generally not associative, and the order in which data enters a computation (e.g. a sum) is non-deterministic when atomics are used."

Does type of precision such as FP16, FP32 and INT8 affect determinism of TensorRT?
Why the FPS values are varying even if the calculations are same?

Do you have any thoughs?

Best regards.

NOTES:

  • Hardware : Jetson TX2
  • I expect same FPS value on every execution.
  • I measure FPS like that:
clock_t beginExecuteEngine = clock();
context->execute(kBatchSize,bindings);
double deltaTimeExecuteEngine = double( clock() - beginExecuteEngine) /double(CLOCKS_PER_SEC);

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions