Investigate TF-probablity performance issue #1075

jerryyin · 2020-08-07T20:25:46Z

The context of it is related with TF-issue#954. There are two tasks related with the issue:

Need to understand why ROCm XLA produce the warning "Unexpectedly high number of iterations...". Under the hood this seems to be indicating the pass always changes the hlo code and never converges. I have recorded that HloPassFix has been running on following passes: simplification, algsimp, fusion. Some improper implementation should have affected the ROCm platoform. However, this should only affect compilation behavior, not runtime behavior. (Maybe functional behavior if the optimization is done wrong.)
The performance issue, which according to the user narrowed down to NoUTurnSampler. By running with test_nuts.py I can confirm that the GPU (non-XLA) is running slower than CPU too. That gives a baseline script to test with.

The text was updated successfully, but these errors were encountered:

jerryyin closed this as completed Aug 7, 2020

Provide feedback