You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The context of it is related with TF-issue#954. There are two tasks related with the issue:
Need to understand why ROCm XLA produce the warning "Unexpectedly high number of iterations...". Under the hood this seems to be indicating the pass always changes the hlo code and never converges. I have recorded that HloPassFix has been running on following passes: simplification, algsimp, fusion. Some improper implementation should have affected the ROCm platoform. However, this should only affect compilation behavior, not runtime behavior. (Maybe functional behavior if the optimization is done wrong.)
The performance issue, which according to the user narrowed down to NoUTurnSampler. By running with test_nuts.py I can confirm that the GPU (non-XLA) is running slower than CPU too. That gives a baseline script to test with.
The text was updated successfully, but these errors were encountered:
The context of it is related with TF-issue#954. There are two tasks related with the issue:
HloPassFix
has been running on following passes:simplification
,algsimp
,fusion
. Some improper implementation should have affected the ROCm platoform. However, this should only affect compilation behavior, not runtime behavior. (Maybe functional behavior if the optimization is done wrong.)NoUTurnSampler
. By running withtest_nuts.py
I can confirm that the GPU (non-XLA) is running slower than CPU too. That gives a baseline script to test with.The text was updated successfully, but these errors were encountered: