You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TPU utilization should be close to 100%. I think your dashboard is showing something else.
My guess it shows percentage of time when TPUs do FLOP-heavy operation, like matrix multiplications. The rest is various data reshapes, weight synchronization and so on. IIUC it is hard to do substantially better than what we have now.
Training details are in #2
I think the TPU utilization is a bit lower than expected:
Is this expected?
I understand there might be other network access factors that can contribute to this but wanted to know.
The text was updated successfully, but these errors were encountered: