-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STS models running slower on GPUs #1395
Comments
Thanks for filing this! My impression is that STS models on GPU have essentially always been slow; I don't think it's anything recent. We haven't pushed hard on this since it hasn't been a blocker for our internal use cases, but I think there's almost certainly room for improvement. Some general context: STS models in general tend to have lots of little ops, which is a different performance profile than your average NN model where the computation is dominated by a few big matmuls. Since there's some overhead every time an op is interpreted, using XLA compilation ( Without having done any profiling, one initial hypothesis might be that some there's some op that (for some reason) is always executed on the CPU, so using the GPU for everything else just adds a data-transfer bottleneck at every step, negating any improvement from the GPU acceleration. The tools at https://www.tensorflow.org/guide/gpu_performance_analysis would probably be useful for investigating this. I do think this is an important issue that we'll want to understand better, though I don't personally expect to be able to put a lot of time into this in the next few weeks. |
Thanks @davmre for the reply! I did have the impression it was running faster before but maybe I made a mistake between setting variational inference and hmc, not sure now. Hoping the support for GPU will eventually happen :) Best, Will |
This simple example running on Colab shows this issue:
Running on CPU takes a few seconds whereas running on GPU takes minutes. I also tested the same procedure using TFP's example notebook on Colab and again running on GPU also took longer (between 2~3x).
I tried testing with bigger datasets to see if data volume was the issue but as I increased it ten fold the GPU could no longer finish its process on Colab.
Also tested on previous versions of Tensorflow and Probability but the issue remained. Is there something that changed that made GPUs slower?
Thanks in advance!
The text was updated successfully, but these errors were encountered: