-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to speed up text generation in TensorFlow reference example notebook? #39654
Comments
@zredlined |
@Saduf2019 It is not an error, the code just does not efficiently leverage the GPU by default and I'm hoping to find some advice on speeding it up. You can run the colab here: The line I'm hoping to speed up is in the generate_text() function above: |
@jvishnuvardhan It seems to me the challenge is getting parallelization for the GPU, while maintaining statefulness of the LSTM to predict the next character in the sentence. Perhaps I can batch several lines to generate at once into the model.predict() while maintaining individual LSTM state per line in the batch? Or load multiple models as workers? Any suggestions or pseudocode would be much appreciated! |
Any suggestions here? It would be acceptable to generate multiple texts simultaneously to more effectively use the GPU. Any insights would be appreciated |
@zredlined Try batching several lines to generate at once into the model.predict() while maintaining individual LSTM state per line in the batch and let us know if it speeds up on no. |
@gowthamkpr thanks! I can't figure out how to maintain LSTM state per line in the batch. The model.predict() appears to just update a single LSTM state after processing each line in the batch. Any suggestions on how to do this? |
I'm working on this, for other reasons, but I'll try to fix this at the same time. |
@zredlined |
I got the The In this commit I fixed NMT-with attention to That commit got rolled-back because of 2.4/2.5 incompatibilities, but I'm planning to resubmit it as soon as tf 2.5 is released. |
For TF2.5 - Use the TextVectorization layer. - Use the AdditiveAttention layer. - tf.function the translate loop for text->text export. - Add more inline explanations, and sanity checks. - Add shape assertions throughout the code to make it easier to follow. Fixes: tensorflow/tensorflow#38248 Fixes: tensorflow/tensorflow#39654 See also: tensorflow/tensorflow#49237 PiperOrigin-RevId: 370250185
For TF2.5 - Use the TextVectorization layer. - Use the AdditiveAttention layer. - tf.function the translate loop for text->text export. - Add more inline explanations, and sanity checks. - Add shape assertions throughout the code to make it easier to follow. Fixes: tensorflow/tensorflow#38248 Fixes: tensorflow/tensorflow#39654 See also: tensorflow/tensorflow#49237 PiperOrigin-RevId: 370250185
For TF2.5 - Use the TextVectorization layer. - Use the AdditiveAttention layer. - tf.function the translate loop for text->text export. - Add more inline explanations, and sanity checks. - Add shape assertions throughout the code to make it easier to follow. Fixes: tensorflow/tensorflow#38248 Fixes: tensorflow/tensorflow#39654 See also: tensorflow/tensorflow#49237 PiperOrigin-RevId: 370250185
The tensorflow official example for text generation (https://github.com/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb) runs in a loop as defined below. The text generation feels slow, and according to NVTOP only uses a fraction of the available GPU resources (15-20%).
Do you have any suggestions on how I can speed this up? Or parallelize it by generating multiple examples at the same time? A quick look at cprofiler shows that 90% of the time is spent on the single line predictions = model(input_eval), so this is where we'd most likely find a speedup. Would appreciate any advice, and happy to submit a PR if I'm able to speed it up!
System information
-TensorFlow version: v2.1.0-rc2-17
Describe the current behavior
Text generation works fine, but feels slow. Using NVTOP it shows only 15% GPU utilization on average.
Describe the expected behavior
Hoping to speed up text generation by better leveraging the GPU
Standalone code to reproduce the issue
This issue can be replicated by running the standard TensorFlow text generation tutorial on Google Colaboratory with GPU
Other info / logs Include any logs or source code that would be helpful to
The text was updated successfully, but these errors were encountered: