How to speed up text generation in TensorFlow reference example notebook? #39654

zredlined · 2020-05-18T17:53:57Z

The tensorflow official example for text generation (https://github.com/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb) runs in a loop as defined below. The text generation feels slow, and according to NVTOP only uses a fraction of the available GPU resources (15-20%).

def generate_text(model, start_string):
  # Evaluation step (generating text using the learned model)

  # Number of characters to generate
  num_generate = 1000

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty string to store our results
  text_generated = []

  # Low temperatures results in more predictable text.
  # Higher temperatures results in more surprising text.
  # Experiment to find the best setting.
  temperature = 1.0

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the character returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted character as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

Do you have any suggestions on how I can speed this up? Or parallelize it by generating multiple examples at the same time? A quick look at cprofiler shows that 90% of the time is spent on the single line predictions = model(input_eval), so this is where we'd most likely find a speedup. Would appreciate any advice, and happy to submit a PR if I'm able to speed it up!

System information

I am running the TensorFlow reference text generation example: https://github.com/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb
Tested on Debian and Google Colab (with GPU support)
TensorFlow installed from (source or binary): Binary
-TensorFlow version: v2.1.0-rc2-17
Python version: 3.7.5
CUDA/cuDNN version: Cuda compilation tools, release 10.1, V10.1.243
GPU model and memory: NVidia Tesla T4

Describe the current behavior
Text generation works fine, but feels slow. Using NVTOP it shows only 15% GPU utilization on average.

Describe the expected behavior
Hoping to speed up text generation by better leveraging the GPU

Standalone code to reproduce the issue
This issue can be replicated by running the standard TensorFlow text generation tutorial on Google Colaboratory with GPU

Other info / logs Include any logs or source code that would be helpful to

The text was updated successfully, but these errors were encountered:

Saduf2019 · 2020-05-19T14:07:59Z

@zredlined
Can you please share simple stand alone code to replicate the issue or if possible share a colab gist for us to a analyse the error

zredlined · 2020-05-19T15:25:10Z

@Saduf2019 It is not an error, the code just does not efficiently leverage the GPU by default and I'm hoping to find some advice on speeding it up. You can run the colab here:

https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/text/text_generation.ipynb

The line I'm hoping to speed up is in the generate_text() function above:
predictions = model(input_eval)

zredlined · 2020-05-21T13:54:49Z

@jvishnuvardhan It seems to me the challenge is getting parallelization for the GPU, while maintaining statefulness of the LSTM to predict the next character in the sentence.

Perhaps I can batch several lines to generate at once into the model.predict() while maintaining individual LSTM state per line in the batch? Or load multiple models as workers? Any suggestions or pseudocode would be much appreciated!

zredlined · 2020-05-26T16:11:11Z

Any suggestions here? It would be acceptable to generate multiple texts simultaneously to more effectively use the GPU. Any insights would be appreciated

gowthamkpr · 2020-05-31T09:53:28Z

@zredlined Try batching several lines to generate at once into the model.predict() while maintaining individual LSTM state per line in the batch and let us know if it speeds up on no.

zredlined · 2020-06-01T17:33:48Z

@gowthamkpr thanks! I can't figure out how to maintain LSTM state per line in the batch. The model.predict() appears to just update a single LSTM state after processing each line in the batch. Any suggestions on how to do this?

MarkDaoust · 2020-06-24T15:53:58Z

I'm working on this, for other reasons, but I'll try to fix this at the same time.
It may take a little while to land, but wrapping that in a tf.function, and batching the inputs should give a good speedup.

Saduf2019 · 2021-04-29T18:18:38Z

@zredlined
Could you please check on tf 2,4,1 and let us know if you still face this issue.

MarkDaoust · 2021-04-29T19:57:02Z

I got the tf.function implementation working in that tutorial.

The tf.function only runs one step at a time, so it's still not ideal.

In this commit I fixed NMT-with attention to tf.function compile the whole loop, with batched-inputs. That should be even faster.

tensorflow/docs@9e18593

That commit got rolled-back because of 2.4/2.5 incompatibilities, but I'm planning to resubmit it as soon as tf 2.5 is released.

For TF2.5 - Use the TextVectorization layer. - Use the AdditiveAttention layer. - tf.function the translate loop for text->text export. - Add more inline explanations, and sanity checks. - Add shape assertions throughout the code to make it easier to follow. Fixes: tensorflow/tensorflow#38248 Fixes: tensorflow/tensorflow#39654 See also: tensorflow/tensorflow#49237 PiperOrigin-RevId: 370250185

zredlined added the type:performance Performance Issue label May 18, 2020

google-ml-butler bot assigned Saduf2019 May 18, 2020

Saduf2019 added the TF 2.1 for tracking issues in 2.1 release label May 19, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label May 19, 2020

Saduf2019 added comp:gpu GPU related issues and removed stat:awaiting response Status - Awaiting response from author labels May 20, 2020

Saduf2019 assigned jvishnuvardhan and unassigned Saduf2019 May 20, 2020

gowthamkpr assigned gowthamkpr and unassigned jvishnuvardhan May 31, 2020

gowthamkpr added the stat:awaiting response Status - Awaiting response from author label May 31, 2020

tensorflowbutler removed the stat:awaiting response Status - Awaiting response from author label Jun 3, 2020

gowthamkpr assigned sanjoy and unassigned gowthamkpr Jun 8, 2020

gowthamkpr added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jun 8, 2020

zredlined mentioned this issue Jun 8, 2020

Speed up synthetic data generation gretelai/gretel-synthetics#15

Closed

MarkDaoust self-assigned this Jun 24, 2020

tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Jun 26, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Apr 29, 2021

Saduf2019 removed the stat:awaiting response Status - Awaiting response from author label Apr 30, 2021

tf-text-github-robot mentioned this issue May 25, 2021

Update nmt_with_attention tensorflow/text#626

Merged

MarkDaoust closed this as completed in MarkDaoust/text@1b89f9d Aug 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to speed up text generation in TensorFlow reference example notebook? #39654

How to speed up text generation in TensorFlow reference example notebook? #39654

zredlined commented May 18, 2020

Saduf2019 commented May 19, 2020

zredlined commented May 19, 2020 •

edited

zredlined commented May 21, 2020

zredlined commented May 26, 2020

gowthamkpr commented May 31, 2020

zredlined commented Jun 1, 2020

MarkDaoust commented Jun 24, 2020 •

edited

Saduf2019 commented Apr 29, 2021

MarkDaoust commented Apr 29, 2021

How to speed up text generation in TensorFlow reference example notebook? #39654

How to speed up text generation in TensorFlow reference example notebook? #39654

Comments

zredlined commented May 18, 2020

Saduf2019 commented May 19, 2020

zredlined commented May 19, 2020 • edited

zredlined commented May 21, 2020

zredlined commented May 26, 2020

gowthamkpr commented May 31, 2020

zredlined commented Jun 1, 2020

MarkDaoust commented Jun 24, 2020 • edited

Saduf2019 commented Apr 29, 2021

MarkDaoust commented Apr 29, 2021

zredlined commented May 19, 2020 •

edited

MarkDaoust commented Jun 24, 2020 •

edited