New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Text Generation with Transformer Decoder example #914

Merged

fchollet merged 18 commits into keras-team:master from jessechancy:jesse-transformerdecoder-tutorial

Aug 9, 2022

Contributor

jessechancy commented Jun 13, 2022

The colab I've written before to test out transformer decoder. I've formatted it to be one of the example on the keras.io website.

fchollet requested a review from mattdangerw

June 14, 2022 16:13

mattdangerw reviewed

View reviewed changes

examples/nlp/text_generation_transformer_decoder.py Outdated

		@@ -0,0 +1,229 @@
		"""
		Title: Text Generation with Keras NLP TransformerDecoder

Member

mattdangerw Jun 28, 2022

Let's add gpt in the title somewhere, it will be popular :)

Contributor Author

jessechancy Jul 14, 2022

added gpt to title

mattdangerw reviewed

View reviewed changes

examples/nlp/text_generation_transformer_decoder.py Outdated

+                    start_tokens.append(sample_token)
+                    num_tokens_generated += 1
+                  txt = self.tokenizer.detokenize(start_tokens)
+                  print(f"generated text: \n{txt}\n")

Member

mattdangerw Jun 28, 2022

maybe let's show greedy, top-k and random all together each epoch?

Member

fchollet Jul 7, 2022

+1

Contributor Author

jessechancy Jul 14, 2022

Added the utility functions in the inference section. However, changed it from printing every epoch because it would get pretty messy with too many prints and also a long callback class. Instead, I gave a short callback wrapper example at the end with the top k utility.

fchollet reviewed

View reviewed changes

Member

fchollet left a comment

Thanks for the PR!

examples/nlp/text_generation_transformer_decoder.py Outdated

+              Description: Implementation of a small GPT-like model using the TransformerDecoder class.
+              """
+              """
+              # Download Library

Member

fchollet Jul 7, 2022

Note that section titles should use ##

Contributor Author

jessechancy Jul 14, 2022

edited to ##

examples/nlp/text_generation_transformer_decoder.py Outdated

+              Date created: 2022/06/13
+              Last modified: 2022/06/13
+              Description: Implementation of a small GPT-like model using the TransformerDecoder class.
+              """

Member

fchollet Jul 7, 2022

Add an Introduction section explaining what the example is about, what dataset you will use, etc.

Contributor Author

jessechancy Jul 14, 2022

added introduction section, with a high level description of the components in the notebook

examples/nlp/text_generation_transformer_decoder.py Outdated

+              import tensorflow as tf
+              from tensorflow import keras
+              import numpy as np
+              from keras_nlp.layers.transformer_decoder import TransformerDecoder

Member

fchollet Jul 7, 2022

Import keras_nlp only

Contributor Author

jessechancy Jul 14, 2022

fixed

examples/nlp/text_generation_transformer_decoder.py Outdated

+                model = keras.Model(inputs=inputs, outputs=outputs)
+                loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
+                model.compile(
+                    "adam", loss=loss_fn,

Member

fchollet Jul 7, 2022

Add metrics and use a keyword argument for the optimizer

Contributor Author

jessechancy Jul 14, 2022

fixed

examples/nlp/text_generation_transformer_decoder.py Outdated

+                    start_tokens.append(sample_token)
+                    num_tokens_generated += 1
+                  txt = self.tokenizer.detokenize(start_tokens)
+                  print(f"generated text: \n{txt}\n")

Member

fchollet Jul 7, 2022

+1

examples/nlp/text_generation_transformer_decoder.py Outdated


		model = create_model()

		model.fit(

Member

fchollet Jul 7, 2022

Add a section for evaluation / inference and a conclusion section summarizing what was learned from the example.

Contributor Author

jessechancy Jul 14, 2022

added inference section and concluding paragraph

mattdangerw reviewed

View reviewed changes

examples/nlp/text_generation_gpt.py Outdated

+              """
+              # Download vocabulary data.
+              vocab_file = keras.utils.get_file(

Member

mattdangerw Jul 21, 2022 •

edited

should we use the word piece vocab learner utility here? that could also simplify the code explanation above.

Contributor Author

jessechancy Jul 21, 2022

edited

mattdangerw requested changes

View reviewed changes

Member

mattdangerw left a comment

This is really great! Left a few comments

examples/nlp/text_generation_gpt.py Outdated



		def create_model():
		inputs = keras.layers.Input(shape=(SEQ_LEN,), dtype=tf.int32)

Member

mattdangerw Jul 21, 2022

given that this is only called once, why not just move this out of the function?

Contributor Author

jessechancy Jul 21, 2022

edited

examples/nlp/text_generation_gpt.py Outdated

+                  x = embedding_layer(inputs)
+                  # Transformer decoders.
+                  for _ in range(NUM_LAYERS):
+                      transformer_block = keras_nlp.layers.TransformerDecoder(

Member

mattdangerw Jul 21, 2022

decoder_layer maybe to agree with embedding_layer naming

Contributor Author

jessechancy Jul 21, 2022

edited

examples/nlp/text_generation_gpt.py Outdated

+              model = create_model()
+              """
+              Let's take a look at our model summary here! We can see that a large majority of the

Member

mattdangerw Jul 21, 2022

remove the exclamation point here, so the one in the next sentence hits better.

Contributor Author

jessechancy Jul 25, 2022

edited

examples/nlp/text_generation_gpt.py Outdated

+              ## Training
+              Now that we have our model, let's train it. We use a subset of the training data to save
+              on training time. It would also be beneficial to use a GPU to speed up the training

Member

mattdangerw Jul 21, 2022

I'm not sure the GPU part bears mentioning here, maybe just at the top of the colab, say if you are running in a colab make sure to enable the gpu runtime for faster training performance.

Contributor Author

jessechancy Jul 21, 2022

edited

examples/nlp/text_generation_gpt.py Outdated

+              # Training
+              LEARNING_RATE = 5e-4
+              EPOCHS = 12
+              NUM_TRAINING_BATCHES = 1000

Member

mattdangerw Jul 21, 2022

why do we need this hyperparameter, can we just train over the full dataset?

Contributor Author

jessechancy Jul 21, 2022

removed hyperparameter

examples/nlp/text_generation_gpt.py Outdated

+              With our trained model, we can test it out to gauge it's performance. Since
+              this is a dataset of mostly fictional books, there is bound to be a hero, so let's use
+              "The hero" as our starting string! We run it through the tokenizer to get the input for

Member

mattdangerw Jul 21, 2022

remove exclamation point

Contributor Author

jessechancy Jul 21, 2022

edited

examples/nlp/text_generation_gpt.py Outdated

+              def preprocess(inputs):
+                  outputs = tokenizer(inputs)
+                  features = outputs[:, :-1]

Member

mattdangerw Jul 21, 2022

should we add bos tokens here? then you could sample without seed text, and I think we would be a little closer to actual gpt

Contributor Author

jessechancy Jul 25, 2022

edited

examples/nlp/text_generation_gpt.py Outdated

+              MAX_PREDICT_LEN = 80
+              start_prompt = "The hero"
+              # Unpadded token sequence.
+              start_tokens = [tokenizer.token_to_id(_) for _ in start_prompt.lower().split()]

Member

mattdangerw Jul 21, 2022

this feels awkward, we should actually use a tokenizer to tokenize the text. maybe let's just instantiate the tokenizer without a sequence length, and either use ragged.to_dense() or a packer layer to densify (especially if we decide to add start tokens)

Contributor Author

jessechancy Jul 25, 2022

edited to use packer layer for start tokens

examples/nlp/text_generation_gpt.py

+              model.fit(ds.take(1), verbose=2, epochs=2, callbacks=[text_generation_callback])
+              """
+              ## Conclusion

Member

mattdangerw Jul 21, 2022

i'm not sure this conclusion section adds too much. we should either replace it with a further reading/where to next section, or remove it entirely

examples/nlp/text_generation_gpt.py Outdated

+              model with few parameters.
+              This example combines concepts from [Text generation with a miniature GPT](https://keras.io/examples/generative/text_generation_with_miniature_gpt/)
+              with KerasNLP abstractions. We will demonstrate how KerasNLP tokenization, model, metrics, and

Member

mattdangerw Jul 21, 2022

Maybe just:
We will demonstrate how KerasNLP tokenization, layers and metrics simplify the training
process, and then show how to generate output text using sampling utilities.

And then remove the whole next paragraph. Readers can read on to see the exact layers used.

Contributor Author

jessechancy Jul 21, 2022

edited

fchollet reviewed

View reviewed changes

Member

fchollet left a comment •

edited

Thanks for the updates! I did a round of review with a focus on copywriting.

examples/nlp/text_generation_gpt.py Outdated

		@@ -0,0 +1,401 @@
		"""
		Title: Simple GPT Text Generation with KerasNLP transformers

Member

fchollet Jul 30, 2022

Just "with KerasNLP" (no transformers)

Member

fchollet Jul 30, 2022

Also don't capitalize all words unless they're proper nouns

examples/nlp/text_generation_gpt.py Outdated

+              Author: [Jesse Chan](https://github.com/jessechancy)
+              Date created: 2022/07/25
+              Last modified: 2022/07/25
+              Description: Using KerasNLP transformers to train a mini-GPT model for text generation.

Member

fchollet Jul 30, 2022

Remove "transformers"

examples/nlp/text_generation_gpt.py Outdated

+              ## Introduction
+              In this example, we will use KerasNLP layers to build a scaled down Generative
+              Pre-trained (GPT) model. GPT is a transformer based model that allows you to generate

Member

fchollet Jul 30, 2022

Capitalize T

Member

fchollet Jul 30, 2022

Also capitalize Transformer

examples/nlp/text_generation_gpt.py Outdated

+              In this example, we will use KerasNLP layers to build a scaled down Generative
+              Pre-trained (GPT) model. GPT is a transformer based model that allows you to generate
+              sophisticated text from a small input.

Member

fchollet Jul 30, 2022

"from a prompt"

examples/nlp/text_generation_gpt.py Outdated

+              metrics simplify the training
+              process, and then show how to generate output text using sampling utilities.
+              Note: If you are running this on a colab make sure to enable GPU runtime for faster

Member

fchollet Jul 30, 2022

"on Colab"

examples/nlp/text_generation_gpt.py Outdated

+              prompt_tokens = tf.convert_to_tensor([tokenizer.token_to_id("[BOS]")])
+              """
+              We will use the `keras_nlp.utils` library for inference. Every text generation

Member

fchollet Jul 30, 2022

Say "module" rather than library

examples/nlp/text_generation_gpt.py Outdated

+              """
+              We will use the `keras_nlp.utils` library for inference. Every text generation
+              utility would require a `token_logits_fn()` wrapper around the model. This wrapper takes

Member

fchollet Jul 30, 2022

"requires"

examples/nlp/text_generation_gpt.py Outdated

+              )
+              """
+              ## Train Tokenizer

Member

fchollet Jul 30, 2022

In every section title, only capitalize the first word, not all words

examples/nlp/text_generation_gpt.py Outdated

+              """
+              ## Conclusion
+              Congrats, you made it through the example! To recap, in this example, we use KerasNLP

Member

fchollet Jul 30, 2022

Drop "Congrats, you made it through the example!"

examples/nlp/text_generation_gpt.py

+              model, and perform inference with the text generation library.
+              If you would like to understand how transformers work, or learn more about training the
+              full GPT model, here are some further readings:

Member

fchollet Jul 30, 2022

Add line break before list

fchollet reviewed

View reviewed changes

Member

fchollet left a comment

Thanks for the update! I pushed some copyedits. Please pull them first. I think we're ready to add the generated files now.

fchollet approved these changes

View reviewed changes

Member

fchollet left a comment

Thank you for the great contribution! 👍 Merging now.

jessechancy requested a review from MarkDaoust as a code owner

August 5, 2022 19:40

jessechancy and others added 18 commits

August 5, 2022 13:42


          added example for text generation with transformer decoder

21c2b22


          made model more lightweight

2c8b4c7


          restructured and updated example

815f3a9


          removed filler

f73dabf


          changed to perplexity metric

3f27e85


          added vocab training, made model input variable and other fixes

a989f59


          copywriting fixes

e77e533


          copywriting fixes 2

3a93774


          copywriting fixes 3

8ec3bed


          Copyedits

657a32d


          formatting changes

26e5999


          add_example files

457b7f6


          updated email of author for cla

fbc72f6


          fixed whitespace

54457bb


          changes removing error and quicker training

434f031


          suppressed warnings

a66c829


          removed unused hyperparameter

81a7fb8


          Title update

c79ee11

jessechancy force-pushed the jesse-transformerdecoder-tutorial branch from 2332f50 to c79ee11 Compare

August 5, 2022 20:43

fchollet approved these changes

View reviewed changes

jessechancy requested a review from mattdangerw

August 8, 2022 04:54

mattdangerw approved these changes

View reviewed changes

Member

mattdangerw left a comment

LGTM! These examples are great! Really demonstrative of the different sampling.

Looks like we will need to cut a new release including the tokenizer vocab trainer function before we release this.

fchollet merged commit 4bd2aa6 into keras-team:master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment