# TEXT GENERATION USING PRETRAINED TRANSFORMER-BASED LANGUAGE MODEL

_**Using a pretrained transformer-based language model with Hugging Face Transformer library to generate a new story-like text against a prompt like "Once upon a time...".**_

This experiment uses a GPT2 - a small generative pretrained transformer based small language model from OpenAI (https://huggingface.co/openai-community/gpt2). It has just 124 million parameters - the lowest number of parameters in GPT-2 family of models and makes inferences tractable over commodity computers.

## Importing Packages

In [None]:
from transformers import pipeline, set_seed
from transformers import GPT2Tokenizer, TFGPT2LMHeadModel

In [None]:
# Set the prompt text for model to follow to continue generating the content
prompt = "..."

## Text Generation using Pipeline (Inference API)

In [None]:
# Sets seed in random, numpy, tf and/or torch (if installed)
set_seed(42)

# Initialize an instance of pipeline by calling `pipeline` method and passing 
# argument "text-generation" in `task` parameter and "gpt2" in `model` parameter

generator_pipeline = # Write code

In [None]:
# Generate content (sequences) based on the prompt provided by calling
# method `generator_pipeline` and passing the `prompt` to parameter `text_inputs`,
# 200 in `max_length` to specify maximum number of tokens in the generated text, 
# `True` to `truncation` to enable truncation beyond mentioned length, 
# and 5 to `num_return_sequences` as total number of generated sequences to return

# NOTE: The following steps may take few minutes to complete

generated_sequences = # Write code

In [None]:
# Prints the pipeline generated sequences

for idx, sequence in enumerate(generated_sequences):
    print(idx+1, "\t" + sequence['generated_text'])
    print('-' * 80, "\n")

## Text Generation using Model-Specific Classes

In [None]:
# Initialize GPT2 model transformer with a language modeling head 
# (linear layer with weights tied to the input embeddings) on top by calling
# method `from_pretrained` of class `TFGPT2LMHeadModel` and passing the arguments
# "gpt2" to first parameter as name of pretrained model and `False` to parameter
# `use_safetensors` as Keras 2.x compatibility in Keras 3.x installation

model = # Write code

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


In [None]:
# Initialize model-specific tokenizer by calling method `from_pretrained` of
# class `GPT2Tokenizer` and passing argument "gpt2" as its parameter

tokenizer = # Write code

In [None]:
# Tokenize and encodes the prompt text by calling `encode` method of tokenizer
# instance created in the previous step and passing the prompt as first parameter for encoding,
# `False` to `add_special_tokens` for not to add special tokens such as 'SOS' and 'EOS' automatically,
# and "tf" to `return_tensors` to return TensorFlow `tf.constant` objects

encoded_prompt = # Write code

# Prints the encoded prompt
print(encoded_prompt)

In [None]:
# Generate sequences by calling method `generate` on the model instance created
# earlier and passing the encoded prompt to parameter `input_ids` as prompt for
# the generated sequence to follow, `True` to `do_sample` to enable sampling the
# next token (from the distribution over the vocabulary instead of choosing the 
# most likely next token), 200 to `max_length` to restrict the length of the 
# generated sequence, and 5 to `num_return_sequences` as total number of sequences 
# to be generated

generated_sequences = # Write code

In [None]:
# [OPTIONAL] Prints the encodings of one of the generated sequences [for reference]
generated_sequences[0]

In [None]:
# Decodes all the generated sequences
for idx, sequence in enumerate(generated_sequences):
    text = # Write code to call `decode` method of the tokenizer instance passing 
            # each sequence as first parameter, and `True` to parameter
            # `clean_up_tokenization_spaces` to removes space artifacts inserted while 
            # encoding the sequence, e.g, "state-of-the-art" gets encoded as "state - of - the - art".

    print(idx+1, "\t" + text)
    print("-" * 80, "\n")

## Observations

- What type of pretrained model was considered for this text generation taks? What was its major characteristics?

- What type of approaches were considered for inferencing in terms of abstracting complexity of text generation using the Transformer library?

- Which approach was easiest? Why was it easy compared to the other approach?

- Both the approaches supported not just one but a sequence of generated text. How did both the approaches support generating sequence of text instead of just one?