# HuggingFace GPT-2 Transformer with Tensorflow

## Setup

- Import the necessary libraries, including TensorFlow and the Hugging Face Transformers library.

- Import your config file with your HuggingFace API

In [3]:
# Import the necessary libraries
import tensorflow as tf
from transformers import TFGPT2LMHeadModel, GPT2Tokenizer, GPT2Config
import config

## Model

- Load the pre-trained GPT-2 model and tokenizer from the Transformers library.

- A *tokenizer* is a tool used in natural language processing (NLP) to split text into smaller units, such as words or subwords, that can be more easily processed by a machine learning model. 


In [4]:
# Load the pre-trained GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2', api_key=config.API_KEY)
model_config = GPT2Config.from_pretrained('gpt2', api_key=config.API_KEY)


- Update the model config for text generation by setting the max_new_tokens, do_sample, top_p, and top_k attributes of the model_config object.

In [5]:
# Update the model config for text generation
model_config.max_new_tokens = 50
model_config.do_sample = True
model_config.top_p = 0.95
model_config.top_k = 50

- `max_new_tokens`: This option specifies the maximum number of new tokens that the model can generate when generating new text. By default, the value of max_new_tokens is set to 50, which means that the model can generate up to 50 new tokens in each generation step. This option can be useful for controlling the length of the generated text and preventing the model from generating very long or very short outputs.

- `do_sample`: This option is a boolean flag that controls whether or not the model uses sampling during text generation. If do_sample is set to True, the model uses sampling to randomly select the next token from the predicted probability distribution at each generation step. If do_sample is set to False, the model always chooses the token with the highest probability at each generation step (i.e., greedy decoding).

- `top_p`: This option controls the "nucleus" sampling strategy during text generation. The top_p value specifies the cumulative probability mass of the most likely tokens to consider. For example, if top_p is set to 0.95, the model will consider the most likely tokens that, taken together, represent 95% of the probability mass of the predicted probability distribution.

- `top_k`: This option controls the "top-k" sampling strategy during text generation. The top_k value specifies the number of most likely tokens to consider at each generation step. For example, if top_k is set to 50, the model will consider the 50 most likely tokens from the predicted probability distribution at each generation step.

The main difference between top_k and top_p is that top_k limits the number of possible next tokens to a fixed number, whereas top_p limits the number of possible next tokens based on their probability mass. top_k can be useful for generating more diverse text by preventing the model from always choosing the most likely token, whereas top_p can be useful for controlling the diversity and quality of the generated text by allowing the model to choose from a smaller set of more likely tokens.

- Load the GPT-2 model with the updated config by calling the `TFGPT2LMHeadModel.from_pretrained` method with the gpt2 model name and the model_config object.


In [6]:
# Load the GPT-2 model with the updated config
model = TFGPT2LMHeadModel.from_pretrained('gpt2', config=model_config)

All model checkpoint layers were used when initializing TFGPT2LMHeadModel.

All the layers of TFGPT2LMHeadModel were initialized from the model checkpoint at gpt2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFGPT2LMHeadModel for predictions without further training.


- Define the input text to generate from.


In [7]:
# Define the input text to generate from
input_text = "I love to"

- Tokenize the input text using the encode method of the tokenizer.


In [8]:
# Tokenize the input text using the tokenizer
input_ids = tokenizer.encode(input_text, return_tensors='tf')

- Generate text using the pre-trained GPT-2 model by calling the generate method of the model with the input IDs, the end-of-string token ID, and an attention mask to indicate which tokens to attend to.


In [9]:
# Generate text using the pre-trained GPT-2 model
output_ids = model.generate(
    input_ids,                            # Input IDs for the model
    pad_token_id=tokenizer.eos_token_id,  # ID of the end-of-string token
    attention_mask=tf.ones(input_ids.shape, dtype=tf.int32) # Attention mask to indicate which tokens to attend to
)



- Decode the output text from the output IDs using the decode method of the tokenizer.


In [10]:
# Decode the output text from the output IDs using the tokenizer
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

- Print the generated output text.

In [11]:
# Print the generated output text
print(output_text)


I love to have these guys play with me now and they always make it work. You never know."

The Knicks (29-29) won their first 14 games of the season before losing to the Suns, 84-85. They are 7-8
