# Extract Logits and Text for use in Data Visualization
## Author: Joseph Jinn

Notes:

- https://github.com/dunovank/jupyter-themes
 - (Jupyter Notebook Themes)

- https://towardsdatascience.com/bringing-the-best-out-of-jupyter-notebooks-for-data-science-f0871519ca29
 - (useful additions for Jupyter Notebook)

- https://medium.com/@rbmsingh/making-jupyter-dark-mode-great-5adaedd814db
 - (Jupyter dark-mode settings - my eyes are no longer bleeding...)

- https://github.com/ipython-contrib/jupyter_contrib_nbextensions
 - (Jupyter extensions)

- https://pytorch.org/tutorials/intermediate/char_rnn_generation_tutorial.html
 - (PyTorch tutorial on character-level RNN)
 
<br>

Enter this in Terminal (for use with jupyter-themes):

jt -t monokai -f fira -fs 13 -nf ptsans -nfs 11 -N -kl -cursw 5 -cursc r -cellw 95% -T

<br>

Important files to reference:

- modeling_gpt2.py
 - The GPT2 model source code.
 
- tokenization_gpy2.py
 - The tokenizer class for the GPT2 model.
 
 <br>
 
Reference Material to understand the Theoretical Foundation of GPT2:

https://en.wikipedia.org/wiki/Language_model

http://karpathy.github.io/2015/05/21/rnn-effectiveness/

It would also be helpful to have some concept about beam search… I’m not super-happy with what my Googling obtains but…

https://en.wikipedia.org/wiki/Beam_search

https://machinelearningmastery.com/beam-search-decoder-natural-language-processing/
 
 <br>
 
Also maybe helpful but don’t get distracted:

the first 20 minutes or so of this (everything after that is details of training, skip it.)  

https://www.youtube.com/watch?v=Keqep_PKrY8

https://medium.com/syncedreview/language-model-a-survey-of-the-state-of-the-art-technology-64d1a2e5a466

https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf

https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf

http://colah.github.io/posts/2015-08-Understanding-LSTMs/


##### Import required packages and libraries.

In [1]:
from tqdm import trange # Instantly make your loops show a smart progress meter

import torch # Pytorch.
import torch.nn.functional as F
import numpy as np # Numpy.
import pandas as pd # Pandas.

###############################################

# Hugging-face Transformers.
from transformers import GPT2Config, OpenAIGPTConfig, XLNetConfig, TransfoXLConfig, XLMConfig, CTRLConfig

from transformers import GPT2LMHeadModel, GPT2Tokenizer
from transformers import OpenAIGPTLMHeadModel, OpenAIGPTTokenizer
from transformers import XLNetLMHeadModel, XLNetTokenizer
from transformers import TransfoXLLMHeadModel, TransfoXLTokenizer
from transformers import CTRLLMHeadModel, CTRLTokenizer
from transformers import XLMWithLMHeadModel, XLMTokenizer

##### Load the GPT2-model.

In [2]:
model_class = GPT2LMHeadModel # Specifies the model to use.
tokenizer_class = GPT2Tokenizer # Specifies the tokenizer to use for the model.
tokenizer = tokenizer_class.from_pretrained('gpt2') # Use pre-trained model.
model = model_class.from_pretrained('gpt2') # User pre-trained model.
model.to('cpu') # Specifies what machine to run the model on.
model.eval() # Specifies that the model is NOT in training mode.

GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0): Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
      (1): Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): Laye

The GPT2 tokenizer converts a string into a list of ID's and back when decoded. 

In [3]:
encoded_string = tokenizer.encode('my favorite testcaseffjfdfgjghgkj')
print(f"Encoded string: {encoded_string}\n")

# Decodes token-by-token, instead of decoding the entire string all at once.
decoded_string = [tokenizer.decode(tok) for tok in encoded_string]
print(f"Decoded string: {decoded_string}\n")

This tokenizer does not make use of special tokens. Input is returned with no modification.
This tokenizer does not make use of special tokens. Input is returned with no modification.
This tokenizer does not make use of special tokens.


Encoded string: [1820, 4004, 1332, 7442, 487, 73, 69, 7568, 70, 73, 456, 70, 42421]

Decoded string: ['my', ' favorite', ' test', 'case', 'ff', 'j', 'f', 'df', 'g', 'j', 'gh', 'g', 'kj']



##### Barbones GPT2-Model Text Prediction.

In [3]:
    debug = True # Enable or disable debug print statements.
    temperature = 1 # Default value.
    length = 1 # Default value.
    num_samples = 1 # Default value.
    
    # Raw text string.
    raw_text = "print('Hello"
    
    # Encode raw text.
    context_tokens = tokenizer.encode(raw_text, add_special_tokens=False)
    
    context = context_tokens # Re-name.
    
    # Convert to a PyTorch Tensor object (numpy array).
    context = torch.tensor(context, dtype=torch.long, device='cpu')
    if debug:
        print(f"Context shape: {context.shape}")
        print(f"Context converted to Pytorch Tensor object: {context}\n")
        
    # Unsqueeze adds a dimension to the Tensor array.
    # Repeat adds x-dimensions and repeats the Tensor elements y-times.
    context = context.unsqueeze(0).repeat(num_samples, 1)
    if debug:
        print(f"Context shape after 'unsqueeze': {context.shape}")
        print(f"Context after 'unsqueeze': {context}\n")
        
    generated = context # Re-name.

Context shape: torch.Size([3])
Context converted to Pytorch Tensor object: tensor([ 4798, 10786, 15496])

Context shape after 'unsqueeze': torch.Size([1, 3])
Context after 'unsqueeze': tensor([[ 4798, 10786, 15496]])



In [4]:
# Create a toy outputs data structure.
test_outputs = tuple(torch.tensor([[[[1, 2, 3, 4, 5]]]]))
if debug:
    print(f"Test outputs Tensor shape: {list(test_outputs)[0].shape}")
    print(f"Test outputs Tensor object:: {test_outputs}\n")

# Create a toy next token logits data structure.
test_next_token_logits = test_outputs[0][:, -1, :]
if debug:
    print(f"Test next token logits shape: {test_next_token_logits.shape}")
    print(f"Test next token logits: {test_next_token_logits}\n")


Test outputs Tensor shape: torch.Size([1, 1, 5])
Test outputs Tensor object:: (tensor([[[1, 2, 3, 4, 5]]]),)

Test next token logits shape: torch.Size([1, 5])
Test next token logits: tensor([[1, 2, 3, 4, 5]])



##### Adapted for use in Data Visualization.

Note: Command Mode - L (adds line numbers)

In [6]:
with torch.no_grad(): # This specifies not to use stochastic gradient descent!
#     for _ in trange(length): 

    """
    Based on the shape, outputs is a 3-dimensional structure.
    1st dimension: a single column storing the 12 sets of the same vocabulary.
    2nd dimension: each set of the vocabulary as a row.
    3rd dimension: the individual words in the vocabulary with their assigned scores.
    
    Important (if I understand correctly):
    
    The indices represent the encoded ID values of the word.
    The element value represent the score assigned to each word (how likely to be the next word).
    """
    # Call to GPT2 model generates a Tensor object containing "scores" for the entire vocabulary.
    outputs = model(input_ids=generated)
    if debug:
        print(f"Outputs shape: {list(outputs)[0].shape}\n")
        print(f"Outputs: {list(outputs)[0]}\n") # Outputs is a tensor containing a lot of stuff...

        print(f"Outputs 3rd-level nested list within tensor object 1st-element shape: {list(outputs)[0][0][0].shape}\n")
        print(f"Outputs 3rd-level nested list within tensor object 1st-element: {list(outputs)[0][0][0]}\n")
        print(f"Outputs 3rd-level nested list within tensor object 2nd-element shape: {list(outputs)[0][0][1].shape}\n")
        print(f"Outputs 3rd-level nested list within tensor object 2nd-element: {list(outputs)[0][0][1]}\n")

    """
    [:, -1, :]
    : indexes everything in that dimension.
    -1 indexes the last element in that dimension.
    
    outputs[0] indexes into the 12 sets of vocabulary.
    
    next_token_logits is the last of the 12 sets of vocabulary.
    """
    next_token_logits = outputs[0][:, -1, :] / (temperature if temperature > 0 else 1.)
    if debug:
        print(f"Next token logits shape: {next_token_logits.shape}\n")
        print(f"Next token logits head: {next_token_logits[0][0]}")
        print(f"Next token logits: {next_token_logits}\n")

    filtered_logits = next_token_logits # Set to default name from run_generation.py
    
    """
    Section for converting PyTorch Tensor to Pandas Dataframe via Numpy.
    """
    numpy_array = next_token_logits.numpy()
    df = pd.DataFrame(numpy_array)
    print(f"df shape: {df.shape}")
    df_tranposed = df.transpose()
    print(f"df_transposed shape: {df_tranposed.shape}")
    df_tranposed.columns = ["wordScore"]
        
    # Determine word ID's (also index values)
    wordID = []
    for i in range(0, len(next_token_logits[0])):
        wordID.append(i)
    print(f"wordID length: {len(wordID)}")
        
    # Assign word ID's to each word score.
    df_tranposed['wordID'] = wordID
    print(f"df_transposed head:\n {df_tranposed.head(10)}")
    
    # Test that we can decode a single wordID from the Pandas dataframe.
    print(f"Word ID: {df_tranposed['wordID'][10]}")
    print(tokenizer.decode(int(df_tranposed['wordID'][0])))
    
    # Debugging purposes.
    df_test = pd.DataFrame(df_tranposed.head(10))
    print(f"df_test shape: {df_test.shape}")
  
    def decode_word(row):
        """
        Function decodes the word ID and add decoded word as entry in a new column.
        
        params:
            row - a panda series.
        return:
            row["decodedWord"] - decoded word as new entry in column.
        """
#         print(f"Type: {type(row['wordID'])}")
#         print(f"Type conversion: {type(int(row['wordID']))}")
        myID = row['wordID'].astype(int)
#         print(f"ID: {myID}")
        decoded_word = tokenizer.decode(int(myID))
#         print(f"Decoded word: {decoded_word}")
        row["decodedWord"] = decoded_word
        return row["decodedWord"]
    
    # Apply function to every row in dataframe.
    df_tranposed["decodedWord"] = df_tranposed.apply(decode_word, axis=1)
    print(f"df_transposed head:\n {df_tranposed.head(10)}")

    df_tranposed.to_csv(f'D:/Dropbox/GitHub-Pages/J-Jinn.github.io/huggingface_transformers/next_token_logits.csv', 
                        index = None, header=True, encoding="utf-8")

            

Outputs shape: torch.Size([1, 3, 50257])

Outputs: tensor([[[-31.0651, -30.1248, -33.1206,  ..., -38.5258, -38.8258, -31.3107],
         [-64.8712, -63.1647, -61.1237,  ..., -75.9139, -74.8393, -66.0857],
         [-50.9404, -57.3028, -59.3610,  ..., -65.5572, -64.8027, -58.9236]]])

Outputs 3rd-level nested list within tensor object 1st-element shape: torch.Size([50257])

Outputs 3rd-level nested list within tensor object 1st-element: tensor([-31.0651, -30.1248, -33.1206,  ..., -38.5258, -38.8258, -31.3107])

Outputs 3rd-level nested list within tensor object 2nd-element shape: torch.Size([50257])

Outputs 3rd-level nested list within tensor object 2nd-element: tensor([-64.8712, -63.1647, -61.1237,  ..., -75.9139, -74.8393, -66.0857])

Next token logits shape: torch.Size([1, 50257])

Next token logits head: -50.940406799316406
Next token logits: tensor([[-50.9404, -57.3028, -59.3610,  ..., -65.5572, -64.8027, -58.9236]])

df shape: (1, 50257)
df_transposed shape: (50257, 1)
wordID len

In [29]:
"""
torch.topk performs a similar function to Softmax.
    Use the words' "scores" to choose the top "k" most likely predicted words (tokens).

- torch.topk
 - Returns the :attr:`k` largest elements of the given :attr:`input` tensor along a given dimension.
 
Non-statistical and probabilistic method, so results are deterministic (always the same).
"""
# Return the top "k" most likely (highest score value) words in sorted order..
my_topk = torch.topk(input=filtered_logits, k=30, dim=1, sorted=True)
print(f"My torch.topk object: {my_topk}\n")
print(f"torch.topk indices: {my_topk.indices}\n")
print(f"torch.topk values: {my_topk.values}\n")

# https://stackoverflow.com/questions/34750268/extracting-the-top-k-value-indices-from-a-1-d-tensor
# https://stackoverflow.com/questions/53903373/convert-pytorch-tensor-to-python-list

# Indices = encoded words, Values = scores.
print(f"\nDecoded torch.topk values: {[tokenizer.decode(idx) for idx in my_topk.indices.squeeze().tolist()]}\n")
print(f"\nDecoded torch.topk values: {tokenizer.decode(my_topk.indices.squeeze().tolist())}\n")

"""
Note: Index values correspond to word ID's
"""
# for i in range(0, len(next_token_logits[0])):
#     print(tokenizer.decode(i))
#     if i > 100:
#         break
print(tokenizer.decode(2159))

My torch.topk object: torch.return_types.topk(
values=tensor([[-47.6363, -47.9428, -48.5424, -50.3235, -50.8331, -50.8338, -50.8381,
         -50.9404, -51.4934, -51.5334, -51.6169, -51.8034, -51.8455, -52.1599,
         -52.2399, -52.4960, -52.8470, -52.8655, -52.9795, -53.0174, -53.0947,
         -53.1981, -53.2480, -53.4509, -53.5328, -53.5375, -53.5451, -53.5861,
         -53.6293, -53.6468]]),
indices=tensor([[ 2159,    11,   995,   422, 11537,  3256, 10603,     0,     6, 13679,
           612,   705,    13,  4064, 24036,  4032,   314,    12,  1770,   198,
           616,    25,  2506,   290, 29564,  6894, 21168,   532, 28265,   262]]))

torch.topk indices: tensor([[ 2159,    11,   995,   422, 11537,  3256, 10603,     0,     6, 13679,
           612,   705,    13,  4064, 24036,  4032,   314,    12,  1770,   198,
           616,    25,  2506,   290, 29564,  6894, 21168,   532, 28265,   262]])

torch.topk values: tensor([[-47.6363, -47.9428, -48.5424, -50.3235, -50.8331, -50.8338, -

In [None]:
"""
This method of sampling simply returns the word with the highest "score" as the chosen next token.
(non-statistical and probabilistic method)

The result is deterministic (always the same).
"""
next_token = torch.argmax(filtered_logits, dim=-1).unsqueeze(-1) # Greedy sampling.

if debug:
    print(f"Next token shape: {next_token.shape}\n")
    print(f"Next token: {next_token}\n")
    print(f"Decoded next token(s): {tokenizer.decode(next_token.squeeze().tolist())}\n")

In [None]:
"""
Monotonic (always) increasing function used by Softmax: e^x

Below is the latex equation from the API documentation for Softmax:
    math:`\text{Softmax}(x_{i}) = \frac{exp(x_i)}{\sum_j exp(x_j)}`

Translation: 
    Each word's score is converted to a probability using: e^(x_i) / sum of e^(x_j) for all words.
    
Token is then chosen based on the multinomial probability distribution of all the words' probabilities.

The result is non-deterministic since statistical probabilistic method based on random sampling.
"""
# Note: num_samples determines the number of tokens to choose.
next_token = torch.multinomial(
    F.softmax(filtered_logits, dim=-1),
    num_samples=1) # Not greedy?

if debug:
    print(f"Next token shape: {next_token.shape}\n")
    print(f"Next token: {next_token}\n")
    print(f"Decoded next token(s): {tokenizer.decode(next_token.squeeze().tolist())}\n")

In [None]:
# Concatenate the chosen token (predicted word) to the end of the tokenized (encoded) string.
generated = torch.cat((generated, next_token), dim=1)
if debug:
    print(f"Generated shape: {generated.shape}")
    print(f"Generated: {generated}")
    print(f"Decoded 'generated' tokens: {tokenizer.decode(generated.squeeze().tolist())}\n")
            
out = generated # Re-name.
if debug:
    print(f"Contents of 'out': {out}")

# This line removes the original text but keeps appending the generated words one-by-one (based on iteration length).
out = out[:, len(context_tokens):].tolist()
if debug:
    print(f"Contents of 'out' after .tolist(): {out}\n")
    print(f"Length of context tokens:{len(context_tokens)}\n")

# Outputs the result of the text modeling and prediction.
for o in out:
    # Decode - convert from token ID's back into English words.
    text = tokenizer.decode(o, clean_up_tokenization_spaces=True)
#     text = text[: text.find(args.stop_token) if args.stop_token else None]
    print(f"Content of text: ##{text}##\n")
                

In [None]:
"""
X = X_1, X_2, ..., X_100
IID random sampling? - with replacement.
SRS - simple random sampling? - without replacement.

Binomial coefficients - n choose k (without replacement, order doesn't matter)
Permutations - n^k (with replacement, order matters)
Permutations - (k + n - 1) choose n (with replacement, order matters)

Note to self: Know multinomial probability distributions since basis for everything.
"""
# "." forces conversion to floats from integers.
# Softmax function accepts input Tensor, number of random samples to take, and whether to sample with replacement.
torch.multinomial(F.softmax(torch.Tensor([3., -1, 2])), 100, replacement=True)

##### Same as above code cells, but refactored into a function in order to make multiple predictions for user-inputted text.

In [None]:
def make_prediction(context):
    """
    This function generates the predicted text and returns the text as tokens.

    Parameters: None
    Return: generated - PyTorch Tensor containing tokens.
    """
    make_prediction_debug = False
    temperature = 1 # Default value.
    length = 20 # Default value.
    num_samples = 1 # Default value.
    
    # Convert to a PyTorch Tensor object (numpy array).
    context = torch.tensor(context, dtype=torch.long, device='cpu')
    if make_prediction_debug:
        print(f"Context shape: {context.shape}")
        print(f"Context converted to Pytorch Tensor object: {context}\n")
        
    # Unsqueeze adds a dimension to the Tensor array.
    # Repeat adds x-dimensions and repeats the Tensor elements y-times.
    context = context.unsqueeze(0).repeat(num_samples, 1)
    if make_prediction_debug:
        print(f"Context shape after 'unsqueeze': {context.shape}")
        print(f"Context after 'unsqueeze': {context}\n")
    
    generated = context # Set to name as in run_generation.py
    
    ############################################################################################

    with torch.no_grad(): # This specifies not to use stochastic gradient descent!
        for _ in trange(length): 
            
            # Call to GPT2 model generates a Tensor object containing "scores" for the entire vocabulary.
            outputs = model(input_ids=generated)
            if make_prediction_debug:
                print(f"Outputs shape: {list(outputs)[0].shape}\n")
                print(f"Outputs: {list(outputs)[0]}\n") # Outputs is a tensor containing a lot of stuff...

                print(f"Outputs 3rd-level nested list within tensor object 1st-element: {list(outputs)[0][0][0].shape}\n")
                print(f"Outputs 3rd-level nested list within tensor object 1st-element: {list(outputs)[0][0][0]}\n")
                print(f"Outputs 3rd-level nested list within tensor object 2nd-element: {list(outputs)[0][0][1].shape}\n")
                print(f"Outputs 3rd-level nested list within tensor object 2nd-element: {list(outputs)[0][0][1]}\n")

            next_token_logits = outputs[0][:, -1, :] / (temperature if temperature > 0 else 1.)
            if make_prediction_debug:
                print(f"Next token logits shape: {next_token_logits.shape}\n")
                print(f"Next token logits: {next_token_logits}\n")

            filtered_logits = next_token_logits # Set to default name from run_generation.py
            
            ############################################################################################
            
            """
            This method of sampling simply returns the word with the highest "score" as the chosen next token.
            (non-statistical and probabilistic method)

            The result is deterministic (always the same).
            """
            next_token = torch.argmax(filtered_logits, dim=-1).unsqueeze(-1) # Greedy sampling.

            if make_prediction_debug:
                print(f"Next token shape: {next_token.shape}\n")
                print(f"Next token: {next_token}\n")
                print(f"Decoded next token(s): {tokenizer.decode(next_token.squeeze().tolist())}\n")
    
            ############################################################################################
        
            """
            Monotonic (always) increasing function used by Softmax: e^x

            Below is the latex equation from the API documentation for Softmax:
                math:`\text{Softmax}(x_{i}) = \frac{exp(x_i)}{\sum_j exp(x_j)}`

            Translation: 
                Each word's score is converted to a probability using: e^(x_i) / sum of e^(x_j) for all words.

            Token is then chosen based on the multinomial probability distribution of all the words' probabilities.

            The result is non-deterministic since statistical probabilistic method based on random sampling.
            """
            # Note: num_samples determines the number of tokens to choose.
            next_token = torch.multinomial(
                F.softmax(filtered_logits, dim=-1),
                num_samples=1) # Not greedy?

            if make_prediction_debug:
                print(f"Next token shape: {next_token.shape}\n")
                print(f"Next token: {next_token}\n")
                print(f"Decoded next token(s): {tokenizer.decode(next_token.squeeze().tolist())}\n")
            
            ############################################################################################
            
            
            # Concatenate the chosen token (predicted word) to the end of the tokenized (encoded) string.
            generated = torch.cat((generated, next_token), dim=1)
            if make_prediction_debug:
                print(f"Generated shape: {generated.shape}")
                print(f"Generated: {generated}")
                print(f"Decoded 'generated' tokens: {tokenizer.decode(generated.squeeze().tolist())}\n")
                                       
    return generated

##### Calls the make_prediction(context) function and outputs text prediction and generation results.

In [None]:
def output_prediction(num_predictions, context_tokens):
    """
    This function outputs the results of our generated predicted text.
    """
    output_prediction_debug = False
    # # Prompt user for input text.
    # raw_text = args.prompt if args.prompt else input("Model prompt >>> ")
    # context_tokens = tokenizer.encode(raw_text, add_special_tokens=False)

    # out = sample_sequence(
    #     model=model,
    #     context=context_tokens,
    #     num_samples=args.num_samples,
    #     length=args.length,
    #     temperature=args.temperature,
    #     top_k=args.top_k,
    #     top_p=args.top_p,
    #     repetition_penalty=args.repetition_penalty,
    #     is_xlnet=bool(args.model_type == "xlnet"),
    #     is_xlm_mlm=is_xlm_mlm,
    #     xlm_mask_token=xlm_mask_token,
    #     xlm_lang=xlm_lang,
    #     device=args.device,
    # )
    
    ############################################################################################
    
    for i in range(0, num_predictions):
            
        out = make_prediction(context_tokens) # Function returns "generated" - PyTorch Tensor containing encoded tokens.
        if output_prediction_debug:
            print(f"Contents of 'out': {out}")

        # This line removes the original text but keeps appending the generated words one-by-one (based on iteration length).
        out = out[:, len(context_tokens):].tolist()
        if output_prediction_debug:
            print(f"Contents of 'out' after .tolist(): {out}\n")
            print(f"Length of context tokens:{len(context_tokens)}\n")

        # Outputs the result of the text modeling and prediction.
        for o in out:
            # Decode - convert from token ID's back into English words.
            text = tokenizer.decode(o, clean_up_tokenization_spaces=True)
        #     text = text[: text.find(args.stop_token) if args.stop_token else None]
            print(f"Content of text: ##text_start_marker##{text}##text_end_marker##\n")
                

##### The usual main function.

In [None]:
def main():
    """
    Main function.
    """
    main_debug = False
    num_predictions = 3 # Specify the number of predictions to make for input string.
    
    # Raw text string.
    raw_text = "this is a test string for trying to understand what the heck is happening in run_generation.py."
    
    # Encode raw text.
    context_tokens = tokenizer.encode(raw_text, add_special_tokens=False)
    # Generate and output text prediction results.
    output_prediction(num_predictions, context_tokens)
    
    if main_debug:
        print(f"Raw text: {raw_text}\n")
        print(f"Context tokens: {context_tokens}\n")
    

##### Execute to  utilize GPT2 model to generate text prediction output.

In [None]:
if __name__ == '__main__':
    main()