<center><b>© Content is made available under the CC-BY-NC-ND 4.0 license.

---

Christian Lopez, lopezbec@lafayette.edu<center>

## This notebook demonstrates the interactive generation of text using the GPT-2 Large model. You can enter a prompt and then generate text interactively, token by token, or specify the number of tokens to generate through a simple form. The notebook shows details such as tokenization and the main token predictions before generation.

# About GPT-2 Large:
## GPT-2 Large is one of the open-source language models developed by OpenAI, with 774 million parameters.


In [1]:
# @title 1) Installing some Python libraries and downloading the GPT-2 model
# Install the transformers library (if not already installed)
!pip install transformers

from IPython.display import clear_output
import torch
import torch.nn.functional as F
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Use GPT-2 Large as our model for English text generation
model_name = "gpt2-large"  # GPT-2 Large has ~774M parameters
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
model.eval()  # Set the model to evaluation mode

# Check if GPU is available and move the model to GPU if possible
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)


clear_output()

In [4]:
# @title 2) Here you can type your prompt, to see how it becomes a list of numbers
# Get a prompt from the user
prompt ='Lafayette College is located ' # @param {type:"string", placeholder:"Enter a prompt"}

# Tokenize the prompt and print tokens and IDs
tokenized_prompt = tokenizer.tokenize(prompt)
input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)

print("\n--- Tokenization Details ---")
print("Tokenized prompt (tokens):", tokenized_prompt)
print("Tokenized prompt (token IDs):", input_ids.tolist()[0])

# Set temperature for randomness (lower value = more deterministic)
temperature = 0.1


--- Tokenization Details ---
Tokenized prompt (tokens): ['L', 'af', 'ayette', 'ĠCollege', 'Ġis', 'Ġlocated', 'Ġ']
Tokenized prompt (token IDs): [43, 1878, 27067, 5535, 318, 5140, 220]


In [5]:
# @title 3) This is to generate the prediction of the following words, one at a time



# Compute logits for the current prompt
with torch.no_grad():
    outputs = model(input_ids)
    logits = outputs.logits

# Adjust logits with temperature for randomness and compute softmax probabilities
next_token_logits = logits[:, -1, :] / temperature
probs = F.softmax(next_token_logits, dim=-1)

# Get the top 5 tokens with the highest probabilities
top5 = torch.topk(probs, 5)
top5_ids = top5.indices[0]
top5_probs = top5.values[0]
top5_tokens = [tokenizer.decode(token_id) for token_id in top5_ids]

print("\n--- Top 5 Tokens for the Next Prediction ---")
for token, prob in zip(top5_tokens, top5_probs):
    print(f"Token: '{token.strip()}', Probability: {prob.item():.4f}")

# Initialize generated text with the prompt
generated_text = prompt

# Define the maximum number of tokens to generate in the interactive loop
max_tokens = 50  # Adjust as needed

print("\n--- Interactive Generation Process ---\n")

# Interactive loop: generate tokens one at a time based on user confirmation
for i in range(max_tokens):
    with torch.no_grad():
        outputs = model(input_ids)
        logits = outputs.logits

    # Adjust logits with temperature for randomness
    next_token_logits = logits[:, -1, :] / temperature

    # Convert logits to probabilities
    probs = F.softmax(next_token_logits, dim=-1)

    # Sample the next token from the probability distribution
    next_token_id = torch.multinomial(probs, num_samples=1)
    next_token_prob = probs[0, next_token_id.item()]  # probability of the chosen token

    # Decode the predicted token to a string
    next_token = tokenizer.decode(next_token_id.squeeze())

    # Append the predicted token to the input_ids for the next iteration
    input_ids = torch.cat([input_ids, next_token_id], dim=1)
    # Append the predicted token to the generated text
    generated_text += next_token

    # Print the predicted token and its probability
    print(f"Step {i+1}: Predicted token: '{next_token.strip()}' with probability: {next_token_prob.item():.4f}")

    # Ask the user if they want to continue generating the next token
    user_input = input("Generate next token? (y/n): ")
    if user_input.lower() not in ["y", "yes"]:
        break

    # Print the updated prompt/text after the user confirms
    print("\nUpdated text:")
    print(generated_text)
    print("-" * 40)

print("\n--- Full Generated Response ---\n")
print(generated_text)



--- Top 5 Tokens for the Next Prediction ---
Token: '', Probability: 1.0000
Token: '________', Probability: 0.0000
Token: '________________', Probability: 0.0000
Token: '_____', Probability: 0.0000
Token: 'Â', Probability: 0.0000

--- Interactive Generation Process ---

Step 1: Predicted token: '' with probability: 1.0000
Generate next token? (y/n): y

Updated text:
Lafayette College is located  
----------------------------------------
Step 2: Predicted token: 'in' with probability: 1.0000
Generate next token? (y/n): y

Updated text:
Lafayette College is located  in
----------------------------------------
Step 3: Predicted token: 'Lafayette' with probability: 0.9975
Generate next token? (y/n): y

Updated text:
Lafayette College is located  in Lafayette
----------------------------------------
Step 4: Predicted token: ',' with probability: 1.0000
Generate next token? (y/n): y

Updated text:
Lafayette College is located  in Lafayette,
----------------------------------------
Step 5: P

KeyboardInterrupt: Interrupted by user

# 4) Here all at once

Try changing the value of the `temperature` hyperparameter to be more "creative"

In [6]:
# @title

# Get the initial prompt from the user
prompt ='Lafayette College is located ' # @param {type:"string", placeholder:"Enter a prompt"}

# Tokenize the prompt and encode to tensor
input_ids = tokenizer.encode(prompt, return_tensors='pt').to(device)
generated_text = prompt

# Ask the user for the number of new tokens to generate
num_tokens = 25 # @param {type:"number"}

# Set temperature for randomness (lower value means more deterministic output)
temperature = 0.1 # @param {type:"slider", min:0.1, max:0.9, step:0.1}

print("\n--- Generating Text ---\n")
for i in range(num_tokens):
    with torch.no_grad():
        outputs = model(input_ids)
        logits = outputs.logits

    # Adjust logits with temperature for randomness and compute probabilities
    next_token_logits = logits[:, -1, :] / temperature
    probs = F.softmax(next_token_logits, dim=-1)

    # Sample the next token from the probability distribution
    next_token_id = torch.multinomial(probs, num_samples=1)
    next_token = tokenizer.decode(next_token_id.squeeze())

    # Append the predicted token to input_ids and generated text
    input_ids = torch.cat([input_ids, next_token_id], dim=1)
    generated_text += next_token

    # Optionally, print each step
    print(f"Step {i+1}: Generated token: '{next_token.strip()}'")

print("\n--- Final Generated Text ---\n")
print(generated_text)



--- Generating Text ---

Step 1: Generated token: ''
Step 2: Generated token: 'in'
Step 3: Generated token: 'Lafayette'
Step 4: Generated token: ','
Step 5: Generated token: 'Louisiana'
Step 6: Generated token: '.'
Step 7: Generated token: 'It'
Step 8: Generated token: 'is'
Step 9: Generated token: 'a'
Step 10: Generated token: 'private'
Step 11: Generated token: 'liberal'
Step 12: Generated token: 'arts'
Step 13: Generated token: 'college'
Step 14: Generated token: 'that'
Step 15: Generated token: 'offers'
Step 16: Generated token: 'a'
Step 17: Generated token: 'variety'
Step 18: Generated token: 'of'
Step 19: Generated token: 'programs'
Step 20: Generated token: 'including'
Step 21: Generated token: 'a'
Step 22: Generated token: 'Bachelor'
Step 23: Generated token: 'of'
Step 24: Generated token: 'Arts'
Step 25: Generated token: 'in'

--- Final Generated Text ---

Lafayette College is located  in Lafayette, Louisiana. It is a private liberal arts college that offers a variety of prog

## Additional Resources

If you are interested in learning more about the fundamentals of Large Language Models (LLMs) and how they work, here are some useful resources:

### Websites
- **[Hugging Face Transformers Documentation](https://huggingface.co/transformers/):**  
  A complete guide on how to use transformer-based models, including tutorials and code examples.
- **[OpenAI Blog](https://openai.com/blog/):**  
  Read about the latest research, applications, and discussions around advanced language models.
- **[The Illustrated Transformer](http://jalammar.github.io/illustrated-transformer/):**  
  A visual and intuitive explanation of transformer architectures and how they power modern language models.
- **[Stanford CS224n: Natural Language Processing with Deep Learning](https://web.stanford.edu/class/cs224n/):**  
  Explore lecture notes, slides, and video recordings on NLP and deep learning.

### YouTube Channels and Videos
- **Hugging Face – Transformers in Action:**  
  Look for practical tutorials and demonstrations on using transformers on the [Hugging Face YouTube Channel](https://www.youtube.com/c/HuggingFace).
- **Yannic Kilcher:**  
  In-depth analysis of research papers and AI models, including explanations of GPT-2, GPT-3, and other LLMs. Visit his [YouTube channel](https://www.youtube.com/c/YannicKilcher).
- **Two Minute Papers:**  
  Short and accessible summaries of recent AI research, including advances in language models. Visit the [Two Minute Papers YouTube Channel](https://www.youtube.com/user/keeroyz).
- **DeepLearning.AI:**  
  Educational content and discussions about current trends in deep learning and AI. Watch videos on the [DeepLearning.AI YouTube Channel](https://www.youtube.com/c/Deeplearningai).

These resources will help you deepen your understanding of LLMs and f
