# Text Generation Demo

This notebook demonstrates text generation capabilities using language models from the NLP toolkit.

In [None]:
# Setup path to allow importing from the src directory
import sys
import os
from pathlib import Path

# Add the project root to the Python path
project_root = str(Path().resolve().parent)
if project_root not in sys.path:
    sys.path.append(project_root)

# Import necessary modules
from transformers import pipeline
import torch
import matplotlib.pyplot as plt
import numpy as np

In [None]:
# Model configuration
MODEL_NAME = "gpt2"  # Small model for faster loading and generation
MAX_LENGTH = 50  # Maximum length of generated text

print(f"Using model: {MODEL_NAME}")
print(f"Max generation length: {MAX_LENGTH}")

In [None]:
# Create text generation pipeline
generator = pipeline('text-generation', model=MODEL_NAME)

# Sample prompt
prompt = "In the future, artificial intelligence will"

# Generate text
generated_text = generator(
    prompt, 
    max_length=MAX_LENGTH, 
    num_return_sequences=1,
    pad_token_id=generator.tokenizer.eos_token_id
)

# Display results
print(f"Prompt: {prompt}")
print("\nGenerated text:")
print(generated_text[0]['generated_text'])

In [None]:
# Multiple prompts for comparison
prompts = [
    "The best way to learn programming is",
    "Climate change will impact the world by",
    "The future of natural language processing includes"
]

# Generate text for each prompt
for i, prompt in enumerate(prompts):
    print(f"\nPrompt {i+1}: {prompt}")
    
    # Generate with shorter length for multiple examples
    result = generator(
        prompt, 
        max_length=30, 
        num_return_sequences=1,
        pad_token_id=generator.tokenizer.eos_token_id
    )
    
    print("Generated text:")
    print(result[0]['generated_text'])
    print("-" * 50)

In [None]:
# Exploring generation parameters
prompt = "Artificial intelligence will revolutionize"

# Different temperature values
temperatures = [0.7, 1.0, 1.5]

for temp in temperatures:
    print(f"\nTemperature: {temp}")
    
    result = generator(
        prompt, 
        max_length=40, 
        temperature=temp,
        num_return_sequences=1,
        pad_token_id=generator.tokenizer.eos_token_id
    )
    
    print(result[0]['generated_text'])
    print("-" * 50)

## Conclusion

This notebook demonstrates basic text generation capabilities using transformers-based language models. The generation quality can be controlled through parameters like:

- **Temperature**: Controls randomness (higher = more random)
- **Top-k**: Limits vocabulary to top k most likely tokens
- **Top-p (nucleus sampling)**: Samples from the smallest set of tokens whose cumulative probability exceeds p
- **Repetition penalty**: Reduces repetitive text

For more advanced generation capabilities, the full API provides additional control options and model choices.