In [1]:
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load pre-trained model and tokenizer
model_name = "gpt2"  # You can also try "gpt2-medium", "EleutherAI/gpt-neo-125M", etc.
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Move model to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# Prompt for generation,Convert prompt to a list of token IDs.
#return_tensors="pt": Returns as PyTorch tensor (pt stands for PyTorch).
#.to(device): Sends input tensor to GPU or CPU.
prompt = "Once upon a time"
input_ids = tokenizer.encode(prompt, return_tensors="pt").to(device)

# Generate text,input_ids:input prompt (already tokenized).
# max_length=100: Maximum total length of input + generated tokens.
# temperature=0.7:Controls randomness. Lower = more predictable, higher = more random.
# top_k=50:Only consider the top 50 likely next tokens. Helps reduce randomness.
# top_p=0.95:Nucleus sampling: Chooses tokens from top 95% cumulative probability.
# num_return_sequences=1:Generate one sequence for a single input prompt.. You can set more for multiple variations.
# do_sample=True: Enables sampling (rather than greedy decoding, which always picks the best token).
# pad_token_id=tokenizer.eos_token_id	Ensures GPT2 handles padding gracefully. GPT2 doesn't have a pad_token by default.
with torch.no_grad():
    output_ids = model.generate(
        input_ids,
        max_length=100,
        temperature=0.7,
        top_k=50,
        top_p=0.95,
        num_return_sequences=1,
        do_sample=True,  # Enables sampling (not greedy)
        pad_token_id=tokenizer.eos_token_id  # Avoid warnings for GPT2
    )

# Decode and print
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print("Generated Text:\n", generated_text)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


Generated Text:
 Once upon a time I was a young man and a girl, we were together, we had a family and we loved each other. We were a very good couple, I loved her. I don't know if she's dead, but she was my sister, I love her and we didn't have any problems.

I was in middle school, I went to college. I got a job. I was like, 'What are you doing? You're working for a company that has
