## Core Text Generation Parameters
Let’s pick the **GPT-2 model** as an example. It is a small transformer model that does not require a lot of computational resources but is still capable of generating high-quality text. A simple example to generate text using the GPT-2 model is as follows:

In [3]:
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# create model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

# tokenize input prompt to sequence of ids
prompt = "Artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt")
# generate output as a sequence of token ids
output = model.generate(
    **inputs,
    max_length=50,
    num_return_sequences=1,
    temperature=1.0,
    top_k=50,
    top_p=1.0,
    repetition_penalty=1.0,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)
# convert token ids into text strings
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(f"Prompt: {prompt}")
print("Generated Text:")
print(generated_text)

Downloading model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

Downloading generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Prompt: Artificial intelligence is
Generated Text:
Artificial intelligence is a big deal to us. And I think to some degree it's part of the answer. Let's face it: If we can make machines do what we ask them to do, what we demand will be a different proposition than


### Experimenting with Temperature
Given you know what the various parameters do, let’s see how the output changes when you adjust some of them.

The temperature parameter has a significant impact on the creativity and randomness of the generated text. You can see its effect with the following example:

In [5]:
prompt = "The future of artificial intelligence is"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text with different temperature values
temperatures = [0.2, 0.5, 1.0, 1.5]
print(f"Prompt: {prompt}")
for temp in temperatures:
    print()
    print(f"Temperature: {temp}")
    output = model.generate(
        **inputs,
        max_length=100,
        num_return_sequences=1,
        temperature=temp,
        top_k=50,
        top_p=1.0,
        repetition_penalty=1.0,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    print("Generated Text:")
    print(generated_text)

Prompt: The future of artificial intelligence is

Temperature: 0.2
Generated Text:
The future of artificial intelligence is uncertain, but it is already in the works.

"The future of artificial intelligence is uncertain, but it is already in the works," said Richard Stallman, chief executive of the open-source software company, which is working on a new version of the AI called DeepMind. "It's not going to be a big leap forward. It's going to be a big leap forward."

The AI is already being tested in the lab of a company

Temperature: 0.5
Generated Text:
The future of artificial intelligence is going to be a very interesting time for AI.

AI is going to be a very interesting time for AI. The future of AI is going to be a very exciting time for AI.

That's the thing about AI. It's a very interesting time. It's a very interesting time.

I think the future of AI is going to be a very exciting time for AI.

It's a very interesting time. It's a very

Temperature: 1.0
Generated Text:
The fut

### Top-K and Top-P Sampling
The nucleus sampling parameters control how flexible you allow the model to pick the next token. Should you adjust the top_k parameter or the top_p parameter? Let’s see their effect in an example:

In [6]:
prompt = "The best way to learn programming is"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text with different top_k values
top_k_values = [5, 20, 50]
print(f"Prompt: {prompt}")

for top_k in top_k_values:
    print()
    print(f"Top-K = {top_k}")
    output = model.generate(
        **inputs,
        max_length=100,
        num_return_sequences=1,
        temperature=1.0,
        top_k=top_k,
        top_p=1.0,
        repetition_penalty=1.0,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    print("Generated Text:")
    print(generated_text)

# Generate text with different top_p values
top_p_values = [0.5, 0.7, 0.9]
for top_p in top_p_values:
    print()
    print(f"Top-P = {top_p}")
    output = model.generate(
        **inputs,
        max_length=100,
        num_return_sequences=1,
        temperature=1.0,
        top_k=0,
        top_p=top_p,
        repetition_penalty=1.0,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    print("Generated Text:")
    print(generated_text)

Prompt: The best way to learn programming is

Top-K = 5
Generated Text:
The best way to learn programming is to get a good understanding of programming. If you are new to programming then there will not be much information on programming at all. You will need a good understanding of programming, but you won't have to go far. The most important thing is to learn what you can learn.

What are the basic rules for learning programming?

There are two major rules you must follow when learning programming. First, you need to learn the basics of programming, and

Top-K = 20
Generated Text:
The best way to learn programming is to try, and learn how to be productive of what you are doing. You need to find a good way. I'll give you three examples:


The "programming" method I'm talking about is how you can write the software as part of a group of people doing something that they love. This is what you do when you are working on a project that is interesting and exciting in a very, very particula

### Controlling Repetition
Repetition is a common issue in text generation. The repetition_penalty parameter helps address this by penalizing tokens that have already appeared in the generated text. Let’s see how it works:

In [23]:
prompt = "Once upon a time, there was a"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text with different repetition penalties
penalties = [1.0, 1.2, 1.5, 2.0]
print(f"Prompt: {prompt}")
for penalty in penalties:
    print()
    print(f"Repetition penalty: {penalty}")
    output = model.generate(
        **inputs,
        max_length=100,
        num_return_sequences=1,
        temperature=0.3,
        top_k=50,
        top_p=1.0,
        repetition_penalty=penalty,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id,
    )
    generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
    print("Generated Text:")
    print(generated_text)

Prompt: Once upon a time, there was a

Repetition penalty: 1.0
Generated Text:
Once upon a time, there was a great deal of confusion and confusion about the nature of the "new" world. The first thing to understand is that the world is not the same as it was before. The world is not the same as it was before. The world is not the same as it was before. The world is not the same as it was before.

The first thing to understand is that the world is not the same as it was before. The world is not the

Repetition penalty: 1.2
Generated Text:
Once upon a time, there was a great deal of confusion about what to do with the information.
A lot of people thought that it would be better if we just let them know when they are ready for their next round and then I can tell you how much more fun this is! It's been so long since my last tournament where everyone had played in one place but now everybody has got an opportunity to play against each other at some point during our run together as well!! T

### Greedy Decoding and Sampling
The do_sample parameter controls whether the model uses sampling (probabilistic selection of tokens) or greedy decoding (always selecting the most probable token). Let’s compare these approaches:

In [24]:
prompt = "The secret to happiness is"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate text with greedy decoding vs. sampling
print(f"Prompt: {prompt}\n")
print("Greedy Decoding (do_sample=False):")
output = model.generate(
    **inputs,
    max_length=100,
    num_return_sequences=1,
    temperature=1.0,
    top_k=50,
    top_p=1.0,
    repetition_penalty=1.0,
    do_sample=False,
    pad_token_id=tokenizer.eos_token_id,
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print("Generated Text:")
print(generated_text)
print()
print("Sampling (do_sample=True):")
output = model.generate(
    **inputs,
    max_length=100,
    num_return_sequences=1,
    temperature=1.0,
    top_k=50,
    top_p=1.0,
    repetition_penalty=1.0,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id,
)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print("Generated Text:")
print(generated_text)

Prompt: The secret to happiness is

Greedy Decoding (do_sample=False):
Generated Text:
The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The secret to happiness is to be happy.

The

Sampling (do_sample=True):
Generated Text:
The secret to happiness is to keep yourself grounded. "

Kirk's wife and son were both married.

While his parents would keep going, Kirk would go to the local library. As a young man, he would go to live and study at the College of William and Mary. His father and grandfather would stay in the family home. Sometimes, Kirk would find Kirk's home to study.

On April 23, 1820, Kirk sent his son to play on a library


### Learning through prompts

In [7]:
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch

# Load GPT-2 small (117M)
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
model = GPT2LMHeadModel.from_pretrained('gpt2')
model.eval()


GPT2LMHeadModel(
  (transformer): GPT2Model(
    (wte): Embedding(50257, 768)
    (wpe): Embedding(1024, 768)
    (drop): Dropout(p=0.1, inplace=False)
    (h): ModuleList(
      (0-11): 12 x GPT2Block(
        (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (attn): GPT2Attention(
          (c_attn): Conv1D()
          (c_proj): Conv1D()
          (attn_dropout): Dropout(p=0.1, inplace=False)
          (resid_dropout): Dropout(p=0.1, inplace=False)
        )
        (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
        (mlp): GPT2MLP(
          (c_fc): Conv1D()
          (c_proj): Conv1D()
          (act): NewGELUActivation()
          (dropout): Dropout(p=0.1, inplace=False)
        )
      )
    )
    (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True)
  )
  (lm_head): Linear(in_features=768, out_features=50257, bias=False)
)

In [8]:
### generate text fro prompt ,helper func

In [9]:
def generate_text(prompt, max_length=50):
    # Encode input
    inputs = tokenizer.encode(prompt, return_tensors='pt')
    
    # Generate output
    outputs = model.generate(inputs, max_length=max_length, 
                              do_sample=True,  # sampling to make it creative
                              top_k=50,        # limits sampling pool
                              top_p=0.95,      # nucleus sampling
                              temperature=0.7, # creativity
                              pad_token_id=tokenizer.eos_token_id)
    
    # Decode and print
    text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    print(text[len(prompt):])  # Only show new generated part


**1 Zero shot example**

In [11]:
prompt = "Translate English to French:\nHello"
print("Zero-shot Output:")
generate_text(prompt)

Zero-shot Output:
, I'm John.

Thank you for visiting my blog. I know it's very difficult and hard to write for people who are not fluent in English. I've tried to write English for others,


**2 One Shot**

In [12]:
prompt = """Translate English to French:
Good morning -> Bonjour
Hello ->"""
print("One-shot Output:")
generate_text(prompt)

One-shot Output:
 Bonjour
Hello -> Bonjour
Hello -> Bonjour
Hello -> Bonjour
Hello -> Bonjour
Hello -> Bonjour



**3 Few-shot Example**

In [16]:
prompt = """Translate English to French:
Good morning -> Bonjour
Good night -> Bonne nuit
How are you -> Comment ça va
nice day ->"""
print("Few-shot Output:")
generate_text(prompt)


Few-shot Output:
 Béat
Good night -> Béat nice day -> Bonne


GPT-2 is not perfect at zero-shot because it wasn't trained for that.

It gets better with few-shot examples.

You may need to adjust max_length, temperature, top_k, top_p for better outputs.

Sometimes repeating generation gives better outputs due to randomness.