https://huggingface.co/docs/transformers/generation_strategies

In [3]:
from transformers import pipeline
from transformers import set_seed

set_seed(42)

In [5]:
tg_pipeline = pipeline("text-generation", model="distilbert/distilgpt2", device=0)

tg_pipeline

<transformers.pipelines.text_generation.TextGenerationPipeline at 0x780838eb3220>

Note that there are many input parameters that can be used to configure and control text generation

https://huggingface.co/docs/transformers/main_classes/text_generation

In [4]:
text = "In a world where dreams become reality"

answer = tg_pipeline(text, pad_token_id=tg_pipeline.tokenizer.eos_token_id)

print(answer)

[{'generated_text': 'In a world where dreams become reality, the question that everyone who makes dreams are a natural reality is, why do dreams not matter?\n\nWhat causes all this?\nSome might think there is such a big shift in the way we perceive reality'}]


In [6]:
text = "In a universe where every idea takes shape"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    min_new_tokens=100,
    max_new_tokens=500
)

print(answer[0]['generated_text'])

In a universe where every idea takes shape.
The author of this essay, a bestseller, talks about his experiences with spaceflight, the origins of cosmonauts and the future of spaceflight. He writes about his experiences on his blog, ‏Spaceflight and the Spaceflight Hypersonic, and on his personal blog, Space-Spaceflight.com.
In this interview this year, NASA‏s James Webb Space Telescope (SETI) is preparing a new project called the Deep Space Odyssey, the fourth of the Hubble Space Telescope, that will be launched on July 1. It will orbit close to Jupiter on a flyby of the spacecraft‏s (nearly four times the diameter of NASA Earth‏s diameter and a third of its diameter) orbit around Jupiter.
The project is the largest spacecraft ever conceived that will orbit Jupiter through the orbit of Jupiter in the constellation of Jupiter. It will orbit the sun around Jupiter at around 7,140 kilometers. The astronauts are already aboard the EOS-23D (Galileo-class rocket), or ELIZABETH-5S (EOS-27M in

In [10]:
text = "In a universe where every idea takes shape"

answer = tg_pipeline(
    text,
    tokenizer = tg_pipeline.tokenizer,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=80,
    stop_strings = ["evolve", "universe", "it"]
)

print(answer[0]['generated_text'])

In a universe where every idea takes shape and shape you just don't know anymore. So this is my journey of finding something new.

What do you think of your thoughts and answers to this issue? Let me know in the comments below or in the discussion on Reddit


In [11]:
text = "In a world sculpted by visionary ideas"

answers = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=80,
    num_return_sequences = 5
)

for answer in answers:
    print(answer['generated_text'])
    print('-'*80)

In a world sculpted by visionary ideas such as a new space station, these designs have been presented at the United Nations and are considered to be the best way to explore this amazing world in unprecedented detail, drawing extraordinary world-class artists and artists.



The creation of this incredible space station, the first and most impressive of its kind ever, will be a massive project of the United Nations Mission in New York City – and
--------------------------------------------------------------------------------
In a world sculpted by visionary ideas in the form of real-world design, sculpts work with artists from across the globe to share their creations to celebrate and celebrate their own creations within the city. We look forward to meeting you in the evening.


More from Global Design:
"The artist in Paris, Dijon, has commissioned some of the most striking sculpts in Paris including Luskin's '70
--------------------------------------------------------------------------

## Decoding Strategies in Text Generation

Decoding strategies in text generation models refer to the methods used to generate text from the output probabilities produced by models like GPT, BART, or other transformer-based architectures. These strategies determine how the next word (or token) is selected during the text generation process, which can significantly influence the quality, coherence, and creativity of the generated text.

#### Greedy Search

How It Works: In Greedy Search, the model selects the token with the highest probability at each step. This approach is straightforward but often leads to suboptimal results because it doesn’t consider the long-term implications of each choice.



In [12]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100
)

print(answer[0]['generated_text'])

I went to the office one day and came up with a new plan: to bring the house out of the closet, and then walk away with a couple small items we have left. That‖‖ means just sitting down and eating, or sitting down and eating.  This morning at the restaurant I received an EMT, and so was my buddy, a guy from this small town that really cared about me.  My mom made my first trip to his hotel two weeks ago and gave me my first meal.


#### Beam Search
How It Works: Beam Search keeps track of multiple possible sequences (beams) at each step, rather than just the single best one. It explores a fixed number of top candidates (beam width) and expands them simultaneously, eventually selecting the sequence with the highest overall score.

In [13]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    num_beams=4
)

print(answer[0]['generated_text'])

I went to the office one day and the next he said, 'I don't know what to do.' He said, 'I don't know what to do.' He said, 'I don't know what to do.' He said, 'I don't know what to do.' He said, 'I don't know what to do.' He said, 'I don't know what to do.' He said, 'I don't know what to do.' He said, 'I don't know what to do.' He


In [14]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    num_beams=5,
    no_repeat_ngram_size=2
)

print(answer[0]['generated_text'])

You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset


I went to the office one day.

“I’m not going to tell you what happened,” he said. “But I don't know if it was because I didn't want to go to that office. I wanted to get out there and get a job, and I thought I would be able to do it.‡
The next day, he went back to his office and got a phone call from a friend. He said, ‪Hey, I‪m sorry.


In [16]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    num_beams=5,
    no_repeat_ngram_size=4
)

print(answer[0]['generated_text'])

I went to the office one day and they said, 'We're not going to do this. We're going to do it. We have to do it.' I said, 'Well, I'm going to do that.' I said to them, 'You know, I'm not going to be doing this.' They said, 'No, I don't want to do this.' And then I said, "You know, we have to do this.'" And they said, "We have to do that."





In [21]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=200,
    num_beams=5,
    repetition_penalty=2.0 # Default is 1.0 which means no penalty
)

print(answer[0]['generated_text'])

I went to the office one day and said, 'You've got to go back to work.' "




























































































































































































### Multinomial Sampling
As opposed to greedy search that always chooses a token with the highest probability as the next token, multinomial sampling (also called ancestral sampling) randomly selects the next token based on the probability distribution over the entire vocabulary given by the model. Every token with a non-zero probability has a chance of being selected, thus reducing the risk of repetition.

In [17]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    num_beams=1
)

print(answer[0]['generated_text'])

I went to the office one day so she didn't have to work long,” he said, before he walked back in and told her he was going to leave and he turned around, and he said: “I don't know and I don't think what goes wrong in this, but at least he'll be OK.”



The lawyer will be interviewed by ABC News later this week.

Originally published as Why the Liberals are so committed to changing the law


#### Diverse beam search decoding

The diverse beam search decoding strategy is an extension of the beam search strategy that allows for generating a more diverse set of beam sequences to choose from. This approach has three main parameters: num_beams, num_beam_groups, and diversity_penalty. The diversity penalty ensures the outputs are distinct across groups, and beam search is used within each group.

In [26]:
text = "I went to the office one day"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=False,
    num_beams=5,
    num_beam_groups=5,
    diversity_penalty=2.0
)

print(answer[0]['generated_text'])

I went to the office one day and I was like, 'Oh, I'm going to go to the office one day and I'm going to go to the office one day and I'm going to go to the office one day and I'm going to go to the office one day and I'm going to go to the office one day and I'm going to go to the office one day and I'm going to go to the office one day and I'm going to go to the office one day and I'm going to


#### Temperature:

A hyperparameter that controls the randomness of predictions, with lower values leading to more deterministic outputs and higher values introducing more variability.

#### Top-K Sampling:

A decoding strategy that restricts token selection to the top K most probable options, introducing controlled randomness by only considering a fixed number of high-probability tokens.

#### Top-p (Nucleus) Sampling:

A decoding strategy that selects tokens from the smallest set whose cumulative probability exceeds a threshold p, dynamically adjusting the number of considered tokens based on the context.

In [28]:
text = "In a world sculpted by visionary ideas"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    temperature=0.1, # Default 1.0
    top_k=0
)

print(answer[0]['generated_text'])

In a world sculpted by visionary ideas, the world of the sculptor is now a world of art.




The world of the sculptor is now a world of art.
The world of the sculptor is now a world of art.
The world of the sculptor is now a world of art.
The world of the sculptor is now a world of art.
The world of the sculptor is now a world of art.
The world of the sculptor is now a world of


In [29]:
text = "In a world sculpted by visionary ideas"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    temperature=1.0,
    top_k=0
)


print(answer[0]['generated_text'])

In a world sculpted by visionary ideas (‪ Fassey‪, ‪ Frondale‪, ‪ Moriarty‪, ‪ Amonesk), my boy is revealed to be a sculptor who is disguised as a soldier and dishonoring his soldier. UFO-2017-05-17

‪ Destrayal Aeronautics**
Why We created this! The art is Kake Ikaduro, an anthropomorphic crew of Space Company Dragonkids. Throughout no form


In [30]:
text = "In a world sculpted by visionary ideas"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    temperature=1.5,
    top_k=0
)


print(answer[0]['generated_text'])

In a world sculpted by visionary ideas for curvitation, Baq hung his Surhaym in Maharashtra in whatever major contests tot …Besides in way, and or LOC Uttar Pradesh astronaut Persepramine Naher — Pozanne taken to Ethiopia on Monday nude and Beiator Afre Factors Hong pop quin stuck ill The poleAli extra sails Hot Quarter Round: Saddmen Say SheltersProject Nobel freakin powers Ex ingredient failing Mahmoud Hada recently cracked Hurricane Harriet Tie consulate swimming Invaining board hullflats Ship takes cailing iceberg


In [34]:
text = "In a world sculpted by visionary ideas"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    top_k=10 # Default value is 50
)

print(answer[0]['generated_text'])

In a world sculpted by visionary ideas, the world of the artist is now a world of art.

























































































In [36]:
text = "In a world sculpted by visionary ideas"

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    top_k=100
)

print(answer[0]['generated_text'])

In a world sculpted by visionary ideas and a young body of talented artmen, he took inspiration from those of Charles Darwin as he made great strides to realize where those artists were today. One of the challenges of exploring ideas and ideas in a creative school were the development of a larger space. Most of the efforts developed by artist-invented artists in the early 1900s were in the home of Frank S. S. Smith, Charles Johnson, and John Blake. The schools were small, yet also large, and they received some


In [37]:
text = "In a world sculpted by visionary ideas"

print(text)

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    top_p = 0.5 # Default value is 1
)

print(answer[0]['generated_text'])

In a world sculpted by visionary ideas
In a world sculpted by visionary ideas, the sculptor, artist and sculptor is an artist and sculptor who has been the architect of the world since the beginning of the 20th century. He has worked on many of the world's most popular sculptures and sculptors, including the first World War II-era World War II World War II-era World War II-era World War II-era World War II-era World War II-era World War II-era World War II-era World War II-era World


In [6]:
text = "In a world sculpted by visionary ideas"

print(text)

answer = tg_pipeline(
    text,
    pad_token_id=tg_pipeline.tokenizer.eos_token_id,
    max_new_tokens=100,
    do_sample=True,
    top_p = 0.80
)

print(answer[0]['generated_text'])

In a world sculpted by visionary ideas
In a world sculpted by visionary ideas, a new exhibition in the Centre for Art Studies at the University of Chicago is launching its first exhibition of the "Growth of Art: Art in a World".



The exhibition will take place at the University of Chicago's Center for Art Studies. The exhibition will be presented in conjunction with the American Library Association (ALA) and the International Centre for Art Studies, the American Institute of Art in New York, and the American Society of Art in London.
The exhibition was launched


### Streaming support

You can use the TextStreamer class to stream the output of generate() into your screen, one word at a time

In [38]:
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

In [39]:
tokenizer = AutoTokenizer.from_pretrained("openai-community/gpt2")

model = AutoModelForCausalLM.from_pretrained("openai-community/gpt2")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

In [40]:
inputs = tokenizer(["An increasing sequence: one,"], return_tensors="pt")

streamer = TextStreamer(tokenizer)

In [43]:
_ = model.generate(**inputs, streamer=streamer, max_new_tokens=30)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


An increasing sequence: one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen,
