# Text generation

Now let’s see how to use a **`pipeline`** to generate some text. The main idea here is that you *provide a prompt* and the *model will auto-complete* it by generating the remaining text.

This is similar to the **predictive text feature** that is found on many phones. Text generation involves randomness, so it’s normal if you don’t get the same results as shown below.

In [1]:
from transformers import pipeline

In [2]:
generator = pipeline("text-generation")

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use cpu


In [3]:
generator(["Amit, is a high impact leader", 'sharp in focus', "relentless in drive and always step ahead in AI reasearcher journey"])

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[[{'generated_text': 'Amit, is a high impact leader of the Islamic State group. He is one of the most dangerous fighters in the world. He is the leader of the Islamic State but he is also an expert in the Quran and the Sunnah. He is a member of Al-Qaeda in Iraq."\n\nThe video posted on the Islamic State website showed the group\'s leader, Abu Bakr al-Baghdadi, talking to a group of young men in a car and talking about how he had killed a woman.\n\nThe video depicts the group\'s leader, Abu Bakr al-Baghdadi, speaking to a group of young men in a car and talking about how he was killing a woman in Mosul.\n\nMr Abbas said: "The first thing a journalist in Syria must understand is that Islamic State is an evil force. The group takes part in the most heinous acts, the most heinous of them are the ones that are committed by the terrorists. This is the same thing as the people who killed American servicemen."\n\nThe leader of Abu Bakr al-Baghdadi, Abu Musab al-Zarqawi, also spoke to the media

You can control how many different sequences are generated with the argument **`num_return_sequences`** and the total length of the output text with the argument **`max_length`**.

# Using any model from the Hub in a pipeline

In the previous examples you used the default model for the task at hand, but you can also choose a particular model from the Hub to use in a **pipeline** for a specific task — say, **text generation**. Go to the HF's [Model Hub](https://huggingface.co/models?pipeline_tag=text-generation) and choose what best suits you.

In [4]:
generator = pipeline("text-generation", model="HuggingFaceTB/SmolLM2-360M")

config.json:   0%|          | 0.00/689 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/724M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/111 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/831 [00:00<?, ?B/s]

Device set to use cpu


In [6]:
generator("Amit, is a high impact AI reasearcher, sharp in focus, relentless in drive and always step ahead.",
  max_length=30,
  num_return_sequences=1,
          )

Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=30) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'Amit, is a high impact AI reasearcher, sharp in focus, relentless in drive and always step ahead. He is a 100% dedicated to his work in developing AI technologies and technologies of the future.\n\nIn this article, Amit has explained to us what exactly is AI and how AI is going to change the future of industries.\n\nWhat is artificial intelligence?\n\nArtificial Intelligence is a term that involves the development of systems that can show some intelligent behavior.\n\nThe term artificial intelligence was first coined by the American author John McCarthy, where the term was first used in 1956 in his paper “I, Robot”.\n\nThe term artificial intelligence was first coined by the American author John McCarthy, where the term was first used in his paper “I, Robot”.\n\nWhat is the difference between machine learning and artificial intelligence?\n\nMachine learning and artificial intelligence differ in the sense that machine learning involves implementing algorithms that c

HF pipeline for text generation
The main idea here is that you provide a prompt and the model will auto-complete it by generating the remaining text. This is similar to the predictive text feature that is found on many phones.

You can control how many different sequences are generated with the argument num_return_sequences and the total length of the output text with the argument max_length.