In Hugging Face, a pipeline is a high-level tool that lets you use powerful machine learning models with just a few lines of code—no need to worry about tokenization, model loading, or postprocessing.

A pipeline wraps all the steps needed to perform a task like:
- Text classification (e.g. sentiment analysis)
- Named entity recognition (NER)
- Question answering
- Text generation
- Translation
- Summarization
- Image classification, speech recognition, and more

It handles:
- Preprocessing: Tokenizing or transforming input data
- Model inference: Running the model
- Postprocessing: Converting outputs into human-readable results

A pipeline wraps together:

- A pretrained model
- Its tokenizer
- The task-specific logic (e.g., classification, translation, summarization)

Why Use It?
- **Beginner-friendly**: No need to understand model internals
- **Fast prototyping**: Great for demos, experiments, and testing
- **Customizable**: You can plug in your own models from the Hugging Face Hub

In [1]:
from transformers import pipeline

## Text classification

In [2]:
classifier = pipeline('sentiment-analysis')

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Device set to use mps:0


In [4]:
classifier('That is a wonderful day!')

[{'label': 'POSITIVE', 'score': 0.9998867511749268}]

In [6]:
classifier('你真是个恶心的臭婊子')

[{'label': 'POSITIVE', 'score': 0.5297845602035522}]

In [7]:
classifier('eres una perra asquerosa.')

[{'label': 'NEGATIVE', 'score': 0.9278761148452759}]

## Reading comprehension

In [8]:
question_answer = pipeline('question-answering')

No model was supplied, defaulted to distilbert/distilbert-base-cased-distilled-squad and revision 564e9b5 (https://huggingface.co/distilbert/distilbert-base-cased-distilled-squad).
Using a pipeline without specifying a model name and revision in production is not recommended.
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


config.json:   0%|          | 0.00/473 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/261M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


In [31]:
context = """Napoleon P, born in June 3rd, 1765, was a shoe maker in Moscow. His parents sent him to school while he was 6 years old.
But he could not study well and were obssessed with mechanics brought from Lain America.
He traveled to China at age 32 and learnt how to cook Kun Pao Chicken.
When he returns to Europe, it was the revolution time and he devoted himself to the rebellion in Hugary."""

In [32]:
question_answer(
    question = 'When was Napoleon born?',
    context = context
               )

{'score': 0.9794514775276184,
 'start': 20,
 'end': 34,
 'answer': 'June 3rd, 1765'}

In [33]:
question_answer(
    question = "Did Napolean study well at school?",
    context = context
               )

{'score': 0.28134778141975403,
 'start': 128,
 'end': 148,
 'answer': 'could not study well'}

In [34]:
question_answer(
    question = "What did Napolean do in China?",
    context = context
               )

{'score': 0.5448819398880005,
 'start': 259,
 'end': 279,
 'answer': 'cook Kun Pao Chicken'}

## Fill mask

In [35]:
unmasker = pipeline('fill-mask')

No model was supplied, defaulted to distilbert/distilroberta-base and revision fb53ab8 (https://huggingface.co/distilbert/distilroberta-base).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/480 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/331M [00:00<?, ?B/s]

Some weights of the model checkpoint at distilbert/distilroberta-base were not used when initializing RobertaForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


In [36]:
sentence = 'We were supposed to leave in the <mask> before the sunrise so that it will not be too hot for hiking.'

In [38]:
unmasker(sentence)

[{'score': 0.32960236072540283,
  'token': 662,
  'token_str': ' morning',
  'sequence': 'We were supposed to leave in the morning before the sunrise so that it will not be too hot for hiking.'},
 {'score': 0.2961837649345398,
  'token': 1390,
  'token_str': ' afternoon',
  'sequence': 'We were supposed to leave in the afternoon before the sunrise so that it will not be too hot for hiking.'},
 {'score': 0.08541944622993469,
  'token': 1559,
  'token_str': ' evening',
  'sequence': 'We were supposed to leave in the evening before the sunrise so that it will not be too hot for hiking.'},
 {'score': 0.06957598030567169,
  'token': 13686,
  'token_str': ' shade',
  'sequence': 'We were supposed to leave in the shade before the sunrise so that it will not be too hot for hiking.'},
 {'score': 0.049729593098163605,
  'token': 2933,
  'token_str': ' dark',
  'sequence': 'We were supposed to leave in the dark before the sunrise so that it will not be too hot for hiking.'}]

## Text generation

In [40]:
text_generator = pipeline('text-generation')

No model was supplied, defaulted to openai-community/gpt2 and revision 607a30d (https://huggingface.co/openai-community/gpt2).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


In [41]:
text_generator('Based on what you have descirbed, I should', max_length = 50, do_sample = False)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'Based on what you have descirbed, I should say that the most important thing to remember is that you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own judgement, you should not be afraid to use your own judgement.\n\nIf you are going to use your own ju

In [42]:
text_generator('As she has left, we cannot', max_length = 50, do_sample = False)

The following generation flags are not valid and may be ignored: ['temperature']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


[{'generated_text': 'As she has left, we cannot say that she is a victim of the same kind of violence that she has been subjected to.\n\n"I am not a victim of the same kind of violence that she has been subjected to. I am a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected to. I am not a victim of the same kind of violence that she has been subjected t

## Identifying object

In [43]:
sequence = """Luis Migueal is a Mexican song-writer born in Shijiazhuang, China.
He studied chemical science in La Univerdad de Lima in Peru and got an PhD degree 5 years later.
He has a dog called Kevin and feed it with his own poop."""

In [None]:
pipeline('ner')

No model was supplied, defaulted to dbmdz/bert-large-cased-finetuned-conll03-english and revision 4c53496 (https://huggingface.co/dbmdz/bert-large-cased-finetuned-conll03-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/998 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

## Text summarization

In [4]:
article = """
A marathon voting session on President Donald Trump’s sweeping domestic policy bill is underway in the Senate and has stretched overnight into the early hours of Tuesday morning after a weekend of negotiations and delays.
The vote-a-rama – an open-ended series of votes on amendments, some political, some substantive – started around 9:35 a.m. on Monday and is still going with no end in sight.
The extended voting session provides an opportunity for Republicans to make any eleventh-hour adjustments to the package and Democrats to push on GOP weak points in the bill and put their colleagues on the spot.
Those politically tough votes are likely to provide fodder for campaign ads down the line.
Senate Majority Leader John Thune told reporters around 1 a.m. on Tuesday that “progress is a very elusive term” when asked if lawmakers are making progress toward a final vote.
Trump’s multitrillion-dollar bill would lower federal taxes and infuse more money into the Pentagon and border security agencies, while downsizing government safety-net programs including Medicaid.
Democrats have zeroed in on Medicaid and other safety-net programs, such as food stamps, as they message against the president’s agenda.
The vote-a-rama comes after Senate Democrats employed a major delay tactic over the weekend that forced clerks to spend more than a dozen hours reading aloud the entire bill.
Lawmakers are up against an extremely tight timeline to pass the legislation. The president has demanded Congress deliver the bill to his desk by the Fourth of July, but the measure must still go back to the House if it passes the Senate.
In the House, Speaker Mike Johnson is confronting growing levels of consternation in his ranks about the final product, raising questions about that measure’s fate in his chamber.
Around 3:30 a.m. on Tuesday, the Senate adopted its first amendment to the bill during the vote-a-rama.
The amendment, offered by Republican Sen. Joni Ernst of Iowa, bars federal funds from being used for unemployment benefits to individuals whose wages are at least $1 million.
"""

In [2]:
summarizer = pipeline('summarization')

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

Device set to use mps:0


In [4]:
summarizer(article, max_length = 200, min_length = 30, do_sample = False)

[{'summary_text': ' The vote-a-rama started around 9:35 a.m. on Monday and is still going with no end in sight . Democrats have zeroed in on Medicaid and other safety-net programs, such as food stamps . The bill would lower federal taxes and infuse more money into the Pentagon and border security agencies .'}]

In [5]:
summarizer(article, max_length = 200, min_length = 30, do_sample = True)

[{'summary_text': ' The vote-a-rama started around 9:35 a.m. Monday and is still going with no end in sight . It is an open-ended series of votes on amendments, some political, some substantive . Democrats have zeroed in on Medicaid and other safety-net programs, such as food stamps . The president has demanded Congress deliver the bill to his desk by the Fourth of July .'}]

#### `do_sample=False`: uses greedy decoding or beam search → deterministic summaries

#### `do_sample=True`: enables random sampling → more diverse or creative summaries (especially when paired with top_k, top_p, or temperature)

## Translation

In [2]:
translator = pipeline("translation", model="Helsinki-NLP/opus-mt-en-es")

source.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/826k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

Device set to use mps:0


In [5]:
translator(article)

[{'translation_text': 'En el Senado está en marcha una sesión de votación maratónica sobre el proyecto de ley de política interna del presidente Donald Trump, que se ha extendido hasta las primeras horas del martes por la mañana después de un fin de semana de negociaciones y retrasos. La sesión de votación prolongada ofrece a los republicanos la oportunidad de hacer ajustes de undécima hora al paquete y a los demócratas para impulsar puntos débiles de GOP en el proyecto de ley y poner a sus colegas en el acto. Es probable que esos votos políticamente duros proporcionen forraje para los anuncios de campaña en la línea. El líder de la mayoría del Senado, John Thune, dijo a los periodistas alrededor de 1 a.m. el martes que “el progreso es un término muy difícil” cuando se les preguntó si los legisladores están haciendo progresos hacia una votación final. La ley multimillonaria de Trump reduciría los impuestos federales y aumentaría el dinero en el penúltimo caso en las agencias de segurid

# Change model for task

In [7]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

In [8]:
tokenizer = AutoTokenizer.from_pretrained('Helsinki-NLP/opus-mt-zh-en')
model = AutoModelForSeq2SeqLM.from_pretrained('Helsinki-NLP/opus-mt-zh-en')

tokenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

config.json: 0.00B [00:00, ?B/s]

source.spm:   0%|          | 0.00/805k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/807k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/312M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/312M [00:00<?, ?B/s]

In [30]:
article_cn = """人定胜天说的是在充分了解客观规律的情况下，发挥人的主观能动性，积极思考利用规则去达成一个对人类更有利的结果
中国足球的现状恰恰就是没有尊重足球发展的客观规律，没有认真一步步去做发展足球需要的基本功导致的结果。"""

In [31]:
translator2 = pipeline(task = 'translation_zh_to_en', model = model, tokenizer = tokenizer)

Device set to use mps:0


In [32]:
translator2(article_cn, max_length = 20000)

[{'translation_text': 'The point of the winning day is to use the subjective energy of the human person, with full knowledge of the principles of objectivity, and to actively think about the use of the rules to achieve a more human-friendly outcome. The reality of football in China is precisely the lack of respect for the objective rules of football development and the failure to take serious steps to achieve the results of the basic work needed to develop football.'}]

In [33]:
translator3 = pipeline("translation", model="Helsinki-NLP/opus-mt-zh-en")

Device set to use mps:0


In [34]:
translator3(article_cn, max_length = 20000)

[{'translation_text': 'The point of the winning day is to use the subjective energy of the human person, with full knowledge of the principles of objectivity, and to actively think about the use of the rules to achieve a more human-friendly outcome. The reality of football in China is precisely the lack of respect for the objective rules of football development and the failure to take serious steps to achieve the results of the basic work needed to develop football.'}]