The most basic object in the 🤗 Transformers library is the pipeline() function. It connects a model with its necessary preprocessing and postprocessing steps, allowing us to directly input any text and get an intelligible answer:



In [2]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

[{'label': 'POSITIVE', 'score': 0.9598049521446228}]

We can even pass several sentences!



In [4]:
classifier(
    ["I've been waiting for a HuggingFace course my whole life.", "I hate this so much!", "I wish I could kill you"]
)

[{'label': 'POSITIVE', 'score': 0.9598049521446228},
 {'label': 'NEGATIVE', 'score': 0.9994558691978455},
 {'label': 'NEGATIVE', 'score': 0.9984739422798157}]

There are three main steps involved when you pass some text to a pipeline:

The text is preprocessed into a format the model can understand.
The preprocessed inputs are passed to the model.
The predictions of the model are post-processed, so you can make sense of them.


In [5]:
feature_ext = pipeline("feature-extraction")
feature_ext("I hate this so much!")

No model was supplied, defaulted to distilbert/distilbert-base-cased and revision 6ea8117 (https://huggingface.co/distilbert/distilbert-base-cased).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/465 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/263M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/49.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/213k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/436k [00:00<?, ?B/s]

[[[0.3332987129688263,
   0.16171419620513916,
   -0.1855306774377823,
   -0.25003910064697266,
   -0.1528613120317459,
   -0.2601052522659302,
   0.40522927045822144,
   -0.021569866687059402,
   0.06194540485739708,
   -1.1299914121627808,
   -0.1727878600358963,
   0.13131341338157654,
   -0.08541657030582428,
   -0.039503369480371475,
   -0.45247143507003784,
   0.15184728801250458,
   -0.05254404991865158,
   0.0446307547390461,
   -0.2146521359682083,
   -0.08890605717897415,
   0.19800518453121185,
   -0.11774303764104843,
   0.46642908453941345,
   -0.1593075543642044,
   0.23581114411354065,
   0.05387543514370918,
   0.3536788523197174,
   0.2558347284793854,
   -0.24764437973499298,
   0.27675676345825195,
   0.22154194116592407,
   0.1457994133234024,
   -0.07547339051961899,
   -0.008405503816902637,
   -0.13398191332817078,
   0.09458933025598526,
   -0.12310121953487396,
   -0.20131559669971466,
   -0.163631871342659,
   -0.4189380705356598,
   -0.5139951705932617,
   0.

### Using pipline for text generation

In [7]:
text_gen = pipeline("text-generation", model="distilgpt2")

text_gen(
    "Neutron stars are very fascinating",
    max_length=30,
    num_return_sequences=2,
)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:None for open-end generation.


[{'generated_text': 'Neutron stars are very fascinating to researchers for years to come. But it is also hard to see the same results again. In the latest discovery'},
 {'generated_text': 'Neutron stars are very fascinating!\n\nThe most interesting thing about the stars are in color, because they have just become a normal part of'}]

### Summarization
Summarization is the task of reducing a text into a shorter text while keeping all (or most) of the important aspects referenced in the text. Here’s an example

In [9]:
summarizer = pipeline("summarization")

summarizer("Police in New York have released two photos of an unmasked individual wanted for questioning over the killing of a healthcare chief executive. UnitedHealthcare boss Brian Thompson, 50, was fatally shot in the back on Wednesday morning outside the Hilton hotel in Midtown Manhattan. The attacker fled the scene without taking any of Thompson's belongings. Police believe the victim was targeted in a pre-planned killing. Investigators are also using facial recognition technology and bullet casings with cryptic messages written on them to track down the suspect. They have yet to reveal a motive in the shooting. The shooting took place at about 06:45 EST (11:45 GMT) in a busy part of Manhattan close to Times Square and Central Park. Thompson had been scheduled to speak at an investor conference later in the day. According to police, the suspect - who was clad in a black face mask and light brown or cream-coloured jacket - appeared to be waiting for Thompson for five minutes outside the Hilton hotel where he was expected to speak. Thompson, who arrived on foot, was shot in the back and leg, and was pronounced dead about half an hour later at a local hospital. New York Police Department (NYPD) Chief of Detectives Joseph Kenny has revealed that the suspect's weapon appeared to jam, but that he was able to quickly fix it and keep shooting. CCTV footage appears to show the gunman had fitted a suppressor, also known as a silencer, to his pistol, BBC Verify has established. New York City Mayor Eric Adams - a veteran of the NYPD - told MSNBC that the use of a silencer was unprecedented in his career. 'I have never seen a silencer before,' he said. 'That was really something shocking to us all.' Investigators reportedly believe the firearm is a BT Station Six 9, a weapon which is marketed as tracing its roots back to pistols used by Second World War-era Allied special operations forces. Police have reportedly visited gun stores in Connecticut to try to determine where the weapon was purchased. After the shooting, video shows the suspect fleeing the scene on foot. Officials initially said the suspect used an electric Citi Bike owned by Lyft. But Lyft, which owns and operates Citi Bike, later said it had been told by the NYPD that one of its vehicles had not been used, according to the BBC's US partner, CBS News.")

No model was supplied, defaulted to sshleifer/distilbart-cnn-12-6 and revision a4f8f3e (https://huggingface.co/sshleifer/distilbart-cnn-12-6).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

[{'summary_text': ' Brian Thompson, 50, was shot in the back and leg outside the Hilton hotel in Midtown Manhattan . He had been scheduled to speak at an investor conference later in the day . Police believe the victim was targeted in a pre-planned killing . CCTV footage appears to show the gunman fitted a silencer to his pistol .'}]