# `pipeline()`
The **pipeline()** function supports multiple modalities, allowing to work with **text, images, audio,** and **even multimodal tasks**.

Here, we focus only on text tasks, but it’s useful to understand the transformer architecture’s potential, so we’ll briefly outline it.

In [1]:
from transformers import pipeline

In [2]:
classifier = pipeline("sentiment-analysis")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

Device set to use cpu


By default, this **"pipeline"** selects a particular *pretrained model* that has been fine-tuned for **sentiment analysis in English**.

The model is **downloaded** and **cached** when we create the **classifier** object. If you rerun the command, the **cached** model will be used instead and there is no need to download the model again.

### There are three main steps involved when you pass some text to a pipeline:
1. The text is **preprocessed** into a format the model can understand.
2. The **preprocessed** inputs are passed to the **model**.
3. The **predictions** of the **model** are **post-processed**, so you can make sense of them.

In [3]:
classifier(["I'm very focused and intelligent leader", "I have built an organization which is world's best in technology", "My organization is worth of more than 200000000000 Billion $", "I hate lazy people, who always do the procastination", 'I like the person who is silent but highly aggressive in work'])

[{'label': 'POSITIVE', 'score': 0.999874472618103},
 {'label': 'POSITIVE', 'score': 0.9998679161071777},
 {'label': 'POSITIVE', 'score': 0.9956153631210327},
 {'label': 'NEGATIVE', 'score': 0.9935566782951355},
 {'label': 'POSITIVE', 'score': 0.9760797619819641}]

#distilbert/distilbert-base-uncased-finetuned-sst-2-english
is a specific version of the **DistilBERT** model that has been fine-tuned for **sentiment analysis using the SST-2 dataset**, while the **revision 714eb0f** refers to a specific commit (or state) of the model's files on a platform like **Hugging Face.**

The model is designed to classify text as **positive or negative,** with the **"uncased"** part indicating it processes text without regard to **capitalization, and "distil"** referring to its **smaller, distilled version** of the **larger "BERT" model.**

### Here's a breakdown of the components:
#### distilbert
: The name of the base language model, a distilled version of BERT, known for being smaller and faster while retaining much of BERT's performance.

#### base
: Indicates the size of the model.
#### uncased
: Means that all text is converted to lowercase before processing, so "Hello" and "hello" are treated as the same word.
#### finetuned
: Signifies that the model was pre-trained on a large corpus of text and then further trained (fine-tuned) on a specific dataset.
#### sst-2-english
: The fine-tuning task and dataset. This is the Stanford Sentiment Treebank v2 dataset, used for binary sentiment classification (positive or negative sentiment) in English text.
#### revision 714eb0f
: This is a specific identifier for a particular version or commit of the model's files hosted on a platform like Hugging Face. It allows users to refer to an exact state of the model, which is useful for reproducible research and deployment.

# Text pipelines

* **text-generation:** Generate text from a prompt
* **text-classification:** Classify text into predefined categories
* **summarization:** Create a shorter version of a text while preserving key information
* **translation:** Translate text from one language to another
* **zero-shot-classification:** Classify text without prior training on specific labels
* **feature-extraction:** Extract vector representations of text