# **ðŸ¤— Transformer Pipeline Inference**

## **What's Covered?**
1. Pipelines
    - What is pipeline()?
    - Behind the Scenes
    - Key Points
2. Pipeline Syntax 
    - Example Usage
    - Garbage Collection
3. Default Model List
    - Common Tasks Supported
    - Identify the Pipeline Supported Tasks

## **Pipelines**

### **What is pipeline()?**
The `pipeline()` makes it simple to use any model from the `Hub` for inference on any language, computer vision, speech, and multimodal tasks. Even if you donâ€™t have experience with a specific modality or arenâ€™t familiar with the underlying code behind the models, you can still use them for inference with the `pipeline()`! 

It is the most powerful way to start using pre-trained Hugging Face models. 

It's a high level API that abstracts away all the complexity of tokenization, model loading, and post-processing, allowing you to perform common tasks with just a few lines of code.

### **Behind the Scenes**
- Determine framework (py/tf/jax)
- Loads tokenizer
- Loads model
- Choose Device (MPS/CUDA/CPU)
- Handles pre/post-processing
- Gives results

### **Key Points**
- The first time you run a pipeline for a specific model, it will download the model weights (which can be several hundred MB to GBs).
- Subsequent runs will use the cached version.
- You can specify a particular model within the pipeline if you don't want the default.
- The output format of the pipeline varies depending on the task.

## **Pipeline Syntax**
1. Start by importing `pipeline` and `torch`.
```python
from transformers import pipeline
import torch
```
2. Specify the inference task, model and torch_dtype.
    - `torch_dtype` tells the HuggingFace pipeline in which numeric precision the model weights and computations should be loaded.
    - `torch_dtype=torch.bfloat16` means Load the model parameters in `bfloat16` precision instead of the default (usually float32).
    - This is done for Lower Memory Usage and Faster Inference.
```python
classifier = pipeline(
    task="text-classification", 
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    torch_dtype=torch.bfloat16,
)
```
3. Pass the input to the `pipeline()`
```python
classifier(input_text)
```


### **Example Usage**

In [1]:
# Import pipeline
from transformers import pipeline
import torch

# Specify the inference task
classifier = pipeline(
    task="text-classification", 
    model="distilbert/distilbert-base-uncased-finetuned-sst-2-english",
    torch_dtype=torch.bfloat16,
)

classifier("It was a very bad movie.")

Device set to use mps:0


[{'label': 'NEGATIVE', 'score': 0.9997997879981995}]

In [2]:
classifier.device

device(type='mps', index=0)

**Important**
Transformers needs to decide:
- Should I load a PyTorch model? (pt)
- Should I load a TensorFlow model? (tf)
- Which AutoModel class is correct?
- If the user didnâ€™t specify a model, which model should I default to?

### **Garbage Collection**

```python
del classifier
```
- This does not delete the object from memory directly.
- It only removes the name `classifier` from the current namespace.
- `classifier` is just a variable name. That name was pointing to some object. `del` translator removes that reference.

```python
import gc
gc.collect()
# Ouput: 10
```
- This explicitly asks Pythonâ€™s garbage collector to find unreachable objects and free them.
- Output represents the number of unreachable objects which were found and collected. 

In [3]:
del classifier

import gc
gc.collect()

113

## **Default Model List**

### **Common Tasks Supported**
- text-classification
- text-generation
- ner
- summarization
- translation
- question-answering
- fill-mask (predicting missing words)
- zero-shot-classification (classifying text without specific training examples)
- ... and many more!

Explore more on:  
https://huggingface.co/docs/transformers/main_classes/pipelines

### **Identify the Pipeline Supported Tasks**

In [4]:
from transformers.pipelines import SUPPORTED_TASKS
print(SUPPORTED_TASKS.keys())

dict_keys(['audio-classification', 'automatic-speech-recognition', 'text-to-audio', 'feature-extraction', 'text-classification', 'token-classification', 'question-answering', 'table-question-answering', 'visual-question-answering', 'document-question-answering', 'fill-mask', 'summarization', 'translation', 'text2text-generation', 'text-generation', 'zero-shot-classification', 'zero-shot-image-classification', 'zero-shot-audio-classification', 'image-classification', 'image-feature-extraction', 'image-segmentation', 'image-to-text', 'image-text-to-text', 'object-detection', 'zero-shot-object-detection', 'depth-estimation', 'video-classification', 'mask-generation', 'image-to-image'])


In [5]:
SUPPORTED_TASKS["audio-classification"]

{'impl': transformers.pipelines.audio_classification.AudioClassificationPipeline,
 'tf': (),
 'pt': (transformers.models.auto.modeling_auto.AutoModelForAudioClassification,),
 'default': {'model': {'pt': ('superb/wav2vec2-base-superb-ks', '372e048')}},
 'type': 'audio'}

In [6]:
SUPPORTED_TASKS["text-classification"]

{'impl': transformers.pipelines.text_classification.TextClassificationPipeline,
 'tf': (transformers.models.auto.modeling_tf_auto.TFAutoModelForSequenceClassification,),
 'pt': (transformers.models.auto.modeling_auto.AutoModelForSequenceClassification,),
 'default': {'model': {'pt': ('distilbert/distilbert-base-uncased-finetuned-sst-2-english',
    '714eb0f'),
   'tf': ('distilbert/distilbert-base-uncased-finetuned-sst-2-english',
    '714eb0f')}},
 'type': 'text'}

In [7]:
import pandas as pd

# Your dictionary (assign it to a variable)
data = SUPPORTED_TASKS   # Replace this with your dict variable name

rows = []

for task, info in data.items():
    impl = info.get("impl", None)

    # Extract PyTorch model classes
    pt_models = [m.__name__ for m in info.get("pt", [])]

    # Extract TensorFlow model classes
    tf_models = [m.__name__ for m in info.get("tf", [])]

    # Extract default model names
    default_entry = info.get("default", {})
    default_model_dict = default_entry.get("model", {})

    default_pt = None
    default_tf = None

    if "pt" in default_model_dict:
        default_pt = default_model_dict["pt"][0]   # (model_name, revision)
    if "tf" in default_model_dict:
        default_tf = default_model_dict["tf"][0]

    rows.append({
        "task": task,
        "type": info.get("type"),
        "impl": impl.__name__ if impl else None,
        "pt_models": pt_models,
        "tf_models": tf_models,
        "default_pt_model": default_pt,
        "default_tf_model": default_tf,
    })

df = pd.DataFrame(rows)
df

Unnamed: 0,task,type,impl,pt_models,tf_models,default_pt_model,default_tf_model
0,audio-classification,audio,AudioClassificationPipeline,[AutoModelForAudioClassification],[],superb/wav2vec2-base-superb-ks,
1,automatic-speech-recognition,multimodal,AutomaticSpeechRecognitionPipeline,"[AutoModelForCTC, AutoModelForSpeechSeq2Seq]",[],facebook/wav2vec2-base-960h,
2,text-to-audio,text,TextToAudioPipeline,"[AutoModelForTextToWaveform, AutoModelForTextT...",[],suno/bark-small,
3,feature-extraction,multimodal,FeatureExtractionPipeline,[AutoModel],[TFAutoModel],distilbert/distilbert-base-cased,distilbert/distilbert-base-cased
4,text-classification,text,TextClassificationPipeline,[AutoModelForSequenceClassification],[TFAutoModelForSequenceClassification],distilbert/distilbert-base-uncased-finetuned-s...,distilbert/distilbert-base-uncased-finetuned-s...
5,token-classification,text,TokenClassificationPipeline,[AutoModelForTokenClassification],[TFAutoModelForTokenClassification],dbmdz/bert-large-cased-finetuned-conll03-english,dbmdz/bert-large-cased-finetuned-conll03-english
6,question-answering,text,QuestionAnsweringPipeline,[AutoModelForQuestionAnswering],[TFAutoModelForQuestionAnswering],distilbert/distilbert-base-cased-distilled-squad,distilbert/distilbert-base-cased-distilled-squad
7,table-question-answering,text,TableQuestionAnsweringPipeline,[AutoModelForTableQuestionAnswering],[TFAutoModelForTableQuestionAnswering],google/tapas-base-finetuned-wtq,google/tapas-base-finetuned-wtq
8,visual-question-answering,multimodal,VisualQuestionAnsweringPipeline,[AutoModelForVisualQuestionAnswering],[],dandelin/vilt-b32-finetuned-vqa,
9,document-question-answering,multimodal,DocumentQuestionAnsweringPipeline,[AutoModelForDocumentQuestionAnswering],[],impira/layoutlm-document-qa,
