# **Loading Models and Inference with Hugging Face Inferences**

## Setup

In [36]:
# @title Install Required Libraries

!pip install torch
!pip install transformers



In [37]:
# @title Importing required libraries

from transformers import pipeline
from transformers import DistilBertForSequenceClassification, DistilBertTokenizer
from transformers import GPT2LMHeadModel, GPT2Tokenizer

import torch

# You can also use this section to suppress warnings
def warn(*args, **kwargs):
    pass
import warnings
warnings.warn = warn
warnings.filterwarnings('ignore')

# Text classification with DistilBERT

In [3]:
# @title Load the tokenizer and model

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

## Preprocess the input text
Tokenize the input text and convert it to a format suitable for the model.

In [12]:
# Sample text
text = "Led Zeppelin didn't just play music; they redefined the very landscape of rock, blues, and folk, forging a sound that remains unmatched."
#text ="One Direction music felt less like artistic expression and more like a carefully engineered product designed for maximum teen idol appeal."

# Tokenize the input text
inputs = tokenizer(text, return_tensors="pt")

print(inputs)

{'input_ids': tensor([[  101,  2419, 22116,  2134,  1005,  1056,  2074,  2377,  2189,  1025,
          2027,  2417, 28344,  1996,  2200,  5957,  1997,  2600,  1010,  5132,
          1010,  1998,  5154,  1010,  2005,  4726,  1037,  2614,  2008,  3464,
          4895, 18900,  7690,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}


##  Perform inference
The `torch.no_grad()` context manager is used to disable gradient calculation.
This reduces memory consumption and speeds up computation, as gradients are not needed for inference (i.e. when you are not training the model). The **inputs syntax is used to unpack a dictionary of keyword arguments in Python. In the context of the model(**inputs):

In [13]:
# Perform inference

with torch.no_grad():
    outputs = model(**inputs)

In [17]:
# Get the logits

logits = outputs.logits

In [15]:
# Convert logits to probabilities
probs = torch.softmax(logits, dim=-1)

# Get the predicted class
predicted_class = torch.argmax(probs, dim=-1)

# Map the predicted class to the label
labels = ["NEGATIVE", "POSITIVE"]
predicted_label = labels[predicted_class]

print(f"Predicted label: {predicted_label}")

Predicted label: POSITIVE


# Text generation with GPT-2


In [18]:
# @title Load the tokenizer and pretrained model

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

## Preprocess the input text

In [28]:
# Prompt
prompt = "Ooh, and she's buying a stairway to Heaven"

# Tokenize the input text
inputs = tokenizer(prompt, return_tensors="pt")
inputs

{'input_ids': tensor([[   46,  1219,    11,   290,   673,   338,  7067,   257, 18761,  1014,
           284, 11225]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}

## Perform Inference  
Generate text using the model

```inputs:``` Input token IDs from the tokenizer

```attention_mask:``` Mask indicating which tokens to attend to

```pad_token_id:```Padding token ID set to the end-of-sequence token ID

```max_length:``` Maximum length of the generated sequences

```num_return_sequence:``` Number of sequences to generate

In [29]:
# Generate text
output_ids = model.generate(
    inputs.input_ids,
    attention_mask=inputs.attention_mask,
    pad_token_id=tokenizer.eos_token_id,
    max_length=50,
    num_return_sequences=1
)

output_ids

tensor([[   46,  1219,    11,   290,   673,   338,  7067,   257, 18761,  1014,
           284, 11225,    13,   198,   198,     1,    40,  1101,  1016,   284,
           467,   284, 11225,   553,   673,  1139,    13,   366,    40,  1101,
          1016,   284,   467,   284, 11225,   526,   198,   198,     1,    40,
          1101,  1016,   284,   467,   284, 11225,   553,   673,  1139,    13]])

In [30]:
# Decode the generated text
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(generated_text)

Ooh, and she's buying a stairway to Heaven.

"I'm going to go to Heaven," she says. "I'm going to go to Heaven."

"I'm going to go to Heaven," she says.


# Hugging Face `pipeline()` function

The `pipeline()` function from the Hugging Face `transformers` library is a high-level API designed to simplify the usage of pretrained models for various natural language processing (NLP) tasks. It abstracts the complexities of model loading, tokenization, inference, and post-processing, allowing users to perform complex NLP tasks with just a few lines of code.



## Details

### Definition

```python
transformers.pipeline(
    task: str,
    model: Optional = None,
    config: Optional = None,
    tokenizer: Optional = None,
    feature_extractor: Optional = None,
    framework: Optional = None,
    revision: str = 'main',
    use_fast: bool = True,
    model_kwargs: Dict[str, Any] = None,
    **kwargs
)
```

### Parameters

- **task**: `str`
  - The task to perform, such as "text-classification", "text-generation", "question-answering", etc.
  - Example: `"text-classification"`

- **model**: `Optional`
  - The model to use. This can be a string (model identifier from Hugging Face model hub), a path to a directory containing model files, or a pre-loaded model instance.
  - Example: `"distilbert-base-uncased-finetuned-sst-2-english"`

- **config**: `Optional`
  - The configuration to use. This can be a string, a path to a directory, or a pre-loaded config object.
  - Example: `{"output_attentions": True}`

- **tokenizer**: `Optional`
  - The tokenizer to use. This can be a string, a path to a directory, or a pre-loaded tokenizer instance.
  - Example: `"bert-base-uncased"`

- **feature_extractor**: `Optional`
  - The feature extractor to use for tasks that require it (e.g., image processing).
  - Example: `"facebook/detectron2"`

- **framework**: `Optional`
  - The framework to use, either `"pt"` for PyTorch or `"tf"` for TensorFlow. If not specified, it will be inferred.
  - Example: `"pt"`

- **revision**: `str`, default `'main'`
  - The specific model version to use (branch, tag, or commit hash).
  - Example: `"v1.0"`

- **use_fast**: `bool`, default `True`
  - Whether to use the fast version of the tokenizer if available.
  - Example: `True`

- **model_kwargs**: `Dict[str, Any]`, default `None`
  - Additional keyword arguments passed to the model during initialization.
  - Example: `{"output_hidden_states": True}`

- **kwargs**: `Any`
  - Additional keyword arguments passed to the pipeline components.

### Task types

The `pipeline()` function supports a wide range of NLP tasks. Here are some of the common tasks:

1. **Text Classification**: `text-classification`
   - **Purpose**: Classify text into predefined categories.
   - **Use Cases**: Sentiment analysis, spam detection, topic classification.

2. **Text Generation**: `text-generation`
   - **Purpose**: Generate coherent text based on a given prompt.
   - **Use Cases**: Creative writing, dialogue generation, story completion.

3. **Question Answering**: `question-answering`
   - **Purpose**: Answer questions based on a given context.
   - **Use Cases**: Building Q&A systems, information retrieval from documents.

4. **Named Entity Recognition (NER)**: `ner` (or `token-classification`)
   - **Purpose**: Identify and classify named entities (like people, organizations, locations) in text.
   - **Use Cases**: Extracting structured information from unstructured text.

5. **Summarization**: `summarization`
   - **Purpose**: Summarize long pieces of text into shorter, coherent summaries.
   - **Use Cases**: Document summarization, news summarization.

6. **Translation**: `translation_xx_to_yy` (e.g., `translation_en_to_fr`)
   - **Purpose**: Translate text from one language to another.
   - **Use Cases**: Language translation, multilingual applications.

7. **Fill-Mask**: `fill-mask`
   - **Purpose**: Predict masked words in a sentence (useful for masked language modeling).
   - **Use Cases**: Language modeling tasks, understanding model predictions.

8. **Zero-Shot Classification**: `zero-shot-classification`
   - **Purpose**: Classify text into categories without needing training data for those categories.
   - **Use Cases**: Flexible and adaptable classification tasks.

9. **Feature Extraction**: `feature-extraction`
   - **Purpose**: Extract hidden state features from text.
   - **Use Cases**: Downstream tasks requiring text representations, such as clustering, similarity, or further custom model training.


## Example 1: Text Classification using `pipeline()`

In [33]:
# @title Load a general text classification model
classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

Device set to use cpu


In [34]:
# @title Classify a sample text
result = classifier("Their innovative blend of thunderous riffs, soaring vocals, and mystical lyricism established Led Zeppelin as pioneers, forever altering the trajectory of popular music.")
print(result)

[{'label': 'POSITIVE', 'score': 0.999832034111023}]


In [35]:
# @title Classify another sample text
result = classifier("Nickelback's music is a masterclass in generic, focus-grouped rock, delivering formulaic anthems devoid of genuine artistry.")
print(result)

[{'label': 'NEGATIVE', 'score': 0.9997269511222839}]


## Example 2: Language detection using `pipeline()`

In [38]:
# Load the language detection model
classifier = pipeline("text-classification", model="papluca/xlm-roberta-base-language-detection")

# Classify the sample text
result = classifier("Bonjour, comment ça va?")
print(result)

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/502 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'fr', 'score': 0.9934879541397095}]


## Example 3: Text Generation using `pipeline()`

In [39]:
# Initialize the text generation pipeline with GPT-2
generator = pipeline("text-generation", model="gpt2")

Device set to use cpu


In [45]:
# Generate text based on a given prompt
prompt = "We came from the land of ice and snow"
result = generator(prompt, max_length=40,  max_new_tokens=None, num_return_sequences=1, truncation=True, pad_token_id=50256)

# Print the generated text
print(result[0]['generated_text'])

We came from the land of ice and snow. We were so much stronger than anyone we never had a chance to fight.

"To be honest, we were always fighting for each other.


## Example 4: Text generation using T5 with `pipeline()`

In [46]:
# Initialize the text generation pipeline with T5
generator = pipeline("text2text-generation", model="t5-small")

config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

Device set to use cpu


In [48]:
# Generate text based on a given prompt
prompt = "translate English to French: How are you?"
result = generator(prompt, max_length=50, max_new_tokens=None, num_return_sequences=1)

# Print the generated text
print(result[0]['generated_text'])

Comment êtes-vous?
