# Hugging Face Transformer pipeline

## Introduction to Transformers
Transformers is a core deep learning module behind the success of recent AI models.

<img src="images/transformers.png" width="300" height="300">

## <img src="images/hugging_face.png" width="40" height="40"> Introduction to Hugging Face
- Hugging Face is a company that focuses on natural language processing (NLP) and provides various tools and libraries to facilitate NLP tasks. One of their popular offerings is the Transformers library.
- [Hugging Face website](https://huggingface.co/)

### Transformers Library
- The Transformers library is a powerful and easy-to-use library for applying state-of-the-art NLP models to various tasks. 
- It provides pre-trained models for tasks like text classification, named entity recognition, question answering, and more.

### Hugging Face Hub

- Hugging Face Hub is a platform provided by Hugging Face where you can find a wide range of pre-trained models and datasets. 
- It allows you to easily access and use these models and datasets in your own projects.

In [1]:
# Let's install the transformers library using pip
! pip install transformers gradio



## Hugging Face Pipeline

The Hugging Face Pipeline is a high-level API that provides a simple and convenient way to use pre-trained models for various NLP tasks. It abstracts away the complexities of model loading, tokenization, and inference, allowing you to focus on the task at hand.

With the Hugging Face Pipeline, you can perform tasks such as text generation, sentiment analysis, named entity recognition, question answering, and more, with just a few lines of code.


<img src="images/pipeline.png">

## Text generation task pipeline

In [2]:
from transformers import pipeline

pipe = pipeline("text-generation", model="HuggingFaceTB/SmolLM2-135M") # device=0 means use GPU

Device set to use cuda:0


In [3]:
output = pipe('''def print_prime(n):
   """
   Print all primes between 1 and n
   """''', max_length=100)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.


In [4]:
print(output[0]['generated_text'])

def print_prime(n):
   """
   Print all primes between 1 and n
   """
   if n <= 1:
      return
   for i in range(2, int(n**0.5)+1):
      if n % i == 0:
         return False
   return True

def is_prime(n):
   """
   Returns True if n is a prime number
   """
   if n == 1:
      return False
   if n == 


In [5]:
classifier = pipeline("sentiment-analysis") # If you don't specify a model, it will use the default one
classifier("We are very happy to show you the 🤗 Transformers library.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cuda:0


[{'label': 'POSITIVE', 'score': 0.9997795224189758}]

In [6]:
classifier.model.name_or_path

'distilbert/distilbert-base-uncased-finetuned-sst-2-english'

## Image captioning task pipeline

In [8]:
img_to_text = pipeline(task="image-to-text", device=0, model="Salesforce/blip-image-captioning-base")
img_to_text("images/tom_jerry.png")

Device set to use cuda:0


[{'generated_text': 'tom and jerry'}]

In [9]:
import gradio as gr
gr.Interface.from_pipeline(img_to_text).launch(debug=False, share=True)

* Running on local URL:  http://127.0.0.1:7860

Could not create share link. Please check your internet connection or our status page: https://status.gradio.app.


2025/03/04 12:06:14 [W] [service.go:132] login to server failed: dial tcp 44.237.78.176:7000: i/o timeout


