# Hugging Face Transformer pipeline

Coded and shared by Divya Patel, Microsoft.

## Introduction to Transformers
Transformers is a core deep learning module behind the success of recent AI models.


## Introduction to Hugging Face
- Hugging Face is a company that focuses on natural language processing (NLP) and provides various tools and libraries to facilitate NLP tasks. One of their popular offerings is the Transformers library.
- [Hugging Face website](https://huggingface.co/)

### Transformers Library
- The Transformers library is a powerful and easy-to-use library for applying state-of-the-art NLP models to various tasks.
- It provides pre-trained models for tasks like text classification, named entity recognition, question answering, and more.

### Hugging Face Hub

- Hugging Face Hub is a platform provided by Hugging Face where you can find a wide range of pre-trained models and datasets.
- It allows you to easily access and use these models and datasets in your own projects.

In [1]:
# Let's install the transformers library using pip
! pip install transformers gradio

Collecting gradio
  Downloading gradio-4.36.1-py3-none-any.whl (12.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.3/12.3 MB[0m [31m31.9 MB/s[0m eta [36m0:00:00[0m
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl (15 kB)
Collecting fastapi (from gradio)
  Downloading fastapi-0.111.0-py3-none-any.whl (91 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m92.0/92.0 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting ffmpy (from gradio)
  Downloading ffmpy-0.3.2.tar.gz (5.5 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting gradio-client==1.0.1 (from gradio)
  Downloading gradio_client-1.0.1-py3-none-any.whl (318 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m318.1/318.1 kB[0m [31m12.4 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting httpx>=0.24.1 (from gradio)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━

## Hugging Face Pipeline

The Hugging Face Pipeline is a high-level API that provides a simple and convenient way to use pre-trained models for various NLP tasks. It abstracts away the complexities of model loading, tokenization, and inference, allowing you to focus on the task at hand.

With the Hugging Face Pipeline, you can perform tasks such as text generation, sentiment analysis, named entity recognition, question answering, and more, with just a few lines of code.


## Text generation task pipeline

In [2]:
from transformers import pipeline

pipe = pipeline("text-generation", model="microsoft/phi-1_5") # device=0 means use GPU

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/736 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.84G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/74.0 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/237 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/798k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.11M [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/1.08k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/99.0 [00:00<?, ?B/s]

In [3]:
output = pipe('''def print_prime(n):
   """
   Print all primes between 1 and n
   """''', max_length=100)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


In [4]:
print(output[0]['generated_text'])

def print_prime(n):
   """
   Print all primes between 1 and n
   """
   primes = []
   for num in range(2, n+1):
       is_prime = True
       for i in range(2, int(math.sqrt(num))+1):
           if num % i == 0:
               is_prime = False
               break
       if is_prime:
           primes


In [5]:
from transformers import pipeline

pipe = pipeline("text-generation", model="openai-community/gpt2") # device=0 means use GPU
output = pipe("once upon a time", max_length=100)
print(output[0]['generated_text'])

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


once upon a time."

The report of the tribunal will take place in a city court here on December 22. It is expected to take place amid uproar over an official's alleged involvement with China's so-called "one wall" policy.

"In this context, the legal consequences of the alleged misconduct should be reviewed, as will whether the evidence is sufficiently significant to warrant a criminal or other charge against the complainant," it said.

This report is being welcomed by the US


In [6]:
classifier = pipeline("sentiment-analysis") # If you don't specify a model, it will use the default one
classifier("We are very happy to show you the 🤗 Transformers library.")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

[{'label': 'POSITIVE', 'score': 0.9997795224189758}]

In [7]:
classifier.model.name_or_path

'distilbert/distilbert-base-uncased-finetuned-sst-2-english'

## Image captioning task pipeline

In [8]:
img_to_text = pipeline(task="image-to-text",  model="Salesforce/blip-image-captioning-base")
img_to_text("https://www.maacindia.com/blog/wp-content/uploads/2016/08/Tom_and_Jerry_s_75th_anniversary_proves_cat_and_mouse_games_never_get_old-1.jpg")

config.json:   0%|          | 0.00/4.56k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/990M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/506 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/711k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/125 [00:00<?, ?B/s]

preprocessor_config.json:   0%|          | 0.00/287 [00:00<?, ?B/s]



[{'generated_text': 'tom and jerry'}]

In [9]:
#import gradio as gr
#gr.Interface.from_pipeline(img_to_text).launch(debug=False, share=True)