# Working with the Transformers Library for Different Natural Language Processing Tasks


Within this notebook we will use the the **Transformers Library** by [Hugging Face](https://huggingface.co/). The library consists of thousands of pre-trained models many of which are trained on huge datasets for thousands of GPU hours. You can use them either directly for inference (as we will do in this lab session) or fine-tune them for your specific applications. Using pre-trained models allows you to reduce your compute costs and carbon footprint and save time and resources required to develop a model from scratch.

For creating user interfaces we will use Gradio which you already know from the previous session on using ML APIs.

In [None]:
!pip install gradio
!pip install transformers[sentencepiece]
!pip install transformers
!pip install torch

The HuggingFace [ModelHub](https://huggingface.co/models) consists of various pre-trained models for different tasks which can be downloaded and used easily using the Transformers Library. 

##  Transformers for Translation Tasks

The easiest way to use a pre-trained model for inference is the **pipeline**. The pipeline can be used out-of-the box for many tasks across modalities (e.g., text, images, etc.). In this lab session we will look into a subset including translation and text classification.

<table>
  <tr>
    <th>Task</th>
    <th>Description</th>
    <th>Pipeline identifier</th>
  </tr>
  <tr>
    <td>Translation</td>
    <td>translate text from one language into another</td>
    <td>pipeline(task=“translation”)</td>
  </tr>
  <tr>
    <td>Text classification</td>
    <td>assign a label to a given sequence of text</td>
    <td>pipeline(task=“sentiment-analysis”)</td>
  </tr>
  <tr>
    <td>Text generation</td>
    <td>generate text that follows a given prompt</td>
    <td>pipeline(task=“text-generation”)</td>
  </tr>
</table>

For a comprehensive overview you can click [here](https://huggingface.co/docs/transformers/main/en/quicktour#pipeline).


As a first example we will explore the pipeline for translating text from one language to another one (use `pipeline("translation_xx_to_yy")`) For example, to translate from English to German you can use `pipeline("translation_en_to_de")`. 

The pipeline downloads and caches a default pretrained model. You can then use it on your target text, e.g., `response = translator(text)`.  

In [None]:
## Using the transformers library for a translation task
## Gradio is used for the user interface 

from transformers import pipeline
import gradio as grad

translator = pipeline("translation_en_to_de")

text = "Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. Using pretrained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. "
response = translator(text)
print(response)

Rather than using a default model, we will now use a **specific model** for the translation task and pass the model, which should be used for the translation to the pipeline as parameter. 

We will use `mdl_name = "Helsinki-NLP/opus-mt-en-de"`. You can find details concerning the model on its [model card](https://huggingface.co/Helsinki-NLP/opus-mt-en-de) in the ModelHub.

In [None]:
#Using the pipeline with a specific model
mdl_name = "Helsinki-NLP/opus-mt-en-de"
translator = pipeline("translation", model=mdl_name)

text = "Transformers provides APIs and tools to easily download and train state-of-the-art pretrained models. Using pretrained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. "
response = translator(text)
print(response)

Downloading:   0%|          | 0.00/1.33k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/298M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/768k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/797k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.27M [00:00<?, ?B/s]



[{'translation_text': 'Transformers bietet APIs und Tools zum einfachen Herunterladen und Trainieren modernster vortrainierter Modelle. Mit vortrainierten Modellen können Sie Ihre Rechenkosten, Ihren CO2-Fußabdruck reduzieren und die Zeit und Ressourcen sparen, die Sie benötigen, um ein Modell von Grund auf zu trainieren.'}]


In a next step we will add a graphical user interface using Gradio, which allows to input text in a textbox and shows the translation in another textbox.

In [None]:
#User Interface
def translate(text):
  response = translator(text)
  return response

grad.Interface(translate, inputs=["text",], outputs="text").launch(debug=True)

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>

Keyboard interruption in main thread... closing server.




### Exercise

Now try for yourself and get your hands dirty.

Modify the example to translate from english to spanish. Search the [ModelHub](https://huggingface.co/models) to find a model which allows translating from english to spanish.

In [None]:
##Add your code to translate from English to Spanish here

from transformers import pipeline
import gradio as grad
mdl_name = "Helsinki-NLP/opus-mt-en-es"
opus_translator = pipeline("translation", model=mdl_name)
def translate(text):
  response = opus_translator(text)
  return response
grad.Interface(translate, inputs=["text",], outputs="text").launch()


Downloading:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/312M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/802k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/826k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.59M [00:00<?, ?B/s]



Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>



##  Transformers for Text Classification 

In a next step we use the Transformers library for a text classification task (using `pipeline("sentiment-analysis")`).

As model we use a specific model (i.e., 
`mdl_name = "siebert/sentiment-roberta-large-english"`), which is a fine-tuned checkpoint of a RoBERTa large model. If you want to see all scores and not just the predicted class you can use: `return_all_scores=True`.

In [None]:
## Using the transformers library for a sentiment-analysis task
## Gradio is used for the user interface 

from transformers import pipeline
import gradio as grad

mdl_name = "siebert/sentiment-roberta-large-english"
roberta_pipe = pipeline("sentiment-analysis", model=mdl_name, return_all_scores=True)

def sentiment(text):
  response = roberta_pipe(text)
  return response

txt=grad.Textbox(lines=1, label="Your input", placeholder="Put your text here")
grad.Interface(sentiment, inputs=txt, outputs="text").launch()
#grad.Interface(sentiment, inputs=txt, outputs="json").launch()



IMPORTANT: You are using gradio version 3.13.0, however version 3.14.0 is available, please upgrade.
--------
Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>



##  Transformers for Text Gneration

In a next step we use the Transformers library for a text generation task (using `pipeline("text-generation")`).

As model we use a specific model (i.e., `model = "GPT2"`). We will configure the `generator` by limiting the length of the output (using `max_length`) and by specifying the number of sequences that should be generated (using `num_return_sequences`).

In [15]:
from transformers import pipeline, set_seed
generator = pipeline('text-generation', model='gpt2')
set_seed(42)
generator("Paris is", max_length=30, num_return_sequences=5)


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'Paris is far more than a game; in fact, it is an exhibition of an art that could be seen as a showcase for the development of art'},
 {'generated_text': 'Paris is not an ordinary region. In fact, its population is more than twice as large as the total population of Rome: between three quarters of a'},
 {'generated_text': "Paris is still a city of many contradictions. The fact that I'm still going.\n\nBut there are ways to make peace!\n\nWith"},
 {'generated_text': 'Paris is a great place because of its rich soils, its natural environment and its great natural beauty.\n\nThe city has two main centers with beautiful'},
 {'generated_text': 'Paris is a pretty decent place to live today. Many of our cities have been destroyed, and it is now cheaper to buy a house and buy a'}]

## Combining Label Detection with Translation

We will now combine the code for label detection you already know from the previous session with a translation task.

In [None]:
# Install the Google Cloud Vision library that allows us to make request to the Google Cloud Vision API
!pip install google-cloud-vision

In [None]:
# Import the libraries
from google.cloud import vision
import os

In [None]:
# Using the GOOGLE_APPLICATION_CREDENTIALS environment variable the location of a credential JSON file can be provided. 
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'My Project-543e6ed386ee.json'

In [None]:
# Instantiate the client (this only works with the credantials correctly set)
client = vision.ImageAnnotatorClient()

In [None]:
#Import libraries
from PIL import Image as PillowImage
import io

In [None]:
# Adapted from https://stackoverflow.com/questions/33101935/convert-pil-image-to-byte-array
# Conversion of image into byte array
def image_to_byte_array(filePath) -> bytes:
  #Open File as PIL Image
  pil = PillowImage.open(filePath, "r")   
  # BytesIO is a fake file stored in memory
  imgByteArr = io.BytesIO()
  # image.save expects a file as a argument, passing BytesIO object
  pil.save(imgByteArr, format=pil.format)
  # Turn the BytesIO object back into a bytes object
  imgByteArr = imgByteArr.getvalue()
  return imgByteArr

In [None]:
from transformers import pipeline

def detect_and_translate_text(filePath, translator):
  image = vision.Image()
  image.content = image_to_byte_array(filePath)
  response_text = client.text_detection(image=image)

  text=""
  if response_text.text_annotations:
    text = response_text.text_annotations[0].description
  
  #Translate output of label detection
  mdl_name = "Helsinki-NLP/opus-mt-de-en"
  translator = pipeline("translation", model=mdl_name)
  response = translator(text)
  
  return response

In [None]:
import gradio as grad
demo = grad.Interface(
    detect_and_translate_text, 
    inputs=grad.Image(type="filepath"),
    outputs = grad.TextArea()
)
demo.launch(debug=True)



Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>



Keyboard interruption in main thread... closing server.




##  Transformers for Text Summarization

We will now try a text summarization task. As pre-trained model we will rely on `deep-learning-analytics/wikihow-t5-small`. For more info on the model you can visit the [ModelHub](https://huggingface.co/deep-learning-analytics/wikihow-t5-small).

**Create your own pipeline**

Since the default pipelines only support limited scenarios we now look into how to create our own pipeline.

In a text summarization task we have a piece of text as an input and would like to return a summary of this text. However, we cannot feed this input text directly into the language model. The language model expects a tensor with the IDs referring to the token indices.  For this we use a tokenizer that can help with (1) splitting the text into words and sub-words and (2) mapping each token to an integer.

On a more technical level we use a `Tokenizer`, more specifically an `AutoTokenizer`, which we can instantiate from an existing file on the Hugging Face Hub (i.e., `tokenizer = AutoTokenizer.from_pretrained(model_name)`). 

A `Tokenizer` can then be used to encode the input text (here `text_input_sequence`) and provide the encoding required by the language model (e.g., `encoded_text = tokenizer.encode(text_input_sequence, return_tensors="pt")`; `pt` stands for Pytorch). 


In [2]:
from transformers import AutoTokenizer
import gradio as grad

model_name = "deep-learning-analytics/wikihow-t5-small" 
tokenizer = AutoTokenizer.from_pretrained(model_name)

text_input_sequence = "Hello, my name is Barbara"
#Longer text snippet for experimentation - taken from https://huggingface.co/deep-learning-analytics/wikihow-t5-small
#text_input_sequence = """Lack of fluids can lead to dry mouth, which is a leading cause of bad breath. Water can also dilute any chemicals in your mouth or gut that are causing bad breath., Studies show that eating 6 ounces of yogurt a day reduces the level of odor-causing compounds in the mouth. In particular, look for yogurt containing the active bacteria Streptococcus thermophilus or Lactobacillus bulgaricus., The abrasive nature of fibrous fruits and vegetables helps to clean teeth, while the vitamins, antioxidants, and acids they contain improve dental health.Foods that can be particularly helpful include:Apples — Apples contain vitamin C, which is necessary for health gums, as well as malic acid, which helps to whiten teeth.Carrots — Carrots are rich in vitamin A, which strengthens tooth enamel.Celery — Chewing celery produces a lot of saliva, which helps to neutralize bacteria that cause bad breath.Pineapples — Pineapples contain bromelain, an enzyme that cleans the mouth., These teas have been shown to kill the bacteria that cause bad breath and plaque., An upset stomach can lead to burping, which contributes to bad breath. Don’t eat foods that upset your stomach, or if you do, use antacids. If you are lactose intolerant, try lactase tablets., They can all cause bad breath. If you do eat them, bring sugar-free gum or a toothbrush and toothpaste to freshen your mouth afterwards., Diets low in carbohydrates lead to ketosis — a state in which the body burns primarily fat instead of carbohydrates for energy. This may be good for your waistline, but it also produces chemicals called ketones, which contribute to bad breath.To stop the problem, you must change your diet. Or, you can combat the smell in one of these ways:Drink lots of water to dilute the ketones.Chew sugarless gum or suck on sugarless mints.Chew mint leaves."""
encoded_text = tokenizer.encode(text_input_sequence,return_tensors="pt") #pt for pytorch and tf for tensorflow
print(encoded_text)

#For testing purposes we use the method convert_ids_to_tokens to convert the ids into tokens
tokens = tokenizer.convert_ids_to_tokens(encoded_text[0])
print(tokens)


Downloading (…)okenizer_config.json:   0%|          | 0.00/25.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/736 [00:00<?, ?B/s]

Downloading (…)ve/main/spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/1.79k [00:00<?, ?B/s]

tensor([[ 8774,     6,    82,   564,    19, 13163,     1]])
['▁Hello', ',', '▁my', '▁name', '▁is', '▁Barbara', '</s>']
Hello, my name is Barbara


Now that we have looked into how a `Tokenizer` can be used, we will look into how we can use the language model to generate a summary. 

First we download a pre-trained model the same as we downloaded the tokenizer in the above step and instantiate the model. 

```
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

```
Then we pass the encoded text as a tensor to the language model and use the model to generate the summary (i.e., `summary_ids`). The following parameters are used: 

*  `tkn_text` representing the tensor with the IDs
*  `max_length` allows to restrict the length of the generated token
* `num_beams` number of beams for beam search; default 1 (means no beam search)
* `repetition_penalty` is the parameter for repetition penality; default 1.0 stands for no penalty 
* `length_penalty` is the parameter for the length penalty; values > 0 promote longer sequences; values < 0 shorter sequences
* `early_stopping` defaults to False - When `True`, generation finishes if the EOS token is reached

```
summary_ids = model.generate(
          tkn_text,
          max_length=250, 
          num_beams=5,
          repetition_penalty=2.5, 
          length_penalty=1.0, 
          early_stopping=True
)
```

Finally, we use the `Tokenizer` to decode the result provided by the language model (i.e., the tensor with the IDs) into text, i.e., `tokens = tokenizer.decode(encoded_text[0], skip_special_tokens=True)`. 

In [None]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import gradio as grad

# Define the model repo
model_name = "deep-learning-analytics/wikihow-t5-small" 

# Download model
txt2txt_tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

def text2text_summary(paragraph):
    #Remove line breaks
    initial_txt = paragraph.strip().replace("\n","")
    #Encode initial_text as tensor
    tkn_text = txt2txt_tokenizer.encode(initial_txt,return_tensors="pt") 
    #Generate summary using model
    summary_ids = model.generate(
            tkn_text,
            max_length=250, 
            num_beams=5,
            repetition_penalty=2.5, 
            length_penalty=1.0, 
            early_stopping=True
    )
    #print(summary_ids)
    #Decode the list of IDs 
    response = txt2txt_tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return response

#Gradio User Interface
para=grad.Textbox(lines=10, label="Paragraph", placeholder="Add your text here")
out=grad.Textbox(lines=1,label="Summary")
grad.Interface(text2text_summary, inputs=para, outputs=out).launch(debug=True)



## Creating a Chatbot/Dialog Bot

# Text fett markieren - tHIS still needs to be documented

In [11]:
#from transformers.models.blenderbot.modeling_blenderbot import BlenderbotForConditionalGeneration
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-large")


def converse(user_input, chat_history=[]):
    # encode the new user input, add the eos_token and return a tensor in Pytorch
    user_input_ids = tokenizer.encode(user_input,tokenizer.eos_token,return_tensors='pt')
 
    # append the new user input tokens to the chat history
    bot_input_ids = torch.cat([torch.LongTensor(chat_history), user_input_ids], dim=-1) 
    
    # generated a response while limiting the total chat history to 1000 tokens, 
    chat_history = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id).tolist()
    #print(chat_history)

    response = tokenizer.decode(chat_history[0]).split("<|endoftext|>")
    #print("Starting to print response")
    #print(response) 
   
    response = [(response[i], response[i+1]) for i in range(0, len(response)-1, 2)]  # convert to tuples of list
    
    return response, chat_history


import gradio as gr
gr.Interface(fn=converse,
             theme="default",
             inputs=[gr.Textbox(placeholder="Let's chat"), "state"],
             outputs=["chatbot", "state"]).launch(debug=False)


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>



# Using a pre-trained model for image classification

In the previous lab we used the Google Cloud Vision API to perform different ML tasks on images. Now we will explore the usage of a pre-trained model for image classification, i.e., for label detection in an image.

In [12]:
import tensorflow as tf
inception_net = tf.keras.applications.MobileNetV2()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5


In [13]:
import requests

# Download human-readable labels for ImageNet.
response = requests.get("https://git.io/JJkYN")
labels = response.text.split("\n")

def classify_image(inp):
  inp = inp.reshape((-1, 224, 224, 3))
  inp = tf.keras.applications.mobilenet_v2.preprocess_input(inp)
  prediction = inception_net.predict(inp).flatten()
  confidences = {labels[i]: float(prediction[i]) for i in range(1000)}
  return confidences

In [16]:
gr.Interface(fn=classify_image, 
             inputs=gr.Image(shape=(224, 224)),
             outputs=gr.Label(num_top_classes=10)).launch()

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Note: opening Chrome Inspector may crash demo inside Colab notebooks.

To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>



In [5]:
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import requests

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)

preprocessor = AutoImageProcessor.from_pretrained("google/mobilenet_v2_1.0_224")
model = AutoModelForImageClassification.from_pretrained("google/mobilenet_v2_1.0_224")

inputs = preprocessor(images=image, return_tensors="pt")

outputs = model(**inputs)
logits = outputs.logits

print(outputs)

# model predicts one of the 1000 ImageNet classes
predicted_class_idx = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_idx])


ImageClassifierOutputWithNoAttention(loss=None, logits=tensor([[ 0.2445, -1.1993,  0.1905,  ..., -1.1717, -0.1934,  0.0726]],
       grad_fn=<AddmmBackward0>), hidden_states=None)
Predicted class: tabby, tabby cat
