<a href="https://colab.research.google.com/github/ric4234/AI-Fridays/blob/main/Analisi%20Di%20Testi/02_Translation_Summarization.ipynb" target="_parent\"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Translation and Summarization





The goal of this exercise is to use a model to translate text into different languages and a model to summarize a text.





#### 1 - Install dependencies and create utils functions

Firstly, we make sure to install all the needed libraries

In [None]:
!pip install transformers
!pip install torch

Suppress warning messages

In [3]:
from transformers.utils import logging
logging.set_verbosity_error()

#### 2 - Build and use a translation pipeline

At this point, we create a tranlation pipeline pipeline using bnllb-200-distilled-600M model from facebook (https://huggingface.co/facebook/nllb-200-distilled-600M). We decided to use this model because it is very small (only 600M parameters) and because it can translate text into over 190 different languages.
You can find also a lot of other model from Huggingface hub filtering models by Translation type (https://huggingface.co/models?pipeline_tag=translation&sort=trending)

In [None]:
from transformers import pipeline
import torch
translator = pipeline(task="translation",
                      model="facebook/nllb-200-distilled-600M",
                      torch_dtype=torch.bfloat16) # This parameter compress the model without any performance degradation

Now that the translator is loaded let's pass the user message

In [8]:
text = """\
My puppy is adorable, \
Your kitten is cute.
Her panda is friendly.
His llama is thoughtful. \
We all have nice pets!"""

To choose other languages, you can find the other language codes on this page: https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200

In [None]:
text_translated = translator(text,
                             src_lang="eng_Latn",
                             tgt_lang="lij_Latn")
text_translated

#### 3 - Build and use a summarization pipeline

In the following code we will build a summarization pipeline using bart-large-cnn model (https://huggingface.co/facebook/bart-large-cnn).

As usual, you can find other models that perform this task via the Hugginface Models section: https://huggingface.co/models?pipeline_tag=summarization

In [None]:
summarizer = pipeline(task="summarization",
                      model="facebook/bart-large-cnn",
                      torch_dtype=torch.bfloat16)

In [22]:
text = """Paris is the capital and most populous city of France, with
          an estimated population of 2,175,601 residents as of 2018,
          in an area of more than 105 square kilometres (41 square
          miles). The City of Paris is the centre and seat of
          government of the region and province of Île-de-France, or
          Paris Region, which has an estimated population of
          12,174,880, or about 18 percent of the population of France
          as of 2017."""

In [24]:
summary = summarizer(text,
                     min_length=10,
                     max_length=100)
summary

[{'summary_text': 'Paris is the capital and most populous city of France, with an estimated population of 2,175,601 residents as of 2018. The City of Paris is the centre and seat of the government of the region and province of Île-de-France.'}]