# Lesson 3: Translation and Summarization

- In the classroom, the libraries are already installed for you.
- If you would like to run this code on your own machine, you can install the following:

```
    !pip install transformers 
    !pip install torch
```

- Here is some code that suppresses warning messages.

In [None]:
from transformers.utils import logging
logging.set_verbosity_error()

### Build the `translation` pipeline using ü§ó Transformers Library

In [1]:
from transformers import pipeline 
import torch

In [11]:
translator = pipeline(task="translation",
                      model="facebook/nllb-200-distilled-600M",
                      torch_dtype=torch.bfloat16) 

NLLB: No Language Left Behind: ['nllb-200-distilled-600M'](https://huggingface.co/facebook/nllb-200-distilled-600M).



In [3]:
text = """\
My puppy is adorable, \
Your kitten is cute.
Her panda is friendly.
His llama is thoughtful. \
We all have nice pets!"""

In [4]:
text_translated = translator(text,
                             src_lang="eng_Latn",
                             tgt_lang="tel_Telu")#tel_Telu fra_Latn

To choose other languages, you can find the other language codes on the page: [Languages in FLORES-200](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200)

For example:
- Afrikaans: afr_Latn
- Chinese: zho_Hans
- Egyptian Arabic: arz_Arab
- French: fra_Latn
- German: deu_Latn
- Greek: ell_Grek
- Hindi: hin_Deva
- Indonesian: ind_Latn
- Italian: ita_Latn
- Japanese: jpn_Jpan
- Korean: kor_Hang
- Persian: pes_Arab
- Portuguese: por_Latn
- Russian: rus_Cyrl
- Spanish: spa_Latn
- Swahili: swh_Latn
- Thai: tha_Thai
- Turkish: tur_Latn
- Vietnamese: vie_Latn
- Zulu: zul_Latn

In [5]:
text_translated

[{'translation_text': '‡∞®‡∞æ ‡∞ï‡±Å‡∞ï‡±ç‡∞ï‡∞™‡∞ø‡∞≤‡±ç‡∞≤ ‡∞Æ‡∞®‡±ã‡∞π‡∞∞‡∞Ç‡∞ó‡∞æ ‡∞â‡∞Ç‡∞¶‡∞ø, ‡∞Æ‡±Ä ‡∞™‡∞ø‡∞≤‡±ç‡∞≤‡∞ø ‡∞Ö‡∞Ç‡∞¶‡∞Ç‡∞ó‡∞æ ‡∞â‡∞Ç‡∞¶‡∞ø. ‡∞Ü‡∞Æ‡±Ü ‡∞™‡∞æ‡∞Ç‡∞°‡∞æ ‡∞∏‡±ç‡∞®‡±á‡∞π‡∞™‡±Ç‡∞∞‡±ç‡∞µ‡∞ï‡∞Ç‡∞ó‡∞æ ‡∞â‡∞Ç‡∞¶‡∞ø. ‡∞Ö‡∞§‡∞®‡∞ø ‡∞≤‡∞æ‡∞Æ‡∞æ ‡∞∂‡±ç‡∞∞‡∞¶‡±ç‡∞ß‡∞ó‡∞≤‡∞¶‡∞ø. ‡∞Æ‡∞®‡∞Ç‡∞¶‡∞∞‡∞ø‡∞ï‡±Ä ‡∞Æ‡∞Ç‡∞ö‡∞ø ‡∞™‡±Ü‡∞Ç‡∞™‡±Å‡∞°‡±Å ‡∞ú‡∞Ç‡∞§‡±Å‡∞µ‡±Å‡∞≤‡±Å ‡∞â‡∞®‡±ç‡∞®‡∞æ‡∞Ø‡∞ø!'}]

In [12]:
text1 = """My name is siva and I am from India,working as software engineer"""

In [13]:
text_translated = translator(text1,
                             src_lang="eng_Latn",
                             tgt_lang="tel_Telu")#tel_Telu fra_Latn
text_translated

[{'translation_text': '‡∞®‡∞æ ‡∞™‡±á‡∞∞‡±Å ‡∞∂‡∞ø‡∞µ ‡∞®‡±á‡∞®‡±Å ‡∞≠‡∞æ‡∞∞‡∞§‡∞¶‡±á‡∞∂‡∞Ç ‡∞®‡±Å‡∞Ç‡∞°‡∞ø ‡∞∏‡∞æ‡∞´‡±ç‡∞ü‡±ç‡∞µ‡±á‡∞∞‡±ç ‡∞á‡∞Ç‡∞ú‡∞®‡±Ä‡∞∞‡±ç ‡∞ó‡∞æ ‡∞™‡∞®‡∞ø‡∞ö‡±á‡∞∏‡±ç‡∞§‡±Å‡∞®‡±ç‡∞®‡∞æ‡∞®‡±Å'}]

## Free up some memory before continuing
- In order to have enough free memory to run the rest of the code, please run the following to free up memory on the machine.

In [14]:
import gc

In [15]:
del translator

In [16]:
gc.collect()

423

### Build the `summarization` pipeline using ü§ó Transformers Library

In [17]:
summarizer = pipeline(task="summarization",
                      model="facebook/bart-large-cnn",
                      torch_dtype=torch.bfloat16)

config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Model info: ['bart-large-cnn'](https://huggingface.co/facebook/bart-large-cnn)

In [18]:
text = """Paris is the capital and most populous city of France, with
          an estimated population of 2,175,601 residents as of 2018,
          in an area of more than 105 square kilometres (41 square
          miles). The City of Paris is the centre and seat of
          government of the region and province of √éle-de-France, or
          Paris Region, which has an estimated population of
          12,174,880, or about 18 percent of the population of France
          as of 2017."""

In [19]:
summary = summarizer(text,
                     min_length=10,
                     max_length=100)

In [20]:
summary

[{'summary_text': 'Paris is the capital and most populous city of France, with an estimated population of 2,175,601 residents as of 2018. The City of Paris is the centre and seat of the government of the region and province of √éle-de-France.'}]

### Try it yourself! 
- Try this model with your own texts!

In [25]:
text2 = """Both classic one stage detection methods, like boosted detectors, DPM & more recent methods like SSD evaluate almost 104 to 105 candidate locations per image but only a few locations contain objects (i.e. Foreground) and rest are just background objects. This leads to the class imbalance problem.

This imbalance causes two problems ‚Äì

Training is inefficient as most locations are easy negatives (meaning that they can be easily classified by the detector as background) that contribute no useful learning.
Since easy negatives (detections with high probabilities) account for a large portion of inputs. Although they result in small loss values individually but collectively, they can overwhelm the loss & computed gradients and can lead to degenerated models.
In simple words, Focal Loss (FL) is an improved version of Cross-Entropy Loss (CE)  that tries to handle the class imbalance problem by assigning more weights to hard or easily misclassified examples  (i.e. background with noisy texture or partial object or the object of our interest ) and to down-weight easy examples (i.e. Background objects).

So Focal Loss reduces the loss contribution from easy examples and increases the importance of correcting misclassified examples.)

So, let‚Äôs first understand what Cross-Entropy loss for binary classification.
."""

In [26]:
summary = summarizer(text2,min_length=10,
                     max_length=100)
summary

[{'summary_text': 'Focal Loss is an improved version of Cross-Entropy Loss (CE) It tries to handle the class imbalance problem by assigning more weights to hard or easily misclassified examples and to down-weight easy examples.'}]