# Translation and Summarization

- In the classroom, the libraries are already installed for you.
- If you would like to run this code on your own machine, you can install the following:

```
    !pip install transformers
    !pip install torch
```

In [1]:
!pip install -q transformers torch

- Here is some code that suppresses warning messages.

In [2]:
from transformers.utils import logging
logging.set_verbosity_error()

### Build the `translation` pipeline using 🤗 Transformers Library

In [3]:
from transformers import pipeline
import torch

In [6]:
translator = pipeline(task="translation",
                      model="facebook/nllb-200-distilled-600M",
                      torch_dtype=torch.bfloat16)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/846 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.46G [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/564 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/4.85M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.3M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/3.55k [00:00<?, ?B/s]

NLLB: No Language Left Behind: ['nllb-200-distilled-600M'](https://huggingface.co/facebook/nllb-200-distilled-600M).



In [11]:
text = """Anti-money laundering (AML) is the term for the legal framework and the methods that aim to stop criminals from making their illegal money look like legal earnings. It includes finding and stopping money laundering actions that use financial systems to conceal where the money came from. Financial institutions have to follow AML rules, such as checking the identity of their customers (KYC), tracking their transactions, freezing dubious accounts, and informing the authorities about any suspicious activities."""

In [12]:
text_translated = translator(text,
                             src_lang="eng_Latn",
                             tgt_lang="ben_Beng")

To choose other languages, you can find the other language codes on the page: [Languages in FLORES-200](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200)

In [13]:
text_translated

[{'translation_text': 'এএমএল (অন্তি-মনি লন্ডারিং) হল আইনি কাঠামো এবং পদ্ধতিগুলির জন্য একটি শব্দ যা অপরাধীদের তাদের অবৈধ অর্থকে আইনী উপার্জনের মতো দেখানোর লক্ষ্যে বাধা দেয়। এটি অর্থ প্রয়োগের জন্য আর্থিক ব্যবস্থা ব্যবহার করে অর্থ প্রয়োগের কার্যক্রমগুলি খুঁজে বের করে এবং বন্ধ করে দেয়। আর্থিক প্রতিষ্ঠানগুলিকে এএমএল নিয়মগুলি অনুসরণ করতে হবে, যেমন তাদের গ্রাহকদের পরিচয় পরীক্ষা করা (কেওয়াইসি), তাদের লেনদেনগুলি ট্র্যাক করা, সন্দেহজনক অ্যাকাউন্টগুলি হিমায়িত করা এবং কোনও সন্দেহজনক ক্রিয়াকলাপ সম্পর্কে কর্তৃপক্ষকে অবহিত করা।'}]

## Free up some memory before continuing
- In order to have enough free memory to run the rest of the code, please run the following to free up memory on the machine.

In [14]:
import gc

In [15]:
del translator

In [16]:
gc.collect()

104

### Build the `summarization` pipeline using 🤗 Transformers Library

In [18]:
summarizer = pipeline(task="summarization",
                      model="facebook/bart-large-cnn",
                      torch_dtype=torch.bfloat16)

config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Model info: ['bart-large-cnn'](https://huggingface.co/facebook/bart-large-cnn)

In [21]:
text = """Anti-money laundering (AML) is the term for the legal framework and the methods that aim to stop criminals from making their illegal money look like legal earnings. It includes finding and stopping money laundering actions that use financial systems to conceal where the money came from. Financial institutions have to follow AML rules, such as checking the identity of their customers (KYC), tracking their transactions, freezing dubious accounts, and informing the authorities about any suspicious activities."""

In [22]:
summary = summarizer(text,
                     min_length=10,
                     max_length=100)

In [23]:
summary

[{'summary_text': 'Anti-money laundering (AML) is the term for the legal framework and the methods that aim to stop criminals from making illegal money look like legal earnings. Financial institutions have to follow AML rules, such as checking the identity of their customers.'}]

### Try it yourself!
- Try this model with your own texts!