## Trying 🤗 HuggingFace Transformers

Make sure you install the dependencies from `requirements.txt` before executing cells in this notebook.

In [2]:
from transformers import pipeline

Define the generator pipeline. In this case, use the `text2text` for NLP processing

In [3]:
generator = pipeline("text2text-generation", model="t5-base")

Downloading pytorch_model.bin:   0%|          | 0.00/850M [00:00<?, ?B/s]

Downloading spiece.model:   0%|          | 0.00/773k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.32M [00:00<?, ?B/s]

For now, this behavior is kept to avoid breaking backwards compatibility when padding/encoding with `truncation is True`.
- Be aware that you SHOULD NOT rely on t5-base automatically truncating your input to 512 when padding/encoding.
- If you want to encode/pad to sequences longer than 512 you can either instantiate this tokenizer with `model_max_length` or pass `max_length` when encoding/padding.


In [4]:
# Summarize
generator("summarize: Machine Learning in production environments\
           is largely seen as the ultimate goal.\
          Sometimes, deploying models can be difficult when automation is not part of the workflow.\
          Creating a foundational process that is reliable and automated is complex\
          and requires commitment from the team and the organization as a whole")

[{'generated_text': 'machine learning is a key to a successful production environment . a foundational process'}]

In [5]:
# Sentiment
generator("sst2 sentence: Automation takes hard work but allows you to have a solid deployment")

[{'generated_text': 'positive'}]

In [6]:
# Questions
generator("question: Is deploying models into production hard?")

[{'generated_text': 'not_entailment'}]

In [7]:
# Translation
generator("translate English to French: Automation takes hard work but allows you to have a solid deployment")

[{'generated_text': "L'automatisation exige beaucoup de travail, mais vous permet d'avoir un dé"}]

You can create other generation objects by calling in other models as well

In [8]:
gpt2_generator = pipeline("text-generation", model="gpt2")

Downloading config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/523M [00:00<?, ?B/s]

Downloading tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading vocab.json:   0%|          | 0.00/0.99M [00:00<?, ?B/s]

Downloading merges.txt:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading tokenizer.json:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

In [9]:
gpt2_generator("some phrase here was thought to be", max_new_tokens=512)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': 'some phrase here was thought to be that the "right to kill" was never in question, that the threat of death might be invoked in some other way. Yet, at this point, the United Nations has no such option. "The only thing that the United States can do is get involved in the process, at least temporarily," a senior U.N. official told Human Rights Watch. "We don\'t have any options."\n\nThe United Nations is in charge of a group of 18 non-governmental institutions including NGOs, NGOs, and stateless groups like Amnesty International and the United Nations Security Council. This delegation is headed by former General Assembly member Rene Bourdieu—and he has worked at the United States Agency for International Development for about eight years. While Bourdieu is the country\'s foreign minister, he does not oversee the United Nations—not even the International Court of Justice.\n\nLast October, the U.N. Security Council adopted a resolution condemning the use of lethal for