## Trying 🤗 HuggingFace Transformers
Huggingface is a platform that builds APIs and packages for machine learning.
Make sure you install the dependencies from `requirements.txt` before executing cells in this notebook.

In [3]:
import warnings
warnings.filterwarnings('ignore')
from transformers import pipeline

Define the generator pipeline. In this case, use the `text2text` for NLP processing

In [4]:
generator = pipeline("text2text-generation", model="t5-base")

Downloading model.safetensors: 100%|██████████| 892M/892M [00:19<00:00, 46.7MB/s] 
Downloading (…)neration_config.json: 100%|██████████| 147/147 [00:00<?, ?B/s] 
Downloading (…)ve/main/spiece.model: 100%|██████████| 792k/792k [00:00<00:00, 5.98MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 1.39M/1.39M [00:00<00:00, 10.8MB/s]


In [5]:
# Summarize
generator("summarize: Machine Learning in production environments is largely seen as the ultimate goal. Sometimes, deploying models can be difficult when automation is not part of the workflow. Creating a foundational process that is reliable and automated is complex and requires commitment from the team and the organization as a whole")

[{'generated_text': 'machine learning is a key to a successful production environment . a foundational process'}]

In [6]:
# Sentiment
generator("sst2 sentence: Automation takes hard work but allows you to have a solid deployment")

[{'generated_text': 'positive'}]

In [7]:
# Sentiment2
generator("sst2 sentence: Automation is a pile of hard work that doesn't offer many benefits when the project is small or not repeatable.")

[{'generated_text': 'negative'}]

In [8]:
# Questions
generator("question: Is deploying models into production hard?")
# This question is too open, we need to explore better questions. The model would also only be able to generate answers within a narrow field.

[{'generated_text': 'not_entailment'}]

In [16]:
# Translation
generator("translate English to German: Automation takes hard work but allows you to have a solid deployment")

# only seems to generate for French and German.

[{'generated_text': 'Automatisierung erfordert harte Arbeit, aber ermöglicht Ihnen einen soliden Einsatz'}]

You can create other generation objects by calling in other models as well

In [17]:
gpt2_generator = pipeline("text-generation", model="gpt2")

Downloading (…)lve/main/config.json: 100%|██████████| 665/665 [00:00<00:00, 672kB/s]
Downloading model.safetensors: 100%|██████████| 548M/548M [00:09<00:00, 55.9MB/s] 
Downloading (…)neration_config.json: 100%|██████████| 124/124 [00:00<?, ?B/s] 
Downloading (…)olve/main/vocab.json: 100%|██████████| 1.04M/1.04M [00:00<00:00, 4.64MB/s]
Downloading (…)olve/main/merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 2.18MB/s]
Downloading (…)/main/tokenizer.json: 100%|██████████| 1.36M/1.36M [00:00<00:00, 6.18MB/s]


In [18]:
gpt2_generator("some phrase here was thought to be", max_new_tokens=512)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


[{'generated_text': "some phrase here was thought to be 'a very high level of stupidity', but is there evidence that it can be true? Is there still an actual science behind it?\n\nThe scientific approach by which we evaluate a hypothesis is called 'biomarker hypothesis'. Our goal is for the hypotheses to describe something (like a structure) which, however 'true' may seem, has not yet been empirically confirmed.\n\nSo, for example, to compare a hypothesis to other hypotheses of a'model' of some kind: we'll have to look at the assumptions they make about the model. We'll have to know where the hypothesis places itself, or the relationships it creates.\n\nWe'll need to have some kind of predictive model to test for it:\n\nFor example, a model to measure the speed of light may need to be useful when our model may be false. If our model is a model that looks out over the world, there will have to be some type of relationship between the model and the data that could be used to estimate the