# **HuggingFace Inference**

In [None]:
!pip install torch
!pip install transformers

In [None]:
from transformers import pipeline
from transformers import DistilBertForSequenceClassification, DistilBertTokenizer
import torch
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# You can also use this section to suppress warnings generated by your code:
def warn(*args, **kwargs):
    pass
import warnings
warnings.warn = warn
warnings.filterwarnings('ignore')

## Text classification with **DistilBERT**

Load the model and tokenizer

In [None]:
tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")
model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

Preprocess the input text

In [None]:
text = "Congratulations! You've won a free ticket to the Bahamas. Reply WIN to claim."

inputs = tokenizer(text, return_tensors="pt")
print(inputs)

{'input_ids': tensor([[  101, 23156,   999,  2017,  1005,  2310,  2180,  1037,  2489,  7281,
          2000,  1996, 17094,  1012,  7514,  2663,  2000,  4366,  1012,   102]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}


Perform inference

In [None]:
with torch.no_grad():
    outputs = model(**inputs)
    # OR model(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'])

logits = outputs.logits
logits.shape, logits

(torch.Size([1, 2]), tensor([[-3.9954,  4.3336]]))

Post-process the output

In [None]:
probs = torch.softmax(logits, dim=-1)
predicted_class = torch.argmax(probs, dim=-1)

labels = ["NEGATIVE", "POSITIVE"]
predicted_label = labels[predicted_class]
print(f"Predicted label: {predicted_label}")

Predicted label: POSITIVE


## Text generation with **GPT-2**

Load tokenizer

In [None]:
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Preprocess the input text

In [None]:
# Prompt
prompt = "Once upon a time"

# Tokenize the input text
inputs = tokenizer(prompt, return_tensors="pt")
inputs

{'input_ids': tensor([[7454, 2402,  257,  640]]), 'attention_mask': tensor([[1, 1, 1, 1]])}

Perform inference

In [None]:
output_ids = model.generate(inputs.input_ids, attention_mask=inputs.attention_mask,
                            pad_token_id=tokenizer.eos_token_id, max_length=50, num_return_sequences=1)

output_ids

tensor([[7454, 2402,  257,  640,   11,  262,  995,  373,  257, 1295,  286, 1049,
         8737,  290, 1049, 3514,   13,  383,  995,  373,  257, 1295,  286, 1049,
         3514,   11,  290,  262,  995,  373,  257, 1295,  286, 1049, 3514,   13,
          383,  995,  373,  257, 1295,  286, 1049, 3514,   11,  290,  262,  995,
          373,  257]])

Post-process the output

In [None]:
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(generated_text)

Once upon a time, the world was a place of great beauty and great danger. The world was a place of great danger, and the world was a place of great danger. The world was a place of great danger, and the world was a


## **Hugging Face `pipeline()` function**

Text classification using `pipeline()`

In [None]:
# Load a general text classification model
classifier = pipeline("text-classification", model="distilbert-base-uncased-finetuned-sst-2-english")

# Classify a sample text
result = classifier("Congratulations! You've won a free ticket to the Bahamas. Reply WIN to claim.")
print(result)

Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9997586607933044}]


Language detection using `pipeline()`

In [None]:
classifier = pipeline("text-classification", model="papluca/xlm-roberta-base-language-detection")
result = classifier("Bonjour, comment ça va?")
print(result)

config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/1.11G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/502 [00:00<?, ?B/s]

sentencepiece.bpe.model:   0%|          | 0.00/5.07M [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

Device set to use cpu


[{'label': 'fr', 'score': 0.9934879541397095}]


Text generation

In [None]:
generator = pipeline("text-generation", model="gpt2")
prompt = "Once upon a time"

# number of desired versions
result = generator(prompt, max_length=50, num_return_sequences=1, truncation=True)

print(result[0]['generated_text'])

Device set to use cpu
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Once upon a time, you would try to build your life up. That's what we did. At a certain point, you would try to maintain that level of commitment. But, in the end, you didn't have to do that. You could just focus on what you were passionate about.

We think it's the best way to build a life. If you're good at it, you're good at creating. If you're good at it, you're good at making people happy and happy for a while. That's what we do.

So, when you're in rehab, you're making a lot of changes and you're going through a lot of changes. You're getting better at making sure your body is working to improve. You're getting better at the way you're feeling and the way you're doing things.

And for the last eight years, we've been building a life for our kids. After they're 12 years old, we're still building them up. They're still getting better at what they're doing.

We're building our life for our kids.

That's what we're doing here today.

So, we're working on building a life. A life that 

Text generation using T5

In [None]:
generator = pipeline("text2text-generation", model="t5-small")

prompt = "translate English to French: How are you?"
result = generator(prompt, max_length=50, num_return_sequences=1)

print(result[0]['generated_text'])

config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

Device set to use cpu
Both `max_new_tokens` (=256) and `max_length`(=50) seem to have been set. `max_new_tokens` will take precedence. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)


Comment êtes-vous?


Fill mask using BERT

In [None]:
generator = pipeline("fill-mask", model="bert-base-uncased")
prompt= "Rabbit eats delicious [MASK]."
result = generator(prompt)

# Print the generated text
print(result)

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

Device set to use cpu


[{'score': 0.5318308472633362, 'token': 2135, 'token_str': '##ly', 'sequence': 'rabbit eats deliciously.'}, {'score': 0.3313165009021759, 'token': 2833, 'token_str': 'food', 'sequence': 'rabbit eats delicious food.'}, {'score': 0.016017301008105278, 'token': 2477, 'token_str': 'things', 'sequence': 'rabbit eats delicious things.'}, {'score': 0.008221233263611794, 'token': 9440, 'token_str': 'foods', 'sequence': 'rabbit eats delicious foods.'}, {'score': 0.007391038350760937, 'token': 12278, 'token_str': 'meals', 'sequence': 'rabbit eats delicious meals.'}]
