# Transformers, what can they do?

Install the Transformers, Datasets, and Evaluate libraries to run this notebook.

In [2]:
!pip install datasets evaluate transformers[sentencepiece]

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting datasets
  Downloading datasets-2.12.0-py3-none-any.whl (474 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m474.6/474.6 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting evaluate
  Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.4/81.4 kB[0m [31m6.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting transformers[sentencepiece]
  Downloading transformers-4.29.2-py3-none-any.whl (7.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.1/7.1 MB[0m [31m76.8 MB/s[0m eta [36m0:00:00[0m
Collecting dill<0.3.7,>=0.3.0 (from datasets)
  Downloading dill-0.3.6-py3-none-any.whl (110 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m110.5/110.5 kB[0m [31m13.4 MB/s[0m eta [36m0:00:00[0m
Collecting xxhash (from datasets)
  Downloading xxhash-3.2.0-c

In [6]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
classifier("I've been waiting for a HuggingFace course my whole life.")

No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading pytorch_model.bin:   0%|          | 0.00/268M [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Xformers is not installed correctly. If you want to use memorry_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.


[{'label': 'POSITIVE', 'score': 0.9598048329353333}]

In [None]:
classifier(
    ["The weather is terrible all are sulking but I am fine"]
)

In [None]:
from transformers import pipeline

classifier = pipeline("zero-shot-classification")
classifier(
    "This is a course about the Transformers library",
    candidate_labels=["education", "politics", "engineering","business"],
)

In [None]:
from transformers import pipeline

generator = pipeline("text-generation")
generator("In this course, we will teach you how to")

In [None]:
from transformers import pipeline

generator = pipeline("text-generation", model="distilgpt2")
generator(
    "In this course, we will teach you how to",
    max_length=30,
    num_return_sequences=2,
)

In [None]:
from transformers import pipeline

unmasker = pipeline("fill-mask")
unmasker("This course will teach you all about <mask> models.", top_k=2)

In [None]:
from transformers import pipeline

ner = pipeline("ner", grouped_entities=True)
print(ner("My name is Sylvain and I work at Hugging Face in Brooklyn."))
txt="My name is Sylvain and I work at Hugging Face in Brooklyn."
print(txt[11:18])


In [None]:
from transformers import pipeline

question_answerer = pipeline("question-answering")
question_answerer(
    question="Where do I work?",
    context="My name is Sylvain and I work at Hugging Face in Brooklyn",
)

In [None]:
from transformers import pipeline

summarizer = pipeline("summarization")
summarizer(
    """
    Would like to know the update from the team for this purpose. 
We have a team to discuss things with MC of phase 1&2  for  diverting flood waters of phase 1 &2 to the lake directly. 
This was a key point discussed in the flood mitigation meeting too..
Hope no one is talking to the EO and  suggesting that the flood waters from the north ( phase 1&2) 
can be channelled our through the Strom water drain we have made for egress of flood waters that we will directly process. 
Phase 1&2 having already made a cross drain along the north  of the first building as we enter into phase 3 ,
 they need to asked to raise the  south lip of this drain substantially for the waters from phase 1&2 to be 
  drained along its east wall behind this building where their  transformer is to be discharged into the lake from  
  the extreme south east corner of compound behind the open well . The water thus will be discharged into the lake from 
  about 10 feet above our south road level , 100 feet ahead of our Strom water egress drain opening, to flow on towards 
the SE end of the lake and go over the weir.
"""
)

In [None]:
from transformers import pipeline

translator = pipeline("translation", model="Helsinki-NLP/opus-mt-fr-en")
translator("Ce cours est produit par Hugging Face.")

In [None]:
print(pipeline.__doc__)

In [None]:
from transformers import pipeline
unmasker = pipeline("fill-mask", model="bert-base-uncased")
result = unmasker("This man works as a [MASK].")
print([r["token_str"] for r in result])

result = unmasker("This woman works as a [MASK].")
print([r["token_str"] for r in result])


In [None]:
ner = pipeline("ner", grouped_entities=True)
ner("My name is Sylvain and I work at Hugging Face in Brooklyn.") 

In [1]:
from transformers import AutoTokenizer
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
raw_inputs = [
    "I've been waiting for a HuggingFace course my whole life.",
    "I hate this so much!",
    "The world is a dangerous place without good people"
]
inputs=tokenizer(raw_inputs, padding=True, truncation=True, return_tensors='tf')
print(inputs.keys())
print(inputs.attention_mask)

Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

Downloading (…)solve/main/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

dict_keys(['input_ids', 'attention_mask'])
tf.Tensor(
[[1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
 [1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0]
 [1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0]], shape=(3, 16), dtype=int32)


In [4]:
from transformers import TFAutoModel
checkpoint = "distilbert-base-uncased-finetuned-sst-2-english"
model = TFAutoModel.from_pretrained(checkpoint)

"""
This architecture contains only the base Transformer module: given some inputs, 
it outputs what we’ll call hidden states, also known as features. 
For each model input, we’ll retrieve a high-dimensional vector representing 
the contextual understanding of that input by the Transformer model.
"""


Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertModel: ['pre_classifier', 'dropout_19', 'classifier']
- This IS expected if you are initializing TFDistilBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
All the layers of TFDistilBertModel were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertModel for predictions without further training.


'\nThis architecture contains only the base Transformer module: given some inputs, \nit outputs what we’ll call hidden states, also known as features. \nFor each model input, we’ll retrieve a high-dimensional vector representing \nthe contextual understanding of that input by the Transformer model.\n'

In [7]:
outputs = model(inputs)
print(outputs.last_hidden_state.shape)
"""
The vector output by the Transformer module is usually large. It generally has three dimensions:

- Batch size: The number of sequences processed at a time (3 in our example).
- Sequence length: The length of the numerical representation of the sequence (16 in our example).
- Hidden size: The vector dimension of each model input.

It is said to be “high dimensional” because of the last value. 
The hidden size can be very large (768 is common for smaller models, 
and in larger models this can reach 3072 or more).
"""

(3, 16, 768)


In [61]:
from transformers import TFAutoModelForSequenceClassification
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)

outputs=model(inputs)
print(outputs, raw_inputs)

Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFDistilBertForSequenceClassification: ['dropout_19']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFDistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['dropout_59']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor: shape=(3, 2), dtype=float32, numpy=
array([[-1.5606955,  1.6122804],
       [ 4.1692314, -3.3464472],
       [ 3.7716544, -3.129022 ]], dtype=float32)>, hidden_states=None, attentions=None) ["I've been waiting for a HuggingFace course my whole life.", 'I hate this so much!', 'The world is a dangerous place without good people']


In [69]:
import tensorflow as tf
import numpy as np
predictions = tf.math.softmax(outputs.logits, axis=-1)
print(model.config.id2label)

{0: 'NEGATIVE', 1: 'POSITIVE'}


In [70]:
def print_results(preds,inp):
  ## sentence classification results
  ## expect both preds, inp to have same ling
  ## assumes {0:'NEGATIVE,1:'POSITIVE'}

  if len(preds)!= len(inp):
    print("Error : Predictions and inputs not of equal length")
    return
  preds_rounded = np.rint(preds)
  ret_val = []
  
  for i in range(len(inp)):
    if preds_rounded[i][0]: 
      ans= 'NEGATIVE'
    elif preds_rounded[i][1]: 
      ans= 'POSITIVE' 
    else: 
      ans = 'UNKNOWN'

    res = {"INPUT":inp[i],"LABEL":ans}
    ret_val.append(res)
    print(res)

  return


In [71]:
print_results(predictions,raw_inputs)

{'INPUT': "I've been waiting for a HuggingFace course my whole life.", 'LABEL': 'POSITIVE'}
{'INPUT': 'I hate this so much!', 'LABEL': 'NEGATIVE'}
{'INPUT': 'The world is a dangerous place without good people', 'LABEL': 'NEGATIVE'}
