# 🤖 Transformers Library by Hugging Face

The **Transformers** library by Hugging Face provides **pre-trained models** and tools for a wide range of AI tasks, including:

- **Natural Language Processing (NLP)**: Text classification, translation, question-answering, and more.
- **Support for Popular Models**: Includes BERT, GPT, LLaMA, and many others.
- **Framework Integration**: Works seamlessly with **PyTorch** and **TensorFlow**.

A powerful tool to accelerate and simplify your machine learning projects! 🚀

# first install transfomers

- **!pip install Transformers**

In [2]:
from transformers import pipeline

classifier = pipeline("sentiment-analysis",device='cuda')
classifier("This book is not so good but it is not bad either")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


[{'label': 'POSITIVE', 'score': 0.9933708906173706}]

## **using model and tokenizer directly**

In [None]:

from transformers import AutoTokenizer,AutoModel

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

txt = "Hello world!"

inp = tokenizer(txt,return_tensors="pt") # return_tensors="pt" returns PyTorch tensors and "tf" returns TensorFlow tensors
out= model(**inp)
print(out) 
print("Last Hidden State Shape : ",out.last_hidden_state.shape) # (batch_size, sequence_length, hidden_size) *sequence_length is the number of tokens in the input text
print("Last Hidden State Size:", out.last_hidden_state.numel())  # Total elements

# If pooler_output is available
if hasattr(out, "pooler_output"):
    print("Pooler Output Shape:", out.pooler_output.shape)
    print("Pooler Output Size:", out.pooler_output.numel())
"""ArithmeticError outputs is last hidden states and pooler output of the model(embeddings of the input text) 
(The pooler_output is not token-level information but a summary representation of the entire input.))
"""     

BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state=tensor([[[-0.1424,  0.1335, -0.1291,  ..., -0.3597, -0.0562,  0.3605],
         [-0.3506,  0.1042,  0.6244,  ..., -0.1761,  0.4834,  0.0644],
         [-0.2451, -0.1573,  0.6945,  ..., -0.5654, -0.0894, -0.1856],
         [-0.8248, -0.9119, -0.6561,  ...,  0.5074, -0.1939, -0.1659],
         [ 0.8767,  0.0352, -0.1233,  ...,  0.2720, -0.6369, -0.1585]]],
       grad_fn=<NativeLayerNormBackward0>), pooler_output=tensor([[-8.9756e-01, -3.3040e-01, -7.6942e-01,  7.5799e-01,  4.6678e-01,
         -1.2035e-01,  9.1835e-01,  1.8087e-01, -7.2716e-01, -9.9991e-01,
         -4.4723e-01,  8.9104e-01,  9.6621e-01,  5.4915e-01,  9.4344e-01,
         -7.6605e-01, -6.0469e-01, -6.1654e-01,  4.0572e-01, -7.4644e-01,
          6.1739e-01,  9.9974e-01,  3.2989e-02,  2.5414e-01,  4.3106e-01,
          9.7732e-01, -8.4328e-01,  9.2297e-01,  9.4871e-01,  6.3994e-01,
         -7.3620e-01,  9.0957e-02, -9.7607e-01, -1.9115e-01, -8.1717e-01,
    

'ArithmeticError outputs is last hidden states and pooler output of the model(embeddings of the input text) \n(The pooler_output is not token-level information but a summary representation of the entire input.))\n'

In [14]:
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "distilbert-base-uncased-finetuned-sst-2-english" # for other models, you could search in the Hugging Face model hub
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
classifier = pipeline ("sentiment-analysis", model=model, tokenizer=tokenizer)
results = classifier (["we love Automns",
                       "i dont hate it but it is bad",])
for result in results:
  print(result)

Hardware accelerator e.g. GPU is available in the environment, but no `device` argument is passed to the `Pipeline` object. Model will be on CPU.


{'label': 'POSITIVE', 'score': 0.9997029900550842}
{'label': 'NEGATIVE', 'score': 0.9987803101539612}


## **Check what Tokenizer is doing with data.**



In [None]:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_name) # pay attention that we use AutoModelForSequenceClassification instead of AutoModel
tokenizer = AutoTokenizer.from_pretrained(model_name)


tokens = tokenizer.tokenize("We are very happy to show you the Transformers library.")
token_ids = tokenizer.convert_tokens_to_ids (tokens)
input_ids = tokenizer ("We are very happy to show you the Transformers library.")
input_ids = tokenizer ("We are very happy to show you the Transformers library.")

print (f' Tokens: {tokens}')
print ("###"*10)
print (f'Token IDs: {token_ids}') # the diffrence between **token_ids** and **input_ids** is that token_ids are the ids of the tokens and input_ids are the ids of the tokens with special tokens like [CLS] and [SEP] and padding tokens
print ("###"*10)
print (f'Input IDs: {input_ids}') #attention_mask means hast same length with input_ids and 1 for real tokens and 0 for padding tokens that means the 1 means the models should pay attention to the token and 0 means the model should ignore the token.

X_train = ["We are very happy to show you the Transformers library.",
            "We hope you don't hate it."]
batch = tokenizer (X_train, padding=True, truncation=True, # batch is a dictionary that contains input_ids, attention_mask, token_type_ids. this batch encoding is useful for training the model. padding is for padding the input_ids to the same Length  when the input_ids are not the same length. truncation is for truncating the input_ids to the same length when the input_ids are not the same length that is because the model can't handle the input_ids with different lengths.
                   max_length=512, return_tensors="pt")
print ("###"*10)
print(batch) 
print ("###"*10)
print(f'batch shape: {batch["input_ids"].shape}') 

 Tokens: ['we', 'are', 'very', 'happy', 'to', 'show', 'you', 'the', 'transformers', 'library', '.']
##############################
Token IDs: [2057, 2024, 2200, 3407, 2000, 2265, 2017, 1996, 19081, 3075, 1012]
##############################
Input IDs: {'input_ids': [101, 2057, 2024, 2200, 3407, 2000, 2265, 2017, 1996, 19081, 3075, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
##############################
{'input_ids': tensor([[  101,  2057,  2024,  2200,  3407,  2000,  2265,  2017,  1996, 19081,
          3075,  1012,   102],
        [  101,  2057,  3246,  2017,  2123,  1005,  1056,  5223,  2009,  1012,
           102,     0,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]])}
##############################
batch shape: torch.Size([2, 13])


### **Using Tokenizer and Model separately and inference the Model using pytorch.**



In [22]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import torch.nn.functional as F

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

X_train = ["We are very happy to show you the Transformers library.",
            "We hope you don't hate it."]
batch = tokenizer(X_train, padding=True, truncation=True,
                   max_length=512, return_tensors="pt")
print(batch)
print ("###"*10)

with torch.no_grad():
    outputs = model(**batch)
    print(outputs)
    print ("###"*10)
    predictions = F.softmax(outputs.logits, dim=1)
    print(predictions)
    print ("###"*10)
    labels = torch.argmax(predictions, dim=1)
    print(labels)
    print ("###"*10)
    labels = [model.config.id2label[label_id] for label_id in labels.tolist()]
    print(labels)

{'input_ids': tensor([[  101,  2057,  2024,  2200,  3407,  2000,  2265,  2017,  1996, 19081,
          3075,  1012,   102],
        [  101,  2057,  3246,  2017,  2123,  1005,  1056,  5223,  2009,  1012,
           102,     0,     0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
        [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0]])}
##############################
SequenceClassifierOutput(loss=None, logits=tensor([[-4.1329,  4.3811],
        [ 0.0818, -0.0418]]), hidden_states=None, attentions=None)
##############################
tensor([[2.0060e-04, 9.9980e-01],
        [5.3086e-01, 4.6914e-01]])
##############################
tensor([1, 0])
##############################
['POSITIVE', 'NEGATIVE']
