# Transformers

Transformers are a type of deep learning model architecture introduced in the paper "Attention Is All You Need" (Vaswani et al., 2017). They have revolutionized Natural Language Processing (NLP) by enabling more efficient and context-aware text processing.

In [2]:
# Transformers installation
! pip install transformers
# To install from source instead of the last release, comment the command above and uncomment the following one.
#! pip install git+https://github.com/huggingface/transformers.git



# Hugging Face Transformers: The Ultimate NLP Library
Hugging Face's Transformers library is the most popular tool for Natural Language Processing (NLP). It provides pre-trained models for tasks like:

- Text Classification (Sentiment Analysis, Spam Detection)
- Named Entity Recognition (NER)
- Question Answering
- Text Summarization
- Machine Translation
- Text Generation (GPT, T5)
- Speech Processing & Vision Transformers

## 1.  Pre-trained sentiment analysis model from Hugging Face's transformers library.
- Here By default, pipeline('sentiment-analysis') loads distilbert-base-uncased-finetuned-sst-2-english, a fine-tuned DistilBERT model trained on the SST-2 dataset for binary sentiment classification (positive/negative).

In [1]:
from transformers import pipeline
classifier = pipeline('sentiment-analysis')

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Device set to use cuda:0


In [2]:
classifier('We are very happy to show you the 🤗 Transformers library.')

[{'label': 'POSITIVE', 'score': 0.9997795224189758}]

In [3]:
classifier('The pizza is not that great but the crust is awesome.')

[{'label': 'POSITIVE', 'score': 0.9998461008071899}]

In [4]:
results = classifier(["We are very happy to show you the 🤗 Transformers library.",
           "We hope you don't hate it."])
for result in results:
    print(f"label: {result['label']}, with score: {round(result['score'], 4)}")

label: POSITIVE, with score: 0.9998
label: NEGATIVE, with score: 0.5309


## 2.  Language Translation Using Hugging Face Transformers

This code loads a pre-trained multilingual sentiment analysis model (`nlptown/bert-base-multilingual-uncased-sentiment`) and analyzes the sentiment of the Spanish text `"Esperamos que no lo odie."` (We hope you don't hate it.), returning a sentiment score (e.g., 1 to 5 stars).

In [5]:
classifier = pipeline('sentiment-analysis', model="nlptown/bert-base-multilingual-uncased-sentiment")

config.json:   0%|          | 0.00/953 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/669M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/39.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/872k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

Device set to use cuda:0


In [6]:
classifier("Esperamos que no lo odie.")

[{'label': '3 stars', 'score': 0.3368820548057556}]

# 3. Multilingual Sentiment Analysis by Converting a Pre-Trained PyTorch BERT Model into TensorFlow

- This code loads a pre-trained sentiment analysis model, converts it from PyTorch to TensorFlow, tokenizes text input, and predicts the sentiment.



In [7]:
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

In [8]:
model_name = "nlptown/bert-base-multilingual-uncased-sentiment" # multilingual
# This model only exists in PyTorch, so we use the `from_pt` flag to import that model in TensorFlow.
model = TFAutoModelForSequenceClassification.from_pretrained(model_name, from_pt=True) #The model was originally trained in PyTorch, but this flag converts it to TensorFlow format.
tokenizer = AutoTokenizer.from_pretrained(model_name) #tokenization splits text into smaller pieces (tokens) so that the model can process them.
classifier = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) # Creates an NLP pipeline for sentiment classification.


pytorch_model.bin:   0%|          | 0.00/669M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFBertForSequenceClassification.

All the weights of TFBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForSequenceClassification for predictions without further training.
Device set to use 0


In [9]:
classifier("I am a good girl")

[{'label': '4 stars', 'score': 0.4100622534751892}]

# 4. End-to-End Sentiment Analysis Using Pre-Trained DistilBERT in TensorFlow

- This code loads a pre-trained sentiment analysis model (distilbert-base-uncased-finetuned-sst-2-english), tokenizes input text, processes it through the model in TensorFlow, and outputs sentiment predictions.
### Step1: Loading the Pre-Trained Model and Tokenizer

In [1]:
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tf_model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/629 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/268M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFDistilBertForSequenceClassification.

All the weights of TFDistilBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

### Step2: Tokenizing a Single Sentence
- Converts text into token IDs that the model understands.

In [3]:
inputs = tokenizer("We are very happy to show you the 🤗 Transformers library.")
print(inputs)

{'input_ids': [101, 2057, 2024, 2200, 3407, 2000, 2265, 2017, 1996, 100, 19081, 3075, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}


### Step3: Tokenizing Multiple Sentences with Padding & Truncation
- Tokenizes multiple sentences at once.
- padding=True: Ensures all sequences have the same length (fills shorter ones with zeros).
- truncation=True: Truncates text if it exceeds max_length=512 tokens.
- return_tensors="tf": Converts the output into a TensorFlow tensor.

In [5]:
tf_batch = tokenizer(
    ["We are very happy to show you the 🤗 Transformers library.", "We hope you don't hate it."],
    padding=True,
    truncation=True,
    max_length=512,
    return_tensors="tf"
)

### Step4: Printing Tokenized Data

In [6]:
for key, value in tf_batch.items():
    print(f"{key}: {value.numpy().tolist()}")


input_ids: [[101, 2057, 2024, 2200, 3407, 2000, 2265, 2017, 1996, 100, 19081, 3075, 1012, 102], [101, 2057, 3246, 2017, 2123, 1005, 1056, 5223, 2009, 1012, 102, 0, 0, 0]]
attention_mask: [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]]


### Step5: Running Sentiment Analysis Model on Tokenized Inputs

In [7]:
tf_outputs = tf_model(tf_batch)
print(tf_outputs)


TFSequenceClassifierOutput(loss=None, logits=<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-4.083296  ,  4.336415  ],
       [ 0.08181123, -0.04178689]], dtype=float32)>, hidden_states=None, attentions=None)


### Step6: Converting Model Outputs to Probabilities (Softmax)

In [8]:
import tensorflow as tf
tf_predictions = tf.nn.softmax(tf_outputs[0], axis=-1)
print(tf_predictions)


tf.Tensor(
[[2.2042994e-04 9.9977952e-01]
 [5.3086025e-01 4.6913972e-01]], shape=(2, 2), dtype=float32)


### Step7: Running Model with Labels (For Loss Calculation)

In [9]:
tf_outputs = tf_model(tf_batch, labels=tf.constant([1, 0]))


In [10]:
tf_outputs

TFSequenceClassifierOutput(loss=<tf.Tensor: shape=(2,), dtype=float32, numpy=array([2.2051287e-04, 6.3325649e-01], dtype=float32)>, logits=<tf.Tensor: shape=(2, 2), dtype=float32, numpy=
array([[-4.083296  ,  4.336415  ],
       [ 0.08181123, -0.04178689]], dtype=float32)>, hidden_states=None, attentions=None)

### Step8: Saving the Pre-Trained Model and Tokenizer

In [14]:
save_directory = "Downloads"
tokenizer.save_pretrained(save_directory)
tf_model.save_pretrained(save_directory)


### Step9: Extracting Hidden States and Attention Weights

In [18]:
tf_outputs = tf_model(tf_batch, output_hidden_states=True, output_attentions=True)
all_hidden_states, all_attentions = tf_outputs[-2:]


# 5. Loading and Customizing DistilBERT for Sentiment Analysis and Multi-Class Classification

### Step1: Loading a Pre-Trained DistilBERT Model

In [23]:
from transformers import DistilBertTokenizer, TFDistilBertForSequenceClassification

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = TFDistilBertForSequenceClassification.from_pretrained(model_name)
tokenizer = DistilBertTokenizer.from_pretrained(model_name)



All PyTorch model weights were used when initializing TFDistilBertForSequenceClassification.

All the weights of TFDistilBertForSequenceClassification were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.


### Step2: Creating a Custom Configuration for DistilBERT

In [24]:
from transformers import DistilBertConfig, DistilBertTokenizer, TFDistilBertForSequenceClassification

config = DistilBertConfig(n_heads=8, dim=512, hidden_dim=4*512)
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased')
model = TFDistilBertForSequenceClassification(config)



### Step3: Loading DistilBERT with a Custom Number of Labels

In [25]:
from transformers import DistilBertConfig, DistilBertTokenizer, TFDistilBertForSequenceClassification

model_name = "distilbert-base-uncased"
model = TFDistilBertForSequenceClassification.from_pretrained(model_name, num_labels=10)
tokenizer = DistilBertTokenizer.from_pretrained(model_name)


Some weights of the PyTorch model were not used when initializing the TF 2.0 model TFDistilBertForSequenceClassification: ['vocab_transform.weight', 'vocab_transform.bias', 'vocab_projector.bias', 'vocab_layer_norm.bias', 'vocab_layer_norm.weight']
- This IS expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model trained on another task or with another architecture (e.g. initializing a TFBertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFDistilBertForSequenceClassification from a PyTorch model that you expect to be exactly identical (e.g. initializing a TFBertForSequenceClassification model from a BertForSequenceClassification model).
Some weights or buffers of the TF 2.0 model TFDistilBertForSequenceClassification were not initialized from the PyTorch model and are newly initialized: ['pre_classifier.weight', 'pre_classifier.bias', 'classifier.weight', 'classifier.bias']
You should 