<a href="https://colab.research.google.com/github/alikaiser12/AI/blob/main/Transformer_01.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using a Transformer for Text Classification
We'll work on a text classification task, where we classify a sentence into one of the predefined categories. We will use the BERT model (Bidirectional Encoder Representations from Transformers) for this task.

# Step 1: Install Required Libraries
If you don't have the libraries installed, you can install them by running the following commands:

bash
Copy


In [1]:
!pip install transformers
!pip install torch


Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

Libraries we are using:

transformers: This is the Hugging Face library that provides access to pre-trained Transformer models like BERT, GPT, T5, etc.

torch: This is PyTorch, the deep learning framework used for training models.

# Step 2: Import Libraries
Now, let’s import the necessary libraries.

In [2]:
import torch
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline


BertTokenizer: This helps in converting words into tokens (which is what transformers understand).

BertForSequenceClassification: This is the pre-trained BERT model for sequence classification.

pipeline: A high-level utility from Hugging Face to easily use pre-trained models.

# Step 3: Load Pre-trained BERT Model
We will use a pre-trained BERT model for classification tasks. Hugging Face provides various pre-trained models that we can fine-tune or use directly.

In [3]:
# Load a pre-trained BERT tokenizer and model for sequence classification
model_name = "bert-base-uncased"  # BERT model with lowercase English text

tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2)  # num_labels=2 for binary classification


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


BertTokenizer: Loads a tokenizer for BERT. It converts words into tokens (numbers).

BertForSequenceClassification: This loads a pre-trained BERT model that is already fine-tuned for sequence classification tasks.

Step 4: Prepare the Input Data
Now we will create some example sentences to classify.

In [4]:
# Example sentences for classification
sentences = ["I love machine learning", "I hate doing homework", "Transformers are amazing for NLP"]

# Convert the sentences into token IDs that the model understands
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")


tokenizer: This converts the sentences into token IDs. We use padding=True to pad the sentences to the same length and truncation=True to truncate longer sentences.

# Step 5: Perform Classification with the Model
We will now run the model to classify the sentences. Since we are performing a binary classification (e.g., positive vs. negative sentiment), the model will output probabilities for each class.

In [5]:
# Perform classification
with torch.no_grad():  # Disable gradient calculations to save memory
    outputs = model(**inputs)  # Pass the tokenized sentences through the model
    logits = outputs.logits  # Get the raw logits (scores) for each class

# Convert logits to probabilities (use softmax)
probs = torch.nn.functional.softmax(logits, dim=-1)

# Print the probabilities
print(probs)


tensor([[0.5759, 0.4241],
        [0.6309, 0.3691],
        [0.5455, 0.4545]])


outputs.logits: The raw output of the model. These are the unnormalized scores for each class.

softmax: We apply the softmax function to convert the logits into probabilities.

# Step 6: Interpret the Output
Now that we have the probabilities, let’s see the predicted labels for the sentences.

In [6]:
# Get the predicted labels (class with the highest probability)
predictions = torch.argmax(probs, dim=-1)

# Print the predicted labels (0 or 1 for binary classification)
for i, sentence in enumerate(sentences):
    print(f"Sentence: {sentence}\nPredicted label: {predictions[i].item()}")


Sentence: I love machine learning
Predicted label: 0
Sentence: I hate doing homework
Predicted label: 0
Sentence: Transformers are amazing for NLP
Predicted label: 0


torch.argmax: This finds the index (label) with the highest probability for each sentence. In binary classification, this will give either 0 or 1.

# Step 7: (Optional) Using a Pipeline for Simpler Usage
The Hugging Face pipeline makes things easier by directly loading a model and tokenizer for specific tasks. Here’s how you can perform sentiment analysis (binary classification) using pipeline:

In [7]:
# Use Hugging Face's pipeline for sentiment analysis
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)

# Classify the example sentences
results = classifier(sentences)

# Print results
for result in results:
    print(f"Label: {result['label']}, Confidence: {result['score']}")


Device set to use cpu


Label: LABEL_0, Confidence: 0.5759455561637878
Label: LABEL_0, Confidence: 0.6308655142784119
Label: LABEL_0, Confidence: 0.5455143451690674


pipeline: This automatically loads the correct model and tokenizer for the task. We use the sentiment-analysis pipeline for binary classification (positive/negative).

results: This contains the predictions, including the label and confidence score.

# Step 8: Full Example Code
Here’s the full code for text classification using a pre-trained Transformer model (BERT):

In [8]:
import torch
from transformers import BertTokenizer, BertForSequenceClassification
from transformers import pipeline

# Load pre-trained BERT model and tokenizer
model_name = "bert-base-uncased"
tokenizer = BertTokenizer.from_pretrained(model_name)
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=2)

# Example sentences for classification
sentences = ["I love machine learning", "I hate doing homework", "Transformers are amazing for NLP"]

# Convert the sentences into token IDs
inputs = tokenizer(sentences, padding=True, truncation=True, return_tensors="pt")

# Perform classification
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits

# Convert logits to probabilities using softmax
probs = torch.nn.functional.softmax(logits, dim=-1)

# Get predicted labels
predictions = torch.argmax(probs, dim=-1)

# Display results
for i, sentence in enumerate(sentences):
    print(f"Sentence: {sentence}\nPredicted label: {predictions[i].item()}")


Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


Sentence: I love machine learning
Predicted label: 0
Sentence: I hate doing homework
Predicted label: 0
Sentence: Transformers are amazing for NLP
Predicted label: 0


# Explanation:
We load a pre-trained BERT model and its tokenizer.

We prepare input data (sentences), tokenize them, and pass them through the model.

The model outputs raw scores (logits), which we convert into probabilities.

Finally, we predict the class with the highest probability and print the results.

# Conclusion:
Transformers are powerful models that use self-attention to capture relationships between words in a sequence, regardless of their position.

We used BERT (a Transformer model) for a simple text classification task where the model predicts whether the sentence is positive or negative.

Transformers are extremely powerful and form the foundation of state-of-the-art models in NLP (e.g., GPT, BERT, T5).