In [1]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
!nvidia-smi

Mon Aug 18 19:43:41 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   51C    P8             10W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [3]:
!pip install transformers



- The term "transformers" in the context of natural language processing (NLP) and machine learning can refer to various models and architectures built upon the original transformer architecture introduced in the paper "Attention is All You Need" by Vaswani et al. in 2017. Here are some key transformer models and variants that have been developed since then:

1. BERT (Bidirectional Encoder Representations from Transformers)
Focuses on understanding context in both directions (left and right) using masked language modeling.
2. GPT (Generative Pre-trained Transformer)
Developed by OpenAI, GPT models (like GPT-2 and GPT-3) are autoregressive models primarily used for text generation.
3. T5 (Text-to-Text Transfer Transformer)
Treats every NLP task as a text-to-text problem, making it very flexible across various applications.
4. RoBERTa (A Robustly Optimized BERT Pretraining Approach)
An improvement over BERT with more training data and different training strategies.
5. XLNet
Combines the ideas of BERT and autoregressive models, allowing for better capturing of context and dependencies.
6. ALBERT (A Lite BERT)
A smaller and more efficient version of BERT that reduces the number of parameters while maintaining performance.
7. DistilBERT
A distilled version of BERT that is smaller and faster while retaining much of its performance.
8. ERNIE (Enhanced Representation through kNowledge Integration)
Developed by Baidu, it incorporates external knowledge to improve language understanding.
9. ELECTRA
Instead of masking tokens like BERT, ELECTRA predicts replaced tokens, leading to more efficient training.
10. DeBERTa (Decoding-enhanced BERT with Disentangled Attention)
Uses a disentangled attention mechanism to improve performance on various NLP tasks.
11. Vision Transformers (ViT)
Adapts the transformer architecture for image processing tasks, treating images as sequences of patches.
12. BART (Bidirectional and Auto-Regressive Transformers)
Combines BERT's bidirectional encoding and GPT's autoregressive decoding for tasks like summarization and translation.
13. LayoutLM
Designed for document understanding, incorporating layout information from scanned documents.
14. Swin Transformer
A hierarchical vision transformer that can be used for both image classification and detection tasks.
15. Transformer-XL
Introduces recurrence to the transformer architecture, allowing it to handle longer sequences more effectively.
These are just some of the prominent transformer models and architectures. The field is rapidly evolving, with new variations and improvements continually being introduced, so the number and types of transformers are continually growing.

In [4]:
from transformers import AutoTokenizer

# Load the tokenizer for a specific model (e.g., GPT-2)
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Tokenize some input text
text = "Hello, how are you?"
tokens = tokenizer(text, return_tensors='pt')
print(tokens)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

{'input_ids': tensor([[15496,    11,   703,   389,   345,    30]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1]])}


In [5]:
from transformers import AutoModelForCausalLM

# Load the pre-trained GPT-2 model
model = AutoModelForCausalLM.from_pretrained("gpt2")

# Generate text
input_ids = tokenizer.encode("indian cricket", return_tensors='pt')
output = model.generate(input_ids, max_length=50)
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)

print(generated_text)

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


indian cricket team, which has been in the country for over a decade.

The team's captain, Ravi Shankar, has been in the country for over a decade.

The team's captain, Ravi Shankar,
