# BERT
- **Objective:**
  - Understand the architecture and use cases of BERT (Bidirectional Encoder Representations from Transformers).
- **Topics Covered:**
  - Pre-training objectives: **Masked Language Modeling (MLM)** and **Next Sentence Prediction (NSP)**.
  - Key features: 
    - Bidirectional context.
    - Token-level embeddings for tasks like classification and NER.
  - Variants of BERT (e.g., DistilBERT, RoBERTa).
- **Practical Component:**
  - Load a pre-trained BERT model from HuggingFace.
  - Perform a simple text classification task.

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification

tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased")

inputs = tokenizer("The movie was fantastic!", return_tensors="pt")
outputs = model(**inputs)
print(outputs.logits)

# GPT
- **Objective:**
  - Explore the GPT family of models and their generative capabilities.
- **Topics Covered:**
  - Pre-training objective: **Causal Language Modeling (CLM)**.
  - Differences from BERT:
    - Autoregressive nature.
    - Focus on text generation.
  - Applications: Chatbots, summarization, and text completion.
  - Variants: GPT, GPT-2, GPT-3, GPT-4.
- **Practical Component:**
  - Load a GPT model for text generation.|

In [None]:
from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

input_text = "Once upon a time"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(inputs.input_ids, max_length=50, num_return_sequences=1)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

- **BERT** excels at understanding contexts for tasks like classification and NER.
- **GPT** is the go-to model for text generation.

> **Note**: Training a language model from scratch requires significant computational resources but is a valuable learning experience.
