# BERT Basics

BERT (Bidirectional Encoder Representations from Transformers) is a groundbreaking NLP model introduced by Google in 2018.

## Key Points:
- Based on **Transformer Encoder** architecture.
- Trained on **Masked Language Modeling (MLM)** and **Next Sentence Prediction (NSP)**.
- Provides **contextual word embeddings** (understands meaning based on context).
- Powers many NLP tasks: classification, question answering, NER, etc.

## 1. Installing Hugging Face Transformers

In [1]:
!pip install transformers torch --quiet

## 2. Loading Pretrained BERT Model & Tokenizer

In [2]:
from transformers import BertTokenizer, BertModel

# Load pre-trained BERT tokenizer and model
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')

## 3. Tokenizing Text

In [3]:
text = "BERT is transforming NLP."
inputs = tokenizer(text, return_tensors='pt')
print(inputs)

The tokenizer converts text into:
- `input_ids`: numerical tokens
- `attention_mask`: tells which tokens should be attended to
- (optionally) `token_type_ids` for sentence pairs

## 4. Getting Embeddings from BERT

In [4]:
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state
print(last_hidden_states.shape)  # (batch_size, sequence_length, hidden_size)

Output shape example:
```
torch.Size([1, 6, 768])
```
This means:
- 1 sequence in the batch
- 6 tokens (including [CLS] and [SEP])
- 768-dimensional embeddings

## 5. Using BERT for Sentiment Classification
We’ll use Hugging Face `pipeline` for an easy sentiment classification demo.

In [5]:
from transformers import pipeline

# Load pipeline with BERT-based model
classifier = pipeline("sentiment-analysis")

result = classifier("I love learning BERT models!")[0]
print(result)

### Example Output:
```python
{'label': 'POSITIVE', 'score': 0.9994}
```

This shows BERT (fine-tuned model) successfully classifies the text as **positive**.

## Summary
- BERT is a Transformer-based model that provides **context-aware embeddings**.
- Tokenizer converts text → input IDs & attention masks.
- Embeddings can be used for downstream NLP tasks.
- Hugging Face makes BERT easy to use for tasks like **sentiment analysis**.

👉 Next: Explore **fine-tuning BERT** on a custom dataset.