# 🌟 Day 19: Attention & Transformers with TensorFlow (Hinglish)

Aaj hum dekhenge:
- Attention Mechanism kya hota hai?
- Transformer architecture ka overview
- BERT ka use karke ek text classifier banana


## 📌 1️⃣ Why Attention?

RNN/CNN me fixed-size context hota hai.
Attention me model har input token ko sab tokens ke sath compare karta hai aur decide karta hai ki kahan focus karna hai.

Ye NLP & Vision dono me kaam karta hai.

## 🤖 2️⃣ Transformer Architecture

Transformer = Encoder + Decoder + Self-Attention

- Encoder: Input ko process karta hai
- Decoder: Output generate karta hai
- Self-Attention: Har token apne context me important tokens ko weight deta hai

Hum yaha sirf Encoder (BERT) use karenge.

## 🔷 3️⃣ Install & Import Libraries


In [None]:
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow_text as text
import numpy as np
import matplotlib.pyplot as plt

print("TensorFlow:", tf.__version__)

## 📥 4️⃣ Load BERT Preprocessor & Encoder

Hum yaha `small_bert` use karenge jo fast & accurate hai.

In [None]:
bert_preprocess = hub.load("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
bert_encoder = hub.load("https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/2")

print("✅ BERT Preprocessor & Encoder loaded")

## 📝 5️⃣ Sample Text & Preprocess


In [None]:
sample_text = ["TensorFlow is amazing!", "I love learning about transformers."]

text_inputs = tf.constant(sample_text)
encoder_inputs = bert_preprocess(text_inputs)

print("Keys:", encoder_inputs.keys())

## 🚀 6️⃣ Encode Text & See Outputs

In [None]:
outputs = bert_encoder(encoder_inputs)

# Pooled output → sentence representation
pooled_output = outputs['pooled_output']

print("Pooled output shape:", pooled_output.shape)

## 🔗 7️⃣ Build a Classification Model

Hum pooled_output ko Dense layers me bhej ke classification karenge.

In [None]:
from tensorflow.keras import layers, Model

text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='text')
preprocessed_text = hub.KerasLayer(bert_preprocess)(text_input)
outputs = hub.KerasLayer(bert_encoder, trainable=True)(preprocessed_text)

pooled_output = outputs['pooled_output']
dropout = layers.Dropout(0.1)(pooled_output)
classifier = layers.Dense(1, activation='sigmoid')(dropout)

model = Model(inputs=text_input, outputs=classifier)

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

## 🧪 8️⃣ Train on Dummy Data (Demo)


In [None]:
# Dummy dataset
texts = np.array(["Good movie", "Bad movie", "Great film", "Terrible film"])
labels = np.array([1, 0, 1, 0])

model.fit(texts, labels, epochs=2)

## 🔗 Summary Table

| Step | Description |
|------|-------------|
| Load BERT | Preprocessor & Encoder from TF Hub |
| Preprocess | Text ko tokens me badlo |
| Encode | BERT output nikaalo |
| Classify | Dense layers ke through predict karo |

---
🎉 Aapne Attention & Transformers ka ek simple text classifier bana liya!