# DL
**Neural Network Architectures helps for sequence data (text) that capture context and meaning better than classical ML models.**

### 🧠 **Breakdown & Hints**

1. **R – RNN (Recurrent Neural Network)**
   🔁 *“Remembers previous words.”*
   → Think **R** for **Remember**.

2. **L – LSTM (Long Short-Term Memory)**
   🧩 *“Fixes RNN’s forgetting issue.”*
   → **L** for **Long memory** — remembers things longer.

3. **G – GRU (Gated Recurrent Unit)**
   ⚡ *“Simpler and faster LSTM.”*
   → **G** for **Gate** — only opens when needed (efficient).

4. **S – Seq2Seq (Sequence-to-Sequence)**
   🔄 *“For translation and summarization.”*
   → **S** for **Send and Summarize** — it converts one sequence to another.

5. **A – Attention Mechanism**
   🎯 *“Focuses on important words.”*
   → **A** for **Attention** — focuses like a spotlight.

| Model                              | Purpose                                                                      |
| ---------------------------------- | ---------------------------------------------------------------------------- |
| **RNN (Recurrent Neural Network)** | Handles sequences; remembers previous words.                                 |
| **LSTM (Long Short-Term Memory)**  | Solves RNN’s “vanishing gradient” problem; remembers long-term dependencies. |
| **GRU (Gated Recurrent Unit)**     | Simplified version of LSTM; faster to train.                                 |
| **Seq2Seq (Encoder-Decoder)**      | For tasks like translation, summarization.                                   |
| **Attention Mechanism**            | Lets model focus on important words in a sequence.                           |


we will use previous project *topic_classification_project* with the same dataset to see the difference
LSTM-based text classification model

- 🧠 Project Name: “News Topic Classification using LSTM”
- 📚 Dataset: 20 Newsgroups --> 3-category('sci.electronics', 'soc.religion.christian', 'rec.autos').
- 🧰 Tech Stack: keras tensorflow 

# Step 1: Load the dataset

In [1]:
import numpy as np 
from sklearn.datasets import fetch_20newsgroups

categories = ['comp.graphics', 'sci.med'] #computer & medical dataset
data = fetch_20newsgroups(subset='all', categories=categories)
texts = data.data
labels = data.target


# Step 2: Label Encode

In [2]:
from sklearn.preprocessing import LabelEncoder

le = LabelEncoder()
labels_encoded = le.fit_transform(labels)


# Step 3: Tokenize and Embedding
In traditional machine learning (ML), models like Naïve Bayes, SVM, or Logistic Regression rely on manual feature extraction such as Bag of Words (BoW) or TF-IDF.
→ These features are sparse, high-dimensional, and do not capture semantic meaning between words.

🧿 Deep Learning Approach, Deep learning models — such as LSTM, GRU, and Transformers — can learn features automatically from sequences of tokens.
→ No need for manual TF-IDF or BoW.

# Notes:

## 🤖👩🏻‍💻 Traditional ML (Naive Bayes, SVM, etc.):

- Tokenizer → splits text into words.

- TF-IDF / CountVectorizer → converts words into sparse numeric vectors (frequency/importance).

- Word order and context are ignored.

- Model only learns patterns based on word presence/importance.

## 🧠🪐 Deep Learning (LSTM, GRU):

 - Keras Tokenizer → converts words into integer indices.

 - Embedding layer → turns indices into dense word vectors.

 - LSTM reads sequences → learns word order, context, and semantic meaning.

 - Model can capture long-term dependencies and relationships between words.

In [3]:

from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

max_words = 5000       # Max vocabulary size
max_len = 200          # Max words in a sequence

tokenizer = Tokenizer(num_words=max_words, oov_token="<OOV>")
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
padded_sequences = pad_sequences(sequences, maxlen=max_len, padding='post')


# Train and test split

In [4]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    padded_sequences, labels_encoded, test_size=0.2, random_state=42
)


# Model Creation - LSTM model

In [13]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout

model = Sequential([
    Embedding(input_dim=max_words, output_dim=64, input_length=max_len),
    LSTM(64, return_sequences=False),
    Dropout(0.5),
    Dense(32, activation='relu'),
    Dense(len(categories), activation='softmax')
])

model.build(input_shape=(None, max_len))
# model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()




In [16]:
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train[:10], y_train[:10], epochs=1, batch_size=2)
model.summary()


[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 19ms/step - accuracy: 0.6000 - loss: 0.6959


# Train the model

In [17]:
history = model.fit(
    X_train, y_train,
    epochs=5,
    batch_size=64,
    validation_split=0.2
)


Epoch 1/5
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 71ms/step - accuracy: 0.5621 - loss: 0.6908 - val_accuracy: 0.5764 - val_loss: 0.6864
Epoch 2/5
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 64ms/step - accuracy: 0.5987 - loss: 0.6785 - val_accuracy: 0.5796 - val_loss: 0.6758
Epoch 3/5
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 63ms/step - accuracy: 0.6226 - loss: 0.6422 - val_accuracy: 0.6401 - val_loss: 0.6115
Epoch 4/5
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 63ms/step - accuracy: 0.7492 - loss: 0.5115 - val_accuracy: 0.8503 - val_loss: 0.4301
Epoch 5/5
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 99ms/step - accuracy: 0.8424 - loss: 0.4746 - val_accuracy: 0.8153 - val_loss: 0.4224


# Evaluate & Predict

In [None]:
loss, accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", accuracy)

# Predict category for a sample text
sample_text = ["The car engine needs regular maintenance."]
# sample_text = ['Doctors are discovering new treatments for cancer']
seq = tokenizer.texts_to_sequences(sample_text)
pad_seq = pad_sequences(seq, maxlen=max_len, padding='post')
pred = model.predict(pad_seq)

predicted_category = le.inverse_transform([np.argmax(pred)])

print("Predicted Category:", data.target_names[predicted_category[0]])


[1m13/13[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - accuracy: 0.8168 - loss: 0.4244
Test Accuracy: 0.8167939186096191
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 121ms/step
Predicted Category: comp.graphics
