# News Category Classification using Deep Learning (Reuters Dataset)


## üéØ Objective

In this lecture, we will build a Deep Learning model that classifies news articles into different categories using the Reuters dataset (available in Keras).

This project helps students learn multi-class text classification, an important step after binary classification like sentiment analysis.

## üß© Theoretical Background
üîπ What is Text Classification?

Text classification automatically assigns categories or labels to text.
Examples:

Classify news articles as politics, sports, business

Detect spam or not spam emails

Categorize customer support tickets

üîπ Dataset: Reuters Newswire Topics

Keras provides this dataset via:



In [None]:
from tensorflow.keras.datasets import reuters

It contains 11,228 articles and 46 categories, already tokenized (each word is represented by an integer).

üîπ Why Deep Learning?

Deep learning automatically learns patterns from sequences of text no manual feature extraction required.

In [None]:
import tensorflow as tf
from tensorflow.keras.datasets import reuters
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Flatten
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical

üßæ Explanation:

reuters: Loads the Reuters dataset

Embedding: Converts word indices to dense vectors

LSTM: Learns sequential dependencies

Dense: Fully connected layer for classification

pad_sequences: Makes all text inputs of equal length

to_categorical: Converts labels to one-hot format




In [None]:
(X_train, y_train), (X_test, y_test) = reuters.load_data(num_words=10000)

print("Training samples:", len(X_train))
print("Testing samples:", len(X_test))
print("Example article:", X_train[0][:10])
print("Label:", y_train[0])


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters.npz
[1m2110848/2110848[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 0us/step
Training samples: 8982
Testing samples: 2246
Example article: [1, 2, 2, 8, 43, 10, 447, 5, 25, 207]
Label: 3


üßæ Explanation:
We load only the top 10,000 most frequent words for efficiency.
Each article is stored as a list of integers representing words.
The labels correspond to 46 possible categories.

In [None]:
max_len = 200
X_train = pad_sequences(X_train, maxlen=max_len)
X_test = pad_sequences(X_test, maxlen=max_len)

print("Shape of training data:", X_train.shape)

Shape of training data: (8982, 200)


üßæ Explanation:
All news articles are padded or truncated to 200 words for consistent LSTM input shape.
This ensures that every input sequence is the same size.

In [None]:
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
print("Shape of labels:", y_train.shape)

Shape of labels: (8982, 46)


üßæ Explanation:
Because there are 46 output categories, we use one-hot encoding ‚Äî each label becomes a vector of size 46, where only one element is 1.

In [None]:
model = Sequential([
    Embedding(10000, 64, input_length=max_len),
    LSTM(128, dropout=0.3, recurrent_dropout=0.3),
    Dense(46, activation='softmax'),
    Dense(64, activation='relu')

])

model.summary()


NameError: name 'Sequential' is not defined

üßæ Explanation:

Embedding(10000, 64): Converts word indices to dense 64-dimensional vectors.

LSTM(64): Learns relationships between words in sequence.

Dense(46, activation='softmax'): Produces probabilities for 46 classes.

‚öôÔ∏è Why Softmax Activation?

Softmax converts outputs into probabilities that sum to 1, making it ideal for multi-class classification.

In [None]:
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])


üßæ Explanation:

Optimizer: Adam ‚Äî adjusts learning rate automatically for faster convergence.

Loss: categorical_crossentropy ‚Äî suitable for more than 2 classes.

Metric: accuracy ‚Äî to measure correct classifications.

‚öôÔ∏è Why Categorical Crossentropy Loss?

It measures how well predicted probabilities match the true labels.

In [None]:
history = model.fit(X_train, y_train,
                    epochs=5,
                    batch_size=128,
                    validation_data=(X_test, y_test))


Epoch 1/5
[1m71/71[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m34s[0m 399ms/step - accuracy: 0.3318 - loss: 3.1385 - val_accuracy: 0.3620 - val_loss: 2.3635
Epoch 2/5


üßæ Explanation:
We train the model for 5 epochs ‚Äî meaning it goes through the entire dataset 5 times.
Validation data allows us to track overfitting and generalization performance.

Class Task #1 (With Solution)

üëâ Task: Train the model for 8 epochs instead of 5 and compare accuracy.

In [None]:
history_8 = model.fit(X_train, y_train,
                      epochs=8,
                      batch_size=128,
                      validation_data=(X_test, y_test))


üí° Observation:
As epochs increase, accuracy may improve slightly, but if validation accuracy stops improving ‚Äî the model starts overfitting.

In [None]:
loss, accuracy = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {accuracy:.2f}")


üßæ Explanation:
The test accuracy shows how well the model performs on unseen data.
Accuracy around 75‚Äì80% is typical for this dataset.

In [None]:
predictions = model.predict(X_test[:5])
print("Predicted categories:", predictions.argmax(axis=1))


üßæ Explanation:
model.predict() gives probabilities for each class.
argmax(axis=1) returns the index (category) with the highest probability.

Class Tasks (Without Solutions)




1Ô∏è‚É£ Replace LSTM(64) with LSTM(128) and compare accuracy.

2Ô∏è‚É£ Add another hidden layer:

```
Dense(64, activation='relu')

```

before the final output layer.

3Ô∏è‚É£ Change optimizer from 'adam' to 'rmsprop' and observe how it affects training speed or accuracy.

üè† Mini Project (Home Assignment)
üéØ Project: Topic Classification of News Headlines

Instructions:

Use the 20 Newsgroups dataset from:

In [None]:
from sklearn.datasets import fetch_20newsgroups

* Tokenize and pad the text similar to this project.

* Build a deep learning model using Embedding + LSTM + Dense layers.

* Compare two models:

* One with only Dense layers

* One with LSTM layers

Submit:

Your notebook
