<a href="https://colab.research.google.com/github/Sumaira-Ashraf/deep-learning/blob/main/reuters_ch3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import tensorflow as tf
import tensorflow as tf
from tensorflow.keras.datasets import reuters
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=10000, test_split=0.2)

#num_words=10000: This limits the vocabulary size to the top 10,000 most frequent words


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters.npz
[1m2110848/2110848[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


 **Decode the News Articles**

In [None]:
print(len(x_train[0]), y_train[0])
print(len(x_train[1]), y_train[1])
print(len(x_train[2]), y_train[2])

87 3
56 4
139 3


In [None]:
max([max(sequence) for sequence in x_train])

9999

x_train[0] = 67, 84,22,482,26,7,48

In [None]:
x_train[0]

[1,
 2,
 2,
 8,
 43,
 10,
 447,
 5,
 25,
 207,
 270,
 5,
 3095,
 111,
 16,
 369,
 186,
 90,
 67,
 7,
 89,
 5,
 19,
 102,
 6,
 19,
 124,
 15,
 90,
 67,
 84,
 22,
 482,
 26,
 7,
 48,
 4,
 49,
 8,
 864,
 39,
 209,
 154,
 6,
 151,
 6,
 83,
 11,
 15,
 22,
 155,
 11,
 15,
 7,
 48,
 9,
 4579,
 1005,
 504,
 6,
 258,
 6,
 272,
 11,
 15,
 22,
 134,
 44,
 11,
 15,
 16,
 8,
 197,
 1245,
 90,
 67,
 52,
 29,
 209,
 30,
 32,
 132,
 6,
 109,
 15,
 17,
 12]

In [None]:
print(reverse_word_index.get(67-3,'?'))
print(reverse_word_index.get(84-3,'?'))
print(reverse_word_index.get(22-3,'?'))
print(reverse_word_index.get(482-3,'?'))
print(reverse_word_index.get(26-3,'?'))
print(reverse_word_index.get(7-3,'?'))
print(reverse_word_index.get(48-3,'?'))
#67, 84,22,482,26,7,48

share
up
from
70
cts
in
1986


In [None]:
" ".join([reverse_word_index.get(i-3,"?") for i in x_train[0]])

'? ? ? said as a result of its december acquisition of space co it expects earnings per share in 1987 of 1 15 to 1 30 dlrs per share up from 70 cts in 1986 the company said pretax net should rise to nine to 10 mln dlrs from six mln dlrs in 1986 and rental operation revenues to 19 to 22 mln dlrs from 12 5 mln dlrs it said cash flow per share this year should be 2 50 to three dlrs reuter 3'

In [None]:
word_index = reuters.get_word_index()
reverse_word_index = dict([(value, key) for (key, value) in word_index.items()])

def decode_review(text):
    return ' '.join([reverse_word_index.get(i - 3, '?') for i in text])

print(decode_review(x_train[0]))

? ? ? said as a result of its december acquisition of space co it expects earnings per share in 1987 of 1 15 to 1 30 dlrs per share up from 70 cts in 1986 the company said pretax net should rise to nine to 10 mln dlrs from six mln dlrs in 1986 and rental operation revenues to 19 to 22 mln dlrs from 12 5 mln dlrs it said cash flow per share this year should be 2 50 to three dlrs reuter 3


**Prepare the Data for Training:**

Tokenization:

Breaks text into words or tokens: This process divides the text into individual words or phrases, which are the fundamental units of meaning.

Assigns numerical indices to each token: Each unique token is assigned a unique integer, creating a vocabulary.

Converts text to numerical sequences: The text documents are then represented as sequences of these numerical indices.

In [None]:
tokenizer = Tokenizer(num_words=10000)
X_train = tokenizer.sequences_to_matrix(x_train, mode='binary')
X_test = tokenizer.sequences_to_matrix(x_test, mode='binary')

**One-Hot Encoding:**

Creates a binary vector for each token: For each token in the vocabulary, a vector of zeros is created.

Sets a single element to 1: The element corresponding to the specific token is set to 1.

Represents categorical data: This technique is ideal for representing categorical data, such as words, where each category is mutually exclusive.

Handling Categorical Data: One-hot encoding is effective for representing the different news categories in the Reuters dataset




In [None]:

# Assuming y_train holds the raw training labels:
y_train_categorical = tf.keras.utils.to_categorical(y_train)
y_test_categorical = tf.keras.utils.to_categorical(y_test)

# Now, y_train_categorical and y_test_categorical hold the one-hot encoded labels.
# You can use them in your model training process

**Understanding the Model Architecture**

Let's break down the provided Keras model architecture step-by-step:

**1. Sequential Model:**

This is a linear stack of layers. Each layer processes the output of the previous layer.

**2. Dense Layers:**

First Dense Layer:
Units: 64
Activation: ReLU (Rectified Linear Unit)
Input Shape: (10000,) - This indicates that the input to the model is a 1-dimensional array with 10,000 elements. This is likely the output of a tokenization and one-hot encoding process, where each element represents the presence or absence of a specific word in the vocabulary.
Second Dense Layer:
Units: 64
Activation: ReLU
Input Shape: The output of the previous layer.
Third Dense Layer:
Units: 46
Activation: Softmax - This activation function outputs a probability distribution over 46 classes. In the case of the Reuters dataset, these 46 classes represent different news topics.

**3. Dropout Layers:**

Dropout(0.5): This layer randomly sets 50% of the input units to zero at each update during training. This technique helps prevent overfitting by reducing the model's reliance on specific features.
Overall Functionality:

Input: The model takes a 10,000-dimensional vector as input, representing a tokenized and one-hot encoded news article.
Hidden Layers: The first two dense layers with ReLU activation extract features from the input data. The dropout layers help regularize the model.
Output Layer: The final dense layer with softmax activation outputs a probability distribution over 46 classes. The class with the highest probability is predicted as the most likely topic for the input news article.

**Why This Architecture for Reuters Dataset?**

Dense Layers: Suitable for capturing complex relationships between features.
ReLU Activation: Introduces non-linearity, enabling the model to learn intricate patterns.
Dropout: Prevents overfitting by randomly dropping out neurons during training.
Softmax Activation: Outputs a probability distribution over multiple classes, making it ideal for multi-class classification tasks.
This architecture is well-suited for text classification tasks like the Reuters dataset, where the goal is to categorize news articles into different topics.










In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu', input_shape=(10000,)),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(46, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Use the one-hot encoded labels for training and validation
model.fit(X_train, y_train_categorical, epochs=20, batch_size=32, validation_data=(X_test, y_test_categorical))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/20
[1m281/281[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 16ms/step - accuracy: 0.4245 - loss: 2.4660 - val_accuracy: 0.6892 - val_loss: 1.3502
Epoch 2/20
[1m281/281[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 11ms/step - accuracy: 0.6726 - loss: 1.3734 - val_accuracy: 0.7128 - val_loss: 1.2087
Epoch 3/20
[1m281/281[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 11ms/step - accuracy: 0.7270 - loss: 1.1119 - val_accuracy: 0.7342 - val_loss: 1.1269
Epoch 4/20
[1m281/281[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 13ms/step - accuracy: 0.7563 - loss: 0.9736 - val_accuracy: 0.7627 - val_loss: 1.0628
Epoch 5/20
[1m281/281[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.7736 - loss: 0.8722 - val_accuracy: 0.7694 - val_loss: 1.0495
Epoch 6/20
[1m281/281[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 10ms/step - accuracy: 0.7962 - loss: 0.7857 - val_accuracy: 0.7720 - val_loss: 1.0641
Epoch 7/20
[1m281/281

<keras.src.callbacks.history.History at 0x78a742294cd0>