<a href="https://colab.research.google.com/github/Shrey-Viradiya/HandsOnMachineLearning/blob/master/Natural_Language_Processing_with_RNNs_and_Attention.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!nvidia-smi

Thu Jun 25 02:07:21 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   72C    P8    36W / 149W |      1MiB / 11441MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# Natural Language Processing with RNNs and Attention

## Generating Shakespearean Text Using a Character RNN

### Creating the Training Dataset

In [2]:
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
import matplotlib.pyplot as plt

In [3]:
shakespeare_url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
filepath = keras.utils.get_file("shakespeare.txt", shakespeare_url)
with open(filepath) as f:
    shakespeare_text = f.read()

In [4]:
tokenizer = keras.preprocessing.text.Tokenizer(char_level=True)
tokenizer.fit_on_texts(shakespeare_text)

In [5]:
tokenizer.texts_to_sequences(['First'])

[[20, 6, 9, 8, 3]]

In [6]:
tokenizer.sequences_to_texts([[20,6,9,8,3]])

['f i r s t']

In [7]:
max_id = len(tokenizer.word_index)

In [8]:
max_id

39

In [9]:
dataset_size = tokenizer.document_count

In [10]:
dataset_size

1115394

In [11]:
[encoded] = np.array(tokenizer.texts_to_sequences([shakespeare_text])) - 1

### How to Split a Sequential Dataset

Let’s take the first 90% of the text for the training set (keeping the rest for the validation set and the test set), and create a tf.data.Dataset that will return each character one by one from this set:

In [12]:
train_size = dataset_size * 90 // 100
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])

### Chopping the Sequential Dataset into Multiple Windows

The training set now consists of a single sequence of over a million characters, so we can’t just train the neural network directly on it: the RNN would be equivalent to a deep net with over a million layers, and we would have a single (very long) instance to train it. Instead, we will use the dataset’s window() method to convert this long sequence of characters into many smaller windows of text. Every instance in the dataset will be a fairly short substring of the whole text, and the RNN will be unrolled only over the length of these substrings. This is called truncated backpropagation through time. 

In [13]:
n_steps = 100
window_length = n_steps + 1

In [14]:
dataset = dataset.window(window_length, shift=1, drop_remainder=True)

The window() method creates a dataset that contains windows, each of which is also represented as a dataset. It’s a nested dataset, analogous to a list of lists. This is useful when you want to transform each window by calling its dataset methods (e.g., to shuffle them or batch them). However, we cannot use a nested dataset directly for training, as our model will expect tensors as input, not datasets. So, we must call the flat_map() method: it converts a nested dataset into a flat dataset (one that does not contain datasets).

In [15]:
dataset = dataset.flat_map(lambda window: window.batch(window_length))

We need to shuffle these windows. Then we can batch the windows and separate the inputs (the first 100 characters) from the target (the last character):

In [16]:
batch_size = 32
dataset = dataset.shuffle(10000).batch(batch_size)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]))

Categorical input features should generally be encoded, usually as one-hot vectors or as embeddings. Here, we will encode each character using a one-hot vector because there are fairly few distinct characters (only 39):

In [17]:
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))

In [18]:
dataset = dataset.prefetch(1)

### Building and Training the Char-RNN Model

In [19]:
import os

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

In [20]:
model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id], dropout=0.2, recurrent_dropout=0.2),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id, activation='softmax'))    
])

model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam')
steps_per_epoch = train_size // batch_size // n_steps

class NvidiaUtilizationCallback(keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs):
        text = !nvidia-smi
        text = text[9][60:65] + ' GPU utilization'
        print(text)

history = model.fit(dataset, epochs = 40, steps_per_epoch=steps_per_epoch, callbacks=[checkpoint_callback, NvidiaUtilizationCallback()])

  0%  GPU utilization
Epoch 1/40
  0%  GPU utilization
Epoch 2/40
  0%  GPU utilization
Epoch 3/40
  0%  GPU utilization
Epoch 4/40
  0%  GPU utilization
Epoch 5/40
  0%  GPU utilization
Epoch 6/40
  0%  GPU utilization
Epoch 7/40
  0%  GPU utilization
Epoch 8/40
  0%  GPU utilization
Epoch 9/40
  0%  GPU utilization
Epoch 10/40
  0%  GPU utilization
Epoch 11/40
  0%  GPU utilization
Epoch 12/40
  0%  GPU utilization
Epoch 13/40
  0%  GPU utilization
Epoch 14/40
  0%  GPU utilization
Epoch 15/40
  0%  GPU utilization
Epoch 16/40
  0%  GPU utilization
Epoch 17/40
  0%  GPU utilization
Epoch 18/40
  0%  GPU utilization
Epoch 19/40
  0%  GPU utilization
Epoch 20/40
  0%  GPU utilization
Epoch 21/40
  0%  GPU utilization
Epoch 22/40
  0%  GPU utilization
Epoch 23/40
  0%  GPU utilization
Epoch 24/40
  0%  GPU utilization
Epoch 25/40
  0%  GPU utilization
Epoch 26/40
  0%  GPU utilization
Epoch 27/40
  0%  GPU utilization
Epoch 28/40
  0%  GPU utilization
Epoch 29/40
  0%  GPU utilization
E

### Using the Model to Generate Text

In [21]:
def preprocess(texts):
    X = np.array(tokenizer.texts_to_sequences(texts)) - 1
    return tf.one_hot(X, max_id)

In [22]:
X_new = preprocess(["How are yo"])

In [23]:
Y_pred = np.argmax(model.predict(X_new), axis=-1)

In [24]:
tokenizer.sequences_to_texts(Y_pred + 1)[0][-1]

'u'

### Generating Fake Shakespearean Text

In [25]:
def next_char(text, temperature=1):
    X_new = preprocess([text])
    y_proba = model.predict(X_new)[0, -1:, :]
    rescaled_logits = tf.math.log(y_proba) / temperature
    char_id = tf.random.categorical(rescaled_logits, num_samples=1) + 1
    return tokenizer.sequences_to_texts(char_id.numpy())[0]

In [26]:
def complete_text(text, n_chars=50, temperature=1):
    for _ in range(n_chars):
        text += next_char(text, temperature)
    return text

In [27]:
text_1 = complete_text("t", temperature=0.2)



In [28]:
print(text_1)

the bolingbroke:
the bone it and the bolingbroke:
a


In [29]:
text_2 = complete_text("w", temperature=1)

In [30]:
print(text_2)

wall heard soul,
and old my wrongs, i'll, gittion o


In [31]:
text_3 = complete_text("i", temperature=2)

In [32]:
print(text_3)

ivy filk will hid,
ploug? nob,nhere hastyou gaod it


In [33]:
text_4 = complete_text("I shall love", temperature=0.5, n_chars=100)



In [34]:
print(text_4)

I shall love,
with heaven throw and see the breath of with and go.

duke of aumerle:
we do the bolingbroke:
and 


In [35]:
text_5 = complete_text("love", temperature=1, n_chars = 150)



In [36]:
print(text_5)

love.
by the time no more thee? what all his itngran
him marve misence the more be soul ressing and in the hand but are would again
be think their grict o


### Stateful RNN

First, note that a stateful RNN only makes sense if each input sequence in a batch starts exactly where the corresponding sequence in the previous batch left off. So the first thing we need to do to build a stateful RNN is to use sequential and nonoverlapping input sequences (rather than the shuffled and overlapping sequences we used to train stateless RNNs). When creating the Dataset, we must therefore use shift=n_steps (instead of shift=1) when calling the window() method. Moreover, we must obviously not call the shuffle() method.

In [37]:
tf.random.set_seed(42)

In [38]:
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])
dataset = dataset.window(window_length, shift=n_steps, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(window_length))
dataset = dataset.batch(1)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]))
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))
dataset = dataset.prefetch(1)

In [39]:
batch_size = 32
encoded_parts = np.array_split(encoded[:train_size], batch_size)
datasets = []
for encoded_part in encoded_parts:
    dataset = tf.data.Dataset.from_tensor_slices(encoded_part)
    dataset = dataset.window(window_length, shift=n_steps, drop_remainder=True)
    dataset = dataset.flat_map(lambda window: window.batch(window_length))
    datasets.append(dataset)
dataset = tf.data.Dataset.zip(tuple(datasets)).map(lambda *windows: tf.stack(windows))
dataset = dataset.repeat().map(lambda windows: (windows[:, :-1], windows[:, 1:]))
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))
dataset = dataset.prefetch(1)

In [40]:
model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, stateful=True,
                     dropout=0.2, recurrent_dropout=0.2,
                     batch_input_shape=[batch_size, None, max_id]),
    keras.layers.GRU(128, return_sequences=True, stateful=True,
                     dropout=0.2, recurrent_dropout=0.2),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id,
                                                    activation="softmax"))
])

class ResetStatesCallback(keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs):
        self.model.reset_states()



In [41]:
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
steps_per_epoch = train_size // batch_size // n_steps
history = model.fit(dataset, steps_per_epoch=steps_per_epoch, epochs=40,
                    callbacks=[ResetStatesCallback(), NvidiaUtilizationCallback()])

  0%  GPU utilization
Epoch 1/40
  0%  GPU utilization
Epoch 2/40
  0%  GPU utilization
Epoch 3/40
  0%  GPU utilization
Epoch 4/40
  0%  GPU utilization
Epoch 5/40
  0%  GPU utilization
Epoch 6/40
  0%  GPU utilization
Epoch 7/40
  0%  GPU utilization
Epoch 8/40
  0%  GPU utilization
Epoch 9/40
  0%  GPU utilization
Epoch 10/40
  0%  GPU utilization
Epoch 11/40
  0%  GPU utilization
Epoch 12/40
  0%  GPU utilization
Epoch 13/40
  0%  GPU utilization
Epoch 14/40
  0%  GPU utilization
Epoch 15/40
  0%  GPU utilization
Epoch 16/40
  0%  GPU utilization
Epoch 17/40
  0%  GPU utilization
Epoch 18/40
  0%  GPU utilization
Epoch 19/40
  0%  GPU utilization
Epoch 20/40
  0%  GPU utilization
Epoch 21/40
  0%  GPU utilization
Epoch 22/40
  0%  GPU utilization
Epoch 23/40
  0%  GPU utilization
Epoch 24/40
  0%  GPU utilization
Epoch 25/40
  0%  GPU utilization
Epoch 26/40
  0%  GPU utilization
Epoch 27/40
  0%  GPU utilization
Epoch 28/40
  0%  GPU utilization
Epoch 29/40
  0%  GPU utilization
E


To use the model with different batch sizes, we need to create a stateless copy. We can get rid of dropout since it is only used during training:

In [42]:
stateless_model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id]),
    keras.layers.GRU(128, return_sequences=True),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id, activation="softmax"))
])


To set the weights, we first need to build the model (so the weights get created):

In [43]:
stateless_model.build(tf.TensorShape([None, None, max_id]))

In [44]:
stateless_model.set_weights(model.get_weights())
model = stateless_model

In [45]:
tf.random.set_seed(42)
print(complete_text("t"))

ty done!
i never ere a such so my. beliver wonds,
a


### Generating Fake Shakespearean Text

In [46]:
def next_char(text, temperature=1):
    X_new = preprocess([text])
    y_proba = stateless_model.predict(X_new)[0, -1:, :]
    rescaled_logits = tf.math.log(y_proba) / temperature
    char_id = tf.random.categorical(rescaled_logits, num_samples=1) + 1
    return tokenizer.sequences_to_texts(char_id.numpy())[0]

In [47]:
def complete_text(text, n_chars=50, temperature=1):
    for _ in range(n_chars):
        text += next_char(text, temperature)
    return text

In [48]:
stateless_model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")

In [49]:
stateless_model.predict(X_new)

array([[[4.59891595e-02, 1.74413472e-01, 1.98242627e-02, 9.89711061e-02,
         2.76178420e-01, 1.33323833e-01, 1.26086525e-03, 6.08527754e-03,
         2.59843897e-02, 1.13477726e-02, 2.20332872e-02, 4.34173178e-03,
         2.35646241e-03, 1.27784088e-02, 6.94315601e-03, 1.85317267e-02,
         6.71544112e-03, 2.25532167e-02, 1.36832881e-03, 5.82638476e-03,
         9.60989331e-04, 1.55420369e-02, 6.14485750e-03, 1.30741103e-02,
         5.91185344e-05, 1.35236280e-03, 1.06580453e-02, 8.86290334e-03,
         7.31884083e-03, 7.44957710e-03, 1.07008098e-02, 3.77355074e-03,
         1.69266993e-03, 2.26535159e-03, 5.61631704e-03, 2.66866153e-03,
         2.68419785e-03, 1.44288142e-03, 9.06004570e-04],
        [4.52118181e-03, 1.03911234e-03, 3.58953071e-03, 4.33750115e-02,
         2.06875196e-03, 1.81944191e-03, 9.36750293e-05, 1.51598640e-02,
         1.73121944e-01, 4.89112884e-02, 1.55169494e-03, 3.13514136e-02,
         2.97830673e-03, 3.72492403e-01, 7.35578164e-02, 1.9008436

In [50]:
text_1 = complete_text("t", temperature=0.2)



In [51]:
print(text_1)

tight of the country.

lady grey:
who will the stay


In [52]:
text_2 = complete_text("w", temperature=1)

In [53]:
print(text_2)

wpent me:
have my good high beat you him of you arm


In [54]:
text_3 = complete_text("i", temperature=2)

In [55]:
print(text_3)

iq'tr?
ame eneep, desind sasdocy to diviber?'
hid a


In [56]:
text_4 = complete_text("I shall love", temperature=0.5, n_chars=100)



In [57]:
print(text_4)

I shall love as the world,
when i shall in the grace to part to be them me the will
for you have i may my grace 


In [58]:
text_5 = complete_text("love", temperature=1, n_chars = 150)



In [59]:
print(text_5)

love:
and leave me: but our heaven, he was held in they
our were to but to the store friends as the counter;
will great so lies to high to thee rest a
kin


## Sentiment Analysis

In [60]:
tf.random.set_seed(42)

In [61]:
(X_train, y_test), (X_valid, y_test) = keras.datasets.imdb.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


In [62]:
X_train[0][:10]

[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65]

In [76]:
word_index = keras.datasets.imdb.get_word_index()
id_to_word = {id_ + 3: word for word, id_ in word_index.items()}
for id_, token in enumerate(("<pad>", "<sos>", "<unk>")):
    id_to_word[id_] = token
" ".join([id_to_word[id_] for id_ in X_train[0][:10]])

'<sos> this film was just brilliant casting location scenery story'

In [64]:
!pip install -U tensorflow_datasets

Collecting tensorflow_datasets
[?25l  Downloading https://files.pythonhosted.org/packages/bd/99/996b15ff5d11166c3516012838f569f78d57b71d4aac051caea826f6c7e0/tensorflow_datasets-3.1.0-py3-none-any.whl (3.3MB)
[K     |████████████████████████████████| 3.3MB 6.3MB/s 
Installing collected packages: tensorflow-datasets
  Found existing installation: tensorflow-datasets 2.1.0
    Uninstalling tensorflow-datasets-2.1.0:
      Successfully uninstalled tensorflow-datasets-2.1.0
Successfully installed tensorflow-datasets-3.1.0


In [65]:
import tensorflow_datasets as tfds

datasets, info = tfds.load("imdb_reviews", as_supervised=True, with_info=True)

[1mDownloading and preparing dataset imdb_reviews/plain_text/1.0.0 (download: 80.23 MiB, generated: Unknown size, total: 80.23 MiB) to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0...[0m


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Completed...', max=1.0, style=Progre…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Size...', max=1.0, style=ProgressSty…







HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteYADTPC/imdb_reviews-train.tfrecord


HBox(children=(FloatProgress(value=0.0, max=25000.0), HTML(value='')))



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteYADTPC/imdb_reviews-test.tfrecord


HBox(children=(FloatProgress(value=0.0, max=25000.0), HTML(value='')))



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteYADTPC/imdb_reviews-unsupervised.tfrecord


HBox(children=(FloatProgress(value=0.0, max=50000.0), HTML(value='')))

[1mDataset imdb_reviews downloaded and prepared to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0. Subsequent calls will reuse this data.[0m


In [70]:
train_size = info.splits["train"].num_examples
test_size = info.splits["test"].num_examples

In [71]:
train_size, test_size

(25000, 25000)

In [69]:
datasets.keys()

dict_keys(['test', 'train', 'unsupervised'])

In [72]:
for X_batch, y_batch in datasets["train"].batch(2).take(1):
    for review, label in zip(X_batch.numpy(), y_batch.numpy()):
        print("Review:", review.decode("utf-8")[:200], "...")
        print("Label:", label, "= Positive" if label else "= Negative")
        print()

Review: This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting  ...
Label: 0 = Negative

Review: I have been known to fall asleep during films, but this is usually due to a combination of things including, really tired, being warm and comfortable on the sette and having just eaten a lot. However  ...
Label: 0 = Negative



In [77]:
def preprocess(X_batch, y_batch):
    X_batch = tf.strings.substr(X_batch, 0, 300)
    X_batch = tf.strings.regex_replace(X_batch, rb"<br\s*/?>", b" ")
    X_batch = tf.strings.regex_replace(X_batch, b"[^a-zA-Z']", b" ")
    X_batch = tf.strings.split(X_batch)
    return X_batch.to_tensor(default_value=b"<pad>"), y_batch

In [78]:
from collections import Counter
vocabulary = Counter()
for X_batch, y_batch in datasets["train"].batch(32).map(preprocess):
    for review in X_batch:
        vocabulary.update(list(review.numpy()))

In [79]:
vocabulary.most_common()[:3]

[(b'<pad>', 214309), (b'the', 61137), (b'a', 38564)]

In [80]:
vocab_size = 10000
truncated_vocabulary = [
    word for word, count in vocabulary.most_common()[:vocab_size]]

In [81]:
truncated_vocabulary

[b'<pad>',
 b'the',
 b'a',
 b'of',
 b'and',
 b'to',
 b'I',
 b'is',
 b'in',
 b'this',
 b'it',
 b'was',
 b'movie',
 b'that',
 b'The',
 b'film',
 b'with',
 b'for',
 b'as',
 b'on',
 b'but',
 b'have',
 b'This',
 b'one',
 b'not',
 b'be',
 b'are',
 b'you',
 b'an',
 b'at',
 b'about',
 b'by',
 b'all',
 b'his',
 b'so',
 b'like',
 b'from',
 b'who',
 b'has',
 b'It',
 b'good',
 b'my',
 b'just',
 b'very',
 b'out',
 b'or',
 b'story',
 b'some',
 b'time',
 b'had',
 b'he',
 b'they',
 b'really',
 b'me',
 b'when',
 b'what',
 b'first',
 b'movies',
 b'bad',
 b'see',
 b'seen',
 b'up',
 b'only',
 b'were',
 b"it's",
 b'would',
 b'more',
 b'made',
 b'great',
 b'can',
 b'been',
 b'i',
 b'her',
 b'no',
 b'A',
 b'which',
 b'even',
 b'films',
 b'there',
 b'ever',
 b'people',
 b'much',
 b'because',
 b'most',
 b'plot',
 b'if',
 b'than',
 b'acting',
 b'get',
 b'their',
 b'well',
 b'into',
 b'how',
 b'best',
 b'think',
 b'other',
 b'its',
 b"It's",
 b'saw',
 b'could',
 b'watch',
 b'many',
 b"don't",
 b'do',
 b'will',
 

Now we need to add a preprocessing step to replace each word with its ID (i.e., its index in the vocabulary).

In [82]:
words = tf.constant(truncated_vocabulary)
word_ids = tf.range(len(truncated_vocabulary), dtype=tf.int64)
vocab_init = tf.lookup.KeyValueTensorInitializer(words, word_ids)
num_oov_buckets = 1000
table = tf.lookup.StaticVocabularyTable(vocab_init, num_oov_buckets)

In [83]:
table.lookup(tf.constant([b"This movie was faaaaaantastic".split()]))

<tf.Tensor: shape=(1, 4), dtype=int64, numpy=array([[   22,    12,    11, 10053]])>

In [84]:
def encode_words(X_batch, y_batch):
    return table.lookup(X_batch), y_batch

In [85]:
train_set = datasets["train"].batch(32).map(preprocess)
train_set = train_set.map(encode_words).prefetch(1)

In [86]:
for X_batch, y_batch in train_set.take(1):
    print(X_batch)
    print(y_batch)

tf.Tensor(
[[  22   11   28 ...    0    0    0]
 [   6   21   70 ...    0    0    0]
 [4099 6881    1 ...    0    0    0]
 ...
 [  22   12  118 ...  331 1047    0]
 [1757 4101  451 ...    0    0    0]
 [3365 4392    6 ...    0    0    0]], shape=(32, 60), dtype=int64)
tf.Tensor([0 0 0 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0 0], shape=(32,), dtype=int64)


In [99]:
embed_size = 128
model = keras.models.Sequential([
    keras.layers.Embedding(vocab_size + num_oov_buckets, embed_size,
                           mask_zero=True, # not shown in the book
                           input_shape=[None]),
    keras.layers.GRU(128, return_sequences=True),
    keras.layers.GRU(128),
    keras.layers.Dense(1, activation="sigmoid")
])
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(train_set, epochs=10, validation_data = test_set)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [100]:
test_set = datasets["test"].batch(32).map(preprocess)
test_set = test_set.map(encode_words).prefetch(1)

In [101]:
model.evaluate(test_set)



[1.3523379564285278, 0.7367200255393982]

In [103]:
K = keras.backend
embed_size = 128
inputs = keras.layers.Input(shape=[None])
mask = keras.layers.Lambda(lambda inputs: K.not_equal(inputs, 0))(inputs)
z = keras.layers.Embedding(vocab_size + num_oov_buckets, embed_size)(inputs)
z = keras.layers.GRU(128, return_sequences=True)(z, mask=mask)
z = keras.layers.GRU(128)(z, mask=mask)
outputs = keras.layers.Dense(1, activation="sigmoid")(z)
model = keras.models.Model(inputs=[inputs], outputs=[outputs])
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(train_set, epochs=10, validation_data = test_set)   

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [104]:
model.evaluate(test_set)



[1.1697889566421509, 0.7277200222015381]