<a href="https://colab.research.google.com/github/Shrey-Viradiya/HandsOnMachineLearning/blob/master/Natural_Language_Processing_with_RNNs_and_Attention.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!nvidia-smi

Fri Jun 26 17:46:08 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 450.36.06    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  GeForce 920MX       On   | 00000000:01:00.0 Off |                  N/A |
| N/A   59C    P8    N/A /  N/A |    162MiB /  2004MiB |     38%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# Natural Language Processing with RNNs and Attention

## Generating Shakespearean Text Using a Character RNN

### Creating the Training Dataset

In [2]:
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
import matplotlib.pyplot as plt

In [3]:
shakespeare_url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
filepath = keras.utils.get_file("shakespeare.txt", shakespeare_url)
with open(filepath) as f:
    shakespeare_text = f.read()

In [4]:
tokenizer = keras.preprocessing.text.Tokenizer(char_level=True)
tokenizer.fit_on_texts(shakespeare_text)

In [5]:
tokenizer.texts_to_sequences(['First'])

[[20, 6, 9, 8, 3]]

In [6]:
tokenizer.sequences_to_texts([[20,6,9,8,3]])

['f i r s t']

In [7]:
max_id = len(tokenizer.word_index)

In [8]:
max_id

39

In [9]:
dataset_size = tokenizer.document_count

In [10]:
dataset_size

1115394

In [11]:
[encoded] = np.array(tokenizer.texts_to_sequences([shakespeare_text])) - 1

### How to Split a Sequential Dataset

Let’s take the first 90% of the text for the training set (keeping the rest for the validation set and the test set), and create a tf.data.Dataset that will return each character one by one from this set:

In [12]:
train_size = dataset_size * 90 // 100
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])

### Chopping the Sequential Dataset into Multiple Windows

The training set now consists of a single sequence of over a million characters, so we can’t just train the neural network directly on it: the RNN would be equivalent to a deep net with over a million layers, and we would have a single (very long) instance to train it. Instead, we will use the dataset’s window() method to convert this long sequence of characters into many smaller windows of text. Every instance in the dataset will be a fairly short substring of the whole text, and the RNN will be unrolled only over the length of these substrings. This is called truncated backpropagation through time. 

In [13]:
n_steps = 100
window_length = n_steps + 1

In [14]:
dataset = dataset.window(window_length, shift=1, drop_remainder=True)

The window() method creates a dataset that contains windows, each of which is also represented as a dataset. It’s a nested dataset, analogous to a list of lists. This is useful when you want to transform each window by calling its dataset methods (e.g., to shuffle them or batch them). However, we cannot use a nested dataset directly for training, as our model will expect tensors as input, not datasets. So, we must call the flat_map() method: it converts a nested dataset into a flat dataset (one that does not contain datasets).

In [15]:
dataset = dataset.flat_map(lambda window: window.batch(window_length))

We need to shuffle these windows. Then we can batch the windows and separate the inputs (the first 100 characters) from the target (the last character):

In [16]:
batch_size = 32
dataset = dataset.shuffle(10000).batch(batch_size)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]))

Categorical input features should generally be encoded, usually as one-hot vectors or as embeddings. Here, we will encode each character using a one-hot vector because there are fairly few distinct characters (only 39):

In [17]:
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))

In [18]:
dataset = dataset.prefetch(1)

### Building and Training the Char-RNN Model

In [19]:
import os

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

In [20]:
class NvidiaUtilizationCallback(keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs):
        text = !nvidia-smi
        text = text[9][60:65] + ' GPU utilization'
        print(text)

In [21]:
model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id], dropout=0.2, recurrent_dropout=0.2),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id, activation='softmax'))    
])

model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam')
steps_per_epoch = train_size // batch_size // n_steps

history = model.fit(dataset, epochs = 40, steps_per_epoch=steps_per_epoch, callbacks=[checkpoint_callback, NvidiaUtilizationCallback()])

  4%  GPU utilization
Epoch 1/40
  2%  GPU utilization
Epoch 2/40
 26%  GPU utilization
Epoch 3/40
 27%  GPU utilization
Epoch 4/40
 25%  GPU utilization
Epoch 5/40
 29%  GPU utilization
Epoch 6/40
 43%  GPU utilization
Epoch 7/40
 57%  GPU utilization
Epoch 8/40
 50%  GPU utilization
Epoch 9/40
  5%  GPU utilization
Epoch 10/40
 61%  GPU utilization
Epoch 11/40
 38%  GPU utilization
Epoch 12/40
 56%  GPU utilization
Epoch 13/40
 34%  GPU utilization
Epoch 14/40
 49%  GPU utilization
Epoch 15/40
 54%  GPU utilization
Epoch 16/40
 52%  GPU utilization
Epoch 17/40
 45%  GPU utilization
Epoch 18/40
 48%  GPU utilization
Epoch 19/40
 52%  GPU utilization
Epoch 20/40
 35%  GPU utilization
Epoch 21/40
 49%  GPU utilization
Epoch 22/40
 33%  GPU utilization
Epoch 23/40
 37%  GPU utilization
Epoch 24/40
 35%  GPU utilization
Epoch 25/40
 39%  GPU utilization
Epoch 26/40
 34%  GPU utilization
Epoch 27/40
 36%  GPU utilization
Epoch 28/40
 34%  GPU utilization
Epoch 29/40
 32%  GPU utilization
E

### Using the Model to Generate Text

In [22]:
def preprocess(texts):
    X = np.array(tokenizer.texts_to_sequences(texts)) - 1
    return tf.one_hot(X, max_id)

In [23]:
X_new = preprocess(["How are yo"])

In [24]:
Y_pred = np.argmax(model.predict(X_new), axis=-1)

In [25]:
tokenizer.sequences_to_texts(Y_pred + 1)[0][-1]

'u'

### Generating Fake Shakespearean Text

In [26]:
def next_char(text, temperature=1):
    X_new = preprocess([text])
    y_proba = model.predict(X_new)[0, -1:, :]
    rescaled_logits = tf.math.log(y_proba) / temperature
    char_id = tf.random.categorical(rescaled_logits, num_samples=1) + 1
    return tokenizer.sequences_to_texts(char_id.numpy())[0]

In [27]:
def complete_text(text, n_chars=50, temperature=1):
    for _ in range(n_chars):
        text += next_char(text, temperature)
    return text

In [28]:
text_1 = complete_text("t", temperature=0.2)



In [29]:
print(text_1)

the true the breather,
and shall be so for the brea


In [30]:
text_2 = complete_text("w", temperature=1)

In [31]:
print(text_2)

who land the bills, and to this conforce it the fie


In [32]:
text_3 = complete_text("i", temperature=2)

In [33]:
print(text_3)

ith can caw, and lorornerild i? ghew?
shoky woe, el


In [34]:
text_4 = complete_text("I shall love", temperature=0.5, n_chars=100)



In [35]:
print(text_4)

I shall love
the right and the breath our himself christ thee speak
to the earth you speak the trumber conduct m


In [36]:
text_5 = complete_text("love", temperature=1, n_chars = 150)



In [37]:
print(text_5)

love, grace and littlesh depast rive,
in god my lord, and wrusty majesty!

henry bolingbroke:
what make it in fall be brigh death his love, shall to my li


### Stateful RNN

First, note that a stateful RNN only makes sense if each input sequence in a batch starts exactly where the corresponding sequence in the previous batch left off. So the first thing we need to do to build a stateful RNN is to use sequential and nonoverlapping input sequences (rather than the shuffled and overlapping sequences we used to train stateless RNNs). When creating the Dataset, we must therefore use shift=n_steps (instead of shift=1) when calling the window() method. Moreover, we must obviously not call the shuffle() method.

In [38]:
tf.random.set_seed(42)

In [39]:
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])
dataset = dataset.window(window_length, shift=n_steps, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(window_length))
dataset = dataset.batch(1)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]))
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))
dataset = dataset.prefetch(1)

In [40]:
batch_size = 32
encoded_parts = np.array_split(encoded[:train_size], batch_size)
datasets = []
for encoded_part in encoded_parts:
    dataset = tf.data.Dataset.from_tensor_slices(encoded_part)
    dataset = dataset.window(window_length, shift=n_steps, drop_remainder=True)
    dataset = dataset.flat_map(lambda window: window.batch(window_length))
    datasets.append(dataset)
dataset = tf.data.Dataset.zip(tuple(datasets)).map(lambda *windows: tf.stack(windows))
dataset = dataset.repeat().map(lambda windows: (windows[:, :-1], windows[:, 1:]))
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))
dataset = dataset.prefetch(1)

In [41]:
model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, stateful=True,
                     dropout=0.2, recurrent_dropout=0.2,
                     batch_input_shape=[batch_size, None, max_id]),
    keras.layers.GRU(128, return_sequences=True, stateful=True,
                     dropout=0.2, recurrent_dropout=0.2),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id,
                                                    activation="softmax"))
])

class ResetStatesCallback(keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs):
        self.model.reset_states()



In [42]:
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
steps_per_epoch = train_size // batch_size // n_steps
history = model.fit(dataset, steps_per_epoch=steps_per_epoch, epochs=40,
                    callbacks=[ResetStatesCallback(), NvidiaUtilizationCallback()])

 21%  GPU utilization
Epoch 1/40
 66%  GPU utilization
Epoch 2/40
 45%  GPU utilization
Epoch 3/40
 63%  GPU utilization
Epoch 4/40
 54%  GPU utilization
Epoch 5/40
 82%  GPU utilization
Epoch 6/40
 60%  GPU utilization
Epoch 7/40
 48%  GPU utilization
Epoch 8/40
 44%  GPU utilization
Epoch 9/40
 64%  GPU utilization
Epoch 10/40
 62%  GPU utilization
Epoch 11/40
 68%  GPU utilization
Epoch 12/40
 59%  GPU utilization
Epoch 13/40
 33%  GPU utilization
Epoch 14/40
 51%  GPU utilization
Epoch 15/40
 67%  GPU utilization
Epoch 16/40
 64%  GPU utilization
Epoch 17/40
 62%  GPU utilization
Epoch 18/40
 58%  GPU utilization
Epoch 19/40
 58%  GPU utilization
Epoch 20/40
 66%  GPU utilization
Epoch 21/40
 36%  GPU utilization
Epoch 22/40
 79%  GPU utilization
Epoch 23/40
 73%  GPU utilization
Epoch 24/40
 80%  GPU utilization
Epoch 25/40
 70%  GPU utilization
Epoch 26/40
 52%  GPU utilization
Epoch 27/40
 80%  GPU utilization
Epoch 28/40
 33%  GPU utilization
Epoch 29/40
 44%  GPU utilization
E


To use the model with different batch sizes, we need to create a stateless copy. We can get rid of dropout since it is only used during training:

In [43]:
stateless_model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id], dropout=0.2, recurrent_dropout=0.2,),
    keras.layers.GRU(128, return_sequences=True, dropout=0.2, recurrent_dropout=0.2,),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id, activation="softmax"))
])




To set the weights, we first need to build the model (so the weights get created):

In [44]:
stateless_model.build(tf.TensorShape([None, None, max_id]))

In [45]:
stateless_model.set_weights(model.get_weights())
model = stateless_model

In [46]:
text_1 = complete_text("t", temperature=0.2)



In [47]:
print(text_1)

ther command.

coriolanus:
what thou shall be the c


In [48]:
text_2 = complete_text("w", temperature=1)

In [49]:
print(text_2)

wick:
engrock of honry to-orning.

romeo:
this in h


In [50]:
text_3 = complete_text("i", temperature=2)

In [51]:
print(text_3)

iv!
wel?

juliet:
bevil! dear-brul'-jueema's dandem


In [52]:
text_4 = complete_text("I shall love", temperature=0.5, n_chars=100)



In [53]:
print(text_4)

I shall love his more and say her to did been me to the sovereign and shows as the voices as the
countence them 


In [54]:
text_5 = complete_text("love", temperature=1, n_chars = 150)



In [55]:
print(text_5)

love
the king well, out in requirt: no, no wincess'ed
to catcin of the nears, and when his counsoly were court.
to the slords the soffer hast his looking 


## Sentiment Analysis

In [56]:
tf.random.set_seed(42)

In [57]:
(X_train, y_test), (X_valid, y_test) = keras.datasets.imdb.load_data()

  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


In [58]:
X_train[0][:10]

[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65]

In [59]:
word_index = keras.datasets.imdb.get_word_index()
id_to_word = {id_ + 3: word for word, id_ in word_index.items()}
for id_, token in enumerate(("<pad>", "<sos>", "<unk>")):
    id_to_word[id_] = token
" ".join([id_to_word[id_] for id_ in X_train[0][:10]])

'<sos> this film was just brilliant casting location scenery story'

In [60]:
import tensorflow_datasets as tfds

datasets, info = tfds.load("imdb_reviews", as_supervised=True, with_info=True)

In [61]:
train_size = info.splits["train"].num_examples
test_size = info.splits["test"].num_examples

In [62]:
train_size, test_size

(25000, 25000)

In [63]:
datasets.keys()

dict_keys(['test', 'train', 'unsupervised'])

In [64]:
for X_batch, y_batch in datasets["train"].batch(2).take(1):
    for review, label in zip(X_batch.numpy(), y_batch.numpy()):
        print("Review:", review.decode("utf-8")[:200], "...")
        print("Label:", label, "= Positive" if label else "= Negative")
        print()

Review: This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting  ...
Label: 0 = Negative

Review: I have been known to fall asleep during films, but this is usually due to a combination of things including, really tired, being warm and comfortable on the sette and having just eaten a lot. However  ...
Label: 0 = Negative



In [65]:
def preprocess(X_batch, y_batch):
    X_batch = tf.strings.substr(X_batch, 0, 300)
    X_batch = tf.strings.regex_replace(X_batch, rb"<br\s*/?>", b" ")
    X_batch = tf.strings.regex_replace(X_batch, b"[^a-zA-Z']", b" ")
    X_batch = tf.strings.split(X_batch)
    return X_batch.to_tensor(default_value=b"<pad>"), y_batch

In [66]:
from collections import Counter
vocabulary = Counter()
for X_batch, y_batch in datasets["train"].batch(32).map(preprocess):
    for review in X_batch:
        vocabulary.update(list(review.numpy()))

In [67]:
vocabulary.most_common()[:3]

[(b'<pad>', 214309), (b'the', 61137), (b'a', 38564)]

In [68]:
vocab_size = 10000
truncated_vocabulary = [
    word for word, count in vocabulary.most_common()[:vocab_size]]

In [69]:
truncated_vocabulary

[b'<pad>',
 b'the',
 b'a',
 b'of',
 b'and',
 b'to',
 b'I',
 b'is',
 b'in',
 b'this',
 b'it',
 b'was',
 b'movie',
 b'that',
 b'The',
 b'film',
 b'with',
 b'for',
 b'as',
 b'on',
 b'but',
 b'have',
 b'This',
 b'one',
 b'not',
 b'be',
 b'are',
 b'you',
 b'an',
 b'at',
 b'about',
 b'by',
 b'all',
 b'his',
 b'so',
 b'like',
 b'from',
 b'who',
 b'has',
 b'It',
 b'good',
 b'my',
 b'just',
 b'very',
 b'out',
 b'or',
 b'story',
 b'some',
 b'time',
 b'had',
 b'he',
 b'they',
 b'really',
 b'me',
 b'when',
 b'what',
 b'first',
 b'movies',
 b'bad',
 b'see',
 b'seen',
 b'up',
 b'only',
 b'were',
 b"it's",
 b'would',
 b'more',
 b'made',
 b'great',
 b'can',
 b'been',
 b'i',
 b'her',
 b'no',
 b'A',
 b'which',
 b'even',
 b'films',
 b'there',
 b'ever',
 b'people',
 b'much',
 b'because',
 b'most',
 b'plot',
 b'if',
 b'than',
 b'acting',
 b'get',
 b'their',
 b'well',
 b'into',
 b'how',
 b'best',
 b'think',
 b'other',
 b'its',
 b"It's",
 b'saw',
 b'could',
 b'watch',
 b'many',
 b"don't",
 b'do',
 b'will',
 

Now we need to add a preprocessing step to replace each word with its ID (i.e., its index in the vocabulary).

In [70]:
words = tf.constant(truncated_vocabulary)
word_ids = tf.range(len(truncated_vocabulary), dtype=tf.int64)
vocab_init = tf.lookup.KeyValueTensorInitializer(words, word_ids)
num_oov_buckets = 1000
table = tf.lookup.StaticVocabularyTable(vocab_init, num_oov_buckets)

In [71]:
table.lookup(tf.constant([b"This movie was awesome".split()]))

<tf.Tensor: shape=(1, 4), dtype=int64, numpy=array([[ 22,  12,  11, 902]])>

In [72]:
def encode_words(X_batch, y_batch):
    return table.lookup(X_batch), y_batch

In [73]:
train_set = datasets["train"].batch(32).map(preprocess)
train_set = train_set.map(encode_words).prefetch(1)

In [74]:
for X_batch, y_batch in train_set.take(1):
    print(X_batch)
    print(y_batch)

tf.Tensor(
[[  22   11   28 ...    0    0    0]
 [   6   21   70 ...    0    0    0]
 [4099 6881    1 ...    0    0    0]
 ...
 [  22   12  118 ...  331 1047    0]
 [1757 4101  451 ...    0    0    0]
 [3365 4392    6 ...    0    0    0]], shape=(32, 60), dtype=int64)
tf.Tensor([0 0 0 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0 0], shape=(32,), dtype=int64)


In [75]:
test_set = datasets["test"].batch(32).map(preprocess)
test_set = test_set.map(encode_words).prefetch(1)

In [76]:
embed_size = 128
model = keras.models.Sequential([
    keras.layers.Embedding(vocab_size + num_oov_buckets, embed_size,
                           mask_zero=True, # not shown in the book
                           input_shape=[None]),
    keras.layers.GRU(128, return_sequences=True, dropout = 0.1, recurrent_dropout=0.1),
    keras.layers.GRU(128, dropout = 0.1, recurrent_dropout=0.1),
    keras.layers.Dense(1, activation="sigmoid")
])
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(train_set, epochs=10, validation_data = test_set, callbacks = [NvidiaUtilizationCallback()])









  0%  GPU utilization
Epoch 1/10
 17%  GPU utilization
Epoch 2/10
 22%  GPU utilization
Epoch 3/10
 22%  GPU utilization
Epoch 4/10
 24%  GPU utilization
Epoch 5/10
 42%  GPU utilization
Epoch 6/10
 28%  GPU utilization
Epoch 7/10
 19%  GPU utilization
Epoch 8/10
 21%  GPU utilization
Epoch 9/10
 27%  GPU utilization
Epoch 10/10


In [77]:
# Same code as above
#------------------------------

# K = keras.backend
# embed_size = 128
# inputs = keras.layers.Input(shape=[None])
# mask = keras.layers.Lambda(lambda inputs: K.not_equal(inputs, 0))(inputs)
# z = keras.layers.Embedding(vocab_size + num_oov_buckets, embed_size)(inputs)
# z = keras.layers.GRU(128, return_sequences=True)(z, mask=mask)
# z = keras.layers.GRU(128)(z, mask=mask)
# outputs = keras.layers.Dense(1, activation="sigmoid")(z)
# model = keras.models.Model(inputs=[inputs], outputs=[outputs])
# model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
# history = model.fit(train_set, epochs=10, validation_data = test_set)   

In [78]:
def giveSentiment(text):
    return model.predict(table.lookup(tf.constant([text.split()])))

In [79]:
giveSentiment(b"Omelets are to die for!")

array([[0.9215863]], dtype=float32)

## Reusing Pretrained Embeddings

In [80]:
import os
TFHUB_CACHE_DIR = os.path.join(os.curdir, "my_tfhub_cache")
os.environ["TFHUB_CACHE_DIR"] = TFHUB_CACHE_DIR

In [81]:
import tensorflow_hub as hub

In [88]:
model = keras.Sequential([
    hub.KerasLayer("https://tfhub.dev/google/nnlm-en-dim128/2",  input_shape=[] ,dtype=tf.string),
    keras.layers.Dense(128, activation="relu"),
    keras.layers.Dense(1, activation="sigmoid")
])

In [83]:
for dirpath, dirnames, filenames in os.walk(TFHUB_CACHE_DIR):
    for filename in filenames:
        print(os.path.join(dirpath, filename))

./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490.descriptor.txt
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/saved_model.pb
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/assets/tokens.txt
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/variables/variables.data-00000-of-00001
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/variables/variables.index


In [None]:
datasets, info = tfds.load("imdb_reviews", as_supervised=True, with_info=True)
train_size = info.splits["train"].num_examples
batch_size = 32
train_set = datasets["train"].batch(batch_size).prefetch(1)
test_set = datasets["test"].batch(batch_size).prefetch(1)

In [93]:
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(train_set, epochs=100, validation_data = test_set, callbacks = [NvidiaUtilizationCallback()])

  4%  GPU utilization
Epoch 1/100
 13%  GPU utilization
Epoch 2/100
  6%  GPU utilization
Epoch 3/100
  2%  GPU utilization
Epoch 4/100
 10%  GPU utilization
Epoch 5/100
  2%  GPU utilization
Epoch 6/100
 32%  GPU utilization
Epoch 7/100
  8%  GPU utilization
Epoch 8/100
 32%  GPU utilization
Epoch 9/100
  5%  GPU utilization
Epoch 10/100
 21%  GPU utilization
Epoch 11/100
 25%  GPU utilization
Epoch 12/100
 24%  GPU utilization
Epoch 13/100
 13%  GPU utilization
Epoch 14/100
  7%  GPU utilization
Epoch 15/100
 13%  GPU utilization
Epoch 16/100
 18%  GPU utilization
Epoch 17/100
 19%  GPU utilization
Epoch 18/100
 14%  GPU utilization
Epoch 19/100
  7%  GPU utilization
Epoch 20/100
 21%  GPU utilization
Epoch 21/100
  6%  GPU utilization
Epoch 22/100
 29%  GPU utilization
Epoch 23/100
 16%  GPU utilization
Epoch 24/100
 16%  GPU utilization
Epoch 25/100
  9%  GPU utilization
Epoch 26/100
  4%  GPU utilization
Epoch 27/100
 21%  GPU utilization
Epoch 28/100
 21%  GPU utilization
Epoch 2

In [85]:
def giveSentiment(text):
    return model.predict([text])

In [94]:
giveSentiment("The movie was not good")

array([[0.07112711]], dtype=float32)

In [95]:
giveSentiment("The movies was to die for!")

array([[0.11201944]], dtype=float32)

## An Encoder–Decoder Network for Neural Machine Translation