<a href="https://colab.research.google.com/github/Shrey-Viradiya/HandsOnMachineLearning/blob/master/Natural_Language_Processing_with_RNNs_and_Attention.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!nvidia-smi

Mon Jun 29 17:03:41 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.36.06    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   70C    P8    36W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# Natural Language Processing with RNNs and Attention

## Generating Shakespearean Text Using a Character RNN

### Creating the Training Dataset

In [2]:
import tensorflow as tf
import tensorflow.keras as keras
import numpy as np
import matplotlib.pyplot as plt

In [3]:
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)])

In [4]:
shakespeare_url = "https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt"
filepath = keras.utils.get_file("shakespeare.txt", shakespeare_url)
with open(filepath) as f:
    shakespeare_text = f.read()

Downloading data from https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt


In [5]:
tokenizer = keras.preprocessing.text.Tokenizer(char_level=True)
tokenizer.fit_on_texts(shakespeare_text)

In [6]:
tokenizer.texts_to_sequences(['First'])

[[20, 6, 9, 8, 3]]

In [7]:
tokenizer.sequences_to_texts([[20,6,9,8,3]])

['f i r s t']

In [8]:
max_id = len(tokenizer.word_index)

In [9]:
max_id

39

In [10]:
dataset_size = tokenizer.document_count

In [11]:
dataset_size

1115394

In [12]:
[encoded] = np.array(tokenizer.texts_to_sequences([shakespeare_text])) - 1

### How to Split a Sequential Dataset

Let’s take the first 90% of the text for the training set (keeping the rest for the validation set and the test set), and create a tf.data.Dataset that will return each character one by one from this set:

In [13]:
train_size = dataset_size * 90 // 100
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])

### Chopping the Sequential Dataset into Multiple Windows

The training set now consists of a single sequence of over a million characters, so we can’t just train the neural network directly on it: the RNN would be equivalent to a deep net with over a million layers, and we would have a single (very long) instance to train it. Instead, we will use the dataset’s window() method to convert this long sequence of characters into many smaller windows of text. Every instance in the dataset will be a fairly short substring of the whole text, and the RNN will be unrolled only over the length of these substrings. This is called truncated backpropagation through time. 

In [14]:
n_steps = 100
window_length = n_steps + 1

In [15]:
dataset = dataset.window(window_length, shift=1, drop_remainder=True)

The window() method creates a dataset that contains windows, each of which is also represented as a dataset. It’s a nested dataset, analogous to a list of lists. This is useful when you want to transform each window by calling its dataset methods (e.g., to shuffle them or batch them). However, we cannot use a nested dataset directly for training, as our model will expect tensors as input, not datasets. So, we must call the flat_map() method: it converts a nested dataset into a flat dataset (one that does not contain datasets).

In [16]:
dataset = dataset.flat_map(lambda window: window.batch(window_length))

We need to shuffle these windows. Then we can batch the windows and separate the inputs (the first 100 characters) from the target (the last character):

In [17]:
batch_size = 32
dataset = dataset.shuffle(10000).batch(batch_size)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]))

Categorical input features should generally be encoded, usually as one-hot vectors or as embeddings. Here, we will encode each character using a one-hot vector because there are fairly few distinct characters (only 39):

In [18]:
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))

In [19]:
dataset = dataset.prefetch(1)

### Building and Training the Char-RNN Model

In [20]:
import os

checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")

checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_prefix,
    save_weights_only=True)

In [21]:
class NvidiaUtilizationCallback(keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs):
        text = !nvidia-smi
        text = text[9][60:65] + ' GPU utilization'
        print(text)

In [22]:
model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id]),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id, activation='softmax'))    
])

model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam')
steps_per_epoch = train_size // batch_size // n_steps

history = model.fit(dataset, epochs = 40, steps_per_epoch=steps_per_epoch, callbacks=[checkpoint_callback, NvidiaUtilizationCallback()])

  0%  GPU utilization
Epoch 1/40
  0%  GPU utilization
Epoch 2/40
  0%  GPU utilization
Epoch 3/40
  0%  GPU utilization
Epoch 4/40
  0%  GPU utilization
Epoch 5/40
  0%  GPU utilization
Epoch 6/40
  0%  GPU utilization
Epoch 7/40
  0%  GPU utilization
Epoch 8/40
  0%  GPU utilization
Epoch 9/40
  0%  GPU utilization
Epoch 10/40
  0%  GPU utilization
Epoch 11/40
  0%  GPU utilization
Epoch 12/40
  0%  GPU utilization
Epoch 13/40
  0%  GPU utilization
Epoch 14/40
  0%  GPU utilization
Epoch 15/40
  0%  GPU utilization
Epoch 16/40
  0%  GPU utilization
Epoch 17/40
  0%  GPU utilization
Epoch 18/40
  0%  GPU utilization
Epoch 19/40
  0%  GPU utilization
Epoch 20/40
  0%  GPU utilization
Epoch 21/40
  0%  GPU utilization
Epoch 22/40
  0%  GPU utilization
Epoch 23/40
  0%  GPU utilization
Epoch 24/40
  0%  GPU utilization
Epoch 25/40
  0%  GPU utilization
Epoch 26/40
  0%  GPU utilization
Epoch 27/40
  0%  GPU utilization
Epoch 28/40
  0%  GPU utilization
Epoch 29/40
  0%  GPU utilization
E

### Using the Model to Generate Text

In [23]:
def preprocess(texts):
    X = np.array(tokenizer.texts_to_sequences(texts)) - 1
    return tf.one_hot(X, max_id)

In [24]:
X_new = preprocess(["How are yo"])

In [25]:
Y_pred = np.argmax(model.predict(X_new), axis=-1)

In [26]:
tokenizer.sequences_to_texts(Y_pred + 1)[0][-1]

'u'

### Generating Fake Shakespearean Text

In [27]:
def next_char(text, temperature=1):
    X_new = preprocess([text])
    y_proba = model.predict(X_new)[0, -1:, :]
    rescaled_logits = tf.math.log(y_proba) / temperature
    char_id = tf.random.categorical(rescaled_logits, num_samples=1) + 1
    return tokenizer.sequences_to_texts(char_id.numpy())[0]

In [28]:
def complete_text(text, n_chars=50, temperature=1):
    for _ in range(n_chars):
        text += next_char(text, temperature)
    return text

In [29]:
text_1 = complete_text("t", temperature=0.2)



In [30]:
print(text_1)

the throw at all this present, and not this not thi


In [31]:
text_2 = complete_text("w", temperature=1)

In [32]:
print(text_2)

wall is must his rubose of bolingbroke
of say the l


In [33]:
text_3 = complete_text("i", temperature=2)

In [34]:
print(text_3)

id-sennd.
if whe not! that pissy, thinksperes
con,a


In [35]:
text_4 = complete_text("I shall love", temperature=0.5, n_chars=100)



In [36]:
print(text_4)

I shall love and tell all thou werk head?

duke of york:
seer would confood,
and lies,
that king of my trialter 


In [37]:
text_5 = complete_text("love", temperature=1, n_chars = 150)



In [38]:
print(text_5)

love,
or glard, mind;
or your in the liah handy;
what was comfort citilip, in joan our heads? who ere tenderen war dithersed to his noght may he secin; an


### Stateful RNN

First, note that a stateful RNN only makes sense if each input sequence in a batch starts exactly where the corresponding sequence in the previous batch left off. So the first thing we need to do to build a stateful RNN is to use sequential and nonoverlapping input sequences (rather than the shuffled and overlapping sequences we used to train stateless RNNs). When creating the Dataset, we must therefore use shift=n_steps (instead of shift=1) when calling the window() method. Moreover, we must obviously not call the shuffle() method.

In [39]:
tf.random.set_seed(42)

In [40]:
dataset = tf.data.Dataset.from_tensor_slices(encoded[:train_size])
dataset = dataset.window(window_length, shift=n_steps, drop_remainder=True)
dataset = dataset.flat_map(lambda window: window.batch(window_length))
dataset = dataset.batch(1)
dataset = dataset.map(lambda windows: (windows[:, :-1], windows[:, 1:]))
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))
dataset = dataset.prefetch(1)

In [41]:
batch_size = 32
encoded_parts = np.array_split(encoded[:train_size], batch_size)
datasets = []
for encoded_part in encoded_parts:
    dataset = tf.data.Dataset.from_tensor_slices(encoded_part)
    dataset = dataset.window(window_length, shift=n_steps, drop_remainder=True)
    dataset = dataset.flat_map(lambda window: window.batch(window_length))
    datasets.append(dataset)
dataset = tf.data.Dataset.zip(tuple(datasets)).map(lambda *windows: tf.stack(windows))
dataset = dataset.repeat().map(lambda windows: (windows[:, :-1], windows[:, 1:]))
dataset = dataset.map(
    lambda X_batch, Y_batch: (tf.one_hot(X_batch, depth=max_id), Y_batch))
dataset = dataset.prefetch(1)

In [42]:
model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, stateful=True,
                     dropout=0.2, recurrent_dropout=0.2,
                     batch_input_shape=[batch_size, None, max_id]),
    keras.layers.GRU(128, return_sequences=True, stateful=True,
                     dropout=0.2, recurrent_dropout=0.2),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id,
                                                    activation="softmax"))
])

class ResetStatesCallback(keras.callbacks.Callback):
    def on_epoch_begin(self, epoch, logs):
        self.model.reset_states()



In [43]:
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
steps_per_epoch = train_size // batch_size // n_steps
history = model.fit(dataset, steps_per_epoch=steps_per_epoch, epochs=40,
                    callbacks=[ResetStatesCallback(), NvidiaUtilizationCallback()])

  0%  GPU utilization
Epoch 1/40
  0%  GPU utilization
Epoch 2/40
  0%  GPU utilization
Epoch 3/40
  0%  GPU utilization
Epoch 4/40
  0%  GPU utilization
Epoch 5/40
  0%  GPU utilization
Epoch 6/40
  0%  GPU utilization
Epoch 7/40
  0%  GPU utilization
Epoch 8/40
  0%  GPU utilization
Epoch 9/40
  0%  GPU utilization
Epoch 10/40
  0%  GPU utilization
Epoch 11/40
  0%  GPU utilization
Epoch 12/40
  0%  GPU utilization
Epoch 13/40
  0%  GPU utilization
Epoch 14/40
  0%  GPU utilization
Epoch 15/40
  0%  GPU utilization
Epoch 16/40
  0%  GPU utilization
Epoch 17/40
  0%  GPU utilization
Epoch 18/40
  0%  GPU utilization
Epoch 19/40
  0%  GPU utilization
Epoch 20/40
  0%  GPU utilization
Epoch 21/40
  0%  GPU utilization
Epoch 22/40
  0%  GPU utilization
Epoch 23/40
  0%  GPU utilization
Epoch 24/40
  0%  GPU utilization
Epoch 25/40
  0%  GPU utilization
Epoch 26/40
  0%  GPU utilization
Epoch 27/40
  0%  GPU utilization
Epoch 28/40
  0%  GPU utilization
Epoch 29/40
  0%  GPU utilization
E


To use the model with different batch sizes, we need to create a stateless copy. We can get rid of dropout since it is only used during training:

In [44]:
stateless_model = keras.models.Sequential([
    keras.layers.GRU(128, return_sequences=True, input_shape=[None, max_id], dropout=0.2, recurrent_dropout=0.2,),
    keras.layers.GRU(128, return_sequences=True, dropout=0.2, recurrent_dropout=0.2,),
    keras.layers.TimeDistributed(keras.layers.Dense(max_id, activation="softmax"))
])




To set the weights, we first need to build the model (so the weights get created):

In [45]:
stateless_model.build(tf.TensorShape([None, None, max_id]))

In [46]:
stateless_model.set_weights(model.get_weights())
model = stateless_model

In [47]:
text_1 = complete_text("t", temperature=0.2)



In [48]:
print(text_1)

there is a bear
the provost to the duke of the coun


In [49]:
text_2 = complete_text("w", temperature=1)

In [50]:
print(text_2)

wick:
i crobje constrict joy their rastalpent me
th


In [51]:
text_3 = complete_text("i", temperature=2)

In [52]:
print(text_3)

iv!
well, judgice bevilg deai. ould-yaarm-'troan,
m


In [53]:
text_4 = complete_text("I shall love", temperature=0.5, n_chars=100)



In [54]:
print(text_4)

I shall love here to shall as it it.

leontes:
he therefore he heart their pastion and in the child
to the cause


In [55]:
text_5 = complete_text("love", temperature=1, n_chars = 150)



In [56]:
print(text_5)

love
the kindled mishake him of you nevense,
these ears me: bin our heaven, he had he
death, by our with courthrop's pervort friends as these have now lov


## Sentiment Analysis

In [57]:
tf.random.set_seed(42)

In [58]:
(X_train, y_test), (X_valid, y_test) = keras.datasets.imdb.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz


In [59]:
X_train[0][:10]

[1, 14, 22, 16, 43, 530, 973, 1622, 1385, 65]

In [60]:
word_index = keras.datasets.imdb.get_word_index()
id_to_word = {id_ + 3: word for word, id_ in word_index.items()}
for id_, token in enumerate(("<pad>", "<sos>", "<unk>")):
    id_to_word[id_] = token
" ".join([id_to_word[id_] for id_ in X_train[0][:10]])

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb_word_index.json


'<sos> this film was just brilliant casting location scenery story'

In [61]:
import tensorflow_datasets as tfds

datasets, info = tfds.load("imdb_reviews", as_supervised=True, with_info=True)

[1mDownloading and preparing dataset imdb_reviews/plain_text/1.0.0 (download: 80.23 MiB, generated: Unknown size, total: 80.23 MiB) to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0...[0m


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Completed...', max=1.0, style=Progre…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Dl Size...', max=1.0, style=ProgressSty…







HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteIN04ID/imdb_reviews-train.tfrecord


HBox(children=(FloatProgress(value=0.0, max=25000.0), HTML(value='')))



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteIN04ID/imdb_reviews-test.tfrecord


HBox(children=(FloatProgress(value=0.0, max=25000.0), HTML(value='')))



HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

Shuffling and writing examples to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0.incompleteIN04ID/imdb_reviews-unsupervised.tfrecord


HBox(children=(FloatProgress(value=0.0, max=50000.0), HTML(value='')))

[1mDataset imdb_reviews downloaded and prepared to /root/tensorflow_datasets/imdb_reviews/plain_text/1.0.0. Subsequent calls will reuse this data.[0m


In [62]:
train_size = info.splits["train"].num_examples
test_size = info.splits["test"].num_examples

In [63]:
train_size, test_size

(25000, 25000)

In [64]:
datasets.keys()

dict_keys(['test', 'train', 'unsupervised'])

In [65]:
for X_batch, y_batch in datasets["train"].batch(2).take(1):
    for review, label in zip(X_batch.numpy(), y_batch.numpy()):
        print("Review:", review.decode("utf-8")[:200], "...")
        print("Label:", label, "= Positive" if label else "= Negative")
        print()

Review: This was an absolutely terrible movie. Don't be lured in by Christopher Walken or Michael Ironside. Both are great actors, but this must simply be their worst role in history. Even their great acting  ...
Label: 0 = Negative

Review: I have been known to fall asleep during films, but this is usually due to a combination of things including, really tired, being warm and comfortable on the sette and having just eaten a lot. However  ...
Label: 0 = Negative



In [66]:
def preprocess(X_batch, y_batch):
    X_batch = tf.strings.substr(X_batch, 0, 300)
    X_batch = tf.strings.regex_replace(X_batch, rb"<br\s*/?>", b" ")
    X_batch = tf.strings.regex_replace(X_batch, b"[^a-zA-Z']", b" ")
    X_batch = tf.strings.split(X_batch)
    return X_batch.to_tensor(default_value=b"<pad>"), y_batch

In [67]:
from collections import Counter
vocabulary = Counter()
for X_batch, y_batch in datasets["train"].batch(32).map(preprocess):
    for review in X_batch:
        vocabulary.update(list(review.numpy()))

In [68]:
vocabulary.most_common()[:3]

[(b'<pad>', 214309), (b'the', 61137), (b'a', 38564)]

In [69]:
vocab_size = 10000
truncated_vocabulary = [
    word for word, count in vocabulary.most_common()[:vocab_size]]

In [70]:
truncated_vocabulary

[b'<pad>',
 b'the',
 b'a',
 b'of',
 b'and',
 b'to',
 b'I',
 b'is',
 b'in',
 b'this',
 b'it',
 b'was',
 b'movie',
 b'that',
 b'The',
 b'film',
 b'with',
 b'for',
 b'as',
 b'on',
 b'but',
 b'have',
 b'This',
 b'one',
 b'not',
 b'be',
 b'are',
 b'you',
 b'an',
 b'at',
 b'about',
 b'by',
 b'all',
 b'his',
 b'so',
 b'like',
 b'from',
 b'who',
 b'has',
 b'It',
 b'good',
 b'my',
 b'just',
 b'very',
 b'out',
 b'or',
 b'story',
 b'some',
 b'time',
 b'had',
 b'he',
 b'they',
 b'really',
 b'me',
 b'when',
 b'what',
 b'first',
 b'movies',
 b'bad',
 b'see',
 b'seen',
 b'up',
 b'only',
 b'were',
 b"it's",
 b'would',
 b'more',
 b'made',
 b'great',
 b'can',
 b'been',
 b'i',
 b'her',
 b'no',
 b'A',
 b'which',
 b'even',
 b'films',
 b'there',
 b'ever',
 b'people',
 b'much',
 b'because',
 b'most',
 b'plot',
 b'if',
 b'than',
 b'acting',
 b'get',
 b'their',
 b'well',
 b'into',
 b'how',
 b'best',
 b'think',
 b'other',
 b'its',
 b"It's",
 b'saw',
 b'could',
 b'watch',
 b'many',
 b"don't",
 b'do',
 b'will',
 

Now we need to add a preprocessing step to replace each word with its ID (i.e., its index in the vocabulary).

In [71]:
words = tf.constant(truncated_vocabulary)
word_ids = tf.range(len(truncated_vocabulary), dtype=tf.int64)
vocab_init = tf.lookup.KeyValueTensorInitializer(words, word_ids)
num_oov_buckets = 1000
table = tf.lookup.StaticVocabularyTable(vocab_init, num_oov_buckets)

In [72]:
table.lookup(tf.constant([b"This movie was awesome".split()]))

<tf.Tensor: shape=(1, 4), dtype=int64, numpy=array([[ 22,  12,  11, 902]])>

In [73]:
def encode_words(X_batch, y_batch):
    return table.lookup(X_batch), y_batch

In [74]:
train_set = datasets["train"].batch(32).map(preprocess)
train_set = train_set.map(encode_words).prefetch(1)

In [75]:
for X_batch, y_batch in train_set.take(1):
    print(X_batch)
    print(y_batch)

tf.Tensor(
[[  22   11   28 ...    0    0    0]
 [   6   21   70 ...    0    0    0]
 [4099 6881    1 ...    0    0    0]
 ...
 [  22   12  118 ...  331 1047    0]
 [1757 4101  451 ...    0    0    0]
 [3365 4392    6 ...    0    0    0]], shape=(32, 60), dtype=int64)
tf.Tensor([0 0 0 1 1 1 0 0 0 0 0 1 1 0 1 0 1 1 1 0 1 1 1 1 1 0 0 0 1 0 0 0], shape=(32,), dtype=int64)


In [76]:
test_set = datasets["test"].batch(32).map(preprocess)
test_set = test_set.map(encode_words).prefetch(1)

In [77]:
embed_size = 128
model = keras.models.Sequential([
    keras.layers.Embedding(vocab_size + num_oov_buckets, embed_size,
                           mask_zero=True, # not shown in the book
                           input_shape=[None]),
    keras.layers.GRU(128, return_sequences=True),
    keras.layers.GRU(128),
    keras.layers.Dense(1, activation="sigmoid")
])
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(train_set, epochs=10, validation_data = test_set, callbacks = [NvidiaUtilizationCallback()])

  0%  GPU utilization
Epoch 1/10
  0%  GPU utilization
Epoch 2/10
  0%  GPU utilization
Epoch 3/10
  0%  GPU utilization
Epoch 4/10
  0%  GPU utilization
Epoch 5/10
  0%  GPU utilization
Epoch 6/10
  0%  GPU utilization
Epoch 7/10
  0%  GPU utilization
Epoch 8/10
  0%  GPU utilization
Epoch 9/10
  0%  GPU utilization
Epoch 10/10


In [78]:
# Same code as above
#------------------------------

# K = keras.backend
# embed_size = 128
# inputs = keras.layers.Input(shape=[None])
# mask = keras.layers.Lambda(lambda inputs: K.not_equal(inputs, 0))(inputs)
# z = keras.layers.Embedding(vocab_size + num_oov_buckets, embed_size)(inputs)
# z = keras.layers.GRU(128, return_sequences=True)(z, mask=mask)
# z = keras.layers.GRU(128)(z, mask=mask)
# outputs = keras.layers.Dense(1, activation="sigmoid")(z)
# model = keras.models.Model(inputs=[inputs], outputs=[outputs])
# model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
# history = model.fit(train_set, epochs=10, validation_data = test_set)   

In [79]:
def giveSentiment(text):
    return model.predict(table.lookup(tf.constant([text.split()])))

In [80]:
giveSentiment(b"Omelets are to die for!")

array([[0.9387816]], dtype=float32)

## Reusing Pretrained Embeddings

In [81]:
import os
TFHUB_CACHE_DIR = os.path.join(os.curdir, "my_tfhub_cache")
os.environ["TFHUB_CACHE_DIR"] = TFHUB_CACHE_DIR

In [82]:
import tensorflow_hub as hub

In [97]:
with tf.device("/cpu:0"):
    model = keras.Sequential([
        hub.KerasLayer("https://tfhub.dev/google/nnlm-en-dim128/2",  input_shape=[] ,dtype=tf.string),
        keras.layers.Dense(128, activation="relu"),
        keras.layers.Dense(1, activation="sigmoid")
    ])

In [98]:
for dirpath, dirnames, filenames in os.walk(TFHUB_CACHE_DIR):
    for filename in filenames:
        print(os.path.join(dirpath, filename))

./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490.descriptor.txt
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/saved_model.pb
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/variables/variables.data-00000-of-00001
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/variables/variables.index
./my_tfhub_cache/29abffb443cb0a0ca9c72e8e3863b76d85028490/assets/tokens.txt


In [99]:
import tensorflow_datasets as tfds

datasets, info = tfds.load("imdb_reviews", as_supervised=True, with_info=True)
train_size = info.splits["train"].num_examples
batch_size = 32
train_set = datasets["train"].batch(batch_size).prefetch(1)
test_set = datasets["test"].batch(batch_size).prefetch(1)

In [100]:
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
history = model.fit(train_set, epochs=20, validation_data = test_set, callbacks = [NvidiaUtilizationCallback()])

  0%  GPU utilization
Epoch 1/20
  0%  GPU utilization
Epoch 2/20
  0%  GPU utilization
Epoch 3/20
  0%  GPU utilization
Epoch 4/20
  0%  GPU utilization
Epoch 5/20
  0%  GPU utilization
Epoch 6/20
  0%  GPU utilization
Epoch 7/20
  0%  GPU utilization
Epoch 8/20
  0%  GPU utilization
Epoch 9/20
  0%  GPU utilization
Epoch 10/20
  0%  GPU utilization
Epoch 11/20
  0%  GPU utilization
Epoch 12/20
  0%  GPU utilization
Epoch 13/20
  0%  GPU utilization
Epoch 14/20
  0%  GPU utilization
Epoch 15/20
  0%  GPU utilization
Epoch 16/20
  0%  GPU utilization
Epoch 17/20
  0%  GPU utilization
Epoch 18/20
  0%  GPU utilization
Epoch 19/20
  0%  GPU utilization
Epoch 20/20


In [101]:
def giveSentiment(text):
    return model.predict([text])

In [102]:
giveSentiment("The movie was not good")

array([[0.1780042]], dtype=float32)

In [103]:
giveSentiment("The movies are to die for!")

array([[0.32158712]], dtype=float32)

## An Encoder–Decoder Network for Neural Machine Translation

In [104]:
import tensorflow_addons as tfa

In [105]:
vocab_size = 10000
embed_size = 128

In [106]:
try:
    encoder_inputs = keras.layers.Input(shape=[None], dtype=np.int32)
    decoder_inputs = keras.layers.Input(shape=[None], dtype=np.int32)
    sequence_lengths = keras.layers.Input(shape=[], dtype=np.int32)

    embeddings = keras.layers.Embedding(vocab_size, embed_size)
    encoder_embeddings = embeddings(encoder_inputs)
    decoder_embeddings = embeddings(decoder_inputs)

    encoder = keras.layers.LSTM(512, return_state=True)
    encoder_outputs, state_h, state_c = encoder(encoder_embeddings)
    encoder_state = [state_h, state_c]

    sampler = tfa.seq2seq.sampler.TrainingSampler()

    decoder_cell = keras.layers.LSTMCell(512)
    output_layer = keras.layers.Dense(vocab_size)
    decoder = tfa.seq2seq.basic_decoder.BasicDecoder(decoder_cell, sampler = sampler,
                                                     output_layer=output_layer)
    final_outputs, final_state, final_sequence_lengths = decoder(
        decoder_embeddings, initial_state=encoder_state,
        sequence_length=sequence_lengths)
    Y_proba = tf.nn.softmax(final_outputs.rnn_output)

    model = keras.models.Model(
        inputs=[encoder_inputs, decoder_inputs, sequence_lengths],
        outputs=[Y_proba])

    model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")


    X = np.random.randint(100, size=10*1000).reshape(1000, 10)
    Y = np.random.randint(100, size=15*1000).reshape(1000, 15)
    X_decoder = np.c_[np.zeros((1000, 1)), Y[:, :-1]]
    seq_lengths = np.full([1000], 15)

    history = model.fit([X, X_decoder, seq_lengths], Y, epochs=2)
except TypeError as t:
    print(t)

Epoch 1/2
Epoch 2/2


## Bidirectional Recurrent Layers

In [107]:
model = keras.models.Sequential([
    keras.layers.GRU(10, return_sequences = True, input_shape=[None, 10]),
    keras.layers.Bidirectional(keras.layers.GRU(10, return_sequences=True))
])

model.summary()

Model: "sequential_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
gru_9 (GRU)                  (None, None, 10)          660       
_________________________________________________________________
bidirectional_1 (Bidirection (None, None, 20)          1320      
Total params: 1,980
Trainable params: 1,980
Non-trainable params: 0
_________________________________________________________________
