## Enabling and testing the GPU

First, you'll need to enable GPUs for the notebook:

- Navigate to Edit→Notebook Settings
- select GPU from the Hardware Accelerator drop-down

Next, we'll confirm that we can connect to the GPU with tensorflow:

# Tensorflow with GPU

This notebook provides an introduction to computing on a [GPU](https://cloud.google.com/gpu) in Colab. In this notebook you will connect to a GPU, and then run some basic TensorFlow operations on both the CPU and a GPU, observing the speedup provided by using the GPU.


In [None]:
%tensorflow_version 2.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

Found GPU at: /device:GPU:0


## Observe TensorFlow speedup on GPU relative to CPU

This example constructs a typical convolutional neural network layer over a
random image and manually places the resulting ops on either the CPU or the GPU
to compare execution speed.

In [None]:
%tensorflow_version 2.x
import tensorflow as tf
import timeit

device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  print(
      '\n\nThis error most likely means that this notebook is not '
      'configured to use a GPU.  Change this in Notebook Settings via the '
      'command palette (cmd/ctrl-shift-P) or the Edit menu.\n\n')
  raise SystemError('GPU device not found')

def cpu():
  with tf.device('/cpu:0'):
    random_image_cpu = tf.random.normal((100, 100, 100, 3))
    net_cpu = tf.keras.layers.Conv2D(32, 7)(random_image_cpu)
    return tf.math.reduce_sum(net_cpu)

def gpu():
  with tf.device('/device:GPU:0'):
    random_image_gpu = tf.random.normal((100, 100, 100, 3))
    net_gpu = tf.keras.layers.Conv2D(32, 7)(random_image_gpu)
    return tf.math.reduce_sum(net_gpu)
  
# We run each op once to warm up; see: https://stackoverflow.com/a/45067900
cpu()
gpu()

# Run the op several times.
print('Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images '
      '(batch x height x width x channel). Sum of ten runs.')
print('CPU (s):')
cpu_time = timeit.timeit('cpu()', number=10, setup="from __main__ import cpu")
print(cpu_time)
print('GPU (s):')
gpu_time = timeit.timeit('gpu()', number=10, setup="from __main__ import gpu")
print(gpu_time)
print('GPU speedup over CPU: {}x'.format(int(cpu_time/gpu_time)))

Time (s) to convolve 32x7x7x3 filter over random 100x100x100x3 images (batch x height x width x channel). Sum of ten runs.
CPU (s):
3.6868959430000245
GPU (s):
0.05784499899999673
GPU speedup over CPU: 63x


# Transcript To Text Keras Code

In [1]:
# Standard Data Science Libraries
import pickle
import math
import pandas as pd
import numpy as np
from numpy import array

# Neural Net Preprocessing
from sklearn.feature_extraction.text import CountVectorizer
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.sequence import pad_sequences
# Neural Net Layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import LSTM
from tensorflow.keras.layers import Embedding

# Neural Net Training
from tensorflow.keras.models import load_model
from tensorflow.keras.callbacks import ModelCheckpoint
from keras.callbacks import EarlyStopping

from pickle import load


Using TensorFlow backend.


In [2]:
import pickle

#trainY_object_file = pickle.load(open("trainY.pkl",'rb'))
#trainY = trainY_object_file

#trainX_object_file = pickle.load(open("trainX.pkl",'rb'))
#trainX = trainX_object_file

with open("trainY.pkl", 'rb') as trainY_pickle_file:
    trainY = pickle.load(trainY_pickle_file)
    trainY_pickle_file.close()

with open("trainX.pkl", 'rb') as trainX_pickle_file:
    trainX = pickle.load(trainX_pickle_file)
    trainX_pickle_file.close()
    
pd.DataFrame(trainX), pd.DataFrame(trainY)

(          0     1     2     3     4     5   ...    13    14   15   16    17    18
 0         44   138    36    55     2    51  ...     2  1971    7    3   140    66
 1        138    36    55     2    51   132  ...  1971     7    3  140    66     4
 2         36    55     2    51   132    10  ...     7     3  140   66     4    12
 3         55     2    51   132    10    18  ...     3   140   66    4    12  2420
 4          2    51   132    10    18   967  ...   140    66    4   12  2420     3
 ...      ...   ...   ...   ...   ...   ...  ...   ...   ...  ...  ...   ...   ...
 106216   280     1   122  1393    66     7  ...    21   626   20    7   219   770
 106217     1   122  1393    66     7  1970  ...   626    20    7  219   770   245
 106218   122  1393    66     7  1970     2  ...    20     7  219  770   245    51
 106219  1393    66     7  1970     2   478  ...     7   219  770  245    51  3250
 106220    66     7  1970     2   478    77  ...   219   770  245   51  3250    63
 
 [

In [5]:
# define model
vocab_size = 7269
author = "DAN"


with open("author_values.pkl", 'rb') as author_values_file:
    author_values = pickle.load(author_values_file)
    author_values_file.close()


max_words = 50000 # Max size of the dictionary
tokenizer = Tokenizer(num_words=max_words)
tokenizer.fit_on_texts(author_values)
sequences = tokenizer.texts_to_sequences(author_values)
print(sequences[:5])

# Flatten the list of lists resulting from the tokenization. This will reduce the list
# to one dimension, allowing us to apply the sliding window technique to predict the next word
text = [item for sublist in sequences for item in sublist]
vocab_size = len(tokenizer.word_index)

# Training on 19 words to predict the 20th
sentence_len = 20
pred_len = 1
train_len = sentence_len - pred_len
seq = []
# Sliding window to generate train data
for i in range(len(text)-sentence_len):
    seq.append(text[i:i+sentence_len])
# Reverse dictionary to decode tokenized sequences back to words
reverse_word_map = dict(map(reversed, tokenizer.word_index.items()))


model = Sequential([
    Embedding(vocab_size+1, 50, input_length=train_len),
    LSTM(150, return_sequences=True),
    LSTM(150),
    Dense(150, activation='relu'),
    Dense(vocab_size, activation='softmax')
])

[[44, 138, 36, 55, 2, 51, 132, 10, 18, 967, 356, 132, 10, 2, 1971, 7, 3, 140, 66, 4, 12, 2420, 3, 357, 80, 3, 24, 124, 357, 5, 75, 8, 139, 332, 3, 17, 5, 2421, 8, 967, 1044, 3, 140, 66, 2422, 2, 69, 292, 6, 562, 798, 10, 2423, 8, 64, 1, 63, 30, 1430, 268, 7, 3, 17, 2, 1972, 255, 612, 55, 2, 427, 141, 243, 235, 1148, 117, 562, 798, 1148, 41, 562, 798, 3, 17, 511, 255, 64, 86, 62, 372, 1149, 71, 5, 719, 639, 2424, 286, 159, 56, 640, 6, 1255, 6, 56, 1625, 5, 185, 528, 641, 5, 1626, 36, 71, 5, 512, 2425, 7, 1973, 581, 3, 61, 1045, 157, 172, 70, 148, 41, 562, 2426, 1046, 8, 358, 305, 18, 107, 51, 16, 51, 1431, 19, 450, 2427, 562, 6, 3, 54, 513, 10, 76, 6, 3, 54, 513, 10, 76, 35, 22, 513, 40, 172, 21, 613, 102, 662, 11, 91, 44, 3, 799, 4, 60, 7, 51, 26, 1627, 19, 109, 14, 10, 2, 2428, 373, 1, 17, 295, 112, 6, 3, 52, 3, 24, 1628, 27, 20, 613, 49, 2429, 71, 184, 71, 44, 125, 9, 38, 35, 17, 1256, 232, 44, 333, 14, 1627, 80, 32, 25, 43, 3, 53, 1974, 44, 54, 30, 513, 6, 19, 54, 30, 2430, 6, 19, 5

In [6]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 19, 50)            363500    
_________________________________________________________________
lstm (LSTM)                  (None, 19, 150)           120600    
_________________________________________________________________
lstm_1 (LSTM)                (None, 150)               180600    
_________________________________________________________________
dense (Dense)                (None, 150)               22650     
_________________________________________________________________
dense_1 (Dense)              (None, 7269)              1097619   
Total params: 1,784,969
Trainable params: 1,784,969
Non-trainable params: 0
_________________________________________________________________


In [8]:
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])
# fit model
model.fit(np.asarray(trainX),
          pd.get_dummies(np.asarray(trainY)),
          batch_size=128, epochs=100)


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<tensorflow.python.keras.callbacks.History at 0x7faa84aeec88>

In [12]:
pickle.dump(tokenizer, open('tokenizer.pkl', 'wb'))
model.save('model_weights.hdf5')