I picked the yelp poplarity reviews dataset from the Tensorflow datasets catalog located at https://www.tensorflow.org/datasets/catalog/yelp_polarity_reviews. I wanted to be able to perform text classification on this dataset, as sequence models are useful for machine learning situations that take in sequential data, which this dataset uses in the form of text streams.  I chose to use the Bidirectional RNN framework from Tensorflow as that is where I got my dataset and thought the two were the most likely to be compatable.

In [1]:
import numpy as np
import tensorflow_datasets as tfds
import tensorflow as tf
import matplotlib.pyplot as plt

2021-10-13 22:22:41.654593: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-10-13 22:22:41.654645: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.


In [2]:
dataset, info = tfds.load('yelp_polarity_reviews', with_info=True, as_supervised=True)
train_dataset = dataset['train']
test_dataset = dataset['test']

train_dataset.element_spec

2021-10-13 22:22:47.576335: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-10-13 22:22:47.576394: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2021-10-13 22:22:47.576437: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (DESKTOP-J2H6DT8): /proc/driver/nvidia/version does not exist
2021-10-13 22:22:47.576947: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


(TensorSpec(shape=(), dtype=tf.string, name=None),
 TensorSpec(shape=(), dtype=tf.int64, name=None))

In [3]:
for example, label in train_dataset.take(1):
  print('text: ', example.numpy())
  print('label: ', label.numpy())

text:  b"The Groovy P. and I ventured to his old stomping grounds for lunch today.  The '5 and Diner' on 16th St and Colter left me with little to ask for.  Before coming here I had a preconceived notion that 5 & Diners were dirty and nasty. Not the case at all.\\n\\nWe walk in and let the waitress know we want to sit outside (since it's so nice and they had misters).  We get two different servers bringing us stuff (talk about service) and I ask the one waitress for recommendations.  I didn't listen to her, of course, and ordered the Southwestern Burger w/ coleslaw and started with a nice stack of rings.\\n\\nThe Onion Rings were perfectly cooked.  They looked like they were prepackaged, but they were very crispy and I could actually bite through the onion without pulling the entire thing out (don't you hate that?!!!)\\n\\nThe Southwestern Burger was order Medium Rare and was cooked accordingly.  Soft, juicy, and pink with a nice crispy browned outer layer that can only be achieved on 

2021-10-13 22:22:48.552350: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


In [4]:
BUFFER_SIZE = 10000
BATCH_SIZE = 64

In [5]:
train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

In [6]:
for example, label in train_dataset.take(1):
  print('texts: ', example.numpy()[:3])
  print()
  print('labels: ', label.numpy()[:3])

texts:  [b'RE: Food poisoning\\n\\nApril 26 @ 8pm, my girlfriends and I went for a quick dinner before we catch our Cirque show @ 9:30pm. I ordered the gnocchi lobster ragu. The first bite was already off as the seafood smell and taste was very strong (like bad unfresh seafood strong), but I brushed it off as the robust /deep flavour of the ragu. I offered my girlfriend (bride-to-be for this trip) took one bite. We both were feeling sick sitting through the show and went straight back to our hotel room. Within seconds to the porcelain throne,  I vomited violently for about 5 minutes and my gf was also for the remainder of the night. She was left in bed for over 24 hours in our hotel room. After I reported to the Front Desk Manager, he apologized but no follow up phone call was made to us.\\n\\nExtremely disappointed in the customer service and bad seafood we were served.'
 b"In Short...Decent Atmosphere, Average overall food Quality...WAAYYY Too Expensive.\\n\\nFor a place famous for i

In [7]:
VOCAB_SIZE = 1000
encoder = tf.keras.layers.experimental.preprocessing.TextVectorization(
    max_tokens=VOCAB_SIZE)
encoder.adapt(train_dataset.map(lambda text, label: text))

In [8]:
vocab = np.array(encoder.get_vocabulary())
vocab[:20]

array(['', '[UNK]', 'the', 'and', 'i', 'to', 'a', 'was', 'of', 'it',
       'for', 'in', 'is', 'that', 'my', 'we', 'this', 'with', 'but',
       'they'], dtype='<U13')

In [9]:
encoded_example = encoder(example)[:3].numpy()
encoded_example

array([[  1,  30,   1, ...,   0,   0,   0],
       [ 16,  12,  62, ...,   0,   0,   0],
       [ 11,   1, 355, ...,   0,   0,   0]])

In [10]:
for n in range(3):
  print("Original: ", example[n].numpy())
  print("\nRound-trip: ", " ".join(vocab[encoded_example[n]]))
  print()

Original:  b'RE: Food poisoning\\n\\nApril 26 @ 8pm, my girlfriends and I went for a quick dinner before we catch our Cirque show @ 9:30pm. I ordered the gnocchi lobster ragu. The first bite was already off as the seafood smell and taste was very strong (like bad unfresh seafood strong), but I brushed it off as the robust /deep flavour of the ragu. I offered my girlfriend (bride-to-be for this trip) took one bite. We both were feeling sick sitting through the show and went straight back to our hotel room. Within seconds to the porcelain throne,  I vomited violently for about 5 minutes and my gf was also for the remainder of the night. She was left in bed for over 24 hours in our hotel room. After I reported to the Front Desk Manager, he apologized but no follow up phone call was made to us.\\n\\nExtremely disappointed in the customer service and bad seafood we were served.'

Round-trip:  [UNK] food [UNK] [UNK] [UNK] my [UNK] and i went for a quick dinner before we [UNK] our [UNK] show 

In [11]:
model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding(
        input_dim=len(encoder.get_vocabulary()),
        output_dim=64,
        mask_zero=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1)
])

In [28]:
sample_text = ('This place has an awesome vegetarian/vegan menu option!!'
               'Everything was very tasty.')
predictions = model.predict(np.array([sample_text]))
print(predictions[0])

[4.067574]


In [14]:
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])

In [16]:
history = model.fit(train_dataset, epochs=1,
                    validation_data=test_dataset,
                    validation_steps=30)



In [17]:
test_loss, test_acc = model.evaluate(test_dataset)

print('Test Loss:', test_loss)
print('Test Accuracy:', test_acc)

Test Loss: 0.21200621128082275
Test Accuracy: 0.9108684062957764


Task 1:
The structure of this RNN is multiple layers starting with an encoder to convert the text stream and an embedding layer to convert the data a second time into trainable vectors.  The relu activation function is used while training in this structure.  Following this is a bidirectional wrapper which is used for running the inputs in two ways which allows the network to preserve information when training.  Lastly, this RNN structure uses two dense layers for converting the trained output vectors into a single output.  The metric I am using to measure performance is the built in accuracy metric in the compile function.

In [22]:
model = tf.keras.Sequential([
    encoder,
    tf.keras.layers.Embedding(len(encoder.get_vocabulary()), 64, mask_zero=True),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64,  return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(1)
])

In [24]:
model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])

In [25]:
history = model.fit(train_dataset, epochs=1,
                    validation_data=test_dataset,
                    validation_steps=30)



In [26]:
test_loss, test_acc = model.evaluate(test_dataset)

print('Test Loss:', test_loss)
print('Test Accuracy:', test_acc)

Test Loss: 0.21233096718788147
Test Accuracy: 0.9087894558906555


In [29]:
sample_text = ('This place has an awesome vegetarian/vegan menu option!!'
               'Everything was very tasty.')
predictions = model.predict(np.array([sample_text]))
print(predictions)

[[4.067574]]


Task 2:
I did not notice any major differences in my results between the two trainings.  One major issue I came into was that fact that my data was incredibly slow to train, so I was only able to get one epoch to finish.  Each training took upwards of 3 hours to complete one epoch only, so this could have a large effect on why my results were so similar.