ELMo embeddings, developed at Allen NLP, are one of many great pre-trained models available on Tensorflow Hub. ELMo embeddings are learned from the internal state of a bidirectional LSTM and represent contextual features of the input text. It’s been shown to outperform GloVe and Word2Vec embeddings on a wide variety of NLP tasks

In [None]:
! pip install nltk
# NLTK is a standard python library that provides a set of diverse algorithms for NLP.
#It is one of the most used libraries for NLP and Computational Linguistics



In [None]:
# Keras ELMO implementation requires older version of Tensorflow. Make sure to run this and restart your kernal.
# Do not run this on local jupyter notebook. Use Google Colab or Cloud Notebooks with GPU's
!pip install tensorflow==1.15
!pip install "tensorflow_hub>=0.6.0"
#TF.Text is a TensorFlow library of text related ops, modules, and subgraphs. The library can perform the
# preprocessing regularly required by text-based models, and includes other features useful for sequence
# modeling not provided by core TensorFlow.
!pip3 install tensorflow_text==1.15

Collecting tensorflow==1.15
  Downloading tensorflow-1.15.0-cp37-cp37m-manylinux2010_x86_64.whl (412.3 MB)
[K     |████████████████████████████████| 412.3 MB 23 kB/s 
[?25hCollecting gast==0.2.2
  Downloading gast-0.2.2.tar.gz (10 kB)
Collecting keras-applications>=1.0.8
  Downloading Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
[K     |████████████████████████████████| 50 kB 6.3 MB/s 
[?25hCollecting tensorboard<1.16.0,>=1.15.0
  Downloading tensorboard-1.15.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 36.3 MB/s 
[?25hCollecting tensorflow-estimator==1.15.1
  Downloading tensorflow_estimator-1.15.1-py2.py3-none-any.whl (503 kB)
[K     |████████████████████████████████| 503 kB 23.9 MB/s 
Building wheels for collected packages: gast
  Building wheel for gast (setup.py) ... [?25l[?25hdone
  Created wheel for gast: filename=gast-0.2.2-py3-none-any.whl size=7554 sha256=50997e5008ad7d791c0a466ac826f2deea1a80e1cecaf57e3916007b4b368b3b
  Stored in

In [None]:
# So, now we will import the modules from the libraries that we have just installed in Cell # 1
import tensorflow as tf
import tensorflow_hub as hub
import tensorflow.keras.backend as K
import numpy as np

In [None]:
tf.__version__, np.__version__ # Using this command we can check the versions of tensorflow and numpy.

('1.15.0', '1.19.5')

In [None]:
# Before we can use ELMO as model, we need to download the elmo module from TF HUB and encapsulate
#that inside a class, such that it can be used as a Keras Layer.
# Latest Version of ELMO Embeddings are here - https://tfhub.dev/google/elmo/3
# For our demonstration, we are using version 2

#Version 2 -->. Restricted trainable variables to the 4 scalar weights as described in the paper.
#Version 3 --> Fix the default output by correctly ignoring padding during the mean pooling operation.
#All other outputs and signatures are correct and unchanged.


#Computes contextualized word representations using character-based word representations and bidirectional
#LSTMs .
#TF HUB modules supports inputs both in the form of raw text strings or tokenized text strings.
#The outputs are fixed embeddings at each LSTM layer, a learnable aggregation of the 3 layers, and a
 #fixed mean-pooled vector representation of the input.
# ElmoEmbeddingLayer is a child class of Layer class present in tf.keras.layers
class ElmoEmbeddingLayer(tf.keras.layers.Layer):
    """Original Author Credit --- Taken from:
    https://github.com/strongio/keras-elmo/blob/master/Elmo%20Keras.ipynb"""

# The __init__ method lets the class initialize the object's attributes and serves no other purpose.
# It is only used within classes
    def __init__(self, **kwargs):
       # the ELMO dimensions are set at 1024, which means the data fed to it, will be projected onto 1024 dimensions.
        self.dimensions = 1024
        super(ElmoEmbeddingLayer, self).__init__(**kwargs)

    def build(self, input_shape):
      # Downloading the ELMO Version 2 from TF HUB.
      # We are setting trainable parameter to be True, since we will train our data with that.
        self.elmo = hub.Module(
            'https://tfhub.dev/google/elmo/2',
            trainable=self.trainable,
            name="{}_module".format(self.name)
        )
        if self.trainable:
          # Trainable True will automatically set empty placeholders for weights used during training
          self._trainable_weights.extend(
              tf.trainable_variables(scope="^{}_module/.*".format(self.name))
          #Returns all variables created with trainable=True.
          #When passed trainable=True, the Variable() constructor automatically
          #adds new variables to the graph collection
          )
        # Changed assuming trainable weights might be set using
        super(ElmoEmbeddingLayer, self).build(input_shape)

    def call(self, x, mask=None):
      #Keras squeeze Removes dimensions of size 1 from the shape of a tensor. We are squeezing our input and
      #cast them as strings
      # while initializing our ELMO layer.
        result = self.elmo(
            K.squeeze(K.cast(x, tf.string), axis=1),
            # Casting of keras tensor, a tensor with the same data as x but reduced dimensions
            as_dict=True,
            signature='default',
        )['default']
        return result

    def compute_mask(self, inputs, mask=None):
      # add padding to non uniform inputs before sending to model.
      #this is general pre-processing step.
        return K.not_equal(inputs, '--PAD--')
        # Used to find element-wise inequality between two tensors, returns a bool tensor

    def compute_output_shape(self, input_shape):
        return (input_shape[0], self.dimensions)
#-------------------------------------------------------- ElmoEmbedding Class ends here ------------------------------------------------

def create_model(train_elmo=False):
  # Create Sequential model
  model = tf.keras.Sequential([
      # Need to explicitly include input layer
      # to allow keras to accept string input
      # Taken from:
      # https://gist.github.com/colinmorris/9183206284b4fe3179809098e809d009
      tf.keras.layers.InputLayer(dtype='string', input_shape=(1,)),
      # InputLayers is a layer to be used as an entry point into the network
      ElmoEmbeddingLayer(trainable=train_elmo),
      # A second layer is the model is elmo layer from the class ElmoEmbeddingLayer
      tf.keras.layers.Dense(1)
      # Third layer is dense layer in the model
  ])

  # Needed to initialize elmo variables
  sess = K.get_session() # Returns a TensorFlow session to be used at the backend
  init = tf.global_variables_initializer() # used to initialize the global variables
  sess.run(init)

  # Compile model
  model.compile(
      optimizer="adam",
      loss="binary_crossentropy",
      metrics=["accuracy"]
  )
  return model




"""The output dictionary contains:

- word_emb: the character-based word representations with shape [batch_size, max_length, 512].
- lstm_outputs1: the first LSTM hidden state with shape [batch_size, max_length, 1024].
- lstm_outputs2: the second LSTM hidden state with shape [batch_size, max_length, 1024].
- elmo: the weighted sum of the 3 layers, where the weights are trainable. This tensor has shape
        [batch_size, max_length, 1024]
- default: a fixed mean-pooling of all contextualized word representations with shape [batch_size, 1024]

"""

In [None]:
model = create_model(train_elmo=True)
# now we are calling the create model and get the model details
# Building Model Object

INFO:tensorflow:Saver not created because there are no variables in the graph to restore


INFO:tensorflow:Saver not created because there are no variables in the graph to restore


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


Instructions for updating:
If using Keras pass *_constraint arguments to layers.


#it_nldlnpdj_04_enus_09

In [None]:
import pandas as pd
# Loading our Product Review Dataset
data = pd.read_csv("https://raw.githubusercontent.com/joshivaibhav/AmazonCustomerReview/master/amazondata.csv")

  interactivity=interactivity, compiler=compiler, result=result)


In [None]:
data.head(10)

Unnamed: 0,Helpful Votes (bin),Number of Records,Star Rating (bin),Customer Id,Helpful Votes,Overall Votes,Product Id,Review Body,Review Year,Review Headline,Star Rating
0,0.0,1,0.0,,4.0,14.0,26009102,You will love this book. It is a hard long re...,03/17/2005 0:00,Best Book Ever,5.0
1,,1,,,,,7491727,This is the UK edition of Dr. Omit's book. Dr....,,researchers from John Hopkins School of Medici...,
2,0.0,1,0.0,,2.0,2.0,002782683X,This is a fun and entertaining book about lear...,06/25/2012 0:00,Michelle,5.0
3,0.0,1,0.0,,0.0,0.0,60187271,"Started a big slow, but once into it the autho...",06/09/2013 0:00,Loved the book,5.0
4,0.0,1,0.0,,14.0,20.0,60392452,Received this book as a Christmas present. I h...,08/05/2003 0:00,Challenges your assumptions,4.0
5,,1,,,,,60194480,-If you wonder \Where did that promise of a pe...,,then this book is your bible.This is not a his...,
6,0.0,1,0.0,,0.0,0.0,60569662,"Ugly Impostor, discernible face, self esteem r...",09/20/2013 0:00,Reflections,5.0
7,0.0,1,0.0,,3.0,4.0,2311216,I agree with those reviewers that believe that...,06/12/2004 0:00,Not a Thriller,2.0
8,0.0,1,0.0,,0.0,0.0,25853503,Then I watch American Masters (PBS) featuring ...,04/28/2013 0:00,First read GWTW when I was in high school. Sa...,5.0
9,,1,,,,,006053821X,I am a fan of all of Tepper's books. I had be...,,I think.Over all a great read,


Doing basic pre-processing.

Subsetting only Rating 5 (Positive) and 1 (Negative)

Subsetting only Review and Rating

In [None]:
data_subset = data[(data['Star Rating']==5) | (data['Star Rating']==1)]
# total 11 cols and 1,28,845 rows, subset for star rating 1 and 5 only

In [None]:
data_subset = data_subset[['Review Body','Star Rating']]
# subset for only two columns Review body and star rating


In [None]:
data_subset.shape
#Return a tuple representing the dimensionality of the DataFrame. (Rows, Cols)

(78665, 2)

In [None]:
data_subset.head(10) # Display the first 10 enteries

Unnamed: 0,Review Body,Star Rating
0,You will love this book. It is a hard long re...,5.0
2,This is a fun and entertaining book about lear...,5.0
3,"Started a big slow, but once into it the autho...",5.0
6,"Ugly Impostor, discernible face, self esteem r...",5.0
8,Then I watch American Masters (PBS) featuring ...,5.0
10,My granddaughter got me interested bin this se...,5.0
15,I discovered this book just a few days after 9...,5.0
16,This rewrite of an older story is a vast impro...,5.0
18,An incedibly well written book. His writing is...,5.0
19,Great book with lots of twists and turns. One...,5.0


In [None]:
data_sample = data_subset.sample(frac=0.01) # Return a random sample of items from an axis of object.

In [None]:
data_sample.shape

(787, 2)

Converting reviews to classes --> 1 or 0

In [None]:
def convert_to_string(rating):
  if rating == 5.0:
    return 1
  else:
    return 0

 # Labeling the data star rating 5, label = 1 else label = 0
data_sample['label'] = data_sample['Star Rating'].apply(convert_to_string)

In [None]:
data_sample.head(10)

Unnamed: 0,Review Body,Star Rating,label
83945,No real strory-line nor entertaining. My boys ...,1.0,0
72999,"I love these books! Relax, unwind and let your...",5.0,1
50112,Judge Judy Sheindlin writes another excellent ...,5.0,1
38090,"Dean Koontz books from Dragon Tears, to Mr. Mu...",1.0,0
10445,gift,5.0,1
21664,My daughter was so excited to get this book an...,5.0,1
55997,I was thrilled to find this novel at the secon...,5.0,1
124989,Thank you for this beautifully written book.,5.0,1
32669,Cowboy is a seemingly simple story of love and...,5.0,1
56423,Quickly shipped. Exactly as described. Thank you!,5.0,1


In [None]:
# reshaping data to adjust according to ELMO
X = np.array(data_sample['Review Body'].tolist()).reshape(data_sample.shape[0], 1)
# Training Data np.array is used to create the arrays.
y = np.array(data_sample['label'].tolist()).reshape(data_sample.shape[0], 1)
# Training Labels
X.shape, y.shape
print('First Data',X[1])
print('First Label',y[1])

First Data ['I love these books! Relax, unwind and let your mind escape while you lay down the color in any fashion you choose. The patterns are enjoyable. The paper quality is pretty good. Pencils or gel pens recommended as the detail can be very small. I will be ordering more since I think a coloring book and a set of pencils will make a great gift or stocking stuffer.']
First Label [1]


In [None]:
model.fit(X, y,batch_size=16,epochs=5)

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Train on 787 samples


<keras.src.callbacks.history.History at 0x7826e73f11b0>

In [None]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
elmo_embedding_layer (ElmoEm (None, 1024)              4         
_________________________________________________________________
dense (Dense)                (None, 1)                 1025      
Total params: 1,029
Trainable params: 1,029
Non-trainable params: 0
_________________________________________________________________


In [None]:
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=1)
print(f"Test Loss: {test_loss:.4f} - Test Accuracy: {test_accuracy:.4f}")

Test Loss: 0.4513 - Test Accuracy: 0.9034


In [None]:
model.trainable_weights
# These are the weights which are re-trained using our data. These weights can later be used to productionize
# this model.
# trainable_weights is the list of those weights that are meant to be updated minimize the loss during training.

[<tf.Variable 'elmo_embedding_layer_module/aggregation/weights:0' shape=(3,) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/aggregation/scaling:0' shape=() dtype=float32>,
 <tf.Variable 'dense/kernel:0' shape=(1024, 1) dtype=float32>,
 <tf.Variable 'dense/bias:0' shape=(1,) dtype=float32>]

In [None]:
elmo = model.layers[0].elmo # Getting the elmo layer information in a variable

In [None]:
elmo.variables


[<tf.Variable 'elmo_embedding_layer_module/aggregation/scaling:0' shape=() dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/aggregation/weights:0' shape=(3,) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_0:0' shape=(1, 1, 16, 32) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_1:0' shape=(1, 2, 16, 32) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_2:0' shape=(1, 3, 16, 64) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_3:0' shape=(1, 4, 16, 128) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_4:0' shape=(1, 5, 16, 256) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_5:0' shape=(1, 6, 16, 512) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_6:0' shape=(1, 7, 16, 1024) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/b_cnn_0:0' shape=(32,) dtype=float32>,
 <tf.Variable 'elmo_

In [None]:
elmo.variables
# These are internal ELMO Layers.
# You can see the power of transfer learning here. ELMO is already a trained model. You can see the layers used during initial training.
# Once trained, it learns the basic pattern of English language which gets transfered to different data.
# In our Data, we are using this trained model to learn positive and negative review.
# This helps not focus on building your own deep learning model layers, but instead, use trained ELMO as plug and play.

[<tf.Variable 'elmo_embedding_layer_module/aggregation/scaling:0' shape=() dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/aggregation/weights:0' shape=(3,) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_0:0' shape=(1, 1, 16, 32) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_1:0' shape=(1, 2, 16, 32) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_2:0' shape=(1, 3, 16, 64) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_3:0' shape=(1, 4, 16, 128) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_4:0' shape=(1, 5, 16, 256) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_5:0' shape=(1, 6, 16, 512) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/W_cnn_6:0' shape=(1, 7, 16, 1024) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/bilm/CNN/b_cnn_0:0' shape=(32,) dtype=float32>,
 <tf.Variable 'elmo_

In [None]:
model.layers[0].trainable_weights # Trainable weights for elmo embedding layer

[<tf.Variable 'elmo_embedding_layer_module/aggregation/weights:0' shape=(3,) dtype=float32>,
 <tf.Variable 'elmo_embedding_layer_module/aggregation/scaling:0' shape=() dtype=float32>]

In [None]:
model.predict([["I love this book and will recommend to everyone!!"]])
# you can use model.predict to get the sentiment rating of reviews.
# however the results are not promising because we only trained for one epoch. You can train it for more epochs and check the results.

array([[0.8117869]], dtype=float32)

In [None]:
model.predict([["I love this book and will recommend to everyone!!"]])

array([[0.8117869]], dtype=float32)