# **Word Embedding Layers for Deep Learning with Keras**

## **Keras Embedding Layer**

We'll examine how to learn a word embedding while applying a neural network to a text classification task in this part.

We'll define a simple problem in which there are 10 text documents, each of which contains a comment on a piece of work that a student contributed. Each text file is assigned a positive "1" or a negative "0" classification. This is a simple sentiment analysis problem.

In [1]:
# importing libraries
import numpy as np
from tensorflow.keras.preprocessing.text import one_hot

### **Step 1** 
1st we'll start by defining the documents and their respective class names.

In [2]:
corpus = ['Well done!','Good work','Great effort',
        'nice work','Excellent!','Weak','Poor effort!',
        'not good','poor work','Could have done better.']
# define class labels
labels = np.array([1,1,1,1,1,0,0,0,0,0])

In [3]:
corpus

['Well done!',
 'Good work',
 'Great effort',
 'nice work',
 'Excellent!',
 'Weak',
 'Poor effort!',
 'not good',
 'poor work',
 'Could have done better.']

## **Step 2**
We can then integer encode every document. As a result, integer sequences will be provided to the embedding layer as input.

The **one hot()** method in Keras generates a hash representation of each word as an effective integer encoding. We will assume a **vocabulary size** of 100, which is significantly more than is required to lower the likelihood of collisions resulting from the hash function.

In [4]:
# as we discuss above we will first define a vocabulary size
vocabSize = 100

#### Integer encoding documents using one_hot representation.
one_hote require text doc which we want to convert and vocabulary size and return the index from the dictionary
* **input_text**: Input text (string).
* **n**: int. Size of vocabulary.

In [5]:
# applying one_hot
oneHotRepresentation = [one_hot(input_text = document, n = vocabSize) for document in corpus]

# displaying representation
oneHotRepresentation

[[71, 70],
 [86, 55],
 [56, 77],
 [81, 55],
 [15],
 [62],
 [50, 77],
 [38, 86],
 [50, 55],
 [29, 77, 70, 81]]

## **Defining Embedding layer**

Before passing the vectors to keras embedding layer note it down that the lengths of the sequences vary, while Keras prefers vectorized inputs with uniform lengths. In our case all the input sequences will be padded to a **length of 10**. In this case, the **pad sequences()** function from the built-in Keras library can be used to accomplish this.

**Args:**
* **sequences**: List of sequences (each sequence is a list of integers).
* **maxlen**: Optional Int, maximum length of all sequences. If not provided,
        sequences will be padded to the length of the longest individual
        sequence.
* dtype: (Optional, defaults to int32). Type of the output sequences.
        To pad sequences with variable length strings, you can use `object`.
* **padding**: String, 'pre' or 'post' (optional, defaults to 'pre'):
        pad either before or after each sequence.
* truncating: String, 'pre' or 'post' (optional, defaults to 'pre'):
        remove values from sequences larger than
        `maxlen`, either at the beginning or at the end of the sequences.
* value: Float or String, padding value. (Optional, defaults to 0.)

**Returns**:
    Numpy array with shape `(len(sequences), maxlen)`

In [6]:
# importing pad sequences from keras
from tensorflow.keras.preprocessing.sequence import pad_sequences

In [7]:
sentLen = 10
paddedCorpus = pad_sequences(sequences = oneHotRepresentation, maxlen = sentLen, padding='pre')
print(paddedCorpus)

[[ 0  0  0  0  0  0  0  0 71 70]
 [ 0  0  0  0  0  0  0  0 86 55]
 [ 0  0  0  0  0  0  0  0 56 77]
 [ 0  0  0  0  0  0  0  0 81 55]
 [ 0  0  0  0  0  0  0  0  0 15]
 [ 0  0  0  0  0  0  0  0  0 62]
 [ 0  0  0  0  0  0  0  0 50 77]
 [ 0  0  0  0  0  0  0  0 38 86]
 [ 0  0  0  0  0  0  0  0 50 55]
 [ 0  0  0  0  0  0 29 77 70 81]]


We are now prepared to specify our neural network model's embedding layer.

The Embedding has a **100-word vocabulary** and a **10 input length**. We'll pick an embedding space with only **8 dimensions**.

### **Embedding Layer Important Arguments**
* **input_dim**: Integer. Size of the vocabulary,
    i.e. maximum integer index + 1.
* **output_dim**: Integer. Dimension of the dense embedding.
* **input_length**: Length of input sequences, when it is constant.
    This argument is required if you are going to connect
    `Flatten` then `Dense` layers upstream
    (without it, the shape of the dense outputs cannot be computed).

In [8]:
# importing embidding layer from keras
from tensorflow.keras.layers import Embedding, Flatten, Dense
# import sequentional model
from tensorflow.keras.models import Sequential

In [9]:
# no of features
outDim = 8

# creating sequential model
model = Sequential()
# adding embedding layer
model.add(Embedding(input_dim = vocabSize, output_dim = outDim, input_length=sentLen))
# converting it in 1D
model.add(Flatten())
# adding sigmoid to classify the input
model.add(Dense(1, activation = "sigmoid"))

2022-11-20 22:13:16.980170: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-11-20 22:13:16.981744: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


In [10]:
# displaying the summary of our model
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 10, 8)             800       
                                                                 
 flatten (Flatten)           (None, 80)                0         
                                                                 
 dense (Dense)               (None, 1)                 81        
                                                                 
Total params: 881
Trainable params: 881
Non-trainable params: 0
_________________________________________________________________


In [11]:
# compiling our model
model.compile(optimizer = "adam", loss="binary_crossentropy", metrics = ["accuracy"])

In [12]:
model.fit(x = paddedCorpus, y = labels, epochs = 50)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x7ffaf8120fa0>

## **Evaluating the model**

In [13]:
loss, accuracy = model.evaluate(x = paddedCorpus, y = labels)



In [14]:
print(f'Accuracy: {accuracy*100}')
print(f'Loss: {loss}')

Accuracy: 100.0
Loss: 0.6414653062820435
