# T81-558: Applications of Deep Neural Networks
**Module 11: Natural Language Processing and Speech Recognition**
* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).

# Module 11 Material

* Part 11.1: Getting Started with Spacy in Python [[Video]](https://www.youtube.com/watch?v=bv_iVVrlfbU) [[Notebook]](t81_558_class_11_01_spacy.ipynb)
* Part 11.2: Word2Vec and Text Classification [[Video]](https://www.youtube.com/watch?v=qN9hHlZKIL4) [[Notebook]](t81_558_class_11_02_word2vec.ipynb)
* **Part 11.3: What are Embedding Layers in Keras** [[Video]](https://www.youtube.com/watch?v=Ae3GVw5nTYU) [[Notebook]](t81_558_class_11_03_embedding.ipynb)
* Part 11.4: Natural Language Processing with Spacy and Keras [[Video]](https://www.youtube.com/watch?v=Ae3GVw5nTYU) [[Notebook]](t81_558_class_11_04_text_nlp.ipynb)
* Part 11.5: Learning English from Scratch with Keras and TensorFlow [[Video]](https://www.youtube.com/watch?v=Ae3GVw5nTYU) [[Notebook]](t81_558_class_11_05_english_scratch.ipynb)

# Part 11.3: What are Embedding Layers in Keras

[Embedding Layers](https://keras.io/layers/embeddings/) are a powerful feature of Keras that allow additional information to be automatically inserted into your neural network.  In the previous section you saw that Word2Vec can expand words to a 300 dimension vector.  An embedding layer would allow you to automatically insert these 300-dimension vectors in the place of word-indexes.  

Embedding layers are often used with Natural Language Processing (NLP); however, they can be used in any instance where you wish to insert a larger vector in the place of an index value.  In some ways you can think of an embedding layer as dimension expansion. However, the hope is that these additional dimensions will provide more information to the model and provide a better score.

### Simple Embedding Layer Example

* **input_dim** = How large is the vocabulary?  How many categories are you encoding. This is the number of items in your "lookup table".
* **output_dim** = How many numbers in the vector that you wish to return. 
* **input_length** = How many items are in the input feature vector that you need to transform?

Now we create one that has a vocabulary size of 10, will reduce those values between 0-9 to 4 number vectors.  Each feature vector coming in will have 2 such features.  This neural network does nothing more than pass the embedding on to the output.  But it does let us see what the embedding is doing.

In [1]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding
import numpy as np

model = Sequential()
embedding_layer = Embedding(input_dim=10, output_dim=4, input_length=2)
model.add(embedding_layer)
model.compile('adam', 'mse')

Now lets query the neural network with 2 rows.

In [2]:
input_data = np.array([
    [1,2]
])

pred = model.predict(input_data)

print(input_data.shape)
print(pred)

(1, 2)
[[[ 0.04763632  0.03387379  0.02331975  0.03487139]
  [-0.02743584  0.00659242 -0.03050996  0.00233712]]]


In [3]:
embedding_layer.get_weights()

[array([[-0.0045323 ,  0.04901491,  0.02596814,  0.02962707],
        [ 0.04763632,  0.03387379,  0.02331975,  0.03487139],
        [-0.02743584,  0.00659242, -0.03050996,  0.00233712],
        [-0.04829236, -0.04555564, -0.0387257 , -0.011488  ],
        [-0.02604507, -0.01598718, -0.00531778, -0.04186999],
        [-0.00292976,  0.01803044,  0.03412081,  0.03287293],
        [ 0.0453856 , -0.01890322, -0.0041332 , -0.02499459],
        [-0.03981459,  0.02295792, -0.00151055, -0.04211504],
        [ 0.03971988, -0.02347859, -0.02527274,  0.02243959],
        [-0.04731938,  0.0447234 , -0.04105244, -0.02245835]],
       dtype=float32)]

### Transferring An Embedding

In [4]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding
import numpy as np

embedding_lookup = np.array([
    [1,0,0],
    [0,1,0],
    [0,0,1]
])

model = Sequential()
embedding_layer = Embedding(input_dim=3, output_dim=3, input_length=2)
model.add(embedding_layer)
model.compile('adam', 'mse')

embedding_layer.set_weights([embedding_lookup])

In [5]:
input_data = np.array([
    [0,1]
])

pred = model.predict(input_data)

print(input_data.shape)
print(pred)

(1, 2)
[[[1. 0. 0.]
  [0. 1. 0.]]]


### Training an Embedding

In [6]:
from numpy import array
from tensorflow.keras.preprocessing.text import one_hot
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Embedding, Dense

In [7]:
# Define 10 resturant reviews.
reviews = [
    'Never coming back!',
    'Horrible service',
    'Rude waitress',
    'Cold food.',
    'Horrible food!',
    'Awesome',
    'Awesome service!',
    'Rocks!',
    'poor work',
    'Couldn\'t have done better']

# Define labels (1=negative, 0=positive)
labels = array([1,1,1,1,1,0,0,0,0,0])

In [8]:
VOCAB_SIZE = 50
encoded_reviews = [one_hot(d, VOCAB_SIZE) for d in reviews]
print(f"Encoded reviews: {encoded_reviews}")

Encoded reviews: [[28, 25, 8], [23, 12], [34, 26], [37, 45], [23, 45], [33], [33, 12], [29], [33, 11], [20, 38, 22, 21]]


In [9]:
MAX_LENGTH = 4

padded_reviews = pad_sequences(encoded_reviews, maxlen=MAX_LENGTH, padding='post')
print(padded_reviews)

[[28 25  8  0]
 [23 12  0  0]
 [34 26  0  0]
 [37 45  0  0]
 [23 45  0  0]
 [33  0  0  0]
 [33 12  0  0]
 [29  0  0  0]
 [33 11  0  0]
 [20 38 22 21]]


In [10]:
model = Sequential()
embedding_layer = Embedding(VOCAB_SIZE, 8, input_length=MAX_LENGTH)
model.add(embedding_layer)
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])

print(model.summary())

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 4, 8)              400       
_________________________________________________________________
flatten (Flatten)            (None, 32)                0         
_________________________________________________________________
dense (Dense)                (None, 1)                 33        
Total params: 433
Trainable params: 433
Non-trainable params: 0
_________________________________________________________________
None


In [11]:
# fit the model
model.fit(padded_reviews, labels, epochs=100, verbose=0)

<tensorflow.python.keras.callbacks.History at 0xb2c785fd0>

In [12]:
print(embedding_layer.get_weights()[0].shape)
print(embedding_layer.get_weights())

(50, 8)
[array([[-1.42880306e-01,  7.46818930e-02,  9.94808897e-02,
         1.36440024e-01, -1.48050189e-01,  9.37938169e-02,
        -7.04777688e-02,  1.75872326e-01],
       [-3.65085602e-02,  3.53831686e-02, -3.70737202e-02,
        -3.20171490e-02, -5.22566959e-03, -5.51359728e-03,
         3.18039916e-02,  3.27227823e-02],
       [-2.28793267e-02,  2.16226839e-02, -1.94333792e-02,
        -2.59495024e-02,  2.41860189e-02,  4.36949022e-02,
        -1.31914727e-02,  5.95686585e-03],
       [-1.07009411e-02,  8.18883255e-03,  1.44555606e-02,
        -3.97174582e-02, -2.89748311e-02, -9.09373909e-03,
        -4.78551649e-02, -4.53313850e-02],
       [-6.10606745e-03,  4.13400196e-02, -3.92818451e-03,
        -7.62758404e-03, -2.78581865e-02,  4.35506739e-02,
        -8.15489143e-03,  2.98755057e-02],
       [ 2.77605541e-02, -2.05098037e-02,  4.89966162e-02,
        -1.58568136e-02,  1.38417743e-02,  2.83907726e-03,
         6.96331263e-03, -4.34809811e-02],
       [ 1.32830851e-02, 

In [13]:
loss, accuracy = model.evaluate(padded_reviews, labels, verbose=0)
print(f'Accuracy: {accuracy}')

Accuracy: 1.0
