## Embedding layer

An [embedding layer](https://keras.io/layers/embeddings/) turns integers into dense vectors of fixed size. 

In our case it will translate word IDs from the vocabulary into word vectors.

Usually the word vectors are initialized with small random numbers and learned during training. 

It is also possible to initialize an embedding layer with pre-trained word vectors. 

A common scenario is to pre-train word vectors on a large corpus like Wikipedia and than re-use the word vectors in other models. The idea is to inject general language understanding into the training process.

In [35]:
import numpy as np
from keras.layers import Input, Dense
from keras.layers.embeddings import Embedding
from keras.models import Model

# Size of the vocabulary. Indexing must
# starts with 0 and be consequtive.
vocab_size = 3

# Size of word vectors
embedding_dims = 10

# Pre-define word vectors:
# word 0 => [0, 0,...]
# word 1 => [10, 10,...]
# word 2 => [20, 20,...]
word_vectors = []
for i in range(vocab_size):
    wvec = np.empty((embedding_dims,))
    wvec.fill(i*10)
    word_vectors.append(wvec)
weights = np.vstack(word_vectors)

# Size of word input
input_size = 5

inputs = Input(shape=(input_size,), dtype='int32') 
outputs = Embedding(vocab_size, embedding_dims, weights=[weights])(inputs)
model = Model([inputs], [outputs])
model.compile(optimizer='sgd', loss='mse')

word_ids = np.array([[0,1,1,1,2]])
word_vectors = model.predict(word_ids)

print('Word input:')
print(word_ids)
print('\nWord vectors:')
print(word_vectors)

Word input:
[[0 1 1 1 2]]

Word vectors:
[[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
  [10. 10. 10. 10. 10. 10. 10. 10. 10. 10.]
  [10. 10. 10. 10. 10. 10. 10. 10. 10. 10.]
  [10. 10. 10. 10. 10. 10. 10. 10. 10. 10.]
  [20. 20. 20. 20. 20. 20. 20. 20. 20. 20.]]]


## Lambda layer

A [lambda layer](https://keras.io/layers/core/#lambda) is a simple way to add custom functionality to a model. 

It is best used for stateless functions. For stateful functions it is better to implement a separate layer.

In our case we'll use a lambda layer to slice the output from an embedding layer into individual word vectors.

In [36]:
import numpy as np
from keras.layers import Input, Lambda
from keras.models import Model

# define the input shape
inputs = Input(shape=(4,4), dtype='float32')

# output the first row of the input matrix
first_row = Lambda(lambda x: x[:,0,:])(inputs)

# output the first column of the input matrix
first_column = Lambda(lambda x: x[:,:,0])(inputs)

# output all values multiplied by 2
mul_2 = Lambda(lambda x: x*2.0)(inputs)

model = Model(inputs=inputs, outputs=[first_row, first_column, mul_2])
model.compile(optimizer='sgd', loss='mse')

x = np.array([
    [[1,1,1,1],
     [2,2,2,2],
     [3,3,3,3],
     [4,4,4,4]]
], dtype=np.float32)

x,y,z = model.predict(x)
print('First row:')
print(x)
print('\nFirst column:')
print(y)
print('\nMultiplied by 2:')
print(z)

First row:
[[1. 1. 1. 1.]]

First column:
[[1. 2. 3. 4.]]

Multiplied by 2:
[[[2. 2. 2. 2.]
  [4. 4. 4. 4.]
  [6. 6. 6. 6.]
  [8. 8. 8. 8.]]]


## Merge layer

[Merge layers](https://keras.io/layers/merge/) take as input a list of tensors (all of the same shape), aggregate them by some operation and return a single tensor (also of the same shape).

Merge operations are:
 * concatenate
 * average
 * sum
 * min/max
 * etc.

In [41]:
import numpy as np
from keras.layers import Input, Concatenate, Lambda, Average, Dense, Add
from keras.models import Model

# define the input shape
inputs = Input(shape=(4,4), dtype='float32')

# slice by row
word_vector_rows = [Lambda(lambda x: x[:,i,:])(inputs) for i in range(4)]

# concatenate rows
concat_out = Concatenate()(word_vector_rows)

# average rows
avg_out = Average()(word_vector_rows)

# add rows
sum_out = Add()(word_vector_rows)

model = Model(inputs=inputs, outputs=[concat_out, avg_out, sum_out])
model.compile(optimizer='sgd', loss='mse')

x = np.array([
    [
    [1,1,1,1],
    [2,2,2,2],
    [3,3,3,3],
    [4,4,4,4],        
    ]
], dtype=np.float32)

concat_out_val, avg_out_val, sum_out_val = model.predict(x)

print('Concatenated rows:')
print(concat_out_val)
print('\nAveraged rows:')
print(avg_out_val)
print('\nSummed rows:')
print(sum_out_val)

Concatenated rows:
[[1. 1. 1. 1. 2. 2. 2. 2. 3. 3. 3. 3. 4. 4. 4. 4.]]

Averaged rows:
[[2.5 2.5 2.5 2.5]]

Summed rows:
[[10. 10. 10. 10.]]


**Note:** This is basicly a **reduce** operation, but this is not available in Keras.