In [1]:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Model, Sequential
from tensorflow import math

In [10]:
vocab_size = 500
model_dimension = 128

LSTM = Sequential()
LSTM.add(layers.Embedding(input_dim = vocab_size, output_dim=model_dimension))
LSTM.add(layers.LSTM(units=model_dimension, return_sequences=True))
LSTM.add(layers.AveragePooling1D(pool_size = 2))
LSTM.add(layers.Lambda(lambda x: math.l2_normalize(x)))

input1 = layers.Input(shape=(None,))
input2 = layers.Input(shape=(None,))

conc = layers.Concatenate(axis = 1)((LSTM(input1), LSTM(input2)))

Siamese = Model(inputs=(input1, input2), outputs=conc)

Siamese.summary()

1. Layer Dimensions
Here's a breakdown of the layers in your LSTM model:

a. Embedding Layer
Input dimension (input_dim): The size of the vocabulary, set as 500. This is the total number of unique words (tokens) that your model can understand.
Output dimension (output_dim): The dimensionality of the embeddings, set as 128. Each word in your vocabulary will be represented as a 128-dimensional vector.
The output of this layer for each word in the input sequence will be a vector of size 128, leading to an output shape of (batch_size, sequence_length, 128) for a batch of sequences.

b. LSTM Layer
Units: Set to the same as model_dimension, which is 128. This parameter defines the dimensionality of the output space of the layer.
return_sequences=True: This ensures that the LSTM outputs the hidden state at each time step, maintaining the time dimension in the output. Thus, the output shape remains (batch_size, sequence_length, 128).

c. AveragePooling1D Layer
pool_size=2: This parameter specifies the size of the pooling window. The layer will take the average of every consecutive group of 2 elements (along the time dimension of the sequence) to reduce the sequence length. This effectively downsamples the input sequence length by a factor of 2, making the output shape (batch_size, sequence_length/2, 128).
2. Understanding pool_size=2 in AveragePooling1D
The AveragePooling1D layer with pool_size=2 performs a downsampling operation by taking the average over a sliding window of 2 elements along the sequence's time dimension. Here's what happens:

Function: It reduces the temporal resolution of the output from the previous LSTM layer. For instance, if the LSTM layer outputs a sequence of length 10, the pooling layer will output a sequence of length 5.

Use Case: This is commonly used to reduce the amount of computation and the model's sensitivity to the exact positions of features in the input sequence. It can also help in smoothing out the features over time.
Example

Consider a sequence processed through these layers:

Input sequence shape: (batch_size, 10, 128) from the LSTM layer.
After AveragePooling1D(pool_size=2), the sequence shape becomes (batch_size, 5, 128).
Additional Layers
After the pooling layer, your code applies a lambda function to normalize the output, then processes two inputs using two LSTMs, concatenates their outputs, and builds a Siamese-style model.

Summary Generation
To see a detailed breakdown of each layer's output dimensions, you can execute Siamese.summary() in your environment with TensorFlow installed. This command prints the configuration and output shapes of each layer in your model, which can be particularly helpful for verifying layer connections and output sizes.

In [11]:
def show_layers(model, layer_prefix):
    print(f"Total layers: {len(model.layers)}\n")
    for i in range(len(model.layers)):
        print("======")
        print(f"layer_prefix_{i}: {model.layers[i]}")

print('Siamese model:\n')
show_layers(Siamese, 'Parallel.sublayers')

print('Detail of LSTM models:\n')
show_layers(LSTM, 'Serial.sublayers')

Siamese model:

Total layers: 4

layer_prefix_0: <InputLayer name=input_layer_15, built=True>
layer_prefix_1: <InputLayer name=input_layer_16, built=True>
layer_prefix_2: <Sequential name=sequential_7, built=True>
layer_prefix_3: <Concatenate name=concatenate_5, built=True>
Detail of LSTM models:

Total layers: 4

layer_prefix_0: <Embedding name=embedding_7, built=True>
layer_prefix_1: <LSTM name=lstm_7, built=True>
layer_prefix_2: <AveragePooling1D name=average_pooling1d_5, built=True>
layer_prefix_3: <Lambda name=lambda_5, built=True>
