# Siamese Network

can find similarities for example between new and old questions asked, which would help to answer the new question if the old one is similar.

Below is an example architecture. Even though this are 2 networks, only one has to be trained, since both are using the same parameters. The only difference would be the input (e.g. different word sequences). The output vectors will be compared. The result is cosine similarity (-1 <= y_hat <= 1).

![](img/siamese.png)

[Source](https://www.coursera.org/learn/sequence-models-in-nlp/supplement/oUdcN/architecture)

In [25]:
#import numpy as np
import trax
from trax import layers as tl
import trax.fastmath.numpy as np
import numpy

In [26]:
numpy.random.seed(10)
%config Completer.use_jedi = False

In [22]:
def L2_normalize(x):
    return x / np.sqrt(np.sum(x * x, axis=-1, keepdims=True))

In [27]:
tensor = numpy.random.random((2,5))
tensor

array([[0.77132064, 0.02075195, 0.63364823, 0.74880388, 0.49850701],
       [0.22479665, 0.19806286, 0.76053071, 0.16911084, 0.08833981]])

In [28]:
norm_tensor = L2_normalize(tensor)
norm_tensor



DeviceArray([[0.57393795, 0.01544148, 0.4714962 , 0.55718327, 0.37093794],
             [0.26781026, 0.23596111, 0.9060541 , 0.20146926, 0.10524315]],            dtype=float32)

In [29]:
vocab_size = 500
model_dimension = 128

# Simple LSTM
LSTM = tl.Serial(
        tl.Embedding(vocab_size=vocab_size, d_feature=model_dimension),
        tl.LSTM(model_dimension),
        tl.Mean(axis=1),
        tl.Fn('Normalize', lambda x: normalize(x))
    )

# Turns into a Siamese network
Siamese = tl.Parallel(LSTM, LSTM)

In [30]:
Siamese

Parallel_in2_out2[
  Serial[
    Embedding_500_128
    LSTM_128
    Mean
    Normalize
  ]
  Serial[
    Embedding_500_128
    LSTM_128
    Mean
    Normalize
  ]
]