# Translation Error Detection for Ubuntu - Pooling method equivalence
In this notebook will be quickly shown that with a mean pooling initialisation of the Generalized Pooling, it gives the same results as the original multilingual sentence embedding model that performs multilingual embedding using mean pooling. <br><br>
Original model used : ``sentence-transformers/distiluse-base-multilingual-cased-v2``

## Importing relevant modules

In [4]:
import torch
from sentence_transformers import SentenceTransformer, models
import numpy as np

# Importing personal packages
import sentence_pooling


In [5]:
import importlib
importlib.reload(sentence_pooling)
from sentence_pooling import GeneralizedSentenceTransformerMaker

## 1. Generalized pooling model

In [6]:
# Load the existing SentenceTransformer model
existing_model = SentenceTransformer("sentence-transformers/distiluse-base-multilingual-cased-v2")

# Build the new model using the reused components
model_maker = GeneralizedSentenceTransformerMaker(existing_model)
model = model_maker.get_model()

# Print the new model architecture
print(model)

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): MultiHeadGeneralizedPooling(
    (P): ModuleList(
      (0-7): 8 x Linear(in_features=768, out_features=96, bias=True)
    )
    (W1): ModuleList(
      (0-7): 8 x Linear(in_features=96, out_features=384, bias=True)
    )
    (W2): ModuleList(
      (0-7): 8 x Linear(in_features=384, out_features=96, bias=True)
    )
  )
  (2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)


In [7]:
# Run the model on an example
sentences = ["First phrase", "Second phrase", "Third phrase darling"]
sentence_embeddings = model.encode(sentences)


In [8]:
print("EMBEDDINGS COMPUTED WITH INITIALIZED GENERALIZED POOLING : \n")
print(sentence_embeddings)

EMBEDDINGS COMPUTED WITH INITIALIZED GENERALIZED POOLING : 

[[ 0.05568264  0.07488462 -0.0569736  ... -0.0319594   0.02177089
  -0.01300419]
 [ 0.02329546  0.04503326 -0.04492065 ... -0.02495108  0.04223553
  -0.0437062 ]
 [ 0.02012968 -0.01120749 -0.06453869 ... -0.03425869 -0.03193577
   0.04811867]]


## Comparing to the original mean pooling model

In [9]:
# Printing the architecture of the original model
print(existing_model)

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 512, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
)


In [10]:
print("EMBEDDINGS COMPUTED WITH MEAN POOLING : \n")

mean_embedding = existing_model.encode(sentences)
print(mean_embedding)

EMBEDDINGS COMPUTED WITH MEAN POOLING : 

[[ 0.05568264  0.07488462 -0.0569736  ... -0.0319594   0.02177089
  -0.01300419]
 [ 0.02329546  0.04503326 -0.04492065 ... -0.02495108  0.04223553
  -0.0437062 ]
 [ 0.02012968 -0.01120749 -0.06453869 ... -0.03425869 -0.03193577
   0.04811867]]


# Checking that the embedings obtained are equal

In [11]:
# Comparing the sentence embeddings obtained by the two models
diff = np.sum(np.abs(sentence_embeddings - mean_embedding) > 0.0000001)
print(diff)

0


The two methods are equivalent to within a precision of $10^{-7}$, indicating that any differences in their computations fall below this threshold.