[Community]: HuggingFaceCrossEncoder score accounting for <not-rele…

…vant score,relevant score> pairs. (#22578) - **Description:** Some of the Cross-Encoder models provide scores in pairs, i.e., <not-relevant score (higher means the document is less relevant to the query), relevant score (higher means the document is more relevant to the query)>. However, the `HuggingFaceCrossEncoder` `score` method does not currently take into account the pair situation. This PR addresses this issue by modifying the method to consider only the relevant score if score is being provided in pair. The reason for focusing on the relevant score is that the compressors select the top-n documents based on relevance. - **Issue:** #22556 - Please also refer to this [comment](UKPLab/sentence-transformers#568 (comment))
langchain-ai · Jun 14, 2024 · d1b7a93 · d1b7a93
1 parent 83643cb
commit d1b7a93
Showing 1 changed file with 4 additions and 0 deletions.
diff --git a/libs/community/langchain_community/cross_encoders/huggingface.py b/libs/community/langchain_community/cross_encoders/huggingface.py
@@ -60,4 +60,8 @@ def score(self, text_pairs: List[Tuple[str, str]]) -> List[float]:
             List of scores, one for each pair.
         """
         scores = self.client.predict(text_pairs)
+        # Somes models e.g bert-multilingual-passage-reranking-msmarco
+        # gives two score not_relevant and relevant as compare with the query.
+        if len(scores.shape) > 1:  # we are going to get the relevant scores
+            scores = map(lambda x: x[1], scores)
         return scores