Model: mrp/SimCSE-model-WangchanBERTa-V2
This model is part of the SimCSE framework, which applies supervised contrastive learning to train BERT-based models. It uses contrastive learning on NLI data (or other labeled datasets) to fine-tune the model, making it highly specialized for sentence similarity tasks.

Embedding Strategy: The strategy of contrastive learning in SimCSE aims to pull together similar sentence pairs and push apart dissimilar ones, which directly improves the ability of the model to compute sentence similarity.

Strengths:
Correct — SimCSE outperforms many other methods in semantic similarity tasks because it specifically optimizes for sentence-level representation using supervised contrastive learning. It is considered state-of-the-art (SOTA) for sentence similarity tasks.

Use Case: This model excels in tasks requiring high precision, such as semantic search, QA reranking, and paraphrase detection.

In [1]:
! pip install transformers sentence-transformers 



In [4]:
from sentence_transformers import SentenceTransformer, util

def compute_similarity(sent1, sent2):
    model = SentenceTransformer('mrp/SimCSE-model-WangchanBERTa-V2')
    embeddings = model.encode([sent1, sent2])
    return util.cos_sim(embeddings[0], embeddings[1]).item()

In [5]:
sent1 = "Despite growing up in completely different environments, both Alice and her childhood friend eventually found themselves pursuing careers in environmental policy, driven by a shared passion for combating climate change and creating a sustainable future for the next generation."
sent2 = "Although raised in separate parts of the world, Alice and her best friend ended up working in environmental advocacy, motivated by their mutual concern for the planet and a desire to leave a positive legacy for future generations."
score = compute_similarity(sent1, sent2)
print(f"SimCSE Similarity: {score:.4f}")

SimCSE Similarity: 0.8954
