Model: all-MiniLM-L6-v2
all-MiniLM-L6-v2 is indeed a commonly used model in Sentence-BERT, designed to be smaller and faster while still providing high-quality sentence embeddings.

Embedding Strategy: Sentence-BERT uses siamese and triplet networks, which are explicitly trained for sentence similarity tasks, with training on Natural Language Inference (NLI) and paraphrase datasets. This allows SBERT to capture semantic meaning more effectively than BERT.

Strengths:
The information here is correct: SBERT is optimized for semantic textual similarity (STS) tasks and is faster than the original BERT model.

Use Case: SBERT is indeed commonly used for real-time applications like semantic search, deduplication, and similarity matching

In [1]:
! pip install transformers sentence-transformers scikit-learn



In [2]:
from sentence_transformers import SentenceTransformer, util

def compute_similarity(sent1, sent2):
    model = SentenceTransformer('all-MiniLM-L6-v2')
    embeddings = model.encode([sent1, sent2])
    return util.cos_sim(embeddings[0], embeddings[1]).item()


  from .autonotebook import tqdm as notebook_tqdm


In [3]:
sent1 = "Despite growing up in completely different environments, both Alice and her childhood friend eventually found themselves pursuing careers in environmental policy, driven by a shared passion for combating climate change and creating a sustainable future for the next generation."
sent2 = "Although raised in separate parts of the world, Alice and her best friend ended up working in environmental advocacy, motivated by their mutual concern for the planet and a desire to leave a positive legacy for future generations."
score = compute_similarity(sent1, sent2)
print(f"Sentence-BERT Similarity: {score:.4f}")

Sentence-BERT Similarity: 0.8202
