# Sentence Embeddings

In the example provided, the libraries are already installed. If we wanted to run this, we would have to download the model libraries that we want from Hugging Face & run with the Transformers library.

Some use cases and services that can be built using these models are:

- Semantic Similarity - Measure the semantic similarity between two sentences. Duplicate detection, question answering, and recommendation systems, where understanding the similarity between sentences is essential for accurate responses or recommendations.
- Information Retrieval - Index and retrieve relevant information from large collections of documents or text data. Search for and retrieve documents that are semantically similar to the user's query.
- Clustering and Classification - Clustering similar sentences together or classifying sentences into predefined categories. Text classification, sentiment analysis, and topic modeling, where grouping or categorizing sentences based on their semantic meaning is necessary.
- Contextual Understanding - Capture the contextual nuances and relationships between sentences, enabling conversational models to understand the context in which sentences are used. Beneficial in tasks such as dialogue systems, where maintaining context and coherence is essential for generating natural and coherent responses.

In [None]:
# !pip install sentence-transformers


### Build the `sentence embedding` pipeline with Hugging Face Transformers Library

In [1]:
# Suppress warning messages
from transformers.utils import logging
logging.set_verbosity_error()

from sentence_transformers import SentenceTransformer

Info about the model used: [all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).

Before it can be used, one must have downloaded it.

In [None]:
# Load the model
model = SentenceTransformer("all-MiniLM-L6-v2")

# Sentences #1 to compare
sentences1 = ['The cat sits outside',
              'A man is playing guitar',
              'The movies are awesome']

# Sentences #2 to compare
sentences2 = ['The dog plays in the garden',
              'A woman watches TV',
              'The new movie is so great']

# Compute embeddings for both lists
embeddings1 = model.encode(sentences1, convert_to_tensor=True)
embeddings2 = model.encode(sentences2, convert_to_tensor=True)

print(embeddings1)
print(embeddings2)
# This will compute two tensors, which we can use to compute the cosine similarity (0,1) to compare the sentences

In [None]:
from sentence_transformers import util

# Compute the cosine similarity between both lists of sentences
cosine_scores = util.cos_sim(embeddings1,embeddings2)
print(cosine_scores)

In [None]:
# Sentence similarity
for i in range(len(sentences1)):
    print("{} \t\t {} \t\t Score: {:.4f}".format(sentences1[i],
                                                 sentences2[i],
                                                 cosine_scores[i][i]))