### Sentence Tranformer

Sentence Transformers are a Python library for creating and using embedding models that represent sentences, paragraphs, or images as dense vectors (embeddings). These embeddings are designed so that semantically similar texts have vectors that are close to each other in the vector space, enabling applications like semantic search, text clustering, and paraphrase mining. They offer a significant advantage over older methods by capturing the deeper semantic meaning of text, including word order and context. 

#### How they work

##### Create embeddings: 
They take input text and produce a fixed-size vector representation (embedding). 
##### Capture meaning: 
Unlike models that treat words independently, Sentence Transformers consider the entire context of the sentence to create a more nuanced representation. 
##### Enable similarity search: 
The resulting embeddings can be used to quickly find similar sentences, paragraphs, or even images, making them ideal for tasks like semantic search. 



#### Key applications

Semantic search: Matching user queries to the most relevant documents based on meaning, not just keywords. 

Clustering: Grouping similar documents or sentences together. 

Classification: Using the embeddings as input for other machine learning classifiers. 

Paraphrase mining: Finding sentences that have the same meaning but are worded differently. 

Retrieval-Augmented Generation (RAG): Retrieving relevant information to improve the quality of generated text. 


In [1]:
!pip3 install sentence-transformers


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.2[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = ["The weather is lovely today.", " It's so sunny outside!"]
embeddings = model.encode(sentences) 

  from .autonotebook import tqdm as notebook_tqdm


In [4]:
len(embeddings[0])

384