# Sentence Transformer

In this notebook, we use the SentenceTransformer model to encode various sentences and test the cosine similarity between them.

<a target="_blank" href="https://colab.research.google.com/github/simonguest/CS-394/blob/main/src/01/notebooks/sentence-transformer.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
<a target="_blank" href="https://github.com/simonguest/CS-394/raw/refs/heads/main/src/01/notebooks/sentence-transformer.ipynb">
  <img src="https://img.shields.io/badge/Download_.ipynb-blue" alt="Download .ipynb"/>
</a>

In [1]:
# Install required packages
!uv pip install sentence-transformers -q

In [2]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

In [3]:
embeddings = model.encode("The cat sat on the mat")
embeddings[:10]

array([ 0.13040183, -0.01187013, -0.02811704,  0.05123864, -0.05597447,
        0.03019154,  0.0301613 ,  0.02469837, -0.01837057,  0.05876679],
      dtype=float32)

In [4]:
embeddings = model.encode("The dog rested on the rug")
embeddings[:10]

array([ 0.05627272,  0.02632686,  0.05896206,  0.12019245, -0.00399702,
        0.08970873, -0.02332847, -0.01548103,  0.00939427,  0.01598458],
      dtype=float32)

In [5]:
embeddings = model.encode("I love pizza!")
embeddings[:10]

array([-0.09438416,  0.02385838,  0.00920313,  0.04992779, -0.09533099,
        0.0061356 ,  0.03513189,  0.00850056,  0.0105693 , -0.0578883 ],
      dtype=float32)

In [6]:
from sklearn.metrics.pairwise import cosine_similarity

sentences = ['The cat sat on the mat', 
             'The dog rested on the rug',
             'I love pizza']
embeddings = model.encode(sentences)

print(cosine_similarity(embeddings))

[[ 1.0000002   0.47530937  0.00155361]
 [ 0.47530937  1.         -0.04451237]
 [ 0.00155361 -0.04451237  1.        ]]
