# Sentence Transformers - Basic Example üöÄ

A simple tutorial to get started with sentence transformers in Jupyter Lab.

## What we'll do:
1. Install and import libraries
2. Load a pre-trained model
3. Encode sentences into embeddings
4. Calculate similarity between sentences
5. See the results!


## 1. Setup and Imports


In [1]:
# Install sentence-transformers if not already installed
# !pip install sentence-transformers

from sentence_transformers import SentenceTransformer, util
import numpy as np

print("‚úÖ Libraries imported successfully!")


‚úÖ Libraries imported successfully!


## 2. Load Pre-trained Model


In [2]:
# Load a small, fast model
model = SentenceTransformer('all-MiniLM-L6-v2')
print(f"‚úÖ Model loaded! Embedding dimension: {model.get_sentence_embedding_dimension()}")


‚úÖ Model loaded! Embedding dimension: 384


## 3. Basic Example - Encode Sentences


In [3]:
# Example sentences
sentences = [
    "I love programming",
    "I enjoy coding", 
    "The weather is nice today",
    "It's a beautiful sunny day",
    "Python is great for machine learning"
]

# Convert sentences to embeddings (vectors)
embeddings = model.encode(sentences)

print(f"üìä Encoded {len(sentences)} sentences")
print(f"üî¢ Each sentence becomes a vector of {len(embeddings[0])} numbers")
print(f"\nüìù Example - First sentence: '{sentences[0]}'")
print(f"üéØ First 5 numbers of its embedding: {embeddings[0][:5]}")


üìä Encoded 5 sentences
üî¢ Each sentence becomes a vector of 384 numbers

üìù Example - First sentence: 'I love programming'
üéØ First 5 numbers of its embedding: [-0.03617874 -0.01277374  0.00300631 -0.01690345  0.00948433]


## 4. Calculate Sentence Similarities


In [4]:
# Calculate similarity between all sentences
similarity_matrix = util.cos_sim(embeddings, embeddings)

print("üîç Similarity Results (1.0 = identical, 0.0 = completely different):\n")

# Show similarities between sentence pairs
for i in range(len(sentences)):
    for j in range(i+1, len(sentences)):
        similarity = similarity_matrix[i][j].item()
        print(f"üìä Similarity: {similarity:.3f}")
        print(f"   üìù '{sentences[i]}'")
        print(f"   üìù '{sentences[j]}'\n")


üîç Similarity Results (1.0 = identical, 0.0 = completely different):

üìä Similarity: 0.817
   üìù 'I love programming'
   üìù 'I enjoy coding'

üìä Similarity: 0.109
   üìù 'I love programming'
   üìù 'The weather is nice today'

üìä Similarity: 0.206
   üìù 'I love programming'
   üìù 'It's a beautiful sunny day'

üìä Similarity: 0.417
   üìù 'I love programming'
   üìù 'Python is great for machine learning'

üìä Similarity: 0.110
   üìù 'I enjoy coding'
   üìù 'The weather is nice today'

üìä Similarity: 0.168
   üìù 'I enjoy coding'
   üìù 'It's a beautiful sunny day'

üìä Similarity: 0.295
   üìù 'I enjoy coding'
   üìù 'Python is great for machine learning'

üìä Similarity: 0.670
   üìù 'The weather is nice today'
   üìù 'It's a beautiful sunny day'

üìä Similarity: 0.072
   üìù 'The weather is nice today'
   üìù 'Python is great for machine learning'

üìä Similarity: 0.079
   üìù 'It's a beautiful sunny day'
   üìù 'Python is great for machine lea

## 5. Try Your Own Example!


In [5]:
# Try comparing your own sentences! Change these:
sentence1 = "I like artificial intelligence"
sentence2 = "Machine learning is fascinating"

# Encode both sentences
emb1 = model.encode([sentence1])
emb2 = model.encode([sentence2])

# Calculate similarity
similarity = util.cos_sim(emb1, emb2)[0][0].item()

print(f"üéØ Comparing your sentences:")
print(f"üìù Sentence 1: '{sentence1}'")
print(f"üìù Sentence 2: '{sentence2}'")
print(f"üìä Similarity: {similarity:.3f}")

# Interpretation
if similarity > 0.5:
    print("‚úÖ These sentences are quite similar!")
elif similarity > 0.3:
    print("ü§î These sentences have some similarity")
else:
    print("‚ùå These sentences are quite different")


üéØ Comparing your sentences:
üìù Sentence 1: 'I like artificial intelligence'
üìù Sentence 2: 'Machine learning is fascinating'
üìä Similarity: 0.546
‚úÖ These sentences are quite similar!


## üéâ Congratulations!

You've successfully:
- ‚úÖ Loaded a sentence transformer model
- ‚úÖ Converted sentences into numerical embeddings  
- ‚úÖ Calculated semantic similarity between sentences
- ‚úÖ Seen how the model understands meaning, not just words!

**Next steps you could try:**
- Change the sentences in the examples above
- Try different languages
- Use different models like `all-mpnet-base-v2` for higher quality
- Build a simple search system
