# Sentence Transformers - Basic Example üöÄ

A simple tutorial to get started with sentence transformers in Jupyter Lab.

## What we'll do:
1. Install and import libraries
2. Load a pre-trained model
3. Encode sentences into embeddings
4. Calculate similarity between sentences
5. See the results!


## 1. Setup and Imports


In [None]:
# Install sentence-transformers if not already installed
# !pip install sentence-transformers

from sentence_transformers import SentenceTransformer, util
import numpy as np

print("‚úÖ Libraries imported successfully!")


## 2. Load Pre-trained Model


In [None]:
# Load a small, fast model
model = SentenceTransformer('all-MiniLM-L6-v2')
print(f"‚úÖ Model loaded! Embedding dimension: {model.get_sentence_embedding_dimension()}")


## 3. Basic Example - Encode Sentences


In [None]:
# Example sentences
sentences = [
    "I love programming",
    "I enjoy coding", 
    "The weather is nice today",
    "It's a beautiful sunny day",
    "Python is great for machine learning"
]

# Convert sentences to embeddings (vectors)
embeddings = model.encode(sentences)

print(f"üìä Encoded {len(sentences)} sentences")
print(f"üî¢ Each sentence becomes a vector of {len(embeddings[0])} numbers")
print(f"\nüìù Example - First sentence: '{sentences[0]}'")
print(f"üéØ First 5 numbers of its embedding: {embeddings[0][:5]}")


## 4. Calculate Sentence Similarities


In [None]:
# Calculate similarity between all sentences
similarity_matrix = util.cos_sim(embeddings, embeddings)

print("üîç Similarity Results (1.0 = identical, 0.0 = completely different):\n")

# Show similarities between sentence pairs
for i in range(len(sentences)):
    for j in range(i+1, len(sentences)):
        similarity = similarity_matrix[i][j].item()
        print(f"üìä Similarity: {similarity:.3f}")
        print(f"   üìù '{sentences[i]}'")
        print(f"   üìù '{sentences[j]}'\n")


## 5. Try Your Own Example!


In [None]:
# Try comparing your own sentences! Change these:
sentence1 = "I like artificial intelligence"
sentence2 = "Machine learning is fascinating"

# Encode both sentences
emb1 = model.encode([sentence1])
emb2 = model.encode([sentence2])

# Calculate similarity
similarity = util.cos_sim(emb1, emb2)[0][0].item()

print(f"üéØ Comparing your sentences:")
print(f"üìù Sentence 1: '{sentence1}'")
print(f"üìù Sentence 2: '{sentence2}'")
print(f"üìä Similarity: {similarity:.3f}")

# Interpretation
if similarity > 0.5:
    print("‚úÖ These sentences are quite similar!")
elif similarity > 0.3:
    print("ü§î These sentences have some similarity")
else:
    print("‚ùå These sentences are quite different")


## üéâ Congratulations!

You've successfully:
- ‚úÖ Loaded a sentence transformer model
- ‚úÖ Converted sentences into numerical embeddings  
- ‚úÖ Calculated semantic similarity between sentences
- ‚úÖ Seen how the model understands meaning, not just words!

**Next steps you could try:**
- Change the sentences in the examples above
- Try different languages
- Use different models like `all-mpnet-base-v2` for higher quality
- Build a simple search system
