# Embeddings & Semantic Search

This notebook covers how to use embeddings for semantic search, corresponding to the SLM Hub [Embeddings Guide](https://slmhub.gitbook.io/slmhub/docs/learn/fundamentals/embeddings).

## 1. Setup
Install `sentence-transformers` for embeddings and `chromadb` for the vector store.

In [None]:
!pip install sentence-transformers chromadb scikit-learn

## 2. Basic Embeddings
We use `sentence-transformers` to convert text into vectors.

In [None]:
from sentence_transformers import SentenceTransformer

# Load a small, fast embedding model
model = SentenceTransformer("all-MiniLM-L6-v2")

# Create embeddings
texts = [
    "Small language models are efficient",
    "SLMs use less compute than LLMs",
    "The weather is nice today"
]

embeddings = model.encode(texts)
print(f"Embeddings shape: {embeddings.shape}")  # (3, 384)

## 3. Calculate Similarity
We can use Cosine Similarity to see how related the texts are.

In [None]:
from sklearn.metrics.pairwise import cosine_similarity

similarities = cosine_similarity(embeddings)

print("Similarity Matrix:")
print(similarities)

# Text 0 ("Small language...") vs Text 1 ("SLMs use less...") should be high
print(f"\nSimilarity (0 vs 1): {similarities[0][1]:.4f}")
# Text 0 vs Text 2 ("Weather...") should be low
print(f"Similarity (0 vs 2): {similarities[0][2]:.4f}")

## 4. RAG Example with ChromaDB
A simple example of Retrieval Augmented Generation logic: Indexing documents and searching by query.

In [None]:
import chromadb

# Initialize Chroma Client
client = chromadb.Client()
# Create collection (delete if exists to reset)
try:
    client.delete_collection("docs")
except:
    pass
collection = client.create_collection("docs")

# Documents to index
documents = [
    "SLMs are models with fewer parameters, typically under 10B.",
    "Fine-tuning adapts models to specific tasks using data.",
    "Quantization reduces model size by lowering precision (e.g., 4-bit)."
]

# Generate embeddings for docs
doc_embeddings = model.encode(documents)

# Add to Chroma
collection.add(
    embeddings=doc_embeddings.tolist(),
    documents=documents,
    ids=["doc1", "doc2", "doc3"]
)

# Perform a Search
query = "How to make models smaller?"
query_embed = model.encode(query)

results = collection.query(
    query_embeddings=[query_embed.tolist()],
    n_results=2
)

print(f"Query: '{query}'")
print("\nTop Results:")
for doc in results["documents"][0]:
    print(f" - {doc}")