### **Semantic Search with Sentence Transformers**
**Goal:** Build a semantic search engine that ranks documents/passages by semantic similarity to a user query using pretrained sentence embeddings.

**For example, if the user asks:**
*  🗨️ “How can I speak to support?”

The system might match it with:
*  🗨️ “How do I contact customer service?”

—even though no keywords directly match—because the meanings are similar.

**Prepare the Document Corpus** You can use any list of texts (FAQs, product descriptions, articles).

In [1]:
# Sample document corpus
documents = [
"How do I reset my password?",
"What is the refund policy for returned items?",
"How can I track my order?",
"Where are you located?",
"How do I contact customer service?",
"Can I cancel my subscription anytime?"
]

**Load Model**

In [2]:
from sentence_transformers import SentenceTransformer
# Load a lightweight model
model = SentenceTransformer('all-MiniLM-L6-v2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

**Encode Documents**

In [3]:
# Convert all documents into sentence embeddings
doc_embeddings = model.encode(documents, convert_to_tensor=True)

**Encode User Query**

In [4]:
query = "How can I speak to support?"
query_embedding = model.encode(query, convert_to_tensor=True)

**Compute Similarity**

In [5]:
from sentence_transformers.util import cos_sim

#cos_sim measures cosine similarity between the query and each document vector.
#Result: a score for each document, indicating how semantically close it is to the query.
similarities = cos_sim(query_embedding, doc_embeddings)[0]

# Rank results by similarity
import numpy as np
top_k = 3
top_results = np.argsort(-similarities)[:top_k]

# Show top matches
print(f"\nQuery: {query}\nTop {top_k} results:")
for idx in top_results:
  print(f"{documents[idx]} (score: {similarities[idx]:.4f})")


Query: How can I speak to support?
Top 3 results:
How do I contact customer service? (score: 0.4009)
How can I track my order? (score: 0.1739)
Can I cancel my subscription anytime? (score: 0.1543)


**Wrap in a Function**

Here we generalize everything into a reusable function:

In [6]:
from sentence_transformers import SentenceTransformer, util
import numpy as np

# Step 7: Wrap in a Search Function
def semantic_search(query, docs, embeddings, model, top_k=3):
    query_emb = model.encode(query, convert_to_tensor=True)
    scores = util.cos_sim(query_emb, embeddings)[0]
    top_k_idx = np.argsort(-scores)[:top_k]
    return [(docs[i], float(scores[i])) for i in top_k_idx]

# Example usage:
results = semantic_search("I need help with my account", documents, doc_embeddings, model)

for text, score in results:
    print(f"> {text} (score: {score:.4f})")


> How do I reset my password? (score: 0.5667)
> How do I contact customer service? (score: 0.3736)
> Can I cancel my subscription anytime? (score: 0.3236)


# 📝 **Summary:**

## 🎯 Objective
The goal of this project is to create a semantic search engine that retrieves the most relevant text responses based on the meaning of a user query. Instead of matching exact keywords, it understands the *context* and *intent* behind the input.

## 🧠 How It Works
- **Sentence Embeddings:**  
  A pretrained model (`all-MiniLM-L6-v2` from SentenceTransformers) is used to convert each document and the user query into a dense vector (embedding) that captures semantic meaning.
  
- **Similarity Calculation:**  
  Cosine similarity is used to compare the query vector with all document vectors. This gives a score indicating how close in meaning each document is to the query.

- **Ranking & Results:**  
  The documents are ranked by similarity scores, and the top-k most relevant results are returned.

## 🔧 Use Cases
- FAQ assistants
- Customer support bots
- Knowledge base search
- Any application where understanding natural language is needed

## ✅ Outcome
This approach enables more accurate, intelligent search experiences—returning results that are meaningfully relevant, even if the user's wording is different from the stored content.
