| Method                                  | LangChain/Vector Store Function                         | Description                                                                             |
| --------------------------------------- | ------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| **Basic Similarity Search**             | `similarity_search(query, k=3)`                         | Returns top-k most similar documents (no scores).                                       |
| **Similarity Search with Score**        | `similarity_search_with_score(query, k=3)`              | Returns top-k documents **with similarity distance scores**. Useful for filtering.      |
| **MMR (Max Marginal Relevance) Search** | `max_marginal_relevance_search(query, k=5, fetch_k=20)` | Diversifies results to avoid redundancy. Great when documents may be too similar.       |
| **Hybrid Search** (if supported)        | Combines semantic + keyword search                      | Some vector DBs like **Weaviate** or **Pinecone** support this natively. Not in Chroma. |
| **Filtered Search**                     | `similarity_search(..., filter={"type": "policy"})`     | Filter by metadata (e.g. only docs with tag "finance").                                 |


In [None]:
What fetch_k=20 means:
fetch_k is the number of initial documents to retrieve from the vector store based purely on similarity to the query.
These 20 documents are then reranked using Max Marginal Relevance (MMR) to select the top k = 5 documents that are:
***Relevant to the query, and
***Diverse from each other (less redundancy).

| Parameter    | Meaning                                                       |
| ------------ | ------------------------------------------------------------- |
| `fetch_k=20` | Pull 20 relevant docs based on similarity.                    |
| `k=5`        | Select the 5 most relevant *and diverse* documents to return. |


1. Similarity Threshold Filtering (Manual Check)
Instead of directly checking scores like in the previous method, you can compute the cosine similarity yourself and define a stricter condition.

In [None]:
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from sklearn.metrics.pairwise import cosine_similarity

# Embed query
query_embedding = embedding.embed_query(query)

# Get top document and its embedding
results = vectorstore.similarity_search_with_score(query, k=1)
if results:
    top_doc, score = results[0]
    doc_embedding = embedding.embed_query(top_doc.page_content)

    # Compute cosine similarity manually
    similarity = cosine_similarity([query_embedding], [doc_embedding])[0][0] # why [0][0] means to get this  [[0.98]]


    if similarity > 0.75:  # Custom threshold
        # Call LLM
        ...
    else:
        print("Query is too different. Skipping LLM.")


cosine_similarity--
Higher similarity score → more similar

Lower similarity score → less similar



2. Use Metadata Filtering Before Similarity Search
If your vector store stores metadata (e.g. topics, tags), you can filter queries based on that first.

In [None]:
# Chroma or Pinecone example
results = vectorstore.similarity_search(
    query, 
    k=1, 
    filter={"category": "health"}  # Only search in health-related documents
)

if results:
    # Proceed to LLM
    ...
else:
    print("No documents found for this category.")


4. Use a Keyword Search (Pre-filter) Before Vector Search
Before even embedding the query, perform a keyword search on a document index or summary file.

If keywords match → go to vector search

If not → skip

In [None]:
if "hypertension" in query.lower():
    # Proceed to vector search + LLM
else:
    print("Query does not match keyword criteria.")
