# Choosing Best-Fit Embeddings

This notebook provides an example of how to choose the best-fit embeddings for your specific use case and store those embeddings in SingleStore.

In [120]:
!pip install -q -U langchain langchain-community singlestoredb langchain-openai langchain-huggingface --quiet

In [147]:
import getpass

OPENAI_API_KEY = getpass.getpass()

os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

 ········


In [141]:
from langchain.chains import LLMChain
from langchain_openai import ChatOpenAI
from langchain_core.prompts import PromptTemplate
from langchain_huggingface import HuggingFaceEmbeddings

## Loading a Hugging Face Transformers model from the MTEB Leaderboard

(https://huggingface.co/spaces/mteb/leaderboard)



In [None]:
model_name = "mixedbread-ai/mxbai-embed-large-v1"
hf_embeddings = HuggingFaceEmbeddings(
    model_name=model_name,
)
texts = ["Hello, world!", "How are you?"]
hf_embeddings.embed_documents(texts)

## Benchmarking over a test data set

Here we will construct a test data set and benchmark our chosen model against it.

In [130]:
cars = [
    {
        "name": "Sedan",
        "description": "A classic four-door car with a spacious interior and smooth ride, perfect for daily commutes or family trips."
    },
    {
        "name": "SUV",
        "description": "A versatile vehicle with ample seating and cargo space. Enjoy off-road adventures or city driving with ease."
    },
    {
        "name": "Sports Car",
        "description": "Indulge in high-speed thrills with this sleek, aerodynamic vehicle. Experience powerful performance and dynamic handling."
    },
    {
        "name": "Convertible",
        "description": "Enjoy open-air driving with this stylish car. Whether you prefer sunny days or starry nights, it's a ride for all seasons."
    },
    {
        "name": "Hatchback",
        "description": "Compact and practical, this car offers easy maneuverability and ample storage space for urban living or weekend getaways."
    },
    {
        "name": "Pickup Truck",
        "description": "Robust and reliable, this truck is built for hauling and towing. Perfect for work or outdoor adventures."
    },
    {
        "name": "Minivan",
        "description": "Spacious and family-friendly, this vehicle offers comfortable seating and modern amenities for long road trips."
    },
    {
        "name": "Coupe",
        "description": "A stylish two-door car with a sporty design. Ideal for those who appreciate performance and aesthetics in a compact form."
    }
]

In [132]:
from langchain.schema.document import Document

docs = []

for car in cars:
    d = Document(page_content=car["description"], metadata={"name": car["name"]})
    docs.append(d)

print(docs)

[Document(page_content='A classic four-door car with a spacious interior and smooth ride, perfect for daily commutes or family trips.', metadata={'name': 'Sedan'}), Document(page_content='A versatile vehicle with ample seating and cargo space. Enjoy off-road adventures or city driving with ease.', metadata={'name': 'SUV'}), Document(page_content='Indulge in high-speed thrills with this sleek, aerodynamic vehicle. Experience powerful performance and dynamic handling.', metadata={'name': 'Sports Car'}), Document(page_content="Enjoy open-air driving with this stylish car. Whether you prefer sunny days or starry nights, it's a ride for all seasons.", metadata={'name': 'Convertible'}), Document(page_content='Compact and practical, this car offers easy maneuverability and ample storage space for urban living or weekend getaways.', metadata={'name': 'Hatchback'}), Document(page_content='Robust and reliable, this truck is built for hauling and towing. Perfect for work or outdoor adventures.', 

In [134]:
from langchain.vectorstores import SingleStoreDB
import os

os.environ["SINGLESTOREDB_URL"] = f'{connection_user}:{connection_password}@{connection_host}:{connection_port}/{connection_default_database}'

In [152]:
vectorstore=SingleStoreDB.from_documents(documents=docs, table_name="embedding_test", embedding=hf_embeddings, use_vector_index=True)

In [148]:
from langchain.chains import RetrievalQA

llm=ChatOpenAI()
qa_chain = RetrievalQA.from_chain_type(llm,retriever=vectorstore.as_retriever())
qa_chain({"query": "What cars are best suited for families?"})

  warn_deprecated(


{'query': 'What cars are best suited for families?',
 'result': 'The classic four-door car with a spacious interior and smooth ride, as well as the spacious and family-friendly vehicle with comfortable seating and modern amenities would be best suited for families.'}

In [170]:
vectorResults = vectorstore.similarity_search(
    "family car",
    k=4,
    search_strategy=SingleStoreDB.SearchStrategy.VECTOR_ONLY,
)
print(vectorResults[0].metadata["name"])
print(vectorResults[0].page_content)

Minivan
Spacious and family-friendly, this vehicle offers comfortable seating and modern amenities for long road trips.


In [171]:
vectorResults = vectorstore.similarity_search(
    "sporty car",
    k=4,
    search_strategy=SingleStoreDB.SearchStrategy.VECTOR_ONLY,
)
print(vectorResults[0].metadata["name"])
print(vectorResults[0].page_content)

Coupe
A stylish two-door car with a sporty design. Ideal for those who appreciate performance and aesthetics in a compact form.


## Query 1: "Family vehicle"

| Rank | mixedbread-ai/mxbai-embed-large-v1  |
|------|-------------------------------------|
| 1    | ✅ SUV                              |
| 2    | ❌ Sedan                            |
| 3    | ✅ Minivan                          |
| 4    | ❌ Pickup Truck                     |

**Precision**: 2/4 = 0.50

**Recall**: 2/3 = 0.67

## Query 2: "Sporty car"

| Rank | mixedbread-ai/mxbai-embed-large-v1  |
|------|-------------------------------------|
| 1    | ✅ Sports Car                       |
| 2    | ❌ Sedan                            |
| 3    | ✅ Convertible                      |
| 4    | ❌ Hatchback                        |

**Precision**: 2/4 = 0.50

**Recall**: 2/3 = 0.67