# Hybrid Search

The standard search in LangChain is done by vector similarity. However, a number of vectorstores implementations (Astra DB, ElasticSearch, Neo4J, AzureSearch, Qdrant...) also support more advanced search combining vector similarity search and other search techniques (full-text, BM25, and so on). This is generally referred to as "Hybrid" search.

**Step 1: Make sure the vectorstore you are using supports hybrid search**

At the moment, there is no unified way to perform hybrid search in LangChain. Each vectorstore may have their own way to do it. This is generally exposed as a keyword argument that is passed in during `similarity_search`.

By reading the documentation or source code, figure out whether the vectorstore you are using supports hybrid search, and, if so, how to use it.

**Step 2: Add that parameter as a configurable field for the chain**

This will let you easily call the chain and configure any relevant flags at runtime. See [this documentation](/docs/how_to/configure) for more information on configuration.

**Step 3: Call the chain with that configurable field**

Now, at runtime you can call this chain with configurable field.

## Code Example

Let's see a concrete example of what this looks like in code. We will use the Cassandra/CQL interface of Astra DB for this example.

Install the following Python package:

In [1]:
!pip install "cassio>=0.1.7"

Collecting cassio>=0.1.7
  Using cached cassio-0.1.8-py3-none-any.whl.metadata (4.1 kB)


Collecting cassandra-driver<4.0.0,>=3.28.0 (from cassio>=0.1.7)


  Downloading cassandra_driver-3.29.2-cp311-cp311-macosx_11_0_arm64.whl.metadata (6.2 kB)




Collecting geomet<0.3,>=0.1 (from cassandra-driver<4.0.0,>=3.28.0->cassio>=0.1.7)
  Using cached geomet-0.2.1.post1-py3-none-any.whl.metadata (1.0 kB)


Downloading cassio-0.1.8-py3-none-any.whl (45 kB)
[?25l   [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/45.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/45.1 kB[0m [31m?[0m eta [36m-:--:--[0m

[2K   [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━[0m [32m41.0/45.1 kB[0m [31m7.5 MB/s[0m eta [36m0:00:01[0m[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.1/45.1 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25h

Downloading cassandra_driver-3.29.2-cp311-cp311-macosx_11_0_arm64.whl (364 kB)
[?25l   [38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/364.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [38;2;249;38;114m━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.0/364.1 kB[0m [31m1.7 MB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.4/364.1 kB[0m [31m1.7 MB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/364.1 kB[0m [31m991.2 kB/s[0m eta [36m0:00:01[0m[2K   [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/364.1 kB[0m [31m991.2 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/364.1 kB[0m [31m991.2 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/364.1 kB[0m [31m991.2 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/364.1 kB[0m [31m991.2 kB/s[0m eta [36m0:00:01[0m[2K   [38;2;249;38;114m━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m112.6/364.1 kB[0m [31m377.5 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━━━━━━━━━━━━[0m [32m143.4/364.1 kB[0m [31m442.4 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━━━━━━━━━━━━━[0m [32m194.6/364.1 kB[0m [31m509.9 kB/s[0m eta [36m0:00:01[0m[2K   [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m[38;5;237m━━━━━━━━━━━━[0m [32m245.8/364.1 kB[0m [31m587.1 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;2;249;38;114m╸[0m[38;5;237m━━━━━━[0m [32m307.2/364.1 kB[0m [31m725.2 kB/s[0m eta [36m0:00:01[0m

[2K   [38;2;249;38;114m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[38;5;237m╺[0m [32m358.4/364.1 kB[0m [31m724.3 kB/s[0m eta [36m0:00:01[0m[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m364.1/364.1 kB[0m [31m720.7 kB/s[0m eta [36m0:00:00[0m
[?25hUsing cached geomet-0.2.1.post1-py3-none-any.whl (18 kB)


Installing collected packages: geomet, cassandra-driver, cassio


Successfully installed cassandra-driver-3.29.2 cassio-0.1.8 geomet-0.2.1.post1


Get the [connection secrets](https://docs.datastax.com/en/astra/astra-db-vector/get-started/quickstart.html).

Initialize cassio:

In [2]:
import cassio

cassio.init(
    database_id="Your database ID",
    token="Your application token",
    keyspace="Your key space",
)

ValueError: Generic error when fetching the URL to the secure-bundle.

Create the Cassandra VectorStore with a standard [index analyzer](https://docs.datastax.com/en/astra/astra-db-vector/cql/use-analyzers-with-cql.html). The index analyzer is needed to enable term matching.

In [3]:
from cassio.table.cql import STANDARD_ANALYZER
from langchain_community.vectorstores import Cassandra
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()
vectorstore = Cassandra(
    embedding=embeddings,
    table_name="test_hybrid",
    body_index_options=[STANDARD_ANALYZER],
    session=None,
    keyspace=None,
)

vectorstore.add_texts(
    [
        "In 2023, I visited Paris",
        "In 2022, I visited New York",
        "In 2021, I visited New Orleans",
    ]
)

ValueError: DB session not set.

If we do a standard similarity search, we get all the documents:

In [4]:
vectorstore.as_retriever().invoke("What city did I visit last?")

NameError: name 'vectorstore' is not defined

The Astra DB vectorstore `body_search` argument can be used to filter the search on the term `new`.

In [5]:
vectorstore.as_retriever(search_kwargs={"body_search": "new"}).invoke(
    "What city did I visit last?"
)

NameError: name 'vectorstore' is not defined

We can now create the chain that we will use to do question-answering over

In [6]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import (
    ConfigurableField,
    RunnablePassthrough,
)
from langchain_openai import ChatOpenAI

This is basic question-answering chain set up.

In [7]:
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

model = ChatOpenAI()

retriever = vectorstore.as_retriever()

NameError: name 'vectorstore' is not defined

Here we mark the retriever as having a configurable field. All vectorstore retrievers have `search_kwargs` as a field. This is just a dictionary, with vectorstore specific fields

In [8]:
configurable_retriever = retriever.configurable_fields(
    search_kwargs=ConfigurableField(
        id="search_kwargs",
        name="Search Kwargs",
        description="The search kwargs to use",
    )
)

NameError: name 'retriever' is not defined

We can now create the chain using our configurable retriever

In [9]:
chain = (
    {"context": configurable_retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

NameError: name 'configurable_retriever' is not defined

In [10]:
chain.invoke("What city did I visit last?")

NameError: name 'chain' is not defined

We can now invoke the chain with configurable options. `search_kwargs` is the id of the configurable field. The value is the search kwargs to use for Astra DB.

In [11]:
chain.invoke(
    "What city did I visit last?",
    config={"configurable": {"search_kwargs": {"body_search": "new"}}},
)

NameError: name 'chain' is not defined