# Create a Hybrid Search Service with Fastembed

This tutorial shows you how to build and deploy your own hybrid search service to look through descriptions of companies from startups-list.com and pick the most similar ones to your query. The website contains the company names, descriptions, locations, and a picture for each entry.

As we have already written on our blog, there is no single definition of hybrid search. In this tutorial we are covering the case with a combination of dense and sparse embeddings. The former ones refer to the embeddings generated by such well-known neural networks as BERT, while the latter ones are more related to a traditional full-text search approach.

![](./image/hybrid%20search.png)

## Workflow
To create a hybrid search service, you will need to transform your raw data and then create a search function to manipulate it. First, you will 1) download and prepare a sample dataset using a modified version of the BERT ML model. Then, you will 2) load the data into Qdrant, 3) create a hybrid search API and 4) serve it using FastAPI.





## Prerequisites
To complete this tutorial, you will need:

Docker - The easiest way to use Qdrant is to run a pre-built Docker image.
Raw parsed data from startups-list.com.
Python version >=3.8


## Prepare sample dataset
To conduct a hybrid search on startup descriptions, you must first encode the description data into vectors. Fastembed integration into qdrant client combines encoding and uploading into a single step.

It also takes care of batching and parallelization, so you don’t have to worry about it.

Let’s start by downloading the data and installing the necessary packages.

First you need to download the dataset.

```bash
wget https://storage.googleapis.com/generall-shared-data/startups_demo.json
```

## Run Qdrant in Docker
Next, you need to manage all of your data using a vector engine. Qdrant lets you store, update or delete created vectors. Most importantly, it lets you search for the nearest vectors via a convenient API.

Note: Before you begin, create a project directory and a virtual python environment in it.

Download the Qdrant image from DockerHub.

```bash
docker pull qdrant/qdrant
```

Start Qdrant inside of Docker.

```bash
docker run -p 6333:6333 \
    -v $(pwd)/qdrant_storage:/qdrant/storage \
    qdrant/qdrant
```

Test the service by going to http://localhost:6333/. You should see the Qdrant version info in your browser.

All data uploaded to Qdrant is saved inside the ./qdrant_storage directory and will be persisted even if you recreate the container.



## Upload data to Qdrant
Install the official Python client to best interact with Qdrant.


In [3]:
%pip install "qdrant-client[fastembed]>=1.8.2"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [2]:
from qdrant_client import QdrantClient

client = QdrantClient(url="http://localhost:6333")

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
client.set_model("sentence-transformers/all-MiniLM-L6-v2")
# comment this line to use dense vectors only
client.set_sparse_model("prithivida/Splade_PP_en_v1")

Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 38409.38it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 81920.00it/s]


In [7]:
client.recreate_collection(
    collection_name="startups",
    vectors_config=client.get_fastembed_vector_params(),
    # comment this line to use dense vectors only
    sparse_vectors_config=client.get_fastembed_sparse_vector_params(),  
)


  client.recreate_collection(


True

Qdrant requires vectors to have their own names and configurations.

Methods get_fastembed_vector_params and get_fastembed_sparse_vector_params help you to get the corresponding parameters for the models you are using. These parameters include vector size, distance function, etc.

Without fastembed integration, you would need to specify the vector size and distance function manually. Read more about it here.

Additionally, you can specify extended configuration for your vectors, like quantization_config or hnsw_config.



In [6]:
import json

payload_path = "startups_demo.json"
metadata = []
documents = []

with open(payload_path) as fd:
    for line in fd:
        obj = json.loads(line)
        documents.append(obj.pop("description"))
        metadata.append(obj)



In [7]:
client.add(
    collection_name="startups",
    documents=documents,
    metadata=metadata,
    parallel=4,  # Use all available CPU cores to encode data. 
    # Requires wrapping code into if __name__ == '__main__' block
)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 18674.55it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 17863.30it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 38764.36it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 19897.08it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 13025.79it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 16898.89it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 142663.40it/s]
Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00, 127875.12it/s]


KeyboardInterrupt: 

: 

## Build the search API
Now that all the preparations are complete, let’s start building a neural search class.

In order to process incoming requests, the hybrid search class will need 3 things: 1) models to convert the query into a vector, 2) the Qdrant client to perform search queries, 3) fusion function to re-rank dense and sparse search results.

Fastembed integration encapsulates query encoding, search and fusion into a single method call. Fastembed leverages reciprocal rank fusion in order combine the results.

1. Create a file named hybrid_searcher.py and specify the following.


In [4]:
class HybridSearcher:
    DENSE_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
    SPARSE_MODEL = "prithivida/Splade_PP_en_v1"
    def __init__(self, collection_name):
        self.collection_name = collection_name
        # initialize Qdrant client
        self.qdrant_client = QdrantClient("http://localhost:6333")
        self.qdrant_client.set_model(self.DENSE_MODEL)
        # comment this line to use dense vectors only
        self.qdrant_client.set_sparse_model(self.SPARSE_MODEL)

    def search(self, text: str):
      search_result = self.qdrant_client.query(
          collection_name=self.collection_name,
          query_text=text,
          query_filter=None,  # If you don't want any filters for now
          limit=5,  # 5 the closest results
      )
      # `search_result` contains found vector ids with similarity scores 
      # along with the stored payload
      
      # Select and return metadata
      metadata = [hit.metadata for hit in search_result]
      return metadata


hybrid_searcher = HybridSearcher(collection_name="startups")


In [5]:
q = 'tesla'

results = hybrid_searcher.search(text=q)

In [15]:
results 

[{'alt': 'GrubHub -  business services hospitality restaurants',
  'city': 'Chicago',
  'document': '',
  'images': 'https://d1qb2nb5cznatu.cloudfront.net/startups/i/32963-cef57a264f13521d27f63e77bb4086e0-thumb_jpg.jpg?buster=1406209987',
  'link': 'https://www.grubhub.com/careers/?nl=1&jvi=&jvk=JobListing',
  'name': 'GrubHub'},
 {'alt': 'A.T. Kearney -  consulting',
  'city': 'Chicago',
  'document': "A.T. Kearney is a leading global management consulting firm with offices in more than 40 countries. Since 1926, we have been trusted advisors to the world's foremost organizations. A.T. Kearney is a partner-owned firm, committed to helping clients achieve immediate ...",
  'images': 'https://d1qb2nb5cznatu.cloudfront.net/startups/i/44872-59aa32d71d1e8b79d03ac4638145b0bf-thumb_jpg.jpg?buster=1370253046',
  'link': 'http://www.atkearney.com',
  'name': 'A.T. Kearney'},
 {'alt': 'Obama for America -  politics',
  'city': 'Chicago',
  'document': '',
  'images': 'https://d1qb2nb5cznatu.clou