# Vector Search with Qdrant


## Step 0: Setup

### Docker
 pull the image and start the container using the following commands:

```bash
docker pull qdrant/qdrant

docker run -p 6333:6333 -p 6334:6334 \
   -v "$(pwd)/qdrant_storage:/qdrant/storage:z" \
   qdrant/qdrant

windows:
docker run -p 6333:6333 -p 6334:6334 -v "path_name/qdrant_storage:/qdrant/storage" qdrant/qdrant

```

- 6333 – REST API port
- 6334 – gRPC API port

When you're running Qdrant in Docker, the Web UI is available at http://localhost:6333/dashboard

### Installing Required Libraries

- The `qdrant-client` package. We'll be using the Python client, but Qdrant also offers official clients for JavaScript/TypeScript, Go, and Rust, so you can choose the best fit for your own projects.

- The `fastembed` package - an optimized embedding (data vectorization) solution designed specifically for Qdrant. Make sure you install version `>= 1.14.2` to use the **local inference** with Qdrant.

In [1]:
!python -m pip install -q "qdrant-client[fastembed]>=1.14.2"

## Step 1: Import Required Libraries & Connect to Qdrant

import the necessary modules from the `qdrant-client` package.

The `QdrantClient` class allows us to establish a connection to the Qdrant service,  
while the `models` module provides definitions for various configurations and parameters we’ll use.

In [2]:
from qdrant_client import QdrantClient, models

  from .autonotebook import tqdm as notebook_tqdm


Initialize the client

In [3]:
client = QdrantClient("http://localhost:6333") #connecting to local Qdrant instance

## Step 2: Study the Dataset

To build a working vector search solution (and, more generally, to understand if/when/how it’s needed), it's good to study the dataset and figure out the nature and structure of the data we’re working with, for example:

- modality — is it text, images, videos, a combination?  
- specifics — if it’s text: language used, how big are the text pieces, are there any special characters, etc.  

It will help us define:
- the right data "schema" (what to vectorize, what to store as metadata, etc);  
- the right embedding model (the best fit based on the domain, precision & resource requirements). 


In [4]:
import pandas as pd

postings_df =pd.read_csv('data/postings.csv')
companies_df = pd.read_csv('data/companies.csv')

In [5]:
postings_df.size, companies_df.size

(3839319, 244730)

In [6]:
companies_df.columns

Index(['company_id', 'name', 'description', 'company_size', 'state', 'country',
       'city', 'zip_code', 'address', 'url'],
      dtype='object')

In [7]:
postings_df.columns

Index(['job_id', 'company_name', 'title', 'description', 'max_salary',
       'pay_period', 'location', 'company_id', 'views', 'med_salary',
       'min_salary', 'formatted_work_type', 'applies', 'original_listed_time',
       'remote_allowed', 'job_posting_url', 'application_url',
       'application_type', 'expiry', 'closed_time',
       'formatted_experience_level', 'skills_desc', 'listed_time',
       'posting_domain', 'sponsored', 'work_type', 'currency',
       'compensation_type', 'normalized_salary', 'zip_code', 'fips'],
      dtype='object')

In [8]:
# Join company name
# Merge but keep both columns temporarily
df = postings_df.merge(companies_df[['company_id', 'name']], on='company_id', how='left')
df['company_name'] = df['company_name'].fillna(df['name'])
df = df.drop(columns=['name'])  # drop the extra column


In [9]:
df.columns

Index(['job_id', 'company_name', 'title', 'description', 'max_salary',
       'pay_period', 'location', 'company_id', 'views', 'med_salary',
       'min_salary', 'formatted_work_type', 'applies', 'original_listed_time',
       'remote_allowed', 'job_posting_url', 'application_url',
       'application_type', 'expiry', 'closed_time',
       'formatted_experience_level', 'skills_desc', 'listed_time',
       'posting_domain', 'sponsored', 'work_type', 'currency',
       'compensation_type', 'normalized_salary', 'zip_code', 'fips'],
      dtype='object')

In [10]:
df = df.dropna(subset=['description', 'title','company_name', 'skills_desc'])  # remove rows without content


In [11]:
df.shape

(2436, 31)

In [12]:
df.size, df.columns, df.describe()

(75516,
 Index(['job_id', 'company_name', 'title', 'description', 'max_salary',
        'pay_period', 'location', 'company_id', 'views', 'med_salary',
        'min_salary', 'formatted_work_type', 'applies', 'original_listed_time',
        'remote_allowed', 'job_posting_url', 'application_url',
        'application_type', 'expiry', 'closed_time',
        'formatted_experience_level', 'skills_desc', 'listed_time',
        'posting_domain', 'sponsored', 'work_type', 'currency',
        'compensation_type', 'normalized_salary', 'zip_code', 'fips'],
       dtype='object'),
              job_id     max_salary    company_id        views     med_salary  \
 count  2.436000e+03     201.000000  2.436000e+03  2365.000000     134.000000   
 mean   3.889318e+09   66109.874179  1.991821e+07     7.860042   10278.119328   
 std    2.076658e+08   66546.042596  3.425119e+07    29.748329   49621.202976   
 min    9.217160e+05      15.000000  1.353000e+03     1.000000      12.000000   
 25%    3.895596e+09

In [13]:
# create a text field to embed
df['text'] = df['title'] + ' at ' + df['company_name'] + ' (' + df['location'] + ')' + '\n\n' + df['description']


# Find a suitable embedding model


**FastEmbed** is an optimized embedding solution designed specifically for Qdrant. It delivers low-latency, CPU-friendly embedding generation, eliminating the need for heavy frameworks like PyTorch or TensorFlow. It uses quantized model weights and ONNX Runtime, making it significantly faster than traditional Sentence Transformers on CPU while maintaining competitive accuracy.

FastEmbed’s integration with Qdrant allows you to directly pass text or images to the Qdrant client for embedding.


In [14]:
from fastembed import TextEmbedding
TextEmbedding.list_supported_models()


[{'model': 'BAAI/bge-base-en',
  'sources': {'hf': 'Qdrant/fast-bge-base-en',
   'url': 'https://storage.googleapis.com/qdrant-fastembed/fast-bge-base-en.tar.gz',
   '_deprecated_tar_struct': True},
  'model_file': 'model_optimized.onnx',
  'description': 'Text embeddings, Unimodal (text), English, 512 input tokens truncation, Prefixes for queries/documents: necessary, 2023 year.',
  'license': 'mit',
  'size_in_GB': 0.42,
  'additional_files': [],
  'dim': 768,
  'tasks': {}},
 {'model': 'BAAI/bge-base-en-v1.5',
  'sources': {'hf': 'qdrant/bge-base-en-v1.5-onnx-q',
   'url': 'https://storage.googleapis.com/qdrant-fastembed/fast-bge-base-en-v1.5.tar.gz',
   '_deprecated_tar_struct': True},
  'model_file': 'model_optimized.onnx',
  'description': 'Text embeddings, Unimodal (text), English, 512 input tokens truncation, Prefixes for queries/documents: not so necessary, 2023 year.',
  'license': 'mit',
  'size_in_GB': 0.21,
  'additional_files': [],
  'dim': 768,
  'tasks': {}},
 {'model':

In [15]:
## set dim to 512
import json

EMBEDDING_DIMENSIONALITY = 512

for model in TextEmbedding.list_supported_models():
    if model["dim"] == EMBEDDING_DIMENSIONALITY:
        print(json.dumps(model, indent=2))

{
  "model": "BAAI/bge-small-zh-v1.5",
  "sources": {
    "hf": "Qdrant/bge-small-zh-v1.5",
    "url": "https://storage.googleapis.com/qdrant-fastembed/fast-bge-small-zh-v1.5.tar.gz",
    "_deprecated_tar_struct": true
  },
  "model_file": "model_optimized.onnx",
  "description": "Text embeddings, Unimodal (text), Chinese, 512 input tokens truncation, Prefixes for queries/documents: not so necessary, 2023 year.",
  "license": "mit",
  "size_in_GB": 0.09,
  "additional_files": [],
  "dim": 512,
  "tasks": {}
}
{
  "model": "Qdrant/clip-ViT-B-32-text",
  "sources": {
    "hf": "Qdrant/clip-ViT-B-32-text",
    "url": null,
    "_deprecated_tar_struct": false
  },
  "model_file": "model.onnx",
  "description": "Text embeddings, Multimodal (text&image), English, 77 input tokens truncation, Prefixes for queries/documents: not necessary, 2021 year",
  "license": "mit",
  "size_in_GB": 0.25,
  "additional_files": [],
  "dim": 512,
  "tasks": {}
}
{
  "model": "jinaai/jina-embeddings-v2-small-e

In [16]:
model_handle = "jinaai/jina-embeddings-v2-small-en"

## Step 4: Create a Collection

When creating a [collection](https://qdrant.tech/documentation/concepts/collections/), we need to specify:

*   Name: A unique identifier for the collection.
*   Vector Configuration:
    *   Size: The dimensionality of the vectors.
    *   Distance Metric: The method used to measure similarity between vectors.


There are additional parameters you can explore in our [documentation](https://qdrant.tech/documentation/concepts/collections/#create-a-collection). Moreover, you can configure other vector types in Qdrant beyond typical dense embeddings (f.e., for hybrid search). However, for this example, the simplest default configuration is sufficient.
    

In [17]:
collection_name = 'jobs-collection'

# Create the collection with specified vector parameters
client.create_collection(
    collection_name=collection_name,
    vectors_config=models.VectorParams(
        size=EMBEDDING_DIMENSIONALITY,  # Dimensionality of the vectors
        distance=models.Distance.COSINE  # Distance metric for similarity search
    )
)


UnexpectedResponse: Unexpected Response: 409 (Conflict)
Raw response content:
b'{"status":{"error":"Wrong input: Collection `jobs-collection` already exists!"},"time":0.003335211}'

![qdrant screenshot](imgs/screenshot-qdrant.png)


## Step 5: Create, Embed & Insert Points into the Collection

[Points](https://qdrant.tech/documentation/concepts/points/#points) are the core data entities in Qdrant. Each point consists of:

1. **ID**. A unique identifier. Qdrant supports both 64-bit unsigned integers and UUIDs.  
2. **Vector**. The embedding that represents the data point in vector space.  
3. **Payload** *(optional)*. Additional metadata as key-value pairs.


upsert :  embed and upload points to our collection.

First, FastEmbed will fetch&download the selected model (path defaults to `os.path.join(tempfile.gettempdir(), "fastembed_cache")`), and perform inference directly on your machine.  
Then, the generated points will be upserted into the collection, and the vector index will be built.


In [18]:
from fastembed import TextEmbedding

model = TextEmbedding(model_name=model_handle)


In [19]:
len(df)

2436

In [20]:
## batch embeddings
batch_size = 256

for i in range(0, len(df), batch_size):
    batch_df = df.iloc[i:i+batch_size]

    texts = batch_df['text'].tolist()
    try:
        vectors = list(model.embed(texts, batch_size=8))  # internal sub-batching
    except Exception as e:
        print(f"Batch {i} failed: {e}")
        continue

    points = [
        models.PointStruct(
            id=int(row['job_id']),
            vector=vectors[j],
            payload={
                "job_title": row['title'],
                "company": row['company_name'],
                "location": row['location'],
                "description": row['description'],
                "post_date": row['original_listed_time'],
                "work_type": row.get('formatted_work_type', 'unknown'),
            }
        )
        for j, (_, row) in enumerate(batch_df.iterrows())
    ]

    try:
        client.upsert(
            collection_name=collection_name,
            points=points
        )
        print(f"✅ Batch {i}–{i+batch_size} done.")
    except Exception as e:
        print(f"❌ Upsert failed for batch {i}: {e}")


✅ Batch 0–256 done.
✅ Batch 256–512 done.
✅ Batch 512–768 done.
✅ Batch 768–1024 done.
✅ Batch 1024–1280 done.
✅ Batch 1280–1536 done.
✅ Batch 1536–1792 done.
✅ Batch 1792–2048 done.
✅ Batch 2048–2304 done.
✅ Batch 2304–2560 done.


#### Study Data Visually

explore the uploaded data in the Qdrant Web UI at [http://localhost:6333/dashboard](http://localhost:6333/dashboard) to study semantic similarity visually.

using the `Visualize` tab in the `jobs-collection` collection, we can view 948 points and see how they group together by meaning, additionally coloured by the location type.  

To do that, run the following command:

```json
{
  "limit": 948,
  "color_by": {
    "payload": "location"
  }
}
```

This 2D representation is the result of dimensionality reduction applied to `jina-embeddings`.

![qdrant visual](imgs/visualize-qdrant.png)

![qdrant visual](imgs/screenshot-qdrant-by-work-type.png)

## Step 6: Running a Similarity Search

find the most similar `text` vector in Qdrant to a given query embedding - the most relevant answer to a given question.

### How Similarity Search Works

1. Qdrant compares the query vector to stored vectors (based on a vector index) using the distance metric defined when creating the collection.

2. The closest matches are returned, ranked by similarity.

> Vector index is built for **approximate** nearest neighbor (ANN) search, making large-scale vector search feasible.


In [34]:
##define a search function

def search(query, limit=1):

    results = client.query_points(
        collection_name=collection_name,
        query=models.Document( #embed the query text locally with "jinaai/jina-embeddings-v2-small-en"
            text=query,
            model=model_handle 
        ),
        limit=limit, # top closest matches
        with_payload=True #to get metadata in the results
    )

    return results

In [35]:
## pick a random user input : skill + company

import random
sample_query = df.sample(n=1).iloc[0].to_dict()

print(json.dumps(sample_query, indent=2))


{
  "job_id": 3905868647,
  "company_name": "A Hiring Company",
  "title": "LPN or RN Hospice Supportive Care - Night Shift - Sign On Bonus",
  "description": "The Supportive Care Nurse provides one-to-one care at the bedside of terminally ill patients, based on an ongoing nursing assessment of need and in accordance with the established plan of care. Three 12 hour shifts per week including weekend rotation, this position is eligible for $5,000 bonus, see details below. Duties include:\n\u00b7 Provides compassionate end-of-life care in accordance with the individualized plan of care, with demonstrated ability to assess and respond to patient and family needs in a timely manner\n\u00b7 Educate patients and caregivers on end-of-life, services and other resources\n\u00b7 Document complex medical information using clinical tools, data and other information to complete assessment of patient needs\n\u00b7 Establishes professional and collaborative relationships with patients, families and pr

In [36]:
query = f"{sample_query['formatted_work_type']} at {sample_query['company_name']}"
query

'Full-time at A Hiring Company'

In [37]:
query_vector = list(model.embed([query]))[0]


In [38]:
# perform search in qadrant
search_result = client.search(
    collection_name="jobs-collection",
    query_vector=query_vector,
    limit=5  # number of similar results to retrieve
)


  search_result = client.search(


In [39]:
# view results
for res in search_result:
    print(f"ID: {res.id}, Score: {res.score}")
    # If you stored payload like job title, company, etc.:
    print(json.dumps(res.payload, indent=2))


ID: 3887495458, Score: 0.86819637
{
  "job_title": "In-Store Shopper - Seasonal Part Time",
  "company": "Whole Foods Market",
  "location": "Massapequa Park, NY",
  "description": "At Whole Foods Market, we\u2019re committed to providing record-setting grocery delivery services to our Prime Now customers. This is a fast-growing program and candidates who are passionate about our quality products and great customer service will be a great fit. We think you\u2019ll agree that it\u2019s a great time to join #TeamWFM.As an In-Store Shopper, you\u2019ll work on the Store Support team supporting Prime Now customer orders, preparing them for delivery and/or pickup. While our offerings will continue to evolve, you\u2019ll shop throughout our store for everyday goods including food, household items, and so much more. Having a flexible schedule is key to meeting our customer\u2019s needs. We especially need Team Members who like to work on Saturday and Sunday - our busiest times of the week! Sh

`score` – the cosine similarity between the `question` and `text` embeddings.


now search for soemthing that wasnt in the initial dataset

In [40]:
queries = [
    "Full-time roles in Carolina",
    "Remote software engineering jobs",
    "Marketing role in New York",
    "Part-time retail assistant near Chicago",
]

for q in queries:
    query_vector = list(model.embed([q]))[0]
    results = client.search(collection_name="jobs-collection", query_vector=query_vector, limit=3)
    print(f"\nQuery: {q}")
    for r in results:
        print(f"ID: {r.id}, Score: {r.score}")
        print(json.dumps(r.payload, indent=2))



Query: Full-time roles in Carolina
ID: 3887710362, Score: 0.8444062
{
  "job_title": "Database Administrator I",
  "company": "N.C. Department of Information Technology",
  "location": "Wake County, NC",
  "description": "Description Of Work\n\nLooking to take the next step in your IT career?\n\nWe currently have an opening for a Database Administrator I\n\nThe position is designated Statutory Exempt and is exempt from the State Human Resources Act.\n\nThis is a time-limited position. It is full-time with State Benefits for a limited time. Although the length of time this position will be active cannot be determined, we anticipate that this position will be in place through December 31, 2026. If you have questions concerning the time-limited status of this position, you may inquire at the interview.\n\nThe Database Administrator I is responsible for the design, implementation, backup, and management of geospatial databases, and the data contained in them that becomes part of the NC Di

  results = client.search(collection_name="jobs-collection", query_vector=query_vector, limit=3)


In [41]:
similar = "Customer Service jobs in NC"
random = "Marine biologist in Iceland"

for q in [similar, random]:
    vector = list(model.embed([q]))[0]
    result = client.search(collection_name="jobs-collection", query_vector=vector, limit=1)
    print(f"\nQuery: {q}")
    print(f"Top Match Score: {result[0].score}")
    print(f"Top Match Title: {result[0].payload['job_title']}")



Query: Customer Service jobs in NC
Top Match Score: 0.8813038
Top Match Title: Customer Service Representative

Query: Marine biologist in Iceland
Top Match Score: 0.7740206
Top Match Title: Administrative Assistant


  result = client.search(collection_name="jobs-collection", query_vector=vector, limit=1)


In [42]:
sample_query = "Warehouse job in Texas"
vec = list(model.embed([sample_query]))[0]
res = client.search(collection_name="jobs-collection", query_vector=vec, limit=1)
payload = res[0].payload
print(f"Job Title: {payload.get('job_title')}")
print(f"Company: {payload.get('company')}")
print(f"Work Type: {payload.get('work_type')}")
print(f"Description snippet: {payload.get('description', '')[:250]}")


Job Title: Warehouse Operations Manager
Company: Clayton Services
Work Type: Full-time
Description snippet: Clayton Services is searching for a  Warehouse Operations Manager to join a thriving company in Northwest Houston. The Warehouse Operations Manager will be responsible for site management, including safety, shipping and receiving, warehousing, invent


  res = client.search(collection_name="jobs-collection", query_vector=vec, limit=1)


In [43]:
def search_jobs(query, top_k=3):
    vec = list(model.embed([query]))[0]
    results = client.search(collection_name="jobs-collection", query_vector=vec, limit=top_k)
    for res in results:
        print(f"Score: {res.score}")
        print(f"Job Title: {res.payload['job_title']}")
        print(f"Company: {res.payload['company']}")
        print(f"Location: {res.payload['location']}")
        print('-' * 50)

# Example usage:
search_jobs("Remote data analyst position")


Score: 0.8811095
Job Title: Data Analyst
Company: Insight Global
Location: United States
--------------------------------------------------
Score: 0.85574645
Job Title: SAS SQL Consultant
Company: Motion Recruitment
Location: United States
--------------------------------------------------
Score: 0.85463935
Job Title: Data Specialist
Company: Centene
Location: United States
--------------------------------------------------


  results = client.search(collection_name="jobs-collection", query_vector=vec, limit=top_k)


In [50]:
!jupyter nbconvert --to script "vector_search_qdrant.ipynb" --log-level=DEBUG


[NbConvertApp] Searching ['C:\\Users\\User\\.jupyter', 'C:\\Users\\User\\AppData\\Roaming\\Python\\etc\\jupyter', 'C:\\Users\\User\\AppData\\Local\\Programs\\Python\\Python311\\etc\\jupyter', 'C:\\ProgramData\\jupyter'] for config files
[NbConvertApp] Looking for jupyter_config in C:\ProgramData\jupyter
[NbConvertApp] Looking for jupyter_config in C:\Users\User\AppData\Local\Programs\Python\Python311\etc\jupyter
[NbConvertApp] Looking for jupyter_config in C:\Users\User\AppData\Roaming\Python\etc\jupyter
[NbConvertApp] Looking for jupyter_config in C:\Users\User\.jupyter
[NbConvertApp] Looking for jupyter_nbconvert_config in C:\ProgramData\jupyter
[NbConvertApp] Looking for jupyter_nbconvert_config in C:\Users\User\AppData\Local\Programs\Python\Python311\etc\jupyter
[NbConvertApp] Looking for jupyter_nbconvert_config in C:\Users\User\AppData\Roaming\Python\etc\jupyter
[NbConvertApp] Looking for jupyter_nbconvert_config in C:\Users\User\.jupyter
[NbConvertApp] Looping through config var

In [32]:
import os
os.listdir()

['.git',
 '.gitignore',
 'backend',
 'data',
 'fast-api-def.py',
 'frontend',
 'imgs',
 'Pipfile',
 'prom.txt',
 'qdrant_storage',
 'README-vector-search-branch.md',
 'README.md',
 'requirements.txt',
 'vector_search_qdrant.ipynb']

In [None]:
! *.ipynb


 Volume in drive C has no label.
 Volume Serial Number is 0695-04C5

 Directory of c:\Users\User\GitHub\career_recommendation_system

07/27/2025  12:17 AM           155,135 vector_search_qdrant.ipynb
               1 File(s)        155,135 bytes
               0 Dir(s)  145,859,026,944 bytes free
