In [None]:
#### 1. Build Qdrant Client 

In [1]:
!pip install -q "qdrant-client[fastembed]>=1.14.2"


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3 -m pip install --upgrade pip[0m


In [2]:
from qdrant_client import QdrantClient, models

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# 1. Initialize the client
client = QdrantClient("http://localhost:6333") #connecting to local Qdrant instance

#### 2. Data Collection
Collect the FAQ data online for indexing

In [3]:
import requests

docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
docs_response = requests.get(docs_url)
documents_raw = docs_response.json()

#documents_raw

Decide which fields to be used for semantic search, which to be used as metadata for filering.

Text including Q&A pairs can be used as Search Content, Course Name and Section Name can be used as Metadata

#### 3. Collection Creation and Embedding Model Selection

In [10]:
from fastembed import TextEmbedding
import json

# For simplicity and memory friendly, use 512 dimension for embedding
EMBEDDING_DIMENSIONALITY = 512

for model in TextEmbedding.list_supported_models():
    if model["dim"] == EMBEDDING_DIMENSIONALITY:
        print(json.dumps(model, indent=2))

{
  "model": "BAAI/bge-small-zh-v1.5",
  "sources": {
    "hf": "Qdrant/bge-small-zh-v1.5",
    "url": "https://storage.googleapis.com/qdrant-fastembed/fast-bge-small-zh-v1.5.tar.gz",
    "_deprecated_tar_struct": true
  },
  "model_file": "model_optimized.onnx",
  "description": "Text embeddings, Unimodal (text), Chinese, 512 input tokens truncation, Prefixes for queries/documents: not so necessary, 2023 year.",
  "license": "mit",
  "size_in_GB": 0.09,
  "additional_files": [],
  "dim": 512,
  "tasks": {}
}
{
  "model": "Qdrant/clip-ViT-B-32-text",
  "sources": {
    "hf": "Qdrant/clip-ViT-B-32-text",
    "url": null,
    "_deprecated_tar_struct": false
  },
  "model_file": "model.onnx",
  "description": "Text embeddings, Multimodal (text&image), English, 77 input tokens truncation, Prefixes for queries/documents: not necessary, 2021 year",
  "license": "mit",
  "size_in_GB": 0.25,
  "additional_files": [],
  "dim": 512,
  "tasks": {}
}
{
  "model": "jinaai/jina-embeddings-v2-small-e

Points are the central entity Qdrant works with.
A point is a record consisting of an ID, a vector, and an optional payload.

A collection is a named set of points (i.e., vectors with optional payloads) that you can search within.
Think of it as the container for your vector search solution, a single business problem solved.

When creating a collection, we need to specify:

Name: A unique identifier for the collection.
Vector Configuration:
Size: The dimensionality of the vectors.
Distance Metric: The method used to measure similarity between vectors.

In [11]:
# Select the model and build the collection
model_handle = "jinaai/jina-embeddings-v2-small-en"

# Define the collection name
collection_name = "zoomcamp-rag"

# Create the collection with specified vector parameters
client.create_collection(

    collection_name=collection_name,
    
    vectors_config=models.VectorParams(
        size=EMBEDDING_DIMENSIONALITY,  # Dimensionality of the vectors
        distance=models.Distance.COSINE  # Distance metric for similarity search
    )
)

True

In [12]:
client 

<qdrant_client.qdrant_client.QdrantClient at 0x7cda53dd73b0>

#### 4. Create, Embed & Insert Points into the Collection

Points are the core data entities in Qdrant. Each point consists of:

- ID. A unique identifier. Qdrant supports both 64-bit unsigned integers and UUIDs.
- Vector. The embedding that represents the data point in vector space.
- Payload (optional). Additional metadata as key-value pairs.

In [17]:
# Create Points to be upserted
points = []
id = 0

for course in documents_raw:
    for doc in course['documents']:

        point = models.PointStruct(
            id=id,
            vector=models.Document(text=doc['text'], model=model_handle), #embed text locally with "jinaai/jina-embeddings-v2-small-en" from FastEmbed
            payload={
                "text": doc['text'],
                "section": doc['section'],
                "course": course['course']
            } #save all needed metadata fields
        )
        points.append(point)

        id += 1

In [16]:
points[0]

PointStruct(id=0, vector=Document(text="The purpose of this document is to capture frequently asked technical questions\nThe exact day and hour of the course will be 15th Jan 2024 at 17h00. The course will start with the first  “Office Hours'' live.1\nSubscribe to course public Google Calendar (it works from Desktop only).\nRegister before the course starts using this link.\nJoin the course Telegram channel with announcements.\nDon’t forget to register in DataTalks.Club's Slack and join the channel.", model='jinaai/jina-embeddings-v2-small-en', options=None), payload={'text': "The purpose of this document is to capture frequently asked technical questions\nThe exact day and hour of the course will be 15th Jan 2024 at 17h00. The course will start with the first  “Office Hours'' live.1\nSubscribe to course public Google Calendar (it works from Desktop only).\nRegister before the course starts using this link.\nJoin the course Telegram channel with announcements.\nDon’t forget to register

Now we’re going to embed and upload points to our collection.

First, FastEmbed will fetch&download the selected model (path defaults to os.path.join(tempfile.gettempdir(), "fastembed_cache")), and perform inference directly on your machine.

Then, the generated points will be upserted into the collection, and the vector index will be built.

In [18]:
# Embed the text points and upsert into collection for retrival
client.upsert(
    collection_name=collection_name,
    points=points
)

Fetching 5 files: 100%|██████████| 5/5 [00:01<00:00,  4.73it/s]


UpdateResult(operation_id=0, status=<UpdateStatus.COMPLETED: 'completed'>)

#### 5. Running a Similarity Search

Retrival Process:
1. Qdrant compares the query vector to stored vectors (based on a vector index) using the distance metric defined when creating the collection.

The closest matches are returned, ranked by similarity.

2. Vector index is built for approximate nearest neighbor (ANN) search, making large-scale vector search feasible.

In [19]:
def search(query, top_n = 1):

    results = client.query_points(
        collection_name=collection_name,

        query = models.Document( # Embed the query text locally with "jinaai/jina-embeddings-v2-small-en"
            text=query,
            model=model_handle),

        limit = top_n, # top closest matches
        with_payload=True #to get metadata in the results
    )

    return results

In [None]:
import random

course = random.choice(documents_raw)
course_piece = random.choice(course['documents'])
print(json.dumps(course_piece, indent=2))

{
  "text": "Problem: when run docker-compose up \u2013build, you may see this error. To solve, add `command: php -S 0.0.0.0:8080 -t /var/www/html` in adminer block in yml file like:\nadminer:\ncommand: php -S 0.0.0.0:8080 -t /var/www/html\nimage: adminer\n\u2026\nIlnaz Salimov\nsalimovilnaz777@gmail.com",
  "section": "Module 5: Monitoring",
  "question": "Failed to listen on :::8080 (reason: php_network_getaddresses: getaddrinfo failed: Address family for hostname not supported)"
}


In [1]:
result = search(course_piece['question'])
display(result) 

NameError: name 'search' is not defined

score – the cosine similarity between the question and text embeddings.