## Step 0: Setup

Qdrant is fully open-source, which means you can run it in multiple ways depending on your needs.  
You can self-host it on your own infrastructure, deploy it on Kubernetes, or run it in managed Cloud.  

We're going to run a Qdrant instance in a Docker container.

### Docker

All you need to do is pull the image and start the container using the following commands:

```bash
docker pull qdrant/qdrant

docker run -p 6333:6333 -p 6334:6334 \
   -v "$(pwd)/qdrant_storage:/qdrant/storage:z" \
   qdrant/qdrant

docker run -p 6333:6333 -p 6334:6334 \
  -v "$(pwd)/qdrant_storage:/qdrant/storage:z" \
  --user $(id -u):$(id -g) \
  qdrant/qdrant
```

The second line in the `docker run` command mounts local storage to keep your data persistent.
So even if you restart or delete the container, your data will still be stored locally.

- 6333 – REST API port
- 6334 – gRPC API port

To help you explore your data visually, Qdrant provides a built-in **Web UI**, available in both Qdrant Cloud and local instances.
You can use it to inspect collections, check system health, and even run simple queries.

When you're running Qdrant in Docker, the Web UI is available at http://localhost:6333/dashboard

### Installing Required Libraries

In the environment you created specifically for this course, we’ll install:

- The `qdrant-client` package. We'll be using the Python client, but Qdrant also offers official clients for JavaScript/TypeScript, Go, and Rust, so you can choose the best fit for your own projects.

- The `fastembed` package - an optimized embedding (data vectorization) solution designed specifically for Qdrant. Make sure you install version `>= 1.14.2` to use the **local inference** with Qdrant.

### Q1

In [3]:
pip install jupyter ipywidgets

Collecting jupyter
  Downloading jupyter-1.1.1-py2.py3-none-any.whl.metadata (2.0 kB)
Collecting ipywidgets
  Downloading ipywidgets-8.1.7-py3-none-any.whl.metadata (2.4 kB)
Collecting notebook (from jupyter)
  Downloading notebook-7.4.4-py3-none-any.whl.metadata (10 kB)
Collecting jupyter-console (from jupyter)
  Downloading jupyter_console-6.6.3-py3-none-any.whl.metadata (5.8 kB)
Collecting widgetsnbextension~=4.0.14 (from ipywidgets)
  Downloading widgetsnbextension-4.0.14-py3-none-any.whl.metadata (1.6 kB)
Collecting jupyterlab_widgets~=3.0.15 (from ipywidgets)
  Downloading jupyterlab_widgets-3.0.15-py3-none-any.whl.metadata (20 kB)
Collecting jupyterlab (from jupyter)
  Downloading jupyterlab-4.4.4-py3-none-any.whl.metadata (16 kB)
Downloading jupyter-1.1.1-py2.py3-none-any.whl (2.7 kB)
Downloading ipywidgets-8.1.7-py3-none-any.whl (139 kB)
Downloading jupyterlab_widgets-3.0.15-py3-none-any.whl (216 kB)
Downloading widgetsnbextension-4.0.14-py3-none-any.whl (2.2 MB)
[2K   [90m━

In [6]:
from fastembed import TextEmbedding
import numpy as np

# Initialize embedder
embedder = TextEmbedding("jinaai/jina-embeddings-v2-small-en")

# Query to embed
query = "I just discovered the course. Can I join now?"

# Get embedding (generator → list)
embedding = list(embedder.embed([query]))[0]

# Confirm size is 512
print("Embedding shape:", embedding.shape)

# Find minimal value in embedding
min_value = np.min(embedding)
print("Minimal value:", min_value)

# Check norm to verify normalization
norm = np.linalg.norm(embedding)
print("Vector norm:", norm)

# Cosine similarity with itself (dot product)
cos_sim = embedding.dot(embedding)
print("Cosine similarity with itself:", cos_sim)



Embedding shape: (512,)
Minimal value: -0.11726373885183883
Vector norm: 1.0
Cosine similarity with itself: 1.0000000000000002


### Q2

In [7]:
from fastembed import TextEmbedding
import numpy as np

embedder = TextEmbedding("jinaai/jina-embeddings-v2-small-en")

# Embed the query
query = "I just discovered the course. Can I join now?"
query_embedding = list(embedder.embed([query]))[0]

# Embed the document
doc = "Can I still join the course after the start date?"
doc_embedding = list(embedder.embed([doc]))[0]

# Since embeddings are normalized, cosine similarity = dot product
cos_sim = query_embedding.dot(doc_embedding)

print(f"Cosine similarity between query and document: {cos_sim:.3f}")


Cosine similarity between query and document: 0.901


### Q3

In [8]:
from fastembed import TextEmbedding
import numpy as np

# Documents data
documents = [
    {'text': "Yes, even if you don't register, you're still eligible to submit the homeworks.\nBe aware, however, that there will be deadlines for turning in the final projects. So don't leave everything for the last minute."},
    {'text': 'Yes, we will keep all the materials after the course finishes, so you can follow the course at your own pace after it finishes.\nYou can also continue looking at the homeworks and continue preparing for the next cohort. I guess you can also start working on your final capstone project.'},
    {'text': "The purpose of this document is to capture frequently asked technical questions\nThe exact day and hour of the course will be 15th Jan 2024 at 17h00. The course will start with the first  “Office Hours'' live.1\nSubscribe to course public Google Calendar (it works from Desktop only).\nRegister before the course starts using this link.\nJoin the course Telegram channel with announcements.\nDon’t forget to register in DataTalks.Club's Slack and join the channel."},
    {'text': 'You can start by installing and setting up all the dependencies and requirements:\nGoogle cloud account\nGoogle Cloud SDK\nPython 3 (installed with Anaconda)\nTerraform\nGit\nLook over the prerequisites and syllabus to see if you are comfortable with these subjects.'},
    {'text': 'Star the repo! Share it with friends if you find it useful ❣️\nCreate a PR if you see you can improve the text or the structure of the repository.'}
]

# Initialize embedder
embedder = TextEmbedding("jinaai/jina-embeddings-v2-small-en")

# Query
query = "I just discovered the course. Can I join now?"
query_embedding = list(embedder.embed([query]))[0]

# Embed all documents
doc_texts = [doc['text'] for doc in documents]
doc_embeddings = list(embedder.embed(doc_texts))  # list of numpy arrays

# Stack embeddings into matrix V (shape: number_of_docs x 512)
V = np.stack(doc_embeddings)  # shape (5, 512)

# Compute cosine similarity = dot product (because normalized)
cosine_similarities = V.dot(query_embedding)  # shape (5,)

print("Cosine similarities:", cosine_similarities)

# Find the index of the highest similarity
best_doc_index = np.argmax(cosine_similarities)
print("Document with highest similarity:", best_doc_index)


Cosine similarities: [0.76296845 0.81823782 0.80853974 0.71330788 0.73044992]
Document with highest similarity: 1


### Q4

In [10]:
from fastembed import TextEmbedding
import numpy as np

# Query
query = "I just discovered the course. Can I join now?"

# Documents
documents = [
    {'text': "Yes, even if you don't register, you're still eligible to submit the homeworks.\nBe aware, however, that there will be deadlines...",
     'question': 'Course - Can I still join the course after the start date?'},
    {'text': 'Yes, we will keep all the materials after the course finishes...',
     'question': 'Course - Can I follow the course after it finishes?'},
    {'text': "The purpose of this document is to capture frequently asked technical questions...",
     'question': 'Course - When will the course start?'},
    {'text': 'You can start by installing and setting up all the dependencies and requirements...',
     'question': 'Course - What can I do before the course starts?'},
    {'text': 'Star the repo! Share it with friends if you find it useful ❣️...',
     'question': 'How can we contribute to the course?'}
]

# Initialize embedder
embedder = TextEmbedding("jinaai/jina-embeddings-v2-small-en")

# Embed the query
query_embedding = list(embedder.embed([query]))[0]

# Concatenate question + text
full_texts = [doc['question'] + ' ' + doc['text'] for doc in documents]

# Embed concatenated texts
full_embeddings = list(embedder.embed(full_texts))
V_full = np.stack(full_embeddings)  # shape (5, 512)

# Compute cosine similarities (dot product, since embeddings are normalized)
cos_sims = V_full.dot(query_embedding)

# Output similarities
for idx, score in enumerate(cos_sims):
    print(f"Document {idx}: Cosine similarity = {score:.4f}")

# Find best document
best_index = int(np.argmax(cos_sims))
print(f"\n🔍 Best matching document index (Q4): {best_index}")

Document 0: Cosine similarity = 0.8592
Document 1: Cosine similarity = 0.8474
Document 2: Cosine similarity = 0.8226
Document 3: Cosine similarity = 0.8021
Document 4: Cosine similarity = 0.8345

🔍 Best matching document index (Q4): 0


### Q5

- https://huggingface.co/BAAI/bge-small-en

In [12]:
from fastembed import TextEmbedding
import numpy as np

# Load the small model
embedder = TextEmbedding("BAAI/bge-small-en")

# Sample sentence
sample = "This is a test sentence."

# Embed and check dimensionality
embedding = list(embedder.embed([sample]))[0]
print("Embedding shape:", embedding.shape)


Fetching 5 files: 100%|██████████| 5/5 [00:00<00:00,  6.81it/s]


Embedding shape: (384,)


### Q6

In [14]:
!pip install qdrant-client fastembed -q

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


In [15]:
import requests

docs_url = 'https://github.com/alexeygrigorev/llm-rag-workshop/raw/main/notebooks/documents.json'
documents_raw = requests.get(docs_url).json()

documents = []
for course in documents_raw:
    if course['course'] != 'machine-learning-zoomcamp':
        continue
    for doc in course['documents']:
        doc['course'] = course['course']
        documents.append(doc)


#### Prepare embeddings with BAAI/bge-small-en (128-dim):

In [16]:
from fastembed import TextEmbedding
from uuid import uuid4

# Embedder
embedder = TextEmbedding("BAAI/bge-small-en")

# Prepare (id, payload, vector)
records = []
for doc in documents:
    full_text = doc['question'] + ' ' + doc['text']
    embedding = list(embedder.embed([full_text]))[0]
    records.append({
        "id": str(uuid4()),
        "vector": embedding,
        "payload": {
            "question": doc['question'],
            "text": doc['text'],
            "course": doc['course']
        }
    })


#### Upload to Qdrant (local instance):

In [18]:
from qdrant_client import QdrantClient
from qdrant_client.http import models

client = QdrantClient("http://localhost:6333")  # or use QdrantCloud

collection_name = "ml-faq"

# Create collection
client.recreate_collection(
    collection_name=collection_name,
    vectors_config=models.VectorParams(size=384, distance=models.Distance.COSINE),
)

# Upload records
client.upsert(
    collection_name=collection_name,
    points=records
)


  client.recreate_collection(


UpdateResult(operation_id=0, status=<UpdateStatus.COMPLETED: 'completed'>)

#### Query with Q1 question:

In [19]:
query = "I just discovered the course. Can I join now?"
query_vector = list(embedder.embed([query]))[0]

search_result = client.search(
    collection_name=collection_name,
    query_vector=query_vector,
    limit=1,
)

# Print highest score
print(f"Highest similarity score: {search_result[0].score:.2f}")


Highest similarity score: 0.87


  search_result = client.search(
