# Universal Query API Tutorial

Learn to build hybrid search systems using Qdrant's Universal Query API with real data from Hugging Face. We'll demonstrate 6 different retrieval patterns using FastEmbed's dense, sparse, and ColBERT models.

## What You'll Learn

The Universal Query API enables complex search pipelines in a single request. Traditional approaches require multiple API calls and client-side result merging. This tutorial shows you how to build sophisticated search systems with declarative queries.


In [None]:
# Install dependencies
%pip install qdrant-client fastembed datasets


In [None]:
from qdrant_client import QdrantClient, models
from fastembed import TextEmbedding, SparseTextEmbedding, LateInteractionTextEmbedding
from datasets import load_dataset

print("Dependencies loaded")


## Setup Qdrant Connection

Configure your Qdrant Cloud credentials.


In [None]:
# Configure Qdrant Cloud
QDRANT_URL = "https://your-cluster-url.cloud.qdrant.io"
QDRANT_API_KEY = "your-api-key"

client = QdrantClient(url=QDRANT_URL, api_key=QDRANT_API_KEY)

# Test connection
try:
    collections = client.get_collections()
    print(f"Connected to Qdrant. Collections: {len(collections.collections)}")
except Exception as e:
    print(f"Connection failed: {e}")


## Load Dataset and Initialize Models

Load SQuAD dataset and initialize FastEmbed models for dense, sparse, and ColBERT embeddings.


In [None]:
# Load SQuAD dataset (100 examples)
dataset = load_dataset("squad", split="validation[:100]")
print(f"Loaded {len(dataset)} examples")

# Initialize FastEmbed models
dense_model = TextEmbedding(model_name="BAAI/bge-small-en-v1.5")
sparse_model = SparseTextEmbedding(model_name="Qdrant/bm25")
colbert_model = LateInteractionTextEmbedding(model_name="jinaai/jina-colbert-v2")
print("FastEmbed models loaded: BGE + BM25 + Jina-ColBERT-v2")


## Create Collection

Set up a multi-vector collection supporting dense, sparse, and ColBERT vectors.


In [None]:
collection_name = "squad_universal_query"

# Remove existing collection
try:
    client.delete_collection(collection_name)
    print("Deleted existing collection")
except:
    pass

# Create collection with all vector types
client.create_collection(
    collection_name=collection_name,
    vectors_config={
        "dense": models.VectorParams(size=384, distance=models.Distance.COSINE),
        "colbert": models.VectorParams(
            size=128, 
            distance=models.Distance.COSINE,
            multivector_config=models.MultiVectorConfig(
                comparator=models.MultiVectorComparator.MAX_SIM
            )
        )
    },
    sparse_vectors_config={
        "sparse": models.SparseVectorParams()
    }
)

print(f"Created collection: {collection_name}")


## Generate Embeddings and Upload Data

Process the dataset and generate all three vector types using FastEmbed.


In [None]:
# Prepare texts for batch processing
texts = []
for item in dataset:
    text = f"{item['title']} {item['context']}"
    texts.append(text)

print("Generating embeddings...")

# Generate all embeddings in batch
dense_embeddings = list(dense_model.embed(texts))
sparse_embeddings = list(sparse_model.embed(texts))
colbert_embeddings = list(colbert_model.embed(texts))

# Prepare points for upload
points = []
for i, item in enumerate(dataset):
    # Convert sparse embedding to dict format
    sparse_vector = {
        int(idx): float(value) 
        for idx, value in zip(sparse_embeddings[i].indices, sparse_embeddings[i].values)
    }
    
    point = models.PointStruct(
        id=i,
        vector={
            "dense": dense_embeddings[i],
            "colbert": colbert_embeddings[i]
        },
        sparse_vector={
            "sparse": models.SparseVector(
                indices=list(sparse_vector.keys()),
                values=list(sparse_vector.values())
            )
        },
        payload={
            "title": item['title'],
            "context": item['context'][:500],
            "question": item['question']
        }
    )
    points.append(point)

# Upload to Qdrant
client.upsert(collection_name=collection_name, points=points)
print(f"Uploaded {len(points)} points with dense, sparse, and ColBERT vectors")


## Helper Function for Query Processing

Create a helper to generate query vectors for all patterns.


In [None]:
def generate_query_vectors(query_text):
    """Generate all query vector types"""
    dense = list(dense_model.embed([query_text]))[0]
    
    sparse_embedding = list(sparse_model.embed([query_text]))[0]
    sparse = {
        int(idx): float(value) 
        for idx, value in zip(sparse_embedding.indices, sparse_embedding.values)
    }
    
    colbert = list(colbert_model.embed([query_text]))[0]
    
    return dense, sparse, colbert

# Test query
query = "What is the capital of France?"
query_dense, query_sparse, query_colbert = generate_query_vectors(query)
print(f"Query: '{query}'")
print(f"Vectors: Dense({len(query_dense)}), Sparse({len(query_sparse)}), ColBERT({len(query_colbert)} tokens)")


## Pattern 1: Hybrid Search with RRF

Combine dense and sparse search using Reciprocal Rank Fusion.


In [None]:
print("=== Pattern 1: Dense + Sparse Hybrid (RRF) ===")

hybrid_results = client.query_points(
    collection_name=collection_name,
    prefetch=[
        models.Prefetch(query=query_dense, using="dense", limit=20),
        models.Prefetch(
            query=models.SparseVector(
                indices=list(query_sparse.keys()),
                values=list(query_sparse.values())
            ),
            using="sparse", limit=20
        )
    ],
    query=models.FusionQuery(fusion=models.Fusion.RRF),
    limit=3
)

for i, result in enumerate(hybrid_results.points, 1):
    print(f"{i}. {result.payload['title']} (score: {result.score:.3f})")


## Pattern 2: Dense Recall + ColBERT Reranking

Use dense search for broad recall, then ColBERT for precision reranking.


In [None]:
print("\n=== Pattern 2: Dense Recall + ColBERT Reranking ===")

rerank_results = client.query_points(
    collection_name=collection_name,
    prefetch=[
        models.Prefetch(query=query_dense, using="dense", limit=50)
    ],
    query=query_colbert,
    using="colbert",
    limit=3
)

for i, result in enumerate(rerank_results.points, 1):
    print(f"{i}. {result.payload['title']} (score: {result.score:.3f})")


## Pattern 3: Triple Vector Hybrid

Combine all three vector types (dense + sparse + ColBERT) with RRF fusion.


In [None]:
print("\n=== Pattern 3: Triple Vector Hybrid (RRF) ===")

triple_results = client.query_points(
    collection_name=collection_name,
    prefetch=[
        models.Prefetch(query=query_dense, using="dense", limit=15),
        models.Prefetch(
            query=models.SparseVector(
                indices=list(query_sparse.keys()),
                values=list(query_sparse.values())
            ),
            using="sparse", limit=15
        ),
        models.Prefetch(query=query_colbert, using="colbert", limit=15)
    ],
    query=models.FusionQuery(fusion=models.Fusion.RRF),
    limit=3
)

for i, result in enumerate(triple_results.points, 1):
    print(f"{i}. {result.payload['title']} (score: {result.score:.3f})")


## Pattern 4: DBSF Fusion

Use Distribution-Based Score Fusion instead of RRF for different score normalization.


In [None]:
print("\n=== Pattern 4: DBSF Fusion ===")

dbsf_results = client.query_points(
    collection_name=collection_name,
    prefetch=[
        models.Prefetch(query=query_dense, using="dense", limit=20),
        models.Prefetch(
            query=models.SparseVector(
                indices=list(query_sparse.keys()),
                values=list(query_sparse.values())
            ),
            using="sparse", limit=20
        )
    ],
    query=models.FusionQuery(fusion=models.Fusion.DBSF),
    limit=3
)

for i, result in enumerate(dbsf_results.points, 1):
    print(f"{i}. {result.payload['title']} (score: {result.score:.3f})")


## Pattern 5: Multi-Stage Filtering

Apply early filters during prefetch and late filters after fusion.


In [None]:
print("\n=== Pattern 5: Multi-Stage Filtering ===")

# Early filter: Basic constraints during prefetch
early_filter = models.Filter(
    must=[models.FieldCondition(key="title", match=models.MatchText(text="University"))]
)

# Late filter: Final constraints after fusion
late_filter = models.Filter(
    must=[models.FieldCondition(key="title", match=models.MatchText(text="of"))]
)

filtered_results = client.query_points(
    collection_name=collection_name,
    prefetch=[
        models.Prefetch(query=query_dense, using="dense", limit=50, filter=early_filter),
        models.Prefetch(
            query=models.SparseVector(
                indices=list(query_sparse.keys()),
                values=list(query_sparse.values())
            ),
            using="sparse", limit=50, filter=early_filter
        )
    ],
    query=models.FusionQuery(fusion=models.Fusion.RRF),
    filter=late_filter,
    limit=3
)

for i, result in enumerate(filtered_results.points, 1):
    print(f"{i}. {result.payload['title']} (score: {result.score:.3f})")


## Pattern 6: Complex Pipeline

Multiple prefetches with different strategies and final ColBERT reranking.


In [None]:
print("\n=== Pattern 6: Complex Pipeline ===")

complex_results = client.query_points(
    collection_name=collection_name,
    prefetch=[
        # High recall dense search
        models.Prefetch(query=query_dense, using="dense", limit=50),
        
        # Filtered sparse search
        models.Prefetch(
            query=models.SparseVector(
                indices=list(query_sparse.keys()),
                values=list(query_sparse.values())
            ),
            using="sparse", 
            limit=30,
            filter=early_filter
        ),
        
        # ColBERT search with filter
        models.Prefetch(query=query_colbert, using="colbert", limit=20, filter=early_filter)
    ],
    # Final ColBERT reranking
    query=query_colbert,
    using="colbert",
    limit=3
)

for i, result in enumerate(complex_results.points, 1):
    print(f"{i}. {result.payload['title']} (score: {result.score:.3f})")


In [None]:
test_query = "What is machine learning?"
print(f"\nComparing patterns for: '{test_query}'")

q_dense, q_sparse, q_colbert = generate_query_vectors(test_query)

patterns = {
    "Dense + Sparse (RRF)": lambda: client.query_points(
        collection_name=collection_name,
        prefetch=[
            models.Prefetch(query=q_dense, using="dense", limit=20),
            models.Prefetch(query=models.SparseVector(
                indices=list(q_sparse.keys()), values=list(q_sparse.values())
            ), using="sparse", limit=20)
        ],
        query=models.FusionQuery(fusion=models.Fusion.RRF), limit=2
    ),
    
    "Triple Vector (RRF)": lambda: client.query_points(
        collection_name=collection_name,
        prefetch=[
            models.Prefetch(query=q_dense, using="dense", limit=15),
            models.Prefetch(query=models.SparseVector(
                indices=list(q_sparse.keys()), values=list(q_sparse.values())
            ), using="sparse", limit=15),
            models.Prefetch(query=q_colbert, using="colbert", limit=15)
        ],
        query=models.FusionQuery(fusion=models.Fusion.RRF), limit=2
    ),
    
    "Dense → ColBERT": lambda: client.query_points(
        collection_name=collection_name,
        prefetch=[models.Prefetch(query=q_dense, using="dense", limit=50)],
        query=q_colbert, using="colbert", limit=2
    ),
    
    "ColBERT Only": lambda: client.query_points(
        collection_name=collection_name, query=q_colbert, using="colbert", limit=2
    )
}

for pattern_name, pattern_func in patterns.items():
    print(f"\n{pattern_name}:")
    results = pattern_func()
    for i, result in enumerate(results.points, 1):
        print(f"  {i}. {result.payload['title']} ({result.score:.3f})")


## Summary

You've explored 6 Universal Query API patterns using native FastEmbed models:

**Vector Types:**
- **Dense**: BAAI/bge-small-en-v1.5 (384-dim)
- **Sparse**: Qdrant/bm25 (native BM25)
- **ColBERT**: jinaai/jina-colbert-v2 (128-dim late interaction)

**Patterns Demonstrated:**
1. **Hybrid RRF** - Dense + sparse fusion
2. **Dense → ColBERT** - Broad recall + precision reranking
3. **Triple Vector** - All three types with RRF
4. **DBSF Fusion** - Alternative fusion method
5. **Multi-Stage Filtering** - Early + late constraints
6. **Complex Pipeline** - Multiple strategies combined

The Universal Query API handles complex search orchestration server-side, enabling sophisticated retrieval with simple declarative queries.


In [None]:
# Optional cleanup
# client.delete_collection(collection_name)

print("Tutorial complete!")
print("Next: Experiment with your own datasets and query patterns")
print("Docs: https://qdrant.tech/documentation/")
