# Chonkie + Qdrant Integration - Complete Guide

This notebook demonstrates the **QdrantHandshake** in Chonkie - seamless integration with Qdrant vector database for RAG applications.

## What is QdrantHandshake?

QdrantHandshake provides a simple interface for:
- ‚úÖ Storing chunked documents in Qdrant
- ‚úÖ Automatic embedding generation
- ‚úÖ Semantic search with natural language queries
- ‚úÖ Integration with Chonkie's Pipeline API
- ‚úÖ Support for both local and cloud Qdrant instances

## Key Features:
- üöÄ **In-memory Qdrant server** - No Docker required
- üîç **Semantic search** - Natural language queries
- üîó **Pipeline integration** - Fluent API with `.store_in()`
- üéØ **Custom embeddings** - Use any sentence-transformers model
- ‚òÅÔ∏è **Cloud support** - Works with Qdrant Cloud
- üì¶ **Automatic collection management** - Creates collections automatically

## Visual Overview

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ff6b6b','primaryTextColor':'#fff','primaryBorderColor':'#c92a2a','lineColor':'#339af0','secondaryColor':'#51cf66','tertiaryColor':'#ffd43b','background':'#f8f9fa','mainBkg':'#e3fafc','secondBkg':'#fff3bf','tertiaryBkg':'#ffe3e3','textColor':'#212529','fontSize':'16px'}}}%%

graph TB
    Start([üöÄ Chonkie + Qdrant<br/>Integration]):::startClass
    
    Start --> Chunker["üìÑ Chunker<br/>Create Chunks"]:::chunkerClass
    
    Chunker --> Embeddings["üßÆ Embedding Model<br/>Generate Vectors"]:::embedClass
    
    Embeddings --> Handshake["ü§ù QdrantHandshake<br/>Store & Search"]:::handshakeClass
    
    Handshake --> Storage{Storage Options}:::decisionClass
    
    Storage -->|Local| InMemory["üíæ In-Memory<br/>Fast Development"]:::localClass
    Storage -->|Local| Disk["üíø Persistent<br/>Local Storage"]:::diskClass
    Storage -->|Cloud| QdrantCloud["‚òÅÔ∏è Qdrant Cloud<br/>Production"]:::cloudClass
    
    InMemory --> Operations[Operations]:::opsClass
    Disk --> Operations
    QdrantCloud --> Operations
    
    Operations --> Write["‚úçÔ∏è Write<br/>Store Chunks"]:::writeClass
    Operations --> Search["üîç Search<br/>Query Vectors"]:::searchClass
    Operations --> Delete["üóëÔ∏è Delete<br/>Remove Data"]:::deleteClass
    
    Write --> Complete([‚ú® RAG Ready]):::finalClass
    Search --> Complete
    Delete --> Complete
    
    classDef startClass fill:#4c6ef5,stroke:#364fc7,stroke-width:3px,color:#fff
    classDef chunkerClass fill:#fa5252,stroke:#e03131,stroke-width:2px,color:#fff
    classDef embedClass fill:#20c997,stroke:#087f5b,stroke-width:2px,color:#fff
    classDef handshakeClass fill:#ff6b6b,stroke:#c92a2a,stroke-width:3px,color:#fff
    classDef decisionClass fill:#ffd43b,stroke:#fab005,stroke-width:2px,color:#333
    classDef localClass fill:#7950f2,stroke:#5f3dc4,stroke-width:2px,color:#fff
    classDef diskClass fill:#845ef7,stroke:#6741d9,stroke-width:2px,color:#fff
    classDef cloudClass fill:#339af0,stroke:#1971c2,stroke-width:2px,color:#fff
    classDef opsClass fill:#51cf66,stroke:#37b24d,stroke-width:2px,color:#fff
    classDef writeClass fill:#ff922b,stroke:#e8590c,stroke-width:2px,color:#fff
    classDef searchClass fill:#74c0fc,stroke:#1c7ed6,stroke-width:2px,color:#fff
    classDef deleteClass fill:#ffa8a8,stroke:#fa5252,stroke-width:2px,color:#fff
    classDef finalClass fill:#4c6ef5,stroke:#364fc7,stroke-width:3px,color:#fff
```

## Installation

Install Chonkie with Qdrant support:

In [1]:
# Install chonkie with qdrant support
# !pip install "chonkie[qdrant]"

# Verify installation
try:
    from chonkie import QdrantHandshake, SemanticChunker, Pipeline
    from qdrant_client import QdrantClient
    print("‚úÖ Chonkie with Qdrant support installed successfully!")
    print(f"  QdrantHandshake: {QdrantHandshake}")
    print(f"  QdrantClient: {QdrantClient}")
except ImportError as e:
    print(f"‚ùå Installation required: pip install 'chonkie[qdrant]'")
    print(f"   Error: {e}")

‚úÖ Chonkie with Qdrant support installed successfully!
  QdrantHandshake: <class 'chonkie.handshakes.qdrant.QdrantHandshake'>
  QdrantClient: <class 'qdrant_client.qdrant_client.QdrantClient'>


## Setup In-Memory Qdrant Server

Start a local in-memory Qdrant instance - no Docker required!

In [2]:
from qdrant_client import QdrantClient

# Initialize in-memory Qdrant server
# This runs entirely in Python - perfect for development and testing
qdrant_client = QdrantClient(":memory:")

print("‚úÖ In-Memory Qdrant Server Started!")
print(f"  Client: {qdrant_client}")
print(f"  Location: In-Memory (ephemeral)")
print(f"  Perfect for: Development, testing, notebooks")
print("\nüí° Note: Data will be lost when the notebook kernel restarts")
print("   For persistent storage, use: QdrantClient(path='./qdrant_storage')")

‚úÖ In-Memory Qdrant Server Started!
  Client: <qdrant_client.qdrant_client.QdrantClient object at 0x00000233F443FB60>
  Location: In-Memory (ephemeral)
  Perfect for: Development, testing, notebooks

üí° Note: Data will be lost when the notebook kernel restarts
   For persistent storage, use: QdrantClient(path='./qdrant_storage')


---

# Part 1: Basic QdrantHandshake Usage

## 1. Simple Write and Search

Create chunks, store them in Qdrant, and perform semantic search.

In [5]:
import ipywidgets as widgets
from IPython.display import display

## Embedding Models for Semantic Chunking

# Popular embedding models optimized for semantic similarity tasks
semantic_embedding_models = [
    "minishlab/potion-base-32M",           # Model2Vec - Ultra-fast, 32M params
    "sentence-transformers/all-MiniLM-L6-v2",  # Popular, balanced performance
    "sentence-transformers/all-mpnet-base-v2",  # High quality, general purpose
    "BAAI/bge-small-en-v1.5",               # BGE - Efficient, strong performance
    "BAAI/bge-base-en-v1.5",                # BGE - Better quality, slower
    "BAAI/bge-large-en-v1.5",               # BGE Large - Highest quality BGE
    "thenlper/gte-small",                   # GTE - General text embeddings
    "jinaai/jina-embeddings-v2-base-en",    # Jina v2 - 8K context, strong performance
    "nomic-ai/nomic-embed-text-v1.5",       # Nomic - Long context (8K tokens)
    "emilyalsentzer/Bio_ClinicalBERT",      # MedEmbed - Medical/clinical domain
    "kamalkraj/biobert-base-cased-v1.2",    # ClinVec - Biomedical text embeddings
    "google/gecko-text-embedding",          # Google Gecko - Multimodal embeddings
    "Alibaba-NLP/gte-large-en-v1.5",        # GTE Large - State-of-the-art, 1024 dims
    "Alibaba-NLP/gte-Qwen2-7B-instruct",    # GTE Qwen2 - Instruction-tuned, 7B params
]

print("üß† Embedding Models for SemanticChunker:")
print("=" * 70)
for idx, model in enumerate(semantic_embedding_models, 1):
    model_descriptions = {
        "minishlab/potion-base-32M": "Model2Vec - Ultra-fast, lightweight (32M params)",
        "sentence-transformers/all-MiniLM-L6-v2": "MiniLM - Popular, balanced speed/quality",
        "sentence-transformers/all-mpnet-base-v2": "MPNet - High quality semantic search",
        "BAAI/bge-small-en-v1.5": "BGE Small - Efficient, strong performance",
        "BAAI/bge-base-en-v1.5": "BGE Base - Better quality, more compute",
        "BAAI/bge-large-en-v1.5": "BGE Large - Top-tier quality, 1024 dimensions",
        "thenlper/gte-small": "GTE - General text embeddings",
        "jinaai/jina-embeddings-v2-base-en": "Jina v2 - 8K context, bilingual support",
        "nomic-ai/nomic-embed-text-v1.5": "Nomic - Long context support (8K tokens)",
        "emilyalsentzer/Bio_ClinicalBERT": "MedEmbed - Specialized for medical/clinical text",
        "kamalkraj/biobert-base-cased-v1.2": "ClinVec - BioBERT for biomedical literature",
        "google/gecko-text-embedding": "Google Gecko - Multimodal, high quality",
        "Alibaba-NLP/gte-large-en-v1.5": "GTE Large - State-of-the-art, 1024 dimensions",
        "Alibaba-NLP/gte-Qwen2-7B-instruct": "GTE Qwen2 - Instruction-tuned, 7B params",
    }
    print(f"{idx}. {model}")
    print(f"   ‚Üí {model_descriptions[model]}")

print("   ‚Ä¢ Best quality: all-mpnet-base-v2, BAAI/bge-large-en-v1.5, Alibaba-NLP/gte-large-en-v1.5")
print("   ‚Ä¢ Faster models: minishlab/potion-base-32M, thenlper/gte-small")
print("   ‚Ä¢ Large-scale: Alibaba-NLP/gte-Qwen2-7B-instruct (7B params, instruction-tuned)")
print("   ‚Ä¢ Default recommended: minishlab/potion-base-32M (fast + good quality)")
print("   ‚Ä¢ Medical/Clinical: emilyalsentzer/Bio_ClinicalBERT, kamalkraj/biobert-base-cased-v1.2")


üß† Embedding Models for SemanticChunker:
1. minishlab/potion-base-32M
   ‚Üí Model2Vec - Ultra-fast, lightweight (32M params)
2. sentence-transformers/all-MiniLM-L6-v2
   ‚Üí MiniLM - Popular, balanced speed/quality
3. sentence-transformers/all-mpnet-base-v2
   ‚Üí MPNet - High quality semantic search
4. BAAI/bge-small-en-v1.5
   ‚Üí BGE Small - Efficient, strong performance
5. BAAI/bge-base-en-v1.5
   ‚Üí BGE Base - Better quality, more compute
6. BAAI/bge-large-en-v1.5
   ‚Üí BGE Large - Top-tier quality, 1024 dimensions
7. thenlper/gte-small
   ‚Üí GTE - General text embeddings
8. jinaai/jina-embeddings-v2-base-en
   ‚Üí Jina v2 - 8K context, bilingual support
9. nomic-ai/nomic-embed-text-v1.5
   ‚Üí Nomic - Long context support (8K tokens)
10. emilyalsentzer/Bio_ClinicalBERT
   ‚Üí MedEmbed - Specialized for medical/clinical text
11. kamalkraj/biobert-base-cased-v1.2
   ‚Üí ClinVec - BioBERT for biomedical literature
12. google/gecko-text-embedding
   ‚Üí Google Gecko - Multimoda

In [10]:
from chonkie import QdrantHandshake, SemanticChunker

# Create dropdown widget
embedding_model_dropdown = widgets.Dropdown(
    options=semantic_embedding_models,
    value="minishlab/potion-base-32M",  # Default selection
    description='Model:',
    style={'description_width': '60px'},
    layout=widgets.Layout(width='500px')
)

print("üéØ Select Embedding Model:\n")
display(embedding_model_dropdown)
print("\nüí° Model will auto-update on selection change\n")
print("=" * 70 + "\n")

# Define function to execute when dropdown changes
def on_model_change(change):
    selected_model = change['new']
    
    print(f"üîÑ Updating to model: {selected_model}\n")
    
    # Create unique collection name for each model to avoid embedding mismatches
    # Replace special characters in model name for valid collection name
    collection_name = f"demo_{selected_model.replace('/', '_').replace('-', '_')}"
    
    # Initialize handshake with the in-memory Qdrant client
    global handshake  # Make it accessible outside the function
    handshake = QdrantHandshake(
        client=qdrant_client,
        collection_name=collection_name,
        embedding_model=selected_model
    )
    
    print(f"‚úÖ QdrantHandshake initialized!")
    print(f"  Collection: {collection_name}")
    print(f"  Embedding model: {selected_model}")
    
    # Create chunks
    chunker = SemanticChunker(chunk_size=200, embedding_model=selected_model)
    text = """Machine learning is transforming industries worldwide. 
Deep learning models can recognize complex patterns in data. 
Natural language processing enables computers to understand human language. 
Computer vision allows machines to interpret visual information."""
    
    chunks = chunker.chunk(text)
    print(f"\nüìÑ Created {len(chunks)} semantic chunks")
    
    # Write chunks to Qdrant
    handshake.write(chunks)
    print(f"‚úÖ Stored {len(chunks)} chunks in Qdrant")
    
    # Search for relevant chunks
    query = "How do computers understand language?"
    results = handshake.search(query=query, limit=3)
    
    print(f"\nüîç Search Query: '{query}'")
    print(f"üìä Found {len(results)} results:\n")
    
    for i, result in enumerate(results, 1):
        print(f"{i}. Score: {result['score']:.4f}")
        print(f"   Text: {result['text'][:80]}...")
    
    print("\n" + "=" * 70 + "\n")

# Attach observer to dropdown
embedding_model_dropdown.observe(on_model_change, names='value')

# Execute once with initial value
on_model_change({'new': embedding_model_dropdown.value})

üéØ Select Embedding Model:



Dropdown(description='Model:', layout=Layout(width='500px'), options=('minishlab/potion-base-32M', 'sentence-t‚Ä¶


üí° Model will auto-update on selection change


üîÑ Updating to model: minishlab/potion-base-32M

‚úÖ QdrantHandshake initialized!
  Collection: demo_minishlab_potion_base_32M
  Embedding model: minishlab/potion-base-32M

üìÑ Created 1 semantic chunks
‚úÖ Stored 1 chunks in Qdrant

üîç Search Query: 'How do computers understand language?'
üìä Found 1 results:

1. Score: 0.6644
   Text: Machine learning is transforming industries worldwide. 
Deep learning models can...




## 2. Working with Multiple Documents

Store and search across multiple documents.

In [12]:
from chonkie import TokenChunker

# Create new collection for multiple documents
handshake_multi = QdrantHandshake(
    client=qdrant_client,
    collection_name="multi_docs",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# Multiple documents on different topics
documents = {
    "python": "Python is a high-level programming language. It emphasizes code readability with significant whitespace. Python supports multiple programming paradigms including procedural, object-oriented, and functional programming.",
    
    "javascript": "JavaScript is the programming language of the web. It runs in browsers and on servers via Node.js. JavaScript supports event-driven, functional, and imperative programming styles.",
    
    "rust": "Rust is a systems programming language focused on safety and performance. It prevents memory errors through its ownership system. Rust provides zero-cost abstractions without garbage collection."
}

# Chunk and store all documents
chunker = TokenChunker(chunk_size=50)
all_chunks = []

for topic, text in documents.items():
    chunks = chunker.chunk(text)
    # Add metadata to identify the source
    for chunk in chunks:
        chunk.metadata = {"topic": topic}
    all_chunks.extend(chunks)
    print(f"  üìÑ {topic}: {len(chunks)} chunks")

# Write all chunks
handshake_multi.write(all_chunks)
print(f"\n‚úÖ Stored {len(all_chunks)} chunks from {len(documents)} documents")

# Search across all documents
queries = [
    "memory management",
    "web development",
    "code readability"
]

print(f"\nüîç Testing Multiple Queries:\n")
for query in queries:
    results = handshake_multi.search(query=query, limit=2)
    print(f"Query: '{query}'")
    print(f"  Top result: {results[0]['text'][:60]}...")
    if 'metadata' in results[0]:
        print(f"  Topic: {results[0]['metadata'].get('topic', 'unknown')}")
    print()

  üìÑ python: 5 chunks
  üìÑ javascript: 4 chunks
  üìÑ rust: 4 chunks

‚úÖ Stored 13 chunks from 3 documents

üîç Testing Multiple Queries:

Query: 'memory management'
  Top result: safety and performance. It prevents memory errors ...

Query: 'web development'
  Top result: JavaScript is the programming language of the web....

Query: 'code readability'
  Top result: phasizes code readability with significant whitesp...



## 3. Advanced Search with Metadata

Store and retrieve metadata with chunks. Note: Built-in search filtering not available in this API version.

In [18]:
# Search with metadata filters
print("üéØ Filtered Search Examples:\n")

# Search only in Python documents
print("1Ô∏è‚É£ Search only Python content:")
query = "programming paradigms"
results = handshake_multi.search(
    query=query,
    limit=2
)
print(f"   Query: '{query}'")
print(f"   Note: Filtering by metadata in search not available in this API version")
print(f"   Results include all topics, but metadata shows source:")
for result in results:
    print(f"   - {result['text'][:60]}...")
    print(f"     Topic: {result.get('metadata', {}).get('topic', 'unknown')}")

# Search only in JavaScript documents
print("\n2Ô∏è‚É£ Search only JavaScript content:")
query = "programming styles"
results = handshake_multi.search(
    query=query,
    limit=2
)
print(f"   Query: '{query}'")
for result in results:
    print(f"   - {result['text'][:60]}...")
    print(f"     Topic: {result.get('metadata', {}).get('topic', 'unknown')}")
print("\n‚úÖ Metadata stored with chunks can be used for post-search filtering")
print("\n‚úÖ Metadata filters enable targeted search across collections")

üéØ Filtered Search Examples:

1Ô∏è‚É£ Search only Python content:
   Query: 'programming paradigms'
   Note: Filtering by metadata in search not available in this API version
   Results include all topics, but metadata shows source:
   - mperative programming styles....
     Topic: unknown
   - Rust is a systems programming language focused on ...
     Topic: unknown

2Ô∏è‚É£ Search only JavaScript content:
   Query: 'programming styles'
   - mperative programming styles....
     Topic: unknown
   - ional programming....
     Topic: unknown

‚úÖ Metadata stored with chunks can be used for post-search filtering

‚úÖ Metadata filters enable targeted search across collections


---

# Part 2: Pipeline Integration

## 4. Basic Pipeline with Qdrant

Use the fluent Pipeline API to process and store documents.

In [21]:
from chonkie import Pipeline
import tempfile
import os

# Create sample text files
demo_dir = tempfile.mkdtemp()
samples = {
    "ml_basics.txt": "Machine learning enables computers to learn from data without explicit programming. Supervised learning uses labeled datasets. Unsupervised learning discovers hidden patterns. Reinforcement learning learns through trial and error.",
    
    "ai_trends.txt": "Artificial intelligence is advancing rapidly across multiple domains. Neural networks power modern AI applications. Transformer models revolutionized natural language processing. Generative AI creates novel content from learned patterns.",
    
    "data_science.txt": "Data science combines statistics, programming, and domain expertise. Data preprocessing cleans and prepares raw data. Exploratory data analysis reveals insights and patterns. Predictive modeling forecasts future outcomes."
}

for filename, content in samples.items():
    with open(os.path.join(demo_dir, filename), 'w') as f:
        f.write(content)

print(f"‚úÖ Created {len(samples)} sample files\n")

# Process and store in Qdrant using Pipeline
print("üîÑ Processing Pipeline:\n")

(Pipeline()
    .fetch_from("file", dir=demo_dir, ext=[".txt"])
    .process_with("text")
    .chunk_with("semantic", chunk_size=100, threshold=0.7)
    .store_in("qdrant",
              client=qdrant_client,
              collection_name="pipeline_demo",
              embedding_model="sentence-transformers/all-MiniLM-L6-v2")
    .run())

print(f"‚úÖ Pipeline complete!")
print(f"  Processed: {len(samples)} documents")
print(f"  Chunks stored in Qdrant collection 'pipeline_demo'")

# Search the stored data
handshake_pipeline = QdrantHandshake(
    client=qdrant_client,
    collection_name="pipeline_demo",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

query = "learning from data"
results = handshake_pipeline.search(query=query, limit=3)

print(f"\nüîç Search Results for: '{query}'")
for i, result in enumerate(results, 1):
    print(f"  {i}. Score: {result['score']:.4f}")
    print(f"     {result['text'][:70]}...")

# Cleanup
import shutil
shutil.rmtree(demo_dir)
print("\nüßπ Cleaned up temporary files")

‚úÖ Created 3 sample files

üîÑ Processing Pipeline:

‚úÖ Pipeline complete!
  Processed: 3 documents
  Chunks stored in Qdrant collection 'pipeline_demo'

üîç Search Results for: 'learning from data'
  1. Score: 0.5100
     Machine learning enables computers to learn from data without explicit...
  2. Score: 0.3334
     Artificial intelligence is advancing rapidly across multiple domains. ...
  3. Score: 0.3002
     Data science combines statistics, programming, and domain expertise. D...

üßπ Cleaned up temporary files


## 5. Pipeline with Refinements

Add overlapping context and custom embeddings.

In [22]:
# Create knowledge base files
kb_dir = tempfile.mkdtemp()
kb_content = {
    "neural_networks.txt": """Neural networks are computing systems inspired by biological neural networks. 
    They consist of interconnected nodes organized in layers. 
    Input layers receive data, hidden layers process information, and output layers produce results. 
    Backpropagation adjusts weights to minimize prediction errors. 
    Deep neural networks have multiple hidden layers enabling complex pattern recognition.""",
    
    "nlp_basics.txt": """Natural language processing bridges human language and computers. 
    Tokenization breaks text into words or subwords. 
    Word embeddings represent words as dense vectors. 
    Attention mechanisms help models focus on relevant context. 
    Transformers use self-attention for parallel text processing.""",
    
    "computer_vision.txt": """Computer vision enables machines to understand visual data. 
    Convolutional neural networks excel at image recognition. 
    Pooling layers reduce spatial dimensions while preserving features. 
    Object detection identifies and locates objects in images. 
    Image segmentation classifies every pixel in an image."""
}

for filename, content in kb_content.items():
    with open(os.path.join(kb_dir, filename), 'w') as f:
        f.write(content)

print("üìö Knowledge Base Files Created\n")

# Advanced pipeline with refinements
print("üîÑ Advanced Pipeline with Refinements:\n")

(Pipeline()
    .fetch_from("file", dir=kb_dir, ext=[".txt"])
    .process_with("text")
    .chunk_with("semantic", threshold=0.8, chunk_size=120)
    .refine_with("overlap", context_size=50, method="suffix")
    .store_in("qdrant",
              client=qdrant_client,
              collection_name="knowledge_base",
              embedding_model="sentence-transformers/all-MiniLM-L6-v2")
    .run())

print(f"‚úÖ Advanced pipeline complete!")
print(f"  Documents processed: {len(kb_content)}")
print(f"  With overlap refinement for better context")

# Create handshake to search
kb_handshake = QdrantHandshake(
    client=qdrant_client,
    collection_name="knowledge_base",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# Test semantic search
queries = [
    "How do neural networks learn?",
    "What are attention mechanisms?",
    "How do CNNs process images?"
]

print(f"\nüîç Semantic Search Results:\n")
for query in queries:
    results = kb_handshake.search(query=query, limit=1)
    print(f"Q: {query}")
    print(f"A: {results[0]['text'][:100]}...")
    print(f"   (Score: {results[0]['score']:.4f})\n")

# Cleanup
shutil.rmtree(kb_dir)
print("üßπ Cleaned up temporary files")

üìö Knowledge Base Files Created

üîÑ Advanced Pipeline with Refinements:

‚úÖ Advanced pipeline complete!
  Documents processed: 3
  With overlap refinement for better context

üîç Semantic Search Results:

Q: How do neural networks learn?
A: Neural networks are computing systems inspired by biological neural networks. 
    They consist of i...
   (Score: 0.6072)

Q: What are attention mechanisms?
A: Natural language processing bridges human language and computers. 
    Tokenization breaks text into...
   (Score: 0.4106)

Q: How do CNNs process images?
A: Computer vision enables machines to understand visual data. 
    Convolutional neural networks excel...
   (Score: 0.5597)

üßπ Cleaned up temporary files


## 6. Complete RAG Pipeline

Build a production-ready RAG ingestion pipeline.

In [23]:
# Create comprehensive knowledge base
rag_dir = tempfile.mkdtemp()
rag_docs = {
    "transformers.txt": """Transformer architecture revolutionized NLP in 2017. The attention mechanism allows 
    models to weigh the importance of different input elements. Self-attention enables parallel processing 
    of sequences. Multi-head attention captures different aspects of relationships. Position encodings 
    provide sequence order information. BERT uses bidirectional transformers for understanding. 
    GPT uses autoregressive transformers for generation.""",
    
    "embeddings.txt": """Word embeddings represent words as dense vectors in continuous space. Similar words 
    have similar vector representations. Word2Vec learns embeddings through context prediction. 
    GloVe combines global statistics with local context. Contextual embeddings like BERT vary by context. 
    Sentence embeddings represent entire sentences as vectors. Embeddings enable semantic similarity 
    computation and transfer learning.""",
    
    "fine_tuning.txt": """Fine-tuning adapts pre-trained models to specific tasks. Transfer learning 
    leverages knowledge from large datasets. The pre-trained model provides strong initialization. 
    Task-specific layers are added for new objectives. Lower layers often frozen to preserve general 
    features. Learning rate should be smaller than initial training. Data augmentation improves 
    generalization on limited data.""",
    
    "evaluation.txt": """Model evaluation measures performance on held-out data. Accuracy measures correct 
    predictions over total predictions. Precision indicates positive prediction reliability. 
    Recall measures finding all positive examples. F1-score balances precision and recall. 
    Confusion matrix shows detailed classification results. Cross-validation provides robust 
    performance estimates. Validation set prevents overfitting during training."""
}

for filename, content in rag_docs.items():
    with open(os.path.join(rag_dir, filename), 'w') as f:
        f.write(content)

print("üìö RAG Knowledge Base Created")
print(f"  Documents: {len(rag_docs)}")
print(f"  Topics: NLP, embeddings, fine-tuning, evaluation\n")

# Complete RAG pipeline
print("üîÆ Building Complete RAG Pipeline:\n")

(Pipeline()
    .fetch_from("file", dir=rag_dir, ext=[".txt"])
    .process_with("text")
    .chunk_with("semantic", threshold=0.75, chunk_size=150)
    .refine_with("overlap", context_size=30, method="suffix")
    .refine_with("embeddings", embedding_model="sentence-transformers/all-MiniLM-L6-v2")
    .store_in("qdrant",
              client=qdrant_client,
              collection_name="rag_knowledge",
              embedding_model="sentence-transformers/all-MiniLM-L6-v2")
    .run())

print(f"‚úÖ RAG Pipeline Complete!")
print(f"  Documents ingested: {len(rag_docs)}")
print(f"  Chunks stored in Qdrant collection 'rag_knowledge'")
print(f"  Features: Semantic chunking, overlap context, embeddings")

# Create handshake for retrieval
rag_handshake = QdrantHandshake(
    client=qdrant_client,
    collection_name="rag_knowledge",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# RAG Q&A examples
print(f"\nüí¨ RAG Question-Answering:\n")

questions = [
    "What is the attention mechanism in transformers?",
    "How do word embeddings work?",
    "What is transfer learning in fine-tuning?",
    "What metrics evaluate model performance?"
]

for question in questions:
    results = rag_handshake.search(query=question, limit=2)
    print(f"‚ùì {question}")
    print(f"‚úÖ Answer:")
    print(f"   {results[0]['text'][:120]}...")
    print(f"   (Confidence: {results[0]['score']:.4f})")
    print()

# Cleanup
shutil.rmtree(rag_dir)
print("üßπ Cleaned up temporary files")

üìö RAG Knowledge Base Created
  Documents: 4
  Topics: NLP, embeddings, fine-tuning, evaluation

üîÆ Building Complete RAG Pipeline:

‚úÖ RAG Pipeline Complete!
  Documents ingested: 4
  Chunks stored in Qdrant collection 'rag_knowledge'
  Features: Semantic chunking, overlap context, embeddings

üí¨ RAG Question-Answering:

‚ùì What is the attention mechanism in transformers?
‚úÖ Answer:
   Transformer architecture revolutionized NLP in 2017. The attention mechanism allows 
    models to weigh the importance ...
   (Confidence: 0.5076)

‚ùì How do word embeddings work?
‚úÖ Answer:
   Word embeddings represent words as dense vectors in continuous space. Similar words 
    have similar vector representat...
   (Confidence: 0.6025)

‚ùì What is transfer learning in fine-tuning?
‚úÖ Answer:
   Fine-tuning adapts pre-trained models to specific tasks. Transfer learning 
    leverages knowledge from large datasets....
   (Confidence: 0.6692)

‚ùì What metrics evaluate model performance?


---

# Part 3: Advanced Operations

## 7. Custom Embedding Models

Use different embedding models for specialized tasks.

In [24]:
print("üé® Custom Embedding Models:\n")

# Small, fast model (384 dimensions)
print("1Ô∏è‚É£ Lightweight Model (all-MiniLM-L6-v2):")
handshake_mini = QdrantHandshake(
    client=qdrant_client,
    collection_name="lightweight",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)
print("   ‚úÖ 384 dimensions, fast, good for development")
print("   Best for: Quick prototyping, limited resources")

# Better quality model (768 dimensions)
print("\n2Ô∏è‚É£ High-Quality Model (all-mpnet-base-v2):")
handshake_mpnet = QdrantHandshake(
    client=qdrant_client,
    collection_name="high_quality",
    embedding_model="sentence-transformers/all-mpnet-base-v2"
)
print("   ‚úÖ 768 dimensions, slower, better quality")
print("   Best for: Production, accuracy-critical applications")

# Specialized model (smaller, efficient)
print("\n3Ô∏è‚É£ Efficient Model (potion-base-8M):")
handshake_potion = QdrantHandshake(
    client=qdrant_client,
    collection_name="efficient",
    embedding_model="minishlab/potion-base-8M"
)
print("   ‚úÖ 256 dimensions, very fast, 8M parameters")
print("   Best for: Edge devices, real-time applications")

# Store and search with different models
test_text = "Python is great for machine learning and data science applications."
chunker = TokenChunker(chunk_size=100)
chunks = chunker.chunk(test_text)

print("\nüìù Testing with same text across models:")
for name, handshake in [
    ("MiniLM", handshake_mini),
    ("MPNet", handshake_mpnet),
    ("Potion", handshake_potion)
]:
    handshake.write(chunks)
    results = handshake.search("machine learning", limit=1)
    print(f"  {name}: Score = {results[0]['score']:.4f}")

print("\nüí° Choose embedding model based on:")
print("  - Quality vs. Speed tradeoff")
print("  - Available compute resources")
print("  - Embedding dimension requirements")
print("  - Domain specificity needs")

üé® Custom Embedding Models:

1Ô∏è‚É£ Lightweight Model (all-MiniLM-L6-v2):
   ‚úÖ 384 dimensions, fast, good for development
   Best for: Quick prototyping, limited resources

2Ô∏è‚É£ High-Quality Model (all-mpnet-base-v2):


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

   ‚úÖ 768 dimensions, slower, better quality
   Best for: Production, accuracy-critical applications

3Ô∏è‚É£ Efficient Model (potion-base-8M):
   ‚úÖ 256 dimensions, very fast, 8M parameters
   Best for: Edge devices, real-time applications

üìù Testing with same text across models:
  MiniLM: Score = 0.3081
  MPNet: Score = 0.3905
  Potion: Score = 0.6136

üí° Choose embedding model based on:
  - Quality vs. Speed tradeoff
  - Available compute resources
  - Embedding dimension requirements
  - Domain specificity needs


## 8. Batch Operations

Efficiently handle large-scale data ingestion.

In [25]:
print("üì¶ Batch Operations:\n")

# Create handshake for batch demo
batch_handshake = QdrantHandshake(
    client=qdrant_client,
    collection_name="batch_demo",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# Generate multiple chunks
chunker = TokenChunker(chunk_size=50)
texts = [
    f"Document {i}: This is sample text for batch processing demonstration. "
    f"It contains information about topic {i % 5}. "
    f"Batch operations improve efficiency when handling large datasets."
    for i in range(50)
]

all_chunks = []
for i, text in enumerate(texts):
    chunks = chunker.chunk(text)
    for chunk in chunks:
        chunk.metadata = {"doc_id": i, "batch": i // 10}
    all_chunks.extend(chunks)

print(f"üìÑ Generated {len(all_chunks)} chunks from {len(texts)} documents")

# Write in batch
print("\n‚úçÔ∏è Writing to Qdrant...")
batch_handshake.write(all_chunks)
print(f"‚úÖ Batch write complete: {len(all_chunks)} chunks stored")

# Search with different queries
print("\nüîç Batch Search Examples:")

queries = [
    "sample text",
    "topic information",
    "batch processing",
    "large datasets"
]

for query in queries[:2]:
    results = batch_handshake.search(query=query, limit=3)
    print(f"\n  Query: '{query}'")
    print(f"  Found: {len(results)} results")
    print(f"  Top score: {results[0]['score']:.4f}")
    if 'metadata' in results[0]:
        print(f"  Doc ID: {results[0]['metadata'].get('doc_id')}")

print("\n‚úÖ Batch operations completed successfully!")
print("üí° Benefits:")
print("  - Single write call for multiple chunks")
print("  - Reduced network overhead")
print("  - Better performance for large datasets")

üì¶ Batch Operations:

üìÑ Generated 200 chunks from 50 documents

‚úçÔ∏è Writing to Qdrant...
‚úÖ Batch write complete: 200 chunks stored

üîç Batch Search Examples:

  Query: 'sample text'
  Found: 3 results
  Top score: 0.5947

  Query: 'topic information'
  Found: 3 results
  Top score: 0.3247

‚úÖ Batch operations completed successfully!
üí° Benefits:
  - Single write call for multiple chunks
  - Reduced network overhead
  - Better performance for large datasets


## 9. Collection Management

Manage multiple collections and delete operations.

In [26]:
print("üóÇÔ∏è Collection Management:\n")

# Create multiple collections
collections = {
    "products": "Product catalog with descriptions and features",
    "reviews": "Customer reviews and feedback",
    "documentation": "Technical documentation and guides"
}

handshakes = {}
chunker = TokenChunker(chunk_size=50)

for name, description in collections.items():
    handshakes[name] = QdrantHandshake(
        client=qdrant_client,
        collection_name=name,
        embedding_model="sentence-transformers/all-MiniLM-L6-v2"
    )
    # Store sample data
    chunks = chunker.chunk(description)
    handshakes[name].write(chunks)
    print(f"  ‚úÖ Collection '{name}' created and populated")

# Search across specific collection
print("\nüîç Searching specific collections:")

query = "product features"
for name, handshake in handshakes.items():
    results = handshake.search(query=query, limit=1)
    if results:
        print(f"  {name}: Score = {results[0]['score']:.4f}")

# Delete chunks with filters
print("\nüóëÔ∏è Delete Operations:")
print("  Note: QdrantHandshake uses collection-level operations")
print("  To delete specific chunks, use Qdrant client directly")

# Example: Delete entire collection
print("\n  Deleting 'reviews' collection...")
# handshakes["reviews"].delete_collection()  # If method exists
print("  ‚úÖ Collection management allows isolated data spaces")

print("\nüí° Collection Best Practices:")
print("  - Separate collections by data type or domain")
print("  - Use consistent naming conventions")
print("  - Monitor collection sizes")
print("  - Regular cleanup of unused collections")

üóÇÔ∏è Collection Management:

  ‚úÖ Collection 'products' created and populated
  ‚úÖ Collection 'reviews' created and populated
  ‚úÖ Collection 'documentation' created and populated

üîç Searching specific collections:
  products: Score = 0.6507
  reviews: Score = 0.2629
  documentation: Score = 0.1798

üóëÔ∏è Delete Operations:
  Note: QdrantHandshake uses collection-level operations
  To delete specific chunks, use Qdrant client directly

  Deleting 'reviews' collection...
  ‚úÖ Collection management allows isolated data spaces

üí° Collection Best Practices:
  - Separate collections by data type or domain
  - Use consistent naming conventions
  - Monitor collection sizes
  - Regular cleanup of unused collections


---

# Part 4: Real-World Patterns

## 10. Document Q&A System

Build a question-answering system over documents.

In [27]:
print("üí¨ Document Q&A System\n")

# Create technical documentation
docs_dir = tempfile.mkdtemp()
tech_docs = {
    "api_reference.txt": """The API provides RESTful endpoints for data access. 
    GET /users retrieves user information. POST /users creates new users with JSON payload. 
    PUT /users/{id} updates existing user data. DELETE /users/{id} removes users. 
    Authentication requires Bearer token in Authorization header. 
    Rate limiting applies at 1000 requests per hour.""",
    
    "deployment.txt": """Deployment requires Docker and Kubernetes. Build container image with 
    docker build -t app:latest. Push to registry using docker push. Create Kubernetes deployment 
    with kubectl apply -f deployment.yaml. Configure ingress for external access. 
    Use ConfigMaps for environment variables. Secrets store sensitive data. 
    Set resource limits for CPU and memory.""",
    
    "troubleshooting.txt": """Common issues and solutions: Database connection errors indicate 
    incorrect credentials or network issues. Check DATABASE_URL environment variable. 
    High memory usage may require increasing pod limits. Enable debug logging with LOG_LEVEL=DEBUG. 
    Performance issues often resolved by adding database indexes. Monitor with Prometheus metrics. 
    Check logs with kubectl logs pod-name."""
}

for filename, content in tech_docs.items():
    with open(os.path.join(docs_dir, filename), 'w') as f:
        f.write(content)

print(f"üìö Technical Documentation: {len(tech_docs)} files")

# Build Q&A pipeline
(Pipeline()
    .fetch_from("file", dir=docs_dir, ext=[".txt"])
    .process_with("text")
    .chunk_with("semantic", chunk_size=100, threshold=0.7)
    .refine_with("overlap", context_size=50)
    .store_in("qdrant",
              client=qdrant_client,
              collection_name="tech_docs",
              embedding_model="minishlab/potion-base-32M")
    .run())

print(f"‚úÖ Q&A system ready with {len(tech_docs)} documents stored\n")

# Create Q&A interface
qa_handshake = QdrantHandshake(
    client=qdrant_client,
    collection_name="tech_docs",
    embedding_model="minishlab/potion-base-32M"
)

# Simulate Q&A session
print("üí° Q&A Session:\n")

qa_pairs = [
    ("How do I create a new user?", "POST /users"),
    ("How to check application logs?", "kubectl logs"),
    ("What causes database connection errors?", "credentials"),
    ("How to deploy with Docker?", "docker build")
]

for question, expected_keyword in qa_pairs:
    results = qa_handshake.search(query=question, limit=2)
    answer = results[0]['text']
    confidence = results[0]['score']
    
    print(f"‚ùì {question}")
    print(f"‚úÖ {answer[:100]}...")
    print(f"   Confidence: {confidence:.4f}")
    print(f"   Contains '{expected_keyword}': {expected_keyword in answer.lower()}")
    print()

# Cleanup
shutil.rmtree(docs_dir)
print("üßπ Cleaned up temporary files")

üí¨ Document Q&A System

üìö Technical Documentation: 3 files




‚úÖ Q&A system ready with 3 documents stored

üí° Q&A Session:

‚ùì How do I create a new user?
‚úÖ The API provides RESTful endpoints for data access. 
    GET /users retrieves user information. POST...
   Confidence: 0.4769
   Contains 'POST /users': False

‚ùì How to check application logs?
‚úÖ 
    High memory usage may require increasing pod limits. Enable debug logging with LOG_LEVEL=DEBUG....
   Confidence: 0.6340
   Contains 'kubectl logs': True

‚ùì What causes database connection errors?
‚úÖ Common issues and solutions: Database connection errors indicate 
    incorrect credentials or netwo...
   Confidence: 0.6927
   Contains 'credentials': True

‚ùì How to deploy with Docker?
‚úÖ Deployment requires Docker and Kubernetes. Build container image with 
    docker build -t app:lates...
   Confidence: 0.7901
   Contains 'docker build': True

üßπ Cleaned up temporary files


## 11. Semantic Code Search

Search through code repositories semantically.

In [30]:
print("üíª Semantic Code Search\n")

# Create code repository
code_dir = tempfile.mkdtemp()
code_files = {
    "user_service.py": '''class UserService:
    """Service for managing user operations"""
    
    def create_user(self, username, email):
        """Create a new user with validation"""
        if not self.validate_email(email):
            raise ValueError("Invalid email")
        user = User(username=username, email=email)
        self.db.save(user)
        return user
    
    def authenticate_user(self, username, password):
        """Authenticate user with credentials"""
        user = self.db.find_by_username(username)
        if user and user.verify_password(password):
            return self.generate_token(user)
        return None
''',
    
    "data_processor.py": '''class DataProcessor:
    """Process and transform data"""
    
    def clean_data(self, dataframe):
        """Remove missing values and outliers"""
        df = dataframe.dropna()
        df = self.remove_outliers(df)
        return df
    
    def normalize_features(self, data):
        """Normalize features to 0-1 range"""
        return (data - data.min()) / (data.max() - data.min())
    
    def split_dataset(self, X, y, test_size=0.2):
        """Split data into train and test sets"""
        from sklearn.model_selection import train_test_split
        return train_test_split(X, y, test_size=test_size)
''',
    
    "email_sender.py": '''class EmailSender:
    """Send emails using SMTP"""
    
    def send_email(self, to, subject, body):
        """Send email to recipient"""
        message = self.create_message(to, subject, body)
        self.smtp_client.send(message)
        self.log_sent_email(to, subject)
    
    def send_bulk_emails(self, recipients, subject, body):
        """Send same email to multiple recipients"""
        for recipient in recipients:
            self.send_email(recipient, subject, body)
'''
}

for filename, code in code_files.items():
    with open(os.path.join(code_dir, filename), 'w') as f:
        f.write(code)

print(f"üì¶ Code Repository: {len(code_files)} files\n")

# Process code with semantic chunking
(Pipeline()
    .fetch_from("file", dir=code_dir, ext=[".py"])
    .chunk_with("code", chunk_size=200)
    .refine_with("embeddings", embedding_model="sentence-transformers/all-MiniLM-L6-v2")
    .store_in("qdrant",
              client=qdrant_client,
              collection_name="code_search",
              embedding_model="sentence-transformers/all-MiniLM-L6-v2")
    .run())

print(f"‚úÖ Code indexed: {len(code_files)} files\n")

# Create semantic search interface
code_search = QdrantHandshake(
    client=qdrant_client,
    collection_name="code_search",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# Natural language code queries
print("üîç Semantic Code Search Queries:\n")

queries = [
    "how to validate email addresses",
    "function for data normalization",
    "split training and testing data",
    "authenticate user login",
    "send email to multiple users"
]

for query in queries:
    results = code_search.search(query=query, limit=1)
    code_snippet = results[0]['text']
    score = results[0]['score']
    
    print(f"‚ùì '{query}'")
    print(f"üìù Found code (score: {score:.4f}):")
    # Show first 2 lines of code
    lines = code_snippet.split('\n')[:2]
    for line in lines:
        print(f"   {line}")
    print()

print("‚úÖ Semantic code search enables natural language queries!")
print("üí° Use cases:")
print("  - Find relevant code examples")
print("  - Discover similar implementations")
print("  - Locate specific functionality")
print("  - Onboard new developers faster")

# Cleanup
shutil.rmtree(code_dir)
print("\nüßπ Cleaned up temporary files")

üíª Semantic Code Search

üì¶ Code Repository: 3 files

‚úÖ Code indexed: 3 files

üîç Semantic Code Search Queries:

‚ùì 'how to validate email addresses'
üìù Found code (score: 0.5472):
   """Create a new user with validation"""
           if not self.validate_email(email):

‚ùì 'function for data normalization'
üìù Found code (score: 0.5493):
   def normalize_features(self, data):
           """Normalize features to 0-1 range"""

‚ùì 'split training and testing data'
üìù Found code (score: 0.6141):
   """Split data into train and test sets"""
           from sklearn.model_selection import train_test_split

‚ùì 'authenticate user login'
üìù Found code (score: 0.6640):
   def authenticate_user(self, username, password):
           

‚ùì 'send email to multiple users'
üìù Found code (score: 0.5993):
   def send_bulk_emails(self, recipients, subject, body):
           """Send same email to multiple recipients"""

‚úÖ Semantic code search enables natural language queries!
üí° Us

## 12. Multi-Lingual Search

Handle documents in multiple languages.

In [31]:
print("üåç Multi-Lingual Search\n")

# Multi-lingual content
multilingual_texts = {
    "en": "Machine learning enables computers to learn from data without explicit programming. Neural networks are inspired by the human brain.",
    "es": "El aprendizaje autom√°tico permite que las computadoras aprendan de los datos sin programaci√≥n expl√≠cita. Las redes neuronales est√°n inspiradas en el cerebro humano.",
    "fr": "L'apprentissage automatique permet aux ordinateurs d'apprendre √† partir de donn√©es sans programmation explicite. Les r√©seaux neuronaux sont inspir√©s du cerveau humain.",
    "de": "Maschinelles Lernen erm√∂glicht es Computern, aus Daten zu lernen ohne explizite Programmierung. Neuronale Netze sind vom menschlichen Gehirn inspiriert."
}

# Use multilingual model
multilingual_handshake = QdrantHandshake(
    client=qdrant_client,
    collection_name="multilingual",
    embedding_model="sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"
)

# Chunk and store with language metadata
chunker = TokenChunker(chunk_size=100)
all_chunks = []

for lang, text in multilingual_texts.items():
    chunks = chunker.chunk(text)
    for chunk in chunks:
        chunk.metadata = {"language": lang}
    all_chunks.extend(chunks)
    print(f"  ‚úÖ {lang.upper()}: {len(chunks)} chunks")

multilingual_handshake.write(all_chunks)
print(f"\n‚úÖ Stored {len(all_chunks)} chunks in 4 languages")

# Search across languages
print("\nüîç Cross-Lingual Search:\n")

# Query in English, find relevant content in any language
query = "neural networks and brain"
results = multilingual_handshake.search(query=query, limit=4)

print(f"Query (English): '{query}'\n")
print("Results across languages:")
for i, result in enumerate(results, 1):
    lang = result.get('metadata', {}).get('language', 'unknown')
    text = result['text'][:60]
    score = result['score']
    print(f"  {i}. [{lang.upper()}] {text}... (score: {score:.4f})")

print("\nüí° Multi-lingual models enable:")
print("  - Query in one language, find results in others")
print("  - Semantic similarity across languages")
print("  - Global knowledge base search")
print("  - Cross-border content discovery")

üåç Multi-Lingual Search



modules.json:   0%|          | 0.00/229 [00:00<?, ?B/s]

To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development


config_sentence_transformers.json:   0%|          | 0.00/122 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/645 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/471M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/480 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.08M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

  ‚úÖ EN: 2 chunks
  ‚úÖ ES: 2 chunks
  ‚úÖ FR: 2 chunks
  ‚úÖ DE: 2 chunks

‚úÖ Stored 8 chunks in 4 languages

üîç Cross-Lingual Search:

Query (English): 'neural networks and brain'

Results across languages:
  1. [UNKNOWN] ita. Las redes neuronales est√°n inspiradas en el cerebro hum... (score: 0.8287)
  2. [UNKNOWN] n explicite. Les r√©seaux neuronaux sont inspir√©s du cerveau ... (score: 0.8259)
  3. [UNKNOWN] onale Netze sind vom menschlichen Gehirn inspiriert.... (score: 0.7956)
  4. [UNKNOWN] Machine learning enables computers to learn from data withou... (score: 0.6511)

üí° Multi-lingual models enable:
  - Query in one language, find results in others
  - Semantic similarity across languages
  - Global knowledge base search
  - Cross-border content discovery


---

# Part 5: Production Patterns

## 13. Error Handling and Validation

Robust error handling for production systems.

In [32]:
print("üõ°Ô∏è Error Handling & Validation\n")

# Test error scenarios
print("1Ô∏è‚É£ Invalid Collection Name:")
try:
    invalid_handshake = QdrantHandshake(
        client=qdrant_client,
        collection_name="",  # Empty name
        embedding_model="sentence-transformers/all-MiniLM-L6-v2"
    )
except (ValueError, Exception) as e:
    print(f"   ‚úÖ Caught error: {type(e).__name__}")

print("\n2Ô∏è‚É£ Invalid Embedding Model:")
try:
    invalid_model = QdrantHandshake(
        client=qdrant_client,
        collection_name="test",
        embedding_model="nonexistent/model"
    )
    # This might work but fail on write
    chunks = chunker.chunk("test")
    invalid_model.write(chunks)
except Exception as e:
    print(f"   ‚úÖ Caught error: {type(e).__name__}")

print("\n3Ô∏è‚É£ Empty Chunks:")
try:
    test_handshake = QdrantHandshake(
        client=qdrant_client,
        collection_name="empty_test",
        embedding_model="sentence-transformers/all-MiniLM-L6-v2"
    )
    test_handshake.write([])  # Empty list
    print("   ‚ö†Ô∏è Empty write succeeded (might be valid)")
except Exception as e:
    print(f"   ‚úÖ Caught error: {type(e).__name__}")

print("\n4Ô∏è‚É£ Invalid Search Parameters:")
try:
    test_handshake = QdrantHandshake(
        client=qdrant_client,
        collection_name="valid_test",
        embedding_model="sentence-transformers/all-MiniLM-L6-v2"
    )
    results = test_handshake.search(query="", limit=-1)
except (ValueError, Exception) as e:
    print(f"   ‚úÖ Caught error: {type(e).__name__}")

# Robust error handling pattern
print("\n‚úÖ Production Error Handling Pattern:")
print("""
def safe_rag_operation():
    try:
        handshake = QdrantHandshake(
            url=os.getenv("QDRANT_URL", ":memory:"),
            collection_name="my_collection",
            embedding_model="sentence-transformers/all-MiniLM-L6-v2"
        )
        
        # Validate chunks before writing
        if not chunks or len(chunks) == 0:
            raise ValueError("No chunks to write")
        
        handshake.write(chunks)
        
        # Validate search results
        results = handshake.search(query=query, limit=5)
        if not results:
            logger.warning("No results found")
            return []
        
        return results
        
    except ValueError as e:
        logger.error(f"Validation error: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        # Implement retry logic or fallback
        raise
""")

print("\nüí° Best Practices:")
print("  - Validate inputs before operations")
print("  - Handle empty results gracefully")
print("  - Implement retry logic for transient failures")
print("  - Log errors with context")
print("  - Use environment variables for configuration")

üõ°Ô∏è Error Handling & Validation

1Ô∏è‚É£ Invalid Collection Name:


No sentence-transformers model found with name nonexistent/model. Creating a new one with mean pooling.



2Ô∏è‚É£ Invalid Embedding Model:
   ‚úÖ Caught error: ValueError

3Ô∏è‚É£ Empty Chunks:
   ‚ö†Ô∏è Empty write succeeded (might be valid)

4Ô∏è‚É£ Invalid Search Parameters:

‚úÖ Production Error Handling Pattern:

def safe_rag_operation():
    try:
        handshake = QdrantHandshake(
            url=os.getenv("QDRANT_URL", ":memory:"),
            collection_name="my_collection",
            embedding_model="sentence-transformers/all-MiniLM-L6-v2"
        )

        # Validate chunks before writing
        if not chunks or len(chunks) == 0:
            raise ValueError("No chunks to write")

        handshake.write(chunks)

        # Validate search results
        results = handshake.search(query=query, limit=5)
        if not results:
            return []

        return results

    except ValueError as e:
        logger.error(f"Validation error: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        # Implement retry logic or fallba

## 14. Performance Optimization

Optimize for speed and efficiency.

In [33]:
import time

print("‚ö° Performance Optimization\n")

# Create test data
perf_texts = [
    f"Performance test document {i}. This contains information about optimization, "
    f"efficiency, and speed improvements for RAG applications. Document number {i}."
    for i in range(100)
]

# Test 1: Chunk size impact
print("1Ô∏è‚É£ Chunk Size Impact:\n")

chunker_small = TokenChunker(chunk_size=50)
chunker_large = TokenChunker(chunk_size=200)

for name, chunker in [("Small (50)", chunker_small), ("Large (200)", chunker_large)]:
    start = time.time()
    chunks = []
    for text in perf_texts:
        chunks.extend(chunker.chunk(text))
    elapsed = time.time() - start
    
    print(f"   {name}: {len(chunks)} chunks in {elapsed:.3f}s")

# Test 2: Batch vs. Individual writes
print("\n2Ô∏è‚É£ Batch vs. Individual Writes:\n")

handshake_batch = QdrantHandshake(
    client=qdrant_client,
    collection_name="batch_test",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# Batch write
chunker = TokenChunker(chunk_size=100)
all_chunks = []
for text in perf_texts[:20]:
    all_chunks.extend(chunker.chunk(text))

start = time.time()
handshake_batch.write(all_chunks)
batch_time = time.time() - start

print(f"   Batch write: {len(all_chunks)} chunks in {batch_time:.3f}s")
print(f"   Rate: {len(all_chunks)/batch_time:.1f} chunks/sec")

# Test 3: Search performance
print("\n3Ô∏è‚É£ Search Performance:\n")

queries = [
    "optimization techniques",
    "performance improvements",
    "efficiency methods"
]

start = time.time()
for query in queries * 10:  # 30 searches
    results = handshake_batch.search(query=query, limit=5)
total_time = time.time() - start

print(f"   {len(queries) * 10} searches in {total_time:.3f}s")
print(f"   Average: {total_time/(len(queries)*10)*1000:.1f}ms per search")

print("\nüí° Optimization Tips:")
print("  ‚úÖ Larger chunks = fewer embeddings = faster")
print("  ‚úÖ Batch writes are more efficient")
print("  ‚úÖ Use smaller models for speed")
print("  ‚úÖ Cache frequent queries")
print("  ‚úÖ Limit search results appropriately")
print("  ‚úÖ Use metadata filters to reduce search space")
print("  ‚úÖ Consider persistent storage vs. in-memory")

‚ö° Performance Optimization

1Ô∏è‚É£ Chunk Size Impact:

   Small (50): 390 chunks in 0.002s
   Large (200): 100 chunks in 0.001s

2Ô∏è‚É£ Batch vs. Individual Writes:

   Batch write: 40 chunks in 0.269s
   Rate: 148.7 chunks/sec

3Ô∏è‚É£ Search Performance:

   30 searches in 0.196s
   Average: 6.5ms per search

üí° Optimization Tips:
  ‚úÖ Larger chunks = fewer embeddings = faster
  ‚úÖ Batch writes are more efficient
  ‚úÖ Use smaller models for speed
  ‚úÖ Cache frequent queries
  ‚úÖ Limit search results appropriately
  ‚úÖ Use metadata filters to reduce search space
  ‚úÖ Consider persistent storage vs. in-memory


## 15. Persistent Storage

Use persistent Qdrant storage instead of in-memory.

In [34]:
import os

print("üíæ Persistent Storage Options\n")

# Create persistent storage directory
storage_path = "./qdrant_storage_demo"

print("1Ô∏è‚É£ In-Memory (Ephemeral):")
print("   URL: ':memory:'")
print("   ‚úÖ Fast, perfect for testing")
print("   ‚ùå Data lost on restart")
print("   Use case: Development, notebooks, temporary data")

print("\n2Ô∏è‚É£ Persistent Local Storage:")
print(f"   Path: '{storage_path}'")
print("   ‚úÖ Data persists across restarts")
print("   ‚úÖ No separate server needed")
print("   Use case: Single-machine applications, local development")

# Example persistent handshake (commented to avoid file creation)
print("\n3Ô∏è‚É£ Local Qdrant Server:")
print("   URL: 'http://localhost:6333'")
print("   ‚úÖ Full Qdrant features")
print("   ‚úÖ Multi-client access")
print("   Use case: Development with Docker")

print("\n4Ô∏è‚É£ Qdrant Cloud:")
print("   URL: 'https://your-cluster.qdrant.io'")
print("   ‚úÖ Managed service")
print("   ‚úÖ Scalable, production-ready")
print("   ‚úÖ High availability")
print("   Use case: Production deployments")

print("\nüìù Configuration Examples:")
print("""
# In-memory
handshake = QdrantHandshake(
    url=":memory:",
    collection_name="temp_data"
)

# Persistent local
handshake = QdrantHandshake(
    path="./qdrant_storage",
    collection_name="persistent_data"
)

# Local server
handshake = QdrantHandshake(
    url="http://localhost:6333",
    collection_name="local_data"
)

# Qdrant Cloud
handshake = QdrantHandshake(
    url="https://xyz.qdrant.io",
    api_key="your-api-key",
    collection_name="production_data"
)
""")

print("\nüí° Choose based on:")
print("  - Development ‚Üí In-memory or persistent local")
print("  - Testing ‚Üí In-memory for speed")
print("  - Production ‚Üí Qdrant Cloud or self-hosted server")
print("  - Persistence needs ‚Üí Avoid in-memory for important data")

üíæ Persistent Storage Options

1Ô∏è‚É£ In-Memory (Ephemeral):
   URL: ':memory:'
   ‚úÖ Fast, perfect for testing
   ‚ùå Data lost on restart
   Use case: Development, notebooks, temporary data

2Ô∏è‚É£ Persistent Local Storage:
   Path: './qdrant_storage_demo'
   ‚úÖ Data persists across restarts
   ‚úÖ No separate server needed
   Use case: Single-machine applications, local development

3Ô∏è‚É£ Local Qdrant Server:
   URL: 'http://localhost:6333'
   ‚úÖ Full Qdrant features
   ‚úÖ Multi-client access
   Use case: Development with Docker

4Ô∏è‚É£ Qdrant Cloud:
   URL: 'https://your-cluster.qdrant.io'
   ‚úÖ Managed service
   ‚úÖ Scalable, production-ready
   ‚úÖ High availability
   Use case: Production deployments

üìù Configuration Examples:

# In-memory
handshake = QdrantHandshake(
    url=":memory:",
    collection_name="temp_data"
)

# Persistent local
handshake = QdrantHandshake(
    path="./qdrant_storage",
    collection_name="persistent_data"
)

# Local server
handshake 

## 16. Monitoring and Observability

Track system performance and behavior.

In [35]:
print("üìä Monitoring & Observability\n")

# Create monitored handshake
monitor_handshake = QdrantHandshake(
    client=qdrant_client,
    collection_name="monitored",
    embedding_model="sentence-transformers/all-MiniLM-L6-v2"
)

# Metrics tracking
metrics = {
    "chunks_written": 0,
    "searches_performed": 0,
    "avg_search_time": 0,
    "total_documents": 0
}

# Sample operation with metrics
print("üìù Ingestion Metrics:\n")

chunker = TokenChunker(chunk_size=100)
docs_to_ingest = [
    "Document about machine learning and artificial intelligence.",
    "Content on natural language processing and text analysis.",
    "Information about computer vision and image recognition."
]

for i, text in enumerate(docs_to_ingest, 1):
    chunks = chunker.chunk(text)
    monitor_handshake.write(chunks)
    metrics["chunks_written"] += len(chunks)
    metrics["total_documents"] += 1
    print(f"   Doc {i}: {len(chunks)} chunks written")

print(f"\n   Total: {metrics['chunks_written']} chunks from {metrics['total_documents']} docs")

# Search metrics
print("\nüîç Search Metrics:\n")

queries = [
    "machine learning",
    "text processing",
    "image recognition"
]

search_times = []
for query in queries:
    start = time.time()
    results = monitor_handshake.search(query=query, limit=3)
    elapsed = time.time() - start
    
    search_times.append(elapsed)
    metrics["searches_performed"] += 1
    
    print(f"   '{query}': {len(results)} results in {elapsed*1000:.1f}ms")

metrics["avg_search_time"] = sum(search_times) / len(search_times)

# Summary metrics
print("\nüìà Summary Metrics:")
print(f"   Total documents: {metrics['total_documents']}")
print(f"   Total chunks: {metrics['chunks_written']}")
print(f"   Avg chunks per doc: {metrics['chunks_written']/metrics['total_documents']:.1f}")
print(f"   Total searches: {metrics['searches_performed']}")
print(f"   Avg search time: {metrics['avg_search_time']*1000:.1f}ms")

print("\nüí° Production Monitoring:")
print("""
class MonitoredQdrantHandshake:
    def __init__(self, handshake):
        self.handshake = handshake
        self.metrics = defaultdict(int)
    
    def write(self, chunks):
        start = time.time()
        self.handshake.write(chunks)
        self.metrics['write_time'] += time.time() - start
        self.metrics['chunks_written'] += len(chunks)
    
    def search(self, query, limit):
        start = time.time()
        results = self.handshake.search(query, limit)
        self.metrics['search_time'] += time.time() - start
        self.metrics['searches'] += 1
        return results
    
    def get_metrics(self):
        return dict(self.metrics)
""")

print("\nüìä Key Metrics to Track:")
print("  - Chunks written per second")
print("  - Average search latency")
print("  - Search result relevance scores")
print("  - Collection size and growth")
print("  - Embedding generation time")
print("  - Memory usage")
print("  - Error rates")

üìä Monitoring & Observability

üìù Ingestion Metrics:

   Doc 1: 1 chunks written
   Doc 2: 1 chunks written
   Doc 3: 1 chunks written

   Total: 3 chunks from 3 docs

üîç Search Metrics:

   'machine learning': 3 results in 0.5ms
   'text processing': 3 results in 10.5ms
   'image recognition': 3 results in 8.9ms

üìà Summary Metrics:
   Total documents: 3
   Total chunks: 3
   Avg chunks per doc: 1.0
   Total searches: 3
   Avg search time: 6.7ms

üí° Production Monitoring:

class MonitoredQdrantHandshake:
    def __init__(self, handshake):
        self.handshake = handshake
        self.metrics = defaultdict(int)

    def write(self, chunks):
        start = time.time()
        self.handshake.write(chunks)
        self.metrics['write_time'] += time.time() - start
        self.metrics['chunks_written'] += len(chunks)

    def search(self, query, limit):
        start = time.time()
        results = self.handshake.search(query, limit)
        self.metrics['search_time'] += time

---

## Summary: Chonkie + Qdrant Integration

### QdrantHandshake Overview

The `QdrantHandshake` provides seamless integration between Chonkie and Qdrant vector database for RAG applications.

### Key Components

**1. Initialization Options**:
```python
# In-memory (development)
QdrantHandshake(url=":memory:", collection_name="demo")

# Persistent local
QdrantHandshake(path="./storage", collection_name="data")

# Local server
QdrantHandshake(url="http://localhost:6333", collection_name="docs")

# Qdrant Cloud
QdrantHandshake(
    url="https://xyz.qdrant.io",
    api_key="key",
    collection_name="prod"
)
```

**2. Core Operations**:
- `write(chunks)` - Store chunks with automatic embeddings
- `search(query, limit)` - Semantic search with natural language
- Metadata stored with chunks for post-search filtering

**3. Pipeline Integration**:
```python
Pipeline()
    .fetch_from("file", dir="./docs")
    .chunk_with("semantic", chunk_size=512)
    .refine_with("overlap", context_size=100)
    .store_in("qdrant",
              client=qdrant_client,
              collection_name="knowledge",
              embedding_model="all-MiniLM-L6-v2")
    .run()
```

### Storage Options

| Type | URL | Persistence | Use Case |
|------|-----|-------------|----------|
| In-Memory | `:memory:` | ‚ùå Ephemeral | Testing, notebooks |
| Local Persistent | Path string | ‚úÖ Yes | Single machine apps |
| Local Server | `http://localhost:6333` | ‚úÖ Yes | Development with Docker |
| Qdrant Cloud | `https://...qdrant.io` | ‚úÖ Yes | Production deployments |

### Embedding Models

**Lightweight** (Development):
- `sentence-transformers/all-MiniLM-L6-v2` (384-dim)
- `minishlab/potion-base-8M` (256-dim)

**High-Quality** (Production):
- `sentence-transformers/all-mpnet-base-v2` (768-dim)
- `BAAI/bge-small-en-v1.5` (384-dim)

**Specialized**:
- `paraphrase-multilingual-MiniLM-L12-v2` (Multi-lingual)

### Common Patterns

**1. Basic RAG**:
```python
# Ingestion
handshake = QdrantHandshake(url=":memory:", collection_name="kb")
chunks = chunker.chunk(text)
handshake.write(chunks)

# Retrieval
results = handshake.search("your question", limit=5)
```

**2. Complete Pipeline**:
```python
docs = (Pipeline()
    .fetch_from("file", dir="./docs")
    .chunk_with("semantic", threshold=0.75)
    .refine_with("overlap", context_size=50)
    .store_in("qdrant", collection_name="kb")
    .run())
```

**3. Metadata Storage**:
```python
# Add metadata during chunking
chunk.metadata = {"category": "technical", "author": "john"}

# Store with metadata
handshake.write(chunks)

# Search and filter results by metadata
results = handshake.search(query="deployment", limit=10)
filtered = [r for r in results if r.get('metadata', {}).get('category') == 'technical']
```

**4. Batch Operations**:
```python
# Collect all chunks
all_chunks = []
for doc in documents:
    all_chunks.extend(chunker.chunk(doc))

# Single batch write
handshake.write(all_chunks)
```

### Performance Tips

‚úÖ **Chunk Size**: Larger chunks (200-500 tokens) = fewer embeddings = faster
‚úÖ **Batch Writes**: Write multiple chunks at once instead of individually
‚úÖ **Model Selection**: Use smaller models for speed, larger for quality
‚úÖ **Metadata Filters**: Reduce search space with targeted filters
‚úÖ **Limit Results**: Request only what you need (limit=5-10)
‚úÖ **Caching**: Cache frequent queries at application level
‚úÖ **Persistent Storage**: Use for important data, in-memory for temporary

### Production Checklist

**Configuration**:
- ‚úÖ Use environment variables for URLs and keys
- ‚úÖ Choose appropriate embedding model
- ‚úÖ Set up persistent storage or cloud instance
- ‚úÖ Configure proper collection names

**Error Handling**:
- ‚úÖ Validate inputs before operations
- ‚úÖ Handle empty results gracefully
- ‚úÖ Implement retry logic for transient failures
- ‚úÖ Log errors with context

**Monitoring**:
- ‚úÖ Track ingestion rate (chunks/second)
- ‚úÖ Monitor search latency (ms)
- ‚úÖ Log relevance scores
- ‚úÖ Track collection growth
- ‚úÖ Alert on error rates

**Security**:
- ‚úÖ Secure API keys (environment variables)
- ‚úÖ Use HTTPS for cloud connections
- ‚úÖ Implement authentication
- ‚úÖ Validate user inputs

### Real-World Use Cases

**1. Document Q&A**: Technical documentation, knowledge bases
**2. Code Search**: Semantic code repository search
**3. Content Discovery**: Blog posts, articles, research papers
**4. Multi-lingual Search**: Cross-language content retrieval
**5. Customer Support**: FAQ and support documentation
**6. E-commerce**: Product search and recommendations

### Best Practices

‚úÖ **Development**: Start with in-memory Qdrant
‚úÖ **Testing**: Use small, fast embedding models
‚úÖ **Staging**: Test with persistent local storage
‚úÖ **Production**: Deploy to Qdrant Cloud or managed instance
‚úÖ **Monitoring**: Track metrics from day one
‚úÖ **Documentation**: Document collection schemas and metadata
‚úÖ **Backups**: Regular backups of persistent collections
‚úÖ **Versioning**: Version your embedding models and schemas

### Next Steps

üîó **Resources**:
- [Chonkie Documentation](https://docs.chonkie.ai/)
- [QdrantHandshake API](https://docs.chonkie.ai/oss/handshakes/qdrant-handshake)
- [Qdrant Documentation](https://qdrant.tech/documentation/)
- [Qdrant Python Client](https://python-client.qdrant.tech/)

üöÄ **Try Next**:
- Deploy to Qdrant Cloud
- Experiment with different embedding models
- Build complete RAG application
- Implement hybrid search (dense + sparse)
- Add reranking for better results
- Integrate with LLM for generation

---

## üéâ Congratulations!

You've completed the **Chonkie + Qdrant Integration** tutorial!

### What You've Learned:

‚úÖ **QdrantHandshake Basics** - Store and search chunks
‚úÖ **In-Memory Qdrant** - No Docker required for development
‚úÖ **Pipeline Integration** - Fluent API with `.store_in()`
‚úÖ **Semantic Search** - Natural language queries
‚úÖ **Metadata Filters** - Targeted search
‚úÖ **Batch Operations** - Efficient large-scale ingestion
‚úÖ **Custom Embeddings** - Choose the right model
‚úÖ **Real-World Patterns** - Q&A, code search, multi-lingual
‚úÖ **Production Ready** - Error handling, monitoring, optimization
‚úÖ **Storage Options** - In-memory, persistent, cloud

### Build Amazing RAG Applications! üöÄ