# 🚀 Agentic Search Powered by Elastic's Vector Database

## Demo: Intelligent Search Agent for Technical Documentation

This notebook demonstrates the architecture presented in the **"Agentic Search with Elastic Vector Database"** webinar by building a working system that:

1. **Ingests** technical documentation (the webinar slides themselves!) using ColPali for visual embeddings
2. **Creates** an intelligent search agent with specialized tools
3. **Demonstrates** how agents can answer complex questions by combining:
   - Text search (BM25 + Vector/Semantic search)
   - Image analysis (ColPali + Visual LLM)
   - Metadata filtering

### The Challenge
*"As a developer, you are asked to create a new search for a repository of technical documents. How would you do this?"* 

### The Solution
Build an agentic search system using:
- **Elasticsearch** as the vector database and knowledge base
- **semantic_text** fields for automatic embedding generation
- **ColPali** for multi-vector image understanding
- **CrewAI** for agent orchestration

Let's build it! 🎯

# Setup

## 📦 Step 1: Install Dependencies


In [1]:
# First ensure numpy is upgraded to avoid binary incompatibility
%pip install --upgrade 'numpy>=1.26.0,<2.0.0' -q

# Then install everything else
# Install with compatible versions for ColPali
%pip install -q 'numpy>=1.26.0,<2.0.0'
%pip install -q 'transformers>=4.46.1,<4.47.0' 'torch>=2.7.0,<2.8.0' torchvision
%pip install colpali-engine>=0.3.0,<0.4.0
%pip install -q elasticsearch crewai crewai-tools langchain python-dotenv pillow requests PyMuPDF
print("✅ All packages installed!")



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.2[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpython3.11 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.
zsh:1: 0.3.0,

## 🔧 Step 2: Import Libraries and Setup


In [2]:
import os
from elasticsearch import Elasticsearch
from crewai import Agent, Task, Crew, Process
from crewai.tools import BaseTool
from typing import Type, List, Dict, Any
from pydantic import BaseModel, Field
import json
from PIL import Image
import requests
from io import BytesIO
import torch
import numpy as np
from colpali_engine.models import ColPali, ColPaliProcessor
import fitz  # PyMuPDF
from pathlib import Path
from datetime import datetime

# For environment variables
from dotenv import load_dotenv
load_dotenv()

print("✓ All libraries imported successfully!")

  from .autonotebook import tqdm as notebook_tqdm


✓ All libraries imported successfully!


## ⚡ Step 3: Connect to Elasticsearch Serverless

**Elasticsearch as Knowledge Base**

The presentation shows Elasticsearch as the central knowledge base that agents use for:
- Storing documents with `semantic_text` fields
- Vector embeddings for similarity search  
- Metadata for filtering and personalization
- Hybrid search (BM25 + Vector)


In [3]:
# Elasticsearch Configuration
ELASTIC_URL = os.getenv("ELASTIC_URL", "https://your-elasticsearch-url.es.us-central1.gcp.cloud.es.io:443")
ELASTIC_API_KEY = os.getenv("ELASTIC_API_KEY", "your-api-key")
OPENAI_BASE_URL = os.getenv("OPENAI_BASE_URL", "https://api.openai.com")
OPENAI_MODEL_NAME = os.getenv("OPENAI_MODEL_NAME", "gpt-4.1")

INDEX_NAME = "technical_docs_webinar"

# Initialize Elasticsearch client
es_client = Elasticsearch(
    ELASTIC_URL,
    api_key=ELASTIC_API_KEY,
    verify_certs=True
)

# Test connection
try:
    if es_client.ping():
        print("✓ Connected to Elasticsearch successfully!")
        info = es_client.info()
        print(f"  Cluster: {info['cluster_name']}")
        print(f"  Version: {info['version']['number']}")
    else:
        print("✗ Failed to connect to Elasticsearch")
except Exception as e:
    print(f"✗ Connection error: {e}")
    print("  💡 Make sure your .env file has ELASTIC_URL and ELASTIC_API_KEY set")


✓ Connected to Elasticsearch successfully!
  Cluster: cb769ac92f2c4811b5d8b9e04983316b
  Version: 8.11.0


## 🗂️ Step 4: Create Index with semantic_text and ColPali Mapping

As shown in the webinar (Slide 15), we use:
- **`semantic_text`** fields for automatic embedding generation
- **`rank_vectors`** for ColPali's multi-vector late interaction (with bit quantization)
- **`dense_vector`** (avg_vector) for quick similarity checks
- **Metadata** for geo-tagging, filtering, and personalization (Slide 18)


In [None]:
# Index mapping optimized for agentic search
index_mapping = {
    "mappings": {
        "properties": {
            "title": {
                "type": "text",
                "fields": {"keyword": {"type": "keyword"}}
            },
            "content": {
                "type": "semantic_text",
                "inference_id": ".elser-2-elastic"
            },
            "slide_text": {
                "type": "text"
            },
            "image_path": {
                "type": "keyword"
            },
            # ColPali multi-vector embeddings using rank_vectors (late interaction)
            "col_pali_vectors": {
                "type": "rank_vectors"
            },
            # Average vector for quick similarity checks
            "avg_vector": {
                "type": "dense_vector",
                "dims": 128,
                "index": True,
                "similarity": "dot_product"
            },
            "metadata": {
                "properties": {
                    "source": {"type": "keyword"},
                    "page_number": {"type": "integer"},
                    "timestamp": {"type": "date"},
                    "tags": {"type": "keyword"},
                    # For geo-based personalization (Slide 18)
                    "location": {"type": "geo_point"}
                }
            }
        }
    }
}

# Delete index if exists (for demo purposes)
try:
    if es_client.indices.exists(index=INDEX_NAME):
        es_client.indices.delete(index=INDEX_NAME)
        print(f"✓ Deleted existing index: {INDEX_NAME}")
except:
    pass

# Create the index
try:
    es_client.indices.create(index=INDEX_NAME, body=index_mapping)
    print(f"✓ Created index: {INDEX_NAME}")
    print("  - semantic_text field for automatic embeddings")
    print("  - rank_vectors field for ColPali late interaction")
    print("  - avg_vector field for quick similarity checks")
    print("  - metadata for filtering and personalization")
except Exception as e:
    print(f"✗ Error creating index: {e}")
    print("  💡 If semantic_text fails, update the inference_id or change to 'text' type")


## 🖼️ Step 5: Initialize ColPali for Image Understanding

**ColPali - Late Interaction Multi-Vector Approach (Slides 16-17)**

Following the Elastic blog post: https://www.elastic.co/search-labs/blog/late-interaction-model-colpali-scale

The webinar explains two approaches to image search:
1. **Single Vector + Agent**: Caption the image, then embed
2. **ColPali Multi-Vector**: Generate multiple vectors capturing different aspects of the image

We're using **ColPali with late interaction** for richer visual understanding of diagrams, charts, and technical content!

**Implementation Details:**
- **rank_vectors field**: Stores multi-vector embeddings efficiently using bit quantization
- **avg_vector field**: Dense vector for quick similarity checks
- **maxSimDotProduct**: Late interaction scoring function for high-quality retrieval

**Performance Note:** On M3 MacBook, ColPali uses Metal Performance Shaders (MPS) for hardware acceleration. Processing 34 slides takes ~2-3 minutes.


In [4]:
# 🔧 ColPali setup
print("Loading ColPali model...")
print("(First time: downloads ~1GB, takes 2-5 min)")

model_name = "vidore/colpali-v1.2"
colpali_model = ColPali.from_pretrained(model_name)
colpali_processor = ColPaliProcessor.from_pretrained(model_name)

print("✓ ColPali ready!")

# Function to process images with ColPali
def process_image_with_colpali(image):
    """Get embeddings from an image using ColPali
    Returns both multi-vectors for late interaction and average vector
    
    Following the colpali-engine documentation:
    https://github.com/illuin-tech/colpali
    """
    try:
        if isinstance(image, str):
            image = Image.open(image)
        
        # Use the processor's process_images helper method
        batch_images = colpali_processor.process_images([image]).to(colpali_model.device)
        
        with torch.no_grad():
            # ColPali model returns embeddings directly as a tensor
            image_embeddings = colpali_model(**batch_images)
            
            # image_embeddings shape: (batch_size, num_patches, embedding_dim)
            # Get the first (and only) image's embeddings
            multi_vectors = image_embeddings[0].cpu().numpy()
            
            # Get average vector for quick similarity
            avg_vector = multi_vectors.mean(axis=0)
            
            # Normalize avg_vector to unit length (required for dot_product similarity)
            avg_vector_norm = np.linalg.norm(avg_vector)
            if avg_vector_norm > 0:
                avg_vector = avg_vector / avg_vector_norm
        
        return {
            "multi_vectors": multi_vectors.tolist(),
            "avg_vector": avg_vector.tolist(),
            "success": True
        }
    except Exception as e:
        return {"multi_vectors": None, "avg_vector": None, "success": False, "error": str(e)}

# Function to process text queries with ColPali
def create_col_pali_query_vectors(query):
    """Generate ColPali vectors for a text query
    
    Returns both multi-vectors (for ColPali late interaction rescoring) and 
    normalized avg_vector (for initial kNN retrieval in RRF)
    
    For late interaction retrieval, we encode the text query which will be
    matched against the multi-vector image embeddings using maxSimDotProduct
    
    Following the colpali-engine documentation:
    https://github.com/illuin-tech/colpali
    """
    try:
        # Use the processor's process_queries helper method
        batch_queries = colpali_processor.process_queries([query]).to(colpali_model.device)
        
        with torch.no_grad():
            # ColPali model returns embeddings directly as a tensor
            query_embeddings = colpali_model(**batch_queries)
            
            # query_embeddings shape: (batch_size, num_tokens, embedding_dim)
            # Get the first (and only) query's embeddings
            multi_vectors = query_embeddings[0].cpu().numpy()
            
            # Calculate normalized average vector for kNN retrieval
            avg_vector = multi_vectors.mean(axis=0)
            avg_vector_norm = np.linalg.norm(avg_vector)
            if avg_vector_norm > 0:
                avg_vector = avg_vector / avg_vector_norm
        
        return {
            "multi_vectors": multi_vectors.tolist(),
            "avg_vector": avg_vector.tolist()
        }
    except Exception as e:
        print(f"Error generating query vectors: {e}")
        return None


Loading ColPali model...
(First time: downloads ~1GB, takes 2-5 min)


`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.
Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use
`config.hidden_activation` if you want to override this behaviour.
See https://github.com/huggingface/transformers/pull/29402 for more details.
Loading checkpoint shards: 100%|██████████| 2/2 [00:02<00:00,  1.25s/it]


✓ ColPali ready!


## 📚 Step 6: Extract Slides from the Webinar PDF

**Data Ingestion Pipeline (Slide 13)**

The webinar shows the ingestion process:
1. **Document** → Text + Diagrams + Metadata
2. **Chunking** → Semantic chunking of content
3. **Caption/ColPali** → Visual embeddings
4. **Enrichment** → Add metadata

We'll extract all 35 slides from `vector_webinar.pdf` as our technical documentation!


In [None]:
class PDFSlideExtractor:
    """Extract slides from PDF as images and text"""
    
    def __init__(self, pdf_path: str, output_dir: str = "slides_output"):
        self.pdf_path = pdf_path
        self.output_dir = Path(output_dir)
        self.output_dir.mkdir(exist_ok=True)
        self.doc = fitz.open(pdf_path)
        print(f"✓ Loaded PDF: {Path(pdf_path).name}")
        print(f"  Total pages: {len(self.doc)}")
    
    def extract_slide(self, page_num: int) -> Dict[str, Any]:
        """Extract a single slide as image and text"""
        page = self.doc[page_num]
        
        # Extract text
        text = page.get_text().strip()
        
        # Render page as high-quality image
        zoom = 2.0  # 2x zoom for better quality
        mat = fitz.Matrix(zoom, zoom)
        pix = page.get_pixmap(matrix=mat)
        
        # Save image
        image_filename = f"slide_{page_num + 1:03d}.png"
        image_path = self.output_dir / image_filename
        pix.save(str(image_path))
        
        # Convert to PIL Image
        img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)
        
        return {
            "page_number": page_num + 1,
            "text": text,
            "image_path": str(image_path),
            "image": img,
            "width": pix.width,
            "height": pix.height
        }
    
    def extract_all_slides(self) -> List[Dict[str, Any]]:
        """Extract all slides from the PDF"""
        slides = []
        print(f"\n⏳ Extracting slides to {self.output_dir}/")
        
        for page_num in range(len(self.doc)):
            slide_data = self.extract_slide(page_num)
            slides.append(slide_data)
            
            # Show progress
            if (page_num + 1) % 10 == 0:
                print(f"  ✓ Extracted {page_num + 1}/{len(self.doc)} slides...")
        
        print(f"✓ Extracted all {len(slides)} slides!\n")
        return slides
    
    def close(self):
        self.doc.close()

# Extract slides from the webinar PDF
pdf_path = "./vector_webinar.pdf"
extractor = PDFSlideExtractor(pdf_path)
extracted_slides = extractor.extract_all_slides()
extractor.close()

# Show sample
print("="*80)
print("📄 Sample Slide Data:")
print("="*80)
print(f"Slide 1 Title: {extracted_slides[0]['text'].split(chr(10))[0]}")
print(f"Image Path: {extracted_slides[0]['image_path']}")
print(f"Dimensions: {extracted_slides[0]['width']}x{extracted_slides[0]['height']}")


## 🔄 Step 7: Process and Ingest Slides into Elasticsearch

Now we'll process each slide with ColPali and index into Elasticsearch with:
- Text content from the slides
- Visual embeddings from ColPali
- Metadata (page number, source, timestamp)


In [None]:
def ingest_slide(slide_data: Dict[str, Any], source_name: str = "agentic_search_webinar") -> str:
    """Ingest a slide into Elasticsearch with ColPali embeddings"""
    
    page_num = slide_data['page_number']
    text = slide_data['text']
    image_path = slide_data['image_path']
    
    # Extract title from first line
    lines = [l.strip() for l in text.split('\n') if l.strip()]
    title = lines[0][:150] if lines else f"Slide {page_num}"
    
    # Process image with ColPali
    image_result = process_image_with_colpali(slide_data['image'])
    
    document = {
        "title": title,
        "content": text,
        "slide_text": text,
        "image_path": image_path,
        "metadata": {
            "source": source_name,
            "page_number": page_num,
            "width": slide_data['width'],
            "height": slide_data['height'],
            "timestamp": datetime.now().isoformat(),
            "tags": ["webinar", "vector-search", "elastic", "agents"]
        }
    }
    
    # Add ColPali embeddings if successful
    if image_result["success"]:
        document["col_pali_vectors"] = image_result["multi_vectors"]
        document["avg_vector"] = image_result["avg_vector"]
    
    # Index document
    response = es_client.index(index=INDEX_NAME, document=document)
    return response['_id']

# Ingest all slides with progress tracking
print(f"⏳ Ingesting {len(extracted_slides)} slides into Elasticsearch...")

ingested_ids = []
for i, slide_data in enumerate(extracted_slides, 1):
    doc_id = ingest_slide(slide_data)
    ingested_ids.append(doc_id)
    
    # Progress updates
    if i % 10 == 0:
        print(f"  ✓ Ingested {i}/{len(extracted_slides)} slides...")

print(f"\n{'='*80}")
print(f"✅ Successfully ingested all {len(ingested_ids)} slides!")
print(f"{'='*80}\n")

# Verify
count_result = es_client.count(index=INDEX_NAME)


> **_NOTE:_** Data is ingested and you're able to presesnt the Demo App. Next steps are code samples and similar to the implemantation of the Demo app. 

> **_NOTE:_** We recommend to go through the notebook to undersrand the logic, but run the test quersions through the app. 

# Agentic Workflow with Crew AI


<div style="background-color: #fff3cd; padding: 10px; border: 2px solid #f5c6cb; border-radius: 5px; color: #721c24;">
⚠️ <strong>PLEASE MAKE SURE TO RUN SETUP STEP 5 TO INITIALISE COLPALI</strong>
</div>

## 🔧 Step 1: Create Search Tool

**Search Tool (Slide 24)** 

"Specializes in information retrieval, query processing and content discovery"

The Search Tool uses **Hybrid Search** (Slide 23) combining:
- **BM25** (keyword/lexical search)
- **Vector Search** (semantic similarity)
- **RRF** (Reciprocal Rank Fusion) for combining results


In [5]:
class SearchToolInput(BaseModel):
    """Input schema for SearchTool"""
    query: str = Field(..., description="The search query to find relevant information in the knowledge base")

class SearchTool(BaseTool):
    name: str = "search_tool"
    description: str = (
        "Specializes in information retrieval, query processing and content discovery. "
        "Searches the Elasticsearch knowledge base using hybrid search (BM25 + Vector). "
        "Returns chunks with text content and image URLs for further analysis."
    )
    args_schema: Type[BaseModel] = SearchToolInput
    
    def _run(self, query: str) -> str:
        """Execute hybrid search with RRF and ColPali late interaction rescoring"""
        try:
            # Generate ColPali query vectors (both multi-vectors and avg_vector)
            query_vectors = create_col_pali_query_vectors(query)

            search_body = {
                "_source": ["title", "content", "slide_text", "image_path", "metadata"],
                "retriever": {
                    "rescorer": {
                        "retriever": {
                            "rrf": {
                                "retrievers": [
                                    {
                                        "standard": {
                                            "query": {
                                                "multi_match": {
                                                    "query": query,
                                                    "fields": ["title", "slide_text"],
                                                    "type": "best_fields"
                                                }
                                            }
                                        }
                                    },
                                    {
                                        "standard": {
                                            "query": {
                                                "semantic": {
                                                    "query": query,
                                                    "field": "content"
                                                }
                                            }
                                        }
                                    },
                                    {
                                        "knn": {
                                            "field": "avg_vector",
                                            "query_vector": query_vectors["avg_vector"],
                                            "k": 10,
                                            "num_candidates": 100
                                        }
                                    }
                                ],
                                "rank_window_size": 50
                            }
                        },
                        "rescore": {
                            "window_size": 10,
                            "query": {
                                "rescore_query": {
                                    "script_score": {
                                        "query": {"match_all": {}},
                                        "script": {
                                            "source": "maxSimDotProduct(params.query_vector, 'col_pali_vectors')",
                                            "params": {"query_vector": query_vectors["multi_vectors"]}
                                        }
                                    }
                                }
                            }
                        }
                    }
                },
                "size": 3
            }
            
            response = es_client.search(index=INDEX_NAME, body=search_body)
            
            # Format results
            results = []
            for hit in response['hits']['hits']:
                source = hit['_source']
                result = {
                    "title": source.get('title', ''),
                    "content": source.get('content', ''),
                    "image_path": source.get('image_path', ''),
                    "page_number": source.get('metadata', {}).get('page_number', ''),
                    "score": hit['_score']
                }
                results.append(result)
            
            if not results:
                return "No results found for the given query."
            
            # Format output
            output = f"Found {len(results)} relevant slides:\\n\\n"
            for i, result in enumerate(results, 1):
                output += f"**Result {i} (Slide {result['page_number']}, Score: {result['score']:.2f})**\\n"
                output += f"Title: {result['title']}\\n"
                output += f"Content Preview: {result['content'][:300]}...\\n"
                output += f"Image: {result['image_path']}\\n"
                output += "-" * 80 + "\\n"
            
            return output
            
        except Exception as e:
            return f"Error performing search: {str(e)}"

# Initialize Search Tool
search_tool = SearchTool()
print("✓ Search Tool initialized (RRF hybrid search + ColPali late interaction rescoring)")


✓ Search Tool initialized (RRF hybrid search + ColPali late interaction rescoring)


## 🖼️ Step 2: Create Image Analysis Tool

This tool returns image paths in Markdown format `![Image](path)`. CrewAI's Vision LLM integration 
should automatically detect and load these image references.

**How it works:**
1. SearchTool returns text + image paths ✅
2. ImageAnalysisTool verifies the image exists and returns `![Image](path)` in Markdown ✅
3. CrewAI passes this to the Vision LLM which loads the image from the path ✅
4. Agent's Vision LLM sees the image and can answer questions about visual content ✅
5. Agent synthesizes answer combining text search results + visual understanding ✅

**Note:** This relies on CrewAI's built-in support for Markdown image syntax. If your Vision LLM 

(Slides 25-26 show example: "What is the temperature of widget XYZ after 2 minutes?")

In [6]:
class ImageAnalysisToolInput(BaseModel):
    """Input schema for ImageAnalysisTool"""
    image_url: str = Field(..., description="The path or URL of the image to analyze")
    question: str = Field(..., description="The specific question about the image")

class ImageAnalysisTool(BaseTool):
    name: str = "image_analysis_tool"
    description: str = (
        "Loads slide images and returns their file paths for visual analysis. "
        "Use this when you need to examine diagrams, charts, or visual elements in the slides."
    )
    args_schema: Type[BaseModel] = ImageAnalysisToolInput
    
    def _run(self, image_url: str, question: str) -> str:
        """Return image path for Vision LLM to load"""
        try:
            # Verify image exists
            if not image_url.startswith('http'):
                if not Path(image_url).exists():
                    return f"Error: Image not found at {image_url}"
            
            # Return the path - let the LLM framework handle loading
            return image_url
            
        except Exception as e:
            return f"Error: {str(e)}"

# Initialize Image Analysis Tool
image_analysis_tool = ImageAnalysisTool()
print("✓ Image Analysis Tool initialized (Vision LLM)")


✓ Image Analysis Tool initialized (Vision LLM)


## 🤖 Step 3: Create the Intelligent Search Agent

**Agent Prompt (Slide 28)** - Using the exact prompt from the webinar!

The agent orchestrates the workflow:
1. **Initial Analysis**: Understand the user's question
2. **Search Execution**: Use search_tool to query the knowledge base
3. **Result Processing**: Examine search results for text and image URLs
4. **Image Analysis**: If images found, use image_analysis_tool
5. **Response Synthesis**: Combine all findings


In [7]:
# Create the Search Agent - using the exact prompt from Slide 28!
from crewai import LLM

# Configure LLM from environment variables (.env file)
agent_llm = LLM(
    model=OPENAI_MODEL_NAME,
    base_url=OPENAI_BASE_URL,
    api_key=os.getenv("OPENAI_API_KEY")
)

search_agent =  Agent(
        role="Intelligent Search Agent",
        goal="Process user questions by searching knowledge bases and analyzing slide images for visual insights",
        backstory="""
        You are an intelligent search agent that combines textual search with visual analysis. 
        You understand that slides often contain important diagrams, charts, and visual layouts 
        that provide crucial context beyond just the text.
        
        Your workflow:
        1. Use search_tool to find relevant slides
        2. Use image_analysis_tool on the top results to extract visual insights
        3. Combine both text and visual information in your answer
        
        Visual analysis helps you understand diagrams, architecture, workflows, and data 
        visualizations that are critical to answering questions accurately.
        """,
        tools=[search_tool, image_analysis_tool],
        llm=agent_llm,
        verbose=True,
        allow_delegation=False
)

print("✓ Intelligent Search Agent created")
print(f"  Model: {OPENAI_MODEL_NAME} via {OPENAI_BASE_URL}")
print("  Tools: search_tool, image_analysis_tool")
print("  Workflow: Search → Analyze Images → Synthesize Response")


✓ Intelligent Search Agent created
  Model: gpt-4.1 via https://litellm-proxy-service-1059491012611.us-central1.run.app/v1
  Tools: search_tool, image_analysis_tool
  Workflow: Search → Analyze Images → Synthesize Response


## 🎯 Step 4: Helper Function to Ask Questions

Create a helper function that creates tasks and executes the agent workflow.


In [8]:
def ask_agent(question: str):
    """Ask a question to the agent - simplified version"""
    
    search_task = Task(
        description=f"""
        Answer the following user question by searching the knowledge base:
        
        **Question:** {question}
        
        **Instructions:**
        1. Use search_tool to find relevant slides
        2. For the top 2-3 most relevant slides, use image_analysis_tool to load their images 
        3. Synthesize all information (text + visual analysis) into a clear, comprehensive answer
        4. Start with your answer immediately - DO NOT dump raw slide content
        5. Reference slide numbers inline (e.g., "According to Slide 14...")
        6. Keep your answer focused and well-organized with bullet points or paragraphs
        
        **Special Instructions for Architecture/Diagram Questions:**
        - ONLY create a Mermaid diagram if the user's question EXPLICITLY asks for:
          * "architecture" or "system architecture"
          * "workflow" or "process flow"
          * "system design" or "how the system works"
          * A "diagram" or "visualization" of a system/process
        - If creating a diagram:
          * Place the diagram FIRST, before your explanation
          * Use this exact format:
            ```mermaid
            graph TD
                A[Component] --> B[Another Component]
            ```
          * Then provide your textual explanation below the diagram
        - For other questions (facts, definitions, specific values), just provide a text answer WITHOUT diagrams
        
        **Format:**
        - Start with a direct answer to the question (and Mermaid diagram if applicable)
        - Use inline citations like "(Slide 14)" or "As shown in Slide 7..."
        - DO NOT copy-paste large blocks of raw slide text
        - End with a brief "References:" line listing the slides used
        """,
        expected_output="A clear, synthesized answer combining text and visual insights, with inline slide references. Include Mermaid diagrams ONLY if the user explicitly asks for architecture, workflow, or system design visualization.",
        agent=search_agent
    )
    
    crew = Crew(
        agents=[search_agent],
        tasks=[search_task],
        process=Process.sequential,
        multimodal=True,
        verbose=True
    )
    
    result = crew.kickoff()
    return result


print("✓ Question-answering function ready!")


✓ Question-answering function ready!


## 🎬 Step 5: Demo - Ask Questions!

Now let's ask questions about the webinar content using our agentic search system!

These examples are based on topics covered in the presentation.


### Example 1: Understanding Vector Search

**Topic covered in Slide 14:** "Why vectors?"


In [9]:
question1 = "How does ColPali work and why use it for image search?"

print("\n" + "="*100)
print(f"❓ QUESTION 1: {question1}")
print("="*100 + "\n")

answer1 = ask_agent(question1)

print("\n" + "="*100)
print("💡 ANSWER:")
print("="*100)
print(answer1)
print("="*100)



❓ QUESTION 1: How does ColPali work and why use it for image search?




💡 ANSWER:
ColPali is a multi-vector search method that significantly improves image search accuracy and relevance by leveraging multiple feature representations in parallel, rather than relying on a single embedding or caption. Here’s how it works and why it’s valuable for image search:

- **How ColPali Works:**
  - Instead of representing each image (or document) with a single vector embedding, ColPali creates and stores **multiple vectors** for each item, capturing diverse aspects such as captions, visual features, and semantic chunks (Slide 17, Slide 26).
  - During indexing, ColPali processes various data types (e.g., text, diagrams, metadata) through feature extraction agents to generate these distinct embeddings, which are all stored and associated with the document in the search system (Slide 13).
  - At query time, ColPali compares the input with all of the stored vectors for each image/document, vastly increasing the chances that relevant information is retrieved even if the 

## 🎉 Summary & Key Takeaways

### What We Built

This notebook implemented the **Agentic Search Architecture** from the webinar:

1. **✅ Elasticsearch Knowledge Base**
   - `semantic_text` fields for automatic embeddings
   - `dense_vector` for ColPali multi-vector image embeddings
   - Metadata for filtering and personalization
   - 35 technical slides fully indexed

2. **✅ ColPali Integration (Late Interaction)**
   - Multi-vector approach with `rank_vectors` field (bit quantization)
   - `maxSimDotProduct` for late interaction scoring
   - Average vector for quick similarity checks
   - Captures diagrams, charts, and visual layouts
   - More comprehensive than single-vector captioning

3. **✅ Intelligent Search Agent**
   - Hybrid Search (BM25 + Vector)
   - Search Tool for information retrieval
   - Image Analysis Tool for visual understanding
   - Automated workflow: Search → Analyze → Synthesize

4. **✅ Working Demo**
   - Questions answered using actual webinar content
   - Combines text and visual information
   - Cites specific slides in responses

### Resources

- **Webinar Slides**: The 35 slides in `vector_webinar.pdf`
- **Elastic Blog**: [ColPali Late Interaction at Scale](https://www.elastic.co/search-labs/blog/late-interaction-model-colpali-scale)
- **Elastic Docs**: [Semantic Text Field](https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-text.html)
- **ColPali Paper**: Multi-vector visual document retrieval
- **CrewAI**: [CrewAI Documentation](https://docs.crewai.com/)

---

**🎯 You now have a working agentic search system powered by Elastic's vector database!**
