# Video/Document Continuous Ingestion from Object Storage

## Purpose

This notebook demonstrates an **automated document and video ingestion pipeline** that:

1. Monitors object storage (MinIO) for new uploads via Kafka events
2. Routes files to appropriate AI services based on file type
3. Indexes documents using NVIDIA RAG Blueprint for semantic search
4. Analyzes videos using NVIDIA VSS (Video Search & Summarization)
5. Enables RAG Agent for semantic search and contextual Q&A over all ingested content

## What Gets Deployed

1. **RAG Stack** - Document indexing, vector search, and AI-powered Q&A (NIMs, Milvus, Ingestor)
2. **VSS Stack** - Video understanding and summarization (VLM, LLM NIMs, VSS Engine)
3. **AIDP Stack** - Event-driven ingestion pipeline (Kafka, MinIO, Consumer)

> **Note**: This notebook is part of the NVIDIA RAG Blueprint. All source code, sample data,
> and deployment configs are included in `examples/rag_event_ingest/`.


## Prerequisites

### Hardware
- **GPU**: 2+ NVIDIA GPUs with 24GB+ VRAM each
- **RAM**: 64GB+ system memory
- **Disk**: 200GB+ free space

### Software (must be pre-installed)
- Ubuntu 22.04 or later
- Docker 24.0+ with Docker Compose v2
- NVIDIA Driver 570+
- NVIDIA Container Toolkit
- Git and Git LFS

### API Keys

| Key | Purpose | How to Get |
|-----|---------|------------|
| `NGC_API_KEY` | Docker login, NIM deployments | [NGC Portal](https://org.ngc.nvidia.com/setup/api-keys) |
| `HF_TOKEN` | Download VSS models (optional) | [HuggingFace Tokens](https://huggingface.co/settings/tokens) |


## Table of Contents

| Section | Description |
|---------|-------------|
| **Setup** | Clone repo, install deps, set API keys, load helpers |
| **Deploy RAG** | NIMs, Vector DB, Ingestor, RAG Server |
| **Deploy VSS** | Clone VSS, deploy NIMs and VLM |
| **Deploy AIDP** | Kafka, MinIO, Consumer |
| **Testing** | Upload documents & videos, query RAG |
| **Clean Up** | Stop services, clean data |


## References

- **RAG Blueprint**: [NVIDIA RAG Documentation](https://docs.nvidia.com/ai-enterprise/rag-blueprints/latest/index.html)
- **VSS**: [Video Search & Summarization Documentation](https://docs.nvidia.com/vss/latest/index.html)
- **NIM**: [NVIDIA NIM Documentation](https://docs.nvidia.com/nim/index.html)


# Setup

Clone the repository, configure API keys, and load helper functions.



## 1. Clone Repository

Clone the RAG Blueprint repo (includes consumer code and sample data).



In [24]:
# Clone repo and install dependencies
import subprocess, sys, os

RAG_REPO_DIR = os.path.expanduser("~/rag")
RAG_BRANCH = "minh/aidp-notebook"

if os.path.exists(os.path.join(RAG_REPO_DIR, "examples/rag_event_ingest")):
    print(f"[OK] RAG repo already cloned: {RAG_REPO_DIR}")
else:
    print(f"Cloning branch {RAG_BRANCH}...")
    subprocess.run(
        f"git clone -b {RAG_BRANCH} https://github.com/NVIDIA-AI-Blueprints/rag.git {RAG_REPO_DIR}",
        shell=True,
    )
    subprocess.run("git lfs pull", shell=True, cwd=RAG_REPO_DIR)
    print(f"[OK] Cloned to {RAG_REPO_DIR}")

# Verify structure
for path in ["deploy/compose", "examples/rag_event_ingest/kafka_consumer", "examples/rag_event_ingest/data"]:
    status = "[OK]" if os.path.exists(os.path.join(RAG_REPO_DIR, path)) else "[MISSING]"
    print(f"  {status} {path}")

# Install dependencies
subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "minio", "aiohttp", "requests", "python-dotenv"])
print("[OK] Dependencies installed")


## 2. Set API Keys

Configure NGC and HuggingFace API keys for NIM deployments and model downloads.



In [None]:
import getpass

def set_api_key(env_var: str, prompt: str, required: bool = True):
    if os.environ.get(env_var):
        print(f"  [OK] {env_var} already set ({os.environ[env_var][:10]}...)")
        return True
    key = getpass.getpass(prompt)
    if key:
        os.environ[env_var] = key
        print(f"  [OK] {env_var} set")
        return True
    if required:
        print(f"  [ERROR] {env_var} is required")
        return False
    print(f"  [SKIP] {env_var} (optional)")
    return True

set_api_key("NGC_API_KEY", "Enter NGC_API_KEY (starts with 'nvapi-'): ", required=True)
set_api_key("HF_TOKEN", "Enter HF_TOKEN (optional, press Enter to skip): ", required=False)


## 3. Helper Functions

Shared utilities for deployment, file upload, status checks, and RAG queries.



In [None]:
import os, sys, json, re, subprocess, time, socket, asyncio
import aiohttp, requests
from typing import List, Optional, Dict

try:
    from minio import Minio
    from minio.error import S3Error
except ImportError:
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "minio"])
    from minio import Minio
    from minio.error import S3Error

# =============================================================================
# CONFIGURATION
# =============================================================================

# Paths relative to RAG repo root
RAG_REPO_DIR = os.path.expanduser("~/rag")
EXAMPLE_DIR = os.path.join(RAG_REPO_DIR, "examples/rag_event_ingest")
AIDP_COMPOSE_FILE = os.path.join(EXAMPLE_DIR, "deploy/docker-compose.yaml")
DATA_DIR = os.path.join(EXAMPLE_DIR, "data")
RAG_SERVER_URL = "http://localhost:8081"
INGESTOR_URL = "http://localhost:8082"

VSS_DIR = os.path.expanduser("~/video-search-and-summarization")
VSS_UI_PORT = 9110
VSS_API_PORT = 8110
VSS_LLM_PORT = 8107
VSS_EMBED_PORT = 8106
VSS_RERANK_PORT = 8105
LOCAL_NIM_CACHE = os.path.expanduser("~/.cache/nim")

MINIO_ENDPOINT = "localhost:9201"
MINIO_ACCESS_KEY = "minioadmin"
MINIO_SECRET_KEY = "minioadmin"
MINIO_BUCKET = "aidp-bucket"
MINIO_COLLECTION = "aidp_bucket"
MINIO_CONSOLE_PORT = 9211

# =============================================================================
# SHARED UTILITIES
# =============================================================================

def run_command(cmd: str, capture: bool = False) -> Optional[str]:
    """Execute a shell command and print it."""
    print(f"$ {cmd}")
    result = subprocess.run(cmd, shell=True, capture_output=capture, text=True)
    return result.stdout if capture else None

def get_host_ip() -> str:
    """Get host IP address for external access URLs."""
    try:
        s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
        s.connect(("8.8.8.8", 80))
        ip = s.getsockname()[0]
        s.close()
        return ip
    except OSError:
        return "localhost"

def get_minio_client() -> Minio:
    """Create MinIO client for AIDP bucket operations."""
    return Minio(MINIO_ENDPOINT, access_key=MINIO_ACCESS_KEY, secret_key=MINIO_SECRET_KEY, secure=False)

def upload_file(local_path: str, object_name: Optional[str] = None) -> bool:
    """Upload a local file to MinIO AIDP bucket."""
    if not os.path.exists(local_path):
        print(f"[ERROR] File not found: {local_path}")
        return False
    obj = object_name or os.path.basename(local_path)
    try:
        client = get_minio_client()
        if not client.bucket_exists(MINIO_BUCKET):
            client.make_bucket(MINIO_BUCKET)
        client.fput_object(MINIO_BUCKET, obj, local_path)
        print(f"[OK] Uploaded: {obj}")
        return True
    except S3Error as e:
        print(f"[ERROR] {e}")
        return False

def get_consumer_logs(lines: int = 30) -> None:
    """Show recent Kafka consumer logs."""
    run_command(f"docker logs kafka-consumer --tail {lines}")

async def _parse_sse_stream(resp) -> str:
    """Parse Server-Sent Events stream into concatenated answer text."""
    chunks, buf = [], ""
    async for raw in resp.content.iter_any():
        buf += raw.decode("utf-8", errors="ignore")
        while "\n" in buf:
            line, buf = buf.split("\n", 1)
            line = line.strip()
            if not line.startswith("data: "):
                continue
            data_str = line[6:]
            if data_str == "[DONE]":
                return "".join(chunks)
            try:
                delta = json.loads(data_str).get("choices", [{}])[0].get("delta", {})
                if delta.get("content"):
                    chunks.append(delta["content"])
            except json.JSONDecodeError:
                pass
    return "".join(chunks)

async def query_rag(question: str, collection: str = None) -> Optional[str]:
    """Query RAG system and print the answer."""
    coll = collection or MINIO_COLLECTION
    print(f"Q: {question}\nCollection: {coll}\n" + "-" * 40)

    payload = {
        "messages": [{"role": "user", "content": question}],
        "use_knowledge_base": True,
        "collection_name": coll,
    }
    try:
        async with aiohttp.ClientSession() as session:
            async with session.post(
                f"{RAG_SERVER_URL}/generate", json=payload,
                timeout=aiohttp.ClientTimeout(total=120),
            ) as resp:
                if resp.status != 200:
                    print(f"[ERROR] {resp.status}: {await resp.text()}")
                    return None

                content_type = resp.headers.get("Content-Type", "")
                if "text/event-stream" in content_type:
                    answer = await _parse_sse_stream(resp)
                else:
                    data = await resp.json()
                    answer = data.get("choices", [{}])[0].get("message", {}).get("content", "No answer")

                print(f"Answer: {answer}")
                return answer
    except aiohttp.ClientError as e:
        print(f"[ERROR] {e}")
        return None

print(f"[OK] Helpers loaded | Host IP: {get_host_ip()}")


Helper functions loaded!
Host IP: 10.86.1.245


# Deploy RAG

Deploy the RAG stack: NIMs (LLM, Embedding, Reranker), Milvus vector database, Ingestor server, and RAG server.


In [57]:
ngc_key = os.environ.get("NGC_API_KEY")
if not ngc_key:
    raise RuntimeError("NGC_API_KEY not set! Run the API keys cell first.")

os.chdir(RAG_REPO_DIR)
os.environ["USERID"] = f"{os.getuid()}:{os.getgid()}"

# Login to nvcr.io
subprocess.run(f"docker login nvcr.io -u '$oauthtoken' -p {ngc_key}",
               shell=True, capture_output=True, text=True)

# Deploy components sequentially
for label, compose_file in [
    ("NIMs",      "deploy/compose/nims.yaml"),
    ("Vector DB", "deploy/compose/vectordb.yaml"),
]:
    print(f"Deploying {label}...")
    run_command(f"docker compose -f {compose_file} up -d")

print("Waiting 30s for Milvus...")
time.sleep(30)

for label, compose_file in [
    ("Ingestor", "deploy/compose/docker-compose-ingestor-server.yaml"),
    ("RAG Server", "deploy/compose/docker-compose-rag-server.yaml"),
]:
    print(f"Deploying {label}...")
    run_command(f"docker compose -f {compose_file} up -d")

ip = get_host_ip()
print(f"\nRAG deployed: http://{ip}:8081 (server) | http://{ip}:8082 (ingestor) | http://{ip}:8090 (UI)")
print("Wait 2-5 minutes for NIMs to load models, then run the status check cell.")


DEPLOYING RAG STACK
USERID: 2833:2833

[1/4] Logging in to nvcr.io...


[OK] Logged in

[2/4] Deploying NIMs...
$ docker compose -f deploy/compose/nims.yaml up -d


 Container nemoretriever-embedding-ms Creating 
 Container nemoretriever-ranking-ms Creating 
 Container compose-page-elements-1 Creating 
 Container compose-table-structure-1 Creating 
 Container nim-llm-ms Creating 
 Container compose-paddle-1 Creating 
 Container compose-graphic-elements-1 Creating 
 Container compose-paddle-1 Created 
 Container compose-page-elements-1 Created 
 Container nemoretriever-embedding-ms Created 
 Container compose-table-structure-1 Created 
 Container compose-graphic-elements-1 Created 
 Container nemoretriever-ranking-ms Created 
 Container nim-llm-ms Created 
 Container compose-graphic-elements-1 Starting 
 Container compose-paddle-1 Starting 
 Container compose-page-elements-1 Starting 
 Container nemoretriever-embedding-ms Starting 
 Container nemoretriever-ranking-ms Starting 
 Container nim-llm-ms Starting 
 Container compose-table-structure-1 Starting 
 Container nemoretriever-ranking-ms Started 
 Container nim-llm-ms Started 
 Container nemoretr


[3/4] Deploying Vector DB...
$ docker compose -f deploy/compose/vectordb.yaml up -d


 Container milvus-minio Creating 
 Container milvus-etcd Creating 
 Container milvus-etcd Created 
 Container milvus-minio Created 
 Container milvus-standalone Creating 
 Container milvus-standalone Created 
 Container milvus-minio Starting 
 Container milvus-etcd Starting 
 Container milvus-etcd Started 
 Container milvus-minio Started 
 Container milvus-standalone Starting 
 Container milvus-standalone Started 


    Waiting for Milvus (30s)...

[4/4] Deploying Ingestor and RAG Server...
$ docker compose -f deploy/compose/docker-compose-ingestor-server.yaml up -d


 Container compose-redis-1 Creating 
 Container ingestor-server Creating 
 Container compose-nv-ingest-ms-runtime-1 Creating 
 Container ingestor-server Created 
 Container compose-redis-1 Created 
 Container compose-nv-ingest-ms-runtime-1 Created 
 Container ingestor-server Starting 
 Container compose-redis-1 Starting 
 Container compose-nv-ingest-ms-runtime-1 Starting 
 Container compose-nv-ingest-ms-runtime-1 Started 
 Container ingestor-server Started 
 Container compose-redis-1 Started 


$ docker compose -f deploy/compose/docker-compose-rag-server.yaml up -d


 Container rag-server Creating 
 Container rag-server Created 
 Container rag-frontend Creating 
 Container rag-frontend Created 
 Container rag-server Starting 



RAG DEPLOYED!
  RAG Server:      http://10.86.1.245:8081
  Ingestor Server: http://10.86.1.245:8082
  RAG Frontend:    http://10.86.1.245:8090


 Container rag-server Started 
 Container rag-frontend Starting 
 Container rag-frontend Started 


Verify RAG services are healthy. Wait 2-5 minutes for NIMs to load models.


In [59]:
print("Wait 2-5 minutes for services to become healthy.")
print("Run this cell again after waiting.\n")

ip = get_host_ip()
for name, port, path in [
    ("RAG Server", 8081, "/health"), ("Ingestor", 8082, "/health"),
    ("Frontend", 8090, "/"), ("Milvus", 19530, "/v1/vector/collections"),
]:
    try:
        s = "[OK]" if requests.get(f"http://localhost:{port}{path}", timeout=10).status_code == 200 else "[WARN]"
    except requests.ConnectionError:
        s = "[DOWN]"
    except requests.Timeout:
        s = "[TIMEOUT]"
    print(f"  {s} {name}: http://{ip}:{port}")
run_command("docker ps --format 'table {{.Names}}\t{{.Status}}' | grep -E '(rag|milvus|ingestor|nim|NAMES)'")


Wait 2-5 minutes for services to become healthy.
Run this cell again after waiting.

RAG Services Status:
  [OK] RAG Server: http://10.86.1.245:8081
  [OK] Ingestor Server: http://10.86.1.245:8082
  [OK] RAG Frontend: http://10.86.1.245:8090
  [OK] Milvus: http://10.86.1.245:19530

Container Status:
$ docker ps --format 'table {{.Names}}	{{.Status}}' | grep -E '(rag|milvus|ingestor|nim-llm|NAMES)'


NAMES                                             STATUS
rag-frontend                                      Up 50 seconds
rag-server                                        Up 51 seconds
ingestor-server                                   Up 51 seconds
milvus-standalone                                 Up About a minute
milvus-etcd                                       Up About a minute (healthy)
milvus-minio                                      Up About a minute (healthy)
nim-llm-ms                                        Up About a minute (healthy)
local_deployment_single_gpu-milvus-standalone-1   Up 5 minutes (healthy)


# Deploy VSS

Deploy the VSS stack: NIMs (LLM, Embedding, Reranker) and VLM for video analysis.


In [38]:
# VSS deployment configuration
VSS_REPO_URL = "https://github.com/NVIDIA-AI-Blueprints/video-search-and-summarization.git"
VSS_GPU_DEVICE = 4       # GPU for NIMs (LLM, Embedding, Reranker)
VSS_VLM_GPU_DEVICE = 5   # GPU for VLM (via-server with Cosmos-Reason2)

NIM_IMAGES = {
    "vss-llm":       ("nvcr.io/nim/meta/llama-3.1-8b-instruct:1.12.0",      VSS_LLM_PORT),
    "vss-embedding": ("nvcr.io/nim/nvidia/llama-3.2-nv-embedqa-1b-v2:1.9.0", VSS_EMBED_PORT),
    "vss-reranker":  ("nvcr.io/nim/nvidia/llama-3.2-nv-rerankqa-1b-v2:1.7.0", VSS_RERANK_PORT),
}

ngc_key = os.environ.get("NGC_API_KEY", "")
hf_token = os.environ.get("HF_TOKEN", "")
if not ngc_key:
    raise RuntimeError("NGC_API_KEY not set!")

# Docker login
subprocess.run(f"echo {ngc_key} | docker login nvcr.io -u '$oauthtoken' --password-stdin",
               shell=True, capture_output=True, text=True, executable="/bin/bash")

# Clone VSS repo
if not os.path.exists(VSS_DIR):
    print(f"Cloning {VSS_REPO_URL}...")
    subprocess.run(f"git clone {VSS_REPO_URL} {VSS_DIR}", shell=True)
else:
    print(f"[OK] VSS repo exists: {VSS_DIR}")

# Deploy NIM containers (all on same GPU)
os.makedirs(LOCAL_NIM_CACHE, exist_ok=True)
for name, (image, port) in NIM_IMAGES.items():
    subprocess.run(f"docker rm -f {name} 2>/dev/null", shell=True, capture_output=True)
    cmd = f"""docker run -d --name {name} \
        -u $(id -u) --gpus '"device={VSS_GPU_DEVICE}"' --shm-size=16GB \
        --network nvidia-rag -e NGC_API_KEY={ngc_key} \
        -v "{LOCAL_NIM_CACHE}:/opt/nim/.cache" \
        -p {port}:8000 -e NIM_LOW_MEMORY_MODE=1 -e NIM_RELAX_MEM_CONSTRAINTS=1 \
        {image}"""
    result = subprocess.run(cmd, shell=True, capture_output=True, text=True, executable="/bin/bash")
    status = "[OK]" if result.returncode == 0 else "[ERROR]"
    print(f"  {status} {name} -> port {port}")

# Deploy VSS application (VLM on separate GPU)
vss_deploy_dir = f"{VSS_DIR}/deploy/docker/local_deployment_single_gpu"
env_content = f"""NGC_API_KEY={ngc_key}
HF_TOKEN={hf_token}
VIA_IMAGE=nvcr.io/nvidia/blueprint/vss-engine:2.4.1
FRONTEND_PORT={VSS_UI_PORT}
BACKEND_PORT={VSS_API_PORT}
MILVUS_DB_HTTP_PORT=19091
MILVUS_DB_GRPC_PORT=29530
MINIO_PORT=9002
MINIO_WEBUI_PORT=9003
GRAPH_DB_USERNAME=neo4j
GRAPH_DB_PASSWORD=password
ARANGO_DB_USERNAME=arangodb
ARANGO_DB_PASSWORD=password
CA_RAG_CONFIG=./config.yaml
GUARDRAILS_CONFIG=./guardrails
NVIDIA_VISIBLE_DEVICES={VSS_VLM_GPU_DEVICE}
VLM_MODEL_TO_USE=cosmos-reason2
MODEL_PATH=git:https://huggingface.co/nvidia/Cosmos-Reason2-8B
VLLM_GPU_MEMORY_UTILIZATION=0.4
VLM_MAX_MODEL_LEN=20480
DISABLE_GUARDRAILS=true
DISABLE_CV_PIPELINE=true
ENABLE_AUDIO=false
"""
with open(f"{vss_deploy_dir}/.env", "w") as f:
    f.write(env_content)

# Patch config.yaml to use our NIM ports
config_file = f"{vss_deploy_dir}/config.yaml"
if os.path.exists(config_file):
    cfg = open(config_file).read()
    cfg = re.sub(r":8007/v1", f":{VSS_LLM_PORT}/v1", cfg)
    cfg = re.sub(r":8006/v1", f":{VSS_EMBED_PORT}/v1", cfg)
    cfg = re.sub(r":8005/v1", f":{VSS_RERANK_PORT}/v1", cfg)
    open(config_file, "w").write(cfg)

cmd = f"cd {vss_deploy_dir} && set -a && source .env && set +a && docker compose up -d"
subprocess.run(cmd, shell=True, capture_output=True, text=True, executable="/bin/bash")

ip = get_host_ip()
print(f"\nVSS deployed: http://{ip}:{VSS_UI_PORT} (UI) | http://{ip}:{VSS_API_PORT} (API)")
print("Wait 2-5 minutes for NIMs to load models, then run the status check cell.")


DEPLOYING VSS STACK
NIM GPU: 4  |  VLM GPU: 5

[0/5] Logging in to nvcr.io...


  [OK] Logged in to nvcr.io

[1/5] Cloning VSS repository...
  [OK] VSS repo exists: /localhome/local-minhngu/video-search-and-summarization

[2/5] Deploying LLM NIM...

[VSS-LLM] Deploying Llama 3.1 8B on GPU 4...
  [OK] LLM NIM started on port 8107

[3/5] Deploying Embedding NIM...

[VSS-Embed] Deploying Embedding NIM on GPU 4...
  [OK] Embedding NIM started on port 8106

[4/5] Deploying Reranker NIM...

[VSS-Rerank] Deploying Reranker NIM on GPU 4...
  [OK] Reranker NIM started on port 8105

[5/5] Deploying VSS Application (VLM on GPU 5)...

[VSS-App] Deploying VSS application...
  [OK] Created .env file
  [OK] Patched config.yaml
  $ docker compose up -d
  [ERROR] VSS app failed:  Network local_deployment_single_gpu_default Creating 
 Network local_deployment_single_gpu_default Created 
 Container local_deployment_single_gpu-milvus-standalone-1 Creating 
 Container local_deployment_single_gpu-graph-db-1 Creating 
 Container local_deployment_single_gpu-elasticsearch-1 Creatin

VSS D

True

Verify VSS services are healthy. Wait 2-5 minutes for NIMs to load models.


In [56]:
ip = get_host_ip()
for name, port, path in [
    ("VSS UI", VSS_UI_PORT, "/"), ("VSS API", VSS_API_PORT, "/"),
    ("LLM NIM", VSS_LLM_PORT, "/v1/health/ready"),
    ("Embedding", VSS_EMBED_PORT, "/v1/health/ready"),
    ("Reranker", VSS_RERANK_PORT, "/v1/health/ready"),
]:
    try:
        requests.get(f"http://localhost:{port}{path}", timeout=10)
        s = "[OK]"
    except requests.ConnectionError:
        s = "[DOWN]"
    except requests.Timeout:
        s = "[TIMEOUT]"
    print(f"  {s} {name}: http://{ip}:{port}")
run_command("docker ps --format 'table {{.Names}}\t{{.Status}}' | grep -E '(vss|via|local_deployment|NAMES)'")


VSS Services Status:
  [OK] VSS UI: http://10.86.1.245:9110
  [OK] VSS API: http://10.86.1.245:8110
  [OK] VSS LLM NIM: http://10.86.1.245:8107
  [OK] VSS Embedding: http://10.86.1.245:8106
  [OK] VSS Reranker: http://10.86.1.245:8105

Container Status:
$ docker ps --format 'table {{.Names}}	{{.Status}}' | grep -E '(vss|via|local_deployment|NAMES)'
NAMES                                             STATUS
local_deployment_single_gpu-via-server-1          Up 2 minutes
local_deployment_single_gpu-elasticsearch-1       Up 3 minutes
local_deployment_single_gpu-graph-db-1            Up 3 minutes
local_deployment_single_gpu-minio-1               Up 2 minutes
local_deployment_single_gpu-arango-db-1           Up 3 minutes
local_deployment_single_gpu-milvus-standalone-1   Up 3 minutes (healthy)
vss-reranker                                      Up 3 minutes
vss-embedding                                     Up 3 minutes
vss-llm                                           Up 3 minutes


# Deploy AIDP

Deploy the AIDP stack: Kafka message broker, MinIO object storage, and Kafka consumer for automated ingestion.


In [60]:
# Verify prerequisites
net_check = subprocess.run("docker network inspect nvidia-rag", shell=True, capture_output=True)
if net_check.returncode != 0:
    raise RuntimeError("nvidia-rag network not found. Deploy RAG first.")

ngc_key = os.environ.get("NGC_API_KEY", "")
if not ngc_key:
    raise RuntimeError("NGC_API_KEY not set!")

host_ip = get_host_ip()
os.environ["VSS_SERVER_URL"] = f"http://{host_ip}:{VSS_API_PORT}"

# Login + pull + build
subprocess.run(f"echo {ngc_key} | docker login nvcr.io -u '$oauthtoken' --password-stdin",
               shell=True, capture_output=True, text=True, executable="/bin/bash")

compose = f"docker compose -f {AIDP_COMPOSE_FILE}"
subprocess.run(f"{compose} pull --ignore-pull-failures", shell=True, capture_output=True, text=True, executable="/bin/bash")
subprocess.run(f"{compose} up -d --build", shell=True, capture_output=True, text=True, executable="/bin/bash")

print(f"AIDP deployed:")
print(f"  Kafka UI:      http://{host_ip}:8080")
print(f"  MinIO Console: http://{host_ip}:{MINIO_CONSOLE_PORT}")
print(f"  Credentials:   minioadmin / minioadmin")


DEPLOYING AIDP STACK
Using host IP: 10.86.1.245

[1/4] Checking prerequisites...
[OK] nvidia-rag network exists
[OK] RAG services detected

[2/4] Logging in to nvcr.io...


[OK] Logged in to nvcr.io

[3/4] Pulling images...
Running: pull...
[OK] pull complete

[4/4] Starting services...
Running: up...
[OK] up complete

AIDP DEPLOYED!

Services:
  Kafka UI:       http://localhost:8080
  MinIO Console:  http://localhost:9211
  MinIO API:      http://localhost:9201

Credentials: minioadmin / minioadmin


True

Verify AIDP services are running.


In [None]:
ip = get_host_ip()
print(f"  Kafka UI:      http://{ip}:8080")
print(f"  MinIO Console: http://{ip}:{MINIO_CONSOLE_PORT}")
run_command("docker ps --format 'table {{.Names}}\t{{.Status}}' | grep -E '(kafka|minio|NAMES)'")


AIDP Services Status:
  Kafka UI: http://10.86.1.245:8080
  MinIO Console: http://10.86.1.245:9211
  MinIO API: http://10.86.1.245:9201

Credentials: minioadmin / minioadmin

Container Status:
$ docker ps --format 'table {{.Names}}	{{.Status}}' | grep -E '(kafka|minio|NAMES)'
NAMES                                             STATUS
kafka-consumer                                    Up 4 seconds
aidp-minio-mc                                     Up 4 seconds
aidp-kafka-ui                                     Up 4 seconds
kafka                                             Up 36 seconds (healthy)
aidp-minio                                        Up 36 seconds (healthy)
milvus-minio                                      Up 2 minutes (healthy)
local_deployment_single_gpu-minio-1               Up 4 minutes

RAG Services Status:
  [OK] RAG Server: http://10.86.1.245:8081
  [OK] Ingestor Server: http://10.86.1.245:8082
  [OK] RAG Frontend: http://10.86.1.245:8090
  [OK] Milvus: http://10.86.1.245:1

Wait for all services (RAG, VSS, AIDP) to stabilize before testing.


In [62]:
# Wait for services to be ready
print("Waiting for services to be ready...")
time.sleep(15)

# Check health
await check_health()

Waiting for services to be ready...


Service Health Check:
RAG Server:      [OK] (200)
Ingestor Server: [OK] (200)
MinIO:           [OK]


# Testing

Test the deployment by uploading documents and videos, then querying via RAG.


## 1. Document Upload

Upload a PDF document to MinIO, which triggers automatic ingestion via Kafka consumer.


### 1.1 Upload to Storage

Upload the document to MinIO object storage.


In [63]:
# Sample documents are included in the repo under examples/rag_event_ingest/data/
pdf_path = os.path.join(DATA_DIR, "documents", "Seahawks-Patriots in Super Bowl LX_ What We Learned from Seattle's 29-13 win.pdf")
upload_file(pdf_path, "Seahawks-Patriots_SuperBowl_LX_Analysis.pdf")


[OK] Uploaded: Seahawks-Patriots_SuperBowl_LX_Analysis.pdf


True

Check consumer logs to verify the document was processed.


In [65]:
# Check consumer logs to see processing
print("Waiting for processing...")
get_consumer_logs(50)

Waiting for processing...
Consumer Logs (last 50 lines):
$ docker logs kafka-consumer --tail 50


        "x-minio-origin-endpoint": "http://172.18.0.21:9000"
      },
      "s3": {
        "s3SchemaVersion": "1.0",
        "configurationId": "Config",
        "bucket": {
          "name": "aidp-bucket",
          "ownerIdentity": {
            "principalId": "minioadmin"
          },
          "arn": "arn:aws:s3:::aidp-bucket"
        },
        "object": {
          "key": "Seahawks-Patriots_SuperBowl_LX_Analysis.pdf",
          "size": 6084589,
          "eTag": "b62aa96fc8551ff52f2a0b06495f751a-2",
          "contentType": "application/octet-stream",
          "userMetadata": {
            "content-type": "application/octet-stream"
          },
          "sequencer": "189329C5DBD3009B"
        }
      },
      "source": {
        "host": "172.18.0.1",
        "port": "",
        "userAgent": "MinIO (Linux; x86_64) minio-py/7.2.0"
      }
    }
  ]
}
2026-02-11 10:16:48,272 - consumer - INFO - Processing: aidp-bucket/Seahawks-Patriots_SuperBowl_LX_Analysis.pdf (6084589 bytes)
20

### 1.2 Verify Document Ingestion

Verify the document was ingested into the RAG system.


In [66]:
resp = requests.get(f"{INGESTOR_URL}/v1/collections", timeout=10)
collections = resp.json().get("collections", []) if resp.status_code == 200 else []
print("Collections:", collections)


Collections:
  - {'collection_name': 'metadata_schema', 'num_entities': 1, 'metadata_schema': []}
  - {'collection_name': 'aidp_bucket', 'num_entities': 14, 'metadata_schema': [{'name': 'filename', 'type': 'string', 'required': False, 'array_type': None, 'max_length': None, 'description': 'Name of the uploaded file'}]}
  - {'collection_name': 'meta', 'num_entities': 34, 'metadata_schema': []}


[{'collection_name': 'metadata_schema',
  'num_entities': 1,
  'metadata_schema': []},
 {'collection_name': 'aidp_bucket',
  'num_entities': 14,
  'metadata_schema': [{'name': 'filename',
    'type': 'string',
    'required': False,
    'array_type': None,
    'max_length': None,
    'description': 'Name of the uploaded file'}]},
 {'collection_name': 'meta', 'num_entities': 34, 'metadata_schema': []}]

List documents ingested into the collection.


In [67]:
resp = requests.get(f"{INGESTOR_URL}/documents", params={"collection_name": MINIO_COLLECTION}, timeout=10)
docs = resp.json().get("documents", []) if resp.status_code == 200 else []
print(f"Documents in '{MINIO_COLLECTION}' ({len(docs)}):")
for d in docs:
    print(f"  - {d.get('document_name', '?')}")


Documents in 'aidp_bucket' (3):
  - Seahawks-Patriots in Super Bowl LX_ What We Learned from Seattle's 29-13 win.pdf
  - Seattle Seahawks vs New England Patriots - Super Bowl LX Game Highlights.mp4_description.json
  - Seahawks-Patriots_SuperBowl_LX_Analysis.pdf


[{'document_name': "Seahawks-Patriots in Super Bowl LX_ What We Learned from Seattle's 29-13 win.pdf",
  'metadata': {'filename': "Seahawks-Patriots in Super Bowl LX_ What We Learned from Seattle's 29-13 win.pdf"}},
 {'document_name': 'Seattle Seahawks vs New England Patriots - Super Bowl LX Game Highlights.mp4_description.json',
  'metadata': {'filename': 'Seattle Seahawks vs New England Patriots - Super Bowl LX Game Highlights.mp4_description.json'}},
 {'document_name': 'Seahawks-Patriots_SuperBowl_LX_Analysis.pdf',
  'metadata': {'filename': 'Seahawks-Patriots_SuperBowl_LX_Analysis.pdf'}}]

Check consumer logs to verify document processing status.


In [None]:
# Check document processing status
print("Waiting for document processing...")
get_consumer_logs(50)

### 1.3 Query Document via RAG

Query the ingested document using natural language.


In [70]:
# Query the document
await query_rag("What was the final score and who won Super Bowl LX?", MINIO_COLLECTION)

Question: What was the final score and who won Super Bowl LX?
Collection: aidp_bucket
----------------------------------------
Answer: The final score was Seattle 29, New England 13. The Seattle Seahawks won Super Bowl LX.


'The final score was Seattle 29, New England 13. The Seattle Seahawks won Super Bowl LX.'

Ask another question about the document.


In [71]:
# Query about key takeaways
await query_rag("What were the key lessons learned from Seattle's victory in Super Bowl LX?", MINIO_COLLECTION)

Question: What were the key lessons learned from Seattle's victory in Super Bowl LX?
Collection: aidp_bucket
----------------------------------------
Answer: Seattle's defense dominated the game, playing to their No. 1 ranking and sacking Drake Maye six times while generating two interceptions. The Seahawks excelled in all three phases of the game, with special teams making a significant impact through consistent field position advantages. Kenneth Walker III demonstrated strong rushing performance, gaining significant yards over expected. The team's ability to capitalize on stalled drives with points and their defensive depth were crucial in securing the victory.


"Seattle's defense dominated the game, playing to their No. 1 ranking and sacking Drake Maye six times while generating two interceptions. The Seahawks excelled in all three phases of the game, with special teams making a significant impact through consistent field position advantages. Kenneth Walker III demonstrated strong rushing performance, gaining significant yards over expected. The team's ability to capitalize on stalled drives with points and their defensive depth were crucial in securing the victory."

## 2. Video Upload

Upload a video to MinIO, which triggers automatic ingestion via Kafka consumer → VSS for video analysis → RAG for indexing.



### 2.1 Upload to Storage

Upload the video to MinIO object storage.


In [72]:
# Sample videos are included in the repo under examples/rag_event_ingest/data/
video_path = os.path.join(DATA_DIR, "videos", "Seattle Seahawks vs New England Patriots - Super Bowl LX Game Highlights.mp4")
upload_file(video_path)

print("\nVideo processing takes longer than documents. Check consumer logs for progress.")


[OK] Uploaded: Seattle Seahawks vs New England Patriots - Super Bowl LX Game Highlights.mp4

Video processing takes longer than documents.
Check consumer logs for progress:


### 2.2 Verify Video Ingestion

Check consumer logs to verify video processing status.


In [None]:
# Check video processing status
print("Waiting for video processing...")
get_consumer_logs(50)

### 2.3 Query Video via RAG

Query the video content using natural language.


In [75]:
# Query about the video content
await query_rag("Summarize the video content", MINIO_COLLECTION)

Question: Summarize the video content
Collection: aidp_bucket
----------------------------------------
Answer: The video content summarizes the Super Bowl LX game between the Seattle Seahawks and the New England Patriots. The Seahawks dominated the game, showcasing a strong defense that intercepted multiple passes and forced crucial turnovers. The Patriots struggled offensively, with quarterback Drake Maye facing pressure and struggling to find open targets. The Seahawks' offense, led by Kenneth Walker III, drove downfield multiple times, culminating in several touchdowns. The final score was 66-31 in favor of the Seattle Seahawks. Key plays included a deep pass touchdown by Russell Wilson, interceptions by the Seahawks' defense, and consistent rushing yards by Walker. The game highlighted the Seahawks' defensive dominance and the Patriots' offensive struggles.


"The video content summarizes the Super Bowl LX game between the Seattle Seahawks and the New England Patriots. The Seahawks dominated the game, showcasing a strong defense that intercepted multiple passes and forced crucial turnovers. The Patriots struggled offensively, with quarterback Drake Maye facing pressure and struggling to find open targets. The Seahawks' offense, led by Kenneth Walker III, drove downfield multiple times, culminating in several touchdowns. The final score was 66-31 in favor of the Seattle Seahawks. Key plays included a deep pass touchdown by Russell Wilson, interceptions by the Seahawks' defense, and consistent rushing yards by Walker. The game highlighted the Seahawks' defensive dominance and the Patriots' offensive struggles."

Query about a specific time range in the video.


In [79]:
# Query about specific time range
await query_rag("What happened between 15:00 and 20:00?", MINIO_COLLECTION)


Question: What happened between 15:00 and 20:00?
Collection: aidp_bucket
----------------------------------------
Answer: Between 15:00 and 20:00, the Seahawks' defense continued to showcase exceptional play, intercepting multiple passes and forcing crucial turnovers. In contrast, the Patriots' offense lacked consistency, struggling to move the ball effectively and converting few first downs. The turning point came when the Patriots finally found the end zone, breaking the shutout and tying the game at 3-3. Despite the late-game heroics, the overall performance highlighted the dominance of the Seahawks' defense and the offensive struggles of the Patriots. The Seahawks' defense continued to shine, intercepting a pass and returning it for a touchdown, putting them ahead 10-3.


"Between 15:00 and 20:00, the Seahawks' defense continued to showcase exceptional play, intercepting multiple passes and forcing crucial turnovers. In contrast, the Patriots' offense lacked consistency, struggling to move the ball effectively and converting few first downs. The turning point came when the Patriots finally found the end zone, breaking the shutout and tying the game at 3-3. Despite the late-game heroics, the overall performance highlighted the dominance of the Seahawks' defense and the offensive struggles of the Patriots. The Seahawks' defense continued to shine, intercepting a pass and returning it for a touchdown, putting them ahead 10-3."

Additional query: analyze key defensive plays and turnovers.


In [None]:
# Defensive Analysis
await query_rag("Describe the key defensive plays and turnovers that impacted the game outcome.", MINIO_COLLECTION)

Additional query: identify critical momentum-changing plays in the second half.


In [None]:
# Momentum Shifts
await query_rag("What were the critical momentum-changing plays in the second half of the game?", MINIO_COLLECTION)

# Clean Up

Stop all services and clean up ingested data.


## 1. Stop RAG Deployment

Stop all RAG services (NIMs, Milvus, Ingestor, RAG server).


In [None]:
os.chdir(RAG_REPO_DIR)
for f in [
    "deploy/compose/docker-compose-rag-server.yaml",
    "deploy/compose/docker-compose-ingestor-server.yaml",
    "deploy/compose/vectordb.yaml",
    "deploy/compose/nims.yaml",
]:
    run_command(f"docker compose -f {f} down")
print("[OK] RAG stopped")


Stopping RAG stack...
$ docker compose -f deploy/compose/docker-compose-rag-server.yaml down


 Container rag-frontend Stopping 
 Container rag-frontend Stopped 
 Container rag-frontend Removing 
 Container rag-frontend Removed 
 Container rag-server Stopping 
 Container rag-server Stopped 
 Container rag-server Removing 
 Container rag-server Removed 
 Network nvidia-rag Removing 
 Network nvidia-rag Resource is still in use 
 Container compose-redis-1 Stopping 
 Container compose-nv-ingest-ms-runtime-1 Stopping 
 Container ingestor-server Stopping 


$ docker compose -f deploy/compose/docker-compose-ingestor-server.yaml down


 Container compose-redis-1 Stopped 
 Container compose-redis-1 Removing 
 Container compose-redis-1 Removed 
 Container compose-nv-ingest-ms-runtime-1 Stopped 
 Container compose-nv-ingest-ms-runtime-1 Removing 
 Container ingestor-server Stopped 
 Container ingestor-server Removing 
 Container compose-nv-ingest-ms-runtime-1 Removed 
 Container ingestor-server Removed 
 Network nvidia-rag Removing 
 Network nvidia-rag Resource is still in use 
 Container milvus-standalone Stopping 


$ docker compose -f deploy/compose/vectordb.yaml down


 Container milvus-standalone Stopped 
 Container milvus-standalone Removing 
 Container milvus-standalone Removed 
 Container milvus-minio Stopping 
 Container milvus-etcd Stopping 
 Container milvus-etcd Stopped 
 Container milvus-etcd Removing 
 Container milvus-etcd Removed 
 Container milvus-minio Stopped 
 Container milvus-minio Removing 
 Container milvus-minio Removed 
 Network nvidia-rag Removing 
 Network nvidia-rag Resource is still in use 
 Container compose-graphic-elements-1 Stopping 
 Container compose-table-structure-1 Stopping 
 Container nemoretriever-embedding-ms Stopping 
 Container nim-llm-ms Stopping 
 Container compose-paddle-1 Stopping 
 Container nemoretriever-ranking-ms Stopping 
 Container compose-page-elements-1 Stopping 


$ docker compose -f deploy/compose/nims.yaml down


 Container nim-llm-ms Stopped 
 Container nim-llm-ms Removing 
 Container nim-llm-ms Removed 
 Container compose-page-elements-1 Stopped 
 Container compose-page-elements-1 Removing 
 Container nemoretriever-ranking-ms Stopped 
 Container nemoretriever-ranking-ms Removing 
 Container nemoretriever-embedding-ms Stopped 
 Container nemoretriever-embedding-ms Removing 
 Container compose-page-elements-1 Removed 
 Container nemoretriever-ranking-ms Removed 
 Container compose-table-structure-1 Stopped 
 Container compose-table-structure-1 Removing 
 Container nemoretriever-embedding-ms Removed 
 Container compose-table-structure-1 Removed 
 Container compose-paddle-1 Stopped 
 Container compose-paddle-1 Removing 
 Container compose-paddle-1 Removed 
 Container compose-graphic-elements-1 Stopped 
 Container compose-graphic-elements-1 Removing 


[OK] RAG stopped


 Container compose-graphic-elements-1 Removed 
 Network nvidia-rag Removing 
 Network nvidia-rag Resource is still in use 


## 2. Stop VSS Deployment

Stop all VSS services (NIMs, VLM, via-server).


In [None]:
vss_deploy_dir = f"{VSS_DIR}/deploy/docker/local_deployment_single_gpu"
if os.path.exists(vss_deploy_dir):
    subprocess.run(f"cd {vss_deploy_dir} && set -a && source .env 2>/dev/null && set +a && docker compose down",
                   shell=True, executable="/bin/bash", capture_output=True)
for name in ["vss-llm", "vss-embedding", "vss-reranker"]:
    subprocess.run(f"docker rm -f {name} 2>/dev/null", shell=True, capture_output=True)
print("[OK] VSS stopped")


Stopping VSS stack...
$ docker compose -f deploy/docker/local_deployment_single_gpu/compose.yaml down


 Container local_deployment_single_gpu-elasticsearch-1 Stopping 
 Container local_deployment_single_gpu-via-server-1 Stopping 
 Container local_deployment_single_gpu-via-server-1 Stopped 
 Container local_deployment_single_gpu-via-server-1 Removing 
 Container local_deployment_single_gpu-via-server-1 Removed 
 Container local_deployment_single_gpu-minio-1 Stopping 
 Container local_deployment_single_gpu-arango-db-1 Stopping 
 Container local_deployment_single_gpu-graph-db-1 Stopping 
 Container local_deployment_single_gpu-milvus-standalone-1 Stopping 
 Container local_deployment_single_gpu-minio-1 Stopped 
 Container local_deployment_single_gpu-minio-1 Removing 
 Container local_deployment_single_gpu-milvus-standalone-1 Stopped 
 Container local_deployment_single_gpu-milvus-standalone-1 Removing 
 Container local_deployment_single_gpu-minio-1 Removed 
 Container local_deployment_single_gpu-milvus-standalone-1 Removed 
 Container local_deployment_single_gpu-arango-db-1 Stopped 
 Contain

$ docker stop vss-llm vss-embedding vss-reranker 2>/dev/null || true
vss-llm
vss-embedding
vss-reranker
$ docker rm vss-llm vss-embedding vss-reranker 2>/dev/null || true
vss-llm
vss-embedding
vss-reranker
[OK] VSS stopped


## 3. Stop AIDP Deployment

Stop AIDP services (Kafka, MinIO, Consumer).


In [None]:
run_command(f"docker compose -f {AIDP_COMPOSE_FILE} down")
print("[OK] AIDP stopped")


Stopping AIDP stack...
$ docker compose -f /localhome/local-minhngu/aidb-helm/deploy/docker-compose-aidp.yaml down


 Container kafka-consumer Stopping 
 Container aidp-kafka-ui Stopping 
 Container aidp-minio-mc Stopping 
 Container aidp-kafka-ui Stopped 
 Container aidp-kafka-ui Removing 
 Container aidp-kafka-ui Removed 
 Container aidp-minio-mc Stopped 
 Container aidp-minio-mc Removing 
 Container kafka-consumer Stopped 
 Container kafka-consumer Removing 
 Container aidp-minio-mc Removed 
 Container kafka-consumer Removed 
 Container aidp-minio Stopping 
 Container kafka Stopping 
 Container aidp-minio Stopped 
 Container aidp-minio Removing 
 Container aidp-minio Removed 


[OK] AIDP stopped


 Container kafka Stopped 
 Container kafka Removing 
 Container kafka Removed 


## 4. Clean All Ingested Documents

Delete all documents from RAG collections and MinIO storage.


In [None]:
# List and delete all documents from RAG collection
try:
    resp = requests.get(f"{INGESTOR_URL}/documents", params={"collection_name": MINIO_COLLECTION}, timeout=30)
    resp.raise_for_status()
    documents = resp.json().get("documents", [])
except requests.RequestException as e:
    print(f"[ERROR] {e}")
    documents = []

if not documents:
    print("No documents to clean.")
else:
    print(f"Deleting {len(documents)} documents...")
    for doc in documents:
        name = doc.get("document_name")
        if not name:
            continue
        try:
            requests.delete(
                f"{INGESTOR_URL}/documents",
                json={"document_names": [name], "collection_name": MINIO_COLLECTION},
                timeout=30,
            ).raise_for_status()
            print(f"  [OK] {name}")
        except requests.RequestException as e:
            print(f"  [ERROR] {name}: {e}")
    print("Cleanup complete.")
