# Example NV-INGEST Pipeline

## Architecture Overview

**Services running locally (Docker containers):**
- Redis - Message broker on port 6379
- etcd - Metadata storage on port 2379
- MinIO - Object storage on ports 9000-9001
- Milvus - Vector database on port 19530
- NV-Ingest Runtime - Main orchestration service on ports 7670-7671

**NVIDIA hosted endpoints:**
- PaddleOCR: https://ai.api.nvidia.com/v1/cv/baidu/paddleocr
- Page Elements Detection: https://ai.api.nvidia.com/v1/cv/nvidia/nv-yolox-page-elements-v1
- Graphic Elements Detection: https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-graphic-elements-v1
- Table Structure Detection: https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-table-structure-v1
- NemoRetriever Parse: https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-parse
- Embeddings: https://integrate.api.nvidia.com/v1 (nvidia/llama-3.2-nv-embedqa-1b-v2)
- Vision-Language Model: https://integrate.api.nvidia.com/v1 (meta/llama-3.2-11b-vision-instruct)
- Speech-to-Text: https://ai.api.nvidia.com/v1/audio/nvidia/speechtotext

In [51]:
# Import required libraries for timing, client connections, and result processing
import os
import time
from nv_ingest_client.client import Ingestor, NvIngestClient
from nv_ingest_api.util.message_brokers.simple_message_broker import SimpleClient
from nv_ingest_client.util.process_json_files import ingest_json_results_to_blob
from openai import OpenAI
from nv_ingest_client.util.milvus import nvingest_retrieval

print("🎉 NV-Ingest successfully imported in Jupyter!")
print("✅ All packages loaded successfully!")

🎉 NV-Ingest successfully imported in Jupyter!
✅ All packages loaded successfully!


In [52]:
# Create connection to the NV-Ingest service running in Docker
# This connects to the main orchestration service on port 7671
# This code uses simple message broker and runs the client locally
client = NvIngestClient(
    message_client_allocator=SimpleClient,
    message_client_port=7671,
    message_client_hostname="localhost"
)

print("✅ NV-Ingest client created successfully!")
print("🎯 Ready to process documents!")

✅ NV-Ingest client created successfully!
🎯 Ready to process documents!


In [53]:
# Configure connection to local Milvus vector database
# Milvus runs locally on port 19530
milvus_uri = "http://localhost:19530"
collection_name = "nv_ingest_test"

print(f"✅ Milvus configuration set:")
print(f"   URI: {milvus_uri}")
print(f"   Collection: {collection_name}")

✅ Milvus configuration set:
   URI: http://localhost:19530
   Collection: nv_ingest_test


In [54]:
# Check for sample PDF file
sample_file = "data/pharmacopia-2014.pdf"
if os.path.exists(sample_file):
    print(f"✅ Sample PDF found: {sample_file}")
    print(f"   File size: {os.path.getsize(sample_file):,} bytes")
else:
    print(f"❌ Sample file not found: {sample_file}")
    # List available files
    if os.path.exists("data/"):
        print("Available files in data/:")
        for file in os.listdir("data/"):
            print(f"  - {file}")

✅ Sample PDF found: data/pharmacopia-2014.pdf
   File size: 3,355,718 bytes


In [55]:
# Build the processing pipeline using method chaining
ingestor = (
    Ingestor(client=client)
    .files(sample_file)
    # EXTRACTION PHASE: Extract different types of content types from the PDF
    # Note: For very complex PDFs, use only the text extractor
    # Specialized NVIDIA APIs for tables, charts, graphs impose a rate limit 
    .extract(              
        extract_text=True,
        extract_tables=False,
        extract_charts=False,
        extract_images=False,
        paddle_output_format="markdown",
        extract_infographics=False,
        text_depth="page"
    )
    # EMBEDDING PHASE: Generate vector embeddings for semantic search
    .embed()
    # STORAGE PHASE: Upload to vector database for retrieval
    .vdb_upload(
        collection_name=collection_name,
        milvus_uri=milvus_uri,
        sparse=False,
        dense_dim=2048,
        recreate=True
    )
)

print("✅ Pipeline configured successfully!")
print("📋 Pipeline stages: File → Extract → Embed → Vector DB Upload")

✅ Pipeline configured successfully!
📋 Pipeline stages: File → Extract → Embed → Vector DB Upload


In [56]:
# EXECUTE THE NV-INGEST PIPELINE
# The first line (from Alex's requirements) orchestrates the workflow:
# 1. Sends PDF to NV-Ingest service (localhost:7670)
# 2. NV-Ingest calls NVIDIA endpoints for AI processing:
#    - PaddleOCR for table extraction
#    - Page/Graphic elements detection for layout analysis  
#    - Vision-language model for image understanding
#    - Embedding model for vector generation
# 3. Results are aggregated and returned
# 4. Embeddings are uploaded to local Milvus database
# 5. Progress bar shows real-time status

print("🚀 Testing FULL NV-Ingest pipeline with LOCAL Milvus...")
print("Starting full ingestion with vector database upload...")
#t0 = time.time()

try:
    results = ingestor.ingest(show_progress=True)
    
    t1 = time.time()
    
    # Let user know if processing successfully completes
    if results:
        print(f"\n🎉 SUCCESS!")
        print(f"✅ Document processed and uploaded to vector database")
        print(f"✅ Vector database collection '{collection_name}' created in Milvus")
        
        # Show a quick summary
        print(f"\n📄 Results summary:")
        full_results = ingest_json_results_to_blob(results[0])
        print(f"Processing time: {t1-t0:.2f} seconds")
        print(f"📊 Processed {len(results)} documents successfully!")

        
# QUERY AND RETRIEVAL: This code allows the user to send a natural language query to the system and receive a response based on the ingested content
# Set up configuration to access the Milvus Vector DB which is locally hosted
        sparse = False
        
        # Example query - Search for something in the Pharmacopia manual provided by the customer
        queries = ["List the options for local anesthetics that can be applied without risk of paralysis or cardiac arrest. Use only the information in the pharmacopia manual"] 
        
        print(f"\n🔍 Searching in collection '{collection_name}' for: {queries[0]}")
        
        try:
            # Query the vector database
            retrieved_docs = nvingest_retrieval(
                queries,
                collection_name,
                milvus_uri=milvus_uri,
                hybrid=sparse,
                top_k=3,  # Get top 3 results
            )
            
            print(f"✅ Found {len(retrieved_docs[0])} relevant documents")
            
            # Extract the most relevant content
            extract = retrieved_docs[0][0]["entity"]["text"]
            
            # Create OpenAI client for NVIDIA endpoints
            openai_client = OpenAI(
                base_url="https://integrate.api.nvidia.com/v1",
                api_key=os.environ["NVIDIA_BUILD_API_KEY"]
            )
            
            # Create prompt for the LLM
            prompt = f"Using the following content: {extract}\n\n Answer the user query: {queries[0]}"
            print(f"\n📝 Prompt: {prompt[:200]}...")
            
            # Get response from NVIDIA LLM
            completion = openai_client.chat.completions.create(
                model="nvidia/llama-3.1-nemotron-70b-instruct",
                messages=[{"role": "user", "content": prompt}],
            )
            
            response = completion.choices[0].message.content
            print(f"\n🤖 Answer: {response}")
            
        except Exception as e:
            print(f"❌ Error during retrieval: {e}")
            print(f"\n🔧 Troubleshooting:")
            print(f"1. Make sure you have data in the collection '{collection_name}'")
            print(f"2. Check if Milvus is running: docker ps | grep milvus")
            print(f"3. Verify collection exists in Milvus")
            
            # Check if collection exists
            try:
                from pymilvus import MilvusClient
                milvus_client = MilvusClient(uri=milvus_uri)
                collections = milvus_client.list_collections()
                print(f"📊 Available collections: {collections}")
                if collection_name in collections:
                    print(f"✅ Collection '{collection_name}' exists")
                else:
                    print(f"❌ Collection '{collection_name}' not found")
                    print(f"💡 You need to run the ingestion pipeline with .vdb_upload() first")
            except Exception as e2:
                print(f"❌ Cannot connect to Milvus: {e2}")
        
except Exception as e:
    print(f"❌ FAILED: {str(e)}")
    print("🔧 Check that all Docker services are running and accessible")

🚀 Testing FULL NV-Ingest pipeline with LOCAL Milvus...
Starting full ingestion with vector database upload...



Processing Documents:   0%|                                                                      | 0/1 [00:00<?, ?doc/s][A
Processing Documents: 100%|██████████████████████████████████████████████████████████████| 1/1 [00:14<00:00, 14.26s/doc][A



🎉 SUCCESS!
✅ Document processed and uploaded to vector database
✅ Vector database collection 'nv_ingest_test' created in Milvus

📄 Results summary:
Processing time: 4129.53 seconds
📊 Processed 1 documents successfully!

🔍 Searching in collection 'nv_ingest_test' for: List the options for local anesthetics that can be applied without risk of paralysis or cardiac arrest. Use only the information in the pharmacopia manual
✅ Found 3 relevant documents

📝 Prompt: Using the following content: 24 ANESTHESIA: Neuromuscular Blockade Reversal Agents
BUPIVACAINE LIPOSOME ( EXPAREL ) � L – ♀ C 
� – $$$$$ 
 ADULT – Bunionectomy : Infi ltrate 7 mL of 
EXPAREL into ...



1. **LIDOCAINE—LOCAL ANESTHETIC (Xylocaine)**
   - **Forms:** 0.5, 1, 1.5, 2%. With epi: 0.5, 1, 1.5, 2%.
   - **Notes:** Onset within 2 min, duration 30 to 60 min (longer with epi). Amide group. Potentially toxic dose 3 to 5 mg/kg without epinephrine, and 5 to 7 mg/kg with epinephrine.

2. **MEPIVACAINE (Carbocaine, Polocaine)**
 