Multimodal Agentic RAG with Graph-Enhanced Knowledge Base
for Hardware Datasheet Intelligent Q&A and Verification System
Veriquery is an end-to-end intelligent system for electronic component datasheet analysis, built on Multimodal Agentic RAG, knowledge graph enhancement, and formal reasoning engines. It transforms static PDF datasheets into queryable knowledge, enabling engineers to retrieve parameters, verify electrical compatibility, compare devices, and locate reference circuits through natural language interaction.
The system addresses a critical gap in electronic design workflows: the manual, error-prone process of extracting and cross-referencing specifications from heterogeneous datasheets. VeriQuery automates this pipeline with a hybrid retrieval architecture, a four-layer Electrical Rule Check (ERC) engine, and a Z-number augmented multi-criteria decision framework.
Unlike general-purpose RAG systems, VeriQuery is purpose-built for the datasheet domain β every core module incorporates domain-specific design: the ERC engine encodes JEDEC logic-level standards and Arrhenius thermal degradation models; the parameter extractor uses section-anchored regex targeting "Electrical Characteristics" tables; the hybrid retriever adds a structured path (SQLite FTS5) specifically to preserve tabular data integrity; and the scoring engine normalizes test conditions via semiconductor physics (CCM) rather than generic min-max scaling.
| Module | Capability | Technology |
|---|---|---|
| Intelligent Q&A | Natural language queries over datasheet content with citation tracing | Agentic RAG + LangGraph + Hybrid Retrieval |
| Pinout Analysis | Automatic pin definition extraction and SVG diagram rendering | Knowledge Graph + Regex + LLM |
| ERC Check | Four-layer progressive electrical compatibility verification | Interval Arithmetic + JEDEC Standards |
| Parameter Comparison | Multi-device scoring with reliability-aware ranking | Z-number + B-SPOTIS + MEREC |
| Circuit Retrieval | Multi-modal circuit diagram search from PDF pages | CLIP + VLM + MaxSim |
| Document Management | PDF upload, parsing, indexing, and lifecycle management | PyMuPDF + Camelot + pdfplumber |
Cascaded pipeline with decreasing confidence:
- Structured Table Query β Direct lookup from extracted PDF tables (confidence: ~0.93)
- Section-Anchored Regex β Pattern matching within "Electrical Characteristics" sections (confidence: 0.73β0.85)
- Few-Shot LLM Verification β Targeted LLM extraction for remaining parameters (confidence: ~0.80)
Each stage processes only parameters missed by prior stages (cascaded fallback), ensuring high-confidence results are preserved first.
Circuit diagram retrieval from datasheet pages using a two-stage filtering pipeline:
| Stage | Model | Function |
|---|---|---|
| L1: CLIP Zero-Shot | ViT-B/32 |
Fast pre-filtering β classifies each page as circuit/non-circuit via text-image similarity, eliminating ~70% of irrelevant pages before expensive VLM inference |
| L2: Patch Embedding | Qwen3.5-2B VLM | Deep feature extraction β splits images into patches, encodes each patch into multi-vector embeddings for fine-grained content matching |
Retrieval uses MaxSim (ColPali late interaction): query tokens are compared against all image patch embeddings via dot-product, and the maximum score per query token is summed:
This architecture avoids the information loss of single-vector global embedding β each image retains full patch-level granularity for precise schematic matching.
Three heterogeneous retrieval paths execute concurrently, with results merged via Reciprocal Rank Fusion (Cormack et al., 2009):
| Path | Method | Weight | Strength |
|---|---|---|---|
| Dense | Sentence-Transformer + ChromaDB (HNSW) | 0.50 | Semantic similarity |
| Sparse | BM25 + jieba tokenization | 0.35 | Exact keyword match |
| Structured | SQLite FTS5 + table store | 0.15 | Preserves table structure |
All three paths execute concurrently via asyncio.gather with return_exceptions=True. Total latency equals max(T1, T2, T3) instead of T1+T2+T3. Individual path failures are tolerated β remaining paths still return results.
Progressive detection architecture for electrical compatibility verification:
| Layer | Detection | Method | Reference |
|---|---|---|---|
| L1 | Static Stability | JEDEC logic level + noise margin | JESD8 series |
| L2 | Signal Integrity | Transmission line reflection analysis | Bogatin, E. |
| L3 | Topology Conflict | Interface protocol + port attribute matrix | IEEE 1801 UPF |
| L4 | Environmental Degradation | Interval arithmetic + Arrhenius model | Moore (1966), JESD22-A108D |
Layer 4 uses Interval(lo, hi) primitives for uncertainty propagation through arithmetic operations, modeling temperature drift where electrical parameters expand from crisp values to intervals.
| Layer | Function | Method |
|---|---|---|
| CCM | Test condition normalization | Semiconductor physics linear equivalent conversion |
| Z-A-FoM | Reliability fusion | Z-number (Zadeh, 2011) + Kang conversion |
| B-SPOTIS | Robust decision | MEREC objective weighting + SPOTIS (Dezert et al., 2020) |
Intent-driven orchestration layer built on LangGraph's StateGraph, routing user queries through domain-specific processing pipelines:
START β intent_router ββ¬β qa β text_retrieval β response_generation β END
ββ pinout β pinout_node β END
Design choices grounded in engineering constraints:
- Regex-based intent routing over LLM classification β zero latency, zero cost, fully interpretable; sufficient for the current 2-intent space (qa / pinout). ERC check and device comparison bypass the router entirely since their inputs are structured parameters, not free-form natural language.
- DAG topology guarantee β LangGraph's acyclic graph ensures every workflow terminates; no infinite loops or stuck states.
- Graceful degradation β individual node failures return fallback responses rather than crashing the pipeline.
- Compiled graph reuse β the workflow is compiled once at startup (read-only, thread-safe) and invoked concurrently across sessions.
The quantitative evaluation of Veriquery is currently underway. We are conducting comprehensive benchmark tests across the following dimensions, and the detailed experimental results will be published in our upcoming academic paper:
- Extraction Accuracy: Evaluating the F1-score of our cascaded parameter extraction pipeline against baseline LLMs (e.g., direct prompt-based extraction).
- Retrieval Robustness: Measuring the Recall@K improvements brought by the RRF hybrid retrieval strategy on heterogeneous datasheet PDFs.
- Reasoning Reliability: Validating the accuracy of the 4-layer ERC engine against standard JEDEC/IEEE edge-case scenarios.
Stay tuned for the full technical report and evaluation dataset.
π Project Structure
veriquery/
βββ api/ # FastAPI backend
β βββ main.py # Application entry, lifespan, middleware
β βββ dependencies.py # Service container, dependency injection
β βββ error_handlers.py # Global exception handlers
β βββ routers/
β βββ chat.py # Intelligent Q&A endpoints
β βββ circuit.py # Circuit retrieval endpoints
β βββ compare.py # Device comparison endpoints
β βββ documents.py # Document management endpoints
β βββ erc.py # ERC check endpoints
β βββ pinout.py # Pinout analysis endpoints
βββ agents/ # LangGraph agent workflow
β βββ workflow_graph.py # DAG topology and intent routing
β βββ workflow_nodes.py # Intent routing, retrieval, generation
β βββ comparison_node.py # Multi-device comparison orchestration
β βββ erc_node.py # ERC check orchestration
βββ core/ # Shared infrastructure
β βββ config.py # Pydantic Settings, singleton configuration
β βββ schema.py # Unified data models (AgentState, PinInfo, etc.)
β βββ llm_client.py # HuggingFace LLM client with quantization
β βββ svg_renderer.py # Pinout SVG diagram renderer
β βββ memory_manager.py # GPU memory management
β βββ model_manager.py # Model lifecycle management
β βββ cleanup_manager.py # Orphan data cleanup
β βββ sqlite_utils.py # SQLite health check and repair
β βββ exceptions.py # Custom exception hierarchy
βββ ingestion/ # Document processing pipeline
β βββ document_processor.py # PDF parsing, CLIP filtering, table orchestration
β βββ image_indexer.py # CLIP + VLM + MaxSim visual indexing
βββ extraction/ # Parameter and table extraction
β βββ parameter_extractor.py # Three-stage cascaded parameter extraction
β βββ table_extractor.py # Three-layer table extraction (Camelot/pdfplumber)
βββ retrieval/ # Hybrid retrieval subsystem
β βββ hybrid_retriever.py # RRF fusion orchestrator
β βββ vector_store.py # ChromaDB dense retrieval
β βββ bm25_store.py # BM25 sparse retrieval
β βββ table_store.py # SQLite FTS5 structured retrieval
β βββ embeddings.py # Sentence-Transformer embedding service
βββ reasoning/ # Formal reasoning engines
β βββ erc_engine.py # Four-layer ERC with interval arithmetic
β βββ parameter_scorer.py # CCM + Z-A-FoM + B-SPOTIS scoring
βββ knowledge/ # Domain knowledge base
β βββ graph_db.py # SQLite knowledge graph schema (chipsβpinsβparameters)
β βββ graph_query.py # Knowledge graph query engine (3-level fallback)
β βββ chip_importer.py # Chip data import pipeline
β βββ pinout_library.py # Built-in common chip pinout database
βββ ui/ # Streamlit frontend
β βββ app.py # Main page with navigation cards
β βββ api_client.py # Backend API client
β βββ theme.py # Academic-style CSS theme
β βββ sidebar_nav.py # Sidebar navigation and document selector
β βββ pages/
β βββ 1_Documents.py # Document management page
β βββ 2_Chat.py # Intelligent Q&A page
β βββ 3_Pinout.py # Pinout analysis page
β βββ 4_ERC.py # ERC check page
β βββ 5_Compare.py # Parameter comparison page
β βββ 6_Circuit.py # Circuit retrieval page
βββ docs/ # Screenshots and documentation assets
βββ data/ # Runtime data (gitignored)
βββ pyproject.toml # Project metadata and build config
βββ requirements.txt # Python dependencies
βββ .env.example # Environment variable template
βββ logging.yaml # Logging config template (reference)
βββ start.ps1 # Windows startup script
| Category | Technology | Purpose |
|---|---|---|
| Backend Framework | FastAPI + Uvicorn | Async REST API with OpenAPI docs |
| Frontend Framework | Streamlit | Multi-page interactive UI |
| Agent Workflow | LangGraph | DAG-based stateful workflow orchestration |
| LLM | Qwen3.5 (HuggingFace) | Local inference with 4-bit quantization, natively multimodal |
| VLM | Qwen3.5 (HuggingFace) | Natively multimodal model for diagram understanding |
| Embedding | BGE / Qwen-Embedding (HuggingFace) | Dense text vectorization (1024-dim) |
| Vector DB | ChromaDB | HNSW approximate nearest neighbor search |
| Sparse Retrieval | rank_bm25 + jieba | BM25 keyword matching with Chinese tokenization |
| Structured Retrieval | SQLite + FTS5 | Full-text search over table data |
| PDF Processing | PyMuPDF + pdfplumber + Camelot | Text extraction, table extraction, image rendering |
| Vision | CLIP + PIL | Image classification and filtering |
| Knowledge Graph | SQLite | Chip-pin-parameter relational storage |
| Configuration | Pydantic Settings | Type-safe env-based configuration |
| Logging | Python logging + RotatingFileHandler | Structured logging with rotation |
- Python 3.10+
- CUDA-capable GPU (recommended, 4GB+ VRAM for minimum models)
- Git
git clone https://github.com/FinalSunFlower/Veriquery.git
cd veriquery
python -m venv .venv
# Windows:
.venv\Scripts\activate
# Linux/macOS:
source .venv/bin/activate
pip install -r requirements.txtAll models are automatically downloaded from HuggingFace on first run. No manual download required. Configure which models to use in .env:
cp .env.example .env| Model | Config Key | Default | VRAM | Purpose |
|---|---|---|---|---|
| LLM | LLM_MODEL |
Qwen/Qwen3.5-0.8B |
~1GB | Text generation (natively multimodal) |
| VLM | VLM_MODEL |
Qwen/Qwen3.5-2B |
~2GB | Diagram understanding (natively multimodal) |
| Embedding | EMBEDDING_MODEL |
BAAI/bge-large-zh-v1.5 |
~1GB | Text vectorization |
| CLIP | CLIP_MODEL |
openai/clip-vit-base-patch32 |
~0.5GB | Image classification |
VRAM recommendations by GPU:
| GPU VRAM | Recommended LLM | Recommended VLM |
|---|---|---|
| 4GB | Qwen/Qwen3.5-0.8B |
Qwen/Qwen3.5-0.8B |
| 8GB | Qwen/Qwen3.5-2B |
Qwen/Qwen3.5-2B |
| 12GB+ | Qwen/Qwen3.5-4B |
Qwen/Qwen3.5-4B |
Tip: If you have limited VRAM, set
EMBEDDING_DEVICE=cpuandLLM_QUANTIZE=truein.env.
Windows (automated):
.\start.ps1Manual start:
# Terminal 1: Start backend API
python -m api.main
# Terminal 2: Start frontend UI
streamlit run ui/app.py --server.port 8501Access the application:
- Frontend: http://localhost:8501
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Health Check: http://localhost:8000/health
- Upload a datasheet β Navigate to the Documents page and upload a PDF datasheet (e.g., NE5532, LM358)
- Ask questions β Go to Chat and type natural language queries like "NE5532 supply voltage range?"
- View pinouts β Use the Pinout page to see automatically generated SVG pin diagrams
- Run ERC β Check electrical compatibility between driver and receiver chips
- Compare devices β Select multiple devices for parameter comparison with scoring
- Find circuits β Search for application circuit diagrams from uploaded datasheets
π API Reference
The backend exposes RESTful endpoints under /api/v1:
| Endpoint | Method | Description |
|---|---|---|
/api/v1/documents/ |
GET | List documents |
/api/v1/documents/upload |
POST | Upload document |
/api/v1/chat/ |
POST | Intelligent Q&A with RAG |
/api/v1/chat/stream |
POST | Streaming Q&A response |
/api/v1/pinout/ |
POST | Pin definition extraction |
/api/v1/erc/check |
POST | Four-layer ERC compatibility check |
/api/v1/compare/devices-enhanced |
POST | Multi-device parameter comparison |
/api/v1/circuit/search |
POST | Multi-modal circuit diagram search |
/health |
GET | System health check |
Full interactive documentation is available at /docs (Swagger UI) and /redoc (ReDoc).
All configuration is managed through environment variables (.env file), with sensible defaults in core/config.py. Key configuration groups:
| Group | Key Variables | Default |
|---|---|---|
| LLM | LLM_MODEL, LLM_DEVICE, LLM_QUANTIZE |
Qwen/Qwen3.5-0.8B, cuda, true |
| VLM | VLM_MODEL, VLM_QUANTIZE |
Qwen/Qwen3.5-2B, true |
| Embedding | EMBEDDING_MODEL, EMBEDDING_DIMENSION |
BAAI/bge-large-zh-v1.5, 1024 |
| Retrieval | VECTOR_WEIGHT, BM25_WEIGHT, STRUCTURED_WEIGHT |
0.50, 0.35, 0.15 |
| Chunking | CHUNK_SIZE, CHUNK_OVERLAP |
800, 200 |
| Storage | CHROMA_PERSIST_DIR, DATA_DIR |
./data/chroma, ./data |
See .env.example for the complete list of configurable parameters.
- Citation Tracing β Every factual statement carries a source citation (file, page, text snippet) for verification
- Graceful Degradation β Each subsystem initializes independently; single component failure does not crash the system
- Lazy Loading β GPU models load on first use, avoiding startup memory pressure
- Async Concurrency β Retrieval paths execute concurrently via
asyncio.gather; total latency = max(T1, T2, T3) - Singleton Pattern β Thread-safe model instances via double-checked locking
- Configuration Validation β Pydantic validates all settings at startup with clear error messages
If you use VeriQuery in your research, please cite:
@misc{veriquery2026,
author = {VeriQuery Team},
title = {VeriQuery: Multimodal Agentic RAG with Graph-Enhanced Knowledge Base for Hardware Datasheet Intelligent Q\&A and Verification System},
year = {2026},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/FinalSunFlower/Veriquery}}
}This project is licensed under the MIT License β see the LICENSE file for details.

