# Verbatim RAG: Structured Templates & VerbatimDoc

This notebook demonstrates the two new features in v0.1.8:

1. **Structured Templates** - Control exactly how your answers are formatted with semantic placeholders like `[CONTRIBUTIONS]`, `[METHODOLOGY]`, `[RESULTS]`

2. **VerbatimDoc** - Generate complete documents from templates with embedded queries like `[!query=what methodology was used]`

Every fact in the output is verbatim from your documents. Traceable, no hallucinations.

For a comprehensive tutorial, see [build_verbatim.ipynb](./build_verbatim.ipynb).

## Setup

In [1]:
!pip install "verbatim-rag>=0.1.8" -q

In [None]:
import os
import logging

# Suppress verbose logs
logging.getLogger("transformers").setLevel(logging.ERROR)
logging.getLogger("milvus").setLevel(logging.WARNING)

# Set your OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-key-here"

In [3]:
from verbatim_rag import VerbatimRAG, VerbatimIndex
from verbatim_rag.ingestion import DocumentProcessor
from verbatim_rag.vector_stores import LocalMilvusStore
from verbatim_rag.embedding_providers import SpladeProvider
from verbatim_rag.llm_client import LLMClient

store = LocalMilvusStore(db_path="./v018_demo.db", enable_sparse=True, enable_dense=False)

llm_client = LLMClient(
    model="gpt-5.1",
    temperature=1.0,
)

sparse_provider = SpladeProvider(
    model_name="opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill",
    device="cpu"
)

index = VerbatimIndex(vector_store=store, sparse_provider=sparse_provider)

# Add a sample research paper
doc = DocumentProcessor().process_url(
    "https://aclanthology.org/2025.bionlp-share.8.pdf",
    title="KR Labs at ArchEHR-QA 2025"
)
index.add_documents([doc])

# Create RAG instance
rag = VerbatimRAG(index, llm_client=llm_client)
rag.template_manager.citation_mode = "hidden"

  from pkg_resources import DistributionNotFound, get_distribution
2025-12-05 22:26:40,750 - INFO - Created indexes for collection: verbatim_rag
2025-12-05 22:26:40,753 - INFO - Created documents collection: verbatim_rag_documents
2025-12-05 22:26:40,754 - INFO - Connected to Milvus Lite: ./v018_demo.db
2025-12-05 22:26:40,861 - INFO - PyTorch version 2.8.0 available.
2025-12-05 22:26:41,349 - INFO - Load pretrained SparseEncoder: opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill
2025-12-05 22:26:44,644 - INFO - Loaded SPLADE model: opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill
2025-12-05 22:26:45,640 - INFO - detected formats: [<InputFormat.PDF: 'pdf'>]
2025-12-05 22:26:45,689 - INFO - Going to convert document batch...
2025-12-05 22:26:45,690 - INFO - Initializing pipeline for StandardPdfPipeline with options hash e647edf348883bed75367b22fbe60347
2025-12-05 22:26:45,698 - INFO - Loading plugin 'docling_defaults'
2025-12-05 22:26:45,700 - INFO -

---

## Feature 1: Structured Templates

Control exactly how your answers are formatted. Define sections with semantic placeholders and get a structured response with each section filled from your sources.

**How it works:**
- You provide a template with placeholders like `[CONTRIBUTIONS]`, `[METHODOLOGY]`, `[RESULTS]`
- You ask a single question
- The LLM extracts relevant spans for each section

In [7]:
# Define your structured template
rag.template_manager.use_structured_mode(template="""
| Metric | Baseline | Ours |
|--------|----------|------|
| F1 | [BASELINE_F1] | [OUR_F1] |
""")

# Single query - template guides what to extract
response = await rag.query_async("What is the F1 score of the baseline and our model?")
print(response.answer)

Batches: 100%|██████████| 1/1 [00:00<00:00, 10.99it/s]
2025-12-05 22:29:13,823 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"


| Metric | Baseline | Ours |
|--------|----------|------|
| F1 | 33.6 | 52.1 |


---

## Feature 2: VerbatimDoc

Generate complete documents from templates with embedded queries. Each `[!query=...]` becomes a separate RAG query, and the results are composed into a final document.

**How it works:**
- You write a document template with embedded queries
- Each query is processed independently
- Results are inserted with global citation numbering
- Useful for research summaries, reports, literature reviews

In [9]:
from verbatim_rag.verbatim_doc import VerbatimDOC

# Define document template with embedded queries
template = """
# Quick Summary

## What's New?
[!query=what is the main contribution]

## How?
[!query=what methodology was used]

## Results?
[!query=what accuracy was achieved]
"""

# Generate the document
result = await VerbatimDOC(rag).process(template, auto_approve=True)
print(result.answer)

Batches: 100%|██████████| 1/1 [00:00<00:00, 10.46it/s]


Extracting spans (async batch mode)...


Batches: 100%|██████████| 1/1 [00:00<00:00, 25.60it/s]


Extracting spans (async batch mode)...


Batches: 100%|██████████| 1/1 [00:00<00:00, 26.15it/s]


Extracting spans (async batch mode)...


2025-12-05 22:30:02,457 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-12-05 22:30:02,708 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-12-05 22:30:03,708 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"



# Quick Summary

## What's New?
[1] Our contributions include a modular, traceable QA architecture that mitigates hallucinations, a method to generate synthetic EHR question-answer corpus and train custom models. Additionally, we are releasing all the code on GitHub 2 under the MIT License.

## How?
[2] To tackle this problem, we propose a verbatim pipeline that clearly separates extraction and generation to mitigate hallucinations:

- Sentence-level extraction , using either zeroshot LLMs or supervised ModernBERT classifiers.
- Template-constrained generation , dynamically creating answer templates filled exclu-

sively with verbatim sentences selected from the extraction phase.

We participated in the ArchEHR-QA 2025 shared task on grounded question answering (QA) from electronic health records (EHRs). Our approach involved (i) utilizing a zero-shot gemma-3-27b-it 1 LLM (Team et al., 2025) and (ii) generating synthetic data for sentence extraction from EHRs to train a compact extrac

---

## Structured Templates vs VerbatimDoc

| | Structured Templates | VerbatimDoc |
|---|---|---|
| **Queries** | 1 query, template guides extraction | N independent queries |
| **Template** | Semantic placeholders: `[METHODOLOGY]` | Embedded queries: `[!query=...]` |
| **Use case** | Structured extraction from same context | Multi-section documents with different questions |
| **Best for** | Summaries, analysis | Reports, literature reviews |

---

## Learn More

- **Full Tutorial:** [build_verbatim.ipynb](./build_verbatim.ipynb)
- **GitHub:** [github.com/KRLabsOrg/verbatim-rag](https://github.com/KRLabsOrg/verbatim-rag)
- **Blog Post:** [huggingface.co/blog/adaamko/verbatimrag](https://huggingface.co/blog/adaamko/verbatimrag)