A RAG architecture that uses hierarchical semantic chunking and graph-based context exclusion to maximize the amount of relevant information in the retrieved context while minimizing the total volume.
Traditional RAG systems face a critical trade-off: retrieving larger chunks provides more context but includes irrelevant information, while smaller chunks are more focused but may lack necessary context. NestedRAG addresses this by optimizing the balance between relevance and chunk size.
NestedRAG dynamically selects optimally-sized chunks by:
-
Hierarchical Tree Structure: Documents are recursively split into semantic units, creating a tree where each branch represents nested text segments at different granularities
-
Branch-Level Selection: During retrieval, only one datapoint per branch is selected, ensuring we get the most relevant segment from each hierarchical chain
- Each query gets optimally-sized chunks based on where relevance lies in the hierarchy
- Higher relevant-to-total information ratio
- No nested or overlapping chunks in results
- Retrieves from diverse document branches automatically
- Increased number of data points: The algorithm produces
s^ldistinct data points, wheresis the number of semantic chunks per level andlis the number of levels.
This architecture was developed for use cases where the source data is long, unstructured text, e.g., conversation transcripts, long-form articles, etc.
Root (Full Document)
├── Level 1 Chunk A
│ ├── Level 2 Chunk A1
│ │ └── Level 3 Chunk A1a
│ └── Level 2 Chunk A2
└── Level 1 Chunk B
├── Level 2 Chunk B1
└── Level 2 Chunk B2
- Search: Find the most semantically similar chunk via vector search
- Identify: Get all ancestors and descendants of this chunk in the graph
- Exclude: Mark these nodes as excluded for future searches
- Repeat: Continue until desired number of chunks retrieved
- Result: Return diverse chunks from different document branches
Option 1: Clone and install (for development)
git clone https://github.com/darwinapps/nested-rag.git
cd nested-rag
pip install -e .Option 2: Install directly with pip
pip install git+https://github.com/darwinapps/nested-rag.gitNestedRAG requires:
- Python 3.9+
- langchain-core
- langchain-qdrant
- qdrant-client
- networkx
from nested_rag import NestedRAG
from langchain_qdrant import QdrantVectorStore
from langchain_openai import OpenAIEmbeddings
from qdrant_client import QdrantClient
# Initialize embeddings and vector store
embeddings = OpenAIEmbeddings()
client = QdrantClient(":memory:") # Or use persistent storage
vector_store = QdrantVectorStore(
client=client,
collection_name="my_documents",
embedding=embeddings,
)
# Create NestedRAG instance
rag = NestedRAG(
vector_store=vector_store,
embedding=embeddings,
max_depth=6, # Maximum hierarchy depth
num_semantic_chunks=2, # Chunks per level
)
# Ingest a document
document_text = """
Your long document text here...
This could be a research paper, article, documentation, etc.
"""
rag.ingest_document(
document_text=document_text,
document_label="my_doc_1",
save_graph=True,
)
# Retrieve relevant chunks
query = "What is the main contribution?"
results = rag.retrieve(
query=query,
limit=5, # Number of chunks to retrieve
)
# Use results with your LLM
for i, doc in enumerate(results):
print(f"Chunk {i+1}:")
print(doc.page_content)
print(f"Metadata: {doc.metadata}")
print("-" * 80)# Retrieve only from specific documents
results = rag.retrieve(
query="Your query",
limit=5,
custom_filters={"label": "my_doc_1"},
)# Load a previously saved document graph
graph = rag.load_graph("my_doc_1")
print(f"Loaded graph with {len(graph.nodes)} nodes")# View graph statistics
stats = rag.get_statistics()
print(f"Total nodes: {stats['total_nodes']}")
print(f"Actual max depth: {stats['actual_max_depth']}")
print(f"Nodes per level: {stats['nodes_per_level']}")| Parameter | Default | Description |
|---|---|---|
max_depth |
6 | Maximum depth of hierarchical chunking |
num_semantic_chunks |
2 | Number of chunks to create at each level |
sentence_split_regex |
r"(?<=[.!])\s+" |
Regex pattern for sentence splitting |
graphs_path |
"./graphs" |
Directory for persisting graph structures |
internal_node_id_field |
"node_id" |
Metadata field for internal node IDs |
node_id_for_llm_field |
"node_id_for_llm" |
Metadata field for LLM-facing IDs |
See the examples/ directory for:
basic_usage.py: Simple end-to-end exampleadvanced_usage.ipynb: Jupyter notebook with visualizationscomparison.py: Comparison with traditional RAG approachesmulti_document.py: Handling multiple documents
Initialize the NestedRAG system.
Parameters:
vector_store(QdrantVectorStore): Vector store for document storageembedding(Embeddings): Embedding model for semantic chunkinggraph(Optional[DiGraph]): Existing document hierarchy graphmax_depth(int): Maximum depth for hierarchical chunkingnum_semantic_chunks(int): Number of chunks per level- Additional configuration parameters...
Ingest and process a document.
Parameters:
document_text(str): Full document textdocument_label(str): Unique identifier for the documentsave_graph(bool): Whether to persist graph to disk
Returns: DiGraph representing document hierarchy
Retrieve relevant chunks using hierarchical exclusion.
Parameters:
query(str): Search querylimit(int): Maximum number of chunks to returnoffset(int): Number of top results to skipcustom_filters(dict): Optional filters for search
Returns: List of Document objects
Load a saved document graph from disk.
Get statistics about the current graph structure.
Contributions are welcome! Please feel free to submit a Pull Request.
git clone https://github.com/darwinapps/nested-rag.git
cd nested-rag
pip install -e ".[dev]"# Basic test run
pytest tests/ -v
# With coverage (requires pytest-cov)
pytest tests/ --cov=nested_rag --cov-report=term-missingWe use black and ruff for code formatting and linting:
black nested_rag/
ruff check nested_rag/This project is licensed under the MIT License - see the LICENSE file for details.
Developed by Gregory Potemkin at DarwinApps LLC.
Implementation presented here is built on top of the following open-source projects:
Star ⭐ this repository if you find it useful!