![pageindex_banner](https://pageindex.ai/static/images/pageindex_banner.jpg)

<p align="center"><i>Reasoning-based RAG&nbsp; ✧ &nbsp;No Vector DB&nbsp; ✧ &nbsp;No Chunking&nbsp; ✧ &nbsp;Human-like Retrieval</i></p>

<p align="center">
  <a href="https://vectify.ai">🏠 Homepage</a>&nbsp; • &nbsp;
  <a href="https://dash.pageindex.ai">🖥️ Dashboard</a>&nbsp; • &nbsp;
  <a href="https://docs.pageindex.ai/quickstart">📚 API Docs</a>&nbsp; • &nbsp;
  <a href="https://github.com/VectifyAI/PageIndex">📦 GitHub</a>&nbsp; • &nbsp;
  <a href="https://discord.com/invite/VuXuf29EUj">💬 Discord</a>&nbsp; • &nbsp;
  <a href="https://ii2abc2jejf.typeform.com/to/tK3AXl8T">✉️ Contact</a>&nbsp;
</p>

<div align="center">

[![Star us on GitHub](https://img.shields.io/github/stars/VectifyAI/PageIndex?style=for-the-badge&logo=github&label=⭐️%20Star%20Us)](https://github.com/VectifyAI/PageIndex) &nbsp;&nbsp; [![Follow us on X](https://img.shields.io/badge/Follow%20Us-000000?style=for-the-badge&logo=x&logoColor=white)](https://twitter.com/VectifyAI)

</div>

---

# Simple Vectorless RAG with PageIndex

## PageIndex Introduction
PageIndex is a new **reasoning-based**, **vectorless RAG** framework that performs retrieval in two steps:  
1. Generate a tree structure index of documents  
2. Perform reasoning-based retrieval through tree search  

<div align="center">
  <img src="https://docs.pageindex.ai/images/cookbook/vectorless-rag.png" width="70%">
</div>

Compared to traditional vector-based RAG, PageIndex features:
- **No Vectors Needed**: Uses document structure and LLM reasoning for retrieval.
- **No Chunking Needed**: Documents are organized into natural sections rather than artificial chunks.
- **Human-like Retrieval**: Simulates how human experts navigate and extract knowledge from complex documents. 
- **Transparent Retrieval Process**: Retrieval based on reasoning — say goodbye to approximate semantic search ("vibe retrieval").

## 📝 Notebook Overview

This notebook demonstrates a simple, minimal example of **vectorless RAG** with PageIndex. You will learn how to:
- [x] Build a PageIndex tree structure of a document
- [x] Perform reasoning-based retrieval with tree search
- [x] Generate answers based on the retrieved context

> ⚡ Note: This is a **minimal example** to illustrate PageIndex's core philosophy and idea, not its full capabilities. More advanced examples are coming soon.

---

## Step 0: Preparation



#### 0.1 Install PageIndex

In [14]:
# Install packages (run once)
%pip install -q llama-cpp-python pageindex nest-asyncio
%pip install -q --upgrade pageindex
# Imports
from pageindex import PageIndexClient
import pageindex.utils as utils
from llama_cpp import Llama
import nest_asyncio
import asyncio
import json
import os
import requests

# Enable nested async (needed for Jupyter)
nest_asyncio.apply()

print("✅ All packages imported successfully!")


Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
✅ All packages imported successfully!


#### 0.2 Setup PageIndex

In [18]:
# Get your PageIndex API key from https://dash.pageindex.ai/api-keys
PAGEINDEX_API_KEY = "61c71223516242df905cf6e1af28aa29"
pi_client = PageIndexClient(api_key=PAGEINDEX_API_KEY)

print("✅ PageIndex client initialized!")


✅ PageIndex client initialized!


#### 0.3 Setup LLM

Choose your preferred LLM for reasoning-based retrieval. In this example, we use OpenAI’s GPT-4.1.

In [19]:
import requests

try:
    response = requests.get("http://127.0.0.1:1234/v1/models", timeout=5)
    print("✅ LM Studio server is UP!")
    print(f"Models: {response.json()}")
except Exception as e:
    print("❌ LM Studio server is NOT running!")
    print(f"Error: {e}")
    print("\n👉 Start it in LM Studio app → Local Server → Start Server")


✅ LM Studio server is UP!
Models: {'data': [{'id': 'meta-llama-llama-3.1-8b-instruct', 'object': 'model', 'owned_by': 'organization_owner'}, {'id': 'text-embedding-nomic-embed-text-v1.5', 'object': 'model', 'owned_by': 'organization_owner'}], 'object': 'list'}


In [20]:
# ===== FINAL BULLETPROOF VERSION =====
from openai import OpenAI
import asyncio
import nest_asyncio
import re

nest_asyncio.apply()

client = OpenAI(
    base_url="http://127.0.0.1:1234/v1",  # Add /v1 at the end
    api_key="not-needed"
)

async def call_llm(prompt, temperature=0):
    """
    Call LM Studio - bulletproof version
    """
    loop = asyncio.get_event_loop()
    
    def _call():
        try:
            response = client.chat.completions.create(
                model="local-model",
                messages=[{"role": "user", "content": prompt}],
                temperature=temperature,
                max_tokens=4096
            )
            
            # Safely get text
            if response and response.choices and len(response.choices) > 0:
                raw_text = response.choices[0].message.content
                if raw_text:
                    raw_text = raw_text.strip()
                    
                    # Extract JSON if present
                    json_match = re.search(r'\{.*?\}', raw_text, re.DOTALL)
                    if json_match:
                        return json_match.group(0)
                    
                    return raw_text
            
            return "No response from model"
            
        except Exception as e:
            import traceback
            print(f"Full error: {traceback.format_exc()}")
            return f"Error: {str(e)}"
    
    result = await loop.run_in_executor(None, _call)
    return result

print("✅ LM Studio function ready!")

# Test
test_result = await call_llm('Return just: {"thinking": "testing", "node_list": ["0019"]}')
print(f"✅ Test Response: {test_result}")


✅ LM Studio function ready!
✅ Test Response: {
        "thinking": "testing",
        "node_list": ["0019"]
    }


## Step 1: PageIndex Tree Generation

#### 1.1 Submit a document for generating PageIndex tree

In [21]:
import os, requests

# Download PDF
pdf_url = "https://arxiv.org/pdf/2501.12948.pdf"
pdf_path = os.path.join("./data", pdf_url.split('/')[-1])
os.makedirs(os.path.dirname(pdf_path), exist_ok=True)

response = requests.get(pdf_url)
with open(pdf_path, "wb") as f:
    f.write(response.content)
print(f"✅ Downloaded {pdf_url}")

# Submit to PageIndex (uses their API for tree building)
doc_id = pi_client.submit_document(pdf_path)["doc_id"]
print(f'✅ Document Submitted: {doc_id}')


✅ Downloaded https://arxiv.org/pdf/2501.12948.pdf
✅ Document Submitted: pi-cmh4z8cfq05bd0cr1zqzi0igk


#### 1.2 Get the generated PageIndex tree structure

In [22]:
# Wait for processing
import time
while not pi_client.is_retrieval_ready(doc_id):
    print("⏳ Processing document...")
    time.sleep(5)

# Get tree structure
tree = pi_client.get_tree(doc_id, node_summary=True)['result']
print('✅ Simplified Tree Structure:')
utils.print_tree(tree)


⏳ Processing document...
⏳ Processing document...
⏳ Processing document...
⏳ Processing document...
✅ Simplified Tree Structure:
[{'title': 'DeepSeek-R1: Incentivizing Reasoning Cap...',
  'node_id': '0000',
  'prefix_summary': '# DeepSeek-R1: Incentivizing Reasoning C...',
  'nodes': [{'title': 'Abstract',
             'node_id': '0001',
             'summary': 'The text introduces DeepSeek-R1-Zero, a ...'},
            {'title': 'Contents',
             'node_id': '0002',
             'summary': 'This document outlines an approach invol...'},
            {'title': '1. Introduction',
             'node_id': '0003',
             'prefix_summary': 'This paper introduces a novel approach t...',
             'nodes': [{'title': '1.1. Contributions',
                        'node_id': '0004',
                        'summary': '### 1.1. Contributions\n'},
                       {'title': 'Post-Training: Large-Scale Reinforcement...',
                        'node_id': '0005',
             

## Step 2: Reasoning-Based Retrieval with Tree Search

#### 2.1 Use LLM for tree search and identify nodes that might contain relevant context

In [23]:
prompt = 'Return just: {"thinking": "testing", "node_list": ["0019"]}'
result = await call_llm(prompt)
print(result)


{
        "thinking": "testing",
        "node_list": ["0019"]
    }


In [24]:
import json

# Your query
query = "What are the conclusions in this document?"

# Remove text fields to reduce prompt size
tree_without_text = utils.remove_fields(tree.copy(), fields=['text'])

# Search prompt
search_prompt = f"""You are given a question and a tree structure of a document.
Each node contains a node id, node title, and a corresponding summary.
Your task is to find all nodes that are likely to contain the answer to the question.

Question: {query}

Document tree structure:
{json.dumps(tree_without_text, indent=2)}

Please reply in the following JSON format:
{{
  "thinking": "Your reasoning process",
  "node_list": ["node_id_1", "node_id_2", ...]
}}

Only return valid JSON, nothing else."""

print("🔍 Searching tree with local Llama...")
tree_search_result = await call_llm(search_prompt, temperature=0)
print(f"✅ Search complete!")
print(tree_search_result)


🔍 Searching tree with local Llama...
✅ Search complete!
{
  "thinking": "The question is asking for nodes that are likely to contain the answer to 'What are the conclusions in this document?' which suggests we need to find nodes related to summary or conclusion.",
  "node_list": ["0002", "0017", "0021"]
}


#### 2.2 Print retrieved nodes and reasoning process

In [25]:
node_map = utils.create_node_mapping(tree)

# Parse JSON response
try:
    tree_search_result_json = json.loads(tree_search_result)
    
    print('🧠 Reasoning Process:')
    utils.print_wrapped(tree_search_result_json.get('thinking', 'No reasoning provided'))
    
    print('\n📄 Retrieved Nodes:')
    for node_id in tree_search_result_json["node_list"]:
        node = node_map[node_id]
        print(f"Node ID: {node['node_id']}\t Page: {node['page_index']}\t Title: {node['title']}")
        
except json.JSONDecodeError as e:
    print(f"⚠️ JSON parsing error: {e}")
    print("Raw response:", tree_search_result)


🧠 Reasoning Process:
The question is asking for nodes that are likely to contain the answer to 'What are the conclusions
in this document?' which suggests we need to find nodes related to summary or conclusion.

📄 Retrieved Nodes:
Node ID: 0002	 Page: 2	 Title: Contents
Node ID: 0017	 Page: 11	 Title: 3. Experiment
Node ID: 0021	 Page: 16	 Title: 5. Conclusion, Limitations, and Future Work


## Step 3: Answer Generation

#### 3.1 Extract relevant context from retrieved nodes

In [26]:
node_list = json.loads(tree_search_result)["node_list"]
relevant_content = "\n\n".join(node_map[node_id]["text"] for node_id in node_list)

print('📖 Retrieved Context:\n')
utils.print_wrapped(relevant_content[:1000] + '...')


📖 Retrieved Context:

## Contents

1 Introduction ..... 3
1.1 Contributions ..... 4
1.2 Summary of Evaluation Results ..... 4
2 Approach ..... 5
2.1 Overview ..... 5
2.2 DeepSeek-R1-Zero: Reinforcement Learning on the Base Model ..... 5
2.2.1 Reinforcement Learning Algorithm ..... 5
2.2.2 Reward Modeling ..... 6
2.2.3 Training Template ..... 6
2.2.4 Performance, Self-evolution Process and Aha Moment of DeepSeek-R1-Zero ..... 6
2.3 DeepSeek-R1: Reinforcement Learning with Cold Start ..... 9
2.3.1 Cold Start ..... 9
2.3.2 Reasoning-oriented Reinforcement Learning ..... 10
2.3.3 Rejection Sampling and Supervised Fine-Tuning ..... 10
2.3.4 Reinforcement Learning for all Scenarios ..... 11
2.4 Distillation: Empower Small Models with Reasoning Capability ..... 11
3 Experiment ..... 11
3.1 DeepSeek-R1 Evaluation ..... 13
3.2 Distilled Model Evaluation ..... 14
4 Discussion ..... 14
4.1 Distillation v.s. Reinforcement Learning ..... 14
4.2 Unsuccessful Attempts ..... 15
5 Conclusion, Limitatio

#### 3.2 Generate answer based on retrieved context

In [27]:
answer_prompt = f"""Answer the question based on the context:

Question: {query}
Context: {relevant_content}

Provide a clear, concise answer based only on the context provided.
"""

print('💡 Generating Answer with Local Llama...\n')
answer = await call_llm(answer_prompt, temperature=0.3)
utils.print_wrapped(answer)


💡 Generating Answer with Local Llama...

The conclusions in this document are:

1. DeepSeek-R1-Zero achieves strong performance across various tasks using a pure reinforcement
learning approach.
2. DeepSeek-R1 is more powerful and achieves performance comparable to OpenAI-o1-1217 on a range of
tasks by leveraging cold-start data alongside iterative RL fine-tuning.
3. Distillation of the reasoning capability from DeepSeek-R1 to small dense models results in
promising outcomes, with one model outperforming GPT-4o and Claude-3.5-Sonnet on math benchmarks.


---

## 🎯 What's Next

This notebook has demonstrated a **basic**, **minimal** example of **reasoning-based**, **vectorless** RAG with PageIndex. The workflow illustrates the core idea:
> *Generating a hierarchical tree structure from a document, reasoning over that tree structure, and extracting relevant context, without relying on a vector database or top-k similarity search*.

While this notebook highlights a minimal workflow, the PageIndex framework is built to support **far more advanced** use cases. In upcoming tutorials, we will introduce:
* **Multi-Node Reasoning with Content Extraction** — Scale tree search to extract and select relevant content from multiple nodes.
* **Multi-Document Search** — Enable reasoning-based navigation across large document collections, extending beyond a single file.
* **Efficient Tree Search** — Improve tree search efficiency for long documents with a large number of nodes.
* **Expert Knowledge Integration and Preference Alignment** — Incorporate user preferences or expert insights by adding knowledge directly into the LLM tree search, without the need for fine-tuning.



## 🔎 Learn More About PageIndex
  <a href="https://vectify.ai">🏠 Homepage</a>&nbsp; • &nbsp;
  <a href="https://dash.pageindex.ai">🖥️ Dashboard</a>&nbsp; • &nbsp;
  <a href="https://docs.pageindex.ai/quickstart">📚 API Docs</a>&nbsp; • &nbsp;
  <a href="https://github.com/VectifyAI/PageIndex">📦 GitHub</a>&nbsp; • &nbsp;
  <a href="https://discord.com/invite/VuXuf29EUj">💬 Discord</a>&nbsp; • &nbsp;
  <a href="https://ii2abc2jejf.typeform.com/to/tK3AXl8T">✉️ Contact</a>

<br>

© 2025 [Vectify AI](https://vectify.ai)