Skip to content

discovery(rag): rag_benchmarks.py uses only mock data — never exercises real KB, provides no real retrieval quality signal #4697

@mrveiss

Description

@mrveiss

Discovery

autobot-backend/knowledge/rag_benchmarks.py (Issue #58) benchmarks RAG operations using randomly generated mock embeddings and documents. It never instantiates a real KnowledgeBase, connects to ChromaDB, or runs queries against actual indexed content. All results measure synthetic numpy array operations, not real retrieval quality.

Evidence

@pytest.fixture
def mock_embeddings(self):
    return [[random.random() for _ in range(384)] for _ in range(100)]

@pytest.fixture
def mock_documents(self):
    return [{"id": f"doc_{i}", "content": f"This is test document {i}...",
             "embedding": [random.random() for _ in range(384)]}
            for i in range(1000)]

All benchmark tests operate on these fixtures. No fixture mounts a real KB or ChromaDB collection. The benchmarks measure:

  • Raw cosine similarity on random vectors (not semantic similarity)
  • Top-k selection on random data (not real document relevance)
  • A simulated pipeline with time.sleep() calls for "realism"

The file is also not wired into any CI pipeline, scheduler, or feedback loop.

Impact

Fix

Add a RealKBBenchmarks test class alongside the existing mock class that:

  1. Connects to a real (or test-fixture) ChromaDB instance with seeded documents
  2. Runs AdvancedRAGOptimizer.advanced_search() with real queries
  3. Scores results against known-good ground truth (precision@k, MRR)
  4. Can run in CI with a lightweight ChromaDB fixture (in-memory mode)

The mock benchmarks can remain for pure performance microbenchmarks (vector math speed etc).

Affected File

  • autobot-backend/knowledge/rag_benchmarks.py — add real-KB benchmark class

Prerequisite For

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions