Skip to content

discovery(rag): QueryContext.expanded_queries computed but never passed to retrieval — query expansion is a silent no-op #4685

@mrveiss

Description

@mrveiss

Discovery

AdvancedRAGOptimizer._analyze_query_context() builds a QueryContext with expanded_queries: List[str] populated by _expand_query(). However, _retrieve_hybrid_results() only accepts query: str (the original query) and never receives or uses the QueryContext — so the expanded variants are computed and then silently discarded.

Evidence

advanced_rag_optimizer.py:611context is built with expanded queries:

context = self._analyze_query_context(query)  # populates context.expanded_queries

advanced_rag_optimizer.py:615 — only the raw query string is passed to retrieval:

hybrid_results = await self._retrieve_hybrid_results(query, metrics)
# context is NOT passed — expanded_queries never used

_retrieve_hybrid_results() signature (line 643):

async def _retrieve_hybrid_results(self, query: str, metrics: RAGMetrics) -> List[SearchResult]:

The context variable is only consumed by:

  • _optimize_result_count() — just slices results, doesn't use expanded_queries
  • _log_search_completion() — logging only
  • _build_context_header() — string formatting only

_expand_query() generates up to 5 variants (e.g. "install → setup, deploy", "error → issue, problem") but none of them are ever searched.

Impact

  • Query expansion has zero effect on retrieval quality despite being implemented
  • Users searching for "install" never get results that only mention "setup" or "deploy"
  • Troubleshooting queries like "error X" never retrieve docs mentioning "problem X"
  • The QueryContext.expanded_queries field is dead weight

Fix

Pass context into _retrieve_hybrid_results() and run additional semantic searches for each expanded_query, merging results with the primary search before hybrid scoring. Alternatively, run a single search with the expanded queries concatenated as context (if the KB supports it).

Affected File

  • autobot-backend/advanced_rag_optimizer.py_retrieve_hybrid_results() and advanced_search()

Metadata

Metadata

Assignees

No one assigned

    Labels

    backendbugSomething isn't workingnot-wiredImplemented but not connected to apprag

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions