Added raw_chunks parameter to search methods #1886

akarim23131 · 2025-04-21T12:27:31Z

Description

Added a new --raw-chunks flag to expose the raw context data retrieved from the vector store before it's processed by the LLM. This feature enhances debugging and transparency by allowing users to see the exact information being used to generate responses across all search methods (local, global, and drift). Most importantly, users now have the flexibility to control when they want to see the raw context - by simply adding the --raw-chunks flag to their query command, they can view the actual chunks of information being passed to the LLM.

Query Command

graphrag query --method local --query "" --root graph_index --raw-chunks
graphrag query --method global --query "" --root graph_index --raw-chunks
graphrag query --method drift --query "" --root graph_index --raw-chunks

Related Issues

Improves debugging capabilities for RAG applications
Enhances transparency in search results
Helps users understand and verify the context selection process
Provides user control over raw context visibility through CLI flag

Proposed Changes

Added user-controlled raw context display:
- New --raw-chunks CLI flag for optional context viewing
- Users can toggle between normal and detailed raw context output
- Works seamlessly with all search methods (local, global, drift)
Added RawChunksCallback class in query.py to handle the display of raw chunks with structured formatting for:
- Reports with titles and text
- Text units with source information
- Relationships and community data
- Special handling for DRIFT search's three-step process
Modified search implementation files:
- factory.py: Added raw_chunks parameter to search engine factory methods
- main.py: Implemented --raw-chunks CLI flag
- query.py: Added raw chunks handling for all search types
- search.py: Updated search implementations (for local, global, drift ) to support raw chunks display
Enhanced DRIFT search to show context at each step:
- Primer search results
- Follow-up question contexts
- Final synthesized context

Checklist

I have tested these changes locally.
I have reviewed the code changes.
I have updated the documentation (if necessary).
I have added appropriate unit tests (if applicable).

Additional Notes

The feature is opt-in via the --raw-chunks flag, maintaining backward compatibility
Users have complete control over when to view raw context through simple CLI flag
Raw chunks are displayed in a structured format for better readability
Implementation handles different data types (dictionaries, lists, strings) robustly
Special consideration given to DRIFT search's multi-step process

akarim23131 · 2025-04-21T12:34:09Z

@microsoft-github-policy-service agree

Added raw_chunks parameter to search methods

d18878d

akarim23131 requested review from a team as code owners April 21, 2025 12:27

akarim23131 and others added 6 commits April 23, 2025 10:41

Resolved conflict:kept model_params for consistency

0e42763

Add get_openai_model_parameters_from_config function

7a778fe

Merge branch 'main' into my-contribution

e53c1c9

Merge branch 'main' into my-contribution

8c404ff

Merge branch 'main' into my-contribution

09662b5

Merge branch 'main' into my-contribution

3a5cc6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added raw_chunks parameter to search methods #1886

Added raw_chunks parameter to search methods #1886

Uh oh!

akarim23131 commented Apr 21, 2025

Uh oh!

akarim23131 commented Apr 21, 2025

Uh oh!

Uh oh!

Added raw_chunks parameter to search methods #1886

Are you sure you want to change the base?

Added raw_chunks parameter to search methods #1886

Uh oh!

Conversation

akarim23131 commented Apr 21, 2025

Description

Query Command

Related Issues

Proposed Changes

Checklist

Additional Notes

Uh oh!

akarim23131 commented Apr 21, 2025

Uh oh!

Uh oh!