This repository provides an installable demo package for a DeepSieve-derived multi-source RAG system with:
- Multi-source retrieval
- Evidence-level selection
- Adaptive Cap (
preferred_source+ fixed quotas) - Streamlit UI for workflow, evidence, and trace inspection
- Source repository:
https://github.com/Joseph1951210/RealRoute - Installable package ( demo artifact):
https://github.com/Joseph1951210/RealRoute/archive/refs/tags/v1.0.1- -demo.zip
- Python 3.10+ (recommended: Python 3.10 or 3.11)
- macOS/Linux shell
- OpenAI-compatible API key (
OPENAI_API_KEY)
python3 -m pip install -r requirements.txt
export OPENAI_API_KEY=your_api_key
python3 -m streamlit run demo/app.pyOpen the local Streamlit URL shown in the terminal (typically http://localhost:8501).
The v1.0- -demo package includes tracked datasets required for the original 2-source preset (e.g., hotpot_qa local/global files).
For 3-source / 4-source presets (multi_source, mixed_4source), make sure the corresponding files exist under data/rag/ before running those presets:
{dataset}.json{dataset}_profiles.json{dataset}_corpus_*.json
If these files are missing, those presets will fail at data-loading time.
- Configure dataset preset and mode (
Hard RoutingorAdaptive Cap) - Run the pipeline with configurable retrieval/selection parameters
- Inspect run-level summary (output directory, config snapshot, overall metrics)
- Inspect query-level traces (subqueries, routing, evidence, final answer, metrics)
- Compare baseline vs Adaptive Cap in the same UI
- Query input (preset dataset or uploaded custom queries)
- Optional decomposition into subqueries (
decompose) - Ordered subquery execution with variable binding
- Retrieval:
- Hard Routing: route to one source then retrieve
- Adaptive Cap mode: retrieve from all sources, then select evidence
- Evidence selection using
selectorwith optional Adaptive Cap - Subquery answer generation (optional reflection retries)
- Final answer fusion
- Save traces and run summaries under
outputs/
- CLI pipeline:
runner/main_rag_only.py - Web UI:
demo/app.py
python3 runner/main_rag_only.py \
--dataset hotpot_qa \
--rag_type naive \
--sample_size 100 \
--decompose \
--use_routing \
--use_reflectionpython3 runner/main_rag_only.py \
--dataset multi_source \
--rag_type naive \
--sample_size 100 \
--decompose \
--use_reflection \
--multi_source \
--hard_routing_multipython3 runner/main_rag_only.py \
--dataset mixed_4source \
--rag_type naive \
--sample_size 100 \
--openai_model gpt-4o \
--decompose \
--use_reflection \
--multi_source \
--top_k_per_source 8 \
--keep_k 8 \
--preferred_cap 5 \
--other_cap 2 \
--selector scoretop_k_per_source: candidates retrieved per source before selectionkeep_k: final evidence budget for answer generationpreferred_cap/other_cap: source quota in Adaptive Cap modeselector: evidence selector (score,norm_score,routing_weighted,rrf,llm)
Important: current Adaptive Cap is not confidence-calibrated. It uses routing-preferred source + fixed quotas.
- Dataset presets: 2-source / 3-source / 4-source
- Mode toggle: Hard Routing vs Adaptive Cap
- Pipeline toggles:
decompose,use_reflection,sample_size, optionalquery_index - Compare mode: run baseline and Adaptive Cap on the same query set
- Trace view tabs: primary trace, side-by-side compare, comparison trace
- Trace download: JSONL and JSON
Custom queries override preset query loading but still use preset corpora.
Supported formats:
- JSON:
[
{"query": "Who wrote ...?", "ground_truth": "..."},
{"query": "What is ...?"}
]- CSV:
- required column:
query - optional column:
ground_truth
- required column:
If ground_truth is provided, EM/F1 is shown; otherwise answer/trace only.
The UI can add one uploaded source corpus to the selected preset sources.
- Required fields:
source_name,source_profile - File format: JSON or CSV
JSON examples:
[
{"title": "Doc 1", "text": "Document content..."},
{"title": "Doc 2", "text": "Another content..."}
]or
[
"plain document text 1",
"plain document text 2"
]CSV:
- required:
text - optional:
title
Each run writes to a directory in outputs/ (directory name encodes key settings).
Common files:
query_{i}_results.jsonlquery_{i}_fusion_prompt.txtoverall_results.jsonoverall_results.txtdemo_run_meta.json(UI run metadata)
Typical JSONL record types:
query_infofinal_answerevaluation_metricsperformance_metricsexecution_resultfused_answer_step
OPENAI_API_KEY is required:- Ensure
export OPENAI_API_KEY=...is executed in the same shell session before launching Streamlit.
- Ensure
pip: command not found:- Use
python3 -m pip ...instead ofpip ....
- Use
- Push the latest code to GitHub.
- Create a version tag (e.g.,
v1.0- -demo). - Create a GitHub Release from that tag.
- Upload a downloadable source archive (
.zipor.tar.gz) as release asset. - Replace
[[TODO: add GitHub Release asset URL]]above with the release asset link.
This repository keeps DeepSieve-style components (decomposition, routing, reflection, fusion) and extends them with multi-source retrieval, evidence selection, Adaptive Cap, and a trace-oriented demo UI.