This is the repository for the paper: Less is More: Compact Clue Selection for Efficient Retrieval-Augmented Generation Reasoning.
🚀 2026.01: Release of our initial codes
pip install -r requirements.txtFor the retriever setup, please refer to Self-RAG. We retrieve Top-60 documents as Full Content for each query.
We provide three modules for training:
- Clue Extractor and Adaptive Truncator are trained using SFT fine-tuning with LLaMA-Factory.
Simply prepare instruction-style data and run:
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml- Run the reranker training script:
python reranker/train_rerank.py \
--data_path "$DATA_PATH" \
--model_name "$MODEL_PATH" \
--output_dir "$OUTPUT_DIR" \
--train_batch_size $BATCH_SIZE \
--max_seq_length $MAX_SEQ_LENGTH \
--pooling $POOLING \
--epochs $EPOCHS \
--warmup_steps $WARMUP_STEPS \
--lr $LR \
--checkpoint_save_total_limit $CHECKPOINT_LIMIT \
--eval_steps $EVAL_STEPS \
--max_train_samples $MAX_TRAIN_SAMPLES
We adopt Substring Exact Match (SubEM) and F1 for evaluation. SubEM checks whether the gold answer appears as a substring in the prediction, while F1 measures token-level overlap with the reference.
python inference_llama.py
--input "$INPUT_FILE" \
--model "$MODEL" 