Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval

Source code for our paper:
Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval

Click the link below to view our papers:

If you find this work useful, please cite our paper and give us a shining star 🌟

@article{wu2026findingmattersanchoringcontext,
      title={Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval}, 
      author={Mingyan Wu and Zhenghao Liu and Xinze Li and Yuqing Lan and Yukun Yan and Shuo Wang and Cheng Yang and Minghe Yu and Zheni Zeng and Maosong Sun},
      year={2026},
      eprint={2601.16462},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.16462}, 
}

Overview

KAIR is a Knowledge Anchoring framework for Iterative Retrieval that anchors knowledge within retrieved knowledge to guide LLMs to locate the key information. During iterative retrieval, KAIR progressively updates the knowledge index to anchor salient evidence from retrieved documents. The evolving index serves as a navigational anchoring index that enables the LLM to assess knowledge sufficiency and formulate subsequent retrieval queries. Finally, KAIR generates answers by jointly leveraging the retrieved documents and the finalized anchoring index.

Set Up

Use git clone to download this project

git clone https://github.com/NEUIR/KAIR.git
cd KAIR

use the virtual environment management packages

conda env create -n KAIR -f kair_environment.yml

Prepare Datasets

Our code and data are developed based on DeepNote.

1 Download the data

Follow the DeepNote's instruction to prepare the datasets:
All corpus and evaluation files should be placed in the /data directory. You can download the experimental data (MuSiQue, HotpotQA, 2WikiMultihopqa) here.
And you can download Bamboogle data here. For Bamboogle dataset, we use the same corpus as HotpotQA dataset.

2 Build Indices

For HotpotQA, 2WikiMQA, and MusiQue

cd src/build_index/emb
python index.py --dataset hotpotqa --model bge-base-en-v1.5 # e.g., for HotpotQA dataset

Configuration

You can configure the model path in the ./config/config.yaml file.

Running KAIR and Evaluation

python KAIR.py --method KAIR --retrieve_top_k 5 --dataset hotpotqa --max_step 3 --model qwen2.5-7b-instruct

❗️Note: max_step should be set to the maximum number of retrieval steps minus one.
The predicted results and evaluation metrics will be automatically saved in the output/{dataset}/ directory. The evaluation results can be found at the end of the file.

Contact

If you have questions, suggestions, and bug reports, please email:

2401930@stu.neu.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
config		config
fig		fig
prompts_KAIR/en		prompts_KAIR/en
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
kair_environment.yml		kair_environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval

Overview

Set Up

Prepare Datasets

1 Download the data

2 Build Indices

Configuration

Running KAIR and Evaluation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Finding What Matters: Anchoring Context Knowledge with Evolving Indices for Iterative Retrieval

Overview

Set Up

Prepare Datasets

1 Download the data

2 Build Indices

Configuration

Running KAIR and Evaluation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages