InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Yuchen Yan^1,2,*, Liang Jiang², Jin Jiang³, Shuaicheng Li²,
Zujie Wen², Zhiqiang Zhang², Jun Zhou², Jian Shao¹, Yueting Zhaung¹, Yongliang Shen^1,†

¹Zhejiang University, ²Ant Group, ³Peking University
Preprint. Under review.
^*Contribution during internship at Ling Team, Ant Group. ^†Corresponding Author

Arxiv | 📑 WebPage

News 🔥🔥

2026.02.09: We release our paper.

Overview 🦾🦾

Building upon our previous work InftyThink, we introduce InftyThink+, an end-to-end reinforcement learning framework that directly optimizes the complete iterative reasoning trajectory. Building on InftyThink’s paradigm of model-controlled iteration boundaries and explicit summarization, our approach proceeds in two stages: a cold-start stage that uses supervised fine-tuning to establish the basic iterative reasoning format, followed by an RL stage that optimizes strategic decisions through trajectory-level learning. We carefully design the rollout strategy, reward formulation, and policy gradient estimation tailored to InftyThink’s single-trajectory, multi-inference structure. This design separates format acquisition from strategy optimization, enabling the model to learn not only how to produce iterative reasoning, but also when to summarize, what to preserve, and how to effectively leverage self-generated summaries across iterations.

QuickStart 🎯🎯

Codes and documentations are on the way.

Citation

If you find our work helpful, feel free to give us a cite.

@misc{yan2026inftythinkplus,
      title={InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning}, 
      author={Yuchen Yan and Liang Jiang and Jin Jiang and Shuaicheng Li and Zujie Wen and Zhiqiang Zhang and Jun Zhou and Jian Shao and Yueting Zhuang and Yongliang Shen},
      year={2026},
      eprint={2602.06960},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.06960}, 
}

Contact Us

If you have any questions, please contact us by email: yanyuchen@zju.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
verl @ a8d5bf9		verl @ a8d5bf9
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

News 🔥🔥

Overview 🦾🦾

QuickStart 🎯🎯

Citation

Contact Us

About

Uh oh!

Releases

Packages

License

ZJU-REAL/InftyThink-Plus

Folders and files

Latest commit

History

Repository files navigation

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

News 🔥🔥

Overview 🦾🦾

QuickStart 🎯🎯

Citation

Contact Us

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages