QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

📃 Paper ｜ 🤗 QFFT-7B ｜ 🤗 QFFT-32B ｜ 📚 QFFT Datasets

The complete code is coming soon!

⚡ Introduction

Welcome to the official repository for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning!

QFFT introduces a novel and efficient fine-tuning method designed to empower large language models with adaptive reasoning ability. Instead of training models on (Question, Reasoning) pairs like traditional Supervised Fine-Tuning (SFT), QFFT discards the question input and learns solely from the reasoning response—especially Long CoT outputs.

QFFT enables models to:

Preserve Short CoT for simple tasks (efficiency)
Trigger Long CoT only when needed (effectiveness)
Reduce overthinking by minimizing unnecessary reasoning
Improve robustness in noisy, low-resource, and out-of-domain scenarios

We open-sourced our models, data, and code here.

💭 Environment

Training Environment (LLaMA-Factory)

cd LLaMA-Factory
pip install -e ".[torch,metrics]" --no-build-isolation

Evaluation Environment (VLLM)

pip install vllm bitsandbytes flashinfer-python==0.2.2.post1
pip install latex2sympy2 word2number

💻 Model

Model Name	Base LLM	Link
QFFT-S1-7B	Qwen2.5-7B-Instruct	HF Link
QFFT-S1-32B	Qwen2.5-32B-Instruct	HF Link
QFFT-LIMO-7B	Qwen2.5-7B-Instruct	HF Link
QFFT-LIMO-32B	Qwen2.5-32B-Instruct	HF Link

📚 Datasets

QFFT uses distilled responses from strong Long CoT models (e.g., DeepSeek-R1). During QFFT, the input questions are removed entirely.

Dataset	Size	Link
S1.1	1k	HF Link
LIMO	871	HF Link

🛠️ Training

Getting Started

To train a model using QFFT, you can use llamafactory-cli and the provided YAML configs:

llamafactory-cli train examples/train_qfft/train_s1_qfft.yaml
llamafactory-cli train examples/train_qfft/train_limo_qfft.yaml

Our Modifications

This codebase is based on LLaMA-Factory.
Our key modification lies in the template system. We implement a new QFFT template in:

/src/llamafactory/data/template.py

For details, please refer line 1569.

🧪 Evaluation

You can evaluate QFFT models on benchmarks (e.g., GSM8K, MATH, AIME) with tools like vllm or Sglang.
We also propose a novel metric RAK (Reasoning Adaptability Kappa) to evaluate the reasoning adaptability.

The evaluation code is coming soon!

📊 Results

Here are the main results comparing SFT and QFFT on 3 mathematical reasoning benchmarks:

📌 7B Models (Qwen2.5-7B-Instruct)

Dataset	Method	GSM8K Acc	GSM8K Tokens	MATH Acc	MATH Tokens	AIME25 Acc	AIME25 Tokens	Avg Acc	Avg Tokens
S1.1	SFT	90.6	1.7K	80.8	5.3K	18.2	17.7K	63.2	8.2K
	QFFT	91.0	0.4K	80.2	2.8K	17.2	12.8K	62.8	5.3K
	Δ	+0.4	-76.5%	-0.6	-47.2%	-1.0	-27.7%	-0.4	-50.5%

Dataset	Method	GSM8K Acc	GSM8K Tokens	MATH Acc	MATH Tokens	AIME25 Acc	AIME25 Tokens	Avg Acc	Avg Tokens
LIMO	SFT	88.2	1.8K	80.4	5.8K	16.8	17.1K	61.8	8.2K
	QFFT	88.0	0.7K	80.6	4.1K	17.2	15.6K	61.9	6.8K
	Δ	-0.2	-61.1%	+0.2	-29.3%	+0.4	-8.8%	+0.1	-33.1%

📌 32B Models (Qwen2.5-32B-Instruct)

Dataset	Method	GSM8K Acc	GSM8K Tokens	MATH Acc	MATH Tokens	AIME25 Acc	AIME25 Tokens	Avg Acc	Avg Tokens
S1.1	SFT	92.8	2.1K	93.1	4.1K	48.6	16.2K	78.2	7.5K
	QFFT	93.6	0.6K	92.2	2.4K	46.8	12.9K	77.5	5.3K
	Δ	+0.8	-71.4%	-0.9	-41.5%	-1.8	-20.4%	-0.6	-44.4%

Dataset	Method	GSM8K Acc	GSM8K Tokens	MATH Acc	MATH Tokens	AIME25 Acc	AIME25 Tokens	Avg Acc	Avg Tokens
LIMO	SFT	91.2	1.9K	93.0	3.9K	45.8	13.2K	76.6	6.3K
	QFFT	92.6	0.8K	92.6	2.9K	45.0	12.5K	76.7	5.4K
	Δ	+1.4	-57.9%	-0.4	-25.6%	-0.8	-5.3%	+0.1	-29.6%

📖 Citation

@misc{liu2025qfft,
  title={QFFT, Question-Free Fine-Tuning for Adaptive Reasoning},
  author={Wanlong Liu and Junxiao Xu and Fei Yu and Yukang Lin and Ke Ji and Wenyu Chen and Yan Xu and Yasheng Wang and Lifeng Shang and Benyou Wang},
  year={2025},
  eprint={2506.12860},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2506.12860},
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
LLaMA-Factory		LLaMA-Factory
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

⚡ Introduction

💭 Environment

Training Environment (LLaMA-Factory)

Evaluation Environment (VLLM)

💻 Model

📚 Datasets

🛠️ Training

Getting Started

Our Modifications

🧪 Evaluation

📊 Results

📌 7B Models (Qwen2.5-7B-Instruct)

📌 32B Models (Qwen2.5-32B-Instruct)

📖 Citation

About

Uh oh!

Releases

Packages

Languages

License

FreedomIntelligence/Question-Free-Fine-Tuning

Folders and files

Latest commit

History

Repository files navigation

QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

⚡ Introduction

💭 Environment

Training Environment (LLaMA-Factory)

Evaluation Environment (VLLM)

💻 Model

📚 Datasets

🛠️ Training

Getting Started

Our Modifications

🧪 Evaluation

📊 Results

📌 7B Models (Qwen2.5-7B-Instruct)

📌 32B Models (Qwen2.5-32B-Instruct)

📖 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages