GitHub - JiuTian-VL/HATS: [CVPR 2026] HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

Rui Shao^1,3,†, Ruize Gao^2,†, Bin Xie¹, Yixing Li¹, Kaiwen Zhou⁴, Shuai Wang⁴, Weili Guan^1,3, Gongwei Chen^1,*

¹ Harbin Institute of Technology, Shenzhen ² National University of Singapore, CNRS@CREATE
³ Shenzhen Loop Area Institute ⁴ Huawei Noah’s Ark Lab

^† Equal contribution ^* Corresponding author

💡 Brief View

Overview of trajectory synthesis paradigms. Compared with (a) existing methods, (b) HATS integrates hardness-driven exploration and alignment-guided refinement in a closed loop, producing high-quality trajectories with rich semantic coverage and strong instruction--execution alignment. (c) Experiments show HATS outperforms OS-Genesis by 100%↑ on AndroidWorld (22.60 vs. 11.30) and 215%↑ on WebArena (20.60 vs. 6.53).

🔍 The Problem: Semantic-Ambiguous Actions

Current GUI trajectory synthesis pipelines struggle with semantic-ambiguous actions—interactions whose functional meaning depends on contextual, sequential, or visual cues. These actions are:

Under-represented: Over 70% of collected traces collapse into trivial actions like "open menu" or "tap back"
Poorly processed: When captured, they often lead to instruction-execution misalignment, introducing noisy supervision

Examples of semantic-ambiguous actions include:

(a) Identical icons triggering different functions depending on context
(b) Operations requiring prerequisite steps to succeed
(c) Visually similar elements leading to distinct outcomes

🏗️ HATS Framework

HATS consists of two cooperative modules unified through Hardness-Driven Monte Carlo Tree Search (HD-MCTS):

1️⃣ Hardness-Driven Exploration Module

Problem with uniform exploration: Random walks oversample trivial actions and miss semantically challenging interactions.

Our solution: Replace random exploration with a hardness-aware policy that:

Uses UCB-based selection to balance exploration and exploitation
Prioritizes under-represented, semantically complex UI states
Concentrates search effort on high-value, ambiguous actions

2️⃣ Alignment-Guided Refinement Module

Problem with one-shot synthesis: Direct instruction generation produces vague descriptions that fail to replay consistently.

Our solution: Multi-round refinement process that:

Synthesizes initial instruction from exploration trace
Replays instruction to verify execution consistency
Measures alignment using action-level reconstruction recall
Refines instruction by injecting missing contextual cues
Iterates until semantic alignment is achieved (R ≥ 0.7)

Only verified trajectories passing alignment checks are admitted to the training corpus.

🔄 Closed-Loop Integration

The two modules form a feedback cycle:

Exploration → Refinement: Hardness-driven search supplies challenging trajectories for validation
Refinement → Exploration: Misalignment signals are converted into hardness rewards that guide future exploration

This closed loop progressively enhances both diversity (coverage of semantic-ambiguous actions) and fidelity (instruction-execution alignment) of synthesized data.

📊 Main Experimental Results

Main Results on AndroidWorld

Main Results on WebArena

🎓 Citation

If you find HATS useful for your research, please cite our paper:

@inproceedings{shao2026hats,
  title={HATS: Hardness-Aware Trajectory Synthesis for GUI Agents},
  author={Shao, Rui and Gao, Ruize and Xie, Bin and Li, Yixing and Zhou, Kaiwen and Wang, Shuai and Guan, Weili and Chen, Gongwei},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
bootstrap_agent		bootstrap_agent
docs		docs
evaluate/Android		evaluate/Android
webarena_web		webarena_web
.env.example		.env.example
.gitignore		.gitignore
HOW_TO_RUN.md		HOW_TO_RUN.md
README.md		README.md
__init__.py		__init__.py
all_execution_data_to_sft_data.py		all_execution_data_to_sft_data.py
avd_tools.py		avd_tools.py
create_avd_batch_sh.py		create_avd_batch_sh.py
create_mcts_batch_sh.py		create_mcts_batch_sh.py
device.py		device.py
fusion_all_documents.py		fusion_all_documents.py
mcts.py		mcts.py
mcts_main.py		mcts_main.py
prompt_templates.py		prompt_templates.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

💡 Brief View

🔍 The Problem: Semantic-Ambiguous Actions

🏗️ HATS Framework

1️⃣ Hardness-Driven Exploration Module

2️⃣ Alignment-Guided Refinement Module

🔄 Closed-Loop Integration

📊 Main Experimental Results

Main Results on AndroidWorld

Main Results on WebArena

🎓 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

HATS: Hardness-Aware Trajectory Synthesis for GUI Agents

💡 Brief View

🔍 The Problem: Semantic-Ambiguous Actions

🏗️ HATS Framework

1️⃣ Hardness-Driven Exploration Module

2️⃣ Alignment-Guided Refinement Module

🔄 Closed-Loop Integration

📊 Main Experimental Results

Main Results on AndroidWorld

Main Results on WebArena

🎓 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages