MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

MobileForge turns real target-app interaction into executable curricula, hierarchical rollout feedback, and hint-contextualized policy updates without human-written tasks, demonstrations, or reward labels.

🔥 News

2026-06-23: Released the MobileForge codebase, 🤗 datasets, and 🤗 benchmark results.
2026-06-19: MobileForge preprint is available on arXiv.
2026-06-10: Released all MobileForge 🤗 model checkpoints.

📊 Main Results

MobileForge improves mobile GUI agents through annotation-free target-app adaptation. With GUI-Owl-1.5-8B, MobileForge reaches 67.24% Pass@1 and 77.59% Pass@3 on AndroidWorld, and 41.03% SR on MobileWorld. With Qwen3-VL-8B, MobileForge raises AndroidWorld Pass@3 to 67.24%.

🧩 Overview

MobileForge adapts mobile GUI agents without collecting task-specific human annotations. It combines MobileGym for target-app exploration and automatic curriculum generation with HiFPO for hint-guided rollout, hierarchical trajectory feedback, and step-level GRPO training.

MobileGym: target-app interaction and hierarchical feedback

MobileGym grounds the adaptation loop in real target-app interaction. It explores Android apps, mines executable curriculum tasks from interaction traces, executes rollouts, and evaluates completed attempts with outcome labels, step-level process feedback, and corrective hints.

HiFPO: feedback-guided policy optimization

HiFPO turns MobileGym feedback into training signals. It runs hint-guided multi-attempt rollouts, filters mastered tasks and low-quality steps, retains informative experience, and trains the agent with hint-contextualized step-level GRPO.

📁 Repository Guide

MobileForge/
|-- explore/                       # Target-app exploration and MobileGym-Curriculum task generation
|-- rollout/                       # Hint-guided rollout, critic feedback, and rollout-to-GRPO processing
|-- training/                      # VERL-derived MobileForge step-level GRPO training stack
|-- evaluation/
|   |-- androidworld/              # AndroidWorld evaluation fork and reproduction utilities
|   `-- mobileworld/               # MobileWorld reproduction notes and helpers
|-- docs/                          # Models, data release, pipeline, and evaluation-result mapping
|-- metadata/                      # Public release manifests and model/result maps
|-- CITATION.cff
|-- citations.bib
`-- README.md

🚀 Where to Start

The root README is intentionally concise. Detailed setup and commands live in the component README files.

Goal	Start here	What it covers
Explore target apps and generate tasks	`explore/`	Target-app exploration, APK cache, parallel exploration, and MobileGym-Curriculum task generation.
Run hint-guided rollouts and build GRPO data	`rollout/`	Multi-attempt rollout, MobileGym-Critic feedback, hint reuse, and rollout-to-training-data conversion.
Train with HiFPO / step-level GRPO	`training/`	Training environment, GRPO launch script, reward function, and utility tools.
Reproduce benchmark runs	`evaluation/` and `docs/evaluation_results.md`	AndroidWorld and MobileWorld artifact mapping and evaluation notes.
Inspect release manifests	`docs/` and `metadata/`	Model list, dataset release notes, pipeline overview, and model-to-result mapping.

📦 Release Index

Artifact	Link	Details
Models	🤗 MobileForge Models collection	Main ForgeQwen3 / ForgeOwl checkpoints and scaling-ablation checkpoints. See `docs/models.md`.
Datasets	🤗 MobileForge Datasets collection	Training data, exploration trajectories, and generated tasks. See `docs/data_release.md`.
Benchmark results	🤗 `lgy0404/mobileforge-benchmark-results`	AndroidWorld and MobileWorld archives. See `docs/evaluation_results.md`.
Paper	arXiv:2606.19930	Technical report and citation metadata.

Citation

@article{liu2026mobileforge,
  title={MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization},
  author={Liu, Guangyi and Zhao, Pengxiang and Wu, Gao and Yin, Yiwen and Li, Mading and Liu, Liang and Liu, Congxiao and Qi, Zhang and Wang, Mengyan and Guo, Liang and others},
  journal={arXiv preprint arXiv:2606.19930},
  year={2026}
}

Contact

For questions about the paper, code, or released artifacts, contact guangyiliu@zju.edu.cn.

⭐ Star History

🙏 Acknowledgements

MobileForge builds on open-source resources including AndroidWorld, MobileWorld, MobileAgent, Qwen3-VL, GUI-explorer, VERL, and GUI-R1.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

🔥 News

📊 Main Results

🧩 Overview

MobileGym: target-app interaction and hierarchical feedback

HiFPO: feedback-guided policy optimization

📁 Repository Guide

🚀 Where to Start

📦 Release Index

Citation

Contact

⭐ Star History

🙏 Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
evaluation		evaluation
explore		explore
metadata		metadata
rollout		rollout
training		training
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
citations.bib		citations.bib

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

MobileForge: Annotation-Free Adaptation for Mobile GUI Agents with Hierarchical Feedback-Guided Policy Optimization

🔥 News

📊 Main Results

🧩 Overview

MobileGym: target-app interaction and hierarchical feedback

HiFPO: feedback-guided policy optimization

📁 Repository Guide

🚀 Where to Start

📦 Release Index

Citation

Contact

⭐ Star History

🙏 Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages