OS-Sentinel

🛠️ Usage

📦 Installation

Clone this repository and set up the environment of AndroidWorld; you may still need to install extra packages needed listed in requirements.txt although you have already installed AndroidWorld;

git clone https://github.com/OS-Copilot/OS-Sentinel
cd OS-Sentinel
# install AndroidWorld
# requirements.txt contains packages not included by AndroidWorld
pip install -r requirements.txt

Install Node.js and Appium:

wget -O install_nvm.sh https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.2/install.sh
bash install_nvm.sh
nvm install v18.12.1
npm install -g appium appium-doctor
npm install wd
appium driver install uiautomator2

Run root.py and it will configure the environment of MobileSafetyBench automatically.
```
conda activate android
python root.py
```
and you can run the script of MobileSafetyBench (msb.py) under the environment of AndroidWorld.

Note

Env OPENAI_API_KEY (while OPENAI_BASE_URL is optional) is needed when calling external VLM.

🔀 Modes

step: to check safety of single-step action in rule-based and VLM-based manners;
```
timestep_new, in_danger = env.record(action)
```
record: to record trajectories of actions proposed by mobile agent.
```
timestep_new = env.record(action)
```
this method fix the system states before each action and env.record("terminate()") is needed at the end or the last action cannot be recorded.

📏 Benchmark

Download our trajectories data at OS-Copilot/MobileRisk;
Extract the zip files and run eval script:
```
unzip '*.zip'
python pipeline/eval.py
```
Don't forget to fill in _API_KEY.
- pipeline/eval.py is for typical VLM evaluation;
- pipeline/eval_llm.py is for text-only LLM evaluation;
- pipeline/tag.py is for risk tag evaluation of VLM;
- pipeline/cons.py is for recorded trajectories via mobile agent instead of our hand-made ones;
Run pipeline/multi_method_consistency.py after result.json is ready.

📋 Citation

@article{sun2025ossentinel,
  title={OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows},
  author={Qiushi Sun and Mukai Li and Zhoumianze Liu and Zhihui Xie and Fangzhi Xu and Zhangyue Yin and Kanzhi Cheng and Zehao Li and Zichen Ding and Qi Liu and Zhiyong Wu and Zhuosheng Zhang and Ben Kao and Lingpeng Kong},
  journal={arXiv preprint arXiv:2510.24411},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 273 Commits
android_world		android_world
asset		asset
mobile_safety		mobile_safety
pipeline		pipeline
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
msb.py		msb.py
requirements.txt		requirements.txt
root.py		root.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OS-Sentinel

🛠️ Usage

📦 Installation

🔀 Modes

📏 Benchmark

📋 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

OS-Copilot/OS-Sentinel

Folders and files

Latest commit

History

Repository files navigation

OS-Sentinel

🛠️ Usage

📦 Installation

🔀 Modes

📏 Benchmark

📋 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages