A Multi-Agent Reinforcement Learning Framework for Autonomous Permission Governance in Mobile Ecosystems
Official implementation of the TrustGuard paper.
Akarma, A., Jan, S., & Syed, T. A. (2026). TrustGuard: A Multi-Agent Reinforcement Learning Framework for Autonomous Permission Governance in Mobile Ecosystems.
Mobile permission systems rely on static policies and uninformed user prompts that cannot reason about application behaviour at runtime. TrustGuard replaces this with a continuous, learning-based governance loop formalised as a Decentralised Partially Observable Markov Decision Process (Dec-POMDP).
Three cooperative agents β Monitoring, Risk-Analysis, and Enforcement β are trained via Centralised Training / Decentralised Execution (CTDE) using MAPPO with a Lagrangian safety constraint that bounds the false-revocation rate.
| Metric | TrustGuard | Best Baseline |
|---|---|---|
| Permission Risk AUROC | 0.963 | 0.921 (MaMaDroid) |
| Privacy Risk Reduction | 41.3% | 34.9% (Single-Agent RL) |
| False Revocation Rate | 2.1% | 6.8% (Single-Agent RL) |
| Enforcement Latency | 1.9 s | 2.8 s |
| AUROC under Mimicry Attack | 0.891 | 0.739 (MaMaDroid) |
App Metadata βββΊ App Semantic Encoder (BERT + GATv2 + CodeBERT) βββΊ Ο(fα΅’) β βΒ²β΅βΆ
β
Permission Prediction Model βββ
gΞΈ: βΒ²β΅βΆ β [0,1]^|π«|
β
Runtime Traces βββββββββββββββββββββββββββββββΊ Runtime Risk Estimator
Οα΅’α΅ (EMA-smoothed)
β
βββββββββββββββββββββββββββββββ€
βΌ βΌ βΌ
Monitoring Risk Enforcement
Agent(k=1) Agent(k=2) Agent(k=3)
βββββββββββββββββββββββββββββββ
β
Shared Belief bβ
(GRU Encoder f_Ο)
β
Enforcement Action β {no_op,
alert, rate_limit, revoke}
The system is trained end-to-end via Constrained MAPPO:
β(ΞΈ, ΞΌ) = πΌ[Ξ£ Ξ³α΅ rβ] β ΞΌ Β· (πΌ[false_revocations] β Ξ΅_safe)
trustguard/
βββ trustguard/ # Main package
β βββ agents/ # Three Dec-POMDP agents + policy networks
β β βββ monitoring_agent.py
β β βββ risk_analysis_agent.py
β β βββ enforcement_agent.py
β β βββ policy_networks.py
β βββ models/ # Four-layer model stack
β β βββ semantic_encoder.py # Layer 1: BERT + GATv2 + CodeBERT
β β βββ permission_predictor.py # Layer 2: multi-label MLP
β β βββ runtime_risk_estimator.py # Layer 3: EMA risk tracker
β β βββ belief_encoder.py # GRU-based shared belief state
β βββ marl/ # MAPPO training infrastructure
β β βββ mappo.py # Constrained MAPPO trainer
β β βββ rollout_buffer.py # On-policy experience buffer
β β βββ centralized_critic.py
β βββ environment/ # Simulation environment
β β βββ permission_env.py # Dec-POMDP environment
β β βββ app_simulator.py # Benign + malicious app behaviour
β β βββ observation_builder.py
β βββ dataset/ # PermissionBench utilities
β β βββ permissionbench_loader.py
β β βββ dataset_builder.py
β β βββ preprocessing.py
β βββ utils/
β βββ metrics.py # PRR, FRR, AUROC, F1, ...
β βββ logging_utils.py # W&B + TensorBoard
β βββ config_utils.py
βββ experiments/ # Runnable experiment scripts
β βββ train_trustguard.py # Main training script
β βββ evaluate_prediction.py # Task 1: permission risk prediction
β βββ evaluate_enforcement.py # Task 2: autonomous enforcement
β βββ adversarial_evaluation.py # Task 3: mimicry attack
βββ configs/ # YAML configuration files
β βββ model.yaml
β βββ marl.yaml
β βββ training.yaml
β βββ dataset.yaml
βββ scripts/
β βββ build_dataset.py
β βββ run_full_experiment.sh
βββ tests/ # pytest test suite
βββ docs/ # Extended documentation
βββ notebooks/
βββ trustguard_demo.ipynb
git clone https://github.com/aliakarma/trustguard.git
cd trustguard
conda create -n trustguard python=3.10 -y
conda activate trustguardpip install torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu121pip install torch-geometric==2.4.0
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-2.1.0+cu121.htmlpip install -e ".[dev]"bash scripts/download_permissionbench.shThis downloads the pre-processed dataset (~2 GB) to data/permissionbench/.
# Requires AndroZoo API key β set ANDROZOO_API_KEY env variable
python scripts/build_dataset.py \
--androzoo-key $ANDROZOO_API_KEY \
--output-dir data/permissionbench \
--n-benign 61840 \
--n-malicious 14512python experiments/train_trustguard.py \
--config-dir configs/ \
--data-dir data/permissionbench \
--output-dir outputs/run_001 \
--seed 42python experiments/train_trustguard.py \
--config-dir configs/ \
--output-dir outputs/run_001 \
--no-pretrainpython experiments/train_trustguard.py ... --use-wandbpython experiments/train_trustguard.py \
--resume outputs/run_001/checkpoint_latest.pt ...python experiments/evaluate_prediction.py \
--checkpoint outputs/run_001/checkpoint_best.pt \
--data-dir data/permissionbench \
--output-dir outputs/eval_task1Expected output:
Accuracy=0.951 | Macro-F1=0.939 | AUROC=0.963 | AP=0.941
python experiments/evaluate_enforcement.py \
--checkpoint outputs/run_001/checkpoint_best.pt \
--output-dir outputs/eval_task2 \
--n-episodes 10Expected output:
PRR=41.3% | FRR=0.0210 | Latency=1.90s
python experiments/adversarial_evaluation.py \
--checkpoint outputs/run_001/checkpoint_best.pt \
--data-dir data/permissionbench \
--output-dir outputs/eval_task3Expected output:
AUROC (clean)=0.9630 | AUROC (attack)=0.8910 | Ξ=-0.0720
bash scripts/run_full_experiment.sh outputs/run_001# Fast unit tests (no GPU, no data download required)
pytest tests/ -v
# With coverage report
pytest tests/ --cov=trustguard --cov-report=htmlAll hyperparameters are controlled via YAML files in configs/.
Key parameters:
| File | Parameter | Default | Description |
|---|---|---|---|
marl.yaml |
lagrangian.eps_safe |
0.025 |
Max false-revocation rate Ξ΅_safe |
marl.yaml |
mappo.eps_clip |
0.2 |
PPO clip coefficient |
marl.yaml |
mappo.gae_lambda |
0.95 |
GAE Ξ» |
model.yaml |
semantic_encoder.output_dim |
256 |
Ο(fα΅’) dimension |
model.yaml |
enforcement_agent.risk_threshold |
0.5 |
Minimum EMA risk for non-no_op |
training.yaml |
training.marl_iterations |
500 |
Total MARL iterations |
PermissionBench is the first large-scale benchmark for mobile permission risk analysis with longitudinal runtime traces.
| Split | Benign | Malicious | Total |
|---|---|---|---|
| Train (70%) | 43,288 | 10,158 | 53,446 |
| Val (10%) | 6,184 | 1,451 | 7,635 |
| Test (20%) | 12,368 | 2,903 | 15,271 |
| Total | 61,840 | 14,512 | 76,352 |
Each record contains: app ID, category, description, declared permissions, API call features, binary risk label, per-permission risk labels, and runtime permission traces.
Download: github.com/aliakarma/PermissionBench
License: CC-BY-4.0
This project is released under the MIT License.
Ali Akarma β 443059463@stu.iu.edu.sa
Islamic University of Madinah, Department of Information Technology