A two-phase multi-agent simulator that generates ecologically-valid multimodal training data for face-forgery detection.
Paper · Quick start · Configuration · Pipeline · Extending the toolbox · CLI · Cite
Face-forgery detectors trained on static, curated benchmarks rarely survive contact with the real internet. Agent4FaceForgery closes that gap by simulating the internet instead of scraping it:
- Phase 1 — Forgery Blueprint Generation. A population of LLM-powered creator agents — each with a profile, a reflective memory, and a chainable toolbox of forgery operators (face-swap, attribute edit, diffusion, relighting, …) — iteratively designs (image, caption) pairs. An Adaptive Rejection Sampling (ARS) gate keeps only the forgeries that are challenging for an external detector.
- Phase 2 — Social Interaction Trajectory. Each accepted blueprint is dropped into an OASIS-inspired social platform where Posters, Critics, Watchers, Explorers, Chatters, and a GeminiAuditor interact under a recommender system. Their follower-weighted verdict yields the text-image consistency label δ ∈ {0, 1}.
The pipeline emits a single JSONL with rows (image, text, y, δ, …) that
plugs directly into any downstream detector — Xception, CLIP, LLaVA, ViT.
┌──────────────────────┐ ARS ┌─────────────────────────┐
│ Phase 1 — Creators │──────────▶│ Phase 2 — Social Threads │
│ profile · memory · │ accept │ RecSys · 21 actions · │
│ toolbox (ops×chain) │ ≥ τ_k │ follower-weighted δ │
└──────────────────────┘ └─────────────────────────┘
│ │
▼ ▼
blueprints.jsonl final_dataset.jsonl
(image, text, y, δ, δ_social, …)
- Python ≥ 3.10, CUDA-capable GPU (optional — the pipeline falls back to CPU-safe PIL perturbations when GPU or heavy dependencies are missing).
- Access to any OpenAI-compatible chat endpoint (OpenAI, vLLM, Together, Azure OpenAI, …). See Configuration below.
git clone https://github.com/<your-org>/Agent4FaceForgery.git
cd Agent4FaceForgery
pip install -r requirements.txtexport FFPP_REAL=/path/to/FaceForensics++/original_sequences
export FFPP_FAKE=/path/to/FaceForensics++/manipulated_sequencesIf FFPP_FAKE is missing, the launcher fabricates a tiny grey stub so the
profile builder can proceed — convenient for a smoke test.
bash scripts/train_xception.sh
# writes storage/xception/best.ptexport OPENAI_API_KEY=<your-key>
bash scripts/run_production.sh 8 32
# ^ ^
# │ └── blueprints per agent
# └── number of creator agentsOutputs land under storage/<SIM_NAME>/:
| File | What it is |
|---|---|
blueprints.jsonl |
accepted rows (x', c', δ', y=1) from Phase 1 |
blueprints_imgs/*.png |
forged images |
threads/bp_*.json |
distilled multi-agent discussion |
propagation/bp_*.json |
full SQLite trace + verdict |
final.jsonl |
the training dataset — (image, text, y, δ, …) |
python scripts/to_llava_sft.py \
--in storage/<SIM_NAME>/final.jsonl \
--out storage/<SIM_NAME>/llava_sft.json
# feed llava_sft.json into your LLaVA fine-tuning script of choiceAll credentials and endpoint URLs are read from environment variables — nothing sensitive is ever written to the repository.
| Variable | Required | Purpose | Example |
|---|---|---|---|
OPENAI_API_KEY |
✅ | Bearer token for the chat endpoint | export OPENAI_API_KEY=sk-... |
LLM_MODEL |
Chat model id | gpt-4o-mini · gpt-4o · qwen2.5-7b-instruct |
|
LLM_BASE_URL |
OpenAI-compatible base URL | https://api.openai.com/v1 (default) |
|
FFPP_REAL |
Path to original_sequences |
/data/FF++/original_sequences |
|
FFPP_FAKE |
Path to manipulated_sequences |
/data/FF++/manipulated_sequences |
Recommended workflow — keep all secrets in a local, git-ignored .env:
# .env (never commit this file — already in .gitignore)
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini
LLM_BASE_URL=https://api.openai.com/v1
FFPP_REAL=/data/FF++/original_sequences
FFPP_FAKE=/data/FF++/manipulated_sequencesLoad it before running:
set -a; source .env; set +a
bash scripts/run_production.sh 8 32Any OpenAI-compatible endpoint works — the client only expects the standard
/v1/chat/completions contract:
# Example: self-hosted vLLM
export LLM_BASE_URL=http://localhost:8000/v1
export LLM_MODEL=Qwen/Qwen2.5-7B-Instruct
export OPENAI_API_KEY=EMPTYIf OPENAI_API_KEY is unset the client fails fast with a clear error
instead of silently producing garbage.
Profile (v_k, c_k) ──▶ Action (op_1 ∘ … ∘ op_n, caption) ──▶ (x', c')
▲ │ │
│ ▼ ▼
Memory ◀───────────── Reflection every R writes ARS scoring
│
s_i = λ·s_LLM + (1-λ)·s_disc │
τ_k = Quantile(Accepted, q) │
accept iff s_i ≥ τ_k ◀─────────┘
-
Profile. Each creator
kis seeded from FF++ statistics asv_k = (T_freq, T_div, T_conf)+ an LLM-authored stylistic tastec_k. -
Memory. Factual and evaluative entries with timestamps in a numpy / FAISS vector store; after
reflection_thresholdwrites the LLM emits anext_plan_biasconsumed on the next round. -
Action module. An action is
(Edit, Desc)whereEdit = O_n(… O_1(x; θ_1) …)is a sequential operator chain sampled from the toolbox with profile-conditioned weights:Category Operators Identity manipulation FaceSwap (DFL), FaceSwap (Insightface), NeuralTexture Attribute editing AttGAN-Age, StarGAN-Expression, AttGAN-Gender Style, classical SBI, GAN-inversion, Relight Style, diffusion IP-Adapter-FaceID, InstantID, PuLID, Flux Heavy backends (
diffusers,insightface) lazy-load and transparently fall back to a PIL perturbation if unavailable — the pipeline never blocks. -
ARS (Eq. 4–5).
s_discfuses Xception and optional LLaVA-judge outputs. Enable LLaVA scoring with--llava_in_ars.
Two presets via --phase2_mode:
original (default) |
enhanced |
|
|---|---|---|
| Agents | 6 core — 1 each of Watcher / Explorer / Critic / Chatter / Poster / GeminiAuditor | + N ordinary users sampled from a persona DB |
| Time steps / blueprint | 5 | 40 |
| Action catalogue | 6 leaf actions in 4 categories (viewing / commenting / sharing / labeling) | 21 fine-grained actions |
| Decoy posts | 0 | 50 |
| Environment | per-blueprint | shared across blueprints |
| Follow graph | static | follow / unfollow / mute mutate |
| Typical actions / bp | 7–15 | 40–120 |
For a Critic-heavy council (useful when you want Critics to outnumber the
herd camp in ablation sweeps), import the built-in BALANCED_CORE_MIX
preset:
from simulation.phase2.persona_db import build_population, BALANCED_CORE_MIX
pop = build_population(n_core=8, core_mix=BALANCED_CORE_MIX, seed=0)The six roles view / comment / share / label under a Reddit-style hot-score recommender augmented with sentence-embedding personalisation. Verdicts are aggregated with a log-scaled follower weight:
Override any preset with --phase2_n_ordinary, --phase2_decoy_posts,
--phase2_shared_env, --phase2_dynamic_graph, --phase2_steps.
New forgery operators drop in as four lines. Add a class derived from
Operator, register it, and the agents, ARS, memory, and Phase 2 pick it up
unchanged.
Step 1 — implement the operator. Create
toolbox/my_op.py:
from PIL import Image
from .base import Operator, OpMeta
class ColorShift(Operator):
meta = OpMeta(
name="color_shift",
category="style", # identity | attribute | style
description="tiny hue rotation — low-cost stylistic perturbation",
params_schema={"strength": "float ∈ [0, 1]"},
cost=0.1,
)
def _apply(self, image: Image.Image, strength: float = 0.3, **_) -> Image.Image:
hsv = image.convert("HSV")
h, s, v = hsv.split()
h = h.point(lambda x: int((x + 180 * strength) % 255))
return Image.merge("HSV", (h, s, v)).convert("RGB")Step 2 — register it in
toolbox/registry.py:
from .my_op import ColorShift
OPERATORS = _register(
...,
ColorShift(),
)That is all. The new operator is now:
- sampleable by
sample_chain(...)with profile-weighted probabilities, - visible to the creator LLM via
describe_toolbox()in prompt context, - chainable with any existing operator up to
--max_tool_chainlength, - subject to the ARS accept/reject filter just like built-ins.
If your operator needs a heavy backend, follow the pattern in
toolbox/insightface_backend.py: lazy-load
inside _apply, catch ImportError / FileNotFoundError, and fall back to
a cheap PIL perturbation so the pipeline keeps running on machines without
the backend installed.
Agent4FaceForgery/
├── main.py # CLI entrypoint for both phases
├── parse.py # argparse + all hyperparameters
│
├── simulation/
│ ├── arena.py # orchestrator (Phase 1 ARS + Phase 2)
│ ├── creator_agent.py # Phase-1 agent (profile · memory · action)
│ ├── memory.py # factual + evaluative memory + reflection
│ ├── retriever.py # numpy / FAISS vector retriever
│ ├── ars.py # Adaptive Rejection Sampling (Eq. 4–5)
│ ├── profile.py # creator profile (v_k, c_k)
│ ├── prompts.py # centralised Phase-1 prompt templates
│ ├── llm_client.py # OpenAI-compatible wrapper (retry + CoT)
│ ├── social_agent.py # lightweight per-blueprint SocialArena
│ └── phase2/
│ ├── action.py # 21 actions × 4 categories
│ ├── clock.py # 24-dim hourly activity / time steps
│ ├── recsys.py # hot-score + interest-based RecSys
│ ├── mini_platform.py # SQLite env server (6 tables)
│ ├── persona_db.py # 6-agent default mix + ordinary users
│ ├── decoy_generator.py # non-forgery post pool
│ ├── population_env.py # PopulationEnv orchestrator
│ └── verdict_aggregator.py # follower-weighted voting → δ (Eq. 6)
│
├── toolbox/ # Phase-1 forgery operators
│ ├── base.py # Operator ABC
│ ├── identity.py # face-swap (insightface + inswapper_128)
│ ├── attribute.py # AttGAN / StarGAN age+expression editing
│ ├── style.py # SBI, GAN-inversion, relighting
│ ├── diffusion.py # IP-Adapter-FaceID / InstantID / PuLID / Flux
│ ├── diffusion_backend.py # lazy diffusion pipelines
│ ├── insightface_backend.py # real inswapper wrapper
│ └── registry.py # operator registry + sampling
│
├── detectors/
│ ├── xception.py # Xception backbone (timm)
│ ├── xception_train.py # training loop (FF++ real-vs-fake)
│ ├── xception_infer.py # scoring API (s_disc)
│ └── llava_judge.py # LLaVA-1.5 judge
│
├── dataio/
│ ├── ffpp.py # FF++ reader (parallel walker + cache)
│ └── profiles_builder.py # build creator profiles from FF++ stats
│
├── configs/ # YAML defaults (data/model/experiment)
├── scripts/ # helper launchers (smoke, prod, train)
└── storage/ # per-run outputs (gitignored)
# Phase 1
--n_agents N number of creator agents
--blueprints_per_agent K blueprints each agent should accept
--max_tool_chain C cap operator chain length
--reflection_threshold R trigger memory reflection every R writes
# ARS
--ars_lambda 0.5 λ in s_i = λ·s_LLM + (1 − λ)·s_disc
--ars_warmup 20 N_warmup samples with the lenient threshold
--ars_tau_warmup 0.2 lenient threshold during warm-up
--ars_quantile 0.4 q-th quantile after warm-up
--llava_in_ars enable LLaVA judge alongside Xception
--llava_ars_weight 0.5 LLaVA vs Xception fusion inside s_disc
# Phase 2
--phase2_backend population {legacy, oasis_mini, population}
--phase2_mode original {original, enhanced}
--phase2_n_ordinary 0 extra ordinary users (enhanced only)
--phase2_decoy_posts 0 injected non-forgery posts
--phase2_steps 5 time-steps per blueprint
--phase2_shared_env share the env across all blueprints
--phase2_dynamic_graph allow follow/unfollow/mute to mutate
--phase2_force_vote force all seen agents to label_post at T_end
# Detectors & LLM
--xception_ckpt storage/xception/best.pt
--llm_model gpt-4o-mini
--llm_base_url (optional; defaults to the OpenAI public endpoint)
--llm_timeout 60
--llm_max_retries 5
If this codebase helps your research, please cite:
@article{lai2025agent4faceforgery,
title = {Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection},
author = {Lai, Yingxin and others},
journal = {arXiv preprint arXiv:2509.12546},
year = {2025},
url = {https://arxiv.org/abs/2509.12546}
}Released under the MIT License. Third-party weights (insightface, diffusers pipelines, LLaVA, Xception pretrain) are subject to their own licenses — please consult the originating repositories.