Agent4FaceForgery

Multi-Agent LLM Framework for Realistic Face Forgery Detection

A two-phase multi-agent simulator that generates ecologically-valid multimodal training data for face-forgery detection.

Paper · Quick start · Configuration · Pipeline · Extending the toolbox · CLI · Cite

✨ TL;DR

Face-forgery detectors trained on static, curated benchmarks rarely survive contact with the real internet. Agent4FaceForgery closes that gap by simulating the internet instead of scraping it:

Phase 1 — Forgery Blueprint Generation. A population of LLM-powered creator agents — each with a profile, a reflective memory, and a chainable toolbox of forgery operators (face-swap, attribute edit, diffusion, relighting, …) — iteratively designs (image, caption) pairs. An Adaptive Rejection Sampling (ARS) gate keeps only the forgeries that are challenging for an external detector.
Phase 2 — Social Interaction Trajectory. Each accepted blueprint is dropped into an OASIS-inspired social platform where Posters, Critics, Watchers, Explorers, Chatters, and a GeminiAuditor interact under a recommender system. Their follower-weighted verdict yields the text-image consistency label δ ∈ {0, 1}.

The pipeline emits a single JSONL with rows (image, text, y, δ, …) that plugs directly into any downstream detector — Xception, CLIP, LLaVA, ViT.

   ┌──────────────────────┐    ARS    ┌─────────────────────────┐
   │ Phase 1 — Creators   │──────────▶│ Phase 2 — Social Threads │
   │ profile · memory ·   │ accept    │ RecSys · 21 actions ·    │
   │ toolbox (ops×chain)  │ ≥ τ_k     │ follower-weighted δ      │
   └──────────────────────┘           └─────────────────────────┘
              │                                    │
              ▼                                    ▼
        blueprints.jsonl                   final_dataset.jsonl
                                    (image, text, y, δ, δ_social, …)

🚀 Quick start

0. Prerequisites

Python ≥ 3.10, CUDA-capable GPU (optional — the pipeline falls back to CPU-safe PIL perturbations when GPU or heavy dependencies are missing).
Access to any OpenAI-compatible chat endpoint (OpenAI, vLLM, Together, Azure OpenAI, …). See Configuration below.

1. Install

git clone https://github.com/<your-org>/Agent4FaceForgery.git
cd Agent4FaceForgery
pip install -r requirements.txt

2. Point at FF++ frames (optional — Phase 2 still runs without them)

export FFPP_REAL=/path/to/FaceForensics++/original_sequences
export FFPP_FAKE=/path/to/FaceForensics++/manipulated_sequences

If FFPP_FAKE is missing, the launcher fabricates a tiny grey stub so the profile builder can proceed — convenient for a smoke test.

3. Train the ARS-side Xception judge (~20 min on a single A100)

bash scripts/train_xception.sh
# writes storage/xception/best.pt

4. Launch the full pipeline (one-liner)

export OPENAI_API_KEY=<your-key>
bash scripts/run_production.sh   8   32
#                               ^    ^
#                               │    └── blueprints per agent
#                               └── number of creator agents

Outputs land under storage/<SIM_NAME>/:

File	What it is
`blueprints.jsonl`	accepted rows `(x', c', δ', y=1)` from Phase 1
`blueprints_imgs/*.png`	forged images
`threads/bp_*.json`	distilled multi-agent discussion
`propagation/bp_*.json`	full SQLite trace + verdict
`final.jsonl`	the training dataset — `(image, text, y, δ, …)`

5. (Optional) Fine-tune LLaVA on the emitted data

python scripts/to_llava_sft.py \
    --in  storage/<SIM_NAME>/final.jsonl \
    --out storage/<SIM_NAME>/llava_sft.json
# feed llava_sft.json into your LLaVA fine-tuning script of choice

🔑 Configuration

All credentials and endpoint URLs are read from environment variables — nothing sensitive is ever written to the repository.

Variable	Required	Purpose	Example
`OPENAI_API_KEY`	✅	Bearer token for the chat endpoint	`export OPENAI_API_KEY=sk-...`
`LLM_MODEL`		Chat model id	`gpt-4o-mini` · `gpt-4o` · `qwen2.5-7b-instruct`
`LLM_BASE_URL`		OpenAI-compatible base URL	`https://api.openai.com/v1` (default)
`FFPP_REAL`		Path to `original_sequences`	`/data/FF++/original_sequences`
`FFPP_FAKE`		Path to `manipulated_sequences`	`/data/FF++/manipulated_sequences`

Recommended workflow — keep all secrets in a local, git-ignored .env:

# .env   (never commit this file — already in .gitignore)
OPENAI_API_KEY=sk-...
LLM_MODEL=gpt-4o-mini
LLM_BASE_URL=https://api.openai.com/v1
FFPP_REAL=/data/FF++/original_sequences
FFPP_FAKE=/data/FF++/manipulated_sequences

Load it before running:

set -a; source .env; set +a
bash scripts/run_production.sh 8 32

Any OpenAI-compatible endpoint works — the client only expects the standard /v1/chat/completions contract:

# Example: self-hosted vLLM
export LLM_BASE_URL=http://localhost:8000/v1
export LLM_MODEL=Qwen/Qwen2.5-7B-Instruct
export OPENAI_API_KEY=EMPTY

If OPENAI_API_KEY is unset the client fails fast with a clear error instead of silently producing garbage.

🧱 Pipeline

Phase 1 — Creator agents + ARS

Profile (v_k, c_k)  ──▶  Action (op_1 ∘ … ∘ op_n, caption)  ──▶  (x', c')
         ▲                         │                                 │
         │                         ▼                                 ▼
     Memory ◀───────────── Reflection every R writes         ARS scoring
                                                                  │
                                  s_i = λ·s_LLM + (1-λ)·s_disc    │
                                  τ_k = Quantile(Accepted, q)     │
                                  accept iff s_i ≥ τ_k  ◀─────────┘

Profile. Each creator k is seeded from FF++ statistics as v_k = (T_freq, T_div, T_conf) + an LLM-authored stylistic taste c_k.
Memory. Factual and evaluative entries with timestamps in a numpy / FAISS vector store; after reflection_threshold writes the LLM emits a next_plan_bias consumed on the next round.

Action module. An action is (Edit, Desc) where Edit = O_n(… O_1(x; θ_1) …) is a sequential operator chain sampled from the toolbox with profile-conditioned weights:

Category	Operators
Identity manipulation	FaceSwap (DFL), FaceSwap (Insightface), NeuralTexture
Attribute editing	AttGAN-Age, StarGAN-Expression, AttGAN-Gender
Style, classical	SBI, GAN-inversion, Relight
Style, diffusion	IP-Adapter-FaceID, InstantID, PuLID, Flux

Heavy backends (diffusers, insightface) lazy-load and transparently fall back to a PIL perturbation if unavailable — the pipeline never blocks.

ARS (Eq. 4–5). s_disc fuses Xception and optional LLaVA-judge outputs. Enable LLaVA scoring with --llava_in_ars.

Phase 2 — Social interaction → δ (Eq. 6)

Two presets via --phase2_mode:

	`original` (default)	`enhanced`
Agents	6 core — 1 each of Watcher / Explorer / Critic / Chatter / Poster / GeminiAuditor	+ N ordinary users sampled from a persona DB
Time steps / blueprint	5	40
Action catalogue	6 leaf actions in 4 categories (viewing / commenting / sharing / labeling)	21 fine-grained actions
Decoy posts	0	50
Environment	per-blueprint	shared across blueprints
Follow graph	static	follow / unfollow / mute mutate
Typical actions / bp	7–15	40–120

For a Critic-heavy council (useful when you want Critics to outnumber the herd camp in ablation sweeps), import the built-in BALANCED_CORE_MIX preset:

from simulation.phase2.persona_db import build_population, BALANCED_CORE_MIX
pop = build_population(n_core=8, core_mix=BALANCED_CORE_MIX, seed=0)

The six roles view / comment / share / label under a Reddit-style hot-score recommender augmented with sentence-embedding personalisation. Verdicts are aggregated with a log-scaled follower weight:

$$δ = 𝟙[ Σ_i w_i · 𝟙[vote_i = "fake"] ≥ Σ_i w_i · 𝟙[vote_i = "real"] ], w_i = 1 + log(1 + followers_i)$$

Override any preset with --phase2_n_ordinary, --phase2_decoy_posts, --phase2_shared_env, --phase2_dynamic_graph, --phase2_steps.

🧩 Extending the toolbox

New forgery operators drop in as four lines. Add a class derived from Operator, register it, and the agents, ARS, memory, and Phase 2 pick it up unchanged.

Step 1 — implement the operator. Create toolbox/my_op.py:

from PIL import Image
from .base import Operator, OpMeta


class ColorShift(Operator):
    meta = OpMeta(
        name="color_shift",
        category="style",              # identity | attribute | style
        description="tiny hue rotation — low-cost stylistic perturbation",
        params_schema={"strength": "float ∈ [0, 1]"},
        cost=0.1,
    )

    def _apply(self, image: Image.Image, strength: float = 0.3, **_) -> Image.Image:
        hsv = image.convert("HSV")
        h, s, v = hsv.split()
        h = h.point(lambda x: int((x + 180 * strength) % 255))
        return Image.merge("HSV", (h, s, v)).convert("RGB")

Step 2 — register it in toolbox/registry.py:

from .my_op import ColorShift

OPERATORS = _register(
    ...,
    ColorShift(),
)

That is all. The new operator is now:

sampleable by sample_chain(...) with profile-weighted probabilities,
visible to the creator LLM via describe_toolbox() in prompt context,
chainable with any existing operator up to --max_tool_chain length,
subject to the ARS accept/reject filter just like built-ins.

If your operator needs a heavy backend, follow the pattern in toolbox/insightface_backend.py: lazy-load inside _apply, catch ImportError / FileNotFoundError, and fall back to a cheap PIL perturbation so the pipeline keeps running on machines without the backend installed.

📂 Repository layout

Agent4FaceForgery/
├── main.py                           # CLI entrypoint for both phases
├── parse.py                          # argparse + all hyperparameters
│
├── simulation/
│   ├── arena.py                      # orchestrator (Phase 1 ARS + Phase 2)
│   ├── creator_agent.py              # Phase-1 agent (profile · memory · action)
│   ├── memory.py                     # factual + evaluative memory + reflection
│   ├── retriever.py                  # numpy / FAISS vector retriever
│   ├── ars.py                        # Adaptive Rejection Sampling (Eq. 4–5)
│   ├── profile.py                    # creator profile (v_k, c_k)
│   ├── prompts.py                    # centralised Phase-1 prompt templates
│   ├── llm_client.py                 # OpenAI-compatible wrapper (retry + CoT)
│   ├── social_agent.py               # lightweight per-blueprint SocialArena
│   └── phase2/
│       ├── action.py                 # 21 actions × 4 categories
│       ├── clock.py                  # 24-dim hourly activity / time steps
│       ├── recsys.py                 # hot-score + interest-based RecSys
│       ├── mini_platform.py          # SQLite env server (6 tables)
│       ├── persona_db.py             # 6-agent default mix + ordinary users
│       ├── decoy_generator.py        # non-forgery post pool
│       ├── population_env.py         # PopulationEnv orchestrator
│       └── verdict_aggregator.py     # follower-weighted voting → δ (Eq. 6)
│
├── toolbox/                          # Phase-1 forgery operators
│   ├── base.py                       # Operator ABC
│   ├── identity.py                   # face-swap (insightface + inswapper_128)
│   ├── attribute.py                  # AttGAN / StarGAN age+expression editing
│   ├── style.py                      # SBI, GAN-inversion, relighting
│   ├── diffusion.py                  # IP-Adapter-FaceID / InstantID / PuLID / Flux
│   ├── diffusion_backend.py          # lazy diffusion pipelines
│   ├── insightface_backend.py        # real inswapper wrapper
│   └── registry.py                   # operator registry + sampling
│
├── detectors/
│   ├── xception.py                   # Xception backbone (timm)
│   ├── xception_train.py             # training loop (FF++ real-vs-fake)
│   ├── xception_infer.py             # scoring API (s_disc)
│   └── llava_judge.py                # LLaVA-1.5 judge
│
├── dataio/
│   ├── ffpp.py                       # FF++ reader (parallel walker + cache)
│   └── profiles_builder.py           # build creator profiles from FF++ stats
│
├── configs/                          # YAML defaults (data/model/experiment)
├── scripts/                          # helper launchers (smoke, prod, train)
└── storage/                          # per-run outputs (gitignored)

🧰 CLI reference

# Phase 1
--n_agents N                     number of creator agents
--blueprints_per_agent K         blueprints each agent should accept
--max_tool_chain C               cap operator chain length
--reflection_threshold R         trigger memory reflection every R writes

# ARS
--ars_lambda 0.5                 λ in  s_i = λ·s_LLM + (1 − λ)·s_disc
--ars_warmup 20                  N_warmup samples with the lenient threshold
--ars_tau_warmup 0.2             lenient threshold during warm-up
--ars_quantile 0.4               q-th quantile after warm-up
--llava_in_ars                   enable LLaVA judge alongside Xception
--llava_ars_weight 0.5           LLaVA vs Xception fusion inside s_disc

# Phase 2
--phase2_backend population      {legacy, oasis_mini, population}
--phase2_mode     original       {original, enhanced}
--phase2_n_ordinary 0            extra ordinary users (enhanced only)
--phase2_decoy_posts 0           injected non-forgery posts
--phase2_steps 5                 time-steps per blueprint
--phase2_shared_env              share the env across all blueprints
--phase2_dynamic_graph           allow follow/unfollow/mute to mutate
--phase2_force_vote              force all seen agents to label_post at T_end

# Detectors & LLM
--xception_ckpt storage/xception/best.pt
--llm_model        gpt-4o-mini
--llm_base_url     (optional; defaults to the OpenAI public endpoint)
--llm_timeout      60
--llm_max_retries  5

📚 Citation

If this codebase helps your research, please cite:

@article{lai2025agent4faceforgery,
  title   = {Agent4FaceForgery: Multi-Agent LLM Framework for Realistic Face Forgery Detection},
  author  = {Lai, Yingxin and others},
  journal = {arXiv preprint arXiv:2509.12546},
  year    = {2025},
  url     = {https://arxiv.org/abs/2509.12546}
}

📄 License

Released under the MIT License. Third-party weights (insightface, diffusers pipelines, LLaVA, Xception pretrain) are subject to their own licenses — please consult the originating repositories.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent4FaceForgery

Multi-Agent LLM Framework for Realistic Face Forgery Detection

✨ TL;DR

🚀 Quick start

0. Prerequisites

1. Install

2. Point at FF++ frames (optional — Phase 2 still runs without them)

3. Train the ARS-side Xception judge (~20 min on a single A100)

4. Launch the full pipeline (one-liner)

5. (Optional) Fine-tune LLaVA on the emitted data

🔑 Configuration

🧱 Pipeline

Phase 1 — Creator agents + ARS

Phase 2 — Social interaction → δ (Eq. 6)

🧩 Extending the toolbox

📂 Repository layout

🧰 CLI reference

📚 Citation

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
dataio		dataio
datasets		datasets
detectors		detectors
scripts		scripts
simulation		simulation
toolbox		toolbox
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
main.py		main.py
parse.py		parse.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Agent4FaceForgery

Multi-Agent LLM Framework for Realistic Face Forgery Detection

✨ TL;DR

🚀 Quick start

0. Prerequisites

1. Install

2. Point at FF++ frames (optional — Phase 2 still runs without them)

3. Train the ARS-side Xception judge (~20 min on a single A100)

4. Launch the full pipeline (one-liner)

5. (Optional) Fine-tune LLaVA on the emitted data

🔑 Configuration

🧱 Pipeline

Phase 1 — Creator agents + ARS

Phase 2 — Social interaction → δ (Eq. 6)

🧩 Extending the toolbox

📂 Repository layout

🧰 CLI reference

📚 Citation

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages