Skip to content

RefrainTC/SALF

Repository files navigation

SALF: A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection

arXiv EMNLP 2025

This repository contains the official implementation of the paper: "A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection", accepted to EMNLP 2025 Main Conference.

Abstract

Rapid LLM advancements heighten fake news risks by enabling the automatic generation of increasingly sophisticated misinformation. Previous detection methods, including fine-tuned small models or LLM-based detectors, often struggle with its dynamically evolving nature. In this work, we propose a novel framework called the Symbolic Adversarial Learning Framework (SALF), which implements an adversarial training paradigm by an agent symbolic learning optimization process, rather than relying on numerical updates. SALF introduces a paradigm where the generation agent crafts deceptive narratives, and the detection agent uses structured debates to identify logical and factual flaws for detection, and they iteratively refine themselves through such adversarial interactions. Unlike traditional neural updates, we represent agents using agent symbolic learning, where learnable weights are defined by agent prompts, and simulate back-propagation and gradient descent by operating on natural language representations of weights, loss, and gradients. Experiments on two multilingual benchmark datasets demonstrate SALF's effectiveness, showing it generates sophisticated fake news that degrades state-of-the-art detection performance by up to 53.4% in Chinese and 34.2% in English on average. SALF also refines detectors, improving detection of refined content by up to 7.7%. We hope our work inspires further exploration into more robust, adaptable fake news detection systems.

Framework Architecture

SALF Framework Figure 2: Overview of the Symbolic Adversarial Learning Framework (SALF). The generator and detector optimize each other through iterative adversarial debate and symbolic feedback.

Project Overview

This is the open-source implementation of SALF (Symbolic Adversarial Learning Framework), an adversarial learning framework for fake news generation and detection. SALF uses natural language feedback (symbolic optimization) instead of traditional numerical gradients to implement adversarial training between generators and detectors.

🚀 Main Results

Quantitative results on Weibo21 and GossipCop datasets demonstrate that SALF-generated fake news significantly degrades the performance of state-of-the-art detectors.

Dataset Type Model Orig macF1 Orig Acc Orig F1real Orig F1fake Refined macF1 Refined Acc Refined F1real Refined F1fake
Weibo21 LLM-Only GPT-4o mini 0.710 0.715 0.747 0.673 0.405 (-43%) 0.485 (-32%) 0.623 (-17%) 0.186 (-72%)
DeepSeek V3 0.763 0.770 0.803 0.723 0.380 (-50%) 0.495 (-36%) 0.647 (-19%) 0.112 (-85%)
SLM-Only ENDEF 0.726 0.727 0.741 0.711 0.576 (-21%) 0.591 (-19%) 0.657 (-11%) 0.495 (-30%)
LLM+SLM ARG 0.784 0.786 0.805 0.764 0.635 (-19%) 0.653 (-17%) 0.717 (-11%) 0.552 (-28%)
ARG-D 0.760 0.761 0.776 0.745 0.502 (-34%) 0.542 (-29%) 0.644 (-17%) 0.360 (-52%)
Average - - - - (-33.4%) (-26.6%) (-15.0%) (-53.4%)
GossipCop LLM-Only GPT-4o mini 0.687 0.863 0.922 0.452 0.519 (-24%) 0.821 (-5%) 0.900 (-2%) 0.138 (-69%)
DeepSeek V3 0.628 0.850 0.915 0.340 0.510 (-19%) 0.823 (-3%) 0.902 (-1%) 0.119 (-65%)
SLM-Only ENDEF 0.761 0.855 0.911 0.611 0.747 (-2%) 0.848 (-1%) 0.907 (-0%) 0.587 (-4%)
LLM+SLM ARG 0.791 0.879 0.927 0.656 0.716 (-9%) 0.796 (-9%) 0.866 (-7%) 0.565 (-14%)
ARG-D 0.771 0.873 0.924 0.619 0.705 (-9%) 0.847 (-3%) 0.909 (-2%) 0.501 (-19%)
Average - - - - (-12.6%) (-4.2%) (-2.4%) (-34.2%)

Table 1: Comparison of fake news detection models on Weibo21 and GossipCop before and after SALF refinement.

Core Features

  • Multi-Agent Debate Detector: Uses affirmative team, negative team, and judge for structured debate to determine news authenticity
  • Symbolic Optimization: Replaces numerical gradients with natural language feedback (Loss, Gradient, Optimizer)
  • Adversarial Training: Generator and Detector compete against each other to continuously improve capabilities
  • Modular Design: Clear code structure, easy to understand and extend

Project Structure

SALF/
├── agents/                    # Agent modules
│   ├── generator.py          # Generator (corresponds to paper Section 3.2)
│   └── detector.py           # Detector (corresponds to paper Section 3.1)
├── optimization/              # Optimization modules
│   └── symbolic_optimizer.py # Symbolic optimizer (corresponds to paper Section 3.3)
├── prompts/                   # Prompt templates
│   ├── debate_prompts.py     # Debate prompts
│   └── optimization_prompts.py # Optimization prompts
├── utils/                     # Utility modules
│   └── llm_client.py         # LLM API client
├── config/                    # Configuration modules
│   └── settings.py           # Configuration file
├── data/                      # Data directory (sample + README, user data ignored by git)
├── results/                   # Results directory (README, gitignored artifacts)
├── checkpoint/                # Optional checkpoints (empty placeholder tracked)
├── scripts/                   # Batch/Bash helpers
├── main.py                    # Main program (corresponds to paper Algorithm 1)
├── run_generation.py          # Pipeline: generation only
├── run_detection.py           # Pipeline: detection only
├── requirements.txt           # Dependencies list
├── LICENSE                    # Open-source license
└── README.md                  # This file

Installation

1. Clone the Repository

git clone https://github.com/RefrainTC/SALF.git
cd SALF

2. (Recommended) Create a Virtual Environment

python3 -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

3. Install Dependencies

pip install -r requirements.txt

4. Configure API Keys

You can configure the project using either Python defaults (for development) or Environment variables (for batch scripts).

  • Method 1 (Python): Edit config/settings.py directly.
  • Method 2 (Bash): Edit config.sh to export variables (automatically loaded by scripts/*.sh).

Configuration Priority: Command-line arguments > Environment variables (config.sh) > Python defaults (settings.py)

For detailed configuration options (including DeepSeek/other models), see CONFIG_GUIDE.md.

5. Model Support (DeepSeek V3 & Others)

This framework is fully compatible with DeepSeek V3 (used in our paper) and other OpenAI-compatible APIs. To use non-OpenAI models:

  1. Set the API Base URL: In config/settings.py or via environment variables, set OPENAI_API_BASE to your provider's endpoint (e.g., https://api.deepseek.com).

  2. Update Model Name: Set DEFAULT_LLM_MODEL to the specific model version (e.g., deepseek-chat or deepseek-reasoner).

Dataset Preparation

⚠️ Privacy & Data Policy Note: Due to privacy regulations and platform policies regarding user data, we do not provide the raw datasets (Weibo21 and GossipCop) directly in this repository.

To reproduce the experiments:

  1. Please obtain the datasets from their original official repositories:
  2. Preprocess the data into the JSON format required by SALF (refer to data/sample.json for the schema).
  3. Place the processed files in the data/ directory.

Usage

Quick Test

bash scripts/run_test.sh

Or for full pipeline quick test:

bash scripts/run_test_full_pipeline.sh

Method 1: Complete Pipeline (main.py)

Suitable for small-scale data or workflow verification:

python main.py \
    --input_file data/train.json \
    --output_file results/salf_output.json \
    --max_iterations 5 \
    --llm_model gpt-4o-mini-2024-07-18

Method 2: Batch Processing Scripts (recommended for large-scale data)

Decoupled design, generation and detection are separated, saves after each item, more fault-tolerant:

Batch Generation

python run_generation.py \
    --input_file data/train.json \
    --output_file results/generated_news.json \
    --max_iterations 3

Batch Detection

python run_detection.py \
    --input_file results/generated_news.json \
    --output_file results/detection_results.json \
    --use_optimized

Full Pipeline

bash scripts/run_full_pipeline.sh

Detailed documentation: See BATCH_PROCESSING.md

Parameter Description

  • --input_file: Input data file (JSON format)
  • --output_file: Output result file
  • --max_iterations: Maximum iterations per sample
  • --llm_model: LLM model name to use
  • --generator_model: Model for generator (defaults to llm_model)
  • --resume_index: Resume training from specified index
  • --max_samples: Maximum number of samples to process (for testing)

Input Data Format

[
    {
        "content": "News content...",
        "label": 1,  // 0=real news, 1=fake news
        "source_id": "news_001"
    },
    ...
]

Output Data Format

[
    {
        "content": "Original news content",
        "label": 1,
        "optimized_content": "Optimized news content",
        "iterations": [
            {
                "iteration": 1,
                "debate_record": {...},
                "judgement": "negative",
                "optimization_log": {...}
            },
            ...
        ],
        "final_judgement": "negative"
    },
    ...
]

Data & Outputs

  • data/README.md: explains expected JSON schema and how to add your own datasets (gitignored by default). data/sample.json provides a tiny example.
  • results/README.md: describes output artifacts and how to keep them untracked to avoid leaking generated content.
  • checkpoint/: optional directory for intermediate saves; kept empty with .gitkeep so the folder exists without shipping artifacts.

Testing

  • Offline quick check (no API calls): python verify_code.py to validate imports and structure.
  • End-to-end smoke test (uses your API credits): bash scripts/run_test.sh runs 1-sample generation + detection using the sample data.

Algorithm Workflow

SALF's training workflow corresponds to Algorithm 1 in the paper:

for t = 1 to T:
    1. Generator generates fake news f^(t)
    2. Detector performs multi-agent debate and makes judgement
    3. If Detector fails (judges as real news):
       - Update Detector prompt
    4. Symbolically optimize Generator:
       - Compute symbolic loss
       - Compute symbolic gradient
       - Update Generator prompt
       - Generate optimized fake news

Core Components

1. Detector

File: agents/detector.py

Function: Multi-agent debate system, including:

  • Positive Team: 3 agents, supporting news authenticity
  • Negative Team: 3 agents, opposing news authenticity
  • Judge: Makes judgement based on debate record

Debate Process:

  1. Opening Statement
  2. Questioning & Rebuttal Round 1
  3. Questioning & Rebuttal Round 2
  4. Closing Statement
  5. Judgement

2. Generator

File: agents/generator.py

Function: Generates fake news and continuously improves through symbolic optimization

Methods:

  • generate(): Generate fake news
  • update_prompt(): Update generator prompt

3. Symbolic Optimizer

File: optimization/symbolic_optimizer.py

Function: Implements symbolic optimization process

Three-step optimization workflow:

  1. compute_loss(): Compute symbolic loss (evaluate news quality)
  2. compute_gradient(): Compute symbolic gradient (propose improvement directions)
  3. optimize_prompt(): Optimize prompt (generate new generator prompt)
  4. optimize_content(): Optimize content (rewrite news based on new prompt)

Development Notes

Code Style

  • Uses Python 3.8+
  • Follows PEP 8 coding standards
  • All functions and classes have detailed docstrings
  • Code comments indicate corresponding paper sections

Debug Mode

Set environment variable to enable debug output:

export DEBUG_MODE=true
python main.py ...

Extension Suggestions

  1. Add Detector Prompt Update: Current implementation has fixed Detector prompts, can add prompt update mechanism similar to Generator
  2. Support More Models: Add more LLM model configurations in config/settings.py
  3. Add Evaluation Metrics: Implement automated evaluation metrics (e.g., detection accuracy, generation quality)
  4. Batch Processing Optimization: Support batch processing to improve efficiency

Important Notes

  • API call costs: Running complete SALF training requires many LLM API calls. Start small (--max_samples 5, --max_iterations 2) and monitor spending.
  • API key security: Do not commit keys. Prefer environment variables or config.sh; values in config/settings.py are placeholders only.
  • Responsible use: This code is for research and defense purposes. Do not use it to spread misinformation. Follow local laws and your provider's policies; review generated content before sharing.

Citation

If you use this code, please cite the related paper:

@inproceedings{tian2025symbolic,
  title={A symbolic adversarial learning framework for evolving fake news generation and detection},
  author={Tian, Chong and Ho, Qirong and Chen, Xiuying},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  pages={12307--12321},
  year={2025}
}

License

Released under the MIT License. See LICENSE for details.

Contact

For questions or suggestions, please contact: [Chong.Tian@mbzuai.ac.ae]

About

This repository contains the official implementation of the paper: "A Symbolic Adversarial Learning Framework for Evolving Fake News Generation and Detection", accepted to EMNLP 2025 Main Conference.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors