PaperAudit: Academic Paper Error Detection and Review System

A comprehensive Large Language Model (LLM)-based system for academic paper error detection and review, providing end-to-end solutions from data preprocessing, error detection, paper review to model training.

💻 Code • 📄 Paper • 📊 Dataset • 🤖 Models • 🇨🇳 中文

Project Overview

PaperAudit is a comprehensive academic paper quality assessment system with the following main features:

Paper Preprocessing: Download papers from OpenReview, parse PDFs, and generate synthetic error data
Error Detection: Use Multi-Agent System (MAS) to detect various types of errors in papers
Paper Review: Provide multi-stage, multi-perspective automated paper review
Model Training: Support Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) training

System Architecture

PaperAudit/
├── preprocess_data/    # Data preprocessing pipeline
├── detect/            # Multi-agent error detection system
├── review/            # Multi-agent paper review system
└── train/             # Model training (SFT & RL)
    ├── train_data_process/  # Training data processing
    ├── sft_training/        # Supervised fine-tuning
    ├── rl_training/         # Reinforcement learning training
    └── eval/                # Model evaluation

Core Modules

1. preprocess_data - Data Preprocessing

Download papers from OpenReview and generate error detection data for training.

Main Features:

Download paper PDFs, reviews, and metadata from OpenReview API
Parse PDFs into structured JSON using LlamaParse and LLM
Add section labels (Abstract, Introduction, Method, etc.)
Generate 8 categories of synthetic errors (evidence/data manipulation, method logic flaws, experimental design issues, etc.)

Key Scripts:

download_openreview.py - Download papers
parse_paper.py - Parse PDFs
add_section.py - Add section labels
synth_corruptions_for_detector.py - Generate synthetic errors

2. detect - Error Detection System

Multi-Agent System (MAS) based paper error detection that can identify factual errors, logical inconsistencies, citation errors, and more.

Main Features:

Multi-agent collaborative detection (Planner, Retriever, Specialist)
Support for multiple detection modes (basic, enhanced, full)
Detection result evaluation and statistical analysis

Key Scripts:

mas_error_detection.py - Main detection pipeline
eval_detection.py - Detection result evaluation
eval_log_detail.py - Detailed statistical analysis

3. review - Paper Review System

Provides multi-stage, multi-perspective automated paper review, including baseline review, cheating detection, motivation evaluation, etc.

Main Features:

AuditAgent: Multi-stage review (baseline review → cheating detection → motivation evaluation → final assessment)
DeepReviewerAgent: Multi-perspective deep review
Alignment evaluation between review results and human reviews

Key Scripts:

run_audit_agent.py - Batch runner for AuditAgent
run_deepreview_agent.py - Batch runner for DeepReviewerAgent
alignment/eval_alignment.py - Alignment evaluation

4. train - Model Training

Supports Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) training for error detection and review models.

Main Features:

SFT Training: Parameter-efficient fine-tuning using LLaMA-Factory and LoRA
RL Training: Reinforcement learning training using VERL framework and GRPO algorithm
Data Processing: Training data format conversion and processing
Model Evaluation: Evaluation tools for API models and base models

Supported Models:

Qwen3-8B / Qwen3-14B
Llama-3.2-3B-Instruct

Quick Start

Environment Setup

Install Dependencies:

pip install -r requirements.txt

Configure Environment Variables:

cp env.example env.sh
# Edit env.sh to set necessary API keys
source env.sh

Usage Examples

1. Data Preprocessing

cd preprocess_data
# Download papers
python download_openreview.py --conference ICLR.cc --year 2025 --type oral

# Parse PDFs
python parse_paper.py --root-dir ./data/ICLR_2025_oral --model gpt-5-2025-08-07

# Generate synthetic errors
python synth_corruptions_for_detector.py --input-dir ./data --output-dir ./corrupted_data

2. Error Detection

cd detect
# Run detection
python mas_error_detection.py --input-dir ../preprocess_data/output --model gpt-5-2025-08-07

# Evaluate results
python eval_detection.py --detection-dir ./results --ground-truth-dir ../preprocess_data/corrupted_data

3. Paper Review

cd review
# Run AuditAgent
python run_audit_agent.py --input-dir ../data/ICLR_26 --model gpt-5-2025-08-07

# Run DeepReviewerAgent
python run_deepreview_agent.py --input-dir ../data/ICLR_26 --model gpt-5-2025-08-07

4. Model Training

cd train
# SFT training
cd sft_training
bash sft_train.sh

# RL training
cd ../rl_training
bash run_grpo_train.sh

Configuration

Environment Variables

Configure the following environment variables in env.sh:

OPENAI_API_KEY - OpenAI API key
LLAMA_API_KEY - LlamaParse API key
OPENREVIEW_USERNAME - OpenReview username
OPENREVIEW_PASSWORD - OpenReview password

Configuration Files

review/config.yml - Review system configuration (LLM parameters, concurrency settings, etc.)
train/sft_training/ds_z3_config.json - DeepSpeed configuration
train/rl_training/config/grpo_config.yaml - GRPO training configuration

Dependencies

Main dependencies include:

Web Framework: FastAPI, Uvicorn
LLM API: OpenAI, LiteLLM
Deep Learning: PyTorch, Transformers, Accelerate
Data Processing: Pandas, NumPy, PyArrow
PDF Processing: PyPDF2, Pillow

See requirements.txt for the complete dependency list.

Project Structure

ACL/
├── preprocess_data/          # Data preprocessing
│   ├── download_openreview.py
│   ├── parse_paper.py
│   ├── add_section.py
│   └── synth_corruptions_for_detector.py
├── detect/                   # Error detection
│   ├── mas_error_detection.py
│   ├── agents.py
│   ├── eval_detection.py
│   └── eval_log_detail.py
├── review/                   # Paper review
│   ├── agents/
│   │   ├── PaperAudit/      # AuditAgent
│   │   └── deepreviewer.py  # DeepReviewerAgent
│   ├── alignment/           # Alignment evaluation
│   └── run_audit_agent.py
├── train/                    # Model training
│   ├── train_data_process/   # Data processing
│   ├── sft_training/         # SFT training
│   ├── rl_training/          # RL training
│   └── eval/                 # Model evaluation
├── requirements.txt          # Project dependencies
├── env.example               # Environment variables example
└── README.md                 # This document

Workflow

Typical complete workflow:

Data Preparation: Use preprocess_data to download and preprocess papers, generating synthetic error data
Error Detection: Use detect module to detect errors in papers
Paper Review: Use review module for multi-stage review
Model Training: Use train module to train and improve detection/review models
Evaluation & Optimization: Evaluate model performance and iterate improvements

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PaperAudit: Academic Paper Error Detection and Review System

Project Overview

System Architecture

Core Modules

1. preprocess_data - Data Preprocessing

2. detect - Error Detection System

3. review - Paper Review System

4. train - Model Training

Quick Start

Environment Setup

Usage Examples

1. Data Preprocessing

2. Error Detection

3. Paper Review

4. Model Training

Configuration

Environment Variables

Configuration Files

Dependencies

Project Structure

Workflow

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
detect		detect
figs		figs
preprocess_data		preprocess_data
review		review
train		train
README.md		README.md
README_cn.md		README_cn.md
env.example		env.example
requirements.txt		requirements.txt

TU2021/PaperAudit

Folders and files

Latest commit

History

Repository files navigation

PaperAudit: Academic Paper Error Detection and Review System

Project Overview

System Architecture

Core Modules

1. preprocess_data - Data Preprocessing

2. detect - Error Detection System

3. review - Paper Review System

4. train - Model Training

Quick Start

Environment Setup

Usage Examples

1. Data Preprocessing

2. Error Detection

3. Paper Review

4. Model Training

Configuration

Environment Variables

Configuration Files

Dependencies

Project Structure

Workflow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages