Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Latest commit

 

History

History
21 lines (14 loc) · 1.3 KB

README.md

File metadata and controls

21 lines (14 loc) · 1.3 KB

GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content

This repository is no longer actively maintained. Please refer to our latest followup work:

Overview

📄 Link to Paper (arXiv) | 💾 Link to Dataset | 📦 Link to Checkpoint

This repository is the codebase for paper GPT-Sentinel: Distinguishing Human and ChatGPT Generating Content.

  1. We collect and publish OpenGPTText - a high quality dataset with approximately 30,000 text sample rephrased by gpt-3.5-turbo (ChatGPT).
  2. We construct two detectors with different architectures - the RoBERTa-Sentinel and T5-Sentinel.
  3. T5-Sentinel shows SOTA performance (98% accuracy) on OpenGPTText dataset

image