Skip to content

AISimplyExplained/watermarking-llms

Repository files navigation

Watermarking LLMs 101

Welcome to the Watermarking LLMs 101 repository! This repository serves as a central hub for educational resources, research papers, and community contributions focused on the study and application of watermarking techniques in Large Language Models (LLMs). Whether you're a student, researcher, or enthusiast, this repository is designed to provide you with comprehensive materials to deepen your understanding of LLM watermarking.
Explore our interactive Watermarking LLMs 101 Course to engage with the material in a dynamic learning environment.

📄 Research Papers on Watermarking LLMs

This section lists the essential research papers on LLM watermarking. For a more detailed summary and discussion, refer to the respective links.

Date Paper Title Authors Summary Link
April 16, 2024 Topic-based Watermarks for LLM-Generated Text Alexander Nemecek et al. Introduces topic-based watermarking for differentiating between LLM- and human-generated text. Read more
April 5, 2024 Have You Merged My Model? Tianshuo Cong et al. Discusses the robustness of LLM IP protection against model merging. Read more
April 2, 2024 An Entropy-based Text Watermarking Detection Method Yijian Lu et al. Proposes an Entropy-based Watermark Detection (EWD) method that adjusts the detection weights based on token entropy, improving performance in low-entropy scenarios. Read more
April 2, 2024 A Statistical Framework of Watermarks for LLMs Xiang Li et al. Proposes a statistical framework for watermarking LLMs, focusing on detection efficiency and optimal rules. Read more
March 19, 2024 Bypassing LLM Watermarks with Color-Aware Substitutions Qilong Wu, Varun Chandrasekaran Introduces SCTS, a color-aware attack that effectively bypasses state-of-the-art watermarking by substituting tokens based on their "color" information. Read more
March 18, 2024 Towards Better Statistical Understanding of Watermarking LLMs Zhongze Cai et al. Discusses an optimization-based approach for watermarking LLMs, balancing model distortion and detection ability with new insights into the green-red algorithm. Read more
March 15, 2024 WaterJudge: Quality-Detection Trade-off Piotr Molenda et al. Analyzes the trade-offs between watermark detectability and the quality of generated texts. Read more
March 12, 2024 Duwak: Dual Watermarks in Large Language Models Chaoyi Zhu et al. Introduces Duwak, a method that embeds dual watermarks enhancing detection efficiency and text quality, significantly reducing the tokens needed for reliable detection. Read more
March 10, 2024 PiGW: A Plug-in Generative Watermarking Framework Rui Ma et al. Proposes PiGW, a framework that integrates watermarks into generative images with minimal quality loss and high security against noise attacks. Read more

🙋 Contributing

We encourage contributions from the community! If you have suggestions, research papers, or educational materials related to LLM watermarking, please feel free to contribute.