Skip to content

Official Implementation for: "RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees"

License

Notifications You must be signed in to change notification settings

jeremyxianx/RAWatermark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees

RAW aims to offer a robust and agile watermarking framework that adapts to the rapidly evolving landscape of digital media creation. As deepfakes and other AI-generated content become increasingly sophisticated and prevalent, RAW's ability to embed imperceptible yet detectable watermarks directly into image and video content provides a crucial tool for content authentication and intellectual property protection.

By offering provable guarantees on false-positive rates and resilience against adversarial attacks, we hope RAW paves the way for a future where the authenticity of digital content can be verified.

This repository contains the source codes for the RAWatermark project, based on this paper. Citation of the work:

@inproceedings{xian2024rawrobustagileplugandplay,
  title={RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees},
  author={Xian, Xun and Wang, Ganghua and Bi, Xuan and Srinivasa, Jayanth and Kundu, Ashish and Hong, Mingyi and Ding, Jie},
  booktitle={Advances in Neural Information Processing Systems},
  year={2024},
  url={https://arxiv.org/abs/2403.18774}
}

Overview

This is the official implementation of our paper titled "RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees" (pdf). The paper introduces an innovative watermarking scheme that is model-agnostic, imperceptible, and operates with zero-bit capacity. It is designed for watermarking videos and images, and is suitable for deployment in real-time scenarios.

Compared to existing encoder-decoder-based watermarking schemes, such as RivaGan, our proposed method offers:

  1. Tremendously elevated watermark encoding speed (e.g., approximately $40\times$ improved time efficiency for watermarking a 25-frame $512 \times 512$ video), generated by the Stable Video Diffusion;
  2. Supporting (1) arbitrary lengths of videos and (2) tunable strength of watermarking without any extra training.
  3. Provable guarantee on the false-positive rate of the watermark detection under distributional-free assumption (Currently, only for image watermark).

Installation

This repository was developed with PyTorch 2.0.1 and should be compatible with newer versions of PyTorch. To set up the required environment, you should first manually install PyTorch with CUDA, and then run the following command to set up a separate environment and install the required packages (both Conda and Git are required).

conda create --name Raw python=3.10
conda activate Raw

git clone https://github.com/jeremyxianx/RAWatermark.git
cd RAWatermark
pip install -r requirement.txt

A demo example for video watermarking and detection

In the following, we provide a walk-through example to demonstrate how to use the library to embed and detect watermarks in videos.

  1. First, initialize the RAWatermark instance, which contains a (jointly pre-trained) pair of watermark and classifier.
import torch
from scripts import raw, tools

# Setup device, can be cpu or cuda
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

RAW = raw.RAWatermark(device = device, wm_index = 0)
  • wm_index specifies the index of a pair of jointly-trained watermark and its corresponding classifier. We provide several pairs of (pre-traied)watermarks and their corresponding classifier in assets/pre_trained folder.
  • You can try different watermark by changing the wm_index parameter. At this moment, we provide 4 pre-trained watermarks, indexed from 0 to 3.
  1. Next, load the video and encode the watermark into each frame of the video. Currently, we only support videos with resolution $512 \times 512$.
demo_videodataset = tools.VideoDataset(
        root_dir = 'assets/video_examples/', # replace to your own video folder
        crop_size=False, # make sure the shape of your video is 512x512
        no_of_frames=25, # replace to your desired number of frames to be watermarked
    )

demo_video1 = demo_videodataset[0].to(device)

wm_demo_video1 = RAW.encode(demo_video1, injection_every_k_frames=1)
  • The format of videos shoule be in either .mp4 or .avi.
  • We only support videos with resolution $512 \times 512$ at this moment. But the length of the video, i.e., no_of_frames, can be arbitrary.
  • injection_every_k_frames specifies the frequency of watermark injection. For example, if injection_every_k_frames is set to 1, then the watermark will be injected into every frame of the video.
  1. Then, we check for the presence of the watermark given the decision_thres.
RAW.detect(wm_demo_video1, decision_thres=0.5)
  • Currently, our detection method is only suitble for videos watermarked with injection_every_k_frames=1.
  • The provable guarantee on the false-positive rate of the watermark detection is only provided for image watermark at this moment.

APIs

We provide several more use cases in the APIs pages. These cases include:

  1. Adjusting the strength of the watermark;
  2. Watermarking images;
  3. Obtaining a provable guarantee on the false-positive rate of the image watermark detection;

Test benchmarks for video watermarking and detection

We test the trained watermark and its associated classifier (trained on the MS-COCO dataset) on short videos generated by the stable-video-diffusion-img2vid-xt model, which is an image-to-video model. For the images used to generate the video, we utilize the DiffusionDB dataset. This ensures that the testing videos are not seen by the watermark and the classifier during training.

Visual Example

Original Watermarked Pixel-wise Difference ($\times 6$)

Encoding Speed (CPU Only)

Video Resolution Number of Frames Method Time Elapsed
$512 \times 512$ 24 RAW (Ours) 0.2 - 0.5s
$512 \times 512$ 24 RivaGan 8-12s

AUROC (over fresh 500 test samples) for Video Watermark Detection

Method AUROC
RAW (Ours) 0.96
RivaGan 0.97

To-Do list

  • Support for videos/images with arbitrary resolution
  • Support for watermark detection with provable guarantee on the false-positive rate for video watermark
  • Release the entire training pipeline for watermark and classifier

How to Contribute

We welcome contributions from everyone. Please read our CONTRIBUTING.md file for guidelines on how to contribute to this project.

Code of Conduct

To ensure a welcoming and productive environment, all participants are expected to uphold our Code of Conduct.

Contact

If you have any questions, please feel free to contact us at xian0044@umn.edu or submit an issue.

About

Official Implementation for: "RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images (Videos) with Provable Guarantees"

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages