Official implementation of the watermark injection and detection algorithms presented in the paper:
"Linguistic-Based Watermarking for Text Authentication" by Xi Yang, Kejiang Chen, Weiming Zhang, Chang Liu, Yuang Qi, Jie Zhang, Han Fang, and Nenghai Yu.
- Python 3.9
- check requirements.txt
pip install -r requirements.txt
pip install git+https://github.com/JunnYu/WoBERT_pytorch.git # Chinese word-level BERT model
python -m spacy download en_core_web_sm
- For Chinese, please download the pre-trained Chinese word vectors and place it in the root directory.
The watermark injection and detection modules are located in the models
directory. watermark_original.py
implements the iterative algorithms as described in the paper. watermark_faster.py
introduces batch processing to speed up the watermark injection algorithm and the precise detection algorithm.
We provide two demonstrations, demo_CLI.py
and demo_gradio.py
, which correspond to command-line interaction and graphical interface interaction respectively.
Click on the GIFs to enlarge them for a better experience.
$ python demo_gradio.py --language English --tau_word 0.8 --lamda 0.83
$ python demo_gradio.py --language Chinese --tau_word 0.75 --lamda 0.83
$ python demo_CLI.py --language English --tau_word 0.8 --lamda 0.83
$ python demo_CLI.py --language Chinese --tau_word 0.75 --lamda 0.83