Skip to content

mail-research/SpARK-llm-watermarking

Repository files navigation

SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality

A watermark that preserve generated text quality and watermark effectiveness by watermarking small portions of tokens distributed across the generated text.

Creating the environment.

To create the environment for the experiment, run this command.

conda env create -f environment.yml

Running the experiment.

Running the experiments.

  • Run the following command to perform the watermark generation
CUDA_VISIBLE_DEVICES=0 python pred.py \
    --mode sparkp \
    --gamma 0.05 \
    --delta 10 \
    --bl_type hard \ 
    --dataset alpacafarm \
    --model llama2-7b-chat-4k \
    --pos_tag NN NP

select the model you want to evaluate via --model. And select the mode and hyper-parameters of the watermark via --mode, --bl_type, --gamma, --delta. The parameter mode means the kinds of watermarks we used in the experiments, including sparkp(SpARK-P) and sparkr(SpARK-R). The parameter bl_type means whether the type of the watermark is hard or soft. Also, you can select the dataset you want to evaluate via --dataset. Add --pos_tag to configure what POS tags can be used. Above command is an example of using SpARK-P for noun on llama2 to generate answers for alpacafarm dataset

  • Then, run the detection code in detect.py to obtain z-scores:
CUDA_VISIBLE_DEVICES=0 python detect.py \
    --input_dir ./pred/llama2-7b-chat-4k_sparkp_g0.05_d10.0_hard
  • After that, you can run the code in eval.py to obtain the evaluation results on all datasets in result.json:
CUDA_VISIBLE_DEVICES=0 python eval.py \
    --input_dir ./pred/llama2-7b-chat-4k_sparkp_g0.05_d10.0_hard
  • To get the detection results of the model with watermarks on standard answers, you can run detect_human.py:
CUDA_VISIBLE_DEVICES=0 python detect_human.py \
    --reference_dir llama2-7b-chat-4k_sparkp_g0.05_d10.0_hard \
    --detect_dir human_generation \

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •