# SynthID Text: Watermarking for Generated Text

This notebook demonstrates how to use the [SynthID Text library][synthid-code]
to apply and detect watermarks on generated text. It is divided into three major
sections and intended to be run end-to-end.

1.  **_Setup_**: Importing the SynthID Text library, choosing your model (either
    [Gemma][gemma] or [GPT-2][gpt2]) and device (either CPU or GPU, depending
    on your runtime), defining the watermarking configuration, and initializing
    some helper functions.
1.  **_Applying a watermark_**: Loading your selected model using the
    [Hugging Face Transformers][transformers] library, using that model to
    generate some watermarked text, and comparing the perplexity of the
    watermarked text to that of text generated by the base model.
1.  **_Detecting a watermark_**: Training a detector to recognize text generated
    with a specific watermarking configuration, and then using that detector to
    predict whether a set of examples were generated with that configuration.

As the reference implementation for the
[SynthID Text paper in _Nature_][synthid-paper], this library and notebook are
intended for research review and reproduction only. They should not be used in
production systems. For a production-grade implementation, check out the
official SynthID logits processor in [Hugging Face Transformers][transformers].

[gemma]: https://ai.google.dev/gemma/docs/model_card
[gpt2]: https://huggingface.co/openai-community/gpt2
[synthid-code]: https://github.com/google-deepmind/synthid-text
[synthid-paper]: https://www.nature.com/
[transformers]: https://huggingface.co/docs/transformers/en/index

# 1. Setup

In [1]:
# @title Install and import the required Python packages
#
# @markdown Running this cell may require you to restart your session.

! pip install synthid-text[notebook]

from collections.abc import Sequence
import enum
import gc

import datasets
import huggingface_hub
from synthid_text import detector_mean
from synthid_text import logits_processing
from synthid_text import synthid_mixin
from synthid_text import detector_bayesian
import tensorflow as tf
import torch
import tqdm
import transformers
!sudo apt update
!sudo apt install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh
!pip install langchain-ollama
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown
import threading
import subprocess
import time
import pandas as pd

Get:1 https://cloud.r-project.org/bin/linux/ubuntu jammy-cran40/ InRelease [3,632 B]
Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  InRelease [1,581 B]
Get:3 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64  Packages [1,741 kB]
Hit:4 http://archive.ubuntu.com/ubuntu jammy InRelease
Get:5 http://security.ubuntu.com/ubuntu jammy-security InRelease [129 kB]
Get:6 http://archive.ubuntu.com/ubuntu jammy-updates InRelease [128 kB]
Hit:7 https://ppa.launchpadcontent.net/deadsnakes/ppa/ubuntu jammy InRelease
Hit:8 https://ppa.launchpadcontent.net/graphics-drivers/ppa/ubuntu jammy InRelease
Get:9 https://r2u.stat.illinois.edu/ubuntu jammy InRelease [6,555 B]
Hit:10 https://ppa.launchpadcontent.net/ubuntugis/ppa/ubuntu jammy InRelease
Get:11 http://archive.ubuntu.com/ubuntu jammy-backports InRelease [127 kB]
Get:12 https://r2u.stat.illinois.edu/ubuntu jammy/main amd64 Packages [2,740 kB]
Get:13 http://security.ubuntu.com/ubuntu jammy-s

In [2]:
# @title Choose your model.
#
# @markdown This reference implementation is configured to use the Gemma v1.0
# @markdown Instruction-Tuned variants in 2B or 7B sizes, or GPT-2.


class ModelName(enum.Enum):
  GPT2 = 'gpt2'
  GEMMA_2B = 'google/gemma-2b-it'
  GEMMA_7B = 'google/gemma-7b-it'


model_name = 'google/gemma-2b-it' # @param ['gpt2', 'google/gemma-2b-it', 'google/gemma-7b-it']
MODEL_NAME = ModelName(model_name)

if MODEL_NAME is not ModelName.GPT2:
  huggingface_hub.notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
# @title Configure your device
#
# @markdown This notebook loads models from Hugging Face Transformers into the
# @markdown PyTorch deep learning runtime. PyTorch supports generation on CPU or
# @markdown GPU, but your chosen model will run best on the following hardware,
# @markdown some of which may require a
# @markdown [Colab Subscription](https://colab.research.google.com/signup).
# @markdown
# @markdown * Gemma v1.0 2B IT: Use a GPU with 16GB of memory, such as a T4.
# @markdown * Gemma v1.0 7B IT: Use a GPU with 32GB of memory, such as an A100.
# @markdown * GPT-2: Any runtime will work, though a High-RAM CPU or any GPU
# @markdown   will be faster.

DEVICE = (
    torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
)
DEVICE

device(type='cuda', index=0)

In [4]:
# @title Example watermarking config
#
# @markdown SynthID Text produces unique watermarks given a configuration, with
# @markdown the most important piece of a configuration being the `keys`: a
# @markdown sequence of unique integers.
# @markdown
# @markdown This reference implementation uses a fixed watermarking
# @markdown configuration, which will be displayed when you run this cell.

CONFIG = synthid_mixin.DEFAULT_WATERMARKING_CONFIG
CONFIG

immutabledict({'ngram_len': 5, 'keys': [654, 400, 836, 123, 340, 443, 597, 160, 57, 29, 590, 639, 13, 715, 468, 990, 966, 226, 324, 585, 118, 504, 421, 521, 129, 669, 732, 225, 90, 960], 'sampling_table_size': 65536, 'sampling_table_seed': 0, 'context_history_size': 1024, 'device': device(type='cuda', index=0)})

In [5]:
# @title Initialize the required constants, tokenizer, and logits processor

BATCH_SIZE = 8
NUM_BATCHES = 320
OUTPUTS_LEN = 1024
TEMPERATURE = 0.5
TOP_K = 40
TOP_P = 0.99

tokenizer = transformers.AutoTokenizer.from_pretrained(MODEL_NAME.value)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "left"

logits_processor = logits_processing.SynthIDLogitsProcessor(
    **CONFIG, top_k=TOP_K, temperature=TEMPERATURE
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/34.2k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

In [6]:
# @title Utility functions to load models and process prompts.


def load_model(
    model_name: ModelName,
    expected_device: torch.device,
    enable_watermarking: bool = False,
) -> transformers.PreTrainedModel:
  if model_name == ModelName.GPT2:
    model_cls = (
        synthid_mixin.SynthIDGPT2LMHeadModel
        if enable_watermarking
        else transformers.GPT2LMHeadModel
    )
    model = model_cls.from_pretrained(model_name.value)
    model.generation_config.pad_token_id = model.generation_config.eos_token_id
  else:
    model_cls = (
        synthid_mixin.SynthIDGemmaForCausalLM
        if enable_watermarking
        else transformers.GemmaForCausalLM
    )
    model = model_cls.from_pretrained(
        model_name.value,
        #device_map='auto',
        torch_dtype=torch.bfloat16,
    )

  model.to(expected_device)

  if str(model.device) != str(expected_device):
    raise ValueError('Model device not as expected.')

  return model

def _process_raw_prompt(prompt: Sequence[str]) -> str:
  """Add chat template to the raw prompt."""
  if MODEL_NAME == ModelName.GPT2:
    return prompt.decode().strip('"')
  else:
    return tokenizer.apply_chat_template(
        [{'role': 'user', 'content': prompt.decode().strip('"')}],
        tokenize=False,
        add_generation_prompt=True,
    )

#2. Removing watermark by using LLMs to paraphrase texts

In [7]:
# @title Compute g-values of custom texts

def generate_responses(input):
  inputs = tokenizer(
      input,
      return_tensors='pt',
      padding=True,
  ).to(DEVICE)
  # Access the correct input_ids
  outputs = inputs['input_ids']
  # eos mask is computed, skip first ngram_len - 1 tokens
  # eos_mask will be of shape [batch_size, output_len]
  eos_token_mask = logits_processor.compute_eos_token_mask(
      input_ids=outputs,
      eos_token_id=tokenizer.eos_token_id,
  )[:, CONFIG['ngram_len'] - 1:]

  # context repetition mask is computed
  context_repetition_mask = logits_processor.compute_context_repetition_mask(
      input_ids=outputs,
  )
  # context repitition mask shape [batch_size, output_len - (ngram_len - 1)]

  combined_mask = context_repetition_mask * eos_token_mask

  g_values = logits_processor.compute_g_values(
      input_ids=outputs,
  )
  # g values shape [batch_size, output_len - (ngram_len - 1), depth]

  return g_values, combined_mask

def return_wm_mean_scores(input):
  wm_g_values, wm_mask = generate_responses(
      input
  )
  wm_weighted_mean_scores = detector_mean.weighted_mean_score(
      wm_g_values.cpu().numpy(), wm_mask.cpu().numpy()
  )
  print('Weighted Mean scores for watermarked responses: ', wm_weighted_mean_scores)
  return wm_weighted_mean_scores

def return_uwm_mean_scores(input):
  wm_g_values, wm_mask = generate_responses(
      input
  )
  wm_weighted_mean_scores = detector_mean.weighted_mean_score(
      wm_g_values.cpu().numpy(), wm_mask.cpu().numpy()
  )
  print('Weighted Mean scores for unwatermarked responses: ', wm_weighted_mean_scores)
  return wm_weighted_mean_scores

def return_paraphrased_mean_scores(input):
  wm_modified_g_values, wm_modified_mask = generate_responses(
      input
  )
  wm_modified_weighted_mean_scores = detector_mean.weighted_mean_score(
      wm_modified_g_values.cpu().numpy(), wm_modified_mask.cpu().numpy()
  )
  print('Weighted Mean scores for paraphrased watermarked responses: ', wm_modified_weighted_mean_scores)
  return wm_modified_weighted_mean_scores

def detector(input):
  if return_paraphrased_mean_scores(input) > 0.4985:
    return True
  else:
    return False

In [8]:
#@title Convert markdown to text
from bs4 import BeautifulSoup
from markdown import markdown
import re

def markdown_to_text(markdown_string):
    """ Converts a markdown string to plaintext """

    # md -> html -> text since BeautifulSoup can extract text cleanly
    html = markdown(markdown_string)

    # remove code snippets
    html = re.sub(r'<pre>(.*?)</pre>', ' ', html)
    html = re.sub(r'<code>(.*?)</code >', ' ', html)

    # extract text
    soup = BeautifulSoup(html, "html.parser")
    text = ''.join(soup.findAll(text=True))

    return text

In [9]:
# @title Generate watermarked and unwatermarked output

gc.collect()
torch.cuda.empty_cache()

def generate_output(example_inputs, enable_watermarking):
  inputs = tokenizer(
      example_inputs,
      return_tensors='pt',
      padding=True,
  ).to(DEVICE)

  model = load_model(MODEL_NAME, expected_device=DEVICE, enable_watermarking=enable_watermarking)

  torch.manual_seed(0)
  outputs = model.generate(
      **inputs,
      do_sample=True,
      temperature=0.86,
      max_length=1024,
      top_k=40,
  )

  lst = []
  for i, output in enumerate(outputs):
    print(tokenizer.decode(output, skip_special_tokens=True))
    lst.append(tokenizer.decode(output, skip_special_tokens=True))

  del inputs, outputs, model
  return lst

gc.collect()
torch.cuda.empty_cache()

In [16]:
#@title Paraphrasing via LLMs
model_list = ["gemma3:4b", "llama3.1:8b" ,"mistral:7b", "phi3:3.8b", "qwen3:4b" ]
def paraphrase(input_prompt, model_size):
  for i in range(model_size):
    # Pass the entire input text as a variable to the prompt template
    template = """Please paraphrase the following text. Put a symbol '@' at the beginning of the text:
    {input_text}
    Answer:"""

    prompt = ChatPromptTemplate.from_template(template)
    model = OllamaLLM(model=model_list[i])
    chain = prompt | model

    x = markdown_to_text(Markdown(chain.invoke({"input_text": input_prompt[0]})).data).split("@")
    input_prompt = [x[-1]]
  return input_prompt

def run_ollama_serve():
  subprocess.Popen(["ollama", "serve"])

thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

In [11]:
#@title Pulling llms api
!ollama pull gemma3:4b
!ollama pull phi3:3.8b
!ollama pull qwen3:4b
!ollama pull llama3.1:8b
!ollama pull mistral:7b

[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026h[?25l[A[1G[?25h[?2026l[?2026

In [26]:
#@title Inputting texts for paraphrasing

print("Input your text here: ")
text = input()
model_size = int(input("Choose your desired model size (1, 3, or 5): "))

#Keep LLMs API running
thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5)

#Paraphrase the wm text
while True:
  try:
    paraphrased_wm_output = paraphrase([text], model_size)
    break
  except RuntimeError:
    continue

print("Here is the paraphrased version:")
print(paraphrased_wm_output[0])
score = return_paraphrased_mean_scores(paraphrased_wm_output)[0]
print()
if (score < 0.5189775):
  print("You are safe")
else:
  print("You need another try")

Input your text here: 
She could talk to animals but they were always asking for favors usually involving stealing food or causing mischief one day a wise old owl approached her and told her that she needed to follow her own heart and not let others dictate what she should do. The owl told her to find her own way and not follow what others told her.  What is the moral of the story?  The moral of the story is that we should follow our own hearts and not let others dictate what we should do.
Choose your desired model size (1, 3, or 5): 3


  text = ''.join(soup.findAll(text=True))
  text = ''.join(soup.findAll(text=True))
  text = ''.join(soup.findAll(text=True))


Here is the paraphrased version:
A special talent in deciphering animal behavior was possessed by her, yet they frequently caused trouble or took items without asking for assistance first. In the end, a wise old owl advised her to lean on her intuition and disregard others' opinions about what she should do. The essence is that you should listen to your inner voice and choose your own path rather than following another's guidance.
Weighted Mean scores for paraphrased watermarked responses:  [0.5033312]

You are safe


#Plotting(for determining the threshold)



In [None]:
#@title Plot histogram and normal distribution pdf for UWM and WM scores
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from google.colab import drive
from scipy.stats import norm
import statistics

drive.mount('/mnt/drive')

result = pd.read_excel('/mnt/drive/MyDrive/Output and Results.xlsx')

data = result['UWM Score']
# Access the underlying numpy array from the pandas Series
x_axis = data.values

# Calculating mean and standard deviation
mean = statistics.mean(x_axis)
sd = statistics.stdev(x_axis)

# Create a smooth range of x-values for plotting the PDF
# Use linspace to create 100 evenly spaced points between the min and max of your data
x_smooth = np.linspace(min(x_axis), max(x_axis), 100)

# Calculate the PDF values for the smooth x-values
pdf_values = norm.pdf(x_smooth, mean, sd)

# Plot the smooth PDF curve
plt.plot(x_smooth, pdf_values, label=f'Normal Distribution (Mean: {mean:.4f}, Std Dev: {sd:.4f})')

# Optionally, plot a histogram of the actual data for comparison
plt.hist(x_axis, bins=20, density=True, alpha=0.6, label='UWM Score Data')

plt.xlabel('UWM Score')
plt.ylabel('Density')
plt.title('Distribution of WM Scores vs. Normal Distribution PDF')
plt.legend()
plt.grid(True)
plt.show()

data = result['WM Score']
# Access the underlying numpy array from the pandas Series
x_axis = data.values

# Calculating mean and standard deviation
mean = statistics.mean(x_axis)
sd = statistics.stdev(x_axis)

# Create a smooth range of x-values for plotting the PDF
# Use linspace to create 100 evenly spaced points between the min and max of your data
x_smooth = np.linspace(min(x_axis), max(x_axis), 100)

# Calculate the PDF values for the smooth x-values
pdf_values = norm.pdf(x_smooth, mean, sd)

# Plot the smooth PDF curve
plt.plot(x_smooth, pdf_values, label=f'Normal Distribution (Mean: {mean:.4f}, Std Dev: {sd:.4f})')

# Optionally, plot a histogram of the actual data for comparison
plt.hist(x_axis, bins=20, density=True, alpha=0.6, label='WM Score Data', color = 'b')

plt.xlabel('WM Score')
plt.ylabel('Density')
plt.title('Distribution of WM Scores vs. Normal Distribution PDF')
plt.legend()
plt.grid(True)
plt.show()



In [None]:
#@title Plot 2 normal distribution pdf for UWM and WM

data = result['UWM Score']
x_axis = data.values
mean = statistics.mean(x_axis)
sd = statistics.stdev(x_axis)
x_smooth = np.linspace(min(x_axis), max(x_axis), 100)
pdf_values_uwm = norm.pdf(x_smooth, mean, sd)
plt.plot(x_smooth, pdf_values_uwm, label=f'UWM Normal Distribution (Mean: {mean:.4f}, Std Dev: {sd:.4f})')

data = result['WM Score']
x_axis = data.values
mean = statistics.mean(x_axis)
sd = statistics.stdev(x_axis)
x_smooth = np.linspace(min(x_axis), max(x_axis), 100)
pdf_values_wm = norm.pdf(x_smooth, mean, sd)

plt.plot(x_smooth, pdf_values_wm, label=f'WM Normal Distribution (Mean: {mean:.4f}, Std Dev: {sd:.4f})')
plt.xlabel('Score')
plt.ylabel('Density')
plt.title('Normal Distribution PDF of WM scores and UWM scores')
plt.legend()

idx = np.argwhere(np.diff(np.sign(pdf_values_uwm - pdf_values_wm))).flatten()
print(f"Threshold: {x_axis[idx][-1]}")