# FedIPR Tutorial: Guide to Adding Watermarks to Image and Language Models

In this tutorial, you'll learn how to add both backdoor-based and feature-based watermarks to your models in the federated training. 
We'll dive into using backdoor-watermark datasets for backdoor-based watermarking and exploring signblock—a tool that learns feature-based watermarks during traning. We will show you how to apply these techniques to both computer vision and language models. We'll also offer a hands-on example with a CV task, share how to verify the watermarks you've embedded, and introduce some ready-to-use models provided by the FATE framework. 

## FedIPR Introduction
FedIPR stands for Federated Intellectual Property Rights, a technology designed to protect the ownership of models developed under federated conditions. At its core, the FedIPR approach is described in the original paper [FedIPR](https://arxiv.org/pdf/2109.13236.pdf), introducing two primary watermarking techniques to safeguard your model: Backdoor-based and Feature-based watermarks.

Backdoor-based methods: These methods use specific input triggers to produce intentional, incorrect labels. The goal here is to create a unique "signature" for the model, allowing for ownership verification through remote APIs, without requiring access to the model's internal parameters.

Feature-based methods: These techniques encode designated binary strings as watermarks directly into the model's layer parameters. Various schemes have been proposed, such as embedding these watermarks into convolution layer weights using a binary cross-entropy loss function, or into normalization layer scale parameters using a hinge-like regularization term. In our implementations, we embed signatures into normalization layers as the same as 

Through these watermarking techniques, FedIPR ensures a robust way to assert ownership of your federated models without compromising their performance.

## Preliminary

We strongly recommend you finish reading our NN tutorial to get familiar with Model and Dataset customizations: [NN Tutorials](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/nn_tutorial/README.md)

## Backdoor Dataset for Backdoor Watermark

You can develop your own backdoor dataset and use it in FedIPRTrainer. If watermark dataset is detected, it will be used to train models along with your task dataset. If not provided, it will perform normal training.

You can add python path so that you can run codes in the notebook.

In [2]:
import sys
your_path_to_fate_python = 'xxx/fate/python'
sys.path.append(your_path_to_fate_python)

### Interfaces

The WaterMarkDataset class serves as a base class for handling watermark datasets in federated learning environments. It’s crucial for you to implement the load method. The primary task when subclassing WaterMarkDataset is to fill in the load method. This method should take a path argument and use it to load both your normal and watermark datasets.

Besides you need to implement other interfaces like get_item, len like using a pytorch dataset to make it work correctly in FATE.
You can refer to this tutorial: [Dataset Customization](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/nn_tutorial/Homo-NN-Customize-your-Dataset.ipynb)

Here show you the source code of the watermark dataset class.

In [2]:
from federatedml.nn.dataset.base import Dataset
from federatedml.util import LOGGER
from federatedml.nn.dataset.image import ImageDataset


class WaterMarkDataset(Dataset):

    def __init__(self):
        super().__init__()
        self.normal_dataset = None
        self.watermark_dataset = None

    def load(self, path):
        raise NotImplementedError()

    def get_normal_dataset(self):
        return self.normal_dataset

    def get_watermark_dataset(self):
        return self.watermark_dataset

To make you better understand how our watermark dataset work, here we show the implementation of load function of our built-in WaterMarkImageDataset.
The WaterMarkImageDataset class is designed to automatically identify and load two distinct folders from the specified file path: one containing 'normal' training samples and another containing 'watermark' trigger samples.

In [3]:
def load(self, file_path):

    # normal dataset path
    normal_path = os.path.join(file_path, self.normal_folder_name)
    # watermark dataset path
    watermark_path = os.path.join(file_path, self.watermark_folder_name)

    # load normal dataset
    self.normal_dataset = ImageDataset(
        center_crop=self.center_crop,
        center_crop_shape=self.size,
        generate_id_from_file_name=self.generate_id_from_file_name,
        file_suffix=self.file_suffix,
        float64=self.float64,
        label_dtype=self.label_type
    )
    if os.path.exists(normal_path):
        self.normal_dataset.load(normal_path)
    else:
        self.normal_dataset = None
        LOGGER.info(
            f'normal dataset not found in {normal_path}, will not load normal dataset')
    # load watermark dataset
    self.watermark_dataset = ImageDataset(
        center_crop=self.center_crop,
        center_crop_shape=self.size,
        generate_id_from_file_name=self.generate_id_from_file_name,
        file_suffix=self.file_suffix,
        float64=self.float64,
        label_dtype=self.label_type
    )
    if os.path.exists(watermark_path):
        self.watermark_dataset.load(watermark_path)
    else:
        self.watermark_dataset = None
        LOGGER.info(
            f'watermark dataset not found in {watermark_path}, will not load watermark dataset')

You can try our WaterMarkImageDataset: use it load our provided cifar-10 watermarked dataset which contains 100 trigger samples.Each image in these folders has been augmented with a pattern of structured noise in one corner. Download the dataset and place it in example/data folder in your fate project: [Dowload Path]()

In [4]:
from fate_llm.dataset.watermark import WaterMarkImageDataset

ds = WaterMarkImageDataset()
ds.load('../../../examples/data/cifar_10_ipr/fedipr_cifar10_guest/')

In [5]:
ds.get_normal_dataset()

Dataset ImageFolder
    Number of datapoints: 25000
    Root location: ../../examples/data/cifar_10_ipr/fedipr_cifar10_guest/normal
    StandardTransform
Transform: Compose(
               ToTensor()
           )

In [6]:
ds.get_watermark_dataset() # water mark dataset

Dataset ImageFolder
    Number of datapoints: 100
    Root location: ../../examples/data/cifar_10_ipr/fedipr_cifar10_guest/watermark
    StandardTransform
Transform: Compose(
               ToTensor()
           )

In [7]:
len(ds)

25100

At this point, you can now customize a watermark dataset for your own tasks to add watermarks to your models. In the upcoming CIFAR-10 task, we will be using FATE's built-in image watermark dataset class.

## Built-in BacthNorm and LayerNorm Blocks for Feature-based Watermark

In this section, we will delve into the workings of feature-based watermarking. Feature-based watermarking involves embedding binary watermarks vectors into specific model parameters. In FATE implementations, we use the same design as the FATE-IPR paper: In the case of CNN, binary water mark are embeded into BatchNorm Layer. In transformers, watermarks are embeded into LayerNorm layers.

You can use SignatureConv, SignatureLayerNorm to build your model. Once these blocks are detected in the FedIPR trainer, trainer will automatically assign binary watermark vector whose bit length is computed by Equation (15) in the origin paper.

You can import them from:model's proprietary elements.

In [5]:
from fate_llm.model_zoo.ipr.sign_block import SignatureConv, SignatureLayerNorm

Here we show you the source code of our built in alexnet and distilbert to show you how to quickly build a model with featurebased watermark:

In [6]:
import torch.nn as nn
from fate_llm.model_zoo.ipr.sign_block import SignatureConv, ConvBlock


class SignAlexNet(nn.Module):

    """
    This is a modified Alexnet: its 4,5,6 layers are replaced by Singnature Conv Block
    """

    def __init__(self, num_classes):
        super().__init__()
        in_channels = 3
        maxpoolidx = [1, 3, 7]
        signed_layer = [4, 5, 6]
        layers = []
        inp = in_channels

        # channels & kennel size
        # the same setting as the FedIPR paper
        oups = {
            0: 64,
            2: 192,
            4: 384,
            5: 256,
            6: 256
        }
        kp = {
            0: (5, 2),
            2: (5, 2),
            4: (3, 1),
            5: (3, 1),
            6: (3, 1)
        }

        for layeridx in range(8):
            if layeridx in maxpoolidx:
                layers.append(nn.MaxPool2d(2, 2))
            else:
                k = kp[layeridx][0]
                p = kp[layeridx][1]
                if layeridx in signed_layer:
                    layers.append(SignatureConv(inp, oups[layeridx], k, 1, p))
                else:
                    layers.append(ConvBlock(inp, oups[layeridx], k, 1, p))
                inp = oups[layeridx]

        self.features = nn.Sequential(*layers)
        self.classifier = nn.Linear(4 * 4 * 256, num_classes)

    def forward(self, x):
        for m in self.features:
            x = m(x)
        x = x.view(x.size(0), -1)
        x = self.classifier(x)
        if self.training:
            return x
        else:  # Sofmax
            return nn.functional.softmax(x, dim=1)

By inserting signconv block you can easily build a cv model with feature-based signature, in the case of NLP models, by useing 'recursive_replace_layernorm' you can quickly replace the original LayerNorm with our sign layernorm. Codes below show that you can quickly add feature-based watermarks to a huggingface pretraind model:

In [None]:
from torch.nn import Module
from transformers import DistilBertForSequenceClassification, DistilBertForTokenClassification
from fate_llm.model_zoo.ipr.sign_block import recursive_replace_layernorm


class SignDistilBertForTokenClassification(Module):

    def __init__(self, model_path=None, num_labels=4) -> None:
        super().__init__()
        if model_path is None:
            model_path = 'distilbert-base-uncased'

        self.model_path = model_path
        self.model = DistilBertForTokenClassification.from_pretrained(
            model_path, num_labels=num_labels)

        # replace layernorm by SignatureLayerNorm
        sub_distilbert = self.model.distilbert.transformer.layer[3:]  # replace layernorm by SingLayerNorm in the last 3 layer
        recursive_replace_layernorm(
            sub_distilbert,
            layer_name_set={'output_layer_norm'})

    def forward(self, input_dict):
        return self.model(**input_dict)

## Verify Feature-based watermark with our tools

After training is done, feature-based watermarks' signatures will be saved together with model. You can use our tool to verify the model ownership.

In [4]:
from fate_llm.trainer.fedipr_trainer import verify_feature_based_signature

See the example below for usage.

## FedIPR on FATE

In FATE-LLM-1.3’s model_zoo we have these built-in models which are automatically integrated with feature-based watermarking capabilities:

#### Model List

- `alexnet.py` - Alexnet
- `resnet.py` - Resnet18
- `distilbert.py` - Distilbert (Configurations match those in the FedIPR paper)
- `gpt2.py` - Standard GPT-2 (Watermarks are added to the last 2 transformer layers)
t.py`

We have verified the effectiveness of our watermarking features through a series of tests:
- For computer vision tasks, we evaluated both backdoor watermarking and feature-based watermarking techniques on the CIFAR-10 and CIFAR-100 datasets. Our testing involved the use of ResNet and AlexNet models.
- For natural language processing tasks, we assessed the performance of DistilBERT and GPT2 models on the IMDB & CoNLL-2003 datasets, which are sequence classification tasn and token classification respectively. 
During the testing phase, the sign bit was automatically allocated, and the data was evenly divided between the guest and host parties. For backdoor watermarking, each party supplied 100 trigger samples, all of which were augmented with noises.

Here we display the results of the experiments:

AlexNet & Resnet:

| Test Configuration | AlexNet Feature-Based Watermark Accuracy | AlexNet Backdoor Watermark Accuracy | ResNet18 Feature-Based Watermark Accuracy | ResNet18 Backdoor Watermark Accuracy |
|--------------------|-----------------------------------------|------------------------------------|------------------------------------------|-------------------------------------|
| Two-party federation on CIFAR-10 with 100 trigger samples, SignBit auto-assigned | 1.0 (All Parties) | 1.0 (All Parties) | 1.0 (All Parties) | 1.0 (All Parties) |
| Two-party federation on CIFAR-100 with 100 trigger samples, SignBit auto-assigned | 1.0 (All Parties) | 1.0 (Guest), 0.991 (Host) | 1.0 (All Parties) | 1.0 (All Parties) |

DistilBert & GPT2:

| Test Configuration | DistillBERT Feature-Based Watermark Accuracy | GPT-2 Feature-Based Watermark Accuracy |
|--------------------|----------------------------------------------|---------------------------------------|
| Two-party federation on CoNLL-2003 Token Classification with SignBit auto-assigned | 1.0 (All Parties) | 1.0 (All Parties) |
| Two-party federation on IMDB Classification with SignBit auto-assigned | 1.0 (All Parties) | 1.0 (All Parties) |


## A Cifar-10 Example &  Verifying Watermark

At last, we will show you a CV example: we will train a AlexNet with backdoor watermark & feature-based watermark at the same time. And after training is done, we use built in tools to verify feature-based watermark. You can verify the backdoor watermark yourself by simply predicting trigger samples with your models.

### FedIPR Parameters

The FedIPRTrainer's parameters are basically the same as the FedAVGTrainer except for 3 parameters: alpha, verify_freq and backdoor_verify_method
alpha is the weight for sign loss; verify_freq is the frequency of verifying your watermark during training(you can check result in logs) and backdoor_verify_method allows you to choose the method for verifying your datasets.

In [None]:
class FedIPRTrainer(FedAVGTrainer):

    def __init__(self,
                 epochs=10,
                 noraml_dataset_batch_size=32,
                 watermark_dataset_batch_size=2,
                 early_stop=None,
                 tol=0.0001,
                 secure_aggregate=True,
                 weighted_aggregation=True,
                 aggregate_every_n_epoch=None,
                 cuda=None,
                 pin_memory=True,
                 shuffle=True,
                 data_loader_worker=0,
                 validation_freqs=None,
                 checkpoint_save_freqs=None,
                 task_type='auto',
                 save_to_local_dir=False,
                 collate_fn=None,
                 collate_fn_params=None,
                 alpha=0.01,
                 verify_freqs=1,
                 backdoor_verify_method: Literal['accuracy',
                                                 'loss'] = 'accuracy'):
        ...

### Submit a pipeline to run FedIPR CV task

This a standalone version example, if you are running on the cluster version, you have to bind name&namespace on guest&host machines correspondingly

In [15]:
import torch as t
from torch import nn
from pipeline import fate_torch_hook
from pipeline.component import HomoNN
from pipeline.backend.pipeline import PipeLine
from pipeline.component import Reader, Evaluation, DataTransform
from pipeline.interface import Data, Model

t = fate_torch_hook(t)

import os
# bind data path to name & namespace
fate_project_path = os.path.abspath('../../../')
host = 9997
guest = 9997
arbiter = 9997
pipeline = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host,
                                                                            arbiter=arbiter)

data_0 = {"name": "watermark_cifar10_guest", "namespace": "experiment"}
data_1 = {"name": "watermark_cifar10_host", "namespace": "experiment"}

data_path_0 = fate_project_path + '/examples/data/cifar_10_ipr/fedipr_cifar10_guest'
data_path_1 = fate_project_path + '/examples/data/cifar_10_ipr/fedipr_cifar10_host'
pipeline.bind_table(name=data_0['name'], namespace=data_0['namespace'], path=data_path_0)
pipeline.bind_table(name=data_1['name'], namespace=data_1['namespace'], path=data_path_1)

reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='guest', party_id=guest).component_param(table=data_0)
reader_0.get_party_instance(role='host', party_id=host).component_param(table=data_1)

from pipeline.component.nn import DatasetParam

dataset_param = DatasetParam(dataset_name='watermark')

from pipeline.component.homo_nn import TrainerParam  # Interface

# our simple classification model:
model = t.nn.CustModel(module_name='ipr.alexnet', class_name='SignAlexNet', num_classes=10)

nn_component = HomoNN(name='nn_0',
                      model=model, # model
                      dataset=dataset_param,  # dataset
                      # Notice that for the convenience of getting result model we set save_to_local_dir=True
                      trainer=TrainerParam(trainer_name='fedipr_trainer', epochs=5, save_to_local_dir=True, cuda=0),
                      optimizer=t.optim.Adam(lr=0.001),
                      loss=t.nn.CrossEntropyLoss(),
                      torch_seed=100 # random seed
                      )


pipeline.add_component(reader_0)
pipeline.add_component(nn_component, data=Data(train_data=reader_0.output.data))
pipeline.compile()



<pipeline.backend.pipeline.PipeLine at 0x7fc070abdc40>

In [None]:
pipeline.fit() # submit!

### Load Model and Verify

Since we enable 'save_to_local_dir', we can directly load trained model from fateflow job folder, and verify its watermarks

In [19]:
from fate_llm.trainer.fedipr_trainer import verify_feature_based_signature

In [20]:
job_id = '202309041103336933850'  # your job id
import os
fate_project_path = os.path.abspath('../../../')
local_dir = fate_project_path + '/fateflow/jobs/{}/guest/9997/nn_0/'.format(job_id)
state_dict = t.load(local_dir + 'model.pkl')

In [21]:
from fate_llm.model_zoo.ipr.alexnet import SignAlexNet

model = SignAlexNet(num_classes=10)

In [24]:
model.load_state_dict(state_dict['model'])

<All keys matched successfully>

In [27]:
keys = state_dict['extra_data']['keys']  # W and watermark vectors

In [28]:
keys

{'features.4': (tensor([[-7.3380e-02,  1.6275e+00, -1.5404e+00,  ...,  3.4250e-01,
           -1.0067e+00, -5.4504e-01],
          [ 2.9928e-01, -4.0935e-01, -6.1239e-01,  ...,  7.2356e-01,
            2.7019e-01, -9.1200e-01],
          [-4.3889e-02,  2.1774e+00, -1.3706e+00,  ..., -8.5879e-01,
            2.3445e-01,  2.0458e+00],
          ...,
          [-5.1755e-01,  5.9240e-01,  2.6353e-01,  ..., -1.0465e+00,
           -5.3456e-01, -6.0439e-01],
          [-2.4679e-01, -1.4290e+00, -5.9567e-01,  ...,  7.7682e-01,
           -6.2445e-01,  1.3682e+00],
          [ 1.1148e+00, -8.7518e-01,  7.6818e-01,  ...,  6.5654e-01,
           -1.8362e+00, -5.5355e-04]]),
  tensor([-1., -1.,  1.,  1.,  1., -1.,  1.,  1., -1., -1., -1.,  1., -1.,  1.,
          -1., -1.,  1., -1., -1., -1., -1.,  1., -1.,  1., -1., -1.,  1., -1.,
           1., -1., -1., -1.,  1.,  1.,  1.,  1.,  1.,  1.,  1., -1., -1.,  1.,
           1., -1.,  1., -1., -1.,  1., -1.,  1., -1., -1., -1.,  1., -1., -1.,
       

In [30]:
acc = verify_feature_based_signature(model, keys)
print(acc)

1.0


The accuracy is 100%! Congratulations. Now you can use FATE to build your own IPR protected models.