# Robbing the Fed: Directly Obtaining Private Data in Federated Learning with Modified Models

This notebook shows an example for the threat model and attack described in "Robbing the Fed: Directly Obtaining Private Data in Federated Learning with Modified Models". This example deviates from the other "honest-but-curious" server models and investigates an actively malicious model. As such, the attack applies to any model architecture, but its impact is more or less obvious (or not at all) depending on the already present architecture onto which the malicious "Imprint" block is grafted.

In this notebook, we place the block in front of a transformer model, as an example. The attack can also be conceptualized as merely a "malicious parameters" attack against a model which contains, for example, only fully connected layers or only convolutions without strides and fully connected layers.


This variant recovers text from a malicious language model.


Paper URL: https://openreview.net/forum?id=fwzUgo0FM9v

### Abstract:
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency.  Previous works have shown that federated gradient updates contain information that can be used to approximately recover user data in some situations.  These previous attacks on user privacy have been limited in scope and do not scale to gradient updates aggregated over even a handful of  data  points,  leaving  some  to  conclude  that  data  privacy  is  still  intact  for realistic training regimes.  In this work, we introduce a new threat model based on minimal but malicious modifications of the shared model architecture which enable the server to directly obtain a verbatim copy of user data from gradient updates without solving difficult inverse problems.  Even user data aggregated over large batches – where previous methods fail to extract meaningful content – can be reconstructed by these minimally modified models.

### Startup

In [1]:
try:
    import breaching
except ModuleNotFoundError:
    # You only really need this safety net if you want to run these notebooks directly in the examples directory
    # Don't worry about this if you installed the package or moved the notebook to the main directory.
    import os; os.chdir("..")
    import breaching
    
import torch
%load_ext autoreload
%autoreload 2

# Redirects logs directly into the jupyter notebook
import logging, sys
logging.basicConfig(level=logging.INFO, handlers=[logging.StreamHandler(sys.stdout)], format='%(message)s')
logger = logging.getLogger()

### Initialize cfg object and system setup:

This will load the full configuration object. This includes the configuration for the use case and threat model as `cfg.case` and the hyperparameters and implementation of the attack as `cfg.attack`. All parameters can be modified below, or overriden with `overrides=` as if they were cmd-line arguments.

In [2]:
cfg = breaching.get_config(overrides=["attack=imprint", "case=10_causal_lang_training",
                                      "case/server=malicious-model-rtf"])
          
device = torch.device(f'cuda') if torch.cuda.is_available() else torch.device('cpu')
torch.backends.cudnn.benchmark = cfg.case.impl.benchmark
setup = dict(device=device, dtype=getattr(torch, cfg.case.impl.dtype))
setup

Investigating use case causal_lang_training with server type malicious_model.


{'device': device(type='cpu'), 'dtype': torch.float32}

### Modify config options here

You can use `.attribute` access to modify any of these configurations for the attack, or the case:

In [3]:
cfg.case.user.num_data_points = 128 # How many sentences?
cfg.case.user.user_idx = 1 # From which user?
cfg.case.data.shape = [32] # This is the sequence length

cfg.case.server.model_modification.num_bins = 512
cfg.case.server.model_modification.position = None # '4.0.conv'
cfg.case.server.model_modification.linfunc = 'randn'

cfg.case.server.has_external_data = False
cfg.case.data.tokenizer = "gpt2"

### Instantiate all parties

The following lines generate "server, "user" and "attacker" objects and print an overview of their configurations.

In [4]:
user, server, model, loss_fn = breaching.cases.construct_case(cfg.case, setup)
attacker = breaching.attacks.prepare_attack(server.model, server.loss, cfg.attack, setup)
breaching.utils.overview(server, user, attacker)

First layer determined to be pos_encoder
Block inserted at feature shape torch.Size([32, 96]).
Reusing dataset wikitext (/home/jonas/data/wikitext/wikitext-103-v1/1.0.0/aa5e094000ec7afeb74c3be92c88313cd6f132d564c7effd961c10fd47c76f20)
Model architecture transformer3 loaded with 13,949,745 parameters and 0 buffers.
Overall this is a data ratio of    3406:1 for target shape [128, 32] given that num_queries=1.
User (of type UserSingleStep) with settings:
    Number of data points: 128

    Threat model:
    User provides labels: False
    User provides buffers: False
    User provides number of data points: True

    Data:
    Dataset: wikitext
    user: 1
    
        
Server (of type MaliciousModelServer) with settings:
    Threat model: Malicious (Analyst)
    Number of planned queries: 1
    Has external/public data: False

    Model:
        model specification: transformer3
        model state: default
        

    Secrets: {'ImprintBlock': {'weight_idx': 0, 'bias_idx': 1, 'shape':

### Simulate an attacked FL protocol

This exchange is a simulation of a single query in a federated learning protocol. The server sends out a `server_payload` and the user computes an update based on their private local data. This user update is `shared_data` and contains, for example, the parameter gradient of the model in the simplest case. `true_user_data` is also returned by `.compute_local_updates`, but of course not forwarded to the server or attacker and only used for (our) analysis.

In [5]:
server_payload = server.distribute_payload()
shared_data, true_user_data = user.compute_local_updates(server_payload)  

Computing user update in model mode: eval.


In [6]:
user.print(true_user_data)

 The Tower Building of the Little Rock Arsenal, also known as U.S. Arsenal Building, is a building located in MacArthur Park in downtown Little Rock, Arkansas
. Built in 1840, it was part of Little Rock's first military installation. Since its decommissioning, The Tower Building has housed two museums. It
 was home to the Arkansas Museum of Natural History and Antiquities from 1942 to 1997 and the MacArthur Museum of Arkansas Military History since 2001. It has also been the
 headquarters of the Little Rock Æsthetic Club since 1894. 
 The building receives its name from its distinct octagonal tower. Besides being the last
 remaining structure of the original Little Rock Arsenal and one of the oldest buildings in central Arkansas, it was also the birthplace of General Douglas MacArthur, who became the supreme
 commander of US forces in the South Pacific during World War II. It was also the starting place of the Camden Expedition. In 2011 it was named as one of
 the top 10 attractions in

### Reconstruct user data:

Now we launch the attack, reconstructing user data based on only the `server_payload` and the `shared_data`. 

For this attack, we also share secret information from the malicious server with the attack (`server.secrets`), which here is the location and structure of the imprint block.

In [7]:
reconstructed_user_data, stats = attacker.reconstruct([server_payload], [shared_data], server.secrets, 
                                                      dryrun=cfg.dryrun)

Initially produced 117 hits.
Recovered tokens tensor([[   11,    12,    13,  ...,   284,   285,   286],
        [  287,   290,   291,  ...,   370,   371,   373],
        [  376,   379,   382,  ...,   513,   517,   530],
        ...,
        [   32,    65,    82,  ..., 23558, 32400, 41075],
        [   11,   278,   351,  ..., 16030, 23707, 24375],
        [   13,    29,    31,  ...,  6553, 12877, 17747]]) through strategy decoder-bias.


Next we'll evaluate metrics, comparing the `reconstructed_user_data` to the `true_user_data`.

In [8]:
metrics = breaching.analysis.report(reconstructed_user_data, true_user_data, [server_payload], 
                                    server.model, order_batch=True, compute_full_iip=False, 
                                    cfg_case=cfg.case, setup=setup)

METRICS: | Accuracy: 0.8633 | S-BLEU: 0.88 | FMSE: 1.1814e-05 | 
 G-BLEU: 0.87 | ROUGE1: 0.87| ROUGE2: 0.86 | ROUGE-L: 0.87| Token Acc: 88.84% | Label Acc: 0.00%


And finally, we also plot the reconstructed data:

In [9]:
user.print(reconstructed_user_data)

 The Tower Building of the Little Rock Arsenal, also known as U.S. Arsenal Building, is a building located in MacArthur Park in downtown Little Rock, Arkansas
. Built in 1840, it was part of Little Rock's first military installation. Since its decommissioning, The Tower Building has housed two museums. It
 was home to the Arkansas Museum of Natural History and Antiquities from 1942 to 1997 and the MacArthur Museum of Arkansas Military History since 2001. It has also been the
 headquarters of the Little Rock Æsthetic Club since 1894. 
 The building receives its name from its distinct octagonal tower. Besides being the last
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
 the top were attractions in the state of Little by <unk>. The arsenal was constructed at the request of Governor James landier Conway previously response to the
 perceived dangers of frontier life and fears of the many Native Americans who were passing through the state on their way to the newly establ

### Notes:
* This is still a malicious model. The linear layer that is inserted acts across the entire sequence length.
* An alternative imprint block based on a sparse structure (instead of the cumulative sum in the default block) can be selected by choosing `SparseImprintBlock`
* The attack will work equally well for any sequence length (but also require more parameters for the linear layer)
* Again, increasing the number of bins allows for more data to be recovered.
* With this tokenizer, sentences that are not recovered are shown as sequence of `,,,,`. This is tokenizer-specific.