Fast Generation-Based Gradient Leakage Attacks against Highly Compressed Gradients
In federated learning, clients' private training data can be stolen from publicly shared gradients. The existing attacks either require modification of the FL model (analytics-based) or take a long time to converge (optimization-based) and fail against highly compressed gradients in practical FL systems. We contrive a new generation-based attack algorithm capable of reconstructing the original mini-batch of data from the compressed gradient in just a few milliseconds.
The whole attack process can be briefly described as:
- train a generator from the output feature of the convolutional layer to the raw images in advance using the auxiliary dataset.
- Obtain the output feature of the convolutional layer from the gradient of the fully connected layer.
- Sending the feature to the generator to get user images.
The core code of the algorithm is extremely simple:
# training (key step)
criterion = nn.MSELoss()
origin_model.fc = nn.Sequential()
dummy_img = generator(origin_model(img))
loss = criterion(dummy_img, img)
# attack
g_w = grad[-2]
g_b = grad[-1]
offset_w = torch.stack([g for idx, g in enumerate(g_w) if idx not in y], dim=0).mean(dim=0) * (bz - 1) / bz
offset_b = torch.stack([g for idx, g in enumerate(g_b) if idx not in y], dim=0).mean() * (bz - 1) / bz
conv_out = (g_w[y] - offset_w) / (g_b[y] - offset_b).unsqueeze(1)
conv_out[torch.isnan(conv_out)] = 0.
conv_out[torch.isinf(conv_out)] = 0.
img = generator(conv_out)
Download generator weights file gen_weights.pth and place it in folder ./data/
Reconstruct 10 batches of images using FGLA algorithm:
python reconstruct_exp.py --exp_name="my_exp" --dataset="imagenet" --reconstruct_num=10
The results all you need will be placed in data/reconstruct/{exp_name}
/
argument | help | optional value |
---|---|---|
algorithm | gradient leakage attack algorithm | fgla, dlg, stg, ig |
model_weights | weights path for generator | |
reconstruct_num | number of reconstructed batches | |
dataset | dataset to use | imagenet, cifar100, caltech256 |
max_iteration | number of iterations when use optimization-based algorithm | |
exp_name | the name of the experiment, used to create a folder | |
batch_size | batch size | |
device | which device to use | |
seed | random seed |
Also, you can train your own generator:
python train_generator.py --exp_name="my_train_exp"
The results all you need will be placed in data/train_generator/{exp_name}/
argument | help | optional value |
---|---|---|
exp_name | the name of the experiment, used to create a folder | |
batch_size | batch size | |
epochs | epochs for training | |
device | which device to use | |
seed | random seed |