Secure Aggregation for Federated Learning
This example shows how TF Encrypted can be used to perform secure aggregation for federated learning, where a model owner is training a model by repeatedly asking a set of data owners to compute gradients on their locally held data set. As a way of reducing the privacy leakage, only the mean gradient is revealed to the model owner in each iteration.
The above players are represented via two classes,
DataOwner, and linked together as follows:
# instantiate the model owner and the model it wishes to train model_owner = ModelOwner('model-owner') # instantiate data owners and build copy of the model on their devices linked to the model owner's data_owners = [ DataOwner('data-owner-0', model_owner.build_training_model), DataOwner('data-owner-1', model_owner.build_training_model), DataOwner('data-owner-2', model_owner.build_training_model), ]
Then, the result of
compute_gradient of each data owner is used as a private input into a secure computation, in this case between three compute servers as needed by the default Pond protocol.
# collect encrypted gradients from data owners model_grads = zip(*( tfe.define_private_input(data_owner.player_name, data_owner.compute_gradient) for data_owner in data_owners )) # compute mean of gradients (without decrypting) aggregated_model_grads = [ tfe.add_n(grads) / len(grads) for grads in model_grads ] # send the encrypted aggregated gradients to the model owner for it to decrypt and update iteration_op = tfe.define_output( model_owner.player_name, aggregated_model_grads, model_owner.update_model )
Finally, we simply run the update procedure for a certain number of iterations:
with tfe.Session() as sess: sess.run(tf.global_variables_initializer()) for _ in range(model_owner.ITERATIONS): sess.run(iteration_op)
The data flow graph generated by the above can be seen below, where each colour represents a different player, with blue for instance being the model owner.
And by opening the private input blocks we can take a closer look at what happens locally on each data owner, in this case data loading, gradient computation, and secret sharing. Note how the gradient computation takes the current set of parameters from the model owner as input.
Make sure to have the training and test data sets downloaded before running the example:
which will place the converted files in the
To then run locally use:
or remotely using:
python3 examples/federated-learning/run.py config.json
See more details in the documentation.