-
Notifications
You must be signed in to change notification settings - Fork 16
Dp scaffold #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dp scaffold #48
Conversation
|
I have what is probably a silly question, because I haven't read the DP-scaffold paper in a lot of detail. Do they also use Opacus to do their DP optimization? If so that's great. I just want to make sure our implementation matches theirs. Computing the covariates using Opacus is a bit cloudy for me, in the sense that, I understand what's stored in the parameter grads after a single batch backwards pass, but is the same accumulation stored in them after Opacus does single point gradients for all items in a batch? I would say probably but I'm not 100% sure. |
They do not use opacus. They implement the DP by hand: https://github.com/maxencenoble/Differential-Privacy-for-Heterogeneous-Federated-Learning/blob/ecad8acb687b974ee917c2cb27515e913ace4d47/flearn/users/user_avg.py#L103. I opted to use opacus to stay consistent with our existing implementation. I assumed that opacus is accumulating across per sample gradient, the only difference from a regular optimizer is that per sample clipping and noise is being added. I just read through this medium article put out by opacus with some additional insight into what they are doing under the hood. What do you think? |
If you're confident in it I'm good with that! Implementing our own would not be fun and Opacus allows for the enforcement of a lot of things, including replacing layers that do not admit DP, like batch norms. I just wanted to double check that we'd thought it through 🙂 |
Yeah I am pretty confident! Just to be sure, I added and took on a small ticket in the backlog to explore this further. I thought it may be good to explore outside of this PR because it will involve a bit of an opacus deep dive to be absolutely sure |
emersodb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the changes look good to me. The tests were a great add along with some of the modularity etc.
PR Type
[Feature | Fix | Documentation | Other() ]
Short Description
Adding DP-SCAFFOLD, a variant of the SCAFFOLD method with instance level differential privacy guarantees against the server or a third party with access to the final model. As part of this, I also extended SCAFFOLD (client and server) to include the option for warm initialization of control variates. This is to stay consistent with the DP-SCAFFOLD paper and official implementation. In both cases, when using warm initialization and not, DP-SCAFFOLD offers the same privacy guarantees as DP-FedAvg. For details of the privacy analysis refer to section 4 in the paper and section B in the supplementary materials.
I also created an instance level privacy client where we take care of the opacus setup under the hood. Right now computing the privacy loss requires manually specifying the number of samples per client and total data size. I have added a ticket to take advantage of the functionality we have built out to fetch client sample counts automatically.
Tests Added