Differential privacy primitives use insecure noise generation #23002

TedTed · 2024-06-13T16:01:13Z

Hi folks,

This method is adding noise to a sum for the purpose of enforcing differential privacy (as described in a recent talk at PEPR '24). The method used to generate noise is naively calling java.util.Random.nextGaussian, and as such is vulnerable to floating-point attacks as described in this 2012 paper or (since this is Gaussian noise and not Laplace noise) this paper or this one.

This could allow an attacker to get more information out of the output data than they should, in potentially catastrophic ways (precision-based attacks, for attacker, are very simple and allow an attacker to perfectly distinguish between true inputs 0 or 1 more than 25% of the time). I have not gone through the trouble of actually installing Presto and build a PoC, but this is such a textbook example of a vulnerable implementation of this stuff that I hope you'll take this seriously even without it.

The text was updated successfully, but these errors were encountered:

duykienvp · 2024-06-25T17:29:57Z

Thanks @TedTed for bring this up.
We are aware of this attack. I just want to clarify the intent of these noisy functions was NOT to be fully differentially private.
There is some documents about this issue https://github.com/prestodb/presto/pull/22715/files#diff-7461e30f5827d33fc08a54932f0a32b06827971adf505b00dfd531351c824891R185
but it was not released to prestodb.io yet. Some of the limitations mentioned in that doc was just the nature of Presto engine. So we had to ask the practitioner who wants to build a DP system to consult with suitable technical experts first.
And we also welcome anyone to help us address these limitations.

Thanks again

TedTed · 2024-06-26T15:23:23Z

Hi Kien,

I'm going to be honest — I find this response disheartening. You gave a talk at PEPR '24 explaining that you built differential privacy support in Presto, that parts of the code was open-source (even though the rewriter isn't), and that this was used for production use cases across Meta platforms (frustratingly, without giving any more details).

Now you're telling me that this isn't actually trying to implement differential privacy, and that using it for a DP system would require consulting with technical experts. Which experts are you talking to, and why are they not giving you advice such as "first off, make your noise addition primitives safer"? The person who wrote the original paper about floating-point attacks works at Meta. Have you asked him for guidance when building this? If the goal of this work is not to be used to implement differential privacy, then what is the purpose of this code, and why was your PEPR talk suggesting otherwise?

You're saying you welcome help to address these limitations. A very very basic first step would be to fix the noise generation logic, for example by using the primitives from GoogleDP, or re-implementing interval refining in Java. You also probably want to fix this bit of code while you're at it — this is almost certainly not the way you want to compute a DP average. But a lot more things can go wrong when implementing DP, and nobody will be able to help you as long as the rewriter logic is not open-source.

TedTed added the bug label Jun 13, 2024

github-project-automation bot added this to Bugs and support requests Jun 13, 2024

github-project-automation bot moved this to 🆕 Unprioritized in Bugs and support requests Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differential privacy primitives use insecure noise generation #23002

Differential privacy primitives use insecure noise generation #23002

TedTed commented Jun 13, 2024

duykienvp commented Jun 25, 2024

TedTed commented Jun 26, 2024 •

edited

Loading

Differential privacy primitives use insecure noise generation #23002

Differential privacy primitives use insecure noise generation #23002

Comments

TedTed commented Jun 13, 2024

duykienvp commented Jun 25, 2024

TedTed commented Jun 26, 2024 • edited Loading

TedTed commented Jun 26, 2024 •

edited

Loading