Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: make prv accountant robust to larger epsilons #606

Closed
wants to merge 2 commits into from
Closed

fix: make prv accountant robust to larger epsilons #606

wants to merge 2 commits into from

Conversation

Solosneros
Copy link
Contributor

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Docs change / refactoring / dependency upgrade

Motivation and Context / Related issue

Hi,

this PR fixes #601 and #604.

It will introduce the same fix as in microsoft/prv_accountant#38. Lukas (author of prv accountant, @wulu473) said that In general, adding any additional points is safe and won't affect the robustness negatively.

The cause of these errors seems to be the grid for computing the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max).

This PR determines the grid (points variable) based on the lowest and highest point. More information is below.

Best

Observation

I debugged the code and arrived at some point at the mean() function of the PrivacyRandomVariableTruncated class. The grid (points variable) used to compute the mean is constant apart from the lowest (self.t_min) and highest point (self.t_max). See the line of code here. It looks like this [self.tmin, -0.1, -0.01, -0.001, -0.0001, -1e-05, 1e-05, 0.0001, 0.001, 0.01, 0.1, self.tmax].

It seems that the tmin and tmax are of the order of [-12,12] for the examples that I posted above and even up to [-48,48] for the example that @jeandut posted in the #604 issue whereas they are more like [-7,7] for the readme example for DP-SGD.

We suspect that the integration breaks down when the gridspacing between between tmin / tmax get's too large.

Proposed solution

Determine the points grid based on tmin and tmax but determines the start and end of the logspace based on tmin and tmax.

Before: (

points = np.concatenate(
[
[self.t_min],
-np.logspace(-5, -1, 5)[::-1],
np.logspace(-5, -1, 5),
[self.t_max],
]
)
)

After:

# determine points based on t_min and t_max
lower_exponent = int(np.log10(np.abs(self.t_min)))
upper_exponent = int(np.log10(self.t_max))
points = np.concatenate(
    [
        [self.t_min],
        -np.logspace(start=lower_exponent, stop=-5, num=10),
        [0],
        np.logspace(start=-5, stop=upper_exponent, num=10),
        [self.t_max],
    ]
)

How Has This Been Tested (if it applies)

I ran the examples from the issues #601 and #604 and they don't break anymore.

import opacus
target_delta = 0.001
target_epsilon = 20
steps = 5000
sample_rate=0.19120458891013384

for target_epsilon in [20, 50]:
    noise_multiplier = opacus.privacy_engine.get_noise_multiplier(target_delta=target_delta, target_epsilon=target_epsilon, steps=steps, sample_rate=sample_rate, accountant="prv")
    prv_accountant = opacus.accountants.utils.create_accountant("prv")
    prv_accountant.history = [(noise_multiplier, sample_rate, steps)]
    obtained_epsilon = prv_accountant.get_epsilon(delta=target_delta)
    print(f"target epsilon {target_epsilon}, obtained epsilon {obtained_epsilon}")

target epsilon 20, obtained epsilon 19.999332284974717
target epsilon 50, obtained epsilon 49.99460075990896

target_epsilon = 4
batch_size = 50
epochs = 5
delta = 1e-05
expected_len_dataloader = 500 // batch_size
sample_rate = 1/expected_len_dataloader


noise_multiplier = opacus.privacy_engine.get_noise_multiplier(target_delta=target_delta, target_epsilon=target_epsilon, epochs=epochs, sample_rate=sample_rate, accountant="prv")
prv_accountant = opacus.accountants.utils.create_accountant("prv")
prv_accountant.history = [(noise_multiplier, sample_rate, int(epochs / sample_rate))]
obtained_epsilon = prv_accountant.get_epsilon(delta=target_delta)
print(f"target epsilon {target_epsilon}, obtained epsilon {obtained_epsilon}")

target epsilon 4, obtained epsilon 3.9968389923130356

Checklist

  • The documentation is up-to-date with the changes I made.
  • I have read the CONTRIBUTING document and completed the CLA (see CONTRIBUTING).
  • All tests passed, and additional code has been covered with new tests.

Not able to run all tests locally and unsure if new tests should be added.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 10, 2023
@facebook-github-bot
Copy link
Contributor

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Solosneros has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in ad084da.

@Solosneros Solosneros deleted the fix_prv_mean_computation branch November 28, 2023 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PRV Accountant fails for specific input values for its args
3 participants