Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardized Distributions #45115

Open
pepper-jk opened this issue Sep 22, 2020 · 11 comments
Open

Standardized Distributions #45115

pepper-jk opened this issue Sep 22, 2020 · 11 comments
Labels
module: distributions Related to torch.distributions triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@pepper-jk
Copy link

pepper-jk commented Sep 22, 2020

馃殌 Feature

torch.normal, torch.distributions.normal, and torch.tensor.normal_ should allow the same parameters and execute the same function/method in the backend.

Motivation

This issue is related to pytorch/opacus/issues/59. The motivation behind it is to allow any distribution to be used by the noise generation of differential privacy in opacus, formerly known as pytorch-dp.

There are multiple ways to create random samples from the normal (or Gaussian) distribution, three at least that I know of (see above).
However not all allow the same parameters and therefore configuration.

  • torch.distributions.normal.Normal does not allow a custom generator and therefore does not support urandom.
    • same goes for the other distributions in torch.distributions
  • but distributions in torch.tensor allow a generator
  • however torch.tensor only supports a subset of the distributions available in torch.distributions
    • in-place sampling from Laplace is not possible
  • torch.distributions only allow the device to be passed explicitly
    • one needs to pass it through one of the other parameters and use a Tensor instead of a float
  • the in-place sampling gets the device from the Tensor it samples for

Pitch

  1. The distributions in torch.distributions should allow the generator parameter and (optionally) the explicit device parameter.
    • this ensures compatibility for the differential privacy implementation, that was based on torch.normal, which as stated here is no longer the "best practice" (and also does not allow other distributions as only normal is available
  2. All other methods and functions to the distributions should allow the same parameters and call the same code in the back end.
    • this would reduce performance and result differences, if multiple researchers (and/or developers) use different methods to produce samples of the same distribution
  3. Ideally the in-place distributions in torch.tensor should be complete, meaning include all the distributions available in torch.distributions.

Alternatives

If 1. is not possible, at least each distribution in torch.distributions should also be available through torch.tensor.<distribution_name>_ as stated in 3., as it allows the use of the in-place sampling to replace torch-normal and allows addition of Laplace as an alternative noise generation.

Additional context

I highly recommend to read the other issue pytorch/opacus/issues/59 for more context.

cc @vincentqb @fritzo @neerajprad @alicanb @vishwakftw @nikitaved

@ailzhang ailzhang added module: distributions Related to torch.distributions triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Sep 22, 2020
@fritzo
Copy link
Collaborator

fritzo commented Sep 30, 2020

Keep in mind that the design goals of torch.distributions differ quite a bit from the low-level helpers like torch.normal and torch.Tensor.normal_.

For details you could read the original design doc, but I would summarize the design goal of torch.distributions as to provide a collection of high-level distribution objects each of which bundle together a number of probability related functions compatible with broadcasting. The basic pair of functions are (1) a differentiable .log_prob() method and a .sample() method that for continuous distributions is differentiable in the sense of reparametrized gradients. Other methods include .entropy() and the pairwise kl_divergence() function, both of which are differentiable. There are about 50 basic distributions in torch.distributions, and another >50 in pyro.distributions that are slowly being moved upstream. So torch.distributions is very focused on what it means for a general probability library to support automatic differentiation.

Given these different design goals, maybe it would be more helpful to update the Distribution interface, e.g. to add more standard .__init__() kwargs to the distributions so as to get device type correct. Would there be a way to accomplish your goals without restricting to in-place operations?

@pepper-jk
Copy link
Author

Yes, if the distribution interface in torch.distributions would allow a generator and device to be passed in .__init__() that would be sufficient to solve pytorch/opacus/issues/59.

Not having the generator would be the only reason why the in-place functions like torch.normal would be needed, as they could utilize a urandom-based generator.

And having an explicit device parameter would negate the need to init a tensor filled with 0s and pass it to the distribution just to generate a noise tensor with the correct device type.

As for my other suggestions. I just thought a more uniform implementation, where the in-place functions used the distributions from torch.distribution, would reduce redundancy and potential bugs that may result from that. Also they are alternative solutions to our problem, as I assumed there was a reason behind torch.distribution not having either parameter. Sorry if I missed the point of the design.

I'll take another look at this tomorrow. But if the adjustment of the distribution interface would be the way to go, I could get behind that.

@fritzo
Copy link
Collaborator

fritzo commented Oct 1, 2020

@pepper-jk great, it sounds like a workable solution might be to pass device into .__init__() and pass generator into .sample(). What I like about this second solution is that it is a little closer to JAX distributions whose .sample() methods also input an rng_key (analogous to generator). @fehiepsi @neerajprad do you have an opinion on how to pass generator into distributions, maybe in a way that minimizes differences across distribution libraries?

@neerajprad
Copy link
Contributor

@fehiepsi @neerajprad do you have an opinion on how to pass generator into distributions, maybe in a way that minimizes differences across distribution libraries?

The generator object that @pepper-jk mentioned is stateful, and I think it would be reasonable to pass that in __init__ and use it internally when we call sample. JAX, in contrast, has a functional random number generator, so the onus of splitting keys and passing in a rng key (which is just an integer) has to be managed by the user, hence necessitating the current design choice of passing this key as the first argument to sample. We could use a similar key to set the generator state, but I believe that will only be useful if we have a key splitting mechanism like JAX and the user will need to manage this on their own.

@fritzo
Copy link
Collaborator

fritzo commented Oct 1, 2020

Re: passing generator to .__init__() vs .sample(). @pepper-jk would a single generator instance be shared among multiple distributions, or would distributions own their generator?

One design concern is that most Distribution methods consume no randomness, so passing generator to .sample() rather than .__init__() might help clarify which methods consume randomness. Another advantage to passing generator to .sample() is that we can avoid storing the generator in a distribution instance, thereby eliminating a bit of boilerplate code in .expand(). What advantages might there be in passing generator to .__init__()?

@fehiepsi
Copy link
Contributor

fehiepsi commented Oct 1, 2020

I would prefer passing generator to sample method. I think that it will also lead to fewer changes in the interface than passing to __init__.

@neerajprad
Copy link
Contributor

I agree about the boilerplate in .expand. My initial thought was that adding this to __init__ makes it clear that the distribution uses its own local source of randomness for sampling rather than relying on some global source, and does not require passing in an additional generator argument with sample which will be internally mutated. However, this was based on my impression of how the generator argument would be used (i.e. one for each distribution instance). I can see that if we are going to share the same generator across multiple distribution instances, then passing it to sample would be a better design choice and also match the API for the primitive torch samplers.

@pepper-jk
Copy link
Author

pepper-jk commented Oct 1, 2020

Re: passing generator to .__init__() vs .sample(). @pepper-jk would a single generator instance be shared among multiple distributions, or would distributions own their generator?

For me either would work fine. I'm not opposed to the idea of passing the generator to .sample. In the code it would replace the generator is also passed to the (in-place) sample function and is a global member of the privacy_engine class. So for the sake of fixing pytorch/opacus/issues/59 this would be fine as far as I understand it.

What does the pytorch/opacus team say, @karthikprasad?

@Darktex
Copy link

Darktex commented Oct 7, 2020

Also no strong opinions on our side at Opacus. We can certainly just pass the generator.

Adding @pbelevich since he built torchcsprng (https://github.com/pytorch/csprng) and can have a more informed opinion than me here :)

@abhmul
Copy link

abhmul commented Nov 10, 2022

Was this ever resolved? Adding a generator argument to the .sample would be helpful for writing deterministic and reproducible code.

@pepper-jk
Copy link
Author

pepper-jk commented Nov 10, 2022

Not to my knowledge.

I ended up creating a small library for our research projects. You can find the library here: https://github.com/tklab-tud/aDPtorch

EDIT: just noticed that our solution is probably not what you are looking for. Sorry if this is the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: distributions Related to torch.distributions triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
None yet
Development

No branches or pull requests

7 participants