Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically join samples whenever they are duplicates #66

Open
HEmile opened this issue May 11, 2020 · 1 comment
Open

Automatically join samples whenever they are duplicates #66

HEmile opened this issue May 11, 2020 · 1 comment
Labels
enhancement New feature or request

Comments

@HEmile
Copy link
Owner

HEmile commented May 11, 2020

Monte Carlo methods for discrete RVs with replacement should not waste computation on duplicate samples if the output dynamic is deterministic. We could automatically join these samples in this way:

  • On Plate, register what sample outcomes are duplicates
  • At every deterministic call, index each duplicate only once during unwrapping
  • During unwrapping, split them again according to the indexing

As deterministic calls are/should be deterministic, we can guarantee this doesn't provide incorrect behaviour. Downside: If indexing is slower than the function performance, this could actually be much slower if a lot of deterministic calls happen...

Another option is to have the user select whenever this joining should happen, using eg a context wrapper.

@HEmile HEmile added the enhancement New feature or request label May 11, 2020
@HEmile HEmile added this to To do in Stochastic computation graphs via automation May 11, 2020
@HEmile
Copy link
Owner Author

HEmile commented May 14, 2020

Other issues: If there are more plates, the amount of unique samples can vary, meaning we get empty tensors that are still in use. Taken as a single tensor, if there is a single element in the plate for which there are no duplicates, then we still compute it for the whole sample performance.

Therefore, to make proper use out of it, this would require looping over the function for individual dimensions, otherwise you still do the full computation... Not sure if it's worth it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

1 participant