Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Mixture distribution for V4 #5438

Merged
merged 13 commits into from
Mar 9, 2022
Merged

Conversation

ricardoV94
Copy link
Member

@ricardoV94 ricardoV94 commented Feb 1, 2022

This PR is an attempt to refactor (Marginalized) Mixture distributions for V4

  • Refactor Mixture with iterable components
  • Refactor Mixture with single component
  • Refactor NormalMixture
  • Rebase from main and handle new meaning of size for multivariate dists
  • Reenable TestMixtureVsLatent
  • Deprecate MixtureSameFamily (keep tests)
  • Add more design decision context to refactor commit
  • Add moment and tests
  • Allow for Nested Mixtures / Other symbolic distributions as components (reminder Reimplement nested Mixtures #5533)

Changes

There are two big changes in how Mixture works compared to V3:

  1. The support dimensionality of the Mixture no longer depends on the size of the weights, but on the support dimensionality of the components. This means one cannot use a batched scalar distribution (like Normal) as if it was a vector distribution, which was supposedly allowed by the docstrings. However, while the old random method respected this, the logp method did not (the logp would be indifferent to mixing of values across this "fake" vector components), so this is not clearly a regression.
# This mixture now assumes interchangeability across the 5 values of each component, 
# regardless of the dimensionality of the weights
mix = pm.Mixture.dist(w=[0.5, 0.5], comp_dists=pm.Normal.dist([-10, 10], size=(5, 2)))

I am exploring adding a keyword argument to override the support dimensionality in a follow-up PR. I managed to do this for the random method, but haven't had time yet to figure out what needs to be changed in the logp method.

  1. Nested Mixtures or Mixtures using non pure RandomVariables (such as Censored variables) are not yet possible. This will require creating a dispatch to retrieve and manipulate these variables (i.e., check ndim_supp, resize), in the same way that's being done with the component RandomVariables. This should be straightforward, since these had to be implemented as methods of the SymbolicDistribution anyway...

Closes #4781

@codecov
Copy link

codecov bot commented Feb 1, 2022

Codecov Report

Merging #5438 (0e442d1) into main (afe210a) will increase coverage by 0.86%.
The diff coverage is 95.30%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5438      +/-   ##
==========================================
+ Coverage   87.29%   88.15%   +0.86%     
==========================================
  Files          81       81              
  Lines       14247    14238       -9     
==========================================
+ Hits        12437    12552     +115     
+ Misses       1810     1686     -124     
Impacted Files Coverage Δ
pymc/distributions/distribution.py 90.80% <ø> (ø)
pymc/distributions/mixture.py 93.16% <95.27%> (+71.99%) ⬆️
pymc/distributions/__init__.py 100.00% <100.00%> (ø)
pymc/distributions/multivariate.py 92.30% <0.00%> (+0.11%) ⬆️

@ricardoV94
Copy link
Member Author

ricardoV94 commented Feb 4, 2022

@Sayam753 could you give some context on what was the idea with MixtureSameFamily?

Now that we have meta information about the components (ndim_supp, ndims_params), it is perhaps no longer needed?

@Sayam753
Copy link
Member

Sayam753 commented Feb 5, 2022

could you give some context on what was the idea with MixtureSameFamily?

MixtureSameFamily distribution helps to create mixture distribution for multivariate distributions.

The legacy Mixture distribution assumes that the mixture components are present in the last dimension of a distribution. This creates a problem for multivariate distributions, because in this case, the last dimension corresponds to events, not the mixture components.

MixtureSameFamily takes an input an integer asking which axis to consider for mixture components. And it reduces that axis during logp computations.

@lucianopaz is the magician behind this distribution.

@ricardoV94
Copy link
Member Author

If I understand correctly, I think we can infer the correct mixture axis from the meta information we have now.

That's my TODO point above. So far, I have just been hacking it by trial and error.

Do you think there is inherent ambiguity still? Even considering ndim_supp, ndims_params, shape_from_params... and whatever else methods we have to reason about shapes of RandomVariables?

@ricardoV94
Copy link
Member Author

Figured out the weights shape padding/broadcasting thanks tho @lucianopaz!

@ricardoV94 ricardoV94 force-pushed the mixtures branch 4 times, most recently from fac451f to 3a39b82 Compare February 27, 2022 22:58
Comment on lines 633 to 639
# Expected to fail if comp_shape is not provided,
# nd is multidim and it does not broadcast with ncomp. If by chance
# it does broadcast, an error is raised if the mixture is given
# observed data.
# Furthermore, the Mixture will also raise errors when the observed
# data is multidimensional but it does not broadcast well with
# comp_dists.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This no longer seems to be a problem. @lucianopaz can you confirm that if the current tests pass, this is indeed fine?

@ricardoV94 ricardoV94 force-pushed the mixtures branch 5 times, most recently from b2d19a9 to 8d665fa Compare March 3, 2022 09:38
@ricardoV94
Copy link
Member Author

@OriolAbril Any idea why the tensor_like here does not have a link to the glossary? https://pymc--5438.org.readthedocs.build/en/5438/api/distributions/generated/pymc.NormalMixture.html

@OriolAbril
Copy link
Member

There should be spaces both before and after the colon that separates parameter name and type

@ricardoV94 ricardoV94 force-pushed the mixtures branch 2 times, most recently from 7e77630 to dfdce06 Compare March 7, 2022 16:10
@ricardoV94
Copy link
Member Author

Moments are also ready thanks to @larryshamalama. This is ready for review and merge 🚀

Copy link
Member

@larryshamalama larryshamalama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great 🙂 Thanks for leading the effort @ricardoV94

ricardoV94 and others added 13 commits March 8, 2022 13:20
Mixtures now use an `OpFromGraph` that encapsulates the Aesara random method. This is used so that logp can be easily dispatched to the distribution without requiring involved pattern matching. The Mixture random and logp methods now fully respect the support dimensionality of its components, whereas previously only the logp method did, leading to inconsistencies between the two methods.

In the case where the weights (or size) indicate the need for more draws than what is given by the component distributions, the latter are resized to ensure there are no repeated draws.

This refactoring forces Mixture components to be basic RandomVariables, meaning that nested Mixtures or Mixtures of Symbolic distributions (like Censored) are not currently possible.

Co-authored-by: Larry Dong <larry.dong@mail.utoronto.ca>
* Emphasize equivalency between iterable of components and single batched component
* Add example with mixture of two distinct distributions
* Add example with multivariate components
The two tests relied on implicit behavior of V3, where the dimensionality of the weights implied the support dimension of mixture distribution. This, however, led to inconsistent behavior between the random method and the logp, as the latter did not enforce this assumption, and did not distinguish if values were mixed across the implied support dimension.

In this refactoring, the support dimensionality of the component variables determines the dimensionality of the mixture distribution, regardless of the weights. This leads to consistent behavior between the random and logp methods as asserted by the new checks.

Future work will explore allowing the user to specify an artificial support dimensionality that is higher than the one implied by the component distributions, but this is for now not possible.
Behavior is now implemented in Mixture
@fonnesbeck fonnesbeck merged commit 620b11d into pymc-devs:main Mar 9, 2022
@ricardoV94 ricardoV94 deleted the mixtures branch June 6, 2023 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mixtures not refactored for V4
6 participants