Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize Product #1391

Merged
merged 31 commits into from
Jul 7, 2022
Merged

Generalize Product #1391

merged 31 commits into from
Jul 7, 2022

Conversation

devmotion
Copy link
Member

@devmotion devmotion commented Aug 31, 2021

DistributionsAD contains custom distribution types for vectors of multivariate distributions and matrices of univariate distributions. I think it would be good to add support for such more general product distributions to Distributions and remove them from DistributionsAD.

In this PR, I propose to generalize the existing Product such that it allows to construct M + N-dimensional product distributions from AbstractArray{<:Distribution{ArrayLikeVariate{M}},N} by stacking the M-dimensional distributions. The ArrayLikeVariate variate form (thanks to @oschulz!) enables us nicely to define these product distributions in an even more general way than what currently is implemented in DistributionsAD.

I tried to perform this generalization in a non-breaking way by adding a new type ProductDistribution, making Product a special alias of it, and deprecating the Product constructor. I did not find a way to deprecate Product with Base.@deprecate_binding, it seems it can't deal with type parameters: JuliaLang/julia#9830. Product still exists and product_distribution(::AbstractVector{<:UnivariateDistribution}) still creates a Product to avoid breaking changes (otherwise e.g. the downstream tests fail).

This PR requires:

  • implementation of interface for MatrixOfUnivariateDistribution
  • implementation of interface for VectorOfMultivariateDistribution
  • basic methods for more general cases
  • tests
  • implementations of more efficient dispatches for product distributions of Fill{<:Distribution{<:ArrayLikeVariate}} (if necessary; not all implementations in DistributionsAD might be needed)

cc: @yebai

@oschulz
Copy link
Contributor

oschulz commented Aug 31, 2021

I like the ability to create M + N products!

Question: Shouldn't we also allow for products of multivaritate dists (of same length) that result in matrixvariate dists, and so on for higher orders?

@devmotion
Copy link
Member Author

devmotion commented Aug 31, 2021

Question: Shouldn't we also allow for products of multivaritate dists (of same length) that result in matrixvariate dists, and so on for higher orders?

That's exactly what the PR proposes to do 🙂 And it is done in VectorOfMultivariate in DistributionsAD with vectors of multivariate distributions.

@oschulz
Copy link
Contributor

oschulz commented Aug 31, 2021

Oh, silly me! Sorry for the misunderstanding - now I like it even more! :-)

@codecov-commenter
Copy link

codecov-commenter commented Aug 31, 2021

Codecov Report

Merging #1391 (55e6c5b) into master (f889f9e) will decrease coverage by 0.94%.
The diff coverage is 98.75%.

❗ Current head 55e6c5b differs from pull request most recent head 1a574fa. Consider uploading reports for the commit 1a574fa to get more accurate results

@@            Coverage Diff             @@
##           master    #1391      +/-   ##
==========================================
- Coverage   85.52%   84.57%   -0.95%     
==========================================
  Files         128      125       -3     
  Lines        7863     7140     -723     
==========================================
- Hits         6725     6039     -686     
+ Misses       1138     1101      -37     
Impacted Files Coverage Δ
src/common.jl 79.12% <0.00%> (-0.68%) ⬇️
src/multivariates.jl 40.90% <ø> (-1.95%) ⬇️
src/multivariate/product.jl 100.00% <100.00%> (ø)
src/product.jl 100.00% <100.00%> (ø)
src/univariate/continuous/studentizedrange.jl 72.72% <0.00%> (-11.89%) ⬇️
src/univariate/continuous/noncentralchisq.jl 78.94% <0.00%> (-7.42%) ⬇️
src/univariate/continuous/chisq.jl 75.00% <0.00%> (-6.82%) ⬇️
src/univariate/continuous/pgeneralizedgaussian.jl 61.76% <0.00%> (-6.81%) ⬇️
src/univariate/continuous/noncentralf.jl 85.00% <0.00%> (-6.67%) ⬇️
src/univariate/continuous/noncentralt.jl 85.00% <0.00%> (-5.91%) ⬇️
... and 106 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f889f9e...1a574fa. Read the comment docs.

@matbesancon
Copy link
Member

like the idea as well

@oschulz
Copy link
Contributor

oschulz commented Nov 7, 2021

Any news on this one?

@devmotion
Copy link
Member Author

I have quite many changes in a local branch but haven't had time to clean and polish them. There were some test errors caused by bugs in upstream packages such as StatsBase, so I had to fix these first. I'll try to finish the PR this week.

@mschauer
Copy link
Member

How do you treat the value support? Force all factors to have the same value support?

@oschulz
Copy link
Contributor

oschulz commented Nov 14, 2021

Thanks for doing this, @devmotion , this will come in so handy!

@devmotion
Copy link
Member Author

Sure, this could be done (and I've thought of it - with a different notation) but I thought such convenience functions could be discussed and possibly added in a separate PR as it is not needed for this PR. Hence I don't really want to start a discussion here but just mention that I thought \otimes (ie tensor product) would be the natural choice.

@cscherrer
Copy link

Ideally, yes, but in LinearAlgebra, × === cross. See e.g.
https://julialang.zulipchat.com/#narrow/stream/225540-gripes/topic/.5Ctimes
and
JuliaLang/julia#37109

It's unfortunate, because the symbol is very common in other contexts. But at least for now, IMO this is reason enough to avoid it when possible.

@oschulz
Copy link
Contributor

oschulz commented Nov 28, 2021

Sure, this could be done (and I've thought of it - with a different notation) but I thought such convenience functions could be discussed and possibly added in a separate PR as it is not needed for this PR.

Sure, of course.

Hence I don't really want to start a discussion here but just mention that I thought \otimes (ie tensor product) would be the natural choice.

Oh yes, that would be neat too, and avoid the × === cross awkwardness.

@oschulz
Copy link
Contributor

oschulz commented Nov 28, 2021

@cscherrer, maybe you can start playing with \otimes in MeasureBase to see how that "feels"?

@cscherrer
Copy link

I think we'll need to step back and look at the notation as a whole, hopefully get something self-consistent that doesn't conflict too badly with Distributions. But let's discuss this elsewhere 🙂

@devmotion devmotion changed the title WIP: Generalize Product and MatrixReshaped WIP: Generalize Product Nov 28, 2021
@devmotion devmotion changed the title WIP: Generalize Product Generalize Product Nov 29, 2021
@devmotion
Copy link
Member Author

Principally, this PR is ready for review - but I really would like to revert the last commit which fixed the downstream tests. It introduces an annoying inconsistency, namely that product_distribution(::AbstractVector{<:UnivariateDistribution}) falls back to creating a Product whereas for all other inputs the default definition of product_distribution creates a ProductDistribution. The ProductDistribution is clearly superior, it supports everything that Product does and fixes some of its bugs (such as support of FillArrays). However, it seems that Broadcast.broadcased expressions (used instead of zip to benefit from pairwise summation) are problematic for AD.

independent `M`-dimensional distributions by stacking them.

The function falls back to constructing a [`ProductDistribution`](@ref) distribution but
specialized methods can be defined.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we say here what pdf(d::ProductDistribution, x) does?

@oschulz
Copy link
Contributor

oschulz commented Jun 3, 2022

This is going to be awesome, thanks so much for this @devmotion !

@devmotion
Copy link
Member Author

Principally, this PR is ready for review - but I really would like to revert the last commit which fixed the downstream tests. It introduces an annoying inconsistency, namely that product_distribution(::AbstractVector{<:UnivariateDistribution}) falls back to creating a Product whereas for all other inputs the default definition of product_distribution creates a ProductDistribution. The ProductDistribution is clearly superior, it supports everything that Product does and fixes some of its bugs (such as support of FillArrays). However, it seems that Broadcast.broadcased expressions (used instead of zip to benefit from pairwise summation) are problematic for AD.

Any comments? In principle the PR seems ready.

@oschulz
Copy link
Contributor

oschulz commented Jul 7, 2022

Any comments? In principle the PR seems ready.

Just that I can't wait to use it! :-)

@matbesancon matbesancon merged commit c430f39 into master Jul 7, 2022
@matbesancon matbesancon deleted the dw/product branch July 7, 2022 11:10
@matbesancon
Copy link
Member

@devmotion I'll let you do the corresponding release?

Comment on lines +24 to +27
Base.depwarn(
"`Product(v)` is deprecated, please use `product_distribution(v)`",
:Product,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm just brainfarting, but it seems like this combined with the @deprecate below means that no matter which constructor I use for a vector of univariate distributions, I'm going to get a deprecation-warning?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at the moment one always get's a depwarn ... #1589

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's unfortunate and fixed by #1590. I thought someone would approve the PR and I would be able to make a bugfix release shortly after the issue was discovered but it seems nobody has approved it within almost two weeks. I'm going to merge it and tag a release now since the issue is quite annoying for downstream packages (and AFAICT even causes time outs in Turing).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devmotion !

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonderful stuff @devmotion !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants