Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methods on MixtureModel actually target a mixture distribution. #1669

Open
bartvanerp opened this issue Jan 31, 2023 · 3 comments
Open

Methods on MixtureModel actually target a mixture distribution. #1669

bartvanerp opened this issue Jan 31, 2023 · 3 comments

Comments

@bartvanerp
Copy link

The current implementation of the mixture model is as follows:

struct MixtureModel{VF<:VariateForm,VS<:ValueSupport,Component<:Distribution} <: AbstractMixtureModel{VF,VS}
    components::Vector{Component}
    prior::Categorical
end

It has a prior and the conditional likelihood models (the components). By specifying both the struct kind of implies that the mixture model is a joint distribution over cluster assignments and observations. However, in the corresponding code and documentation it seems that instead the cluster assignment is implicitly marginalized out. As an example consider the pdf function. Because the structure specifies a joint distribution, it in theory needs both a value for the observation as the cluster assignment, whereas it only needs the observation. This implicitly marginalization might cause some confusion.

Basically some methods corresponding to the MixtureModel seem to operate on the mixture distribution following from the implicit marginalization of the cluster assignment of the (hierarchical) mixture model. Although the following operations all make sense, perhaps it is more suited for Distributions.jl to instead have the following distribution:

struct MixtureDistribution{VF<:VariateForm,VS<:ValueSupport,Component<:Distribution, T<:Real} <: AbstractMixtureModel{VF,VS}
    components::Vector{Component}
    weights::Vector{T}
end

where the cluster assignment is explicitly marginalized out and therefore results in a set of mixture weights, which are just numbers.

I hope I am not nitpicking too much with this issue, but I think it might be a valuable consideration in the scope of the robustness/uniformness of this package.

@devmotion
Copy link
Member

By specifying both the struct kind of implies that the mixture model is a joint distribution over cluster assignments and observations.

I'd say this is debatable. To me, these fields are internal implementation details and the Categorical type is an alternative to Vector{T} that makes implementing rand easier since rand and sampler are defined for Categorical but not for Vector{T}. The support of MixtureModel is the support of the components.

@bartvanerp
Copy link
Author

Thanks for the quick reply!

For internal computations I understand that it might be more convenient to work with Categorical distributions. However, still this leads to a small mismatch between the actual naming of the struct. The wording MixtureModel is usually used to denote the joint probability over likelihoods and prior(s) as a hierarchical model, which might be out of scope for this package. On the other hand MixtureDistribution is used to denote the marginalized pdf of the observations. In the computations and documentation currently the latter definition is used.

@lrnv
Copy link

lrnv commented Apr 3, 2023

Due to the discussion here, there with @dmetivie, in link with #1670, I would strongly prefer something like:

struct Mixture{NT} where {NT<:NTuple{N,Distributions},N}
    components::NT
    weights::Vector{T} # where the length of weights is N, might even be statically sized.
end

The name MixtureModel or Mixture are Ok with me, MixtureDistribution is not: we use Gamma, Normal, not GammaDistribution and NormalDistribution.

See in particular the argument I made in this reply about parameter field inferability. TLDR: it would allow to use fit(MixtureModel{Tuple{Gamma,Normal,Beta}},data), while currently this method is not practical for mixture models, as its first parameter must be the Type, which does not include enough information yet to make the inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants