Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HierarchicalModelLogPDF #1134

Closed
DavAug opened this issue May 12, 2020 · 8 comments
Closed

HierarchicalModelLogPDF #1134

DavAug opened this issue May 12, 2020 · 8 comments
Assignees
Labels

Comments

@DavAug
Copy link
Member

DavAug commented May 12, 2020

Would it be of interest to have a HierarchicalModelLogPDF class in pints?

So something that produces a joint LogPDF from individual LogPDFs with a specific dependence structure:

p(y, psi, theta) = p(y|psi)p(psi|theta)p(theta)

At the moment it is easy enough to implement the individual conditional PDFs with the existing functionality in pints I just haven't seen how to easily create sums of LogPDFs with different parameter spaces and also such with this hidden state structure.

@DavAug DavAug changed the title SumOfConditionallyIndependentLogPDFs HierarchicalModelLogPDF May 12, 2020
@MichaelClerx
Copy link
Member

If you can do it, then Yes please!

I think @martinjrobins and @chonlei both felt they couldn't generalise their hierarchical schemes enough?

@DavAug
Copy link
Member Author

DavAug commented May 12, 2020

Sum of PDFs with disjoint parameter sets should be easy enough. Somewhat analogous to 'pints.SumOfIndependentLogPDFs', but conditioning on disjoint parameter sets and and introducing some rule for the mapping, say the parameters have to in the order [psi, theta].

# Get dimensions
        i = iter(self._log_likelihoods)
        self._indiv_n_params = []
        for e in i:
            self._indiv_n_parameters.append(e.n_parameters())
        self._n_parameters = np.sum(self._indiv_n_parameters)
        self._indiv_n_parameters = np.cumsum([0] + self._indiv_n_parameters)  # add 0 in front to make call easier

    def __call__(self, x):
        total = 0
        for i, e in enumerate(self._log_likelihoods):
            total += e(x[self._indiv_n_parameters[i]:self._indiv_n_parameters[i+1]])
        return total

@DavAug
Copy link
Member Author

DavAug commented May 12, 2020

But having varying psi messes up the intended functionality for Likelihoods a bit in p(psi|theta).

@DavAug
Copy link
Member Author

DavAug commented May 12, 2020

So would it be acceptable to alter the values in the likelihood p(psi|theta), even if usually the "data" is not meant to change.

So a quick solution would be to access the private variable self._values in the pints.problem and update the psi values every iteration.

@simonmarchant
Copy link
Contributor

I'm looking at doing a hierarchical prior function, to at least add this functionality for a small subset of cases. I've written a docstring that describes what I'm intending for it to do: does it sound reasonable for a first hierarchical prior function?
image
In particular @ben18785 the hierarchical guru

@simonmarchant simonmarchant self-assigned this Jun 3, 2020
@ben18785
Copy link
Collaborator

ben18785 commented Jun 3, 2020 via email

@DavAug
Copy link
Member Author

DavAug commented Jul 1, 2020

@MichaelClerx @ben18785 @martinjrobins @chonlei

Additionally to the really cool HierarchicalLogPrior that @simonmarchant is implementing, I've been thinking about how one could compose hierarchical posteriors that are not constrained to Normally distributed bottom-level parameters, and more importantly where you have the chance to pool some of the bottom-level parameters. In my problems you often want some parameters like the noise to be shared between individuals, while others may vary. I couldn't really workout how pymc3 may be integrated with pints in order to do this.

So here is my suggestion (sorry for the unrendered teX, but maybe just look at the example at the bottom to get a feel for the interface):

class HierarchicalLogPosterior(LogPDF):
    r"""
    Represents the log-posterior of an hierarchical model

    .. math::
        \log \mathbb{P}(\theta ,\psi ^{(1)}, \ldots ,\psi ^{(n)} |
        D^{(1)}, \ldots ,D^{(n)}) =
        \sum _{i=1}^n \log \mathbb{P}(y | \psi ^{(i)})
        \Big | _{y = D^{(i)}} +
        \sum _{i=1}^n \mathbb{P}(\psi ^{(i)} | \theta) + \mathbb{P}(\theta ) +
        \text{constant},

    where :math:`y` is the observable output of the hierarchical model whose
    distribution is parametrised by :math:`\psi`. The top level parameters
    :math:`\theta` determine the distribution of :math:`\psi`. We refer to
    :math:`\sum _{i=1}^n\log\mathbb{P}(y|\psi ^{(i)})\Big |_{y=D^{(i)}}`
    as the bottom-level log-likelihood, and to
    :math:`\sum _{i=1}^n \mathbb{P}(\psi _i | \theta)` as the top-level
    log-likelihood.

    Hierarchical modelling of :math:`n` data sets :math:`D^{(1)}\ldots,D^{(n)}`
    is the compromise between a pooled analysis, in which one set of model
    parameters :math:`\psi` is assumed to be suitable for all observations, and
    an unpooled analysis where an independent set of parameters
    :math:`\psi ^{(i)}` is inferred for each data set :math:`D^{(i)}`. In
    hierarchical models the :math:`\psi ^{(i)}` are assumed to
    follow a top level distribution parameterised by parameters :math:`\theta`.

    The :class:`HierarchicalLogPosterior` takes a list of bottom-level
    log-likelihoods of length :math:`n`, a list of top-level likelihoods of
    length :math:`b`, and a list of priors of length :math:`t`. The
    bottom-level log-likelihoods represent the likelihoods for the
    parameters :math:`\psi ^{(i)}` for each data set :math:`D^{(i)}`, where
    each parameter set is of size :math:`b`. The top-level log-likelihoods
    represent the likelihoods of the parameters :math:`\theta` for a given set
    of bottom parameters :math:`\psi ^{(1)}, \ldots , \psi ^{(n)}`. For each of
    the :math:`t` top-level parameters :math:`\theta` is a prior passed to the
    hierarchical log-posterior class.

    The log-posterior can be evaluated by passing an array-like object with
    parameters (:math:`\theta, \psi ^{(1)}, \ldots , \psi ^{(n)}`) of length
    :math:`t + nb`. Parameters are sorted such that top-level parameters come
    before bottom-level parameters. The parameter order within the
    log-likelihoods is preserved, and parameters from different
    log-likelihoods are appended.

    Optionally, a subset of the parameters :math:`\psi` can be chosen to be
    pooled by passing a boolean array-like object ``non_hierarchical_params``
    of length :math:`b` to the HierarchicalLogPosterior. ``True`` indicates
    pooling. With :math:`p` pooled parameters the length of the parameters
    reduces to :math:`t + p + n(b-p)`. Pooled parameters count as top-level
    parameters, and therefore come before the remaing bottom-level parameters.

    Extends :class:`LogPDF`

    Parameters
    ----------
    bottom_log_likelihood : List of :class:`LogPDF` of length :math:`n`
        A list of log-likelihoods of the parameters :math:`\psi^{(i)}` for each
        data set :math:`D^{(1)}, \dots, D^{(n)}`. All log-likelihoods in the
        list live on the same :math:`b`-dimensional parameter space.
    top_log_likelihood : List of :class:`LogPrior` of length :math:`b`
        A list of log-likelihoods of the parameters :math:`\theta` for a given
        set of bottom-level parameters
        :math:`\psi ^{(1)}, \ldots , \psi ^{(n)}`. If :math:`p` of the
        :math:`b` bottom-level parameters are pooled, only :math:`b-p`
        top-level likelihoods can be passed.
    log_prior : List of :class:`LogPrior` of length :math:`t`
        A list of log-priors for the top-level parameters :math:`\theta`. If
        :math:`p` of the bottom-level parameters are pooled, :math:`t+p`
        log-priors need to be passed to the HierarchicalPosterior.
    non_hierarchical_params : Array-like boolean of length :math:`b`
        Mask which determines which of the :math:`b` parameters :math:`\psi`
        may be pooled.

    Example
    -------
    ::

        # Three schools problem: Assume data for the different schools has
        # already been integrated into three pints.SingleOutputProblem with
        # one parameter each for the mean score of the school.
        #
        # Bottom-level model is Normal distribution, where for illustration we
        # choose to pool the variance parameter. Top-level model is also a
        # Normal distribution.

        # Bottom-level log-likelihoods
        bottom_log_likelihood_school_one = pints.GaussianLogLikelihood(
            problem_school_one)
        bottom_log_likelihood_school_two = pints.GaussianLogLikelihood(
            problem_school_two)
        bottom_log_likelihood_school_three = pints.GaussianLogLikelihood(
            problem_school_three)

        # Top-level log-likelihoods
        top_log_likelihood = pints.GaussianLogPrior(
            mean=0, sd=1)  # Initial values have no effect

        # Log-priors
        bottom_variance_log_prior = pints.HalfCauchyLogPrior(
            location=0, scale=5)
        top_mean_log_prior = pints.NormalLogPrior(
            mean=0, standatd_deviation=5)
        top_variance_log_prior = pints.HalfCauchyLogPrior(
            location=0, scale=5)

        # Hierarchical log-posterior
        log_posterior = pints.HierarchicalLogPosterior(
            bottom_log_likelihood=[
                bottom_log_likelihood_school_one,
                bottom_log_likelihood_school_two,
                bottom_log_likelihood_school_three],
            top_log_likelihood=[
                top_log_likelihood]
            log_prior=[
                bottom_variance_log_prior,
                top_mean_log_prior,
                top_variance_log_prior]
            non_hierarchical_params=[
                False, True])  # Bottom-mean: hierarchical, -variance: pooled

        # Evaluation of hierarchical log-prior
        mock_top_mean = 2.5
        mock_top_variance = 4
        mock_bottom_variance = 1
        mock_mean_school_one = 1
        mock_mean_school_two = 2
        mock_mean_school_three = 3

        value = log_posterior([
            mock_top_mean,
            mock_top_variance,
            mock_bottom_variance,
            mock_mean_school_one,
            mock_mean_school_two,
            mock_mean_school_three])
    """

I'm grateful for any suggestions.

@DavAug
Copy link
Member Author

DavAug commented Jul 2, 2020

I realised that for gradient based sampling or optimisation, we would need to be able to compute the partials of the top-level likelihoods with respect to both, it's inputs \theta but also with respect to \psi.

If we used LogPriors for the top-level likelihoods, the priors would need to get an additional method which would return the partials with respect to the hyperparameters.

@DavAug DavAug mentioned this issue Jan 14, 2021
7 tasks
@pints-team pints-team locked and limited conversation to collaborators Jun 23, 2024
@MichaelClerx MichaelClerx converted this issue into discussion #1655 Jun 23, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
Projects
None yet
Development

No branches or pull requests

4 participants