# Conditional prior demonstration

Conditional priors enable inference to be performed with priors that correlate different parameters.
In this notebook, we demonstrate two uses of this: maintaining a two-dimensional distribution while changing the parameterization to be more efficient for sampling, and enforcing an ordering between parameters.

Many cases where `Conditional` priors are useful can also be expressed with `Constraint` priors, however the conditional approach can improve sampling efficiency.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from bilby.core.prior import (
    Prior, PriorDict, ConditionalPriorDict,
    Uniform, ConditionalUniform, Constraint, 
)
from corner import corner
from scipy.stats import semicircular


%matplotlib inline

### Sampling from a disc

Our first example is sampling uniformly from a disc

$$p(x, y) = \frac{1}{\pi}; x^2 + y+2 \leq 1.$$

A naive implementation of this would define a uniform prior over a square over `[-1, 1]` and then reject points that don't satisfy the radius constraint.
Naive sampling from this parameterization would have an efficiency of $\pi / 4$.

If we instead consider the marginal distribution $p(x)$ and conditional distribution $p(y | x)$, we can achieve a sampling efficiency of 100%.

$$
p(x) = \int_{-1 + \sqrt{x^2 + y^2}}^{1 - \sqrt{x^2 + y^2}} dy p(x, y) = \frac{2 (1 - \sqrt{x^2 + y^2})}{\pi} \\
p(y | x) = \frac{1}{2 (1 - \sqrt{x^2})}
$$

The marginal distribution for $x$ is the [Wigner semicircle distribution](https://en.wikipedia.org/wiki/Wigner_semicircle_distribution), this distribution is not currently defined in `Bilby`, but we can wrap the `scipy` implementation.
The conditional distribution for $y$ is implemented in `Bilby` as the `ConditionUnifrom`, we just need to define the condition function.

In [None]:
class SemiCircular(Prior):

    def __init__(self, radius=1, center=0, name=None, latex_label=None, unit=None, boundary=None):
        super(SemiCircular, self).__init__(
            minimum=center - radius,
            maximum=center + radius,
            name=name,
            latex_label=latex_label,
            unit=unit,
            boundary=boundary,
        )
        self.radius = radius
        self.center = center
        self._dist = semicircular(loc=center, scale=radius)

    def prob(self, val):
        return self._dist.pdf(val)

    def ln_prob(self, val):
        return self._dist.logpdf(val)

    def cdf(self, val):
        return self._dist.cdf(val)

    def rescale(self, val):
        return self._dist.ppf(val)


def conditional_func_y(reference_parameters, x):
    condition = np.sqrt(reference_parameters["maximum"]-x**2)
    return dict(minimum=-condition, maximum=condition)

#### Sample from the distribution

To demonstrate the equivalence of the two methods, we will draw samples from the distribution using the two methods and verify that they agree.

In [None]:
N = int(2e4)

CORNER_KWARGS = dict(
    plot_contours=False,
    plot_density=False,
    fill_contours=False,
    max_n_ticks=3,
    verbose=False,
    use_math_text=True,
)


def convert_to_radial(parameters):
    p = parameters.copy()
    p['r'] = p['x']**2 + p['y']**2
    return p

def sample_circle_with_constraint():
    d = PriorDict(
            dictionary=dict(
                x=Uniform(-1, 1),
                y=Uniform(-1, 1),
                r=Constraint(0, 1),
            ),
            conversion_function=convert_to_radial
        )
    return pd.DataFrame(d.sample(N))


def sample_circle_with_conditional():
    d = ConditionalPriorDict(
            dictionary=dict(
                x=SemiCircular(),
                y=ConditionalUniform(
                    condition_func=conditional_func_y, 
                    minimum=-1, maximum=1
                )
            )
        )
    return pd.DataFrame(d.sample(N))


s1 = sample_circle_with_constraint()
s2 = sample_circle_with_conditional()
fig = corner(s1.values, **CORNER_KWARGS, color="tab:blue", labels=["$x$", "$y$"])
corner(s2.values, **CORNER_KWARGS, color="tab:green", fig=fig)
plt.show()
plt.close()

### Sampling from ordered distributions

As our second example, we demonstrate defining a prior distribution over a set of strictly ordered parameters.

We note that in this case, we do not require that the marginal distributions for each of the parameters are independently and identically disributed, although this can be fairly simply remedied.

In [None]:
class BoundedUniform(ConditionalUniform):
    """Conditional Uniform prior where prior sample < previous prior sample
    
    This is ensured by fixing the maximum bound to be the previous prior sample value.
    """
    def __init__(self, idx: int, minimum, maximum, name=None, latex_label=None,
                 unit=None, boundary=None):
        super(BoundedUniform, self).__init__(
            minimum=minimum, maximum=maximum, name=name, 
            latex_label=latex_label, unit=unit,
            boundary=boundary, condition_func=self.bounds_condition
        )
        self.idx = idx
        self.previous_name = f"{name[:-1]}{self.idx - 1}"
        self._required_variables = [self.previous_name] 
        # this is used in prior.sample(... **required_variables)


    def bounds_condition(self, reference_params, **required_variables):
        previous_sample = required_variables[self.previous_name]
        return dict(maximum=previous_sample)


def make_uniform_conditonal_priordict(n_priors=3):
    priors = ConditionalPriorDict()
    for i in range(n_priors):
        if i==0:
            priors[f"uni{i}"] = Uniform(minimum=0, maximum=1, name=f"uni{i}")
        else:
            priors[f"uni{i}"] = BoundedUniform(idx=i, minimum=0, maximum=1, name=f"uni{i}")
    return priors


In [None]:
samples = pd.DataFrame(make_uniform_conditonal_priordict(3).sample(10000))
fig = corner(samples.values, **CORNER_KWARGS, color="tab:blue", labels=[f"$A_{ii}$" for ii in range(3)])
plt.show()
plt.close()