Implement Truncated RVs #131

ricardoV94 · 2022-04-06T15:17:53Z

Closes #96

codecov · 2022-04-06T16:08:17Z

Codecov Report

Base: 96.12% // Head: 96.23% // Increases project coverage by +0.11% 🎉

Coverage data is based on head (5ef727c) compared to base (7890228).
Patch coverage: 98.23% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #131      +/-   ##
==========================================
+ Coverage   96.12%   96.23%   +0.11%     
==========================================
  Files          12       13       +1     
  Lines        1987     2100     +113     
  Branches      241      253      +12     
==========================================
+ Hits         1910     2021     +111     
  Misses         39       39              
- Partials       38       40       +2

Impacted Files	Coverage Δ
aeppl/truncation.py	`98.23% <98.23%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

aeppl/truncation.py

ricardoV94 · 2022-04-12T06:31:04Z

Okay, this seems to be working reasonably well. Ready for review

tests/test_truncation.py

aeppl/truncation.py

brandonwillard

This looks great; it only needs to be refactored to use RandomStreams and return updates.

aeppl/truncation.py

ricardoV94 · 2022-04-27T15:10:33Z

Added RandomStream as a separate commit for review.

rlouf · 2022-11-02T17:27:54Z

Afair truncation means that you cannot observe any data and that you know that before observation. Censoring is when we don't know the value of every data point, but only that it's greater (or smaller) than a given value.

~~With that in mind it seems to me that a natural implementation for truncation would be:~~

import aesara.tensor as at

srng = at.random.RandomStream(0)
x_rv = srng.normal(0, 1)
y = at.clip(x_rv, -1, 1)

Here we are specifying that we know that y can only take values between -1 and 1. To see this, it is enough to consider prior predictive sampling: y can only take values between -1 and 1. In the censored case, the prior predictive distribution should also return values that are out of bounds. I think the censored/truncated interface should be revamped as it currently doesn't reflect the mathematical meaning of these quantities.

at.clip returns its endpoints and doesn't reject the samples so that wouldn't work.

ricardoV94 · 2022-11-02T18:00:02Z

Truncated means you cannot observe a value in a certain interval, and you wouldn't know it it occurred. The generative graph involves rejection sampling (if you want to achieve a fixed data size) or subtensoring (if you don't care about a fixed data size).

Censoring means you cannot observe a value in a certain interval but you know when that occurs. The generative process involves clipping or rounding (or any other encoding that loses resolution but not number of datapoints)

There is a good pymc example that explains the difference: https://www.pymc.io/projects/examples/en/latest/generalized_linear_models/GLM-truncated-censored-regression.html

The Stan manual is also nice:
https://mc-stan.org/docs/stan-users-guide/truncated-data.html
https://mc-stan.org/docs/stan-users-guide/censored-data.html

rlouf · 2022-11-02T18:23:52Z

Take prior predictive samples from the code example I shared, where do the values lie and what does this model generate?

Truncation is about the generative model for your data, censoring is about how your data is observed.

ricardoV94 · 2022-11-02T18:30:48Z

Your example code generates censored data

rlouf · 2022-11-02T18:41:36Z

To generate censored data you generate from the full distribution (normal(0,1)) and then apply a >= threshold filter to your prior samples to mimic the observation process.

If that's not it, how do you call the process I just described?

brandonwillard · 2022-11-02T18:44:25Z

Are you guys essentially talking about how the endpoints of the cliped intervals are/should be handled/understood?

rlouf · 2022-11-02T18:59:05Z

Are you guys essentially talking about how the endpoints of the cliped intervals are/should be handled/understood?

Yes.

In this PR are we talking about truncated random variables or random variables with a truncated distribution? There isn't a 1:1 map between these two concepts. We also call truncated random variable a variable $Y$ that is defined such that $Y=X$ if $X < c$ and $Y=c$ if $X \geq c$ where $c$ a constant. This can be represented by at.clip, assuming in my example x_rv represents a random variable object (and not a distribution).

Regarding censoring, there's an implicit assumption about the meaning of the threshold values when using at.clip that is not clearly stated, and not completely obvious. Saying that the value is equal to the value of the threshold is not the same as saying that it is greater than the threshold, these two correspond to different types. This probably touches on broader design concerns wrt the representation of these objects in the library.

brandonwillard · 2022-11-02T19:56:26Z

There's an implicit assumption about the meaning of the bounds when using at.clip that is not clearly stated, and not completely obvious. Saying that the value is equal to the value of the threshold is not the same as saying that it is greater than the threshold, these two correspond to different types. This probably touches on a deeper problem with the representation of these objects in the library.

Yes, there are some fundamental representation issues involved with using existing Aesara Ops for these types of derived measures. Simply put, if samples from the original model graphs (e.g. that use clip or whatever else) don't match the measures/densities we assign them in AePPL, then we have a representation issue that needs to be fixed, and that might be the case with the representations of truncation and/or censoring used here. If so, the easy fix is to create graphs or Ops that explicitly represent truncation and/or censoring. As I recall, that's what was done in here, though.

ricardoV94 · 2022-11-02T20:22:59Z

There's an implicit assumption about the meaning of the threshold values when using at.clip that is not clearly stated, and not completely obvious. Saying that the value is equal to the value of the threshold is not the same as saying that it is greater than the threshold, these two correspond to different types. This probably touches on broader design concerns wrt the representation of these objects in the library.

The logprob is correct (hopefully) for a data generative process that uses at.clip. This is somewhat analogous to a censoring process but it's not the only way of obtaining censored data and is also perhaps not the most intuitive. In a recent PR I added censoring by rounding and took away the emphasis of clipping as censoring. Just called it measurable clip or something.

Back to clip, some other packages would use two vectors of data, one with the censored values and another with boolean/ categorical values that indicate whether each value was censored or not. You could try to handle this in aeppl as well.

Those have slightly different meanings at the edges. In the clip case you assume edge values could only have come from censoring for continuous RVs (ignoring precision issues, it's a pdf vs pmf), and a mixture for discrete RVs (IIRC?), whereas for the double vector approach you wouldn't have to assume for continuous RVs or count as a mixture for discrete RVs because the source of the value it's disambiguated by the user. The double vector retains more information about the data generating process.

Anyway, this PR was about truncation not censoring.

rlouf · 2022-11-02T21:50:09Z

Anyway, this PR was about truncation not censoring.

You're right, will open a separate discussion for my remaining questions about censoring.

This is still relevant and important here:

In this PR are we talking about truncated random variables or random variables with a truncated distribution? There isn't a 1:1 map between these two concepts. We also call truncated random variable a variable $Y$ that is defined such that $Y=X$ if $X < c$ and $Y=c$ if $X \geq c$ where $c$ a constant. This can be represented by at.clip, assuming in my example x_rv represents a random variable object (and not a distribution).

ricardoV94 · 2022-11-03T06:02:40Z

Anyway, this PR was about truncation not censoring.
This is still relevant and important here:

In this PR are we talking about truncated random variables or random variables with a truncated distribution? There isn't a 1:1 map between these two concepts. We also call truncated random variable a variable $Y$ that is defined such that $Y=X$ if $X < c$ and $Y=c$ if $X \geq c$ where $c$ a constant. This can be represented by at.clip, assuming in my example x_rv represents a random variable object (and not a distribution).

I think that still refers to censoring in the literature, not truncation. It's just one side censoring instead two side censoring. You can obtain those graphs with y = at.clip(-np.inf, c). That's considered in the clip_logprob to generate a more efficient expression. There is also an issue for arbitrary censoring via set_subtensor here that you might be interested in: #177

In general, if you can lose information about the values of your data but not the size of your data you are always talking about censoring.

rlouf · 2022-11-03T07:24:12Z

Does this means this PR is about RVs with a truncated probabilistic distribution then?

ricardoV94 · 2022-11-03T07:32:36Z

Does this means this PR is about RVs with a truncated probabilistic distribution then?

I guess so. I am not sure what is the difference between RVs and distributions that you were mentioning above. It's definitely about truncation, not censoring.

rlouf · 2022-11-03T10:30:34Z

Ok, that's all I needed to know, and it's fine as long as it is clearly stated. I believe there is a difference between RVs with a truncated distribution (the sample space is the truncated interval), and a truncated RV which is the product of another RV (who sample space can be the real line) with an indicator function. At the very least, if equivalence there is, it is not obvious.

When the terminology in the literature is so confused, here between measure-theory people and people only working with densities, it is important to be very specific about the relation between our representation and the different mathematical representation. This is not about splitting hair for the sake of it, but rather not ending up with the confusion and ambiguities that PPLs are usually ridden with.

rlouf · 2023-01-20T12:35:58Z

As I expressed above, I have strong reservations when it comes to this PR and the meaning of the operator that is proposed. The problem can be seen clearly when comparing:

import aesara.tensor as at

srng = at.random.RandomStream()

x = at.random.uniform(0, 10, name="x", size=100)
xt, _ = truncate(x, lower=5, upper=15, srng=srng)

to:

import aesara.tensor as at

srng = at.random.RandomStream(0)
x = srng.uniform(0, 10)
xt = at.clip(x)

In the latter case, if we assume that $x$ represents a random variable the code can be correctly read as defining a new random variable $xt$ such that $xt = \operatorname{clip} \circ \operatorname{UniformRV}$. But if we admit this here, we also have to interpret the first snippet as defining a new random variable $xt$ such that $xt = \operatorname{truncate} \circ \operatorname{UniformRV}$. This is obviously not correct, and that is because $\operatorname{truncate}$ operates on measures (if we follow the standard definition).

I think that the abstractions that are present today in Aesara do not allow us to work on measure transformations since measures are not first-class citizens, but are present only implicitly.

Here is a modest proposal to alleviate these limitations. In this proposal we represent e.g. the measure with a standard normal density wrt the Lebesgue measure as

x_m = normal(0, 1)

the $\operatorname{truncate}$ operator acts on measures, so we can defined the truncated measure as

x_t = truncate(normal(0, 1), lower=0)

to bridge the gap with the existing RandomVariable API in Aesara, we define the $\operatorname{sample}$ operator which takes a measure and returns an independent random variable with this measure each time it is called:

x_rv = sample(rng, normal(0, 1))

here we retain the existing ambiguity there is in Aesara between random variables and elements of $E$ (which does not matter mathematically). The RandomStream interface can be retained with

srng = at.random.RandomStream(0)
x_rv = srng.sample(normal(0, 1))

So the first code snippet above could be rewritten:

import aesara.tensor as at
import aesara.tensor.random as ar

srng = ar.RandomStream(0)
x_m = truncate(ar.uniform(0, 10), lower=5, upper=15)
xt = srng.sample(x_m)

This interface has a few nice incidental properties:

The size argument gets a new interpretation. Doing x_rv = srng.sample(normal(0, 1), size=(10,)) means that we are defining 10 independent random variables with the same associated measure normal(0, 1) (we should probably rename it to iid or something similar);
Provided we can make the RandomStream interface work this way, it would be much easier for one to add a custom measure and use it with this interface;
I have a feeling this would make it easy to go the "explicitly representing rng key advancement" way if we ever choose to.

I can write an extended API proposal if you think this makes sense, and try to work out the potential challenges linked to this representation of probabilistic objects in Aesara and AePPL.

ricardoV94 force-pushed the truncated_rvs branch 6 times, most recently from d67ee37 to cc14e41 Compare April 6, 2022 16:08

ricardoV94 commented Apr 6, 2022

View reviewed changes

aeppl/truncation.py Outdated Show resolved Hide resolved

ricardoV94 force-pushed the truncated_rvs branch 2 times, most recently from b8f681a to 773dc50 Compare April 6, 2022 17:03

brandonwillard added enhancement New feature or request important This label is used to indicate priority over things not given this label op-probability Involves the implementation of log-probabilities for Aesara `Op`s labels Apr 6, 2022

brandonwillard assigned ricardoV94 Apr 6, 2022

ricardoV94 force-pushed the truncated_rvs branch 2 times, most recently from 982e469 to 77343ca Compare April 11, 2022 14:38

ricardoV94 marked this pull request as ready for review April 12, 2022 06:22

ricardoV94 force-pushed the truncated_rvs branch 4 times, most recently from ad22781 to 697af2f Compare April 12, 2022 17:43

ricardoV94 commented Apr 14, 2022

View reviewed changes

tests/test_truncation.py Outdated Show resolved Hide resolved

aeppl/truncation.py Show resolved Hide resolved

ricardoV94 commented Apr 25, 2022

View reviewed changes

aeppl/truncation.py Outdated Show resolved Hide resolved

ricardoV94 commented Apr 25, 2022

View reviewed changes

aeppl/truncation.py Outdated Show resolved Hide resolved

brandonwillard requested changes Apr 25, 2022

View reviewed changes

aeppl/truncation.py Outdated Show resolved Hide resolved

aeppl/truncation.py Outdated Show resolved Hide resolved

aeppl/truncation.py Outdated Show resolved Hide resolved

ricardoV94 force-pushed the truncated_rvs branch from 697af2f to 3422b0e Compare April 26, 2022 07:22

brandonwillard reviewed Apr 26, 2022

View reviewed changes

aeppl/truncation.py Outdated Show resolved Hide resolved

ricardoV94 force-pushed the truncated_rvs branch from 3422b0e to ac549a5 Compare April 27, 2022 14:29

brandonwillard previously approved these changes Apr 29, 2022

View reviewed changes

ricardoV94 marked this pull request as draft June 16, 2022 12:24

rlouf mentioned this pull request Nov 18, 2022

Add numerical stabilization for difference of exponentials aesara-devs/aesara#1303

Closed

brandonwillard dismissed their stale review via 564569f January 17, 2023 23:36

brandonwillard force-pushed the truncated_rvs branch from 4e7852c to 564569f Compare January 17, 2023 23:36

brandonwillard marked this pull request as ready for review January 17, 2023 23:37

brandonwillard force-pushed the truncated_rvs branch from 564569f to 5b75e87 Compare January 17, 2023 23:38

Add an interface and basic implementations for truncated variables

865d6d3

brandonwillard force-pushed the truncated_rvs branch 3 times, most recently from da00b21 to cee8573 Compare January 18, 2023 19:08

Disable auto-update-conda and use miniforge-variant

5ef727c

brandonwillard force-pushed the truncated_rvs branch from cee8573 to 5ef727c Compare January 18, 2023 19:12

brandonwillard requested a review from rlouf January 18, 2023 19:22

brandonwillard unassigned ricardoV94 Jan 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Truncated RVs #131

Implement Truncated RVs #131

ricardoV94 commented Apr 6, 2022 •

edited

Loading

codecov bot commented Apr 6, 2022 •

edited

Loading

ricardoV94 commented Apr 12, 2022

brandonwillard left a comment

ricardoV94 commented Apr 27, 2022

rlouf commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

brandonwillard commented Nov 2, 2022

rlouf commented Nov 2, 2022 •

edited

Loading

brandonwillard commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 3, 2022 •

edited

Loading

rlouf commented Nov 3, 2022

ricardoV94 commented Nov 3, 2022

rlouf commented Nov 3, 2022 •

edited

Loading

rlouf commented Jan 20, 2023 •

edited

Loading

Implement Truncated RVs #131

Are you sure you want to change the base?

Implement Truncated RVs #131

Conversation

ricardoV94 commented Apr 6, 2022 • edited Loading

codecov bot commented Apr 6, 2022 • edited Loading

Codecov Report

ricardoV94 commented Apr 12, 2022

brandonwillard left a comment

Choose a reason for hiding this comment

ricardoV94 commented Apr 27, 2022

rlouf commented Nov 2, 2022 • edited Loading

ricardoV94 commented Nov 2, 2022 • edited Loading

rlouf commented Nov 2, 2022 • edited Loading

ricardoV94 commented Nov 2, 2022 • edited Loading

rlouf commented Nov 2, 2022 • edited Loading

brandonwillard commented Nov 2, 2022

rlouf commented Nov 2, 2022 • edited Loading

brandonwillard commented Nov 2, 2022 • edited Loading

ricardoV94 commented Nov 2, 2022 • edited Loading

rlouf commented Nov 2, 2022 • edited Loading

ricardoV94 commented Nov 3, 2022 • edited Loading

rlouf commented Nov 3, 2022

ricardoV94 commented Nov 3, 2022

rlouf commented Nov 3, 2022 • edited Loading

rlouf commented Jan 20, 2023 • edited Loading

ricardoV94 commented Apr 6, 2022 •

edited

Loading

codecov bot commented Apr 6, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

brandonwillard commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 2, 2022 •

edited

Loading

rlouf commented Nov 2, 2022 •

edited

Loading

ricardoV94 commented Nov 3, 2022 •

edited

Loading

rlouf commented Nov 3, 2022 •

edited

Loading

rlouf commented Jan 20, 2023 •

edited

Loading