Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated) #4746

kc611 · 2021-06-06T19:00:46Z

Linked Issue: #4686

This PR refactors the following distributions:

AsymmetricLaplace
HalfStudentT
ExGaussian
Interpolated

(Random Note: It's strange how git is showing that many file changes in the test file pymc3/tests/test_distributions.py when only difference I made is of few lines ?)

pymc3/distributions/continuous.py

ricardoV94

Thanks for opening the PR! I marked a couple of things than need to be changed. Let me know if you have any questions :)

Also, we will need the test in test_distributions_random.py to be refactored / added, and any tests that may be marked as xfail because of these distributions anywhere else on the test module.

pymc3/tests/test_distributions.py

ricardoV94 · 2021-06-07T09:03:11Z

pymc3/tests/test_distributions.py


    @pytest.mark.xfail(condition=(aesara.config.floatX == "float32"), reason="Fails on float32")
-    @pytest.mark.xfail(reason="Distribution not refactored yet")
+    @pytest.mark.xfail(reason="Test not yet refactored for new version")


This will need to be refactored for this PR, otherwise we have no test that covers the logp method of the Interpolated, or do we?

Yea, I plan on doing exactly that in this PR. It seems some problem with the arguments not being passed correctly.

ricardoV94 · 2021-06-07T09:04:39Z

(Random Note: It's strange how git is showing that many file changes in the test file pymc3/tests/test_distributions.py when only difference I made is of few lines ?)

Could it be some auto-formatting on your end? I can try to check locally to see if it is just a GitHub confusion or if you actually indented those lines.

kc611 · 2021-06-07T09:58:06Z

Could it be some auto-formatting on your end? I can try to check locally to see if it is just a GitHub confusion or if you actually indented those lines.

Please do. The only changes it shows on my side are the few changed x_fail lines.

Edit: Never mind, Looks like VS Code doesn't take into account the indentation changes as "real" changes in git. Fixed it now.

pymc3/distributions/continuous.py

ricardoV94 · 2021-06-07T11:10:55Z

pymc3/distributions/continuous.py

+    def rng_fn(cls, rng, x, pdf, size=None):
+        interp = InterpolatedUnivariateSpline(x, pdf, k=1, ext="zeros")
+
+        Z = interp.integral(x[0], x[-1])


Inside rng_fn there shouldn't be any symbolic operations. I didn't realize you need Z here as well.

You can either use scipy equivalents or, if easier, have everything that's needed for both logp and the rng_fn defined once during the dist classmethod and passed to super as additional inputs (and expected in both the logp and rng_fn. This way, the rng_fn will receive the non-symbolic arguments as inputs during normal sampling, while the logp method will receive the symbolic ones.

Apologies if this sounds very confusing :D

I think I should probably refactor the test first, just so that I'll have an intuition of how this entire Interpolated thing should work.

Yeah, you can also check it out in V3, and see how it works there (and in the test). That may help, since this is a very non-standard one

pymc3/distributions/continuous.py

…dentT, ExGaussian, Interpolated)

michaelosthege

LGTM
@ricardoV94 you make the call

pymc3/tests/test_distributions_random.py

ricardoV94 · 2021-06-10T18:18:41Z

pymc3/tests/test_distributions_random.py

@@ -270,7 +270,6 @@ class TestWald(BaseTestCases.BaseTestCase):
    params = {"mu": 1.0, "lam": 1.0, "alpha": 0.0}


-@pytest.mark.xfail(reason="This distribution has not been refactored for v4")
 class TestAsymmetricLaplace(BaseTestCases.BaseTestCase):


These tests that rely on BaseTestCases.BaseTestCase should be removed and replaced with tests using the new BaseTestDistribution

ricardoV94

Looks great so far. The random tests still need to be refactored to use the BaseTestDistribution

ricardoV94 · 2021-06-10T18:28:15Z

pymc3/tests/test_distributions_random.py

@@ -1497,7 +1494,6 @@ def ref_rand(size, mu, sigma):

        pymc3_random(pm.Moyal, {"mu": R, "sigma": Rplus}, ref_rand=ref_rand)

-    @pytest.mark.xfail(reason="This distribution has not been refactored for v4")
    @pytest.mark.xfail(condition=(aesara.config.floatX == "float32"), reason="Fails on float32")
    def test_interpolated(self):


This one should be moved to it's own class. The reason for this is that after refactoring all distributions, the TestScalarParameterSamples where this one is will disappear, but this distribution is special enough that the test should be kept (i.e., we are not simply drawing from scipy / numpy distributions).

I just moved it to it's own class and named it TestInterpolated

ricardoV94 · 2021-06-10T18:34:01Z

pymc3/tests/test_distributions_random.py

@@ -1257,7 +1256,6 @@ def ref_rand(size, mu, lam, alpha):
            ref_rand=ref_rand,
        )

-    @pytest.mark.xfail(reason="This distribution has not been refactored for v4")


These tests that are using a ref_rand should be removed as well, unless there is a good reason to test them against a reference. If we are just copying the same scipy/numpy expression we use in our random Ops the default tests in BaseTestDistribution should be used instead.

ricardoV94

Looks good. The last thing I would suggest is to add a BaseTestDistribution for the Interpolated just with the check_rv_size test enabled. Just to make sure size is working properly.

Otherwise, I think this PR is pretty much spot on. Specially wth the thorny Interpolated.

kc611 · 2021-06-11T08:43:19Z

Thanks @ricardoV94 for all your suggestions, (and especially for being patient with me in the start of this PR with all those silly mistakes. I guess my coffee didn't work quite right that particular day.)

Edit: Also (just a heads up) I'll be working on refactoring KroneckerNormal from the multivariate distributions next.

ricardoV94 · 2021-06-12T16:47:55Z

I am afraid the Interpolated is still problematic. Right now it works fine if you pass non-symbolic arguments, but it fails if the inputs are symbolic. This snippet raises an enigmatic ValueError from within the scipy.interpolate method:

x_points = np.linspace(0, 10)
pdf_points = st.norm.pdf(x_points, loc=5, scale=0.1)
ae_x_points = at.as_tensor_variable(x_points)
ae_pdf_points = at.as_tensor_variable(pdf_points)
interp = pm.Interpolated.dist(ae_x_points, ae_pdf_points, size=(2, 4))
# ValueError: diff requires input that is at least one dimensional

V3 also had the same limitation.

It seems a proper fix would require integrating the non-symbolic InterpolatedUnivariateSpline inside the symbolic SplineWrapper (or as a separate symbolic Op). The logp method making use of x_points.data is also quite hackish. For instance, it wouldn't work with a generic at.arange input.

If we can't be bothered at the moment, perhaps we should at least mention this limitation explicitly in the docstring and add an informative TypeError in dist if the input is symbolic. What do you think?

Edit: I see this line is in the docstrings:

Both parameters ``x_points`` and values ``pdf_points`` are not variables, but
plain array-like objects, so they are constant and cannot be sampled.

ricardoV94 · 2021-06-12T17:31:44Z

I pushed some changes that hopefully clarify a bit how the Interpolated is expected to work internally, ~~and also placed the contents of the _argcdf method directly inside the rng_fn.~~

I am okay with merging as is, and perhaps open a GH issue to discuss whether we want to make this distribution accept symbolic elements in the future.

ricardoV94

Great work @kc611 and thanks for your patience. Let me know if you also agree with merging it, given my last comments.

kc611 · 2021-06-13T05:25:52Z

Sure, you can go ahead and merge this, and yes, we should probably open an issue for making a proper fix for integrating the non-symbolic InterpolatedUnivariateSpline inside the symbolic SplineWrapper (or as a separate symbolic Op). Otherwise we'll be stuck with this hackish approach for Interpolated which I'm sure we don't want for v4.

ricardoV94 · 2021-06-13T05:29:01Z

Sure, you can go ahead and merge this, and yes, we should probably open an issue for making a proper fix for integrating the non-symbolic InterpolatedUnivariateSpline inside the symbolic SplineWrapper (or as a separate symbolic Op). Otherwise we'll be stuck with this hackish approach for Interpolated which I'm sure we don't want for v4.

Do you mind creating that issue?

I'll wait until Monday to merge the PR in case anyone else wants to have a look.

ricardoV94 reviewed Jun 6, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 6, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 6, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 6, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 6, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 6, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 requested changes Jun 6, 2021

View reviewed changes

michaelosthege added this to the vNext (4.0.0) milestone Jun 6, 2021

ricardoV94 mentioned this pull request Jun 6, 2021

Port remaining distributions to v4 #4686

Closed

26 tasks

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/tests/test_distributions.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

kc611 commented Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

kc611 marked this pull request as draft June 7, 2021 10:06

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Show resolved Hide resolved

kc611 force-pushed the add_dist branch 2 times, most recently from 293900a to cf99a91 Compare June 7, 2021 10:51

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

pymc3/distributions/continuous.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 7, 2021

View reviewed changes

kc611 commented Jun 8, 2021

View reviewed changes

pymc3/distributions/continuous.py Show resolved Hide resolved

kc611 requested a review from ricardoV94 June 9, 2021 15:20

kc611 marked this pull request as ready for review June 10, 2021 11:12

kc611 changed the title ~~Refactored remaining continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated)~~ Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated) Jun 10, 2021

kc611 force-pushed the add_dist branch from 74380f1 to 1785959 Compare June 10, 2021 11:45

Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStu…

bfc384b

…dentT, ExGaussian, Interpolated)

kc611 force-pushed the add_dist branch from 1785959 to bfc384b Compare June 10, 2021 12:10

michaelosthege approved these changes Jun 10, 2021

View reviewed changes

ricardoV94 reviewed Jun 10, 2021

View reviewed changes

pymc3/tests/test_distributions_random.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jun 10, 2021

View reviewed changes

ricardoV94 requested changes Jun 10, 2021

View reviewed changes

ricardoV94 reviewed Jun 10, 2021

View reviewed changes

Refactored tests for new distributions

7e34e86

ricardoV94 reviewed Jun 11, 2021

View reviewed changes

kc611 added 2 commits June 11, 2021 04:37

Added RV size testing for Interpolated

53e4e0e

Removed unused import

c421634

ricardoV94 approved these changes Jun 12, 2021

View reviewed changes

ricardoV94 force-pushed the add_dist branch from 3358b34 to 37f3ef1 Compare June 12, 2021 17:36

Convert to constant and add comments

e684ede

ricardoV94 force-pushed the add_dist branch from 37f3ef1 to e684ede Compare June 12, 2021 17:45

kc611 mentioned this pull request Jun 13, 2021

Enable pm.Interpolated to accept symbolic inputs #4767

Open

ricardoV94 merged commit 02a973b into pymc-devs:main Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated) #4746

Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated) #4746

kc611 commented Jun 6, 2021 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

ricardoV94 Jun 7, 2021 •

edited

Loading

kc611 Jun 7, 2021

ricardoV94 commented Jun 7, 2021 •

edited

Loading

kc611 commented Jun 7, 2021 •

edited

Loading

ricardoV94 Jun 7, 2021

kc611 Jun 7, 2021

ricardoV94 Jun 7, 2021

michaelosthege left a comment

ricardoV94 Jun 10, 2021 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

ricardoV94 Jun 10, 2021 •

edited

Loading

kc611 Jun 11, 2021

ricardoV94 Jun 10, 2021

ricardoV94 left a comment

kc611 commented Jun 11, 2021 •

edited

Loading

ricardoV94 commented Jun 12, 2021 •

edited

Loading

ricardoV94 commented Jun 12, 2021 •

edited

Loading

ricardoV94 left a comment

kc611 commented Jun 13, 2021

ricardoV94 commented Jun 13, 2021

Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated) #4746

Refactored continuous distributions to v4 (AsymmetricLaplace, HalfStudentT, ExGaussian, Interpolated) #4746

Conversation

kc611 commented Jun 6, 2021 • edited Loading

ricardoV94 left a comment • edited Loading

Choose a reason for hiding this comment

ricardoV94 Jun 7, 2021 • edited Loading

Choose a reason for hiding this comment

kc611 Jun 7, 2021

Choose a reason for hiding this comment

ricardoV94 commented Jun 7, 2021 • edited Loading

kc611 commented Jun 7, 2021 • edited Loading

ricardoV94 Jun 7, 2021

Choose a reason for hiding this comment

kc611 Jun 7, 2021

Choose a reason for hiding this comment

ricardoV94 Jun 7, 2021

Choose a reason for hiding this comment

michaelosthege left a comment

Choose a reason for hiding this comment

ricardoV94 Jun 10, 2021 • edited Loading

Choose a reason for hiding this comment

ricardoV94 left a comment • edited Loading

Choose a reason for hiding this comment

ricardoV94 Jun 10, 2021 • edited Loading

Choose a reason for hiding this comment

kc611 Jun 11, 2021

Choose a reason for hiding this comment

ricardoV94 Jun 10, 2021

Choose a reason for hiding this comment

ricardoV94 left a comment

Choose a reason for hiding this comment

kc611 commented Jun 11, 2021 • edited Loading

ricardoV94 commented Jun 12, 2021 • edited Loading

ricardoV94 commented Jun 12, 2021 • edited Loading

ricardoV94 left a comment

Choose a reason for hiding this comment

kc611 commented Jun 13, 2021

ricardoV94 commented Jun 13, 2021

kc611 commented Jun 6, 2021 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

ricardoV94 Jun 7, 2021 •

edited

Loading

ricardoV94 commented Jun 7, 2021 •

edited

Loading

kc611 commented Jun 7, 2021 •

edited

Loading

ricardoV94 Jun 10, 2021 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

ricardoV94 Jun 10, 2021 •

edited

Loading

kc611 commented Jun 11, 2021 •

edited

Loading

ricardoV94 commented Jun 12, 2021 •

edited

Loading

ricardoV94 commented Jun 12, 2021 •

edited

Loading