Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for adding posterior_predictive without trace #823

Merged
merged 1 commit into from Oct 21, 2019

Conversation

OriolAbril
Copy link
Member

@OriolAbril OriolAbril commented Sep 17, 2019

Fixes #822 and adds a test to check behaviour of from_pymc3 when trace is None.

EDIT: Only adds the test as a follow-up to #833.

@rpgoldman
Copy link
Contributor

rpgoldman commented Sep 18, 2019

I think this may still have issues. I tried using it and then converting a prior trace, and I get an error where arviz tries to have the dimensions of a prior trace be chains and draws, but the only actual dimensions in the trace are the draws and the dimension of the variable in question.

I have looked and the problem seems to arise in the call to numpy_to_data_array(), which is in data/base.py, and automatically adds chains and draws to the dimensions of every variable.

The variable in question is a predictor (a function of an input variable), so is not replicated through the trace.

Here is the line of code that defines the variable whose dimensions cause the error:

ztemp = Deterministic("ztemp", inc_temp - 30)

The variable inc_temp is defined as:

inc_temp = Data("inc_temp", self.df["inc_temp"].to_numpy())

... so you can see that it does not vary over the course of the trace. I suppose I could suppress this variable, and just make it a SharedVariable, but that seems undesirable.

I'm not certain why these variables did not cause me problems when they appeared in the posterior trace.

@OriolAbril
Copy link
Member Author

@rpgoldman I fear I do not know pymc3 well enough to follow you. Am I right to assume that the issue is that either inc_temp or z_temp are present in the prior dict resulting from pm.sample_prior_predictive? And that then ArviZ tries to assign them draw amd chain dimensions they do not have?

@rpgoldman
Copy link
Contributor

@OriolAbril I'm not sure if this is a problem with Arviz, or PyMC3.

The issue is that my prior trace contains within it some sampled values, with shape (500, x) for x the shape of the sampled variable.

The prior trace also contains some un-sampled predictor variables, with shape x, where as above x is the shape of the sampled variable.

I don't know whether this is a bug in PyMC3, or in the way I am using PyMC3, but it causes from_pymc3() to error out.

@ahartikainen
Copy link
Contributor

I think we should somehow "ignore" errors due to edgecases and maybe throw a warning?

@rpgoldman
Copy link
Contributor

rpgoldman commented Sep 18, 2019

@ahartikainen @OriolAbril I just posted something on the pymc3 slack channel about this, because I'm not sure what is going wrong, and if it is my fault. Having a deterministic function of a predictor variable in a pymc3 model seems uncertain in its effect.

Hoping someone there can shed some light on the subject.

@rpgoldman
Copy link
Contributor

Actually, this suggests a question about the general use of Arviz -- if predictors must be kept out of the trace objects, doesn't this hamper plotting with predictors and predictions?

@OriolAbril
Copy link
Member Author

Could this be related to pymc-devs/pymc#3588?

@rpgoldman
Copy link
Contributor

@rpgoldman I fear I do not know pymc3 well enough to follow you. Am I right to assume that the issue is that either inc_temp or z_temp are present in the prior dict resulting from pm.sample_prior_predictive? And that then ArviZ tries to assign them draw amd chain dimensions they do not have?

Yes, that is exactly what is happening.

@rpgoldman
Copy link
Contributor

Could this be related to pymc-devs/pymc3#3588?

Yes: the deterministic that is being included in the trace is not a pm.Data, but it is a simple function of a pm.Data.

pymc-devs/pymc#3588 suggests that it's not sufficient to omit pm.Data: we must also omit all functions of pm.Data alone. I.e., we should look at everything that has only pm.Data in its parents and omit those variables, as well as the pm.Data alone.

@ahartikainen
Copy link
Contributor

That sounds like a good solution.

@rpgoldman
Copy link
Contributor

rpgoldman commented Sep 18, 2019

It also sounds to me like I should have been using from_dict instead of from_pymc3, but that definitely didn't jump out at me as the right option. Seems likefrom_pymc3 should just farm out processing of posterior predictive and prior predictive to from_dict, and from_pymc3 should just merge the functionality of the MultiTrace translator with from_dict.

Does that make sense?

It also suggests that there should be a merge_inference_data() function or a merge() method defined on InferenceData so that one could, for example, load an InferenceData from netcdf and then add a prior trace. Maybe it should be merge_groups, instead.

@rpgoldman
Copy link
Contributor

Actually, this suggests a question about the general use of Arviz -- if predictors must be kept out of the trace objects, doesn't this hamper plotting with predictors and predictions?

Is this related to #313 ? It seems like InferenceData doesn't have any way of incorporating predictors, and that my attempt to do so using pm.Data and pm.Deterministic is what's causing this pain.

@ahartikainen
Copy link
Contributor

ahartikainen commented Sep 18, 2019

It also suggests that there should be a merge_inference_data() function or a merge() method defined on InferenceData so that one could, for example, load an InferenceData from netcdf and then add a prior trace. Maybe it should be merge_groups, instead.

See arviz.concat function that can join independent groups

https://github.com/arviz-devs/arviz/blob/master/arviz/data/inference_data.py

We need to add it to the docs

@rpgoldman
Copy link
Contributor

@ahartikainen Have a look at #825 ?

@OriolAbril
Copy link
Member Author

This does not fix #822 as it was already fixed in #833. I was going to close this PR but I realized that I had included a test to detect the issue in #822 so we can include this PR as a test for #833.

@OriolAbril OriolAbril changed the title Fix #822 Add tests for adding posterior_predictive without trace Oct 21, 2019
@OriolAbril
Copy link
Member Author

Ready to merge

@ahartikainen
Copy link
Contributor

LGTM

@ahartikainen ahartikainen merged commit 4815daa into arviz-devs:master Oct 21, 2019
@OriolAbril OriolAbril deleted the i822 branch October 21, 2019 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

from_pymc3 errors out when given no trace
3 participants