Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider changing rcParams default for log_likelihood to False #2155

Open
ricardoV94 opened this issue Nov 8, 2022 · 3 comments
Open

Consider changing rcParams default for log_likelihood to False #2155

ricardoV94 opened this issue Nov 8, 2022 · 3 comments

Comments

@ricardoV94
Copy link

ricardoV94 commented Nov 8, 2022

data.log_likelihood : true # save pointwise log likelihood values, one of "true", "false"

Paraphrasing from the PyMC issue: pymc-devs/pymc#6266

This evaluation can be quite expensive and require too much memory. I think it's safe to say most users don't make use of this information making it pretty wasteful.

Would require coordinating with other PPLs that use this variable to see if they agree and can help users make the transition.

CC @OriolAbril

@ilaiyengar
Copy link

Hi, I am a first time contributor! Would this be a good first issue that I can assist with?

@ahartikainen
Copy link
Contributor

Is pymc still using arviz converter?

@OriolAbril
Copy link
Member

Hi, I am a first time contributor! Would this be a good first issue that I can assist with?

@ilaiyengar no, sorry. This issue still needs discussion and there is no clear fix yet. You should take a look at Beginner or let us know the areas within ArviZ you are interested in on gitter so we can find something that fits your interests.

Is pymc still using arviz converter?

The converter to InferenceData for pymc v4 is now in pymc codebase, but for now it maintains compatibility with ArviZ, uses it's rcParams, uses dict_to_dataset...


I have checked the converters and the data.log_likelihood rcParam is used by pyro, numpyro (and pymc) to compute the pointwise log likelihood values and by stan converters to retrieve log_lik variable into the log likelihood group if present. I generally use the log likelihood values but I have run into memory errors a few times after forgetting to set the value to false.

When that happens I generally need to exit the python interpreter and end up needing to resample. Therefore, given it is an rcParam that can be set again to true on a user or project basis, I think it would make sense to set the default to false, but it will probably confuse users for a while when they try to call az.compare and it doesn't work out of the box anymore.

If set to false (or whatever we choose really) it might also make sense to split the rcParam into two in order to preserve Stan behaviour where true doesn't always compute and store pointwise log likelihood and it does only if the log_lik variable is present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants