Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reassess MutableData and coords_mutable #6972

Closed
ricardoV94 opened this issue Oct 26, 2023 · 5 comments · Fixed by #7047
Closed

Reassess MutableData and coords_mutable #6972

ricardoV94 opened this issue Oct 26, 2023 · 5 comments · Fixed by #7047

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Oct 26, 2023

We could make all coords mutable by default and replace mutable lengths by constants before doing any operations on a Model object. Same for MutableData. We could have a helper function freeze_graph that does that.

Note we already do something similar for jax based sampling since shared variables aren't really supported in that backend

@jessegrabowski
Copy link
Member

jessegrabowski commented Oct 26, 2023

+1 to the 2nd idea. It seems like it should be possible to know which coords are mutable/constant based on the model? Consider this model:

import pymc as pm

coords = {'a':[0], 'b':[0]}
with pm.Model() as mod:
    x = pm.MutableData('x', [3], dims='a')
    y = pm.ConstantData('y', [1], dims='b')

Just by looking at mod.named_vars_to_dims, we can immediately infer that 'a' is associated with a SharedVariable while 'b' isn't. From here, it wouldn't be hard to have a helper function that checks all the dims and makes a replacement from a default of all constant -> some mutable or default all mutable -> some constant.

My point in all this is that the mutable/non-mutable distinction on coords is not one the user should have to engage with. If a container is declared as mutable, the coords should be mutable: either because they all are by default, or because we internally make it so. Declaring a pm.MutableData then being unable to set the data because you didn't use the (undocumented!!) coords_mutable is not a good user experience.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Oct 26, 2023

I would go one step further and remove the Mutable/Constant data distinction as well

@ricardoV94 ricardoV94 changed the title Rename coords_mutable to mutable_coords Reassess MutableData and coords_mutable Oct 26, 2023
@jessegrabowski
Copy link
Member

My only objection to that is that if someday we allowed users to recycle compiled logp functions in pm.sample, it wouldn't be possible to change the data between runs. This is an oft-requested feature on the Discourse.

I also don't think an on-going discussion about removing the Mutable/Constant data distinction should be a blocker to a PR that removes the distinction for coords.

@ricardoV94
Copy link
Member Author

ricardoV94 commented Oct 26, 2023

We can allow advanced users to still compile functions with shared variables, but I don't think that has to map directly to the pm.Data API, since it hasn't to this point anyway.

@AlexAndorra
Copy link
Contributor

Completely agree with this too -- making coords mutable by default, inferring which ones are not under the hood, and removing the distinction between Mutable and Constant Data.
All this adds boilerplate to the workflow and tends to make user experience less friendly, especially for beginners

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants