Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VarInfo Goals #7

Closed
cpfiffer opened this issue Aug 14, 2019 · 3 comments
Closed

VarInfo Goals #7

cpfiffer opened this issue Aug 14, 2019 · 3 comments

Comments

@cpfiffer
Copy link
Member

This issue contains my thoughts and comments from working with VarInfo a lot more during the course of TuringLang/Turing.jl#793. My experience is that VarInfo is somewhat easy to use once you get over the very steep learning curve, but that learning curve can be a powerful deterrent to development from outside folks.

I make some strong statements here to encourage discussion. I'm not trying to bash anyone's superb contributions (particularly @mohamed82008's great work on VarInfo), I just want to see if I can provoke some high-level thinking about VarInfo without thinking too much about what it is right now. Try to keep the context of this discussion about what VarInfo could be and not what it is now.

I don't want to see VarInfo anywhere

I think we see VarInfo too much. If I'm a non-Turing person and I'm building some kind of inference tool, I don't want to learn about our arcane system for managing variables. I just want to manipulate parameters, draw from priors, etc. Many of our functions should probably never have a do_something(vi, spl) signature -- we should find ways to handle everything on the back end without anyone worrying about how to use VarInfo. A better way would be to have the VarInfo stored somewhere in a shared state or tied to a specific model.

I can imagine a case where VarInfo is stored in some environment variable or state variable or something, and the sampler or model might just have a location to go look at where the VarInfo is. Then you could just call logp(model) and by default it would calculate the log probability using whatever the current state is. If you really wanted to, you could pass in a VarInfo and work with a specific one if you're doing a lot of numerical work and such, but I think for almost all cases VarInfo could sit far away and never be thought about.

An alternative fix would be to just have a very small handful of variables that are dead-simple to use an understand. See TuringLang/Turing.jl#886 for a better discussion.

  • update!(vi, new_vals) should update the parameters.
  • parameters(vi) should get the current parameterization in a NamedTuple or Dict format.
  • logp(vi, model) should give you a log probability, no questions asked and no hassle.
  • priors(vi) should give you a NamedTuple or Dict of prior distributions to draw from.
  • If I want to change my priors or something, we should have a way to do that too. priors!(vi, new_priors) should set my priors to whatever the new distributions are.

VarInfo is ultimately my biggest issue with Turing's internals. I understand why we need it and it is a masterful work of engineering, but from a usability side it is a disaster, particularly if our goal is to have a high degree of ease-of-use for inference designers.

If you asked me a question on how to do something with a VarInfo right now, chances are very good it would take me more than an hour to think about what it is that VarInfo is, what it does, and where in the source code I might find an answer. Add another half hour because whatever it was I though VarInfo was is not true.

Where should VarInfo live?

I'm not sure where the VarInfo should go. I don't think it should be a free-floating entity like it has been in Turing's past, and I'm also not convinced that it's attachment to the sampler state as in TuringLang/Turing.jl#793 is correct either.

Is VarInfo more a function of the model, or of the sampler? If it's more specific to the model, shouldn't we store it there? I don't really know. If it's in the model, then it's quite nice to use for non-MCMC methods, since nobody would have to add VarInfo to their method -- they can just call the model's version. Ultimately the VarInfo is constructed from the model, and the samplers just reference it. Right now I'm learning towards moving VarInfo over to the model, but I'm open to discussion on that.

A downside to putting it on the model side is that it becomes harder to build new modeling tools on top of Turing, but easier to build inference methods. I think it's a trade-off that's worth considering.

Removing the Sampler.info field

Build a VarInfoFlags struct that handles all the various switches and gizmos and whatever that VarInfo uses. Currently, all the Samplers have a dictionary called info in them which will no longer be used on the inference side after TuringLang/Turing.jl#793. It'd be nice if we could remove the field entirely and separate the VarInfo flags from the Sampler either by storing the flags in the VarInfo itself, or at least removing the dictionary by just storing the flags with the sampler.

This is really more mechanical than goal-oriented, and it's just something I or someone else might need to apply some elbow grease to.

@cpfiffer
Copy link
Member Author

I would also like to add a flag for whether a combination of vi and spl is linked, so we don't have to do a for loop to check each time.

@mohamed82008
Copy link
Member

Thanks for this write-up @cpfiffer and sorry for not getting back to you earlier. I will try to address your comments in TuringLang/Turing.jl#965.

@phipsgabler
Copy link
Member

Closing in favour of AbstractPPL discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants