-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add return values aux state to dynamic modeling lang #153
Conversation
Awesome! @marcoct -- thoughts on supporting |
I've also needed an iterator over (key, value) pairs for a choicemap (#123), and similarly, I've been using a scrappy version of |
@marcoct, could you add a documentation section about the aux state? My understanding is that it's a key-value store with arbitrary keys and values, with the requirement that there are no key collisions with the choicemap. Is there a contract that the aux state should be reconstructible from the arguments and choicemap? I'm suggesting caution here because I think there is some risk of making API/GFI decisions that are based on assumptions specific to the built-in modelling language. |
@bzinberg I added some more documentation about the auxiliary state in one of the above commits. One commitment it makes is that the auxiliary state is a function of the arguments, random choices, and untraced randomness, and I don't know how to be less stringent with that. Another commitment is that it is addressed using the same address hierarchy as random choices. That's a choice, but it seems reasonable. It's not clear otherwise how to come up with a scheme for easily addressing into auxiliary state that itself produced within a inner function call. Another commitment is that the addresses must be disjoint from addressed of random choices. That's also a choice, made primarily so that the same nice syntax can be used to read both ( |
In a sense, the values of all random choices are auxiliary data (whereas auxiliary data are not all random choices). So another option for documenting it would be to:
|
@alex-lew Yes I agree It seems somewhat unsatisfying to have to separately maintain these two different views and duplicate a lot of functionality like this. But we really do need to preserve just the choices separately. Maybe there is some refactoring that can be done to reduce logical duplication. |
@marcoct, I think one key difference between "choices" and other aux data is that scores (which I think have semantics of "approximation to the joint log probability of a portion of the trace") must depend only on the arguments and the choicemap, not on any other state including aux. Is that right? |
Also, small suggestion: I think we may want to tweak the syntax for accessing return values slightly from how it appears above: if we write @gen function bar() ... end
@gen function foo()
@trace(bar(), :bar)
end
trace = simulate(foo, ()) I think the API for accessing the return value of the call to |
Yes, the 'choice map' is definitely more related to the scores than the return value or the aux state that we're discussing. The various scores and weights are all functions of just arguments, 'choice maps', and untraced randomness, and not return value or aux state. (But technically everything including the auxiliary state and return value is a deterministic function of just those things -- and no other randomness/state). |
Interesting. I see the wrinkle. How about we use
I prefer that, since it avoids introducing another constant ( |
What's the definition of untraced/auxiliary randomness? Is it the case that:
|
Yes, where the untraced randomness comes from is not defined by the GFI. In the built-in modeling language it any randomness consumed during either (i) a call to a regular Julia function that is not a generative function or a 'distribution' (so it can't be traced) but is non-deterministic, or (ii) a call to a non-deterministic generative function (not all are non-deterministic, but they usually are) or a non-deterministic 'distribution' (not all are non-deterministic, but they usually are). |
I think that would be nice. I think it is still an abuse of notation, since the ideal notation would be to denote it as |
If that's the case, then what does it mean that something "must be a function only of the arguments, the choicemap, and the untraced randomness"? If, in the built-in modelling language, the untraced randomness can include arbitrary Julia code, then isn't that a vacuous restriction? |
It's other mutable state (besides the ones mentioned in the list) that we can't rely on. For example, it breaks the semantics of Gen if arbitrary Julia code that is used in generative functions mutates some global state. This should probably be explicitly stated in https://probcomp.github.io/Gen/dev/ref/modeling/#Built-in-Modeling-Language-1. I thought it was, but can't find it (it's stated for the static language, but not the dynamic language). |
Doesn't untraced randomness rely on global state by virtue of relying on
the RNG?
|
Yes, but the semantics of Gen treat the RNG as actually random. |
If the Julia code behind the "untraced randomness" in the built-in modeling
language is assumed to have access to ambient true randomness, then what
does it mean for something (the execution of the Julia code) to not depend
on something else (a piece of non-RNG global state)? Does it mean
probabilistic independence? If so, it seems to me like this could depend
on the details of how we formalize this ideal RNG.
|
You can take a generative function and a (Julia) environment. If the generative function refers to pieces of that environment, then those pieces should be constant. If those pieces change, you have a different generative function, so technically the traces of the two functions are not compatible. As a heuristic for checking this, you could imagine hashing the environment at the start of every generative function, and then checking that it hasn't changed (not something I think we should do). |
I agree we should not (and cannot) do this. In order to do this precisely, I think we would need to make a theoretical model of the Julia codebase (and possibly the ideal computer on which it is running). For example, we would need to know that no part of Julia relies on the ability to fix a random seed. We would also need that theoretical model in order to determine what notion of "environment snapshot" is sufficient to ensure that a given piece of Julia code runs hermetically. Given that the completely precise version is intractable to describe, is there an informal way to articulate the independence requirement you stated above? As a reader and user of Gen, I need some guidance to understand what it means that "the score depends only on the arguments, the choicemap, and the untraced randomness." I think maybe the statement (which I'm not recommending we actually say) would be something like, "The behavior of the Julia runtime can be approximated as something that uses a single internal RNG for everything, and that RNG cannot be accessed directly by Julia code, and that RNG produces truly random IID samples. Running Gen on this approximate version of Julia, the result of Is that about right? |
@marcoct, I'd like to help us get this merged in (and feel a little guilty for asking so many questions, but I need them for my understanding). Would it help to meet in person next week and see if we can come to a common understanding more quickly that way? |
@bzinberg Do you want to discuss auxiliary randomness more before I merge this commit in? |
Did you see my previous comment asking you to meet and talk about it in person? I have not reviewed the code at all, was waiting until we had more discussion of the design. Yes, I'd still like to talk about this in person. |
In conversation with @bzinberg we decided:
|
Adding to the recap in https://github.com/probcomp/Gen/pull/153#issuecomment-561283305: One key clarification that (I think) completely solved my confusion in https://github.com/probcomp/Gen/pull/153#issuecomment-552986920 was that the value of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Please see these smallish comments and ping me when I should take another look
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
… thing, now that traces have aux data in addition to choicemaps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I removed the method nested_view(::Trace)
, which no longer makes sense, and filed #167 which will allow it to make sense again.
👍 to merge.
@marcoct, green light to merge this? |
Addresses https://github.com/probcomp/Gen/issues/150 for the dynamic modeling language only (not the static modeling language).