Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add return values aux state to dynamic modeling lang #153

Merged
merged 14 commits into from
Dec 18, 2019

Conversation

marcoct
Copy link
Collaborator

@marcoct marcoct commented Nov 11, 2019

Addresses https://github.com/probcomp/Gen/issues/150 for the dynamic modeling language only (not the static modeling language).

@alex-lew
Copy link
Contributor

Awesome! @marcoct -- thoughts on supporting haskey for traces? I realize most of the time the user will know what keys exist, but I can imagine scenarios where you'd want to check whether auxiliary state exists at some node before trying to access it (and triggering an exception). E.g., a pretty-printer.

@bzinberg
Copy link
Contributor

bzinberg commented Nov 11, 2019

I've also needed an iterator over (key, value) pairs for a choicemap (#123), and similarly, I've been using a scrappy version of haskey based on that iterator.

@bzinberg
Copy link
Contributor

@marcoct, could you add a documentation section about the aux state? My understanding is that it's a key-value store with arbitrary keys and values, with the requirement that there are no key collisions with the choicemap. Is there a contract that the aux state should be reconstructible from the arguments and choicemap? I'm suggesting caution here because I think there is some risk of making API/GFI decisions that are based on assumptions specific to the built-in modelling language.

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 11, 2019

@bzinberg I added some more documentation about the auxiliary state in one of the above commits.

One commitment it makes is that the auxiliary state is a function of the arguments, random choices, and untraced randomness, and I don't know how to be less stringent with that.

Another commitment is that it is addressed using the same address hierarchy as random choices. That's a choice, but it seems reasonable. It's not clear otherwise how to come up with a scheme for easily addressing into auxiliary state that itself produced within a inner function call.

Another commitment is that the addresses must be disjoint from addressed of random choices. That's also a choice, made primarily so that the same nice syntax can be used to read both (trace[addr]). An alternative would be to separate the two, but then you would either need to use get_aux to return a separate object, like we do for get_choices, or disallow users from using trace[addr] to access random choice values, which seems really useful. I don't see a reason why you would just want to the auxiliary data alone, whereas for choices that is key for when they are used as proposals.

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 11, 2019

In a sense, the values of all random choices are auxiliary data (whereas auxiliary data are not all random choices). So another option for documenting it would be to:

  1. Require that auxiliary data always mirror the values of random choices (which can also be obtained via get_choices). That is a requirement that would be added to the definition of auxiliary data.

  2. Document Base.getindex as returning the auxiliary data.

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 11, 2019

@alex-lew Yes I agree haskey for traces (handling both random choices and auxiliary data) makes sense. Also, we could consider adding support for getting all the auxiliary state under a particular address (like get_submap for choice maps) -- but that would now require a separate accessor and not Base.getindex since the address for a call is now mapped to the return value so it can't be used for the 'state under some address'. Separately, we could also add haskey to choice maps if that's useful to have separately.

It seems somewhat unsatisfying to have to separately maintain these two different views and duplicate a lot of functionality like this. But we really do need to preserve just the choices separately. Maybe there is some refactoring that can be done to reduce logical duplication.

@bzinberg
Copy link
Contributor

@marcoct, I think one key difference between "choices" and other aux data is that scores (which I think have semantics of "approximation to the joint log probability of a portion of the trace") must depend only on the arguments and the choicemap, not on any other state including aux. Is that right?

@bzinberg
Copy link
Contributor

Also, small suggestion: I think we may want to tweak the syntax for accessing return values slightly from how it appears above: if we write

@gen function bar() ... end

@gen function foo()
  @trace(bar(), :bar)
end

trace = simulate(foo, ())

I think the API for accessing the return value of the call to bar() should be something like trace[:bar => Gen.retval], where Gen.retval is a singleton value. (Having trace[:bar] (which I understand to be the current way) as a convenient alternative syntax is fine.) My thinking is that if we only make the return value available at the outer scope, then all aux state except the return value will be expressible as trace[k] for some key k, which seems like a bit of a wrinkle. Wdyt?

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 12, 2019

@bzinberg

I think one key difference between "choices" and other aux data is that scores (which I think have semantics of "approximation to the joint log probability of a portion of the trace") must depend only on the arguments and the choicemap, not on any other state including aux. Is that right?

Yes, the 'choice map' is definitely more related to the scores than the return value or the aux state that we're discussing. The various scores and weights are all functions of just arguments, 'choice maps', and untraced randomness, and not return value or aux state. (But technically everything including the auxiliary state and return value is a deterministic function of just those things -- and no other randomness/state).

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 12, 2019

@bzinberg

I think the API for accessing the return value of the call to bar() should be something like trace[:bar => Gen.retval], where Gen.retval is a singleton value. (Having trace[:bar] (which I understand to be the current way) as a convenient alternative syntax is fine.) My thinking is that if we only make the return value available at the outer scope, then all aux state except the return value will be expressible as trace[k] for some key k, which seems like a bit of a wrinkle. Wdyt?

Interesting. I see the wrinkle. How about we use trace[] to get the return value of the top-level function (that is valid Julia syntax). So in your example with foo and bar above we would have:

  • trace[] is the return value of foo (we should probably remove get_retval in favor of this)
  • trace[:bar] is the return value of bar
  • trace[:bar => :x] is the value of e.g. some random choice made in bar

I prefer that, since it avoids introducing another constant (Gen.retval), which would need to be documented and would also need to be excluded from use in addresses in order to prevent ambiguity (like Pair is currenty).

@bzinberg
Copy link
Contributor

What's the definition of untraced/auxiliary randomness? Is it the case that:

  • The definition of untraced randomness is dependent upon which implementation of the GFI we're talking about, and
  • In the built-in modelling language it is defined as a call to a generative function or distribution that does not appear inside a @trace macro?

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 12, 2019

What's the definition of untraced/auxiliary randomness? Is it the case that:
The definition of untraced randomness is dependent upon which implementation of the GFI we're talking about, and
In the built-in modelling language it is defined as a call to a generative function or distribution that does not appear inside a @trace macro?

Yes, where the untraced randomness comes from is not defined by the GFI. In the built-in modeling language it any randomness consumed during either (i) a call to a regular Julia function that is not a generative function or a 'distribution' (so it can't be traced) but is non-deterministic, or (ii) a call to a non-deterministic generative function (not all are non-deterministic, but they usually are) or a non-deterministic 'distribution' (not all are non-deterministic, but they usually are).

@bzinberg
Copy link
Contributor

bzinberg commented Nov 12, 2019

How about we use trace[] to get the return value of the top-level function (that is valid Julia syntax).

I think that would be nice. I think it is still an abuse of notation, since the ideal notation would be to denote it as trace[e] where e (which doesn't actually exist) is the identity element for pair creation, i.e., (e => x) == x for all x. I think it's fine though, the syntax trace[] is very readable and I don't foresee any issues.

@bzinberg
Copy link
Contributor

Yes, where the untraced randomness comes from is not defined by the GFI. In the built-in modeling language it any randomness consumed during either (i) a call to a regular Julia function that not a generative function or a 'distribution' (so it can't be traced) but is non-deterministic, or (ii) a call to a non-deterministic generative function (not all are non-deterministic, but they usually are) or a non-deterministic 'distribution' (not all are non-deterministic, but they usually are).

If that's the case, then what does it mean that something "must be a function only of the arguments, the choicemap, and the untraced randomness"? If, in the built-in modelling language, the untraced randomness can include arbitrary Julia code, then isn't that a vacuous restriction?

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 12, 2019

If that's the case, then what does it mean that something "must be a function only of the arguments, the choicemap, and the untraced randomness"? If, in the built-in modelling language, the untraced randomness can include arbitrary Julia code, then isn't that a vacuous restriction?

It's other mutable state (besides the ones mentioned in the list) that we can't rely on. For example, it breaks the semantics of Gen if arbitrary Julia code that is used in generative functions mutates some global state. This should probably be explicitly stated in https://probcomp.github.io/Gen/dev/ref/modeling/#Built-in-Modeling-Language-1. I thought it was, but can't find it (it's stated for the static language, but not the dynamic language).

@bzinberg
Copy link
Contributor

bzinberg commented Nov 12, 2019 via email

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 12, 2019

Doesn't untraced randomness rely on global state by virtue of relying on the RNG?

Yes, but the semantics of Gen treat the RNG as actually random.

@bzinberg
Copy link
Contributor

bzinberg commented Nov 12, 2019 via email

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 12, 2019

If the Julia code behind the "untraced randomness" in the built-in modeling language is assumed to have access to ambient true randomness, then what does it mean for something (the execution of the Julia code) to not depend on something else (a piece of non-RNG global state)? Does it mean probabilistic independence? If so, it seems to me like this could depend on the details of how we formalize this ideal RNG.

You can take a generative function and a (Julia) environment. If the generative function refers to pieces of that environment, then those pieces should be constant. If those pieces change, you have a different generative function, so technically the traces of the two functions are not compatible. As a heuristic for checking this, you could imagine hashing the environment at the start of every generative function, and then checking that it hasn't changed (not something I think we should do).

@bzinberg
Copy link
Contributor

As a heuristic for checking this, you could imagine hashing the environment at the start of every generative function, and then checking that it hasn't changed (not something I think we should do).

I agree we should not (and cannot) do this. In order to do this precisely, I think we would need to make a theoretical model of the Julia codebase (and possibly the ideal computer on which it is running). For example, we would need to know that no part of Julia relies on the ability to fix a random seed. We would also need that theoretical model in order to determine what notion of "environment snapshot" is sufficient to ensure that a given piece of Julia code runs hermetically.

Given that the completely precise version is intractable to describe, is there an informal way to articulate the independence requirement you stated above? As a reader and user of Gen, I need some guidance to understand what it means that "the score depends only on the arguments, the choicemap, and the untraced randomness."

I think maybe the statement (which I'm not recommending we actually say) would be something like, "The behavior of the Julia runtime can be approximated as something that uses a single internal RNG for everything, and that RNG cannot be accessed directly by Julia code, and that RNG produces truly random IID samples. Running Gen on this approximate version of Julia, the result of get_score(...) is random, due to the randomness of the RNG. The requirement is that for all traces trace, the distribution of get_score(trace) must depend only on get_args(trace) and get_choices(trace)."

Is that about right?

@bzinberg
Copy link
Contributor

@marcoct, I'd like to help us get this merged in (and feel a little guilty for asking so many questions, but I need them for my understanding). Would it help to meet in person next week and see if we can come to a common understanding more quickly that way?

@marcoct
Copy link
Collaborator Author

marcoct commented Nov 27, 2019

@bzinberg Do you want to discuss auxiliary randomness more before I merge this commit in?

@bzinberg
Copy link
Contributor

Did you see my previous comment asking you to meet and talk about it in person? I have not reviewed the code at all, was waiting until we had more discussion of the design. Yes, I'd still like to talk about this in person.

@marcoct
Copy link
Collaborator Author

marcoct commented Dec 3, 2019

In conversation with @bzinberg we decided:

  • We should change the name from "untraced randomness" everywhere to "non-addressable randomness" and make sure that "auxiliary randomness" does not appear anywhere.

  • We should update the semantics (and documentation) of update to describe how r is actually resampled (it is not completely resampled from scratch each time). This is issue https://github.com/probcomp/Gen/issues/166 (which does not need to be addressed for this PR).

  • We confirmed that auxiliary state should be allowed to depend on the non-addressable randomness (as already documented in this PR).

@bzinberg
Copy link
Contributor

bzinberg commented Dec 3, 2019

Adding to the recap in https://github.com/probcomp/Gen/pull/153#issuecomment-561283305:

One key clarification that (I think) completely solved my confusion in https://github.com/probcomp/Gen/pull/153#issuecomment-552986920 was that the value of r is actually stored within the trace. So get_score(trace) is actually a deterministic function of the trace and the trace contains r. I had previously thought that the value of get_score(trace) was random, with trace and r being independent. The rename from "untraced randomness" to "non-addressable randomness" helps make it clearer that the former is the case. (I wonder if we should still explicitly say somewhere in the docs that a specific trace embodies a specific choice of r?)

Copy link
Contributor

@bzinberg bzinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Please see these smallish comments and ping me when I should take another look

docs/src/ref/gfi.md Outdated Show resolved Hide resolved
docs/src/ref/gfi.md Outdated Show resolved Hide resolved
docs/src/ref/gfi.md Outdated Show resolved Hide resolved
docs/src/ref/gfi.md Outdated Show resolved Hide resolved
docs/src/ref/gfi.md Outdated Show resolved Hide resolved
src/dynamic/trace.jl Show resolved Hide resolved
src/gen_fn_interface.jl Show resolved Hide resolved
src/static_ir/trace.jl Show resolved Hide resolved
test/dynamic_dsl.jl Show resolved Hide resolved
test/dynamic_dsl.jl Show resolved Hide resolved
marcoct and others added 8 commits December 5, 2019 20:26
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
Co-Authored-By: bzinberg <bzinberg@mit.edu>
… thing, now that traces have aux data in addition to choicemaps
Copy link
Contributor

@bzinberg bzinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I removed the method nested_view(::Trace), which no longer makes sense, and filed #167 which will allow it to make sense again.

👍 to merge.

docs/src/ref/gfi.md Outdated Show resolved Hide resolved
src/gen_fn_interface.jl Outdated Show resolved Hide resolved
@bzinberg
Copy link
Contributor

@marcoct, green light to merge this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants