-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify ChoiceMap interface/architecture via ValueChoiceMap
s
#263
base: master
Are you sure you want to change the base?
Simplify ChoiceMap interface/architecture via ValueChoiceMap
s
#263
Conversation
Textdump from my initial benchmarking:
|
Thanks for this work @georgematheos! I'm wondering if the following slight adjustment to the framing is equivalent to what you're proposing: Julia's indexing syntax is that The current indexing syntax for choicemaps is, essentially, keys are assumed to be linked lists, and Julia indexable things can have zero-dimensional indexes: It seems that the If we can pin that down as the semantics of choicemap indexing, I think that would simplify the API. Again, I think I'm essentially saying the same thing as you. (Ok, that's my third "essentially," better stop now.) WDYT? |
Hi Ben! Yes, that's a great way to think of it! I hadn't considered the I just added a change to the choicemap docs on this PR to try to articulate choicemap lookups along the lines of your comment (1bd705f). I did this mostly via examples, though; @bzinberg does this look about like what you had in mind? Do you think including these sorts of examples are sufficient, or should we try to find a better formal articulation? |
The email pings reminded me that I forgot to reply to this 🙂 Upon further thought, I think I've better figured out what the itch was that I was trying to scratch with my above comment. I was trying to figure out what the "more correct but slightly less convenient to write" version of the indexing formalism is for these choicemaps, along the same lines as I advocated in JuliaLang/julia#28866 (comment). Once we know what that is, we can make a syntax shortcut that is as close as possible to the surface, and not have a nagging sense that we're muddying the internals. The key (hah) invariant that indexing should satisfy is, for any choicemap c[concat(keypath1, keypath2)] == c[keypath1][keypath2] where This axiom then implies that So I guess we have a few choices.
|
Oh how about this: Define a class # For any T1, T2, ... that are not subtypes of KeyPathPrefix,
c[a1::T1, a2::T2, ...] === getindex(c, a1, a2, ...) =
assert_leaf_and_get_value(c[KeyPathPrefix(a1, a2, ...)])
c[A::KeyPathPrefix] === getindex(c, A) = submap_at_keypath(c, A)
# Can also have a vararg overload of submap_at_keypath:
# if T1, T2, ... are not subtypes of KeyPathPrefix,
submap_at_keypath(c, a1::T1, a2::T2...) =
submap_at_keypath(c, KeyPathPrefix(a1, a2, ...)) This might get the best of both worlds. If the user doesn't know about submap_at_prefix(c, a1, a2, ...) or c[KeyPathPrefix(a1, a2, ...)] It might have to be a breaking change because it probably should supersede Gen's special treatment of linked lists formed by WDYT? |
@bzinberg thanks for bringing this up, I agree this is an important thing to figure out the right semantics for now. Also, check out the design doc I linked in the probcomp slack; part of my proposal is to merge ChoiceMaps and Selections into a single type called an Re should # where get_value throws an error if there is no value at this address
Base.getindex(c::ChoiceMap, addr) = get_value(c, addr)
Base.getindex(c::ChoiceMap) = get_value(c) # support valuechoicemap[] syntax
get_value(c::ChoiceMap, addr) = get_value(get_submap(c, addr))
get_value(v::ValueChoiceMap) = v.value
get_value(c::ChoiceMap) = error("There is not a value at this address!") (This is pretty much what I've implemented in this PR. The Re use special type instead of abstract type Address end
struct EmptyAddress <: Address end
struct BaseAddress{T} <: Address
addr::T
end
struct HeirarchicalAddress{First, Rest} <: Address
first::First
rest::Rest
# force first and rest to be addresses with this inside constructor:
HeirarchicalAddress(f::Address, r::Address) = new(f, r)
end
# declaration for get_submap:
function get_submap(::ChoiceMap, ::Address) end
# implementations:
get_submap(c::ChoiceMap, ::EmptyAddress) = c
get_submap(c::ChoiceMap, h::HeirarchicalAddress) = get_submap(get_submap(c, h.first), h.rest)
# get_submap(::ChoiceMap, ::BaseAddress) is has a custom implementation for each ChoiceMap type
# same declarations for get_value as above
# getindex tries to convert to addresses:
Base.getindex(c::ChoiceMap, a::Address) = get_value(c, a)
Base.getindex(c::ChoiceMap, a) = get_value(c, Address(a))
# default converters:
Address() = EmptyAddress()
Address(a::Address) = a
Address(a) = BaseAddress(a)
Address(a::Pair) = HeirarchicalAddress(a[1], Address(a[2])) If a user wanted to use a Pair as a |
Ah, thanks for the pointer -- I'll take a look at your design doc about Regarding your API sketch, it looks to me like the user-facing API will be similar to the one implied by #263 (comment), and you've suggested a different type layout for the internals (compared to "just use a thing that is conceptually a tuple"). FWIW, I think that the current use of
I'm wondering if you agree it would be good to phase out I also have two questions/comments about the type layout you proposed:
|
Okay, I think you have a good point @bzinberg that Tuples Vs Linked-ListsAs you are saying, I'm essentially proposing to make the Tuples don't seem quite right to me for this use case. For example, it feels like I should be able to change a top-level address without reconstructing the whole hierarchical address. And it feels unnatural to have "indices" for the address nodes at different "depth"s. (I don't see any reason to need constant-time access to the nth node in an address...when do I ever need a node without needing to know the part of the address preceding it?) While the performance for tuples is highly optimized, I think the JIT compiler should be able to produce equally fast code for linked lists for iteration (though of course not for indexing--but again, I don't think we need this). When I run simple experiments to iterate over linked-lists vs tuples, it looks like for iteration implemented via recursion, linked lists have better performance, vs "for-loop" recursion leads to better performance for tuples. (Pairs are faster with recursive iteration because it's slow to do Concatenation(Note that I use I agree that it's not obvious what the best way to do concatenation is. Ideally, we could concatenate any 2 addresses in constant time. I don't think this is possible if addresses are represented as tuples, since tuples are immutable so to concatenate we have to fill a new tuple with the values from the other ones. If they are linked-lists, we can sort of concatenate in constant time, but we get the wrong "nesting structure" as you are pointing out. (Ie. we'd want Interestingly, a lot of the time, having improper nesting wouldn't matter. For example, However, there are cases when we need the first (first::BaseAddress, rest::Address) = get_first_rest(a::HeirarchicalAddress) and have this automatically do the (possibly linear-time) traversal needed to extract the base address from Alternatively, we could just do this iteration at construction-time, so every time we construct a Another possibility would be to have one type |
Aha - now that I've read your docs on The main things I want to advocate are API simplicity and deferred commitments. By deferred commitments I mean that the core conceptual machinery that users have to know should "be as unrestrictive as possible, except when it's not" -- wherever restrictions or complexity are introduced, they should be well justified. So what I'm most interested in understanding better (and didn't do a good job of focusing on in my previous comment) is, what commitments do the above API and Julia type hierarchy make in terms of how the user structures their programs, and are they all necessary? On first glance it looks to me like the type hierarchy itself is catered to inference algorithms that operate by tree traversal on the choicemap -- which might be exactly right but I want to better understand why it's right. |
Btw, this is a minor and tangential thing, but just to clarify, when I said
I was referring to stack memory overhead (which can overflow the stack even if the system has plenty of physical memory), because AFAIK Julia does not do tail call optimization. I doubt that any reasonable Gen program would come close to having choicemaps this deep, I was just reflexively commenting on what appeared to be stack-based recursion in a language that isn't Scheme. (Sadly, even Clojure doesn't have proper tail call optimization, because it needs to follow Java calling conventions for interoperability.) |
Ah shoot, they don't flatten tail recursion?! Okay, that's good to know. I agree most programs won't have choicemaps that deep, but it still might be a performance regression. We could consider if an iterative implementation like function get_submaps_shallow(c::Choicemap, a)
while !(a isa BaseAddress)
c = get_submap(c, a.first)
a = a.rest
end
return get_submap(c, a) would improve performance. Re commitments, I totally agree. I am pretty busy right now but at some point I think I might put together a presentation/writeup to discuss with the group about a way to look at Gen as a sort of "tree traversal" library, which motivates a lot of my suggestions and changes. This will help explain why I think that this type of tree-recursion is arguably fundamental to Gen, and why I think it may be reasonable to commit to it in the explicit design of the library. |
Note: this PR implements a breaking change, and is not for merging in the short-term; I am posting it to discuss this possible way of changing the choicemap interface, and possibly for inclusion in a "version number change" release in which we allow for breaking changes.
Big Picture
This PR is based on issue #258, and makes it so that the values stored in ChoiceMaps are wrapped in a subtype
ValueChoiceMap <: ChoiceMap
. This way, we can view the values as leaf nodes of a ChoiceMap, which simplifies some code and will make it possible in the future to make Distribution a subtype of GenerativeFunction (which is one of the biggest reasons for this change). To be consistent with this semantic that values are choicemaps, I have changed the meaning ofget_submaps_shallow
and the error-throwing behavior ofget_submaps
, as I will discuss in the next section.Breaking Changes
get_submaps_shallow
now returns an iterator over tuples(addr, submap)
, including for submaps which are aValueChoiceMap
. Socollect(get_submaps_shallow(choicemap((:a => :b, 1), (:c, 2)))
will now return[(:a, choicemap(:b, 1)), (c, ValueChoiceMap(2)]
.get_submap(choicemap, addr)
wherechoicemap[addr]
contains a value no longer throws aKeyError
; now it returns aValueChoiceMap
containing that value.I have also changed some error throwing behavior to what seemed more natural to me in this implementation, though these could be reverted to the old behavior without changing the idea of the PR majorly:
get_submap(choicemap, addr1 => addr2)
whereaddr1
contains a value now returns anEmptyChoiceMap
rather than throwing aKeyError
.get_value(choicemap, addr)
where there is no value stored ataddr
no longer throws aKeyError
; instead it throws a custom error called aChoiceMapGetValueError
. This error does not include any information about which address did not possess a value. However, I do not think this makes the error significantly less informative than the wayKeyError
s were being returned, since theseKeyError
s always wrap the deepest part of the address, regardless of the first address not to be found. (Ie. if you called get_value(choicemap, :a => :b => :c => :d)and it threw a
KeyError, it would always be a
KeyError(:d), even if say
:bwas the first key which had an
EmptyChoiceMapunder it. So this address the
KeyError` wraps is not very useful.)Nonbreaking interface changes
get_nonvalue_submaps_shallow
performs the functionalityget_submaps_shallow
used to.get_value(choicemap)
andhas_value(choicemap)
are valid calls.get_value(choicemap)
returns the value on the choicemap ifchoicemap isa ValueChoiceMap
and throws aChoiceMapGetValueError
otherwise;has_value(choicemap)
returns true ifchoicemap isa ValueChoiceMap
and false otherwise._fill_array!
, so custom choicemap types have a built-in implementation forto_array
. (Users must still implement_from_array
to usefrom_array
.)Base.isempty
,has_value
,get_value
,get_values_shallow
, andget_nonvalue_submaps_shallow
are provided based onget_submap
andget_submaps_shallow
, so users should only need to implement 2 methods to create a custom choicemap type.There are also many implementation changes.
Initial Benchmarking
I have run 3 benchmarks and included the files in the PR; they are
test/static_choicemap_benchmark.jl
in which I test the performance of lookups from static choicemaps,test/static_inference_benchmark.jl
, in which I test the performance of running MH on a simple static model, andtest/dynamic_choicemap_benchmark.jl
, in which I test the performance of lookups from dynamic choicemaps.The takeaways from this initial benchmarking are
:b => :c
). However, this improvement may be achievable via some changes to the static IR code without the rest of this PR; it comes from changing the implementation ofget_choices
for static DSL traces so that it always returns aStaticTraceIRAssmt
rather than sometimes returning anEmptyChoiceMap
; this type-stability allows for deeper compilation. However, this full PR may be needed for the behavior ofisempty
to perform correctly if we get rid of this check to sometimes returnEmptyChoiceMap
; I'm not totally sure.:a
) and is improved for nested lookups (ie. looking up:b => :c
).I suspect we will want to do more benchmarking; any suggestions for what experiments would be useful to run?
Questions and requests for review
src/modeling_library/recurse/recurse.jl
- I only made minor changes to this file, and all the tests are passing, but I have not taken the time to understand how this combinator works, so I am not 100% confident my implementation here is correctis_empty
andnum_nonempty
in the static IRtrace.jl
file. I am not totally clear why these values are being tracked (is is just so we can return anEmptyChoiceMap
if there are no values in the traces subtraces?), but I removed the check that hasget_submap
sometimes return anEmptyChoiceMap
, since this type instability decreases lookup performance drastically. My new implementation should handle callingisempty
on a staticIRtrace choicemap with no values properly even if it isn't an EmptyChoiceMap; does this mean we should get rid of trackingisempty
andnum_nonempty
? Or are they being used for something else? Or have I actually broken something that just didn't trigger any errors in the test suite?