-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consistent distribution semantics (change initialstate? change action?) #308
Comments
For abstract type InitialType end
struct InitialState <: IntialType end
struct InitialObs <: InitialType end
initial(m::Union{MDP, POMDP}, ::InitialType) = ... But this is also not really consistent with For |
Yeah, I think (1) might result in the best final outcome... It just might be a little bumpy to transition to it. |
I am more in favor of 1.1, but can you clarify the following: How would one define a problem using the generative interface with (1)?
I think it is a bit inconsistent? one would need a call to For 1.3, I think |
I guess that with the |
Thanks for the thoughts! this is very helpful!
|
Moving forward in development and documentation with |
Another question is what do we do about
I guess my vote is for 2. It is only really for the reinforcement learning case and it is not the observation for a particular step, so it is a different concept. Making it fall back adds some difficulties with throwing errors. |
* fix #250 * travis only tests 1.1 and 1 * removed inferred_in_latest * removed all of the old deprecated generative stuff * removed ddn code * before removing programatic deprecation macros * tests pass * before switching back to master * initial steps * tests pass * started * got rid of errors, switched to distribution initialstate (#308) * DDNOut -> Val * brought back DDNOut * tests pass * working on docs * working on docs * cleaned up example * a bit more cleanup * finished documentation to fix #280 * added deprecation case for when initialstate_distribution is implemented * Changed emphasis of explit/generative explanation * Update README.md * fixed typo * Update docs/src/def_solver.md Co-authored-by: Jayesh K. Gupta <mail@rejuvyesh.com> * Update runtests.jl * moved available() and add_registry() to deprecated.jl * Update def_pomdp.md Co-authored-by: Jayesh K. Gupta <mail@rejuvyesh.com>
I just added
ImplicitDistribution
to POMDPModelTools. This makes it easier to create a distribution object when you can't write down the explicit distribution, but only have a function to sample from the distribution.We may want to change a couple things in light of this:
1.
initialstate
This makes
initialstate(m, rng)
somewhat unnecessary because now it is easy enough to doI see two options for eliminating one of these redundant functions:
initialstate
return a distributiontransition
andobservation
@deprecate initialstate(m, rng) rand(rng, initialstate(m))
and@deprecate initialstate_distribution(m) initialstate(m)
)QuickMDP(initialstate=1, ...)
initalstate
altogether (keepinitialstate_distribution
)initialstate
transition
andobservation
initialstate
andinitialstate_distribution
and create a new functioninitial(m::Union{MDP, POMDP})
that returns the initial state distribution.initialobs
2.
action
If we are going to make everything distribution-focused, should we also make
action
return a distribution?Options:
transition
andobservation
(but this does not seem too bad since policies are different from POMDPs)action
return a distributiontransition
andobservation
rand(action(policy, b))
does not feel that cleanFunctionPolicy
action_distribution(policy, b)
or justdistribution(policy, b)
transition
andobservation
action(policy, b, rng)
? What is the default fallback pattern?Though I'm very undecided at this point, my initial feeling is that we should introduce
initial
(option (3)) and not make any changes toaction
(option (1)).The text was updated successfully, but these errors were encountered: