-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse observation matrix bugfix #30
Conversation
Another suggestion: the current code is using |
Hi @mattuntergassmair thanks! We should have caught this earlier! Can we add a test that has a different number of observations and states like the TMaze from POMDPModels?
No, because The tests are currently failing because of some issue with Compose 0.6.1. We need to somehow convince it to not use that version. I tried playing with this for a while tonight but could not figure it out. |
Hi @zsunberg I added the TMaze for testing as suggested 👍 I discovered several bugs in the implementation of TMaze, will submit a separate PR to TMaze PR: JuliaPOMDP/POMDPModels.jl#58 |
src/sparse_tabular.jl
Outdated
push!(transmat_col_A[ai], si) | ||
push!(transmat_data_A[ai], 1.0) | ||
else | ||
# if isterminal(mdp, s) # if terminal, there is a probability of 1 of staying in that state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to check for terminal states explicitly here or do we expect them to be detected in the transition
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as below. We didn't want the user to have to think about this.
src/sparse_tabular.jl
Outdated
if isterminal(mdp, s) | ||
reward_S_A[stateindex(mdp, s), :] .= 0.0 | ||
else | ||
# if isterminal(mdp, s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as for transition matrix, do we want to check here or do we expect the check to happen in reward(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we didn't want an extra overhead in thinking about this from the user side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I guess that works. Should we then document somewhere that the user doesn't need to worry about checking for terminal states in reward
and transition
? Right now it seems like the code for this is duplicated.
Also, re-introducing these lines I commented out will make some tests fail for the TMaze because it allows "escaping" from the terminal state. I guess that's a bug that needs to be taken care of on the TMaze side of things though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very important! Since no actions will be taken from a terminal state (https://juliapomdp.github.io/POMDPs.jl/stable/basic_properties/#Terminal-States-1), a complete and valid problem implementation might have a transition
function that errors when it is called on a terminal state. Thus, we need this check to avoid such errors.
Should we then document somewhere that the user doesn't need to worry about checking for terminal states in reward and transition?
Yes, this is a good fact to know, but where should we put it? Happy to add it if there is a good place.
I guess that's a bug that needs to be taken care of on the TMaze side of things though
Yes, thanks for finding that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it could be added here https://juliapomdp.github.io/POMDPs.jl/stable/explicit/ ?
@zsunberg
@zsunberg ok I understand, then maybe |
Sort of - then you have to find a way to map the states returned by |
I understand. How about we use enumeration over
If this sounds good I can make a separate PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the hard work on this! A few things need to be fixed:
- POMDPModels and DiscreteValueIteration shouldn't be in the [deps] section because they are only needed for testing
- Need to put a compat requirement for POMDPModels to be 0.4.2 or greater to deal with the TMaze and Compose problems
- Compose shouldn't be in [deps] because it is not a dependency. The bug with Compose should be fixed in POMDPModels as discussed in the comments
- The isterminal checks need to be re-enabled (see comments)
Once POMDPModels 0.4.2 has been successfully registered, I will restart the build |
@mattuntergassmair Yep, that is the right way to do it! |
Need to update |
I don't even know where DiscreteValueIteration came from, removed it now |
Still needs appropriate |
|
Not sure why tests are failing on travis. |
One of the test is using |
Ouuuuh, that's why testing is soooo slow and it's downloading packages and stuff. So I guess it's intended to work that way? |
Can someone verify the test work locally with POMDPModels master? Then it looks like we will have to tag one more version of POMDPModels and then the tests should pass. I won't be able to work on this until Saturday, but it seems like we are close :) Thanks for working on it! |
I can confirm that tests are passing locally for me |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thanks for making the requested changes @mattuntergassmair I think we're ready as soon as JuliaRegistries/General#7040 gets merged
Woo! done finally! |
The dimensions of the constructed sparse observation matrix were wrong (ns x ns). Fixed this (-> ns x no) and a few other things on the way.
PS: I thought i was working on my private fork and pushed the changes, realized after I was in the SISL repo. I reverted that commit and am now submitting the same changes as a PR here. Apologies for that