Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sparse observation matrix bugfix #30

Merged
merged 8 commits into from
Dec 22, 2019

Conversation

mattuntergassmair
Copy link

The dimensions of the constructed sparse observation matrix were wrong (ns x ns). Fixed this (-> ns x no) and a few other things on the way.

PS: I thought i was working on my private fork and pushed the changes, realized after I was in the SISL repo. I reverted that commit and am now submitting the same changes as a PR here. Apologies for that

@mattuntergassmair
Copy link
Author

mattuntergassmair commented Dec 12, 2019

Another suggestion: the current code is using si = stateindex(mdp, s) in the for loop, which requires mapping s to its associated index. Couldn't we instead just say for (si, sp) in enumerate(states(pomdp))? This would require that the state iterator conserves the ordering of the states but that seems reasonable. Saving uneccessary calls to stateindex would also improve performance. The same holds for calls to actionindex()

@zsunberg
Copy link
Member

Hi @mattuntergassmair thanks! We should have caught this earlier! Can we add a test that has a different number of observations and states like the TMaze from POMDPModels?

Couldn't we instead just say for (si, sp) in enumerate(states(pomdp))?

No, because states is not required be ordered consistently with state_index. You could use ordered_states and enumerate.

The tests are currently failing because of some issue with Compose 0.6.1. We need to somehow convince it to not use that version. I tried playing with this for a while tonight but could not figure it out.

@mattuntergassmair
Copy link
Author

mattuntergassmair commented Dec 16, 2019

Hi @zsunberg I added the TMaze for testing as suggested 👍 I discovered several bugs in the implementation of TMaze, will submit a separate PR to POMDPModels.jl to fix it. Just consider that you may have to apply those changes to POMDPModels.jl in order for the POMDPModelTools.jl tests to pass on your machine.

TMaze PR: JuliaPOMDP/POMDPModels.jl#58

push!(transmat_col_A[ai], si)
push!(transmat_data_A[ai], 1.0)
else
# if isterminal(mdp, s) # if terminal, there is a probability of 1 of staying in that state
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to check for terminal states explicitly here or do we expect them to be detected in the transition function?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as below. We didn't want the user to have to think about this.

if isterminal(mdp, s)
reward_S_A[stateindex(mdp, s), :] .= 0.0
else
# if isterminal(mdp, s)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as for transition matrix, do we want to check here or do we expect the check to happen in reward(...)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we didn't want an extra overhead in thinking about this from the user side.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I guess that works. Should we then document somewhere that the user doesn't need to worry about checking for terminal states in reward and transition? Right now it seems like the code for this is duplicated.
Also, re-introducing these lines I commented out will make some tests fail for the TMaze because it allows "escaping" from the terminal state. I guess that's a bug that needs to be taken care of on the TMaze side of things though

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very important! Since no actions will be taken from a terminal state (https://juliapomdp.github.io/POMDPs.jl/stable/basic_properties/#Terminal-States-1), a complete and valid problem implementation might have a transition function that errors when it is called on a terminal state. Thus, we need this check to avoid such errors.

Should we then document somewhere that the user doesn't need to worry about checking for terminal states in reward and transition?

Yes, this is a good fact to know, but where should we put it? Happy to add it if there is a good place.

I guess that's a bug that needs to be taken care of on the TMaze side of things though

Yes, thanks for finding that!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattuntergassmair
Copy link
Author

@zsunberg ok I understand, then maybe ordered_states could be an interesting way to do it (maybe for a separate PR though). What is the rationale behind states being non-ordered? As long as the order is the same between calls it shouldn't even matter at all, right?

Project.toml Outdated Show resolved Hide resolved
@zsunberg
Copy link
Member

What is the rationale behind states being non-ordered?

states might return a set-like object that doesn't even have a well-defined order. In the early days, people ran into a ton of problems because the collection returned by states did not have the same order as their implementation of stateindex, so we decided to not require it to match up and declare that stateindex is the unique way to define state ordering.

As long as the order is the same between calls it shouldn't even matter at all, right?

Sort of - then you have to find a way to map the states returned by transition back into things like a value function vector or alpha vectors. This could be accomplished by creating an ad-hoc dictionary of states or something, but we wanted to make the definition of this ordering explicit so that things like alpha vectors would make sense to everyone. Hence, stateindex

@mattuntergassmair
Copy link
Author

mattuntergassmair commented Dec 16, 2019

What is the rationale behind states being non-ordered?

states might return a set-like object that doesn't even have a well-defined order. In the early days, people ran into a ton of problems because the collection returned by states did not have the same order as their implementation of stateindex, so we decided to not require it to match up and declare that stateindex is the unique way to define state ordering.

As long as the order is the same between calls it shouldn't even matter at all, right?

Sort of - then you have to find a way to map the states returned by transition back into things like a value function vector or alpha vectors. This could be accomplished by creating an ad-hoc dictionary of states or something, but we wanted to make the definition of this ordering explicit so that things like alpha vectors would make sense to everyone. Hence, stateindex

I understand. How about we use enumeration over ordered_states() as you suggested for constructing the SparseTabular(PO)MDP? Then ordered_states() calls stateindex which is equivalent to the current implementation, but a user can just overwrite ordered_states() for a specific type to get better performance, e.g.

ordered_states(pomdp::MyPOMDP) = enumerate(states(pomdp))

If this sounds good I can make a separate PR

Copy link
Member

@zsunberg zsunberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the hard work on this! A few things need to be fixed:

  1. POMDPModels and DiscreteValueIteration shouldn't be in the [deps] section because they are only needed for testing
  2. Need to put a compat requirement for POMDPModels to be 0.4.2 or greater to deal with the TMaze and Compose problems
  3. Compose shouldn't be in [deps] because it is not a dependency. The bug with Compose should be fixed in POMDPModels as discussed in the comments
  4. The isterminal checks need to be re-enabled (see comments)

@zsunberg
Copy link
Member

Once POMDPModels 0.4.2 has been successfully registered, I will restart the build

@zsunberg
Copy link
Member

zsunberg commented Dec 17, 2019

I understand. How about we use enumeration over ordered_states() as you suggested for constructing the SparseTabular(PO)MDP? Then ordered_states() calls stateindex which is equivalent to the current implementation, but a user can just overwrite ordered_states() for a specific type to get better performance, e.g.

@mattuntergassmair Yep, that is the right way to do it!

@rejuvyesh
Copy link
Member

Need to update travis.yml and add JuliaPOMDP registry if we want to use DiscreteValueIteration for testing.

@mattuntergassmair
Copy link
Author

Need to update travis.yml and add JuliaPOMDP registry if we want to use DiscreteValueIteration for testing.

I don't even know where DiscreteValueIteration came from, removed it now

@rejuvyesh
Copy link
Member

Still needs appropriate compat?

@mattuntergassmair
Copy link
Author

Still needs appropriate compat?

compat for what? DiscreteValueIteration should no longer be needed, POMDPModels is already there. Did I miss anything?

@rejuvyesh
Copy link
Member

Not sure why tests are failing on travis.

@MaximeBouton
Copy link
Contributor

One of the test is using DiscreteValueIteration, ideally we would write actual unit tests.
However, within the test script it is adding the registry and installing DiscreteValueIteration so no need to do anything in travis I believe.

@mattuntergassmair
Copy link
Author

One of the test is using DiscreteValueIteration, ideally we would write actual unit tests.
However, within the test script it is adding the registry and installing DiscreteValueIteration so no need to do anything in travis I believe.

Ouuuuh, that's why testing is soooo slow and it's downloading packages and stuff. So I guess it's intended to work that way?

@zsunberg
Copy link
Member

Can someone verify the test work locally with POMDPModels master? Then it looks like we will have to tag one more version of POMDPModels and then the tests should pass. I won't be able to work on this until Saturday, but it seems like we are close :) Thanks for working on it!

@mattuntergassmair
Copy link
Author

Can someone verify the test work locally with POMDPModels master? Then it looks like we will have to tag one more version of POMDPModels and then the tests should pass. I won't be able to work on this until Saturday, but it seems like we are close :) Thanks for working on it!

I can confirm that tests are passing locally for me

Copy link
Member

@zsunberg zsunberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks for making the requested changes @mattuntergassmair I think we're ready as soon as JuliaRegistries/General#7040 gets merged

@zsunberg zsunberg merged commit 9866016 into JuliaPOMDP:master Dec 22, 2019
@zsunberg
Copy link
Member

Woo! done finally!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants