Use different rollout depth than the tree construction depth. #88

BoZenKhaa · 2022-02-03T11:05:21Z

Currently, when using the rollout estimator, the rollouts end at the same depth as the MCTS tree construction. This can be solved by redefining the rollouts in e.g. this way (for rollouts terminating in terminal states):

function MCTS.rollout(estimator::MCTS.SolvedRolloutEstimator, mdp::MDP, s, d::Int)
    sim = RolloutSimulator(;estimator.rng, eps=nothing, max_steps=nothing)
    POMDPs.simulate(sim, mdp, estimator.policy, s)
end

Providing custom policy for the RolloutEstimator in the estimate_value parameter does not change the depth, since the depth is handled in the rollout.

To me, this behavior seems somewhat unintuitive, and the solution is undocumented as far as I can tell.

I think there are few options I can think of to address this:

Add this example to the docs and to the domain knowledge notebook.
Add new parameter to the MCTSSolver, such as rollout_depth, that will be passed to the estimate_value instead of the depth variable. For backward compatibility, this could be a function, defaulting to depth->depth.
Provide new methods for the domain_knowledge.jl that handle the different depth.

Overall, the option 1. is the easiest, option 2. seems hacky, since not all estimators are rollouts. Option 3. makes most sense to me, maybe with some small rewrite to make the new methods play well with the rest of the domain_knowledge.jl.

I can try to prepare the PR for either option, but after seeing #38, I thought there might be an even better way of handling this?

The text was updated successfully, but these errors were encountered:

zsunberg · 2022-02-12T03:23:32Z

@BoZenKhaa Thanks a bunch for thinking through this.

I actually think we should change it to function like it does in the alg4dm book: https://algorithmsbook.com/files/dm.pdf#%5B%7B%22num%22%3A153%2C%22gen%22%3A0%7D%2C%7B%22name%22%3A%22XYZ%22%7D%2C59.776%2C238.344%2Cnull%5D

That is, the value estimates are by default infinite-horizon. We can then add an option to the RolloutEstimator for the maximum depth.

This would be a breaking change, but we can increment the version number appropriately. What do you think of that?

BoZenKhaa · 2022-02-14T12:56:17Z

I will have a look at the relevant section in the book, but the way I learned about MCTS, rollouts to the terminal state are the "default" for me, so the current behavior was a surprise to me.

I think your proposal sounds good. If the infinite rollout is to be a default, then there is no way to do it without break. Maybe the default rollout could also have some epsilon defined for MDPs with infinite horizon?

zsunberg · 2022-05-19T17:06:50Z

Closed with #91

jancervenka mentioned this issue Feb 21, 2022

Custom rollout depth #91

Merged

zsunberg closed this as completed May 19, 2022

zsunberg mentioned this issue Jun 12, 2024

depth parameter is not used by DPWSolver #109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use different rollout depth than the tree construction depth. #88

Use different rollout depth than the tree construction depth. #88

BoZenKhaa commented Feb 3, 2022 •

edited

Loading

zsunberg commented Feb 12, 2022

BoZenKhaa commented Feb 14, 2022

zsunberg commented May 19, 2022

Use different rollout depth than the tree construction depth. #88

Use different rollout depth than the tree construction depth. #88

Comments

BoZenKhaa commented Feb 3, 2022 • edited Loading

zsunberg commented Feb 12, 2022

BoZenKhaa commented Feb 14, 2022

zsunberg commented May 19, 2022

BoZenKhaa commented Feb 3, 2022 •

edited

Loading