Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best path seen over entire search #49

Open
rcnlee opened this issue Aug 8, 2018 · 1 comment
Open

Best path seen over entire search #49

rcnlee opened this issue Aug 8, 2018 · 1 comment

Comments

@rcnlee
Copy link
Contributor

rcnlee commented Aug 8, 2018

What's the best way to get the best path seen over the entire search in DPW? This is the sequence of s,a,r's with the best total reward encountered over all samples, which probably occurs during a rollout.

If it doesn't currently exist, how can I implement it?

I see that there is a new action_info architecture where extra info can be returned from action. But you don't get the rollout portion of the sequence because it is hidden in estimate_value which calls RandomSolver. So is the easiest way to write my own rollout function that wraps the existing one? Or is there a better way?

Thanks!

@zsunberg
Copy link
Member

zsunberg commented Aug 9, 2018

Hmm... yeah, that seems kind of hard right now :/ I definitely didn't plan for it when writing. Do you even know how to get the portion of the trajectory from the tree search? Since simulate is called recursively, It seems like you have to pass more arguments into simulate to keep track of the trajectory.

If you just want the rollout portion, yes, you would just need to implement a new type for estimate_value that keeps track of such things, but I think you will have to write your own version of MCTS.simulate and maybe a few other functions to keep track of the entire trajectory including when it traverses the tree. You could still use the existing tree structures, etc. though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants