Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pomdpsol should parse standard-out info #1

Closed
cboettig opened this issue May 24, 2016 · 9 comments
Closed

pomdpsol should parse standard-out info #1

cboettig opened this issue May 24, 2016 · 9 comments

Comments

@cboettig
Copy link
Member

Should extract such information as initialize time, convergence information.

@cboettig
Copy link
Member Author

Done

@miladm12
Copy link
Collaborator

Hi Carl, were you able to extract the initialization time? I cannot see it in the code in R directory.

@cboettig
Copy link
Member Author

Yup, I think so.

You should be able to install the package from github, e.g. with devtools:

devtools::install_github("cboettig/appl")
library("appl")

and then run an example from the documentation, e.g. ?pomdpsol:

model <- system.file("models/example.pomdp", package = "appl")
policy <- tempfile()
pomdpsol(model, output = policy, timeout = 2)

which now gives you:

                load_time                  init_time                   run_time            final_precision              end_condition 
                  "0.12s "                    "0.50s"                     "2.09"                  "6.69175" "  Preset timeout reached" 

(i.e. 0.5 seconds to init this model, which has 24 states)

You can see the R code for doing this here: https://github.com/cboettig/appl/blob/master/R/appl.R#L123 , I think I got the parsing right in general. Look good?

@miladm12
Copy link
Collaborator

Oh okay, but I guess one significant improvement would be to set the timeout input to only the running time excluding the initialization time. This way, we do not need to check the initialization time before hand for setting a reasonable timeout. We can just specify the running time and no matter how long the initialization would take, sarsop runs the value iteration for that specific running time that the user gives as input. Does it make sense? Do you agree?

@miladm12
Copy link
Collaborator

miladm12 commented Jun 3, 2016

Did you miss this? I believe it is a very important piece if we can add it. Because depending on the complexity of the problem, loading + initialization time can take a lot of time and user doesn't know how long to set the timeout, but if the input timeout represent only the running time (excluding loading and initialization time) then it would be more general and useful feature.

@cboettig
Copy link
Member Author

cboettig commented Jun 3, 2016

thanks, yup, I did miss this, thanks for the ping. Not totally sure I follow how that would work. From the way I've implemented it here, we only get the initialization time after pomdpsol command is called, and thus the user has already given the timeout.

In general I'm not sure the user should be running with timeout as the option. Seems better to set a desired precision, and perhaps a max memory to make sure the program at least exits gracefully (but probably with a warning if the desired precision is not met). But I haven't figured out what units the precision is in or how a user would decide the appropriate precision a-priori.

@miladm12
Copy link
Collaborator

miladm12 commented Jun 3, 2016

Yes the user doesn't know the initialization time, but I'll give you the example of what I mean:

suppose a user wants to run sarsop for 1000 seconds. If he sets timeout to 1000 secs, he's actually running sarsop for 1000-loading-initialization times. What if we could find a way to allocate this 1000 secs only to running time. This is a huge advantage. Does this make sense now?

@cboettig
Copy link
Member Author

cboettig commented Jun 3, 2016

Right, I do see that, but this still seems a bit strange to me. On the software engineering side, if that's a common use case it seems they should have designed pomdpsol binary to apply the timeout behavior only to the sarsop run, and not to the initialization. That they applied it to the whole makes it seem to me that they merely wanted to have the user be able to limit the total computational time, rather than run the sarsop for, say, 1000 seconds. My impression is that the timeout is meant mostly as a backup to make sure the program exits, rather than as a primary exit condition.

For comparison, it wouldn't make sense to do infinite-time-horizon MDP solution based on a computational time, which would give different results on computers of different speeds anyhow -- instead one runs until the policy converges (policy iteration) or the value converges (value iteration), right? I might be misunderstanding this, but if the appl designers didn't intend to support running sarsop (post-initialization) for a fixed computational time as the exit condition, than I'm not sure we should build it in here.

It might make more sense, at least in the context we apply it here, to run to policy-fn convergence, perhaps by attempting successively lower precision until the policy no longer changes?

@miladm12
Copy link
Collaborator

miladm12 commented Jun 4, 2016

yeah I guess that make sense right now. It has personally a lot of use when we wanna do learning and we don't care that much about precision of the value, but to just get a reasonable policy as soon as possible. But generally, it should not be a factor for stopping the sarsop. That's what I meant to be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants