New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pomdpsol should parse standard-out info #1
Comments
Done |
Hi Carl, were you able to extract the initialization time? I cannot see it in the code in R directory. |
Yup, I think so. You should be able to install the package from github, e.g. with devtools: devtools::install_github("cboettig/appl")
library("appl") and then run an example from the documentation, e.g. model <- system.file("models/example.pomdp", package = "appl")
policy <- tempfile()
pomdpsol(model, output = policy, timeout = 2) which now gives you: load_time init_time run_time final_precision end_condition
"0.12s " "0.50s" "2.09" "6.69175" " Preset timeout reached" (i.e. 0.5 seconds to init this model, which has 24 states) You can see the R code for doing this here: https://github.com/cboettig/appl/blob/master/R/appl.R#L123 , I think I got the parsing right in general. Look good? |
Oh okay, but I guess one significant improvement would be to set the timeout input to only the running time excluding the initialization time. This way, we do not need to check the initialization time before hand for setting a reasonable timeout. We can just specify the running time and no matter how long the initialization would take, sarsop runs the value iteration for that specific running time that the user gives as input. Does it make sense? Do you agree? |
Did you miss this? I believe it is a very important piece if we can add it. Because depending on the complexity of the problem, loading + initialization time can take a lot of time and user doesn't know how long to set the timeout, but if the input timeout represent only the running time (excluding loading and initialization time) then it would be more general and useful feature. |
thanks, yup, I did miss this, thanks for the ping. Not totally sure I follow how that would work. From the way I've implemented it here, we only get the initialization time after In general I'm not sure the user should be running with timeout as the option. Seems better to set a desired precision, and perhaps a max memory to make sure the program at least exits gracefully (but probably with a warning if the desired precision is not met). But I haven't figured out what units the precision is in or how a user would decide the appropriate precision a-priori. |
Yes the user doesn't know the initialization time, but I'll give you the example of what I mean: suppose a user wants to run sarsop for 1000 seconds. If he sets timeout to 1000 secs, he's actually running sarsop for 1000-loading-initialization times. What if we could find a way to allocate this 1000 secs only to running time. This is a huge advantage. Does this make sense now? |
Right, I do see that, but this still seems a bit strange to me. On the software engineering side, if that's a common use case it seems they should have designed For comparison, it wouldn't make sense to do infinite-time-horizon MDP solution based on a computational time, which would give different results on computers of different speeds anyhow -- instead one runs until the policy converges (policy iteration) or the value converges (value iteration), right? I might be misunderstanding this, but if the appl designers didn't intend to support running sarsop (post-initialization) for a fixed computational time as the exit condition, than I'm not sure we should build it in here. It might make more sense, at least in the context we apply it here, to run to policy-fn convergence, perhaps by attempting successively lower precision until the policy no longer changes? |
yeah I guess that make sense right now. It has personally a lot of use when we wanna do learning and we don't care that much about precision of the value, but to just get a reasonable policy as soon as possible. But generally, it should not be a factor for stopping the sarsop. That's what I meant to be useful. |
Should extract such information as initialize time, convergence information.
The text was updated successfully, but these errors were encountered: