pomdpsol should parse standard-out info #1

cboettig · 2016-05-24T17:15:56Z

Should extract such information as initialize time, convergence information.

cboettig · 2016-05-31T02:43:49Z

Done

miladm12 · 2016-05-31T03:33:29Z

Hi Carl, were you able to extract the initialization time? I cannot see it in the code in R directory.

cboettig · 2016-05-31T03:50:28Z

Yup, I think so.

You should be able to install the package from github, e.g. with devtools:

devtools::install_github("cboettig/appl")
library("appl")

and then run an example from the documentation, e.g. ?pomdpsol:

model <- system.file("models/example.pomdp", package = "appl")
policy <- tempfile()
pomdpsol(model, output = policy, timeout = 2)

which now gives you:

                load_time                  init_time                   run_time            final_precision              end_condition 
                  "0.12s "                    "0.50s"                     "2.09"                  "6.69175" "  Preset timeout reached"

(i.e. 0.5 seconds to init this model, which has 24 states)

You can see the R code for doing this here: https://github.com/cboettig/appl/blob/master/R/appl.R#L123 , I think I got the parsing right in general. Look good?

miladm12 · 2016-05-31T04:00:26Z

Oh okay, but I guess one significant improvement would be to set the timeout input to only the running time excluding the initialization time. This way, we do not need to check the initialization time before hand for setting a reasonable timeout. We can just specify the running time and no matter how long the initialization would take, sarsop runs the value iteration for that specific running time that the user gives as input. Does it make sense? Do you agree?

miladm12 · 2016-06-03T22:12:46Z

Did you miss this? I believe it is a very important piece if we can add it. Because depending on the complexity of the problem, loading + initialization time can take a lot of time and user doesn't know how long to set the timeout, but if the input timeout represent only the running time (excluding loading and initialization time) then it would be more general and useful feature.

cboettig · 2016-06-03T22:25:03Z

thanks, yup, I did miss this, thanks for the ping. Not totally sure I follow how that would work. From the way I've implemented it here, we only get the initialization time after pomdpsol command is called, and thus the user has already given the timeout.

In general I'm not sure the user should be running with timeout as the option. Seems better to set a desired precision, and perhaps a max memory to make sure the program at least exits gracefully (but probably with a warning if the desired precision is not met). But I haven't figured out what units the precision is in or how a user would decide the appropriate precision a-priori.

miladm12 · 2016-06-03T22:39:19Z

Yes the user doesn't know the initialization time, but I'll give you the example of what I mean:

suppose a user wants to run sarsop for 1000 seconds. If he sets timeout to 1000 secs, he's actually running sarsop for 1000-loading-initialization times. What if we could find a way to allocate this 1000 secs only to running time. This is a huge advantage. Does this make sense now?

cboettig · 2016-06-03T22:53:09Z

Right, I do see that, but this still seems a bit strange to me. On the software engineering side, if that's a common use case it seems they should have designed pomdpsol binary to apply the timeout behavior only to the sarsop run, and not to the initialization. That they applied it to the whole makes it seem to me that they merely wanted to have the user be able to limit the total computational time, rather than run the sarsop for, say, 1000 seconds. My impression is that the timeout is meant mostly as a backup to make sure the program exits, rather than as a primary exit condition.

For comparison, it wouldn't make sense to do infinite-time-horizon MDP solution based on a computational time, which would give different results on computers of different speeds anyhow -- instead one runs until the policy converges (policy iteration) or the value converges (value iteration), right? I might be misunderstanding this, but if the appl designers didn't intend to support running sarsop (post-initialization) for a fixed computational time as the exit condition, than I'm not sure we should build it in here.

It might make more sense, at least in the context we apply it here, to run to policy-fn convergence, perhaps by attempting successively lower precision until the policy no longer changes?

miladm12 · 2016-06-04T00:28:10Z

yeah I guess that make sense right now. It has personally a lot of use when we wanna do learning and we don't care that much about precision of the value, but to just get a reasonable policy as soon as possible. But generally, it should not be a factor for stopping the sarsop. That's what I meant to be useful.

cboettig closed this as completed May 31, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pomdpsol should parse standard-out info #1

pomdpsol should parse standard-out info #1

cboettig commented May 24, 2016

cboettig commented May 31, 2016

miladm12 commented May 31, 2016

cboettig commented May 31, 2016

miladm12 commented May 31, 2016

miladm12 commented Jun 3, 2016

cboettig commented Jun 3, 2016

miladm12 commented Jun 3, 2016

cboettig commented Jun 3, 2016

miladm12 commented Jun 4, 2016

pomdpsol should parse standard-out info #1

pomdpsol should parse standard-out info #1

Comments

cboettig commented May 24, 2016

cboettig commented May 31, 2016

miladm12 commented May 31, 2016

cboettig commented May 31, 2016

miladm12 commented May 31, 2016

miladm12 commented Jun 3, 2016

cboettig commented Jun 3, 2016

miladm12 commented Jun 3, 2016

cboettig commented Jun 3, 2016

miladm12 commented Jun 4, 2016