Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parametric/exact Rmin: wrong result #15

Closed
kleinj opened this issue May 15, 2017 · 1 comment
Closed

Parametric/exact Rmin: wrong result #15

kleinj opened this issue May 15, 2017 · 1 comment
Labels

Comments

@kleinj
Copy link
Member

kleinj commented May 15, 2017

At https://groups.google.com/d/topic/prismmodelchecker/vQI-1srXSdk/discussion, Shalini Ghosh reported a small MDP with a strange result for an Rmin formula in the parametric engine. The model is attached.

The problematic formula is Rmin=?[ F "changed" ].

prism lanechange_mdp.prism -pf 'Rmin=?[F "changed" ]' --param p=0:1
Result (minimum expected reward): ([0.0,1.0]): { 1  | 2  }

prism lanechange_mdp.prism -pf 'Rmin=?[F "changed" ]' --exact --const p=0.626
Result (minimum expected reward): 1/2

prism lanechange_mdp_simple_parametric.prism -pf 'Rmin=?[F "changed" ]' --const p=0.626
Result: 1.002137421843802 (value in the initial state)

Manually restricting the MDP to the minimal strategy DTMC yields the expected results:

prism lanechange_dtmc.prism -pf 'R=?[F "changed" ]' --exact --param p=0:1
Result (expected reward): ([0.0,1.0]): { 2 p - 5  | 10 p - 10  }

prism lanechange_dtmc.prism -pf 'R=?[F "changed" ]' --exact --const p=0.626
Result (expected reward): 937/935

prism lanechange_dtmc.prism -pf 'R=?[F "changed" ]' --const p=0.626
Result: 1.0021390374331551 (value in the initial state)

Some preliminary debugging suggests that the precomputation for Rmin in the parametric/exact engine is the most probable culprit.

lanechange_mdp.prism.txt
lanechange_dtmc.prism.txt

kleinj added a commit to kleinj/prism-svn that referenced this issue Aug 15, 2017
…olicy iteration

To ensure that the policy iteration (performed on an MDP where parameters have been replaced
by a parameter valuation) converges and converges on the right result, we initialise the
policy iteration for Rmin[F] with a proper scheduler, i.e., that reaches the goal
with probability one (see [BertsekasTsitsiklis91]).

Together with the previous commit, this fixes issue
prismmodelchecker/prism#15.

As there is no check against negative rewards, those remain problematic.
kleinj added a commit to kleinj/prism that referenced this issue Aug 28, 2017
…olicy iteration

To ensure that the policy iteration (performed on an MDP where parameters have been replaced
by a parameter valuation) converges and converges on the right result in the presence of
zero-reward end components, we initialise the policy iteration for Rmin[F] with a
proper scheduler, i.e., that reaches the goal with probability one (see [BertsekasTsitsiklis91]).

Together with the previous commit, this fixes prismmodelchecker#4 and prismmodelchecker#15.
davexparker pushed a commit that referenced this issue Aug 31, 2017
…olicy iteration

To ensure that the policy iteration (performed on an MDP where parameters have been replaced
by a parameter valuation) converges and converges on the right result in the presence of
zero-reward end components, we initialise the policy iteration for Rmin[F] with a
proper scheduler, i.e., that reaches the goal with probability one (see [BertsekasTsitsiklis91]).

Together with the previous commit, this fixes #4 and #15.
@davexparker
Copy link
Member

Fixed by the commit referenced above, now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants