Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved MaxForks by relying on cached slns instead of SAT calls #203

Closed
wants to merge 2 commits into from

Conversation

holycrap872
Copy link
Contributor

This commit optimizes programs whose execution is halted by the
MaxForks option. Once a particular number of forks is hit, the
execution stops forking. It does not, however, stop exploring the
program. Instead each "dangling" state is pushed to completion to
make sure no assertion violations or errors lie along the path.

Previously, the implementation had used SAT calls to test the validity
of a branch. If both sides were possible, it would randomly pick one
side to push the execution down.

The optimization works by calculating a solution to the dangling state
and following that solution through to completion. This is comparable
to generating a test case for the dangling state in klee, exiting klee,
and running the test case as you would any other input. This means
that any problems that lie further down the path will be uncovered. By
avoiding the sat calls and instead relying on simply plugging a
solution into a state and seeing if it's T/F, the program completes
much more quickly.

@holycrap872 holycrap872 force-pushed the MaxForksFix branch 3 times, most recently from 4842e3d to 6b3b92b Compare February 22, 2015 03:36
@holycrap872 holycrap872 force-pushed the MaxForksFix branch 2 times, most recently from 1790f3a to 489ffa6 Compare March 4, 2015 21:35
@ccadar
Copy link
Contributor

ccadar commented Mar 4, 2015

I agree this is a much nicer behaviour. @paulmar, could you take a look at this patch, since it's related to ZESTI? Thanks!

@paulmar
Copy link
Contributor

paulmar commented Mar 4, 2015

At a high level, I'm wondering whether this can be implemented more elegantly by reusing the seeding mechanism and enabling OnlyReplaySeeds.

A few comments on the code:

I didn't quite get the reasoning in handling Switch instructions. Is getting a successor BasicBlock* such an expensive operation to warrant the code duplication? I would think not.

Also, the new behaviour should only be enabled when forking is permanently disabled until the end of the execution. Otherwise a state with an assignment may be later forked, potentially making its constraints and assignment inconsistent.

Is there a reason for changing the -max-forks default from -1 to 0?

@holycrap872
Copy link
Contributor Author

In response to your questions in order

Using OnlyReplaySeeds:
I don't have a ton of experience with the seeding part of Klee, but it seems like it would be the same basic idea. I would think of it as 1) Processing any seeds the user gives you, 2) Executing the exploration phase, 3) Generating "seeds" based on the dangling stats that in still in the program after hitting the termination condition, 4) Pushing those seeds all the way to the end.
If that's the basic idea, I can look into it and try and reuse the infrastructure that's already there.

The SwitchStatements:
I always just try to avoid creating objects that will be immediately thrown away. I honestly didn't calculate how quickly the Blocks were created/destroyed. If you think the current form is needlessly complicated, I can definitely change it back to the more simplistic form.

Disabled Forking:
I'm not sure I understand your point here.

Changing form -1 to 0:
I saw a lot of options using the 0 behavior, so I figured I'd change it so it was more consistent with the other command line options. Now that I think about it, I guess people may way a 0 fork behavior so I can change it back.

@ccadar
Copy link
Contributor

ccadar commented Mar 14, 2015

Hi @holycrap872 , I believe Paul's point that you found unclear was that you should only do this when the maximum number of forks is reached, and not when the memory limit is reached. When the memory limit is reached, forking might only be disabled temporarily, until more memory becomes available.

I would also rather keep the code simple and not duplicate that code; I also feel that the optimization in question is not worth it, but I guess this could be measured.

And yes, let's stick with -1 as the default option.

Finally, I would remove the comment about "The optimal solution would be..." I don't think that's the optimal solution, because the test case that you generate might not necessarily expose all potential bugs on that path.

This commit optimizes programs whose execution is halted by the
MaxForks option.  Once a particular number of forks is hit, the
execution stops forking.  It does not, however, stop exploring the
program. Instead each "dangling" state is pushed to completion to
make sure no assertion violations or errors lie along the path.

Previously, the implementation had used SAT calls to test the validity
of a branch.  If both sides were possible, it would randomly pick one
side to push the execution down.

The optimization works by calculating a solution to the dangling state
and following that solution through to completion.  This is comparable
to generating a test case for the dangling state in klee, exiting klee,
and running the test case as you would any other input.  This means
that any problems that lie further down the path will be uncovered. By
avoiding the sat calls and instead relying on simply plugging a
solution into a state and seeing if it is T/F, the program completes
much more quickly.
The max-forks-terminate option kills exploration after reaching a
certain number of forks.  Klee then begins to dump states and wrap up
execution.  This metric is similar to the current max-forks commandline
option.  The main difference is that the dangling states are not pushed
through the rest of the pogram.

The anticipated use case for this option is to terminate after a
certain number of forks and then create tests for each of the dangling
states.  Each of these tests will then be executed normally, rather
than inside klee, to make sure there are no assertion violations or
errors.  This is likely MUCH faster than finishing each path within
Klee (like max-forks does).
@holycrap872
Copy link
Contributor Author

@ccadar , I just updated the pull request according to your suggestions.

I think I misstated what exactly was going on with the "duplicate" code (now line 1618). What it is intended to do is avoid calling solver->mayBeTrue() whenever we encounter a switch statement AND we are over the number of MaxForks.

In the old implementation, ALL cases would be tested with a call to solver->mayBeTrue() even if we had exceeded the number of forks. This is obviously a waste since only ONE of the cases would eventually be followed. The new implementation first tests whether we have exceeded the max number of forks. If so, all the cases in the switch statement are created and simply tested against the cached solution (the cached solution will satisfy one, and only one of the cases). This means the creation of the cases results in some wasted work, but we avoid multiple calls to solver->mayBeTrue(). We instead rely on the cached solution to choose one of the SAT cases to follow.

If, instead, we haven't reached the max number of forks, the old behavior is followed where we test each case for satisfiability (solver->mayBeTrue()), and create all the cases that could are SAT.

@paulmar
Copy link
Contributor

paulmar commented Mar 17, 2015

I believe the original comment about generating assignments when forking is temporary disabled still applies. If forking is disabled and an assignment is generated, then enabled and the state is forked, the assignment may not be in sync with the newly added constraint.

@holycrap872
Copy link
Contributor Author

@paulmar , I thought I had changed all of the guarding if statements to only include the case where the number of max forks was exceeded (previously I had included all the other stuff in each if state me by.) Other than that, I'm not sure how the situation you describe would be reached.

@paulmar
Copy link
Contributor

paulmar commented Mar 18, 2015

@holycrap872
Copy link
Contributor Author

I took that code from the original implementation. Did moving it to it's new location open up the possibility for errors or should it just not have been there at all?

@LebronX
Copy link

LebronX commented Jul 1, 2020

Hi, although this PR have already existed for 5 years, I would like to revive and fix it.

@paulmar I think the problem happened in https://github.com/holycrap872/klee/blob/c547d5d7eba08da04e6a28975c2db835f93cea59/lib/Core/Executor.cpp#L778-781
is that in the if condition there should only (MaxForks!=~0u && stats::forks >= MaxForks) but exclude the others. Am I understand correct?

@holycrap872 Could I create a new PR based on your change? Because I don't have the write access to this PR.

@holycrap872
Copy link
Contributor Author

@LebronX, go for it. It's been a long time since I've touched this PR or Klee, but let me know if I can help.

@ccadar
Copy link
Contributor

ccadar commented Jul 1, 2020

Thanks, both. This PR unfortunately fell through the cracks, but I still think it's a nice contribution and since it doesn't impact the core of KLEE, I think we could incorporate it pretty soon if there is renewed interest.

And yes, the code should only impact the case where the max-forks is exceeded (in the current version it also kicks in in other cases too).

@LebronX LebronX mentioned this pull request Jul 2, 2020
@ccadar
Copy link
Contributor

ccadar commented Aug 21, 2020

Closing this, superseded by #1285

@ccadar ccadar closed this Aug 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants