Skip to content

Conversation

@martin-cs
Copy link
Collaborator

In the past users have had to modify the program before analysis to get this kind of precision. Doing so is expensive in terms of memory for both the program and the analysis. This gives a more efficient route. The is an equivalent of an unwinding limit that controls the number of histories that can be generated for each location. This is a better measure as it should scale memory linearly and time linearly(ish) while loop unwinding can have seriously non-linear effects when there are nested loops.

This is the enabling technology for a "loop unwinding" analysis that can automatically over-approximate loop bounds. It is also probably the last "core" change to the AI infrastructure before VSD can be merged in full.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

@codecov
Copy link

codecov bot commented Jul 2, 2020

Codecov Report

Merging #5404 into develop will increase coverage by 0.01%.
The diff coverage is 80.43%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5404      +/-   ##
===========================================
+ Coverage    68.23%   68.25%   +0.01%     
===========================================
  Files         1178     1180       +2     
  Lines        97588    97720     +132     
===========================================
+ Hits         66589    66695     +106     
- Misses       30999    31025      +26     
Flag Coverage Δ
#cproversmt2 42.80% <71.42%> (+<0.01%) ⬆️
#regression 65.42% <80.43%> (+0.02%) ⬆️
#unit 32.23% <71.42%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/analyses/ai.h 57.27% <0.00%> (-0.53%) ⬇️
src/analyses/call_stack_history.h 89.47% <ø> (ø)
src/goto-analyzer/goto_analyzer_parse_options.h 100.00% <ø> (ø)
src/analyses/local_control_flow_history.cpp 70.27% <70.27%> (ø)
src/analyses/local_control_flow_history.h 89.65% <89.65%> (ø)
src/analyses/ai.cpp 73.42% <100.00%> (-0.19%) ⬇️
src/analyses/ai_history.cpp 15.38% <100.00%> (+15.38%) ⬆️
src/analyses/ai_history.h 75.86% <100.00%> (ø)
src/analyses/call_stack_history.cpp 70.37% <100.00%> (+1.13%) ⬆️
src/goto-analyzer/goto_analyzer_parse_options.cpp 69.78% <100.00%> (+2.46%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d30335...e30da66. Read the comment docs.

@martin-cs martin-cs force-pushed the feature/ai-local-history branch from ff6e64c to 9498bfd Compare July 3, 2020 20:08
@martin-cs
Copy link
Collaborator Author

Please can someone with access to the AWS CodeBuild outputs let me know what the problem is? Last time it was @chrisr-diffblue or @thk123 who could do this...

@chrisr-diffblue
Copy link
Contributor

Looks like all of the failed AWS codebuild jobs failed really early due to infrastructure problems (failed to apt-get install dependencies). I've restarted all those failed jobs.

@martin-cs
Copy link
Collaborator Author

Thanks @chrisr-diffblue they are listed as "failed" rather than "pending" for me at the moment. I'll give them a minute...

@martin-cs
Copy link
Collaborator Author

These still look failed to me, plus the code coverage numbers are kinda wacky.

@chrisr-diffblue
Copy link
Contributor

Unfortunately, it looks like although the restarted jobs seem to have passed, they haven't posted status back to github, sigh. You might have to go for the rebase/dummy commit + force push to make github start new jobs.

@smowton
Copy link
Contributor

smowton commented Jul 7, 2020

@chrisr-diffblue IIRC updating Github status after a rerun is a Codebuild option, @rjmunro knows more

@rjmunro
Copy link
Contributor

rjmunro commented Jul 7, 2020

@smowton @chrisr-diffblue Yes, there is an option "Report build statuses to source provider when your builds start and finish" in the "Edit" -> "Source" menu, hidden in "Additional configuration (Git clone depth, Git submodules)" (even though it is neither of those things). Without this, codebuild will report status to github only where the build was triggered by a github webhook. I can see it's not turned on, but I'd rather not change the setting myself as I'm not involved in this project. I'm happy to do it if someone officially asks me to.

Finding that setting was a huge win for us because it let us trigger codebuild jobs from other codebuild jobs, rather than from webhooks. This meant we could build a CI pipeline.

@thk123
Copy link
Contributor

thk123 commented Jul 9, 2020

@rjmunro I'm happy to do it - any reason to not have this on? Seems like a no-brainer...

@rjmunro
Copy link
Contributor

rjmunro commented Jul 13, 2020

@thk123 The guess the main reason not to have it on would be if you were testing your codebuild config and didn't want failures to block PRs or something, but in general, I would turn it on.

@martin-cs
Copy link
Collaborator Author

Thanks for the help all but ... I still can't see what is actually causing the failures. I'd really like to get this PR moving and I'm keen to fix the issues but without the info...

@thk123
Copy link
Contributor

thk123 commented Jul 16, 2020

@martin-cs as Chris said - the failures seem spurious - you should just be able to rebase this and push to retrigger the builds. I'll try and get the updated status thing working today.

@martin-cs martin-cs force-pushed the feature/ai-local-history branch from 9498bfd to 3407f45 Compare July 16, 2020 09:59
@martin-cs
Copy link
Collaborator Author

Thank you @thk123 that seems to have done the trick.

Do any of the maintainers have time for a review of this PR?

Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly lgtm, some requests for clarity, particularly clarifying either by naming a pass-through function or using a named nullptr when histories are being omitted because they're irrelevant, and when they're being omitted due to a strategic choice to ignore history and merge all states.

p_call,
calling_function_id,
l_return,
nullptr, // Not needed as we are skipping the function call
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and other nullptr args are pretty opaque -- how about #define CALLER_UNSPECIFIED nullptr?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really agree with using a define for this. How about constexpr whatever* UnspecifiedCaller = nullptr;?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to ai_historyt as a member variable as @hannes-steffenhagen-diffblue suggests.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

( Unfortunately constexpr doesn't actually work because ai_history_baset has a virtual destructor but a static const variable does. )

p_call,
callee_function_id,
l_begin,
nullptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this one not given?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is only used on function call return edges, this is a call, so p_call already has the caller info.


auto step_return = p->step(n, *(storage->abstract_traces_before(n)));
auto step_return =
p->step(n, *(storage->abstract_traces_before(n)), nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the nullptr here imply we're conflating some states that we could potentially distinguish?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question but in this case no. The step is local so the history is not needed. I have added a comment to this effect.

( There is a separate conversation about whether such an interface is even meaningful w.r.t. function calls and goto instructions. )

auto tmp = std::dynamic_pointer_cast<const local_control_flow_historyt<
track_forward_jumps,
track_backward_jumps>>(in);
PRECONDITION(tmp != nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSTCONDITION?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is a precondition of down_cast that this conversion works. To quote the comment above caller's responsibility to ensure correct types.

std::shared_ptr<const local_control_flow_historyt<
track_forward_jumps,
track_backward_jumps>>
down_cast(trace_ptrt in) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to_local_control_flow_history?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

bool local_control_flow_decisiont::
operator<(const local_control_flow_decisiont &op) const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you intentionally want more informative orderings to be lower in the sort order? If so say so up top

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ummm... I am not entirely sure I have an idea of what constitutes informative in this context. Where the choice of ordering matters (both in this and in local_control_flow_historyt<track_forward_jumps, track_backward_jumps>:: operator<) it is commented and in one case checked by assertion (if there is a merge history it should sort last). Specifically this includes:

// Lower locations first, generally this means depth first
// Branch not taken takes priority
// This prioritises complete histories over merged ones

return false;
}

// Instantiate the versions we need
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being all possible versions :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That too! (although now I think about it maybe the version that doesn't track either isn't actually used... )

}
else if(cmdline.isset("loop-unwind"))
{
options.set_option("local-control-flow-history", true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering there's some trickiness here, perhaps print a line describing the options you've inferred from the higher-level options provided

Copy link
Collaborator Author

@martin-cs martin-cs Aug 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options that are set are internal only and are intended to make the handling of these options simpler. I am reluctant to print anything here because the user can't set any of these options directly.

"local-control-flow-history-limit", cmdline.get_value("loop-unwind"));
options.set_option("history set", true);
}
else if(cmdline.isset("branching"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The else-ifs here imply --loop-unwind --branching won't work as expected, right? If so check for this case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct and in part why the combined option is known as --loop-unwind-and-branching. I am reluctant to address this issue in this PR because it affects almost all options to goto-analyze`, all of the major options have an implicit ordering and check for absence but not duplication.

trace_ptrt p,
const irep_idt &to_function_id,
locationt to_l,
trace_ptrt caller_history,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a pretty strong preference for not using nullptr to indicate optional arguments – this isn't obvious from the signature, and frequently makes the logic inside more complicated if you need to disambiguate between pointers you're allowed to dereference and ones you aren't...

Also, are these trace_ptrts all shared_ptr? In that case it'd seem a bit wasteful to me, unless these are stored somewhere (I can't quite see it, maybe somewhere further down in the call stack)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed as you suggested earlier. I hope the invariants in ai_history.h show when the optional argument is present. step kind of has to be a case switch because of the different types of instruction.

Yes, they are all shared_ptrs. They are stored in something that implements the ai_storaget interface, allowing you to store states and histories with varying degrees of sensitivity (per history, per location, soon per function and just once). It's not great but I had a solution with unique_ptr but it involved way too many copies.

@martin-cs martin-cs force-pushed the feature/ai-local-history branch from 3407f45 to afd7980 Compare August 3, 2020 16:09
@martin-cs
Copy link
Collaborator Author

@smowton and @hannes-steffenhagen-diffblue I have responded to and (mostly) implemented your review comments. They are two separate commits at the moment for simplicity. If you are happy I will squash them into the relevant earlier commits.

@martin-cs martin-cs force-pushed the feature/ai-local-history branch from afd7980 to 1f36c23 Compare August 4, 2020 10:04
@martin-cs
Copy link
Collaborator Author

@smowton and @hannes-steffenhagen-diffblue sorry to hassle but please can you have a look at the changes I have made and see if they address the issues you raised?

@smowton
Copy link
Contributor

smowton commented Aug 11, 2020

I stand by my green tick :) (meaning I'm happy with this as-is but have suggestions)

@smowton
Copy link
Contributor

smowton commented Aug 11, 2020

Apologies for slowness btw, was on holiday the past week

@martin-cs
Copy link
Collaborator Author

@smowton Thanks! Hope you had a nice holiday and sorry for hassling.

@hannes-steffenhagen-diffblue @danpoe @peterschrammel @chrisr-diffblue Any chance you could help with a second green tick? This is kind of blocking for me and the next PR is something that might make your lives easier...

Copy link
Contributor

@chrisr-diffblue chrisr-diffblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of extremely minor nitpicks, but nothing blocking - apologies for the massive delay in me getting around to reviewing....

martin added 4 commits August 12, 2020 17:19
This is similar to a three-way merge in that it allows histories to
not only use their state and histories in the next location but also
to access the history at the call site.  Doing so makes it possible to
create "function local" history where all call-sites merge and drop
their previous history but it is picked back up on function return.
This is needed for managing scalability in some of the more involved
histories.
Track every forwards and / or backwards jump within a function.  This
can give the same effect as loop unrolling without modifying the
program.  It can also give lazy merging of paths through if statements.
For performance reasons it is not interprocedural.
These allow the user to track just forwards, just backwards or both
kinds of jump.  This gives per-path analysis (up to the limit of number
of times visiting each location), loop unwinding or both.
@martin-cs martin-cs force-pushed the feature/ai-local-history branch from 1f36c23 to e30da66 Compare August 12, 2020 16:19
@martin-cs
Copy link
Collaborator Author

I have addressed @chrisr-diffblue comments ( thanks for the review and no worries about the lag ) and squashed the changes in. I will merge once CI has passed, @hannes-steffenhagen-diffblue shout now if you aren't happy with how your comments have been resolved.

@martin-cs martin-cs merged commit 2bd790d into diffblue:develop Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants