-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many improvements to make CFGFast fast again. #1092
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ltfish
added
enhancement
Some subsystem of angr needs tweaking
refactor
Something needs to be reorganized
labels
Jun 27, 2018
ltfish
changed the title
Many adjustments to make CFGFast fast again.
[WIP] Many adjustments to make CFGFast fast again.
Jun 27, 2018
ltfish
force-pushed
the
re/faster_cfgfast
branch
2 times, most recently
from
July 4, 2018 06:58
07c3567
to
481d6e3
Compare
This was referenced Jul 7, 2018
Closed
ltfish
force-pushed
the
re/faster_cfgfast
branch
from
July 16, 2018 18:56
1601c28
to
3846f92
Compare
ltfish
force-pushed
the
re/faster_cfgfast
branch
from
July 17, 2018 12:28
982a4cd
to
2801bd7
Compare
ltfish
changed the title
[WIP] Many adjustments to make CFGFast fast again.
Many improvements to make CFGFast fast again.
Jul 17, 2018
ltfish
added a commit
to angr/pyvex
that referenced
this pull request
Jul 26, 2018
See angr/angr#1092 for more details. * The initial commit. * Redo the lifting logic to avoid redundant IRSB copying. * Move get_defaultexit_target() from Python to C to speed things up. * Add a check to make sure LibVEX_Lift() does not return NULL. * Implement IRSB.instruction_addresses. * Implement IRSB.has_statements. * Fix IRSB.addr. * Fix a NULL-deref in pyvex.c. * Postprocessor: Do not remove NoOp statements. Otherwise it will cause a mismatch between statement indices and the indices in IRSB.exit_statements (which are generated in PyVEX C). * FixesPostProcessor: Get the IRSB address correctly. * Remove NoOp statements in the C world. * Restore the IRSB size calculation. * Make sure exit_statements is not None before accessing. * Enable tests for PyVEX itself. * Implement ARM call jumpkind fixer in C for a better performance. * Implement MIPS32 post-processing in C world. * Add a missing undefine. * Fix test.py. * Add stddef.h to pyvex.c. * Implement data reference collection in pyvex_c. * Lint the code. * Expose PyVEXError. * More linting. * oops * Expose get_type_size and get_type_spec_size. * Expose IRTypeEnv. * Postprocessor: Support NeedStatementsNotification. * Reorganize lifting/__init__.py. * Remove the IRSB.addr property, replace with raw attribute * Make check against MAX_DATA_REFS more explicit * Move from data ref tuples to a DataRef class * Lint block.py: Define data_refs and _instruction_addresses in __init__ * Make pyvex.lift positional arguments match pyvex.IRSB * The kosher way to do this is .lift() * Split pyvex.c into several smaller C files (committing as fish to preserve authorship)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This series of development work is inspired by the testing binary used in #1075 -- the binary is provided by @KevOrr, thanks! I realized that angr's CFG recovery (CFGFast, I mean) was obviously too slow to run on a 35-MB blob. I then did some intensive profiling and made quite a few improvements in angr, CLE, and PyVEX.
To give you a sense of what sort of improvement I am referring to, the following is the benchmark I've been using throughout the past three days, and the results of running CFGFast for the first 6% of code.
block.statements
are made on-demand: 228.2 sec (over 3 minutes).Here is an incomplete list of things that have been changed (improved, hopefully):
Block.vex_nostmt
so that we can get a VEX IRSB without its statements. In CFGFast, we default to using statement-free IRSBs, unless the block contains an indirect jump.irsb.exit_statements
.to_snippets()
no longer needs to lift/re-lift blocks.Function.transition_graph
) are no longer added immediately after the source node is traversed. The addition of these edges is delayed until the destination nodes are traversed (CFGJob
has a new propertyfunc_edges
).._changed_functions
are renated to._updated_nonreturning_functions
. If a function is already deemed as returning, it will not be added to this set any more.Thanks @rhelmot for going through much insanity for commiting her code as me ;)