Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Many improvements to make CFGFast fast again. #1092
This series of development work is inspired by the testing binary used in #1075 -- the binary is provided by @KevOrr, thanks! I realized that angr's CFG recovery (CFGFast, I mean) was obviously too slow to run on a 35-MB blob. I then did some intensive profiling and made quite a few improvements in angr, CLE, and PyVEX.
To give you a sense of what sort of improvement I am referring to, the following is the benchmark I've been using throughout the past three days, and the results of running CFGFast for the first 6% of code.
Here is an incomplete list of things that have been changed (improved, hopefully):
Thanks @rhelmot for going through much insanity for commiting her code as me ;)
See angr/angr#1092 for more details. * The initial commit. * Redo the lifting logic to avoid redundant IRSB copying. * Move get_defaultexit_target() from Python to C to speed things up. * Add a check to make sure LibVEX_Lift() does not return NULL. * Implement IRSB.instruction_addresses. * Implement IRSB.has_statements. * Fix IRSB.addr. * Fix a NULL-deref in pyvex.c. * Postprocessor: Do not remove NoOp statements. Otherwise it will cause a mismatch between statement indices and the indices in IRSB.exit_statements (which are generated in PyVEX C). * FixesPostProcessor: Get the IRSB address correctly. * Remove NoOp statements in the C world. * Restore the IRSB size calculation. * Make sure exit_statements is not None before accessing. * Enable tests for PyVEX itself. * Implement ARM call jumpkind fixer in C for a better performance. * Implement MIPS32 post-processing in C world. * Add a missing undefine. * Fix test.py. * Add stddef.h to pyvex.c. * Implement data reference collection in pyvex_c. * Lint the code. * Expose PyVEXError. * More linting. * oops * Expose get_type_size and get_type_spec_size. * Expose IRTypeEnv. * Postprocessor: Support NeedStatementsNotification. * Reorganize lifting/__init__.py. * Remove the IRSB.addr property, replace with raw attribute * Make check against MAX_DATA_REFS more explicit * Move from data ref tuples to a DataRef class * Lint block.py: Define data_refs and _instruction_addresses in __init__ * Make pyvex.lift positional arguments match pyvex.IRSB * The kosher way to do this is .lift() * Split pyvex.c into several smaller C files (committing as fish to preserve authorship)