Reachability Analysis #669
Replies: 1 comment 2 replies
-
I'll contribute a little practical information about how the compiler implementation plays into any algorithm we might select. Compilation presently uses the worklist to pass program elements between four phases: Gathering information can happen in three ways:
During It would be beneficial (not only to reachability analysis but also to escape analysis, dead code elimination, inlining, memory model optimizations, etc.) to provide more general facilities for tracking at least the following information:
This could still be done via plugin but it'd be just one plugin that has no dependencies and simply tracks the raw information, so that other plugins could depend on that one including the call analysis plugin(s). The other suggestion I wanted to make is that we can run almost any given analysis after more than one phase, and thereby get a better result. Any analysis that may result in a reduction of the program size is not only beneficial for that reason but also for the practical reason of speeding up the LLVM stage of compilation. Finally I wanted to mention that there's a cleaner version of "Fast Interprocedural Class Analysis" here (it's already on the big reading list). |
Beta Was this translation helpful? Give feedback.
-
Problem Statement
Starting from a main class and perhaps an optional list of additional entry points (for example methods invoked reflectively), we need to determine the reachable program entities (types, methods, fields, etc.) that should be included in the final native executable.
General Approach
The general approach to solving this problem is to optimistically assume nothing is reachable except the entrypoints and then iteratively add new entities as we discover they are reachable. Eventually this process reaches a fix-point (nothing else is added). This worklist-driven approach to discovering the reachable program is builtin to qbicc's top-level compilation driver. A key piece of this reachability analysis is processing reachable dynamic method invocations (
invokeinterface
,invokevirtual
,invokedynamic
), which may invoke multiple potential callee methods and where the possible callees is also dependent on the runtime values that reach the call site, and deciding what new methods the processed callsite makes reachable. There have been hundreds of research papers on this general topic (control flow analysis, call graph construction, class analysis, pointer analysis, etc).Suitable Call Graph Discovery Algorithms for Qbicc
Although there may be scenarios in which the user is willing to spend significant compile time producing a highly optimized native executable, we expect that the majority of compilations will need to use relatively inexpensive call graph construction algorithms. In particular, we are looking for algorithms that have time and space complexities that are roughly linear as a function of program size (and since all Java programs include some part of the JDK, even trivial programs are fairly large).
I've attached a copy of "Fast Interprocedural Class Analysis" from POPL'98. FastInterproceduralClassAnalysis-POPL98.pdf We tried to systematically look at some of the design possibilities for this part of the algorithm space.
Beta Was this translation helpful? Give feedback.
All reactions