Introduce semantic layer to prepare to share range analysis #7986
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We'd like to share Java's range analysis with C++ (and Swift, C#, and maybe Go). Range analysis has several other interesting analyses on which it depends, including sign analysis, modulus analysis, and a little bit of constant analysis and nullness analysis. Sign analysis and modulus analysis are already shared with C#, but everything else is still Java-specific.
The first step to sharing all of this with other languages is to get the Java-specific portion separated from the sharable portion. After separate conversations with @aschackmull and @rdmarsh2, it seemed like a good time to consider a language-neutral interface that would allow all of these semantic analyses to work with multiple languages without having to adapt each {language, analysis} pair one at a time.
This PR factors out all of the dependencies that Java's range analysis had on
import java
. In their place, I've introduced a few modules undersemmle.code.java.semantic.*
:SemanticExpr
- Currently just directly wraps Java'sExpr
class, with the minimal set of subclasses and member predicates to make range analysis work. We'll need to think more about the right interface to expose here, such that it can be implemented relatively easy for each language.SemanticSSA
- Wraps a small subset of Java's SSA library, plus theSsaReadPosition
stuff that was previously internal to range analysis.SemanticCFG
- Wraps a small subset of Java's CFG library.SemanticGuard
- Wraps a small subset of Java's guards library.SemanticType
- Unlike the wrappers above, this is separate concrete type system, populated from Java's type system. The interface is basically cut and pasted from what we've already been using as the IR's type system. Key differences from the Java type system include:I'm not claiming that any of the above semantic interfaces are what we should wind up with, but they're a good starting point by showing what we actually use today.
In adapting the existing Java analysis code to use the new interfaces, I started by replacing all uses of the Java-specific types with their semantic equivalents, and started fixing up compiler errors. Anything that was truly Java-specific was factored out into a separate file. For any dependencies on already-shared files, like sign analysis and modulus analysis, I added semantic wrappers for those files to avoid modifying any shared file. As a follow-up, we can port the sign and modulus analyses to the semantic interface and remove the need for those wrappers.
I don't expect to actually merge these changes into the Java repo until I've had a chance to try out the now-sharable analysis on a C++ adapter that exposes the semantic interfaces. This PR is mostly so interested parties can take a look. @aschackmull @rdmarsh2 @hvitved.