Introduce semantic layer to prepare to share range analysis #7986

dbartol · 2022-02-11T22:47:18Z

We'd like to share Java's range analysis with C++ (and Swift, C#, and maybe Go). Range analysis has several other interesting analyses on which it depends, including sign analysis, modulus analysis, and a little bit of constant analysis and nullness analysis. Sign analysis and modulus analysis are already shared with C#, but everything else is still Java-specific.

The first step to sharing all of this with other languages is to get the Java-specific portion separated from the sharable portion. After separate conversations with @aschackmull and @rdmarsh2, it seemed like a good time to consider a language-neutral interface that would allow all of these semantic analyses to work with multiple languages without having to adapt each {language, analysis} pair one at a time.

This PR factors out all of the dependencies that Java's range analysis had on import java. In their place, I've introduced a few modules under semmle.code.java.semantic.*:

SemanticExpr - Currently just directly wraps Java's Expr class, with the minimal set of subclasses and member predicates to make range analysis work. We'll need to think more about the right interface to expose here, such that it can be implemented relatively easy for each language.
SemanticSSA - Wraps a small subset of Java's SSA library, plus the SsaReadPosition stuff that was previously internal to range analysis.
SemanticCFG - Wraps a small subset of Java's CFG library.
SemanticGuard - Wraps a small subset of Java's guards library.
SemanticType - Unlike the wrappers above, this is separate concrete type system, populated from Java's type system. The interface is basically cut and pasted from what we've already been using as the IR's type system. Key differences from the Java type system include:
- Numeric types are just described by their kind (signed, unsigned, FP) and size.
- Character types are just integer types.
- All pointers and references are just a single "address" type.
- Classes (the object layout part, not the reference) are just an "opaque" type with a specific size.

I'm not claiming that any of the above semantic interfaces are what we should wind up with, but they're a good starting point by showing what we actually use today.

In adapting the existing Java analysis code to use the new interfaces, I started by replacing all uses of the Java-specific types with their semantic equivalents, and started fixing up compiler errors. Anything that was truly Java-specific was factored out into a separate file. For any dependencies on already-shared files, like sign analysis and modulus analysis, I added semantic wrappers for those files to avoid modifying any shared file. As a follow-up, we can port the sign and modulus analyses to the semantic interface and remove the need for those wrappers.

I don't expect to actually merge these changes into the Java repo until I've had a chance to try out the now-sharable analysis on a C++ adapter that exposes the semantic interfaces. This PR is mostly so interested parties can take a look. @aschackmull @rdmarsh2 @hvitved.

Dave Bartolomeo added 2 commits February 11, 2022 16:35

Move some Java-specific code into separate file

bbc347b

Introduce semantic layer to make range analysis language-neutral

1082c9a

github-actions bot added the Java label Feb 11, 2022

dbartol added C++ C# labels Feb 11, 2022

Dave Bartolomeo added 7 commits February 14, 2022 18:27

Fix test failures

c1acb7d

Updates to semantic layer based on Sign Analysis usage

de9b544

Fix constant analysis

614bc07

Remove unnecessary changes

86a943f

Fix formatting

540f2ff

Fix formatting

b5e7429

Remove unnecessary imports

915c04a

dbartol mentioned this pull request Feb 18, 2022

Port Java sign analysis to semantic layer #8068

Closed

dbartol closed this by deleting the head repository May 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Introduce semantic layer to prepare to share range analysis #7986

Introduce semantic layer to prepare to share range analysis #7986

Uh oh!

dbartol commented Feb 11, 2022

Uh oh!

Uh oh!

Introduce semantic layer to prepare to share range analysis #7986

Introduce semantic layer to prepare to share range analysis #7986

Uh oh!

Conversation

dbartol commented Feb 11, 2022

Uh oh!

Uh oh!