This is a prototype implementation of the type inference algorithm introduced in the paper
U. Schöpp and C. Xu. Inferring Region Types via an Abstract Notion of Environment Transformation. APLAS'22.
based on the Soot framework. It takes a Java (bytecode) program as input and computes the region type of the given method in the program.
The arXiv version of the above paper is available here.
There are a few examples including those from the paper in the testcases.paperexamples package in the test module. To play with them, one can run the main method in regiontypeinference.Main in the main module.
To display more information about the analysis, one can set the following environment variables to be true
:
SHOW_TABLE
— to print the table of abstract transformations at each iteration of the fix-point procedure,DEBUGGING
— to print the abstract transformation at each node of the control flow graph of the analyzed method.
Supposing the repository has been cloned to the current directory, a Docker container can be built with:
cd AbstractTransformation
docker build . -t abstracttransformation
The artifact can then be run as follows:
docker run abstracttransformation
This will start the analysis for the examples from the paper and output the results.
Environment variables can be set like so:
docker run -e SHOW_TABLE=true -e DEBUGGING=true abstracttransformation
The artifact can also be built using Gradle. A configuration file build.gradle is provided in the repository. For example, in an IDE (e.g. IntelliJ IDEA), one can set up a project for the artifact by opening the build.gradle file as a project.
The following Java code is taken from the example of linked lists given in Appendix B of the paper and available here.
class Node {
Node next;
Node last() {
TaintAPI.emitA();
if (next == null) {
return this;
} else {
return next.last();
}
}
}
Running our tool with it will print the following result:
Analysis result of the method <testcases.paperexamples.Node: testcases.paperexamples.Node last()>
Transformation: [$stack1 := {this.next, this.(next,[(next, next)],next)}, $stack3 := {$stack3, this.next, this.(next,[(next, next)],next)}, $stack2 := {this.next, this.(next,[(next, next)],next), $stack2}, this := {this.next, this.(next,[(next, next)],next), this}]
Transformation ignoring the Jimple variables: [this := {this.next, this.(next,[(next, next)],next), this}]
Type term: {this.next, this.(next,[(next, next)],next), this}
Input environment: ()
Output environment: ()
Output field table: ()
Output type: ⊥
We explain how to interpret the above analysis result:
Transformation
— the abstract transformation of the methodTransformation ignoring the Jimple variables
— The abstract transformation may contain additional variables due to the translation of Java Bytecode to Jimple programs in the Soot framework. We remove these variables in the result for the sake of readability.Type term
— the term to be instantiated with a given environment into a output type for the methodInput environment
— typing environment to execute the method, by default set to be the empty environment()
where all variables and fields have the bottom type⊥
Output environment
— typing environment after execute the method with the input environmentOutput field table
— field table after execute the method with the input environmentOutput type
— output type of the method, obtained by instantiating the type term with the output environment and field table
The abstract transformation of last()
is [this := {this.next, this.(next,[(next, next)],next), this}]
. It performs no update to the empty environment and thus the output type of last()
with the empty environment is ⊥
, i.e. there is no region for the output value.
For the following method
class Test {
Node linear () {
Node x = new Node();
Node y = new Node();
y.next = x;
return y.last();
}
...
}
the analysis result is
Analysis result of the method <testcases.paperexamples.Test: testcases.paperexamples.Node linear()>
Transformation: [$stack4 := {<created at .(Node.java:24)>}, this := {<created at .(Node.java:24)>, <created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next}, $stack2 := {$stack2, <created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next}, $stack5 := {<created at .(Node.java:24)>, <created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next}, $stack3 := {$stack3, <created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next}, y := {<created at .(Node.java:24)>}, $stack3 := {<created at .(Node.java:23)>}, <created at .(Node.java:24)>.next :> {<created at .(Node.java:23)>}, this := {<created at .(Node.java:24)>}, x := {<created at .(Node.java:23)>}, $stack1 := {<created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next, $stack1}]
Transformation ignoring the Jimple variables: [this := {<created at .(Node.java:24)>, <created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next}, y := {<created at .(Node.java:24)>}, <created at .(Node.java:24)>.next :> {<created at .(Node.java:23)>}, this := {<created at .(Node.java:24)>}, x := {<created at .(Node.java:23)>}]
Type term: {<created at .(Node.java:24)>, <created at .(Node.java:24)>.(next,[(next, next)],next), <created at .(Node.java:24)>.next}
Input environment: ()
Output environment: ($stack4: {<created at .(Node.java:24)>}, this: {<created at .(Node.java:24)>, <created at .(Node.java:23)>}, $stack2: {<created at .(Node.java:23)>}, x: {<created at .(Node.java:23)>}, $stack5: {<created at .(Node.java:24)>, <created at .(Node.java:23)>}, $stack3: {<created at .(Node.java:23)>}, y: {<created at .(Node.java:24)>}, $stack3: {<created at .(Node.java:23)>}, $stack1: {<created at .(Node.java:23)>}, this: {<created at .(Node.java:24)>})
Output field table: (<created at .(Node.java:24)>.next: {<created at .(Node.java:23)>}, <created at .(Node.java:23)>.next: {null})
Output type: {<created at .(Node.java:24)>, <created at .(Node.java:23)>}
where the output type {<created at .(Node.java:24)>, <created at .(Node.java:23)>}
means that the returned value of the method linear()
is created either in line 24 or line 23 of Node.java.
The src folder consists of a main module for the development of the tool and a test module containing five test cases. The main module is structured as follows:
- main/java
- mockup: mock code of some library methods (not needed for the included test cases)
- ourlib: our library to support mock code and test cases
- regiontypeinference: implementation of the region type inference algorithm
- interproc: an interprocedural analysis to infer the type of each method of the program
- intraproc: a forward flow analysis to compute an abstract transformation from the control flow graph of the method
- policy: policies specifying the default types of intrinsic methods
- region: various regions representing properties of program values
- transformation: development of abstract transformations
- Main.java: main method to run the tool with the test cases
- MockInfo.java: information about which classes have mock code
- TA.java: wrap-up of the tool, including Soot environment configuration
The algorithm to infer region types for Featherweight Java programs introduced in the paper has been fully implemented. Here is a mapping from the paper to the artifact:
-
Section 2. Background
- Access graphs — regiontypeinference.transformation.FieldGraph
-
Section 3. A Theory of Abstract Transformations
- Running example (Fig. 1) — testcases.paperexamples.RunningExample
One can run e.g. the main method to get the analysis result of the running example. - Term — regiontypeinference.transformation.Term
A term is a set of atoms (e.g. VariableAtom, VariableFieldAtom, RegionAtom and RegionFieldAtom). - Abstract transformation — regiontypeinference.transformation.Transformation
An abstract transformation is a mapping from keys to terms, containing assignments (i.e. elements from VariableAtom) and constraints (i.e. elements from VariableFieldAtom or RegionFieldAtom).
- Running example (Fig. 1) — testcases.paperexamples.RunningExample
-
Section 4. Type Inference via Abstract Transformations
- Region — regiontypeinference.region
In particular, the region$\mathsf{Null}$ is implemented as a SpecialRegion and$\mathsf{CreatedAt}$ as AllocationSiteRegion - Abstract method table — regiontypeinference.interproc.AbstractMethodTable
An abstract method table assigns an abstract transformation and a term (TransAndTerm) to each method. -
$[[-]]$ — regiontypeinference.intraproc.TransformationAnalysis
The core of the type inference algorithm is the$[[-]]$ function that computes an abstract transformation and a type term for the given Featherweight Java expression. It becomes a forward flow analysis for the control flow graphs of the Java program. In particular, the TransformationAnalysis has aflowThrough
method that computes an abstract transformation for each node in the control flow graph and then concatenates it with the one generated from the previous nodes. - Computing the abstract method table
$T$ — regiontypeinference.interproc.InterProcTransAnalysis
The fixed point procedure that computes an abstract transformation for each method in the program is implemented as an interprocedural analysis that uses the above TransformationAnalysis.
- Region — regiontypeinference.region
-
Appendix A. Composition and Join of Abstract Transformations
- Operations on terms and abstract transformations are implemented in the corresponding classes.
-
Appendix B. An Example of Inferring Region Types
- The example of linked lists is implemented in testcases.paperexamples.Node.
When running the tool with this example, one can set the environment variableSHOW_TABLE
totrue
to print the abstract transformations of the methods in each iteration.
- The example of linked lists is implemented in testcases.paperexamples.Node.
Our development of abstract transformations is independent of our selection of region types (as discussed in Section 3 of the paper). Therefore, to adapt the algorithm to infer other types, one only needs to
- Add new (region) types by implementing the Region interface
- Specify the types of intrinsic methods in a new policy by implementing the Policy interface
- Implement mock code for the library with the
@Replaces
annotation in the mockup package if needed
The current implementation covers only the core features of Java. To support others such as arrays, strings and dynamic invocation, we will implement the methods for the corresponding Jimple expressions or values in the TransformationAnalysis class. For example, we need to implement the caseDynamicInvokeExpr
method to support analysis of dynamic invocation of methods. To support exception handling, we also need to extend the flowThrough
method to analyze also the code of the exception handlers (see here for an example).
We plan to extend the inference algorithm to a region-sensitive trace property analysis. As discussed in Section 5 of the paper, we need to implement some form of formal expressions to summarize the behaviour of the program. Then we need to extend TransformationAnalysis to compute also a formal expression for the control flow graph of the given method as well as InterProcTransAnalysis to compute the method table which now contains also a formal expression for each method.
- GuideForceJava, based on our PPDP'21 paper.
- TSA, developed by Zălinescu et al., based on their APLAS'17 paper.
- Ulrich Schöpp
- Chuangjie Xu
- Eugen Zălinescu