EvoSpex (submission for Artifact Evaluation at ICSE 2021)
EvoSpex: An Evolutionary Algorithm for Learning Postconditions
F. Molina, P. Ponzio, N. Aguirre, and M. Frias. University of Río Cuarto and CONICET, Argentina.
EvoSpex is a tool based on a genetic algorithm for learning postcondition assertions of Java methods and it is publicly available at DOI. A postcondition assertion for a Java method is an expression that captures properties of the method's executions.
Given a Java method, to produce its postcondition EvoSpex recieves as input sets of valid and invalid pre/post state pairs (i.e., state pairs that represent, and do not represent, the method’s current behavior, respectively). Valid pre/post state pairs are obtained by generating executions of the method under analysis, while invalid ones are obtained by mutating the valid pairs. Given these inputs, EvoSpex uses a genetic algorithm to explore the state space of candidate postconditions with the goal of obtaining one that is satisfied by the valid pre/post pairs, while leaving out the invalid ones.
Reproducing the experiments in the paper
In order to reproduce the experiments in the paper, you must first follow the instructions described in INSTALL.MD, to be able to run EvoSpex and its dependencies.
The paper contains two sets of experiments. In the first set, the tool is used to generate postconditions, whose quality is compared against those generated by the Daikon tool. In the second set, the tool is used on a number of contract-equipped classes, and the produced assertions are contrasted with the contracts, to assess how many of these can be reproduced by EvoSpex.
Postcondition generation and comparison with Daikon
In this set of experiments, we generate postconditions both with EvoSpex and Daikon, and compare their quality using OASIs. The subjects are projects taken from the SF110 benchmark. Assuming that the target project has been compiled as described in INSTALL.MD, we first generate the postconditions with the two techniques, as follows.
NOTE: the instructions in this section assumes that the project being analyzed is 2_a4j. The classes and methods to analyze for project 2_a4j are listed in file $EVOSPEX/src/test/resources/sf110/2_a4j/target-classes.txt. To analyze other projects, just use any of the names listed when doing
ls $EVOSPEX/src/test/resources/sf110 (again, the classes and methods to analyze for a particular project are listed in the file target-classes.txt of the corresponding folder). Also, the sets of valid and invalid pre/post state pairs for each of the analyzed methods have been previously computed. For more details of this process we refer the reader to this page. It is important to remark that the test suites from which such state pairs were generated and which guide EvoSpex's evolution, is exactly the same test suite used to infer postconditions with Daikon.
Generating postconditions with EvoSpex
cd $EVOSPEX ./experiments/sf110/run-evospex-project.sh 2_a4j 10
The script will run EvoSpex 10 times for each of the analyzed methods of the project 2_a4j. Each execution will be stored in the folder experiments/sf110/2_a4j/evospex-results.
Generating postconditions with Daikon
cd $EVOSPEX ./experiments/sf110/run-daikon-project.sh 2_a4j
This execution will run Daikon performing the three usual steps described in this example. As a result, postconditions for each of the target methods will be generated and saved in the folder experiments/sf110/2_a4j/daikon-results.
Assessing the Quality of the Postconditions
Once the postconditions produced by EvoSpex and Daikon are obtained, we are ready to assess their quality, using OASIs. The generated postconditions have to be placed in the corresponding Java files, that will be later on fed to OASIs, following the steps described here. Since this requires a manual effort, for convenience we already placed the assertions obtained in previous runs of the tools in the corresponding Java files, so that the user can continue with the subsequent instructions. The user is welcome to take the outputs of Daikon and EvoSpex and insert the obtained assertions in the corresponding files, as described in here.
To analyze the quality of the postconditions using OASIs, four arguments need to be provided:
- the current project
- the class to be analyzed
- the technique (EvoSpex or Daikon) from which the oracles were computed
- the mode (FP for false positives - FN for false negatives)
So, for example, to detect False Positives in the oracles produced by EvoSpex on class FullProduct of project 2_a4j, from the folder $EVOSPEX run:
./experiments/sf110/run-oasis-class.sh 2_a4j FullProduct EvoSpex FP
and for postconditions produced by Daikon:
./experiments/sf110/run-oasis-class.sh 2_a4j FullProduct Daikon FP
These scripts will try to find False Positives for the postconditions learned by the corresponding technique. It is important to remark that the assertions in the files corresponding to each technique, namely $SF110SRC/2_a4j/src/main/java/net/kencochrane/a4j/beans/FullProductEvoSpex.java and $SF110SRC/2_a4j/src/main/java/net/kencochrane/a4j/beans/FullProductDaikon.java, are already filtered and do not contain false positives. To perform the analyses from scratch, you should perform the following steps:
- Edit the file $SF110SRC/2_a4j/src/main/java/net/kencochrane/a4j/beans/FullProductEvoSpex.java by including the removed false positives that are described as comments in each of the analyzed methods into the assert statement at the end of method. For instance, the assert statement of method addAccessory(MiniProduct product)
assert( old_thissimilarItems == this.similarItems && ExpressionEvaluator.evaluateSetMembership(product,"this . accessories",this) && this.details != null );
Detect False Positives with command
./experiments/sf110/run-oasis-class.sh 2_a4j FullProduct EvoSpex FP
If False Positives are detected, remove the assertion causing it and perform step 2 until no more False Positives are detected. If no False Positives are detected, continue to step 4.
Detect False Negatives with command
./experiments/sf110/run-oasis-class.sh 2_a4j FullProduct EvoSpex FN
Note: exactly the same process should be performed for the file FullProductDaikon.java, and for every class analyzed for project 2_a4j.
Finally, the way in which we compute the results of Table II in the paper is as follows. As we compute the False Positives and False Negatives, we update the corresponding row in the file $EVOSPEX/experiments/sf110/2_a4j/oasis-res.csv. Such file contains one row that describes for each 3-tuple Class-Method-Technique the amount of produced assertions by the technique, the amount of false positives, and the amount of false negatives. Once the iterative process of computing False Positives and False Negatives, the file oasis-res.csv is in its final state, and the paper numbers can be computed by:
cd $EVOSPEX/experiments/sf110 python3.7 print-paper-results.py 2_a4j/oasis-res.csv
This command will output the rows reported in Table II for project 2_a4j:
Methods,Technique,Assertions,FPs,FPs%,FNs 23,Daikon,257,59,22.96,9 23,EvoSpex,60,5,8.33,5
When performing all these steps for every SF110 analyzed the project, the experiment is finished.
Reproducing manually written Contracts
This experiment analyzes the ability of our tool to reproduce manually written contracts. The subjects of this experiment were:
- a set of classes equipped with contracts for the verification of object oriented programs, originally written in the Eiffel programming language and used for verification using the AutoProof tool:
composite doublylinkedlistnode map ringbuffer
- a set of synthesized implementations using the Cozy tool:
polyupdate structure listcomp minfinder maxbag
For this experiment, EvoSpex was executed for each case study until it reached 30 generations or a 10 minute timeout was fulfilled. This execution was repeated 10 times, and then the postcondition that repeated the most number of times was selected.
For instance, to run this experiment for the doublylinkedlistnode case study, you can run the following commands:
cd $EVOSPEX/experiments ./experiments/doublylinkedlistnode-all.sh 10
For the remaining cases, just run the corresponding script present in the experiments folder in the same way. For every case, the results will be stored in the folder experiments/results. For instance, for the case doublylinkedlistnode the output will be in the folder experiments/results/doublylinkedlistnode.
Finally, collect the most common postcondition as described in here and manually compare the inferred postcondition with the one present in the Java file (equipped with contracts). These Java files are available in the folder $EVOSPEX/src/examples/casestudies/eiffel or $EVOSPEX/src/examples/casestudies/cozy, respectively.