GitHub

Test Generation Strategies for Building Failure Models and Explaining Spurious Failures

In this paper, we propose to develop failure models for inferring explainable rules on test inputs that cause spurious failures. We implement and compare two approaches for building failure models: surrogate-assisted and ML-guided test generation. Surrogate-assisted test generation leverages ML that acts as a surrogate to the system under test that predicts the labels for test inputs instead of exercising all the inputs. More specifically, we propose a new surrogate-assisted algorithm that uses multiple surrogate models simultaneously, and selects the most accurate ML model dynamically for prediction. On the other hand, ML-guided test generation infers boundary regions that separate passing and failing test inputs and subsequently samples test inputs from those regions. An overview of the approaches are discussed below:

Preprocessing Phase. For a given system under test, its input search space and a fitness function, an initial set of test inputs are generated (1). Further, data imbalance is handled using SMOTE (2).
Main Loop. ML models are trained iteratively on the test inputs executed by the simulator such that the ML models are either used as a surrogate to perform explorative search or used to conduct guided search which is exploitative in nature (3). The test inputs are iteratively added to the test suite which is further used to update the ML model (4). Finally, the generated test suite is used to train Decision Rules to characterize the spurious failure test inputs at a system level (5).

License

This software is released under GNU GENERAL PUBLIC LICENSE, Version 2. Please refer to the license.txt

Content description

Folders

Benchmark: contains two folder Formalization which contains the formalization details of the requirements for each Simulink model and NTSS system. Folder Simulink Models contains the benchmark Simulink models (.slx files).
Code: contains scripts related to the implementation of all the approaches in this paper. It contains two subfolder namely Simulink and NTSS. As the name suggests, Simulink contains code related to the all the approaches implemented for the Simulink models and NTSS contains code for NTSS system. Simulink folder contains seven sub folders. They are:
- inputGenerator: contains code to perform random search (ars.m) on the input search space.
- Algorithms: Contains the implementation for Surrogate-assisted (Folders Dynamic Surrogate and Individual Surrogate), ML-guided test generation (logisticRegression.m and regressionTree.m), naive random search baseline (randomSearch.m) and state-of-the-art baseline (decisiontreeSoTA.m).
- Functions: contains intermediate scripts that is required for all the implementation to work.
- Models: contains files that define the input search space, requirements and calls differents algorithms (i.e Surrogate, ML-guided and random search) on a requirement that is set.
- Scripts: [IMPORTANT] This folder is the starting point for all the algorithms. The information of the requirement or the algorithm that needs to be performed to generate the dataset is defined in this folder.
- pyScripts: contains code to perform oversampling using SMOTE (SMOTE.py), code to perform guided search using logistic regression (logGen.py), regression tree (extractRangesEpsilon.py) and state-of-the-art (SoTA.py) algorithms
Evaluation Results: contains the results in Excel format along with the diagrams for each research question.
Data: contains raw datasets generated by different algorithms as well as test sets used to assess the performances of failure models.
Evaluation: contains scripts for evaluating different approaches for each research question. Folder RQ1 contains two scripts: RQ1_surrogate_verification.py used to crawl through all the test suite CSV files generated and extract key information of the dataset such as the dataset size, number of pass instances, numbe of fail instances, number of wrongly predicted instances etc and verify.m to evaluate the accuracy of the generated test suite by simulating all the predicted test inputs. NOTE: verify.m is used only in the case of surrogate-assisted technique. Folder RQ2 contains script RQ2_DR_classification.py used to build failure models using datasets generated by different algorithms and subsequently logging the results to a csv file. The script to build failure models based on the engineered features for NTSS is available in a subfolder called NTSS inside RQ2 folder. Folder RQ3 contains the code to build a decision tree for surrogate assisted algorithm based on the data from RQ2 as well as code to draw the boxplots. Finally, Folder RQ4 contains the script to generate rules for Autopilot requirements. Please note that the code to generate rules for NTSS is available in RQ2\NTSS\ where we do feature engineering for NTSS.

Prerequisite

Matlab R2021b [For Simulink Models]
Python (3.8.5)
Virtual Box 6.1 [For NTSS]
Ubuntu 20.04 disc image (https://ubuntu.com/download/desktop) [For NTSS]
OpenWrt 19.0.7 (https://downloads.openwrt.org/releases/19.07.8/targets/x86/64/) [For NTSS]
nuttcp 8.1.4 (http://nuttcp.net/nuttcp/nuttcp-8.1.4/nuttcp.c) [For NTSS]
dpinger (https://github.com/dennypage/dpinger) [For NTSS]

Instructions to run the proposed algorithms

Simulink Models

Create a Anaconda python environment with the packages listed in Code/Simulink/pyScripts/requirements.txt.
Open anaconda terminal and proceed to testGenStrat main folder.
Type matlab command to launch MATLAB software along with the Python packages.
Add the folder testGenStrat and all of its subfolders on your classpath (right click on the folder > add to path > selected folder and subfolders)
Open the file executeXXnewHCR where XX is replaced with the values that correspond to the Simulink model under test. The values that XX can take are AP (Autopilot), TU (Tustin), REG (Regulator), FSM (Finite State Machine), NL (Non linear Guidance). Once opened, depending upon the algorithm that needs to be executed, enter the appropriate values inside the Python list models and req to set the algorithm (i.e Surrogate technique, ML-guided technique or Random search) and the requirement that needs to be tested.
Run the command executeXXnewHCR on the terminal.

NTSS

First follow the instructions here for building virtual machines and installing required packages.
Copy the scripts in code folder to the Documents directory of VM1 virtual machine.
Open a terminal and navigate to Documents. Then, run the scripts using python. For example, to run surrogate-assisted algorithm, type python3 SurrogateAssisted.py.

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
Benchmark		Benchmark
Code		Code
Data		Data
Evaluation Results		Evaluation Results
Evaluation		Evaluation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Supplementary_Material.pdf		Supplementary_Material.pdf
overview.jpg		overview.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark

Benchmark

Code

Code

Data

Data

Evaluation Results

Evaluation Results

Evaluation

Evaluation

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Supplementary_Material.pdf

Supplementary_Material.pdf

overview.jpg

overview.jpg

Repository files navigation

Test Generation Strategies for Building Failure Models and Explaining Spurious Failures

License

Content description

Prerequisite

Instructions to run the proposed algorithms

Simulink Models

NTSS

About

Releases

Packages

Languages

License

anonpaper23/testGenStrat

Folders and files

Latest commit

History

Repository files navigation

Test Generation Strategies for Building Failure Models and Explaining Spurious Failures

License

Content description

Prerequisite

Instructions to run the proposed algorithms

Simulink Models

NTSS

About

Resources

License

Stars

Watchers

Forks

Languages