DeepGD

This repository is a companion page for the following paper: “DeepGD: A Multi-Objective Black-Box Test Selection Approach for Deep Neural Networks”, (submitted paper). This paper is implemented in python on Google Colab .

This repository is a replication package of DeepGD, a black-box (BB) test selection approach for deep neural networks (DNNs) that uses a customized multi-objective genetic search to guide the selection of test inputs with high fault-revealing power. It relies on diversity and uncertainty scores and it only requires access to the complete test dataset and prediction probabilities of the DNN model. The approach also considers a clustering-based approach to estimate faults in DNN models. An empirical evaluation on five DNN models, four datasets, and nine baselines shows that DeepGD provides better guidance for selecting inputs with high fault-revealing power and improves model performance through retraining.

Our main contributions are:

1- Proposing BB test selection approach

2- Customized the multi-objective search and validating by doing an ablation study

3- Validating DeepGD by approximating faults in DNNs

4- Comparing existing test selection metrics with DeepGD in terms of fault detection abilities, execution time, diversity, and retraining improvement

Baseline_results folder contains a selected subsets by each methods through different datasets and models.
Fault_clusters folder contains DNNs' faults which are saved in it for six different combinations of models & datasets.
Retraining folder contains the retrained models with the selected subsets of all methods.
LSA_DSA folder contains some parts of [4] for computing the LSA and DSA scores.
ATS-master_final folder contains the related code for applying ATS from [3].

Requirements

You need to first install these Libraries:

!pip install umap-learn
!pip install tslearn
!pip install hdbscan
!pip install pymoo

The code was developed and tested based on the following environment:

python 3.8
keras 2.7.0
Tensorflow 2.7.1
pytorch 1.10.0
torchvision 0.11.1
matplotlib
sklearn

Here is a documentation on how to use this replication package.

Getting started

To run the code, you have to upload the repository to Google Drive and open it on Google Colab and just the first line and set your local path and run the code.

Repository Structure

This is the root directory of the repository. The directory is structured as follows:

Replication-package
 .
 |
 |---  Final DeepGD                  Customized multi-objective search-based test selection method
 |
 |---  BaseLine methods              All the used test selection baselines that have used in the paper.
 |
 |---  Retraining                    Retraining experiments.
 |
 |---  Generating data               Our implementation for generating test inputs by applying real transformations.

Research Questions

Our experimental evaluation answers the research questions below.

RQ1: Do we find more faults than existing test selection approaches with the same testing budget?

RQ2: Do we more effectively guide the retraining of DNN models with our selected inputs than with baselines?

RQ3. Can DeepGD select more diverse test input sets compared to other test selection approaches?

RQ4. How do DeepGD and baseline approaches compare in terms of computation time?

Not in the paper:

In our study, we would like to emphasize that the use of mispredictions as the only criterion for evaluating test selection approaches can be misleading and may not accurately reflect the fault detection capabilities of these methods. This is primarily due to the possible presence of redundant mispredicted test inputs that can be due to the same fault in the DNN model. Therefore, the specific numbers related to mispredicted inputs are not included in our paper. Instead, they are provided within this replication package, along with other supplementary material.

Notes

1- To speed-up the execution, you can use GPU-based TensorFlow by changing the Colab Runtime.

References

1- DeepGini

2- Diversity

3- ATS

4- Surprise Adequacy

5- Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction and Clustering

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
ATS-master_final		ATS-master_final
Baseline_results		Baseline_results
Data		Data
Fault_clusters		Fault_clusters
LSA_DSA		LSA_DSA
Result_DeepGD		Result_DeepGD
Retraining		Retraining
.gitattributes		.gitattributes
BaseLine methods.ipynb		BaseLine methods.ipynb
Generating data.ipynb		Generating data.ipynb
Org_Final_DeepGD.ipynb		Org_Final_DeepGD.ipynb
README.md		README.md
Retraining.ipynb		Retraining.ipynb
Stability		Stability
Stability results.png		Stability results.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepGD

Requirements

Getting started

Repository Structure

Research Questions

Not in the paper:

Notes

References

About

Releases

Packages

Contributors 2

Languages

ZOE-CA/DeepGD

Folders and files

Latest commit

History

Repository files navigation

DeepGD

Requirements

Getting started

Repository Structure

Research Questions

Not in the paper:

Notes

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages