False Alarm Reduce Research

This repository contains the code for the paper: False Alarm Reduction Method for Weakness Static Analysis Using BERT Model

Project Overview

Static analysis tools inspect source code and generate diagnostic messages ("warnings") that indicate the location and contextual characteristics of potential security vulnerabilities. Since each static analysis tool differs in the types of vulnerabilities it can detect and its analysis performance, it is common to use multiple tools during software development. However, this approach often produces a large number of alarms, including many false positives.
In many cases, it is difficult to analyze syntax or semantics accurately across diverse source code written by developers. To address this, we aim to enhance vulnerability detection accuracy using the BERT model, which is based on the Transformer architecture and is capable of capturing sequential and semantic relationships in source code.
In this research, we propose a system that leverages BERT to compute vulnerability scores for each line of code, and uses a decision tree model to classify the reliability of alerts generated by multiple static anlysis tools—thereby reducing the false positive rate.
This approach combines the strengths of multiple static analysis tools with the advantages of deep learning to accurately detect software vulnerabilities. Ultimately, we propose a method that enables comprehensive vulnerability analysis while significantly reducing false positives, leading to cost and time savings during software development and code review processes.
The architecture of the proposed false positive reduction model using BERT-based static vulnerability analysis is shown below.

Project Structure

BWA (Bug Warning Analyzer): This is a line-level vulnerability analysis model using a BERT-based architecture. It tokenizes and embeds input C/C++ source code and analyzes it by learning vulnerability patterns.

Tools configuration: Multiple static analysis tools are used to detect potential vulnerabilities in the source code.

ACM (Alarm Classification Model): This model takes the line-level vulnerability scores produced by the BWA and the alarms generated by multiple static analysis tools as input, and classifies the alerts using a decision tree model.

Environment & Installation

Dataset

This project uses the Juliet test suite, first released in December 2010 by the Center for Assured Software (CAS) of the U.S. National Security Agency (NSA). The suite consists of relatively short code snippets with distinct control flow, data flow, or data structure characteristics. Version 1.3 is used in this project and includes 118 classes of security weaknesses.
The official C/C++ version includes a total of 64,099 test cases:
- C source files: 53,476
- C++ source files: 46,276
- Header files: 4,422
- Total files: 104,174
After downloading, store the dataset in the dataset folder.

BWA

Development environment: Anaconda, Python
For detailed installation instructions, please refer to the README.md file within the module folder.

Tools configuration

Development environment: Python
For detailed installation instructions, please refer to the README.md file within the module folder.

ACM

Development environment: Python
For detailed installation instructions, please refer to the README.md file within the module folder.

Experiments

Since this research involves two deep learning models, the Juliet test suite dataset must first be split appropriately. As shown in Table 3-6, 60% or 80% of the dataset is used for training each model, while the remaining 20% is used for validation and testing.

Dataset Split Overview

The following figure summarizes the specifications of the system used to conduct experiments and the primary Python packages utilized.

System Specifications and Major Package Information

Evaluation

To evaluate the performance of the proposed models, we measured Precision, Accuracy, F1-Score, and the ROC Curve. For the BWA model, representative evaluation metrics for deep learning were selected, while for the ACM model, commonly used metrics for classification tasks were employed.

Evaluation metrics for each model

the following figure presents the results of applying the proposed models to detect vulnerabilities based on different CWE (Common Weakness Enumeration) categories.

Evaluation results by CWE category

Citation

If you use this code for your research, please cite the following paper.

Title: False Alarm Reduction Method for Weakness Static Analysis Using BERT Model
Journal: Applied Sciences
DOI: 10.3390/app13063502

@Article{nguyen2023FalseAlarmReduction,
  AUTHOR = {Nguyen, Dinh Huong and Seo, Aria and Nnamdi, Nnubia Pascal and Son, Yunsik},
  TITLE = {False Alarm Reduction Method for Weakness Static Analysis Using BERT Model},
  JOURNAL = {Applied Sciences},
  VOLUME = {13},
  YEAR = {2023},
  NUMBER = {6},
  ARTICLE-NUMBER = {3502},
  URL = {https://www.mdpi.com/2076-3417/13/6/3502},
  ISSN = {2076-3417},
  DOI = {10.3390/app13063502}
}

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
alarm_classification_model		alarm_classification_model
bwa_model		bwa_model
docs		docs
juliettestsuite_cc++		juliettestsuite_cc++
tools_configuration		tools_configuration
LICENSE		LICENSE
README(korean).md		README(korean).md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

False Alarm Reduce Research

Project Overview

Project Structure

Environment & Installation

Dataset

BWA

Tools configuration

ACM

Experiments

Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

False Alarm Reduce Research

Project Overview

Project Structure

Environment & Installation

Dataset

BWA

Tools configuration

ACM

Experiments

Evaluation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages