Skip to content

PLASS-Lab/FalseAlarmReduceResearch

Repository files navigation

False Alarm Reduce Research

Readme(Korean)

🪪 Overview | 📐 Structure | 🪄️ Installation | 🔬 Experiments | 📊 Evaluation | 🔗 Citation | 📝 Paper

This repository contains the code for the paper: False Alarm Reduction Method for Weakness Static Analysis Using BERT Model

Project Overview

  • Static analysis tools inspect source code and generate diagnostic messages ("warnings") that indicate the location and contextual characteristics of potential security vulnerabilities. Since each static analysis tool differs in the types of vulnerabilities it can detect and its analysis performance, it is common to use multiple tools during software development. However, this approach often produces a large number of alarms, including many false positives.
  • In many cases, it is difficult to analyze syntax or semantics accurately across diverse source code written by developers. To address this, we aim to enhance vulnerability detection accuracy using the BERT model, which is based on the Transformer architecture and is capable of capturing sequential and semantic relationships in source code.
  • In this research, we propose a system that leverages BERT to compute vulnerability scores for each line of code, and uses a decision tree model to classify the reliability of alerts generated by multiple static anlysis tools—thereby reducing the false positive rate.
  • This approach combines the strengths of multiple static analysis tools with the advantages of deep learning to accurately detect software vulnerabilities. Ultimately, we propose a method that enables comprehensive vulnerability analysis while significantly reducing false positives, leading to cost and time savings during software development and code review processes.
  • The architecture of the proposed false positive reduction model using BERT-based static vulnerability analysis is shown below.

Project Structure

BWA (Bug Warning Analyzer): This is a line-level vulnerability analysis model using a BERT-based architecture. It tokenizes and embeds input C/C++ source code and analyzes it by learning vulnerability patterns.

Tools configuration: Multiple static analysis tools are used to detect potential vulnerabilities in the source code.

ACM (Alarm Classification Model): This model takes the line-level vulnerability scores produced by the BWA and the alarms generated by multiple static analysis tools as input, and classifies the alerts using a decision tree model.

Environment & Installation

Dataset

  • This project uses the Juliet test suite, first released in December 2010 by the Center for Assured Software (CAS) of the U.S. National Security Agency (NSA). The suite consists of relatively short code snippets with distinct control flow, data flow, or data structure characteristics. Version 1.3 is used in this project and includes 118 classes of security weaknesses.
  • The official C/C++ version includes a total of 64,099 test cases:
    • C source files: 53,476
    • C++ source files: 46,276
    • Header files: 4,422
    • Total files: 104,174
  • After downloading, store the dataset in the dataset folder.

BWA

  • Development environment: Anaconda, Python
  • For detailed installation instructions, please refer to the README.md file within the module folder.

Tools configuration

  • Development environment: Python
  • For detailed installation instructions, please refer to the README.md file within the module folder.

ACM

  • Development environment: Python
  • For detailed installation instructions, please refer to the README.md file within the module folder.

Experiments

  • Since this research involves two deep learning models, the Juliet test suite dataset must first be split appropriately. As shown in Table 3-6, 60% or 80% of the dataset is used for training each model, while the remaining 20% is used for validation and testing.


Dataset Split Overview

  • The following figure summarizes the specifications of the system used to conduct experiments and the primary Python packages utilized.


System Specifications and Major Package Information

Evaluation

  • To evaluate the performance of the proposed models, we measured Precision, Accuracy, F1-Score, and the ROC Curve. For the BWA model, representative evaluation metrics for deep learning were selected, while for the ACM model, commonly used metrics for classification tasks were employed.


Evaluation metrics for each model

  • the following figure presents the results of applying the proposed models to detect vulnerabilities based on different CWE (Common Weakness Enumeration) categories.


Evaluation results by CWE category

Citation

If you use this code for your research, please cite the following paper.

Title: False Alarm Reduction Method for Weakness Static Analysis Using BERT Model
Journal: Applied Sciences
DOI: 10.3390/app13063502

@Article{nguyen2023FalseAlarmReduction,
  AUTHOR = {Nguyen, Dinh Huong and Seo, Aria and Nnamdi, Nnubia Pascal and Son, Yunsik},
  TITLE = {False Alarm Reduction Method for Weakness Static Analysis Using BERT Model},
  JOURNAL = {Applied Sciences},
  VOLUME = {13},
  YEAR = {2023},
  NUMBER = {6},
  ARTICLE-NUMBER = {3502},
  URL = {https://www.mdpi.com/2076-3417/13/6/3502},
  ISSN = {2076-3417},
  DOI = {10.3390/app13063502}
}

PLASS Lab, Dongguk University

About

False Alarm Reduce Research for Weakness Analyzer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors