LogRoBERTa: An Innovative Model for Detecting Hidden Anomalies in Long and Complex Log Sequences
This paper introduces an innovative log detection model named LogRoBERTa. Leveraging the RoBERTa model for pre-training, LogRoBERTa captures contextual information from software logs, comprehends complex log structures, and employs Attention-based Bi-LSTM for log anomaly detection.
-
LogRoBERTa and other benchmark models are implemented on HDFS and BGL datasets
-
Please note that due to the large size of the datasets, we have not included the original files in the GitHub repository. You can download the corresponding dataset files by clicking on the provided link.
-
You can refer to the code structure of our LogRoBERTa model in the files
LogRoBERTa_HDFS.pyandLogRoBERTa_BGL.py.LogRoBERTa_HDFS.pyis designed for the HDFS dataset, whileLogRoBERTa_BGL.pyis designed for the BGL dataset. -
In the paper's Section V. EXPERIENCES -> C. RQ3: Comparative Enhancement Effects of Each Module in LogRoBERTa, you will find our specific configurations and experimental results for our comparative experiments located in the
Modulefolder.
| Model | Module 1 | Module 2 | Module 3 |
|---|---|---|---|
| LogRoBERTa | RoBERTa | Bi-LSTM | Attention |
| Comparison 1 | BERT | Bi-LSTM | Attention |
| Comparison 2 | - | Bi-LSTM | Attention |
| Comparison 3 | RoBERTa | LSTM | Attention |
| Comparison 4 | RoBERTa | Bi-LSTM | - |
