SEED: A Transformers-Based Autoencoder Enhanced by Masking and Self-Distillation for Business Process Anomaly Detection

This is the source code of our paper 'SEED: A Transformers-Based Autoencoder Enhanced by Masking and Self-Distillation for Business Process Anomaly Detection'.

Absract

Detecting anomalies in business processes is integral to ensuring operational success. Unsupervised anomaly detection methods, due to their label-free nature, have gained traction. However, prevailing anomaly detection approaches relying on autoencoders confront the persistent challenge of overfitting. To address this, we propose a transformers-based autoencoder enhanced by masking and self-distillation for business process anomaly detection, named SEED. The transformers-based autoencoder is capable of capturing interrelationships across multiple perspectives. Incorporating masking and self-distillation techniques, our model not only reconstructs masked attribute values but also aligns hidden representations with those generated by a teacher encoder. These techniques enhance the model's generalization, fostering robustness against noise. Moreover, we introduce a novel method for computing anomaly scores, effectively mitigating the impact of varying potential attribute values. We conduct extensive experiments on synthetic and real-life logs, showcasing SEED's superior performance over state-of-the-art methods by a substantial margin. Ablation studies indicate that employing masked autoencoding and self-distillation techniques significantly enhances the model's generalization, ultimately leading to improved anomaly detection performance.

Datasets

Five commonly used real-life datasets:

i) BPIC12: The event log for a loan application process.

ii) BPIC13: This dataset relates to Volvo IT incident and problem management, covering three distinct logs.

iii) BPIC20: The dataset encompasses events related to two years of travel expense claims. Events were recorded in 2017 for two departments and extended to cover the entire university in 2018. This dataset encompasses five distinct logs.

iv) Receipt: This log records the execution of the receiving phase of the building permit application process in an anonymous municipality.

v) Sepsis: Events in this log correspond to sepsis cases observed in a hospital.

All real-life logs, containing artificial anomalies ranging from 5% to 45%, utilized in the experiments are stored in the 'eventlogs' folder. Each log is named according to the following convention: 'log_Name-anomaly_Ratio'.

Eight synthetic datasets: i.e., Paper, P2P, Small, Medium, Large, Huge, Gigantic, and Wide.

All synthetic logs, containing artificial anomalies ranging from 5% to 45%, utilized in the experiments are stored in the 'eventlogs' folder. Each log is named according to the following convention: 'log_Name-anomaly_Ratio-attribute_Number'.

Requirements

Run

i) Set the running configuration:

Modify 'conf.py' to configure runtime settings.

ii) Get the result for SEED on each dataset:

```
  python main.py
```

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
eventlogs		eventlogs
models		models
processmining		processmining
utils		utils
README.md		README.md
architecture.png		architecture.png
conf.py		conf.py
detect.py		detect.py
main.py		main.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SEED: A Transformers-Based Autoencoder Enhanced by Masking and Self-Distillation for Business Process Anomaly Detection

Absract

Datasets

Requirements

Run

i) Set the running configuration:

ii) Get the result for SEED on each dataset:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

guanwei49/SEED

Folders and files

Latest commit

History

Repository files navigation

SEED: A Transformers-Based Autoencoder Enhanced by Masking and Self-Distillation for Business Process Anomaly Detection

Absract

Datasets

Requirements

Run

i) Set the running configuration:

ii) Get the result for SEED on each dataset:

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages