ML4ProM

Check out the paper on:

Please follow the notebooks to reproduce results:

./notebooks/1_EDA.ipynb downloads datasets and does Exploratory Data Analysis for each dataset to understand datasets better,
./notebooks/2_training.ipynb executes training scripts and presents results,
./notebooks/3_post-train.ipynb presents feature importances for each dataset and each ML model.

How to train models and output results?

Inside the project directory (../ml4prom/) execute following to get to know more about the args:

python -m src.models.train_model -h

which returns:

  --debug DEBUG         When True, plots ROC-Curve & Confusion Matrix
  --seq_encoding SEQ_ENCODING
                        Possible encodings; 'one-hot' & 'n-gram' where n is an integer
  --unique_traces UNIQUE_TRACES
                        when True, duplicate traces(trace variants) are removed from dataset
  --remove_biased_feats REMOVE_BIASED_FEATS
                        when True, the biased features are removed from dataset, e.g. patient is dead in COVID dataset

The following command does multiple things:

load all datasets
apply preprocessing, e.g. remove biased features, remove duplicate traces, etc.
encode traces (sequence of events)
train ML models with StratifiedKFold cross-validation
output a .csv file to ./reports/ including the accuracy scores

python -m src.models.train_model --seq_encoding one-hot --remove_biased_feats --unique_traces

Citation:

@inproceedings{velioglu2022explainable,
  title={Explainable Artificial Intelligence for Improved Modeling of Processes},
  author={Velioglu, Riza and G{\"o}pfert, Jan Philip and Artelt, Andr{\'e} and Hammer, Barbara},
  booktitle={International Conference on Intelligent Data Engineering and Automated Learning},
  pages={313--325},
  year={2022},
  organization={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_test.py		train_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

notebooks

notebooks

src

src

.gitignore

.gitignore

CITATION.cff

CITATION.cff

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

train_test.py

train_test.py

Repository files navigation

ML4ProM

How to train models and output results?

Citation:

About

Releases

Packages

Languages

License

rizavelioglu/ml4prom

Folders and files

Latest commit

History

Repository files navigation

ML4ProM

How to train models and output results?

Citation:

About

Topics

Resources

License

Stars

Watchers

Forks

Languages