## BenchmarkLoader

The cell below shows the code for loading a benchmark from a configuration file. Click [here](benchmark_configs/test_config.yaml) to see the config file.

Loading a benchmark performs all necessary processing steps including loading the data and generating benchmark tasks. To allow for downloading public 
data which requires attribution, downloading data from remote sources is supported. when a remote source is used, the proper credits are displayed to the
user.

In [1]:
from ppm_benchmark.core.benchmark_loader import BenchmarkLoader


loader = BenchmarkLoader()

benchmark = loader.load_from_config('ppm_benchmark/benchmark_configs/test_config.yaml')

Downloading Hospital Billing dataset by Felix Mannhardt from https://data.4tu.nl/ndownloader/items/6af6d5f0-f44c-49be-aac8-8eaa5fe4f6fd/versions/1


parsing log, completed traces ::   0%|          | 0/100000 [00:00<?, ?it/s]

## Loading Benchmark & Tasks

When a benchmark is generated from a config file it is automatically saved to disk. The `.load_from_folder()` method can be used to retrieve the configured benchmark.
Each benchmark revolves around several tasks on which performance can be measured. The `.get_tasks()` method returns all task names, these can be used to retrieve the training and testing data.

In [2]:
import random
from ppm_benchmark.core.benchmark_loader import BenchmarkLoader


loader = BenchmarkLoader()
benchmark = loader.load_from_folder('test_benchmark')

tasks = benchmark.get_tasks()
tasks 

['test_task']

## Getting Task Data

The cell below shows the code for retrieving the training and test data. This simple example is a classification task for the last activity. For this example the 'prediction' is a randomly selected activity.

In [3]:
task = benchmark.load_task('test_task')

preds = dict()
train_df = task.get_train_data()
test_df = task.get_test_data()
targets = train_df['concept:name'].unique().tolist()

for index, row in test_df.iterrows():
        pred = random.choice(targets)
        case_id = row['case:concept:name']
        preds[case_id] = pred
preds

{'YHND': 'FIN',
 'ZHND': 'CODE OK',
 'AIND': 'CODE ERROR',
 'BIND': 'STORNO',
 'CIND': 'STORNO',
 'DIND': 'CHANGE DIAGN',
 'EIND': 'CHANGE END',
 'FIND': 'CODE OK',
 'GIND': 'RELEASE',
 'HIND': 'ZDBC_BEHAN',
 'IIND': 'REOPEN',
 'JIND': 'CODE ERROR',
 'KIND': 'REJECT',
 'LIND': 'REOPEN',
 'MIND': 'BILLED',
 'NIND': 'FIN',
 'OIND': 'CHANGE DIAGN',
 'PIND': 'RELEASE',
 'QIND': 'STORNO',
 'RIND': 'NEW',
 'SIND': 'BILLED',
 'TIND': 'BILLED',
 'UIND': 'CHANGE DIAGN',
 'VIND': 'SET STATUS',
 'WIND': 'CODE OK',
 'XIND': 'CHANGE DIAGN',
 'YIND': 'RELEASE',
 'ZIND': 'SET STATUS',
 'AJND': 'CHANGE DIAGN',
 'BJND': 'JOIN-PAT',
 'CJND': 'NEW',
 'DJND': 'JOIN-PAT',
 'EJND': 'CHANGE DIAGN',
 'FJND': 'STORNO',
 'GJND': 'JOIN-PAT',
 'HJND': 'STORNO',
 'IJND': 'DELETE',
 'JJND': 'RELEASE',
 'KJND': 'CODE NOK',
 'LJND': 'BILLED',
 'MJND': 'DELETE',
 'NJND': 'CHANGE DIAGN',
 'OJND': 'CHANGE DIAGN',
 'PJND': 'DELETE',
 'QJND': 'CODE NOK',
 'RJND': 'MANUAL',
 'SJND': 'STORNO',
 'TJND': 'CODE OK',
 'UJND': '

## Evaluation

Evaluation is done for each individual task. The evaluation functions are specified in the config and will be displayed to the user when running the code below.

In [4]:
benchmark.evaluate(task, preds)

Evaluation metrics for task test_task:
	 Accuracy: 0.05695


## Experiments

The experiment class is implemented to help users in tracking PPM experiments. A run can be initialized to track a task. If the optional parameter `model_type=tensorflow` is given to the `init_run()` method, the returned run_tracker is a Tensorflow callback which supports automatic tracking. For other model types, the generic run_tracker can be used inside the training loop for tracking metrics. The code below simulates a situation for tracking the loss.

In [3]:
from ppm_benchmark.core.experiment import Experiment


experiment = Experiment()
experiment.new_experiment('test_experiment')

task = benchmark.load_task('test_task')
run_tracker, run = experiment.init_run(task, 'test_run')

for i in range(0, 10):
        run_tracker.epoch_end({'loss': 0})

run_tracker.train_end()
run_tracker.evaluate(preds)


Run test_run for task test_task finished.
Completed in 9 epochs
Training time: 0.0 minutes
Test set metrics: {'accuracy': 0.05305}


## Retrieving Run Data

using the `to_dict()` method on a run object returns all tracked run data in a dictionary format. This can easily be converted into a DataFrame for further analysis.

In [4]:
run.to_dict()

{'task': 'test_task',
 'run_id': 'test_run',
 'start_time': datetime.datetime(2024, 6, 6, 15, 35, 37, 23857),
 'end_time': datetime.datetime(2024, 6, 6, 15, 35, 37, 23857),
 'training_time': 0.0,
 'test_metrics': {'accuracy': 0.05305},
 'epochs': {1: {'loss': 0},
  2: {'loss': 0},
  3: {'loss': 0},
  4: {'loss': 0},
  5: {'loss': 0},
  6: {'loss': 0},
  7: {'loss': 0},
  8: {'loss': 0},
  9: {'loss': 0}},
 'hyper_params': {}}