# Runtime Analysis

The purpose of this notebook is to gain insights into the actual execution times of the algorithms employed. Initially, we generate three distinct sets for each of the five logs utilized in our thesis. Specifically, we extract samples consisting of 200, 500, and 800 traces per log. We apply two different settings to show how the running time behaves:

1. Considering that the logs feature varying numbers of utilized objects, we leverage the logs comprising 200 traces to conduct a comparative analysis of the running time based on the object quantity.

2. Since the logs also vary in different aspects that we cannot fully compensate (e.g., max length of a trace, amount of activities), we compare the running time for the same log based on the amount of traces inside the log. 

### Everything in this notebook that is not fully executed was run on the servers in Karlsruhe.

In [1]:
import warnings
warnings.filterwarnings('ignore')
from ocpa.objects.log.importer.ocel import factory as ocel_import_factory
from ocpa.algo.discovery.ocpn import algorithm as ocpn_discovery_factory
from src.utils import sample_traces, process_log, generate_variant_log
from ocpa.objects.log.importer.csv import factory as ocel_import_factory_csv
from ocpa.objects.log.exporter.ocel import factory as ocel_export_factory
from ocpa.algo.util.filtering.log import case_filtering
from models.baseline_measure import baseline_measure
from models.alignment_measure import alignment_measure_events
from models.negative_events_measure import negative_events_with_weighting
from src.utils import get_happy_path_log, create_flower_model, generate_variant_model
from models.VAE_measure import get_text_data, decode_sequence, create_lstm_vae, VAE_generalization,create_VAE_input
from tqdm import tqdm
import numpy as np
import time

In [3]:
def sample_for_runtime(filename,trace_amount):
    ocel = ocel_import_factory.apply(f"../src/data/jsonocel/{filename}.jsonocel")
    ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
    train_log = sample_traces(ocel, ocpn, trace_amount)
    #process the sampled log
    train_log = [[' '.join(activity.lower() for activity in sublist)] for sublist in train_log]
    df_log = process_log(train_log, ocel, ocpn, f'../src/data/runtime/{filename}_{trace_amount}.csv')
    return df_log

In [15]:
def filter_for_runtime(filename,trace_amount):
    ocel = ocel_import_factory.apply(f"../src/data/jsonocel/{filename}.jsonocel")
    ocel = case_filtering.filter_process_executions(ocel, ocel.process_executions[:trace_amount])
    ocel_export_factory.apply(ocel, f'../src/data/runtime/{filename}_{trace_amount}.jsonocel')
    return df_log

# Sample Order Process and P2P Process

For the order and the p2p process, we only have 48 and 80 traces respectively. Therefore, we need to sample the needed amount of 200, 500, and 800 traces for these processes. We make use of the sample for runtime function defined above.

In [4]:
filenames = ["order_process","p2p-normal"]
sample_sizes = [200 ,500, 800]

In [5]:
for filename in filenames:
    for sample_size in sample_sizes:
        df_log = sample_for_runtime(filename,sample_size)

Check the arcs: 100%|██████████| 46/46 [00:00<?, ?it/s]
Generate the traces: 100%|██████████| 200/200 [00:00<00:00, 11561.09it/s]
Check the arcs: 100%|██████████| 46/46 [00:00<00:00, 43670.89it/s]
Generate the traces: 100%|██████████| 500/500 [00:00<00:00, 8723.78it/s]
Check the arcs: 100%|██████████| 46/46 [00:00<00:00, 38688.19it/s]
Generate the traces: 100%|██████████| 800/800 [00:00<00:00, 15322.92it/s]
Check the arcs: 100%|██████████| 40/40 [00:00<00:00, 24325.38it/s]
Generate the traces: 100%|██████████| 200/200 [00:00<00:00, 7534.57it/s]
Check the arcs: 100%|██████████| 40/40 [00:00<?, ?it/s]
Generate the traces: 100%|██████████| 500/500 [00:00<00:00, 17820.65it/s]
Check the arcs: 100%|██████████| 40/40 [00:00<00:00, 40358.95it/s]
Generate the traces: 100%|██████████| 800/800 [00:00<00:00, 11255.00it/s]


Save the files in ocel format for better usability.

In [20]:
filenames = ["order_process","p2p-normal"]
objects = [["order","item","delivery"],["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"]]
sample_sizes = [200 ,500, 800]
for filename in filenames:
    if filename == 'order_process':
        object_types = objects[0]
    else:
        object_types = objects[1]
    for sample_size in sample_sizes:
        parameters = {"obj_names": object_types,
                          "val_names": [],
                          "act_name": "event_activity",
                          "time_name": "event_timestamp",
                          "sep": ","}
        ocel_csv = ocel_import_factory_csv.apply(file_path=f'../src/data/runtime/{filename}_{sample_size}.csv', parameters=parameters)
        ocel_export_factory.apply(ocel_csv, f'../src/data/runtime/{filename}_{sample_size}.jsonocel')


# Filter BPI, DS3, and DS4

For the BPI, DS3, and DS4 process, we have more then enough traces. To make it comparable, we use the ocpa functionality to filter down the traces to the needed amount of 200, 500, and 800 traces for these processes. We make use of the filter for runtime function defined above.

In [16]:
filenames = ["BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]

In [17]:
for filename in filenames:
    for sample_size in sample_sizes:
        df_log = filter_for_runtime(filename,sample_size)

### Generate Variant Log for all samples

In [30]:
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]

In [33]:
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        generate_variant_log(ocel,f"../src/data/runtime/variant_logs/{filename}_{sample_size}_variant_log.csv" )

# Runtime Baseline Measure

In [24]:
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]

### OCPN model

In [28]:
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
        start_time = time.time()
        value = baseline_measure(ocel,ocpn,'event_activity','event_id')
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the baseline measure approach for {filename} with {sample_size} traces is {execution_time} seconds")

The execution time for the baseline measure approach for order_process with 200 traces is 0.0128 seconds
The execution time for the baseline measure approach for order_process with 500 traces is 0.0111 seconds
The execution time for the baseline measure approach for order_process with 800 traces is 0.0169 seconds
The execution time for the baseline measure approach for p2p-normal with 200 traces is 0.0088 seconds
The execution time for the baseline measure approach for p2p-normal with 500 traces is 0.0134 seconds
The execution time for the baseline measure approach for p2p-normal with 800 traces is 0.0163 seconds
The execution time for the baseline measure approach for BPI2017-Final with 200 traces is 0.0293 seconds
The execution time for the baseline measure approach for BPI2017-Final with 500 traces is 0.0481 seconds
The execution time for the baseline measure approach for BPI2017-Final with 800 traces is 0.0676 seconds
The execution time for the baseline measure approach for DS3 wit

### Happy Path model

In [29]:
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path__ocel = get_happy_path_log(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path_ocpn = ocpn_discovery_factory.apply(happy_path__ocel, parameters={"debug": False})
        start_time = time.time()
        value = baseline_measure(ocel,happy_path_ocpn,'event_activity','event_id')
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the baseline measure approach for {filename} and the happy path model with {sample_size} traces is {execution_time} seconds")

The execution time for the baseline measure approach for order_process and the happy path model with 200 traces is 0.0098 seconds
The execution time for the baseline measure approach for order_process and the happy path model with 500 traces is 0.0109 seconds
The execution time for the baseline measure approach for order_process and the happy path model with 800 traces is 0.0148 seconds
The execution time for the baseline measure approach for p2p-normal and the happy path model with 200 traces is 0.007 seconds
The execution time for the baseline measure approach for p2p-normal and the happy path model with 500 traces is 0.0111 seconds
The execution time for the baseline measure approach for p2p-normal and the happy path model with 800 traces is 0.0146 seconds
The execution time for the baseline measure approach for BPI2017-Final and the happy path model with 200 traces is 0.0284 seconds
The execution time for the baseline measure approach for BPI2017-Final and the happy path model with

### Flower model

In [34]:
objects = [["order","item","delivery"],["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"],["application","offer"],["incident","customer"], ["Payment application","Control summary","Entitlement application","Geo parcel document","Inspection","Reference alignment"]]
counter = 0
for filename in filenames:
    ots = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        flower_ocpn = create_flower_model(f"../src/data/runtime/{filename}_{sample_size}.jsonocel",ots)
        start_time = time.time()
        value = baseline_measure(ocel,flower_ocpn,'event_activity','event_id')
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the baseline measure approach for {filename} and the flower model with {sample_size} traces is {execution_time} seconds")

The execution time for the baseline measure approach for order_process and the flower model with 200 traces is 0.0292 seconds
The execution time for the baseline measure approach for order_process and the flower model with 500 traces is 0.0323 seconds
The execution time for the baseline measure approach for order_process and the flower model with 800 traces is 0.0457 seconds
The execution time for the baseline measure approach for p2p-normal and the flower model with 200 traces is 0.0101 seconds
The execution time for the baseline measure approach for p2p-normal and the flower model with 500 traces is 0.0328 seconds
The execution time for the baseline measure approach for p2p-normal and the flower model with 800 traces is 0.0255 seconds
The execution time for the baseline measure approach for BPI2017-Final and the flower model with 200 traces is 0.039 seconds
The execution time for the baseline measure approach for BPI2017-Final and the flower model with 500 traces is 0.1261 seconds
Th

### Variant Model

In [36]:
objects = [["order","item","delivery"],["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"],["application","offer"],["incident","customer"], ["Payment application","Control summary","Entitlement application","Geo parcel document","Inspection","Reference alignment"]]
counter = 0
for filename in filenames:
    object_types = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        parameters = {"obj_names": object_types,
                      "val_names": [],
                      "act_name": "event_activity",
                      "time_name": "event_timestamp",
                      "sep": ","}
        ocel_variant = ocel_import_factory_csv.apply(file_path=f"../src/data/runtime/variant_logs/{filename}_{sample_size}_variant_log.csv", parameters=parameters)
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        variant_ocpn = generate_variant_model(ocel,save_path_logs=f"../src/data/runtime/variant_logs/{filename}/{filename}_{sample_size}",object_types = object_types, save=True)        
        start_time = time.time()
        value = baseline_measure(ocel_variant,variant_ocpn,'event_activity','event_id')
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the baseline measure approach for {filename} and the variant model with {sample_size} traces is {execution_time} seconds")

Generating Variant Models: 100%|██████████| 105/105 [00:27<00:00,  3.82it/s]
Processing Variant Nets: 100%|██████████| 105/105 [00:00<00:00, 7465.96it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for order_process and the variant model with 200 traces is 1.1585 seconds


Generating Variant Models: 100%|██████████| 174/174 [00:40<00:00,  4.35it/s]
Processing Variant Nets: 100%|██████████| 174/174 [00:00<00:00, 4523.03it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for order_process and the variant model with 500 traces is 4.6279 seconds


Generating Variant Models: 100%|██████████| 228/228 [01:14<00:00,  3.07it/s]
Processing Variant Nets: 100%|██████████| 228/228 [00:00<00:00, 5741.00it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for order_process and the variant model with 800 traces is 8.808 seconds


Generating Variant Models: 100%|██████████| 2/2 [00:03<00:00,  1.82s/it]
Processing Variant Nets: 100%|██████████| 2/2 [00:00<00:00, 1001.51it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for p2p-normal and the variant model with 200 traces is 0.0548 seconds


Generating Variant Models: 100%|██████████| 2/2 [00:08<00:00,  4.06s/it]
Processing Variant Nets: 100%|██████████| 2/2 [00:00<00:00, 2000.14it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for p2p-normal and the variant model with 500 traces is 0.0519 seconds


Generating Variant Models: 100%|██████████| 2/2 [00:13<00:00,  6.62s/it]
Processing Variant Nets: 100%|██████████| 2/2 [00:00<00:00, 1977.51it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for p2p-normal and the variant model with 800 traces is 0.0831 seconds


Generating Variant Models: 100%|██████████| 199/199 [01:15<00:00,  2.65it/s]
Processing Variant Nets: 100%|██████████| 199/199 [00:00<00:00, 5328.87it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for BPI2017-Final and the variant model with 200 traces is 6.6024 seconds


Generating Variant Models: 100%|██████████| 497/497 [03:00<00:00,  2.75it/s]
Processing Variant Nets: 100%|██████████| 497/497 [00:00<00:00, 2693.30it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for BPI2017-Final and the variant model with 500 traces is 25.6404 seconds


Generating Variant Models: 100%|██████████| 787/787 [04:57<00:00,  2.65it/s]
Processing Variant Nets: 100%|██████████| 787/787 [00:00<00:00, 5876.25it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for BPI2017-Final and the variant model with 800 traces is 66.9893 seconds


Generating Variant Models: 100%|██████████| 200/200 [01:39<00:00,  2.01it/s]
Processing Variant Nets: 100%|██████████| 200/200 [00:00<00:00, 10455.44it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for DS3 and the variant model with 200 traces is 0.7253 seconds


Generating Variant Models: 100%|██████████| 500/500 [01:44<00:00,  4.76it/s]
Processing Variant Nets: 100%|██████████| 500/500 [00:00<00:00, 10393.41it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for DS3 and the variant model with 500 traces is 3.1678 seconds


Generating Variant Models: 100%|██████████| 800/800 [03:19<00:00,  4.02it/s]
Processing Variant Nets: 100%|██████████| 800/800 [00:00<00:00, 5854.08it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for DS3 and the variant model with 800 traces is 7.1009 seconds


Generating Variant Models: 100%|██████████| 200/200 [02:32<00:00,  1.31it/s]
Processing Variant Nets: 100%|██████████| 200/200 [00:00<00:00, 1043.72it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for DS4 and the variant model with 200 traces is 25.0321 seconds


Generating Variant Models: 100%|██████████| 500/500 [04:57<00:00,  1.68it/s]
Processing Variant Nets: 100%|██████████| 500/500 [00:00<00:00, 1391.84it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for DS4 and the variant model with 500 traces is 48.3128 seconds


Generating Variant Models: 100%|██████████| 799/799 [05:58<00:00,  2.23it/s]
Processing Variant Nets: 100%|██████████| 799/799 [00:00<00:00, 2900.87it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
The execution time for the baseline measure approach for DS4 and the variant model with 800 traces is 125.3297 seconds


# Runtime Alignment Measure Events

Examplary for the Alignment Measure, we analyse the running time for the measure based on events.

In [2]:
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]

### OCPN model

In [3]:
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
        start_time = time.time()
        value = alignment_measure_events(ocel,ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the alignment measure approach for {filename} with {sample_size} traces is {execution_time} seconds")

Check the arcs: 100%|██████████| 68/68 [00:00<00:00, 25336.47it/s]
Save the transitions: 100%|██████████| 11/11 [00:00<00:00, 195.14it/s]


The execution time for the alignment measure approach for order_process with 200 traces is 0.0966 seconds


Check the arcs: 100%|██████████| 68/68 [00:00<00:00, 67156.27it/s]
Save the transitions: 100%|██████████| 11/11 [00:00<00:00, 207.16it/s]


The execution time for the alignment measure approach for order_process with 500 traces is 0.0769 seconds


Check the arcs: 100%|██████████| 68/68 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 11/11 [00:00<00:00, 88.09it/s]


The execution time for the alignment measure approach for order_process with 800 traces is 0.1583 seconds


Check the arcs: 100%|██████████| 40/40 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 523.20it/s]


The execution time for the alignment measure approach for p2p-normal with 200 traces is 0.0334 seconds


Check the arcs: 100%|██████████| 40/40 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 134.91it/s]


The execution time for the alignment measure approach for p2p-normal with 500 traces is 0.0971 seconds


Check the arcs: 100%|██████████| 40/40 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 276.20it/s]


The execution time for the alignment measure approach for p2p-normal with 800 traces is 0.049 seconds


Check the arcs: 100%|██████████| 114/114 [00:00<00:00, 10414.73it/s]
Save the transitions: 100%|██████████| 19/19 [00:00<00:00, 328.73it/s]


The execution time for the alignment measure approach for BPI2017-Final with 200 traces is 0.0742 seconds


Check the arcs: 100%|██████████| 130/130 [00:00<00:00, 43412.38it/s]
Save the transitions: 100%|██████████| 20/20 [00:00<00:00, 110.26it/s]


The execution time for the alignment measure approach for BPI2017-Final with 500 traces is 0.2018 seconds


Check the arcs: 100%|██████████| 146/146 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 20/20 [00:00<00:00, 89.25it/s]


The execution time for the alignment measure approach for BPI2017-Final with 800 traces is 0.2597 seconds


Check the arcs: 100%|██████████| 100/100 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 8/8 [00:00<00:00, 50.45it/s]


The execution time for the alignment measure approach for DS3 with 200 traces is 0.182 seconds


Check the arcs: 100%|██████████| 102/102 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 8/8 [00:00<00:00, 43.61it/s]


The execution time for the alignment measure approach for DS3 with 500 traces is 0.1988 seconds


Check the arcs: 100%|██████████| 124/124 [00:00<00:00, 123420.43it/s]
Save the transitions: 100%|██████████| 8/8 [00:00<00:00, 29.39it/s]


The execution time for the alignment measure approach for DS3 with 800 traces is 0.3068 seconds


Check the arcs: 100%|██████████| 342/342 [00:00<00:00, 27170.22it/s]
Save the transitions: 100%|██████████| 56/56 [00:01<00:00, 38.66it/s]


The execution time for the alignment measure approach for DS4 with 200 traces is 1.4776 seconds


Check the arcs: 100%|██████████| 314/314 [00:00<00:00, 27904.56it/s]
Save the transitions: 100%|██████████| 61/61 [00:02<00:00, 21.39it/s]


The execution time for the alignment measure approach for DS4 with 500 traces is 2.8949 seconds


Check the arcs: 100%|██████████| 312/312 [00:00<00:00, 15416.24it/s]
Save the transitions: 100%|██████████| 63/63 [00:03<00:00, 18.06it/s]

The execution time for the alignment measure approach for DS4 with 800 traces is 3.521 seconds





### Happy Path model

In [40]:
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path__ocel = get_happy_path_log(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path_ocpn = ocpn_discovery_factory.apply(happy_path__ocel, parameters={"debug": False})
        start_time = time.time()
        value = alignment_measure_events(ocel,happy_path_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the alignment measure approach for {filename} and the happy path model with {sample_size} traces is {execution_time} seconds")

Check the arcs: 100%|██████████| 30/30 [00:00<00:00, 30023.65it/s]
Save the transitions: 100%|██████████| 10/10 [00:00<00:00, 232.50it/s]


The execution time for the alignment measure approach for order_process and the happy path model with 200 traces is 0.068 seconds


Check the arcs: 100%|██████████| 30/30 [00:00<00:00, 34164.84it/s]
Save the transitions: 100%|██████████| 10/10 [00:00<00:00, 156.24it/s]


The execution time for the alignment measure approach for order_process and the happy path model with 500 traces is 0.0931 seconds


Check the arcs: 100%|██████████| 30/30 [00:00<00:00, 15311.40it/s]
Save the transitions: 100%|██████████| 10/10 [00:00<00:00, 81.62it/s]


The execution time for the alignment measure approach for order_process and the happy path model with 800 traces is 0.1515 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 38130.04it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 290.28it/s]


The execution time for the alignment measure approach for p2p-normal and the happy path model with 200 traces is 0.054 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 36149.59it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 249.19it/s]


The execution time for the alignment measure approach for p2p-normal and the happy path model with 500 traces is 0.0591 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 19014.98it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 83.28it/s]


The execution time for the alignment measure approach for p2p-normal and the happy path model with 800 traces is 0.1403 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 37957.50it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 231.57it/s]


The execution time for the alignment measure approach for BPI2017-Final and the happy path model with 200 traces is 0.0529 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 19067.30it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 63.77it/s]


The execution time for the alignment measure approach for BPI2017-Final and the happy path model with 500 traces is 0.1721 seconds


Check the arcs: 100%|██████████| 42/42 [00:00<00:00, 20946.58it/s]
Save the transitions: 100%|██████████| 12/12 [00:00<00:00, 40.84it/s]


The execution time for the alignment measure approach for BPI2017-Final and the happy path model with 800 traces is 0.3308 seconds


Check the arcs: 100%|██████████| 78/78 [00:00<00:00, 73900.09it/s]
Save the transitions: 100%|██████████| 6/6 [00:00<00:00, 49.16it/s]


The execution time for the alignment measure approach for DS3 and the happy path model with 200 traces is 0.1393 seconds


Check the arcs: 100%|██████████| 78/78 [00:00<00:00, 39156.88it/s]
Save the transitions: 100%|██████████| 6/6 [00:00<00:00, 17.35it/s]


The execution time for the alignment measure approach for DS3 and the happy path model with 500 traces is 0.3868 seconds


Check the arcs: 100%|██████████| 78/78 [00:00<00:00, 127446.71it/s]
Save the transitions: 100%|██████████| 6/6 [00:00<00:00, 43.48it/s]


The execution time for the alignment measure approach for DS3 and the happy path model with 800 traces is 0.144 seconds


Check the arcs: 100%|██████████| 154/154 [00:00<00:00, 51923.06it/s]
Save the transitions: 100%|██████████| 32/32 [00:00<00:00, 49.29it/s]


The execution time for the alignment measure approach for DS4 and the happy path model with 200 traces is 0.6612 seconds


Check the arcs: 100%|██████████| 154/154 [00:00<00:00, 96738.48it/s]
Save the transitions: 100%|██████████| 32/32 [00:01<00:00, 31.71it/s]


The execution time for the alignment measure approach for DS4 and the happy path model with 500 traces is 1.0162 seconds


Check the arcs: 100%|██████████| 142/142 [00:00<00:00, 71508.12it/s]
Save the transitions: 100%|██████████| 32/32 [00:01<00:00, 25.85it/s]

The execution time for the alignment measure approach for DS4 and the happy path model with 800 traces is 1.2589 seconds





### Flower model

In [41]:
objects = [["order","item","delivery"],["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"],["application","offer"],["incident","customer"], ["Payment application","Control summary","Entitlement application","Geo parcel document","Inspection","Reference alignment"]]
counter = 0
for filename in filenames:
    ots = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        flower_ocpn = create_flower_model(f"../src/data/runtime/{filename}_{sample_size}.jsonocel",ots)
        start_time = time.time()
        value = alignment_measure_events(ocel,flower_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the alignment measure approach for {filename} and the flower model with {sample_size} traces is {execution_time} seconds")

Check the arcs: 100%|██████████| 32/32 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 11/11 [00:00<00:00, 912.74it/s]


The execution time for the alignment measure approach for order_process and the flower model with 200 traces is 0.019 seconds


Check the arcs: 100%|██████████| 32/32 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 11/11 [00:00<00:00, 525.54it/s]


The execution time for the alignment measure approach for order_process and the flower model with 500 traces is 0.029 seconds


Check the arcs: 100%|██████████| 32/32 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 11/11 [00:00<00:00, 405.47it/s]


The execution time for the alignment measure approach for order_process and the flower model with 800 traces is 0.0332 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 1044.25it/s]


The execution time for the alignment measure approach for p2p-normal and the flower model with 200 traces is 0.0156 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 37867.32it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 562.54it/s]


The execution time for the alignment measure approach for p2p-normal and the flower model with 500 traces is 0.024 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 9/9 [00:00<00:00, 500.00it/s]


The execution time for the alignment measure approach for p2p-normal and the flower model with 800 traces is 0.0229 seconds


Check the arcs: 100%|██████████| 52/52 [00:00<00:00, 55681.34it/s]
Save the transitions: 100%|██████████| 19/19 [00:00<00:00, 558.24it/s]


The execution time for the alignment measure approach for BPI2017-Final and the flower model with 200 traces is 0.042 seconds


Check the arcs: 100%|██████████| 54/54 [00:00<00:00, 54081.28it/s]
Save the transitions: 100%|██████████| 20/20 [00:00<00:00, 322.16it/s]


The execution time for the alignment measure approach for BPI2017-Final and the flower model with 500 traces is 0.067 seconds


Check the arcs: 100%|██████████| 54/54 [00:00<00:00, 52648.17it/s]
Save the transitions: 100%|██████████| 20/20 [00:00<00:00, 172.42it/s]


The execution time for the alignment measure approach for BPI2017-Final and the flower model with 800 traces is 0.121 seconds


Check the arcs: 100%|██████████| 20/20 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 8/8 [00:00<00:00, 140.49it/s]


The execution time for the alignment measure approach for DS3 and the flower model with 200 traces is 0.0627 seconds


Check the arcs: 100%|██████████| 20/20 [00:00<00:00, 19949.13it/s]
Save the transitions: 100%|██████████| 8/8 [00:00<00:00, 66.65it/s]


The execution time for the alignment measure approach for DS3 and the flower model with 500 traces is 0.126 seconds


Check the arcs: 100%|██████████| 20/20 [00:00<?, ?it/s]
Save the transitions: 100%|██████████| 8/8 [00:00<00:00, 48.72it/s]


The execution time for the alignment measure approach for DS3 and the flower model with 800 traces is 0.1724 seconds


Check the arcs: 100%|██████████| 132/132 [00:00<00:00, 169570.64it/s]
Save the transitions: 100%|██████████| 56/56 [00:01<00:00, 42.33it/s]


The execution time for the alignment measure approach for DS4 and the flower model with 200 traces is 1.3329 seconds


Check the arcs: 100%|██████████| 142/142 [00:00<00:00, 47333.00it/s]
Save the transitions: 100%|██████████| 61/61 [00:02<00:00, 26.81it/s]


The execution time for the alignment measure approach for DS4 and the flower model with 500 traces is 2.2883 seconds


Check the arcs: 100%|██████████| 146/146 [00:00<00:00, 71689.11it/s]
Save the transitions: 100%|██████████| 63/63 [00:02<00:00, 23.89it/s]

The execution time for the alignment measure approach for DS4 and the flower model with 800 traces is 2.6632 seconds





### Variant Model

In [42]:
objects = [["order","item","delivery"],["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"],["application","offer"],["incident","customer"], ["Payment application","Control summary","Entitlement application","Geo parcel document","Inspection","Reference alignment"]]
counter = 0
for filename in filenames:
    object_types = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        parameters = {"obj_names": object_types,
                      "val_names": [],
                      "act_name": "event_activity",
                      "time_name": "event_timestamp",
                      "sep": ","}
        ocel_variant = ocel_import_factory_csv.apply(file_path=f"../src/data/runtime/variant_logs/{filename}_{sample_size}_variant_log.csv", parameters=parameters)
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        variant_ocpn = generate_variant_model(ocel,save_path_logs=f"../src/data/runtime/variant_logs/{filename}/{filename}_{sample_size}",object_types = object_types)        
        start_time = time.time()
        value = alignment_measure_events(ocel_variant,variant_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the alignment measure approach for {filename} and the variant model with {sample_size} traces is {execution_time} seconds")

Generating Variant Models: 100%|██████████| 105/105 [00:11<00:00,  9.40it/s]
Processing Variant Nets: 100%|██████████| 105/105 [00:00<00:00, 13127.52it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 3484/3484 [00:01<00:00, 2798.44it/s]
Save the transitions: 100%|██████████| 1097/1097 [00:03<00:00, 354.42it/s]


The execution time for the alignment measure approach for order_process and the variant model with 200 traces is 4.3513 seconds


Generating Variant Models: 100%|██████████| 174/174 [00:25<00:00,  6.71it/s]
Processing Variant Nets: 100%|██████████| 174/174 [00:00<00:00, 12264.87it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 5824/5824 [00:04<00:00, 1388.14it/s]
Save the transitions: 100%|██████████| 1816/1816 [00:14<00:00, 129.44it/s]


The execution time for the alignment measure approach for order_process and the variant model with 500 traces is 18.2552 seconds


Generating Variant Models: 100%|██████████| 228/228 [01:03<00:00,  3.60it/s]
Processing Variant Nets: 100%|██████████| 228/228 [00:00<00:00, 3999.94it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 7724/7724 [00:13<00:00, 575.04it/s]
Save the transitions: 100%|██████████| 2399/2399 [00:37<00:00, 63.97it/s]


The execution time for the alignment measure approach for order_process and the variant model with 800 traces is 50.9762 seconds


Generating Variant Models: 100%|██████████| 2/2 [00:03<00:00,  1.89s/it]
Processing Variant Nets: 100%|██████████| 2/2 [00:00<00:00, 1940.01it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 76/76 [00:00<00:00, 25309.02it/s]
Save the transitions: 100%|██████████| 18/18 [00:00<00:00, 391.29it/s]


The execution time for the alignment measure approach for p2p-normal and the variant model with 200 traces is 0.0723 seconds


Generating Variant Models: 100%|██████████| 2/2 [00:07<00:00,  3.77s/it]
Processing Variant Nets: 100%|██████████| 2/2 [00:00<00:00, 1994.91it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 76/76 [00:00<00:00, 37903.34it/s]
Save the transitions: 100%|██████████| 18/18 [00:00<00:00, 321.41it/s]


The execution time for the alignment measure approach for p2p-normal and the variant model with 500 traces is 0.077 seconds


Generating Variant Models: 100%|██████████| 2/2 [00:10<00:00,  5.38s/it]
Processing Variant Nets: 100%|██████████| 2/2 [00:00<00:00, 1997.29it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 76/76 [00:00<00:00, 38025.42it/s]
Save the transitions: 100%|██████████| 18/18 [00:00<00:00, 246.29it/s]


The execution time for the alignment measure approach for p2p-normal and the variant model with 800 traces is 0.0931 seconds


Generating Variant Models: 100%|██████████| 199/199 [01:09<00:00,  2.87it/s]
Processing Variant Nets: 100%|██████████| 199/199 [00:00<00:00, 7960.43it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 11914/11914 [00:19<00:00, 596.63it/s]
Save the transitions: 100%|██████████| 2632/2632 [00:36<00:00, 72.73it/s] 


The execution time for the alignment measure approach for BPI2017-Final and the variant model with 200 traces is 56.188 seconds


Generating Variant Models: 100%|██████████| 497/497 [02:47<00:00,  2.97it/s]
Processing Variant Nets: 100%|██████████| 497/497 [00:00<00:00, 4477.52it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 28408/28408 [02:31<00:00, 187.43it/s]
Save the transitions: 100%|██████████| 6546/6546 [03:45<00:00, 29.06it/s]


The execution time for the alignment measure approach for BPI2017-Final and the variant model with 500 traces is 376.8567 seconds


Generating Variant Models: 100%|██████████| 787/787 [04:09<00:00,  3.16it/s]
Processing Variant Nets: 100%|██████████| 787/787 [00:00<00:00, 4888.24it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 43640/43640 [06:17<00:00, 115.53it/s]
Save the transitions: 100%|██████████| 10335/10335 [08:50<00:00, 19.48it/s]


The execution time for the alignment measure approach for BPI2017-Final and the variant model with 800 traces is 908.2175 seconds


Generating Variant Models: 100%|██████████| 200/200 [01:59<00:00,  1.67it/s]
Processing Variant Nets: 100%|██████████| 200/200 [00:00<00:00, 5000.09it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 12498/12498 [00:09<00:00, 1322.91it/s]
Save the transitions: 100%|██████████| 1071/1071 [00:15<00:00, 70.17it/s]


The execution time for the alignment measure approach for DS3 and the variant model with 200 traces is 24.7384 seconds


Generating Variant Models: 100%|██████████| 500/500 [04:08<00:00,  2.01it/s]
Processing Variant Nets: 100%|██████████| 500/500 [00:00<00:00, 4386.28it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 30084/30084 [01:01<00:00, 493.14it/s]
Save the transitions: 100%|██████████| 2638/2638 [01:17<00:00, 33.95it/s]


The execution time for the alignment measure approach for DS3 and the variant model with 500 traces is 138.7508 seconds


Generating Variant Models: 100%|██████████| 800/800 [03:09<00:00,  4.22it/s]
Processing Variant Nets: 100%|██████████| 800/800 [00:00<00:00, 12049.27it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 46494/46494 [00:56<00:00, 824.24it/s]
Save the transitions: 100%|██████████| 4153/4153 [01:11<00:00, 57.88it/s]


The execution time for the alignment measure approach for DS3 and the variant model with 800 traces is 128.1757 seconds


Generating Variant Models: 100%|██████████| 200/200 [01:37<00:00,  2.05it/s]
Processing Variant Nets: 100%|██████████| 200/200 [00:00<00:00, 4196.68it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 33208/33208 [01:15<00:00, 438.38it/s]
Save the transitions: 100%|██████████| 6837/6837 [02:20<00:00, 48.56it/s]


The execution time for the alignment measure approach for DS4 and the variant model with 200 traces is 216.5817 seconds


Generating Variant Models: 100%|██████████| 500/500 [03:17<00:00,  2.53it/s]
Processing Variant Nets: 100%|██████████| 500/500 [00:00<00:00, 3879.27it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 79746/79746 [07:45<00:00, 171.16it/s]
Save the transitions: 100%|██████████| 16784/16784 [11:49<00:00, 23.66it/s]


The execution time for the alignment measure approach for DS4 and the variant model with 500 traces is 1175.2838 seconds


Generating Variant Models: 100%|██████████| 799/799 [04:59<00:00,  2.67it/s]
Processing Variant Nets: 100%|██████████| 799/799 [00:00<00:00, 4120.74it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 120656/120656 [19:22<00:00, 103.80it/s]
Save the transitions: 100%|██████████| 25901/25901 [25:45<00:00, 16.75it/s]

The execution time for the alignment measure approach for DS4 and the variant model with 800 traces is 2708.4326 seconds





# Runtime Negative Events Measure with weighting

Examplary for the Negative Events Measure, we analyse the running time for the measure with the weighting approach.

In [3]:
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]

### OCPN model

In [None]:
# Has been run on the server
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
        start_time = time.time()
        value, AG, DG = negative_events_with_weighting(ocel,ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the negative measure approach for {filename} with {sample_size} traces is {execution_time} seconds")

### Happy Path model

In [None]:
# Has been run on the server
filenames = ["order_process","p2p-normal","BPI2017-Final", "DS3","DS4"]
sample_sizes = [200 ,500, 800]
for filename in filenames:
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path__ocel = get_happy_path_log(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path_ocpn = ocpn_discovery_factory.apply(happy_path__ocel, parameters={"debug": False})
        start_time = time.time()
        value, AG, DG = negative_events_with_weighting(ocel,happy_path_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the negative measure approach for {filename} and the happy path model with {sample_size} traces is {execution_time} seconds")

### Flower model

In [4]:
# DS4 has been run on the server
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]
objects = [["order","item","delivery"],["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"],["application","offer"],["incident","customer"], ["Payment application","Control summary","Entitlement application","Geo parcel document","Inspection","Reference alignment"]]
counter = 0
for filename in filenames:
    ots = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        flower_ocpn = create_flower_model(f"../src/data/runtime/{filename}_{sample_size}.jsonocel",ots)
        start_time = time.time()
        value, AG, DG = negative_events_with_weighting(ocel,flower_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the negative measure approach for {filename} and the flower model with {sample_size} traces is {execution_time} seconds")

Check the arcs: 100%|██████████| 32/32 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 200/200 [00:48<00:00,  4.16it/s]


The execution time for the negative measure approach for order_process and the flower model with 200 traces is 48.1581 seconds


Check the arcs: 100%|██████████| 32/32 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 500/500 [04:21<00:00,  1.91it/s]


The execution time for the negative measure approach for order_process and the flower model with 500 traces is 262.1218 seconds


Check the arcs: 100%|██████████| 32/32 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 800/800 [11:30<00:00,  1.16it/s]


The execution time for the negative measure approach for order_process and the flower model with 800 traces is 691.1698 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 200/200 [00:19<00:00, 10.51it/s]


The execution time for the negative measure approach for p2p-normal and the flower model with 200 traces is 19.1667 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<00:00, 35937.67it/s]
Calculate Generalization for all process executions: 100%|██████████| 500/500 [02:10<00:00,  3.83it/s]


The execution time for the negative measure approach for p2p-normal and the flower model with 500 traces is 130.6515 seconds


Check the arcs: 100%|██████████| 38/38 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 800/800 [05:46<00:00,  2.31it/s]


The execution time for the negative measure approach for p2p-normal and the flower model with 800 traces is 346.4434 seconds


Check the arcs: 100%|██████████| 52/52 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 200/200 [08:28<00:00,  2.54s/it]


The execution time for the negative measure approach for BPI2017-Final and the flower model with 200 traces is 509.1066 seconds


Check the arcs: 100%|██████████| 54/54 [00:00<00:00, 7384.34it/s]
Calculate Generalization for all process executions: 100%|██████████| 500/500 [55:23<00:00,  6.65s/it] 


The execution time for the negative measure approach for BPI2017-Final and the flower model with 500 traces is 3323.8055 seconds


Check the arcs: 100%|██████████| 54/54 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 800/800 [2:06:25<00:00,  9.48s/it]  


The execution time for the negative measure approach for BPI2017-Final and the flower model with 800 traces is 7586.3597 seconds


Check the arcs: 100%|██████████| 20/20 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 200/200 [1:31:16<00:00, 27.38s/it]


The execution time for the negative measure approach for DS3 and the flower model with 200 traces is 5477.4821 seconds


Check the arcs: 100%|██████████| 20/20 [00:00<00:00, 10225.02it/s]
Calculate Generalization for all process executions: 100%|██████████| 500/500 [5:29:15<00:00, 39.51s/it]   


The execution time for the negative measure approach for DS3 and the flower model with 500 traces is 19758.8332 seconds


Check the arcs: 100%|██████████| 20/20 [00:00<?, ?it/s]
Calculate Generalization for all process executions: 100%|██████████| 800/800 [9:12:11<00:00, 41.41s/it]   


The execution time for the negative measure approach for DS3 and the flower model with 800 traces is 33134.3191 seconds


### Variant Model

In [9]:
#has been run on the server
filenames = ["order_process","p2p-normal" ,"BPI2017-Final","DS3","DS4"]
sample_sizes = [500, 800]
objects = [["order","item","delivery"]]
counter = 0
for filename in filenames:
    object_types = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        parameters = {"obj_names": object_types,
                      "val_names": [],
                      "act_name": "event_activity",
                      "time_name": "event_timestamp",
                      "sep": ","}
        ocel_variant = ocel_import_factory_csv.apply(file_path=f"../src/data/runtime/variant_logs/{filename}_{sample_size}_variant_log.csv", parameters=parameters)
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        variant_ocpn = generate_variant_model(ocel,save_path_logs=f"../src/data/runtime/variant_logs/{filename}/{filename}_{sample_size}",object_types = object_types)        
        start_time = time.time()
        value, AG, DG = negative_events_with_weighting(ocel_variant,variant_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for the negative measure approach for {filename} and the variant model with {sample_size} traces is {execution_time} seconds")

Generating Variant Models: 100%|██████████| 174/174 [00:11<00:00, 15.57it/s]
Processing Variant Nets: 100%|██████████| 174/174 [00:00<00:00, 15820.70it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


Check the arcs: 100%|██████████| 5824/5824 [00:03<00:00, 1528.64it/s]
Calculate Generalization for all process executions:   0%|          | 0/500 [00:15<?, ?it/s]


KeyboardInterrupt: 

# Runtime VAE

In a first step, we train and derive the generated logs for the VAE measure for better usability, but keep the time also.

In [7]:
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]

In [3]:
def create_VAE_logs(filename, sample_size):
    ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
    ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
    train_log = create_VAE_input(ocel,f'../src/data/runtime/VAE/{filename}_{sample_size}.txt')
    start_time = time.time()
    timesteps_max, enc_tokens, characters, char2id, id2char, x, x_decoder = get_text_data(num_samples=sample_size,
                                                                                      data_path=f'../src/data/runtime/VAE/{filename}_{sample_size}.txt')
    input_dim, timesteps = x.shape[-1], x.shape[-2]
    batch_size, latent_dim = 1, 191
    intermediate_dim, epochs = 353, 20

    vae, enc, gen, stepper = create_lstm_vae(input_dim,
                                             batch_size=batch_size,
                                             intermediate_dim=intermediate_dim,
                                             latent_dim=latent_dim,
                                            )
    vae.fit([x, x_decoder], x, epochs=epochs, verbose=1)
    
    #rearrange the input data and get the max amount of characters
    max_length = max(len(string) for string in train_log)

    def decode(s):
        return decode_sequence(s, gen, stepper, input_dim, char2id, id2char, max_length)

    log = []

    for _ in tqdm(range(sample_size), desc="Sample Traces"):

        id_from = np.random.randint(0, x.shape[0] - 1)

        m_from, std_from = enc.predict([[x[id_from]]])

        seq_from = np.random.normal(size=(latent_dim,))
        seq_from = m_from + std_from * seq_from

        #print(decode(seq_from))
        log.append([decode(seq_from)])
    df_log = process_log(log, ocel, ocpn, f'../src/data/runtime/VAE/{filename}_{sample_size}_generated.csv')
    execution_time = np.round(time.time() - start_time,4)
    print(f"The execution time for VAE training for {filename} with {sample_size} traces is {execution_time} seconds")

In [None]:
filenames = ["order_process","p2p-normal","BPI2017-Final","DS3"]
sample_sizes = [200 ,500, 800]
for filename in filenames:
    for sample_size in sample_sizes:
        create_VAE_logs(filename,sample_size)

Number of samples: 200
Number of unique input tokens: 13
Max sequence length for inputs: 21
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, None, 13)]   0           []                               
                                                                                                  
 lstm (LSTM)                    (None, 353)          518204      ['input_1[0][0]']                
                                                                                                  
 dense (Dense)                  (None, 191)          67614       ['lstm[0][0]']                   
                                                                                                  
 dense_1 (Dense)                (None, 191)          67614       ['lstm[0][0]']                   
  

Sample Traces: 100%|██████████| 200/200 [00:13<00:00, 14.97it/s]


The execution time for VAE training for order_process with 200 traces is 41.2974 seconds
Number of samples: 500
Number of unique input tokens: 13
Max sequence length for inputs: 23
Model: "model_4"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_6 (InputLayer)           [(None, None, 13)]   0           []                               
                                                                                                  
 lstm_2 (LSTM)                  (None, 353)          518204      ['input_6[0][0]']                
                                                                                                  
 dense_4 (Dense)                (None, 191)          67614       ['lstm_2[0][0]']                 
                                                                                                  
 dense_5 (

Sample Traces: 100%|██████████| 500/500 [00:28<00:00, 17.48it/s]


The execution time for VAE training for order_process with 500 traces is 112.8398 seconds
Number of samples: 800
Number of unique input tokens: 13
Max sequence length for inputs: 22
Model: "model_8"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_11 (InputLayer)          [(None, None, 13)]   0           []                               
                                                                                                  
 lstm_4 (LSTM)                  (None, 353)          518204      ['input_11[0][0]']               
                                                                                                  
 dense_8 (Dense)                (None, 191)          67614       ['lstm_4[0][0]']                 
                                                                                                  
 dense_9 

Sample Traces: 100%|██████████| 800/800 [00:48<00:00, 16.63it/s]


The execution time for VAE training for order_process with 800 traces is 282.4236 seconds
Number of samples: 200
Number of unique input tokens: 11
Max sequence length for inputs: 11
Model: "model_12"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_16 (InputLayer)          [(None, None, 11)]   0           []                               
                                                                                                  
 lstm_6 (LSTM)                  (None, 353)          515380      ['input_16[0][0]']               
                                                                                                  
 dense_12 (Dense)               (None, 191)          67614       ['lstm_6[0][0]']                 
                                                                                                  
 dense_1

Sample Traces: 100%|██████████| 200/200 [00:10<00:00, 19.36it/s]


The execution time for VAE training for p2p-normal with 200 traces is 29.0265 seconds
Number of samples: 500
Number of unique input tokens: 11
Max sequence length for inputs: 11
Model: "model_16"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_21 (InputLayer)          [(None, None, 11)]   0           []                               
                                                                                                  
 lstm_8 (LSTM)                  (None, 353)          515380      ['input_21[0][0]']               
                                                                                                  
 dense_16 (Dense)               (None, 191)          67614       ['lstm_8[0][0]']                 
                                                                                                  
 dense_17 (D

Sample Traces: 100%|██████████| 500/500 [00:23<00:00, 21.56it/s]


The execution time for VAE training for p2p-normal with 500 traces is 64.4985 seconds
Number of samples: 800
Number of unique input tokens: 11
Max sequence length for inputs: 11
Model: "model_20"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_26 (InputLayer)          [(None, None, 11)]   0           []                               
                                                                                                  
 lstm_10 (LSTM)                 (None, 353)          515380      ['input_26[0][0]']               
                                                                                                  
 dense_20 (Dense)               (None, 191)          67614       ['lstm_10[0][0]']                
                                                                                                  
 dense_21 (D

Sample Traces: 100%|██████████| 800/800 [00:37<00:00, 21.54it/s]


The execution time for VAE training for p2p-normal with 800 traces is 98.547 seconds
Number of samples: 200
Number of unique input tokens: 24
Max sequence length for inputs: 70
Model: "model_24"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_31 (InputLayer)          [(None, None, 24)]   0           []                               
                                                                                                  
 lstm_12 (LSTM)                 (None, 353)          533736      ['input_31[0][0]']               
                                                                                                  
 dense_24 (Dense)               (None, 191)          67614       ['lstm_12[0][0]']                
                                                                                                  
 dense_25 (De

Sample Traces: 100%|██████████| 200/200 [00:50<00:00,  3.93it/s]


The execution time for VAE training for BPI2017-Final with 200 traces is 221.1716 seconds
Number of samples: 500
Number of unique input tokens: 25
Max sequence length for inputs: 70
Model: "model_28"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_36 (InputLayer)          [(None, None, 25)]   0           []                               
                                                                                                  
 lstm_14 (LSTM)                 (None, 353)          535148      ['input_36[0][0]']               
                                                                                                  
 dense_28 (Dense)               (None, 191)          67614       ['lstm_14[0][0]']                
                                                                                                  
 dense_2

Sample Traces: 100%|██████████| 500/500 [01:41<00:00,  4.92it/s]


The execution time for VAE training for BPI2017-Final with 500 traces is 525.8119 seconds
Number of samples: 800
Number of unique input tokens: 25
Max sequence length for inputs: 70
Model: "model_32"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_41 (InputLayer)          [(None, None, 25)]   0           []                               
                                                                                                  
 lstm_16 (LSTM)                 (None, 353)          535148      ['input_41[0][0]']               
                                                                                                  
 dense_32 (Dense)               (None, 191)          67614       ['lstm_16[0][0]']                
                                                                                                  
 dense_3

Sample Traces: 100%|██████████| 800/800 [00:57<00:00, 13.97it/s]


The execution time for VAE training for BPI2017-Final with 800 traces is 775.6343 seconds
Number of samples: 200
Number of unique input tokens: 10
Max sequence length for inputs: 261
Model: "model_36"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_46 (InputLayer)          [(None, None, 10)]   0           []                               
                                                                                                  
 lstm_18 (LSTM)                 (None, 353)          513968      ['input_46[0][0]']               
                                                                                                  
 dense_36 (Dense)               (None, 191)          67614       ['lstm_18[0][0]']                
                                                                                                  
 dense_

Sample Traces: 100%|██████████| 200/200 [01:54<00:00,  1.74it/s]


The execution time for VAE training for DS3 with 200 traces is 881.8236 seconds
Number of samples: 500
Number of unique input tokens: 10
Max sequence length for inputs: 261
Model: "model_40"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_51 (InputLayer)          [(None, None, 10)]   0           []                               
                                                                                                  
 lstm_20 (LSTM)                 (None, 353)          513968      ['input_51[0][0]']               
                                                                                                  
 dense_40 (Dense)               (None, 191)          67614       ['lstm_20[0][0]']                
                                                                                                  
 dense_41 (Dense)

Sample Traces: 100%|██████████| 500/500 [04:19<00:00,  1.92it/s]


The execution time for VAE training for DS3 with 500 traces is 2185.7187 seconds
Number of samples: 800
Number of unique input tokens: 10
Max sequence length for inputs: 261
Model: "model_44"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_56 (InputLayer)          [(None, None, 10)]   0           []                               
                                                                                                  
 lstm_22 (LSTM)                 (None, 353)          513968      ['input_56[0][0]']               
                                                                                                  
 dense_44 (Dense)               (None, 191)          67614       ['lstm_22[0][0]']                
                                                                                                  
 dense_45 (Dense

Sample Traces: 100%|██████████| 800/800 [05:37<00:00,  2.37it/s]


The execution time for VAE training for DS3 with 800 traces is 3698.4216 seconds
Number of samples: 200
Number of unique input tokens: 58
Max sequence length for inputs: 2975
Model: "model_48"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_61 (InputLayer)          [(None, None, 58)]   0           []                               
                                                                                                  
 lstm_24 (LSTM)                 (None, 353)          581744      ['input_61[0][0]']               
                                                                                                  
 dense_48 (Dense)               (None, 191)          67614       ['lstm_24[0][0]']                
                                                                                                  
 dense_49 (Dens

In [None]:
#has been run on the server
# filenames = ["DS4"]
# sample_sizes = [200 ,500, 800]
# for filename in filenames:
#     for sample_size in sample_sizes:
#         create_VAE_logs(filename,sample_size)

Calculate generalization for each measure

In [None]:
# variant model for 500, and everything for 800 has been run on the server
filenames = ["order_process"]
sample_sizes = [200 ,500, 800]
objects = [["order","item","delivery"]]
counter = 0
for filename in filenames:
    object_types = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
        parameters = {"obj_names": object_types,
                      "val_names": [],
                      "act_name": "event_activity",
                      "time_name": "event_timestamp",
                      "sep": ","}
        ocel_gen = ocel_import_factory_csv.apply(file_path=f'../src/data/runtime/VAE/{filename}_{sample_size}_generated.csv', parameters=parameters)
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the ocpn model with {sample_size} traces is {execution_time} seconds")
        happy_path__ocel = get_happy_path_log(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path_ocpn = ocpn_discovery_factory.apply(happy_path__ocel, parameters={"debug": False})
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, happy_path_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the happy path model with {sample_size} traces is {execution_time} seconds")
        flower_ocpn = create_flower_model(f"../src/data/runtime/{filename}_{sample_size}.jsonocel",object_types)
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, flower_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the flower model with {sample_size} traces is {execution_time} seconds")
        variant_ocpn = generate_variant_model(ocel,save_path_logs=f"../src/data/runtime/variant_logs/{filename}/{filename}_{sample_size}_gen",object_types = object_types, save=True)        
        for transition in variant_ocpn.transitions:
            split_string = transition.name.split("_")
            transition.name = split_string[0]
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, variant_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the variant with {sample_size} traces is {execution_time} seconds")

Precision of IM-discovered net:  0.5634
Fitness of IM-discovered net:  0.9571
VAE Generalization= 0.7092
The execution time for VAE generalization for order_process and the ocpn model with 200 traces is 55.4598 seconds
Precision of IM-discovered net:  0.865
Fitness of IM-discovered net:  0.5407
VAE Generalization= 0.6655
The execution time for VAE generalization for order_process and the happy path model with 200 traces is 16.9993 seconds
Precision of IM-discovered net:  0.2893
Fitness of IM-discovered net:  1.0
VAE Generalization= 0.4487
The execution time for VAE generalization for order_process and the flower model with 200 traces is 14.8334 seconds


Generating Variant Models: 100%|██████████| 105/105 [00:18<00:00,  5.81it/s]
Processing Variant Nets: 100%|██████████| 105/105 [00:00<00:00, 6997.39it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########
Precision of IM-discovered net:  0.6364
Fitness of IM-discovered net:  0.3377
VAE Generalization= 0.4412
The execution time for VAE generalization for order_process and the variant with 200 traces is 78.0944 seconds
Precision of IM-discovered net:  0.5777
Fitness of IM-discovered net:  0.9454
VAE Generalization= 0.7172
The execution time for VAE generalization for order_process and the ocpn model with 500 traces is 164.6971 seconds
Precision of IM-discovered net:  0.8443
Fitness of IM-discovered net:  0.4449
VAE Generalization= 0.5827
The execution time for VAE generalization for order_process and the happy path model with 500 traces is 72.5822 seconds
Precision of IM-discovered net:  0.2749
Fitness of IM-discovered net:  1.0
VAE Generalization= 0.4312
The execution time for VAE generalization for order_process and the flower model with 500 traces is 69.1959 second

Generating Variant Models: 100%|██████████| 174/174 [00:31<00:00,  5.45it/s]
Processing Variant Nets: 100%|██████████| 174/174 [00:00<00:00, 5800.05it/s]


#########Start generating Object-Centric Petri Net#########
#########Finished generating Object-Centric Petri Net#########


In [3]:
#has been run on the server
filenames = ["p2p-normal","BPI2017-Final","DS3","DS4"]
sample_sizes = [200 ,500, 800]
objects = [["PURCHORD","INVOICE","PURCHREQ","MATERIAL","GDSRCPT"],["application","offer"],["incident","customer"], ["Payment application","Control summary","Entitlement application","Geo parcel document","Inspection","Reference alignment"]]
counter = 0
for filename in filenames:
    object_types = objects[counter]
    counter += 1
    for sample_size in sample_sizes:
        ocel = ocel_import_factory.apply(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        ocpn = ocpn_discovery_factory.apply(ocel, parameters={"debug": False})
        parameters = {"obj_names": object_types,
                      "val_names": [],
                      "act_name": "event_activity",
                      "time_name": "event_timestamp",
                      "sep": ","}
        ocel_gen = ocel_import_factory_csv.apply(file_path=f'../src/data/runtime/VAE/{filename}_{sample_size}_generated.csv', parameters=parameters)
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the ocpn model with {sample_size} traces is {execution_time} seconds")
        happy_path__ocel = get_happy_path_log(f"../src/data/runtime/{filename}_{sample_size}.jsonocel")
        happy_path_ocpn = ocpn_discovery_factory.apply(happy_path__ocel, parameters={"debug": False})
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, happy_path_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the happy path model with {sample_size} traces is {execution_time} seconds")
        flower_ocpn = create_flower_model(f"../src/data/runtime/{filename}_{sample_size}.jsonocel",object_types)
        start_time = time.time()
        generalization = VAE_generalization(ocel_gen, flower_ocpn)
        execution_time = np.round(time.time() - start_time,4)
        print(f"The execution time for VAE generalization for {filename} and the flower model with {sample_size} traces is {execution_time} seconds")

Precision of IM-discovered net:  0.7471
Fitness of IM-discovered net:  0.7936
VAE Generalization= 0.7697
The execution time for VAE generalization for p2p-normal and the ocpn model with 200 traces is 33.3217 seconds
Precision of IM-discovered net:  0.8333
Fitness of IM-discovered net:  0.6667
VAE Generalization= 0.7407
The execution time for VAE generalization for p2p-normal and the happy path model with 200 traces is 38.2871 seconds
Precision of IM-discovered net:  0.1575
Fitness of IM-discovered net:  1.0
VAE Generalization= 0.2721
The execution time for VAE generalization for p2p-normal and the flower model with 200 traces is 36.8373 seconds
Precision of IM-discovered net:  0.7963
Fitness of IM-discovered net:  1.0
VAE Generalization= 0.8866
The execution time for VAE generalization for p2p-normal and the ocpn model with 500 traces is 128.6482 seconds
Precision of IM-discovered net:  0.8889
Fitness of IM-discovered net:  1.0
VAE Generalization= 0.9412
The execution time for VAE gene

KeyboardInterrupt: 