# Training the Neural Networks

The pretrained neural networks are available as part of the download from notebook [2. Dataset Generation](2.%20Dataset%20Generation.ipynb). If you haven't downloaded the models from the release yet, we suggest that you do now.

In case you want to train your own models, here is the code to do so.

In [1]:
# Required on some Windows machines
import os
os.environ["TF_FORCE_GPU_ALLOW_GROWTH "] = "true"

import arrow
import tensorflow as tf
from tqdm.notebook import tqdm

from deepalign import Dataset
from deepalign import fs
from deepalign.alignments import ConfNet

We can get all dataset filenames using this helper method.

In [2]:
datasets = sorted([f.name for f in fs.get_event_log_files()])

Now we can train a ConfNet model for each of the datasets using the following for loop. It will create a version of ConfNet with no attributes `(0, 0)`, only case attributes `(0, 1)`, only event attributes `(1, 0)`, and both `(1, 1)`.

In [3]:
for dataset_name in datasets:
    for ea, ca in [(0, 0), (0, 1), (1, 0), (1, 1)]:
        start_time = arrow.now()
        dataset = Dataset(dataset_name, use_case_attributes=ca, use_event_attributes=ea)
        if ca and dataset.num_case_attributes == 0:
            continue
        confnet = ConfNet(dataset, use_case_attributes=ca, use_event_attributes=ea)
        confnet.fit(dataset, batch_size=100, epochs=50, validation_split=0.1,
                    callbacks=[tf.keras.callbacks.EarlyStopping(patience=5)])
        confnet.save(
            str(fs.MODEL_DIR / f'{dataset_name}_{confnet.identifier}_{start_time.format(fs.DATE_FORMAT)}'))

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/5

# Creating the Baseline Models

In [4]:
from deepalign.alignments.processmining import OptimalCostAligner
from deepalign.alignments.processmining import HeuristicsMinerAligner
from deepalign.alignments.processmining import InductiveMinerAligner

In [5]:
datasets = sorted([f.name for f in fs.get_event_log_files()])

In [6]:
aligners = [OptimalCostAligner, HeuristicsMinerAligner, InductiveMinerAligner]

In [7]:
for aligner_class in tqdm(aligners):
    for dataset_name in tqdm(datasets):
        dataset = Dataset(dataset_name)
        aligner = aligner_class()
        aligner.fit(dataset)
        file_name = f'{dataset_name}_{aligner.abbreviation}'
        aligner.save(file_name)

HBox(children=(FloatProgress(value=0.0, max=3.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))

100%|██████████| 5000/5000 [00:00<00:00, 6784.26it/s]
  return np.array(self.cases)[indices]








HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))

HBox(children=(FloatProgress(value=0.0, max=1.0), HTML(value='')))