### Insert attribute change
The following attribute behavior can be combined freely
1. Types of Drift (Lu et al. 2019)
    1. "*Sudden drift*: A new concept occurs within a short time"
    2. "*Gradual drift*: A new concept replaces on old one over a period of time"
    3. "*Incremental drift*: An old concept incrementally changes to a new concept over a period of time".
    4. "*Reoccurring Concepts*: An old concept may reoccur after some time"
2. Attribute data type (Optional: add Ordinal)
    1. Categorical (nominal)
    2. Continuous
3. Attribute level
    1. Trace
    2. Event
4. Noise level
    1. None (0%)
    2. Low (10%)
    3. Medium (25%)
    4. Strong (50%)
5. Concept change
    1. Missing data
    2. Completely new distribution
    3. For categorical data: Oversampling of one class
    4. For categorical data: Undersampling of one class
    5. For continuous data: Increase of mean
    6. For continuous data: Decrease of mean
6. Data Stationarity
    1. Strong decrease
    2. Weak decrease
    3. Stationary
    4. Weak increase
    5. Strong increase
7. Location of Attribute Change
    1. Normally distributed before changepoint

In [1]:
import os
import json
import helper
from concept_drift import generate_attributes
import numpy as np

In [2]:
# select a dataset to augment
datasets = helper.get_datasets_by_criteria(is_synthetic=True, size=2500)
dataset = next(iter(datasets))
dataset_info = datasets[dataset]

In [3]:
# read the event log
opyenxes_log = helper.opyenxes_read_xes(dataset_info['file_path'])

In [4]:
ag = generate_attributes.AttributeGenerator(opyenxes_log, dataset_info['changepoints'])

In [5]:
relevant_attribute_count = 3
for attribute_index in range(relevant_attribute_count):
    attribute_name = f'relevant_attribute_{attribute_index + 1}'
    # changepoint to explain
    explain_changepoint = dataset_info['changepoints'][attribute_index]
    explainable_changepoints = ag.generate_categorical_attribute(attribute_name, 
                                                                 explain_change_points=[dataset_info['changepoints'][attribute_index]],
                                                                noise_level=0.1)

In [6]:
# add 10 more attributes that do not explain the change points
irrelevant_attribute_count = 10
for attribute_index in range(irrelevant_attribute_count):
    attribute_name = f'irrelevant_attribute_{attribute_index + 1}'
    ag.generate_categorical_attribute(attribute_name)

In [7]:
change_point_explanations = ag.change_point_explanations

In [8]:
new_base_path = 'data/synthetic/maardji et al 2013_xes_attributes'
old_base_path = 'data\\synthetic\\maardji et al 2013_xes\\'

In [9]:
new_path = helper.create_and_get_new_path(dataset_info['file_path'], old_base_path, new_base_path, new_extension=None)
new_path

'data\\synthetic\\maardji et al 2013_xes_attributes\\logs\\cb2.5k.xes'

In [10]:
helper.opyenxes_write_xes(ag.opyenxes_log, new_path)

Importance: DEBUG
Message: Start serializing log to XES.XML

Importance: DEBUG
Message: finished serializing log (7636.572998046875 msec.)



In [11]:
# update the data dictionary
data_info_new = dataset_info.copy()
data_info_new['file_path'] = new_path
data_info_new['has_generated_attributes'] = True
data_info_new['change_point_explanations'] = change_point_explanations
data_info_new

{'file_path': 'data\\synthetic\\maardji et al 2013_xes_attributes\\logs\\cb2.5k.xes',
 'file_name': 'cb2.5k',
 'drift_type': 'sudden',
 'dataset': 'maardji et al 2013',
 'size': 2500,
 'changepoints': [250, 500, 750, 1000, 1250, 1500, 1750, 2000, 2250],
 'is_synthetic': True,
 'has_generated_attributes': True,
 'change_point_explanations': {250: [{'attribute_name': 'relevant_attribute_1',
    'drift_type': 'sudden',
    'drift_location': 244}],
  500: [{'attribute_name': 'relevant_attribute_2',
    'drift_type': 'sudden',
    'drift_location': 479}],
  750: [{'attribute_name': 'relevant_attribute_3',
    'drift_type': 'sudden',
    'drift_location': 748}],
  1000: [],
  1250: [],
  1500: [],
  1750: [],
  2000: [],
  2250: []}}

In [12]:
data_dictionary = {new_path: data_info_new}
helper.update_data_dictionary(data_dictionary)