# Synthetic Log Generation from DECLARE Models

DECLARE4Py implements the generation of synthetic logs from DECLARE models with a solution based on Answer Set Programming that uses a Clingo solver. More details can be found in the paper of Chiariello, F., Maggi, F. M., & Patrizi, F. (2022, June). ASP-Based Declarative Process Mining. In *Proceedings of the AAAI Conference on Artificial Intelligence* (Vol. 36, No. 5, pp. 5539-5547).

As first step, it is necessary to import a `.decl` file containing the DECLARE constraints.

In [7]:
import os
from Declare4Py.ProcessModels.DeclareModel import DeclareModel
from Declare4Py.ProcessMiningTasks.LogGenerator.ASP.ASPLogGenerator import AspGenerator

model_name = 'data-model1-simplified'
model: DeclareModel = DeclareModel().parse_from_file(os.path.join("../../../", "tests", "test_models", f"{model_name}.decl"))

Then, some general settings are needed to set the number of cases to generate and the minimum and maximum number of events for each case

In [8]:
%%time
# Number of cases that have be generated
num_of_cases = 10

# Minimum and maximum number of events a case can contain
(num_min_events, num_max_events) = (8,15)

# Shows some feedback from the Generator (Set it too false to ignore all debug messages)
verbose = True

asp_gen: AspGenerator = AspGenerator(model, num_of_cases, num_min_events, num_max_events, verbose=verbose)
asp_gen.run()

DEBUG:ASP generator:Distribution for traces UNIFORM
DEBUG:ASP generator:traces: 10, events can have a trace min(8) max(15)
DEBUG:Distribution:Distribution() UNIFORM min_mu: 8 max_sigma: 15 num_traces: 10 custom_prob: None
DEBUG:Distribution:Calculating Uniform Distribution
DEBUG:Distribution:Generating Uniform Probabilities since either distribution is uniform or custom probabilities are None
DEBUG:Distribution:Probabilities sum is 1
DEBUG:Distribution:Distribution result: [13  9 13 11 14 14 15 14  8 15]
DEBUG:ASP generator:Prepared distribution of 10 positive traces with distribution: Counter({14: 3, 13: 2, 15: 2, 9: 1, 11: 1, 8: 1})
DEBUG:ASP generator:Prepared distribution of 0 negative traces with distribution: Counter()
DEBUG:ASP generator:Generating positive Traces
DEBUG:ASP generator:Translate declare model to ASP
DEBUG:ASP generator:Declare model translated to ASP. Total Facts 13
DEBUG:ASP generator:ASP encoding generated
DEBUG:ASP generator:Total trace to generate and events: 

CPU times: total: 48.6 s
Wall time: 6.1 s


The class `AspGenerator` has to be instantiated with the DECLARE model and the settings of above. Then, the `run` method will generate the cases and the `to_xes` method will save them in a `.xes` event log or the `to_csv` method will save them in a `.csv` file.

In [9]:
asp_gen.to_xes(f"{model_name}.xes")
asp_gen.to_csv(f"{model_name}.csv")

  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)
  self.comm = Comm(**args)


exporting log, completed traces ::   0%|          | 0/10 [00:00<?, ?it/s]

Logs can be generated with some **purposes** according to the needs of Process Mining algorithms. DECLARE4Py implements four useful purposes that can be set with the following methods of the `AspGenerator` class.


## 1. Setting up the Length Distribution of the Cases

Users can specify a probability distribution over the lengths of the generated traces. The method `set_distribution_type` takes as parameter the `distribution_type`. By setting this parameter with the `uniform` value, a uniform distribution in `[num_min_events, num_max_events]` is chosen. 

Also, the length of the positive traces can be changed with the method `set_positive_traces`

In [10]:
%%time
# Default is uniform
asp_gen.set_distribution_type("uniform")

# Before was 10, lets double that
asp_gen.set_positive_traces(num_of_cases * 2)

asp_gen.run()
asp_gen.to_csv(f'{model_name}_Distribution_Test_1.csv')

DEBUG:Distribution:Distribution() UNIFORM min_mu: 8 max_sigma: 15 num_traces: 20 custom_prob: None
DEBUG:Distribution:Calculating Uniform Distribution
DEBUG:Distribution:Generating Uniform Probabilities since either distribution is uniform or custom probabilities are None
DEBUG:Distribution:Probabilities sum is 1
DEBUG:Distribution:Distribution result: [12 12 11  8 13  9 13 15 10  8 12  8 15 13 11 10 13 10 14 13]
DEBUG:ASP generator:Prepared distribution of 20 positive traces with distribution: Counter({13: 5, 12: 3, 8: 3, 10: 3, 11: 2, 15: 2, 9: 1, 14: 1})
DEBUG:ASP generator:Prepared distribution of 0 negative traces with distribution: Counter()
DEBUG:ASP generator:Generating positive Traces
DEBUG:ASP generator:Translate declare model to ASP
DEBUG:ASP generator:Declare model translated to ASP. Total Facts 13
DEBUG:ASP generator:ASP encoding generated
DEBUG:ASP generator:Total trace to generate and events: Traces:3, Events: 12, RandFrequency: 1.0, Seed:374533593
DEBUG:ASP generator: T

CPU times: total: 1min 8s
Wall time: 8.32 s


A `gaussian` distribution requires a location (the mean) and a scale (the variance)

In [14]:
%%time
asp_gen.change_distribution_settings(min_num_events_or_mu=15.5, max_num_events_or_sigma=3.2, dist_type="gaussian")
asp_gen.run()
asp_gen.to_csv(f'{model_name}_Distribution_Test_2.csv')

DEBUG:Abstract Log Generator:The distribution type is set to GAUSSIAN with mu = 15.5 and sigma = 3.2
DEBUG:Distribution:Distribution() GAUSSIAN min_mu: 15.5 max_sigma: 3.2 num_traces: 20 custom_prob: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3]
DEBUG:Distribution:Calculating Normal Distribution
DEBUG:ASP generator:Prepared distribution of 20 positive traces with distribution: Counter({15.0: 5, 11.0: 3, 14.0: 3, 12.0: 2, 16.0: 2, 18.0: 2, 19.0: 1, 20.0: 1, 21.0: 1})
DEBUG:ASP generator:Prepared distribution of 0 negative traces with distribution: Counter()
DEBUG:ASP generator:Generating positive Traces
DEBUG:ASP generator:Translate declare model to ASP
DEBUG:ASP generator:Declare model translated to ASP. Total Facts 13
DEBUG:ASP generator:ASP encoding generated
DEBUG:ASP generator:Total trace to generate and events: Traces:2, Events: 12.0, RandFrequency: 1.0, Seed:155413360
DEBUG:ASP generator: Traces generated :[trace(evt_val_12,1), trace(evt_val_6,2), trace(evt_val_7,3), trace(evt_val_0,4

CPU times: total: 1min 53s
Wall time: 13.8 s


A `custom` distribution requires the user to set the probability for each length in `[num_min_events, num_max_events]`

In [12]:
%%time
asp_gen.set_distribution_type("custom")
asp_gen.set_custom_probabilities([0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3])

asp_gen.run()
asp_gen.to_csv(f'{model_name}_Distribution_Test_3.csv')

DEBUG:Distribution:Distribution() CUSTOM min_mu: 8 max_sigma: 15 num_traces: 20 custom_prob: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3]
DEBUG:Distribution:Calculating Custom Distribution
DEBUG:Distribution:Probabilities sum is 1.0
DEBUG:Distribution:Distribution result: [ 9 15  9 15 15 14 14 11 12  9 15 15 15 12 15 13  9  8 10 15]
DEBUG:ASP generator:Prepared distribution of 20 positive traces with distribution: Counter({15: 8, 9: 4, 14: 2, 12: 2, 11: 1, 13: 1, 8: 1, 10: 1})
DEBUG:ASP generator:Prepared distribution of 0 negative traces with distribution: Counter()
DEBUG:ASP generator:Generating positive Traces
DEBUG:ASP generator:Translate declare model to ASP
DEBUG:ASP generator:Declare model translated to ASP. Total Facts 13
DEBUG:ASP generator:ASP encoding generated
DEBUG:ASP generator:Total trace to generate and events: Traces:4, Events: 9, RandFrequency: 1.0, Seed:145440377
DEBUG:ASP generator: Traces generated :[trace(evt_val_6,1), trace(evt_val_7,2), trace(evt_val_9,3), trace(evt

CPU times: total: 56.5 s
Wall time: 7.47 s


## 2. Setting up the Personalized Clingo configuration

### More information

For more information on clingo and its functionalities consult:  https://potassco.org/

For more information on the option commands consult the documentation of Clingo (Potassco) at: https://github.com/potassco/guide/releases/ or https://github.com/potassco/asprin/blob/master/asprin/src/main/clingo_help.py

Or download directly the documentation from here: https://github.com/potassco/guide/releases/download/v2.2.0/guide.pdf

### Setting up the configuration

Clingo offers various option to personalize the solver range of action, probabilistic reasoning and decision-making

At the moment the solver can be personalized using the following method `set_distribution_type` with the following options:
    
- The **Configuration** of clingo can be: "frumpy", "tweety", "crafty", "jumpy", "trendy" or "handy". (Default is frumpy)


- The amount of **Threads** used by clingo to speed up the process. (Default uses al possible cores)


- The **Random Frequency** used by clingo in the decision-making is a float number between 0 and 1 included. Where 0 means: No random decisions and 1 means: Every decision is random. (Default is 1)


- The **Mode** configures the optimization of the algorithm and can be either "optN" or "ignore". (Default is optN)


- The **Sign** of the operation which can be "asp", "pos" "neg", "rnd". (Default is asp)


- The **Strategy** configures the optimization of the strategy and can be "bb" or "usc". (This functionality is not used in the default configuration)


- The **Heuristic** used by clingo configures the decision heuristic and can be "Berkmin", "Vmtf", "Vsids", "Domain", "Unit" or "None". (This functionality is not used in the default configuration)


In [20]:
%%time

asp_gen.use_default_clingo_configuration()
# The default configuration can be obtained using the following command
print(asp_gen.get_current_clingo_configuration())

# To enable the custom configuration: 
asp_gen.use_custom_clingo_configuration(config="jumpy", threads=None, frequency=1, sign_def="rnd", strategy="bb", heuristic="Vsids")

# The current configuration then becomes the custom one
print(asp_gen.get_current_clingo_configuration())

# this command tells the generator to use the default configuration again
# asp_gen.use_default_clingo_configuration()
# It does not delete the old custom configuration, in fact the custom configuration can be re-enabled by calling
# asp_gen.use_custom_clingo_configuration()

asp_gen.run()
asp_gen.to_csv(f'{model_name}_Custom_Configuration_Test_1.csv')

DEBUG:Distribution:Distribution() GAUSSIAN min_mu: 15.5 max_sigma: 3.2 num_traces: 20 custom_prob: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3]
DEBUG:Distribution:Calculating Normal Distribution
DEBUG:ASP generator:Prepared distribution of 20 positive traces with distribution: Counter({16.0: 3, 18.0: 3, 19.0: 2, 12.0: 2, 15.0: 2, 13.0: 2, 11.0: 1, 20.0: 1, 14.0: 1, 9.0: 1, 23.0: 1, 8.0: 1})
DEBUG:ASP generator:Prepared distribution of 0 negative traces with distribution: Counter()
DEBUG:ASP generator:Generating positive Traces
DEBUG:ASP generator:Translate declare model to ASP
DEBUG:ASP generator:Declare model translated to ASP. Total Facts 13
DEBUG:ASP generator:ASP encoding generated


{'CONFIG': 'tweety', 'THREADS': '16', 'FREQUENCY': '1.0', 'SIGN-DEF': 'asp', 'MODE': 'optN', 'STRATEGY': None, 'HEURISTIC': None}
{'CONFIG': 'jumpy', 'THREADS': '16', 'FREQUENCY': '1', 'SIGN-DEF': 'rnd', 'MODE': 'optN', 'STRATEGY': 'bb', 'HEURISTIC': 'Vsids'}


DEBUG:ASP generator:Total trace to generate and events: Traces:1, Events: 11.0, RandFrequency: 1, Seed:37248818
DEBUG:ASP generator: Traces generated :[trace(evt_val_9,1), trace(evt_val_10,2), trace(evt_val_4,3), trace(evt_val_4,4), trace(evt_val_4,5), trace(evt_val_6,6), trace(evt_val_7,7), trace(evt_val_6,8), trace(evt_val_7,9), trace(evt_val_9,10), trace(evt_val_10,11), assigned_value(attr_name_0,attr_value_0,1), assigned_value(attr_name_0,attr_value_0,2), assigned_value(attr_name_0,attr_value_1,3), assigned_value(attr_name_0,attr_value_0,4), assigned_value(attr_name_0,attr_value_1,5), assigned_value(attr_name_0,attr_value_0,6), assigned_value(attr_name_0,attr_value_0,7), assigned_value(attr_name_0,attr_value_0,8), assigned_value(attr_name_0,attr_value_0,9), assigned_value(attr_name_0,attr_value_0,10), assigned_value(attr_name_0,attr_value_1,11), assigned_value(attr_name_1,attr_value_26,3), assigned_value(attr_name_1,attr_value_30,4), assigned_value(attr_name_1,attr_value_31,5), ass

CPU times: total: 4min 20s
Wall time: 28.3 s



# NOTE: The following code is not tested

## 3. Setting up the Variants

Users can generate variants by setting the number of repetitions of the workflow of each case. This is done with the `set_number_of_repetition_per_trace` method

In [None]:
asp_gen.set_number_of_repetition_per_trace(3)

asp_gen.run()
asp_gen.to_csv(f'{model_name}.csv')

## Setting up Positive and Negative Traces

Users can specify some constraints to be violated in the synthetic cases to obtain labelled logs for binary classification, e.g., for deviance mining algorithms. The method `set_constraints_to_violate` takes as input:

1. `tot_negative_trace`: the number of negative cases to be violated;
2. `violate_all`: whether to violate *all* the specified constraints or let Clingo decide which constraints to be violated;
3. `constraints_list`: the list containing the subset of DECLARE constraints (specified as strings of text) to be violated.

In [None]:
asp_gen: AspGenerator = AspGenerator(model, num_of_cases, num_min_events, num_max_events)

asp_gen.set_constraints_to_violate(tot_negative_trace=10, violate_all=True, constraints_list=[
    "Init[ER Registration] | |",
    "Chain Response[ER Registration, ER Triage] |A.org:group is J |T.org:group is A |"])
asp_gen.run()
asp_gen.to_csv(f'{model_name}.csv')

In addition, instead of giving the explicit text of the DECLARE constraint, an index can be used in the `set_constraints_to_violate_by_template_index` method

In [None]:
asp_gen: AspGenerator = AspGenerator(model, num_of_cases, num_min_events, num_max_events)

for id, constr_text in enumerate(model.serialized_constraints):
    print(f"{id} - {constr_text}")

asp_gen.set_constraints_to_violate_by_template_index(tot_negative_trace=10, violate_all=True, 
                                                 constraints_idx_list=[0, 3])
asp_gen.run()
asp_gen.to_csv(f'{model_name}.csv')

## Setting up Rules for the Activation Conditions

Users can specify the number of activations of a DECLARE constraint in the synthetic cases. This can be done with the `set_activation_conditions` method by specifying an interval of activations for specific DECLARE constraints in the loaded model

In [None]:
asp_gen: AspGenerator = AspGenerator(model, num_of_cases, num_min_events, num_max_events)

asp_gen.set_activation_conditions({
'Response[CRP, Release B] |A.org:group is J |T.org:group is A |':
[2, 3]}) # activation should occur between 2 to 3 times

asp_gen.run()
asp_gen.to_csv(f'{model_name}.csv')

In addition, instead of giving the explicit text of the DECLARE constraints, an index can be used in the `set_activation_conditions_by_template_index` method

In [None]:
asp_gen: AspGenerator = AspGenerator(model, num_of_cases, num_min_events, num_max_events)

for id, constr_text in enumerate(model.serialized_constraints):
    print(f"{id} - {constr_text}")

asp_gen.set_activation_conditions_by_template_index({3: [2, 3]})
asp_gen.run()
asp_gen.to_csv(f'{model_name}.csv')