# Synthetic Log Generation from Declare Positional Models

## What is a Declare Positional Based Model

A Declare positional based model is essentially a Declare model with different type of constraints. 

<br>

### Defining an activity

Activities can be defined through the keyword `activity`. More activities can be defined in the same line using the comma as a separator. 

During the definition of an activity remember that **the colons character `:` followed by a space cannot be used** since is used by the parser to distinguish the difference between activities and attributes or attributes and values

Example : `activity activity_name_1, ..., activity_name_n` or simply `activity activity_name_1`

<br>

### Assigning attributes to activities

Activities has attributes and in order to assign attributes the keyword `bind`. 

Example : `bind activity_name_1, ..., activity_name_n: attribute_name_1, ... , attribute_name_n`

By Using this type definition line to more activities can be assigned more attributes. This way every activity will have the same attributes as the other defined in the same line.

Otherwise, the binding can also be done singularly: `bind activity_name_1: attribute_name_1, ... , attribute_name_n`

**Note**: If a previous bind assigned some attributes to an activity, another bind will add the new attributes on top resulting in the activity having the attributes of the first and second bind
**Note**: If the activity is not defined the parser of the PositionalBased Model will define an activity for you, hence the definition of an activity can actually be omitted by using directly the line:

Example : `bind activity_name_1, ..., activity_name_n: attribute_name_1, ... , attribute_name_n` In order to define both activities and attributes in one line.

<br>

### Assigning values to attributes

Remember to assign values to the attributes created before, otherwise the PositionalBasedModel will launch an error!

In order to assign some values to attributes it is not necessary to use a keyword. The following line can be used as an example in order to assign more values to more attributes:

Example: `attribute_name_1, ..., attribute_name_n: attribute_value_1, ... , attribute_value_n`

This type of definition line creates attributes of type `Enumeration`

In order to define a range of integers or float, instead use the following definition:

Integer Example: `attribute_name_1, ..., attribute_name_n: integer between x and y`

Float Example: `attribute_name_1, ..., attribute_name_n: float between x and y`

This type of definition line creates attributes of type `integer between` or `float between`.

**Note**: In the definition of `float between` the number must be written with the point as a separator for floating point numbers. The precision of the floating point number is defined by how many values exists after the point:

Float Example: `attribute_name_1: float between 10.0 and 15.0`: 1 floating point digit

Float Example: `attribute_name_1: float between 10.00 and 15.00`: 2 floating point digit

Float Example: `attribute_name_1: float between 10.0 and 15.000`: 3 floating point digit. (The number with bigger precision is selected for the calculation of the digit precision)

**Note**: Redefining an attribute with its values will overwrite the current definition!

<br>

### Creating constraints 

In the following 2 tables is presente the definition of the constraints and their arguments, together with the possible actions that can be performed with each parameter.

<br>

| Declare Function                                                     | Argument 1  |  Type  | Supports Variables | Supports Conditional Operators | Arg can be empty |  Argument 2   |    Type     | Supports Variables | Supports Conditional Operators | Arg can be empty |
|:---------------------------------------------------------------------|:-----------:|:------:|:------------------:|:------------------------------:|:----------------:|:-------------:|:-----------:|:------------------:|:------------------------------:|:----------------:|
| pos(activity a, position p, time t)                                  | activity a  | encode |        yes         |         yes (only !=)          |        no        |  position p   |     int     |        yes         |              yes               |       yes        |
| payload(attribute a, value v, position p)                            | attribute a | encode |        yes         |         yes (only !=)          |        no        |    value v    |     any     |        yes         |              yes               |        no        |
| payload_range(attribute a, min_value min, max_value max, position p) | attribute a | encode |        yes         |               no               |        no        | min_value min | int / float |         no         |               no               |       yes        |
| absolute_pos(activity a, position p, time t)                         | activity a  | encode |        yes         |               no               |        no        |  position p   |     int     |        yes         |               no               |        no        |
| pos_not_greater_than(activity a, position p, time t)                 | activity a  | encode |        yes         |               no               |        no        |  position p   |     int     |        yes         |               no               |        no        |
| pos_not_lower_than(activity a, position p, time t)                   | activity a  | encode |        yes         |               no               |        no        |  position p   |     int     |        yes         |               no               |        no        |
| absolute_payload(attribute a, value v)                               | attribute a | encode |        yes         |               no               |        no        |    value v    |     any     |        yes         |               no               |        no        |

<br>

| Declare Function                                                     |  Argument 3   |    Type     | Supports Variables | Supports Conditional Operators | Arg can be empty | Argument 4 | Type | Supports Variables | Supports Conditional Operators | Arg can be empty | 
|:---------------------------------------------------------------------|:-------------:|:-----------:|:------------------:|:------------------------------:|:----------------:|:----------:|:----:|:------------------:|:------------------------------:|:----------------:|
| pos(activity a, position p, time t)                                  |    time t     |     int     |        yes         |              yes               |       yes        |            |      |                    |                                |                  |
| payload(attribute a, value v, position p)                            |  position p   |     int     |        yes         |              yes               |       yes        |            |      |                    |                                |                  |
| payload_range(attribute a, min_value min, max_value max, position p) | max_value max | int / float |         no         |               no               |       yes        | position p | int  |        yes         |              yes               |       yes        |
| absolute_pos(activity a, position p, time t)                         |    time t     |     int     |        yes         |               no               |       yes        |            |      |                    |                                |                  |
| pos_not_greater_than(activity a, position p, time t)                 |    time t     |     int     |        yes         |               no               |       yes        |            |      |                    |                                |                  |
| pos_not_lower_than(activity a, position p, time t)                   |    time t     |     int     |        yes         |               no               |       yes        |            |      |                    |                                |                  |

<br>

#### Typing

Arguments can have 4 different types:
- `int` : The argument supports integers numbers .
- `float` : The argument supports floating point numbers .
- `encode` : This type of argument defines a name of an `Activity`, a name of an `Attribute` or an `Enumeration` value.
- `any` : An argument defined as `any` can be either `int`, `float` or `encode`.

<br>

#### Variables

The arguments that supports this feature can have their value replaced with the character sequence that defines a variables.

In order to define a variable use `:` followed by upper case characters or numbers. (No spaces). Example: `:VAR1`

Example: `pos(activity a, :P1, :T1)` or `payload(attribute a, :V2, :P2)` 

<br>

#### Conditional constraints

**Conditional constraints** can be implemented with the use of variables. This type of constraint creates the possibility to enhance the Declare Functions in order to create different results.

Structure of a conditional constraint: **value or variable (+,-) value or variable (==,!=,<=,<,>,>=) value or variable (+,-) value or variable**

Example of Conditional constraints: `:Var1 != 7`, `:Var1 == :Var2`, `:Var1 == :Var2 + 10`, `:Var1 >= :Var2 - 10`

<br>

#### Variables and Conditional constraints

Altogether with the use of variables and conditional constraints, the definition of constraints can be changed to:

Example: `pos(activity a, :P1, :T1), pos(activity b, :P2, :T2), :P1 == :P2 + 2, :T1 + 10 <= :T2`

Example: `pos(:ACT1, position p, time t), :ACT1 != a`

<br>

#### Conditional Operators

Arguments that support conditional operators can be written in the following way:

Example: `pos(a, :P1, >=5), pos(a, >=:P1, 10!=)` which corresponds to `pos(a, :P1, :T1), :T1 >= 5, pos(a, :P2, :T2), :P2 >= :P1, 10 != :T2`

This gives the possibility to bound variable or values in different ways. 

**Note**: Conditional operators can be placed in front or at the end of a value or variable. The parsing is done completely by the Positional Based model.

<br>

#### Empty Arguments

Arguments can be left empty using `_`. This means that it doesn't matter what value is there. The generator will then choose any value for that argument that fits.

<br>

#### Some more examples:

| Declare Function Example                                 | Function                                                                                                                                                                                     |
|----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| pos(activity a, position p, time t)                      | The activity "a" will be placed in position "p" at time "t"                                                                                                                                  |
| pos(activity a, position p, _)                           | The activity "a" will be placed in position "p" at any time "t"                                                                                                                              |
| pos(activity a, _, _)                                    | The activity "a" will be present at least 1 time in the events at any position "p" and at any time "t"                                                                                       |
| payload(attribute a, value v, position p)                | The attribute "a" in position "p" will have a value "v"                                                                                                                                      |
| payload(attribute a, value v, _)                         | One attribute "a" in any position "p" will have a value "v"                                                                                                                                  |
| payload_range(attribute a, int min, int max, position p) | The attribute "a" in position "p" will have a value in between "min" and "max"                                                                                                               |
| payload_range(attribute a, int min, _, position p)       | The attribute "a" in position "p" will have a value in between "min" and the maximum value that the attribute can have                                                                       |
| payload_range(attribute a, _, int max, position p)       | The attribute "a" in position "p" will have a value in between the minimum value that the attribute can have and "max"                                                                       |
| payload_range(attribute a, _, _, position p)             | The attribute "a" in position "p" will have a value in between the minimum value that the attribute can have and the maximum value that the attribute can have                               |
| payload_range(attribute a, int min, int max, _)          | One attribute "a" in any position "p" will have a value in between "min" and "max"                                                                                                           |
| absolute_pos(activity a, position p, time t)             | The activity "a" will be placed in position "p" at time "t" and there will not be any more occurrences of that activity in the trace events                                                  |
| pos_not_greater_than(activity a, position p, time t)     | The activity "a" will be placed in a position not greater than "p" at a time not greater than "t" and there will not be any more occurrences of that activity after position "p" at time "t" |
| pos_not_greater_than(activity a, position p, _)          | The activity "a" will be placed in a position not greater than  "p" at any time "t" and there will not be any more occurrences of that activity after position "p"                           |
| pos_not_lower_than(activity a, position p, time t)       | The activity "a" will be placed in a position not lower than  "p" at a time not lower than "t" and there will not be any more occurrences of that activity before position "p" at time "t"   |
| pos_not_lower_than(activity a, position p, _)            | The activity "a" will be placed in a position not lower than  "p" at any time "t" and there will not be any more occurrences of that activity before position "p"                            |
| absolute_payload(attribute a, value v)                   | The attribute "a" will have value "v" in any activity that contains the attribute "a" in the trace                                                                                           |


## Functioning of the positional model

Let's see how the model works in practice. First import the model.

In [1]:
from Declare4Py.ProcessMiningTasks.LogGenerator.PositionalBased.PositionalBasedModel import PositionalBasedModel

In order to instantiate a PositionalBasedModel the functions: `parse_from_file` or `parse_from_string` can be used. Both functions read PositionalBasedDeclare and create a model accordingly.

For our model, the file `experimental_model.decl` will be used. 

With the creation of the `PositionalBasedModel` these parameters can be inserted in the constructor:
- `positional_time_start: int` = Indicates the starting time unit for the positional based model. Standard value: 1
- `positional_time_end: int` = Indicates the ending time unit for the positional based model. Standard value: 100
- `time_unit_in_seconds_min: int` = Indicates the minimum value in seconds that 1 positional_time_unit can have. Standard value: 240 
- `time_unit_in_seconds_max: int` = Indicates the maximum value in seconds that 1 positional_time_unit can have. Standard value: 300 
- `verbose: bool` = Indicates if the user wants to see debug messages.

Each trace generated has a positional time unit value. 1 positional time unit corresponds to a number of seconds in range time_unit_in_seconds_min and time_unit_in_seconds_max.
When the sequence of events are generated the timestamp is calculated based on the time unit assigned

**NOTE**: `positional_time_end` **cannot** be lower than the maximum number of event, otherwise the problem becomes UNSAT

**NOTE**: The lesser is the difference of time between `positional_time_start` and `positional_time_end` the lesser the time of generating traces will be. 

**NOTE**: The bigger is the difference of time between `positional_time_start` and `positional_time_end` the bigger the time of generating traces will be. 

In [2]:
model_name = "experimental_model"
model_path: str = f"../../../Declare4Py/ProcessMiningTasks/LogGenerator/PositionalBased/DeclareFiles/{model_name}.decl"
model: PositionalBasedModel = PositionalBasedModel().parse_from_file(model_path)



Positional Time and Time Unit in seconds can be changed anytime from the model using the functions `set_positional_based_time_range` and `set_time_unit_in_seconds_range`. Calling the function with empty parameters will reset the values.

In [3]:
model.set_positional_based_time_range()
model.set_time_unit_in_seconds_range()

The Positional Based Model can be exported to ASP or Declare strings using:


In [4]:
# Returns the ASP string of the model form positive traces
print(model.to_asp())

In [5]:
# Returns the ASP string of the model for negative traces
print(model.to_asp(generate_negatives=True))

In [6]:
# Returns the Declare string of the model
print(model.to_declare())

In [7]:
# Returns the String of the parsed Declare model
print(model.get_parsed_model())

The Positional Based Model can be exported to ASP or Declare files using:

In [3]:
# if you want to export both Declare and ASP file

export_path = "../../../output/" + model_name

# The parameters are self-explanatory based on values the result will be one of the previous cell method as file
# The function exports both asp and decl file. It can also export only one file at the time based on the attributes asp_file and Decl_file
model.to_file(export_path) 

# Otherwise files can be exported singularly using
# The parameters are the same
model.to_asp_file(export_path)
model.to_decl_file(export_path)

The Asp String can also be exported differently using:

In [21]:
# Exports the model without constraints
model.to_asp_file_without_constraints(export_path + "_no_constraints")

# Creates one asp file per constraint rule
# generate_also_negatives is true export also the negatives of the rules
model.to_one_asp_file_per_constraints(export_path, generate_also_negatives=True)

**NOTE**: Remember to add manually the constant p in ASP if you want to use the exported file. Use the line `#const p = insert_value.`

# Synthetic Positional Based Log Generation from DECLARE Models

DECLARE4Py implements the generation of synthetic logs from DECLARE positional models with a positional solution based on Answer Set Programming that uses a Clingo solver.

In [10]:
from Declare4Py.ProcessMiningTasks.LogGenerator.PositionalBased.PositionalBasedLogGenerator import PositionalBasedLogGenerator

By using the already initialized model with some general settings we can instantiate the generator

In [11]:
# Number of cases that have be generated
num_of_cases = 40

# Minimum and maximum number of events a case can contain
(num_min_events, num_max_events) = (20, 20)

# Shows some feedback from the Generator (Set it too false to ignore all debug messages)
verbose = True

generator: PositionalBasedLogGenerator = PositionalBasedLogGenerator(num_of_cases, num_min_events, num_max_events, model, verbose=verbose)

# If the number of traces wants to be changed use:
# generator.set_total_traces()
# If the number of min and max events wants to be changed use:
# generator.set_min_max_events()
# If the model wants to be changed use:
# generator.set_positional_based_model()


In order to run the generator call the method `run`. The method also supports some parameters, which are and function as follows:

- `equal_rule_split: bool` = Indicates if the user wants to generate an equal number of traces for each rule. If not, the traces will be generated randomly.
- `high_variability: bool` = If True Generates the traces singularly otherwise generates the traces together with low variability.
- `generate_negatives: bool` = Indicates if the user wants to generate the negative traces as well. If true, the number of traces will be doubled and half will be positives and the other half will be negatives.
- `positive_noise_percentage: int` = Indicates the percentage of noise in the trace generation for positive traces. To x percentage of positive traces will be falsely assigned to negative label.
- `negative_noise_percentage: int` = Indicates the percentage of noise in the trace generation for negative traces. To x percentage of negative traces will be falsely assigned to positive label.
- `append_results: bool` = Appends the current run result to the old results, Otherwise it deletes the old results and stores the new ones.

In [12]:
%%time
generator.run(generate_negatives_traces=True)

The results of the `PositionalBasedLogGenerator` can then be later exported with the  `to_xes` method will save them in a `.xes` event log or the `to_csv` method will save them in a `.csv` file.

In [13]:
generator.to_csv(export_path)
generator.to_xes(export_path)

## Generating traces with noise, equal split rule or high variability

Noise can be applied to the traces in order to generate falsely labelled traces. By inserting a value between 0 and 100 the generator will generate "n" traces and apply the noise percentage

In [14]:
%%time
generator.run(generate_negatives_traces=True, negative_noise_percentage=5, positive_noise_percentage=7)
generator.to_csv(f'{export_path}_Noise_Test.csv')

The generator with the `equal_rule_split` set as False in this case will generate the traces without knowing which rule the trace is correlated to.

In [15]:
%%time
generator.run(equal_rule_split=False, generate_negatives_traces=True, negative_noise_percentage=6, positive_noise_percentage=3)
generator.to_csv(f'{export_path}_No_Equal_Split_Rule_Test.csv')

The generator with the `high_variability` set as False will generate quickly traces all together but the variability of the generate traces will be low.

In [None]:
%%time
generator.run(high_variability=True, equal_rule_split=True, generate_negatives_traces=True, negative_noise_percentage=6, positive_noise_percentage=3)
generator.to_csv(f'{export_path}_No_Equal_Split_Rule_Test.csv')

## Setting up the Length Distribution of the Cases

Users can specify a probability distribution over the lengths of the generated traces. The method `set_distribution_type` takes as parameter the `distribution_type`. By setting this parameter with the `uniform` value, a uniform distribution in `[num_min_events, num_max_events]` is chosen. 

Also, the length of the positive traces can be changed with the method `set_total_traces`

In [16]:
%%time
# Default is uniform
generator.set_distribution_type("uniform")

generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Distribution_Test_1.csv')

A `gaussian` distribution requires a location (the mean) and a scale (the variance)

In [17]:
%%time
generator.change_distribution_settings(min_num_events_or_mu=25.5, max_num_events_or_sigma=2.0, dist_type="gaussian")
generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Distribution_Test_2.csv')

A `custom` distribution requires the user to set the probability for each length in `[num_min_events, num_max_events]`

In [18]:
%%time
generator.set_distribution_type("custom")

# Let's change the minimum and maximum number of events
generator.set_min_max_events(19,26)
generator.set_custom_probabilities([0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3])

generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Distribution_Test_3.csv')

## Setting up the Personalized Clingo configuration

### More information

For more information on clingo and its functionalities consult:  https://potassco.org/

For more information on the option commands consult the documentation of Clingo (Potassco) at: https://github.com/potassco/guide/releases/ or https://github.com/potassco/asprin/blob/master/asprin/src/main/clingo_help.py

Or download directly the documentation from here: https://github.com/potassco/guide/releases/download/v2.2.0/guide.pdf

### Setting up the configuration

Clingo offers various option to personalize the solver range of action, probabilistic reasoning and decision-making

At the moment the solver can be personalized using the following method `use_custom_clingo_configuration` with the following options:
    
- The **Configuration** of clingo can be: "frumpy", "tweety", "crafty", "jumpy", "trendy" or "handy". (Default is trendy)


- The amount of **Threads** used by clingo to speed up the process. (Default uses al possible cores)
- 

- The **Time limit** is the maximum time that the solver can use in order to search for a satisfiable answer


- The **Random Frequency** used by clingo in the decision-making is a float number between 0 and 1 included. Where 0 means: No random decisions and 1 means: Every decision is random. (Default is 0.3)


- The **Mode** configures the optimization of the algorithm and can be either "optN" or "ignore". (Default is optN)


- The **Sign** of the operation which can be "asp", "pos" "neg", "rnd". (Default is asp)


- The **Strategy** configures the optimization of the strategy and can be "bb" or "usc". (This functionality is not used in the default configuration)


- The **Heuristic** used by clingo configures the decision heuristic and can be "Berkmin", "Vmtf", "Vsids", "Domain", "Unit" or "None". (This functionality is not used in the default configuration)


In [20]:
%%time

generator.use_default_clingo_configuration()
# The default configuration can be obtained using the following command
print(generator.get_current_clingo_configuration())

# To enable the custom configuration: 
generator.use_custom_clingo_configuration(config="frumpy", threads=None, frequency=0.9, sign_def="asp", strategy="bb", heuristic="Domain")

# The current configuration then becomes the custom one
print(generator.get_current_clingo_configuration())

# this command tells the generator to use the default configuration again
# generator.use_default_clingo_configuration()
# It does not delete the old custom configuration, in fact the custom configuration can be re-enabled by calling
# generator.use_custom_clingo_configuration()

generator.run(high_variability=True, generate_negatives_traces=True)
generator.to_csv(f'{export_path}_Custom_Configuration_Test.csv')