# Explainer utility in BPMN2CONSTRAINTS

In this notebook, we explore the `Explainer` class, designed to analyze and explain the conformance of traces against predefined constraints. Trace analysis is crucial in domains such as process mining, where understanding the behavior of system executions against expected models can uncover inefficiencies, deviations, or compliance issues.

The constraints currently consists of basic regex, this is because of it's similiarities and likeness to declarative constraints used in BPMN2CONSTRAINTS


## Step 1: Setup

In [1]:
import sys
sys.path.append('../')
from explainer.explainer_util import Trace, EventLog
from explainer.explainer_regex import ExplainerRegex


## Step 2: Basic Usage
Let's start by creating an instance of the `Explainer` and adding a simple constraint that a valid trace should contain the sequence "A" followed by "B" and then "C".


In [2]:
explainer = ExplainerRegex()
explainer.add_constraint('A.*B.*C')

## Step 3: Analyzing Trace Conformance

Now, we'll create a trace and check if it conforms to the constraints we've defined.

In [3]:
trace = Trace(['A', 'X', 'B', 'Y', 'C'])
is_conformant = explainer.conformant(trace)
print(f"Is the trace conformant? {is_conformant}")

Is the trace conformant? True


## Step 4: Explaining Non-conformance

If a trace is not conformant, we can use the `minimal_expl` and `counterfactual_expl` methods to understand why and how to adjust the trace.


In [4]:
non_conformant_trace = Trace(['A', 'C'])
print('Constraint: A.*B.*C')
print('Trace:' + str(non_conformant_trace.nodes))
print(explainer.minimal_expl(non_conformant_trace))
print(explainer.counterfactual_expl(non_conformant_trace))

non_conformant_trace = Trace(['C', 'B', 'A'])
print('-----------')
print('Constraint: A.*B.*C')
print('Trace:' + str(non_conformant_trace.nodes))
print(explainer.minimal_expl(non_conformant_trace))
print(explainer.counterfactual_expl(non_conformant_trace))

non_conformant_trace = Trace(['A','A','C'])
print('-----------')
print('Constraint: A.*B.*C')
print('Trace:' + str(non_conformant_trace.nodes))
print(explainer.minimal_expl(non_conformant_trace))
print(explainer.counterfactual_expl(non_conformant_trace))


non_conformant_trace = Trace(['A','A','C','A','TEST','A','C', 'X', 'Y']) 
print('-----------')
print('Constraint: A.*B.*C')
print('Trace:' + str(non_conformant_trace.nodes))
print(explainer.minimal_expl(non_conformant_trace))
print(explainer.counterfactual_expl(non_conformant_trace))


explainer.remove_constraint(0)
explainer.add_constraint('AC')
non_conformant_trace = Trace(['A', 'X', 'C']) #Substraction
print('-----------')
print('Constraint: AC')
print('Trace:' + str(non_conformant_trace.nodes))
print(explainer.minimal_expl(non_conformant_trace))
print(explainer.counterfactual_expl(non_conformant_trace))
print('-----------')

explainer.add_constraint('B.*A.*B.*C')
explainer.add_constraint('A.*B.*C.*')
explainer.add_constraint('A.*D.*B*')
explainer.add_constraint('A[^D]*B')
explainer.add_constraint('B.*[^X].*')
non_conformant_trace = Trace(['A', 'X', 'C']) #Substraction
for con in explainer.constraints:
    print(f'constraint: {con}')
print('Trace:' + str(non_conformant_trace.nodes))
print(explainer.minimal_expl(non_conformant_trace))
print(explainer.counterfactual_expl(non_conformant_trace))




Constraint: A.*B.*C
Trace:['A', 'C']
Non-conformance due to: Constraint (A.*B.*C) is violated by subtrace: ('A', 'C')

Addition (Added B at position 1): A->B->C
-----------
Constraint: A.*B.*C
Trace:['C', 'B', 'A']
Non-conformance due to: Constraint (A.*B.*C) is violated by subtrace: ('C', 'B')

Addition (Added A at position 1): C->A->B->A
Subtraction (Removed C from position 0): A->B->A
Addition (Added C at position 2): A->B->C->A
-----------
Constraint: A.*B.*C
Trace:['A', 'A', 'C']
Non-conformance due to: Constraint (A.*B.*C) is violated by subtrace: ('A', 'A')

Addition (Added B at position 1): A->B->A->C
-----------
Constraint: A.*B.*C
Trace:['A', 'A', 'C', 'A', 'TEST', 'A', 'C', 'X', 'Y']
Non-conformance due to: Constraint (A.*B.*C) is violated by subtrace: ('A', 'A')

Subtraction (Removed TEST from position 4): A->A->C->A->A->C->X->Y
Addition (Added B at position 1): A->B->A->C->A->A->C->X->Y
-----------
Constraint: AC
Trace:['A', 'X', 'C']
Non-conformance due to: Constraint (AC

## Step 5: Generating minimal solutions

In [5]:
exp = ExplainerRegex()
exp.add_constraint("^A")
exp.add_constraint("A.*B.*")
exp.add_constraint("C$")
trace = Trace(['A', 'B','A','C', 'B'])
print("Example without minimal solution")
print("--------------------------------")
print(exp.counterfactual_expl(trace))

print("\nExample with minimal solution")
print("--------------------------------")
exp.set_minimal_solution(True)
print(exp.counterfactual_expl(trace))
exp.set_minimal_solution(False)
trace = Trace(['C','B','A'])
print("\nExample without minimal solution")
print("--------------------------------")
print(exp.counterfactual_expl(trace))

print("\nExample with minimal solution")
print("--------------------------------")
exp.set_minimal_solution(True)
print(exp.counterfactual_expl(trace))

Example without minimal solution
--------------------------------

Subtraction (Removed A from position 2): A->B->C->B
Subtraction (Removed B from position 3): A->B->C

Example with minimal solution
--------------------------------

Addition (Added B at position 1): A->B->B->A->C->B
Subtraction (Removed B from position 5): A->B->B->A->C

Example without minimal solution
--------------------------------

Addition (Added B at position 1): C->B->B->A
Addition (Added B at position 1): C->B->B->B->A
Addition (Added A at position 1): C->A->B->B->B->A
Subtraction (Removed C from position 0): A->B->B->B->A
Addition (Added C at position 4): A->B->B->B->C->A
Subtraction (Removed A from position 5): A->B->B->B->C

Example with minimal solution
--------------------------------

Addition (Added A at position 1): C->A->B->A
Subtraction (Removed C from position 0): A->B->A
Addition (Added C at position 2): A->B->C->A
Subtraction (Removed A from position 3): A->B->C


## Step 6: Contribution functions and Event Logs

For this project, 4 contribution functions have been developed to determined a trace variant's, or constraint's contribution to a system.

For the sake efficiency, all of the contribution functions, `variant_ctrb_to_conformance_loss`, `variant_ctrb_to_fitness`,`constraint_ctrb_to_fitness` and `constraint_ctrb_to_conformance`, should equal the total amount of conformance loss and fitness rate.

There are to methods to determine the conformance rate (and conformance loss, by extension) and the fitness rate; `determine_conformance_rate` and `determine_fitness_rate`. 

All of these methods utilized an abstraction of an Event Log. In this block, the initialization and usage of conformance rate and fitness rate is displayed.

In [6]:
exp = ExplainerRegex()
# Setup an event log
event_log = EventLog()
traces = [
    Trace(['A', 'B','C']),
    Trace(['A', 'B']),
    Trace(['B']),
    Trace(['B','C'])
]
event_log.add_trace(traces[0], 10) # The second parameter is how many variants you'd like to add, leave blank for 1
event_log.add_trace(traces[1], 10)
event_log.add_trace(traces[2], 10)
event_log.add_trace(traces[3], 20)
# Add the constraints
exp.add_constraint('^A')
exp.add_constraint('C$')

print("Conformance rate: " + str(exp.determine_conformance_rate(event_log) * 100) + "%")
print("Fitness rate:     " + str(exp.determine_fitness_rate(event_log) * 100) + "%")

Conformance rate: 20.0%
Fitness rate:     50.0%


`variant_ctrb_to_conformance_loss` determines how much a specific variant contributes to the overall conformance loss

In [7]:
print("Contribution of variant to conformance rate")
print("Ctrb of variant "+ str(traces[0].nodes) +": "+ str(exp.variant_ctrb_to_conformance_loss(event_log, traces[0])))
print("Ctrb of variant "+ str(traces[1].nodes) +":      "+ str(exp.variant_ctrb_to_conformance_loss(event_log, traces[1])))
print("Ctrb of variant "+ str(traces[2].nodes) +":           "+ str(exp.variant_ctrb_to_conformance_loss(event_log, traces[2])))
print("Ctrb of variant "+ str(traces[3].nodes) +":      "+ str(exp.variant_ctrb_to_conformance_loss(event_log, traces[3])))
print("Total conformance loss:          " + str(exp.variant_ctrb_to_conformance_loss(event_log, traces[0]) + exp.variant_ctrb_to_conformance_loss(event_log, traces[1]) + exp.variant_ctrb_to_conformance_loss(event_log, traces[2]) + exp.variant_ctrb_to_conformance_loss(event_log, traces[3])))

Contribution of variant to conformance rate
Ctrb of variant ['A', 'B', 'C']: 0.0
Ctrb of variant ['A', 'B']:      0.2
Ctrb of variant ['B']:           0.2
Ctrb of variant ['B', 'C']:      0.4
Total conformance loss:          0.8


`variant_ctrb_to_fitness` determines how much a specific variant contributes to the overall fitness rate

In [8]:
print("Contribution of variant to fitness rate")
print("Ctrb of variant " + str(traces[0].nodes) + ": " + str(round(exp.variant_ctrb_to_fitness(event_log, traces[0]), 2)))
print("Ctrb of variant " + str(traces[1].nodes) + ":      " + str(round(exp.variant_ctrb_to_fitness(event_log, traces[1]), 2)))
print("Ctrb of variant " + str(traces[2].nodes) + ":           " + str(round(exp.variant_ctrb_to_fitness(event_log, traces[2]), 2)))
print("Ctrb of variant " + str(traces[3].nodes) + ":      " + str(round(exp.variant_ctrb_to_fitness(event_log, traces[3]), 2)))
total_fitness = (exp.variant_ctrb_to_fitness(event_log, traces[0]) +
                 exp.variant_ctrb_to_fitness(event_log, traces[1]) +
                 exp.variant_ctrb_to_fitness(event_log, traces[2]) +
                 exp.variant_ctrb_to_fitness(event_log, traces[3]))
print("Total fitness:                   " + str(round(total_fitness, 2)))


Contribution of variant to fitness rate
Ctrb of variant ['A', 'B', 'C']: 0.0
Ctrb of variant ['A', 'B']:      0.1
Ctrb of variant ['B']:           0.2
Ctrb of variant ['B', 'C']:      0.2
Total fitness:                   0.5


`constraint_ctrb_to_fitness` determines how much a specific constraint contributes to the overall fitness rate

In [9]:

print("^A ctrb to fitness rate: " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 0)))
print("B$ ctrb to fitness rate: " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 1)))
print("Total fitness loss       " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 0) + exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 1)))

^A ctrb to fitness rate: 0.3
B$ ctrb to fitness rate: 0.2
Total fitness loss       0.5


## Step 7: Shapely values

`constraint_ctrb_to_conformance` determines how much a specific constraint contributes to the overall conformance loss. 

Because the constraints overlap in this case, Shapley values have been used to determine the contribution. This makes the method more complicated and more computationally heavy than the other contribution functions 


In [10]:
print("Contriution of constraint to conformance rate")
print("^A ctrb:                " + str(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 0)))
print("C$ ctrb:                " + str(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 1)) + " (adjusted " + str(round(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 1), 2)) + ")")
print("Total conformance loss: " + str(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 0) + exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 1)))



Contriution of constraint to conformance rate
^A ctrb:                0.5
C$ ctrb:                0.30000000000000004 (adjusted 0.3)
Total conformance loss: 0.8


In [11]:
exp = ExplainerRegex()
event_log = EventLog()
trace1 = Trace(['A', 'B', 'C'])
trace2 = Trace(['B', 'C'])
trace3 = Trace(['A', 'B'])
trace4 = Trace(['B'])
trace5 = Trace(['A', 'C'])


event_log.add_trace(trace1, 5) 
event_log.add_trace(trace2, 10)
event_log.add_trace(trace3, 5)
event_log.add_trace(trace4, 5)
event_log.add_trace(trace5, 10)


exp = ExplainerRegex()
exp.add_constraint("C$")
exp.add_constraint("^A")
exp.add_constraint("B+")
conf_rate = exp.determine_conformance_rate(event_log)
print("Conformance rate: "+ str(round(conf_rate, 2)))
print("Contribution C$: ", round(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 0), 2)) # Round for easier readability
print("Contribution ^A: ", round(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 1), 2))
print("Contribution B+: ", round(exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 2), 2))
total_ctrb = exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 0) + exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 1) + exp.constraint_ctrb_to_conformance(event_log, exp.constraints, 2)
conf_rate = round(conf_rate, 2) 
total_ctrb = round(total_ctrb, 2)
print("Conformance loss : " + str(100 - (conf_rate * 100)) + "%, contribution to loss: " + str(total_ctrb * 100) + "%")
print("------------------------------------")
print("Fitness rate loss: "+ str(1 - exp.determine_fitness_rate(event_log)))
print("C$ ctrb to fitness rate loss : " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 0)))
print("^A ctrb to fitness rate loss : " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 1)))
print("B+ ctrb to fitness rate loss : " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 2)))

print("Total fitness rate loss :      " + str(exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 0) + exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 1) + exp.constraint_ctrb_to_fitness(event_log, exp.constraints, 2)))


Conformance rate: 0.14
Contribution C$:  0.21
Contribution ^A:  0.36
Contribution B+:  0.29
Conformance loss : 86.0%, contribution to loss: 86.0%
------------------------------------
Fitness rate loss: 0.33333333333333337
C$ ctrb to fitness rate loss : 0.09523809523809523
^A ctrb to fitness rate loss : 0.14285714285714285
B+ ctrb to fitness rate loss : 0.09523809523809523
Total fitness rate loss :      0.3333333333333333
