# Model Discovery in Declare4Py

This tutorial explains how to perform the discovery of a DECLARE model and how to browse the results.

After importing the Declare4Py package, a `Declare4Py` object has to be instantiated to load the log.

In [1]:
import os
from declare4py.declare4py import Declare4Py


log_path = os.path.join("..", "tests", "Sepsis Cases.xes.gz")

d4py = Declare4Py()
d4py.parse_xes_log(log_path)

parsing log, completed traces ::   0%|          | 0/1050 [00:00<?, ?it/s]

The discovery of a DECLARE model is performed with two consecutive steps in Declare4Py:

1. the computing of the frequent itemsets with an input support and of length 2 with the `compute_frequent_itemsets()` function, see the tutorial on the analysis of the logs.
2. the model discovery with the `discovery()` function. This takes as input the boolean parameter `consider_vacuity=true` that considers vacuously satisfied traces as satisfied, violated otherwise. The integer parameter `max_declare_cardinality` sets the cardinality of the Exactly, Existence and Absence templates. This function returns a Python dictionary containing the results indexed by discovered constraints. The value is a Python dictionary with keys the single ids of the traces in the log (that can be retrieved with `get_trace_keys()`) that satisfy the constraint. The values of this inner dictionary is a `CheckerResult` object containing the number of pendings, activations, violations, fulfilments and the truth value of the trace for that constraint.
```
discovery_results = {constr_1: {trace_1: CheckerResult object, trace_2: CheckerResult object, ...},
                     constr_2: {trace_1: CheckerResult object, ... },
                     ...
                    }
```
The `CheckerResult` objects can be accessed by the attributes `num_pendings`, `num_activations`, `num_fulfillments`, `num_violations` and `state`.

In [2]:
d4py.compute_frequent_itemsets(min_support=0.9, len_itemset=2)
discovery_results = d4py.discovery(consider_vacuity=True, max_declare_cardinality=2)

Computing discovery ...


Let's inspect the results for the constraint `Responded Existence[ER Sepsis Triage, ER Triage] | | |` and the trace `(488, 'VR')`

In [3]:
decl_constr = 'Responded Existence[ER Sepsis Triage, ER Triage] | | |'
trace_id = (488, 'VR')
print(f"Number of pendings: {discovery_results[decl_constr][trace_id].num_pendings}")
print(f"Number of activations: {discovery_results[decl_constr][trace_id].num_activations}")
print(f"Number of fulfilments: {discovery_results[decl_constr][trace_id].num_fulfillments}")
print(f"Number of violation: {discovery_results[decl_constr][trace_id].num_violations}")
print(f"Truth value of: {discovery_results[decl_constr][trace_id].state}")

Number of pendings: 0
Number of activations: 1
Number of fulfilments: 1
Number of violation: 0
Truth value of: Satisfied


The results of the discovery can be filtered according to a support threshold and saved in a DECLARE file (if specified). This is performed by the `filter_discovery` function that returns a Python dictionary with the discovered DECLARE constraints as keys and their support as value.

In [4]:
d4py.filter_discovery(min_support=0.7, output_path='sepsis_model_discovered.decl')

{'Existence1[ER Triage] | |': 1.0,
 'Absence2[ER Triage] | |': 0.9971428571428571,
 'Exactly1[ER Triage] | |': 0.9971428571428571,
 'Existence1[ER Registration] | |': 1.0,
 'Absence2[ER Registration] | |': 1.0,
 'Exactly1[ER Registration] | |': 1.0,
 'Init[ER Registration] | |': 0.9476190476190476,
 'Existence1[ER Sepsis Triage] | |': 0.9990476190476191,
 'Absence2[ER Sepsis Triage] | |': 1.0,
 'Exactly1[ER Sepsis Triage] | |': 0.9990476190476191,
 'Existence1[Leucocytes] | |': 0.9638095238095238,
 'Existence1[CRP] | |': 0.959047619047619,
 'Choice[ER Registration, ER Triage] | | |': 1.0,
 'Choice[ER Triage, ER Registration] | | |': 1.0,
 'Responded Existence[ER Registration, ER Triage] | | |': 1.0,
 'Responded Existence[ER Triage, ER Registration] | | |': 1.0,
 'Response[ER Registration, ER Triage] | | |': 0.9942857142857143,
 'Alternate Response[ER Registration, ER Triage] | | |': 0.9942857142857143,
 'Chain Response[ER Registration, ER Triage] | | |': 0.9247619047619048,
 'Precedenc