# DECLARE Constraint Discovery and Analysis

This notebook performs DECLARE constraint discovery on event log data to identify declarative process constraints and analyze their satisfaction across traces.

In [62]:
from Declare4Py.D4PyEventLog import D4PyEventLog
from Declare4Py.ProcessMiningTasks.ConformanceChecking.MPDeclareAnalyzer import (
    MPDeclareAnalyzer,
)
from Declare4Py.ProcessMiningTasks.Discovery.DeclareMiner import DeclareMiner
from Declare4Py.ProcessModels.DeclareModel import DeclareModel
import pm4py as pm
from pm4py.objects.conversion.log import converter as log_converter
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

Load the clustered event log from CSV, filter by cluster number, and convert to both pm4py and D4Py formats for processing.

In [63]:
log_path = "/home/cwrk/Work/xai-2026-declarative-feature-space/DomesticDeclarations_clustered.csv"
log = pd.read_csv(log_path)

cols = ["case:concept:name", "concept:name", "time:timestamp", "cluster"]
print("Number of clusters: ", log["cluster"].nunique())
cluster_number = -1
log = log.loc[log['cluster'] == cluster_number, cols]

print()

log["time:timestamp"] = pd.to_datetime(log["time:timestamp"])

log = pm.convert.convert_to_event_log(log)
pm.write_xes(log, f"DomesticDeclarations_cluster_{cluster_number}.xes")
log = pm.read_xes(f"DomesticDeclarations_cluster_{cluster_number}.xes")

# legacy pm4py log
parameters = {
    log_converter.Variants.TO_EVENT_LOG.value.Parameters.CASE_ID_KEY: "case:concept:name"
}
log = log_converter.apply(
    log,
    variant=log_converter.Variants.TO_EVENT_LOG,
    parameters=parameters,
)

case_ids = [trace.attributes["concept:name"] for trace in log]
# d4py log
log = D4PyEventLog(log=log, case_name="case:concept:name")

Number of clusters:  6



exporting log, completed traces ::   0%|          | 0/780 [00:00<?, ?it/s]

parsing log, completed traces ::   0%|          | 0/780 [00:00<?, ?it/s]

## Discover DECLARE Model and Check Conformance

Run DECLARE constraint discovery with specified parameters (min support 5%, max cardinality 2), then perform conformance checking to analyze how traces satisfy the discovered constraints.

In [64]:
discovery = DeclareMiner(
    log=log,
    consider_vacuity=False,
    min_support=0.05,
    itemsets_support=0.05,
    max_declare_cardinality=2,
)
model = discovery.run()
checker = MPDeclareAnalyzer(
    log=log, declare_model=model, consider_vacuity=False
)
results = checker.run()


Computing discovery ...


## Extract Conformance Metrics

Extract key metrics from conformance checking results:
- **Violations**: number of constraint violations per trace
- **Activations**: how many times constraints were activated
- **Fulfillments**: number of successful constraint satisfactions
- **Pendings**: constraints still pending
- **Satisfaction**: binary satisfaction state per constraint

In [65]:
# ["num_activations", "num_violations", "num_fulfillments", "num_pendings", "state"]
violations = results.get_metric("num_violations")
activations = results.get_metric("num_activations")
fullfillments = results.get_metric("num_fulfillments")
pendings = results.get_metric("num_pendings")
statisfaction = results.get_metric("state")

constraints = list(statisfaction.columns)
n_traces = statisfaction.shape[0]

# sum over all columns of fullfillments, violations, activations, pendings
violations_sum = violations.sum()
activations_sum = activations.sum()
fullfillments_sum = fullfillments.sum()
pendings_sum = pendings.sum()
statisfaction_sum = statisfaction.sum()

## Calculate Constraint Support

Compute support (satisfaction rate) for each constraint across all traces and save results to CSV.

In [66]:
support = {}
for c in constraints:
    support[c] = statisfaction_sum[c] / n_traces

support = pd.Series(support)
support = support.sort_values(ascending=False)
support.to_csv(f"bpic2020_support_cluster_{cluster_number}.csv", header=True)
support

Absence2[Payment Handled] | |                                                                              1.000000
Absence2[Declaration SAVED by EMPLOYEE] | |                                                                1.000000
Absence2[Request Payment] | |                                                                              1.000000
Absence2[Declaration REJECTED by BUDGET OWNER] | |                                                         0.998718
Absence2[Declaration REJECTED by PRE_APPROVER] | |                                                         0.993590
                                                                                                             ...   
Alternate Response[Declaration REJECTED by BUDGET OWNER, Declaration FINAL_APPROVED by SUPERVISOR] | |     0.052564
Chain Precedence[Declaration REJECTED by BUDGET OWNER, Declaration REJECTED by EMPLOYEE] | |               0.052564
Responded Existence[Declaration APPROVED by ADMINISTRATION, Declaration 

## Create Feature Spaces with Support Filtering

Build two feature representations:
- **Boolean feature space**: binary satisfaction (True/False) per constraint
- **Quantitative feature space**: number of fulfillments per constraint

Filter to keep only constraints with support ≥ 50% and analyze their value distributions.

In [67]:
bool_feature_space = statisfaction.astype(bool)
qual_feature_space = fullfillments

support_threshhold = 0.5
cols_to_keep = support[support >= support_threshhold].index
bool_feature_space = bool_feature_space[cols_to_keep]
qual_feature_space = qual_feature_space[cols_to_keep]

bool_unique_series = bool_feature_space.apply(lambda col: col.value_counts().to_dict())
for c in bool_unique_series.index:
    print(c, bool_unique_series[c])

qual_unique_series = qual_feature_space.apply(lambda col: col.value_counts().to_dict())
for c in qual_unique_series.index:
    print(c, qual_unique_series[c])

Absence2[Payment Handled] | | {True: 780}
Absence2[Declaration SAVED by EMPLOYEE] | | {True: 780}
Absence2[Request Payment] | | {True: 780}
Absence2[Declaration REJECTED by BUDGET OWNER] | | {True: 779, False: 1}
Absence2[Declaration REJECTED by PRE_APPROVER] | | {True: 775, False: 5}
Absence2[Declaration FINAL_APPROVED by SUPERVISOR] | | {True: 771, False: 9}
Absence2[Declaration APPROVED by PRE_APPROVER] | | {True: 770, False: 10}
Absence2[Declaration REJECTED by SUPERVISOR] | | {True: 768, False: 12}
Absence2[Declaration APPROVED by BUDGET OWNER] | | {True: 767, False: 13}
Absence2[Declaration REJECTED by ADMINISTRATION] | | {True: 761, False: 19}
Absence1[Declaration REJECTED by BUDGET OWNER] | | {True: 724, False: 56}
Absence2[Declaration REJECTED by EMPLOYEE] | | {True: 712, False: 68}
Absence1[Declaration REJECTED by PRE_APPROVER] | | {True: 699, False: 81}
Absence1[Declaration APPROVED by PRE_APPROVER] | | {True: 697, False: 83}
Absence1[Declaration APPROVED by BUDGET OWNER] | 

Inspect Boolean Feature Space


In [68]:
bool_feature_space

Unnamed: 0,Absence2[Payment Handled] | |,Absence2[Declaration SAVED by EMPLOYEE] | |,Absence2[Request Payment] | |,Absence2[Declaration REJECTED by BUDGET OWNER] | |,Absence2[Declaration REJECTED by PRE_APPROVER] | |,Absence2[Declaration FINAL_APPROVED by SUPERVISOR] | |,Absence2[Declaration APPROVED by PRE_APPROVER] | |,Absence2[Declaration REJECTED by SUPERVISOR] | |,Absence2[Declaration APPROVED by BUDGET OWNER] | |,Absence2[Declaration REJECTED by ADMINISTRATION] | |,...,"Choice[Declaration APPROVED by ADMINISTRATION, Declaration FINAL_APPROVED by SUPERVISOR] | |","Not Responded Existence[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Exclusive Choice[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Exclusive Choice[Declaration REJECTED by ADMINISTRATION, Declaration SUBMITTED by EMPLOYEE] | |","Not Chain Response[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Not Response[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Choice[Declaration REJECTED by SUPERVISOR, Payment Handled] | |","Choice[Payment Handled, Declaration REJECTED by SUPERVISOR] | |","Choice[Request Payment, Declaration REJECTED by SUPERVISOR] | |","Choice[Declaration REJECTED by SUPERVISOR, Request Payment] | |"
0,True,True,True,True,True,False,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
1,True,True,True,True,True,True,True,True,True,True,...,False,False,False,False,False,False,False,False,False,False
2,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
3,True,True,True,True,True,True,True,True,True,True,...,False,False,False,False,False,False,False,False,False,False
4,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
775,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True
776,True,True,True,True,True,True,True,True,True,True,...,False,False,False,False,False,False,False,False,False,False
777,True,True,True,True,True,True,True,True,True,True,...,False,False,False,False,False,False,False,False,False,False
778,True,True,True,True,True,True,True,True,True,True,...,True,True,True,True,True,True,True,True,True,True


Inspect Quantitative Feature Space


In [69]:
qual_feature_space

Unnamed: 0,Absence2[Payment Handled] | |,Absence2[Declaration SAVED by EMPLOYEE] | |,Absence2[Request Payment] | |,Absence2[Declaration REJECTED by BUDGET OWNER] | |,Absence2[Declaration REJECTED by PRE_APPROVER] | |,Absence2[Declaration FINAL_APPROVED by SUPERVISOR] | |,Absence2[Declaration APPROVED by PRE_APPROVER] | |,Absence2[Declaration REJECTED by SUPERVISOR] | |,Absence2[Declaration APPROVED by BUDGET OWNER] | |,Absence2[Declaration REJECTED by ADMINISTRATION] | |,...,"Choice[Declaration APPROVED by ADMINISTRATION, Declaration FINAL_APPROVED by SUPERVISOR] | |","Not Responded Existence[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Exclusive Choice[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Exclusive Choice[Declaration REJECTED by ADMINISTRATION, Declaration SUBMITTED by EMPLOYEE] | |","Not Chain Response[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Not Response[Declaration SUBMITTED by EMPLOYEE, Declaration REJECTED by ADMINISTRATION] | |","Choice[Declaration REJECTED by SUPERVISOR, Payment Handled] | |","Choice[Payment Handled, Declaration REJECTED by SUPERVISOR] | |","Choice[Request Payment, Declaration REJECTED by SUPERVISOR] | |","Choice[Declaration REJECTED by SUPERVISOR, Request Payment] | |"
0,,,,,,,,,,,...,,3,,,3,3,,,,
1,,,,,,,,,,,...,,0,,,0,0,,,,
2,,,,,,,,,,,...,,2,,,2,2,,,,
3,,,,,,,,,,,...,,0,,,0,0,,,,
4,,,,,,,,,,,...,,2,,,2,2,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
775,,,,,,,,,,,...,,1,,,1,1,,,,
776,,,,,,,,,,,...,,0,,,0,0,,,,
777,,,,,,,,,,,...,,0,,,0,0,,,,
778,,,,,,,,,,,...,,1,,,1,1,,,,


Analyze Boolean Feature Distribution


In [70]:

bool_unique_series = bool_feature_space.apply(lambda col: col.value_counts().to_dict())
for c in bool_unique_series.index:
    print(c, bool_unique_series[c])

Absence2[Payment Handled] | | {True: 780}
Absence2[Declaration SAVED by EMPLOYEE] | | {True: 780}
Absence2[Request Payment] | | {True: 780}
Absence2[Declaration REJECTED by BUDGET OWNER] | | {True: 779, False: 1}
Absence2[Declaration REJECTED by PRE_APPROVER] | | {True: 775, False: 5}
Absence2[Declaration FINAL_APPROVED by SUPERVISOR] | | {True: 771, False: 9}
Absence2[Declaration APPROVED by PRE_APPROVER] | | {True: 770, False: 10}
Absence2[Declaration REJECTED by SUPERVISOR] | | {True: 768, False: 12}
Absence2[Declaration APPROVED by BUDGET OWNER] | | {True: 767, False: 13}
Absence2[Declaration REJECTED by ADMINISTRATION] | | {True: 761, False: 19}
Absence1[Declaration REJECTED by BUDGET OWNER] | | {True: 724, False: 56}
Absence2[Declaration REJECTED by EMPLOYEE] | | {True: 712, False: 68}
Absence1[Declaration REJECTED by PRE_APPROVER] | | {True: 699, False: 81}
Absence1[Declaration APPROVED by PRE_APPROVER] | | {True: 697, False: 83}
Absence1[Declaration APPROVED by BUDGET OWNER] | 

Analyze Quantitative Feature Distribution


In [71]:
qual_unique_series = qual_feature_space.apply(lambda col: col.value_counts().to_dict())
for c in qual_unique_series.index:
    print(c, qual_unique_series[c])

Absence2[Payment Handled] | | {}
Absence2[Declaration SAVED by EMPLOYEE] | | {}
Absence2[Request Payment] | | {}
Absence2[Declaration REJECTED by BUDGET OWNER] | | {}
Absence2[Declaration REJECTED by PRE_APPROVER] | | {}
Absence2[Declaration FINAL_APPROVED by SUPERVISOR] | | {}
Absence2[Declaration APPROVED by PRE_APPROVER] | | {}
Absence2[Declaration REJECTED by SUPERVISOR] | | {}
Absence2[Declaration APPROVED by BUDGET OWNER] | | {}
Absence2[Declaration REJECTED by ADMINISTRATION] | | {}
Absence1[Declaration REJECTED by BUDGET OWNER] | | {}
Absence2[Declaration REJECTED by EMPLOYEE] | | {}
Absence1[Declaration REJECTED by PRE_APPROVER] | | {}
Absence1[Declaration APPROVED by PRE_APPROVER] | | {}
Absence1[Declaration APPROVED by BUDGET OWNER] | | {}
Choice[Declaration SUBMITTED by EMPLOYEE, Payment Handled] | | {}
Choice[Payment Handled, Declaration SUBMITTED by EMPLOYEE] | | {}
Choice[Declaration SUBMITTED by EMPLOYEE, Request Payment] | | {}
Choice[Request Payment, Declaration SUBMI