## Qualitative Analysis for Event Log with Nonlinearities

Implemented decision points with guards:
- Request Manager or Standard Approval:
    - Request Manager Approval if total_price >= 600 or supplier is "Dunder Mifflin"
    - Request Standard Approval if total_price <= 1200
- Manager Rejection or Approval:
    - Manager Rejection if total_price >= 500 and item_amount mod 2 == 1
    - No Guard for Manager Approval 
- Standard Rejection or Approval:
    - Standard Rejection if total_price >= 500 and item_amount mod 2 == 1
    - No Guard for Standard Approval 
- Are the goods fine or damaged?
    - Goods Damaged if item_amount^3 > total_price
    - Goods Fine if item_amount^3 <= total_price
- What happens if the goods are fine?
    - Pay Invoice (no Guard, but dependent on previous attivities: if Goods Fine and Receive Invoice)
    - Cancel Order (no Guard, but dependent on previous attivities: if Goods Fine and either Revocation Costumer or Revocation Vendor)
- What happens after Receive Invoice?
    - Revocation Costumer if supplierMap(supplier) >= item_amount where supplierMap maps "Saturn" to 10, "Dunder Mifflin" to 20, "Staples" to 30 and everything else to 40
    - Recovation Vendor if total_price/item_amount < 100 
    - Pay Invoice (no Guard, but dependent on previous attivities: if Receive Invoice and Goods Fine)
    - Cancel Payment (no Guard, but dependent on previous attivities: if Receive Invoice and Goods Damaged)

In [None]:
import os
from exdpn.util import import_log
from exdpn.data_petri_net import data_petri_net
from exdpn.guards import ML_Technique

#### First, look which machine learning guard has the highest performance for a decision point.

In [None]:
event_log_nonlinearities = import_log(os.path.join(os.getcwd(), "..", "datasets", "p2p_nonlinearities.xes"))
dpn_nonlinearities = data_petri_net.Data_Petri_Net(event_log = event_log_nonlinearities,
                                                   event_level_attributes = ["item_category", "item_id", "item_amount", "supplier", "total_price"],
                                                   verbose = False)
# delay als event attribute
decision_points = list(dpn_nonlinearities.decision_points)
decision_points

In [None]:
print("Decision point: ", decision_points[0])
dpn_nonlinearities.guard_manager_per_place[decision_points[0]].get_comparison_plot()
best_guard = dpn_nonlinearities.get_guard_at_place(decision_points[0])
print("Best guard:", best_guard)

In [None]:
print("Decision point: ", decision_points[1])
dpn_nonlinearities.guard_manager_per_place[decision_points[1]].get_comparison_plot()
best_guard = dpn_nonlinearities.get_guard_at_place(decision_points[1])
print("Best guard:", best_guard)

In [None]:
print("Decision point: ", decision_points[2])
dpn_nonlinearities.guard_manager_per_place[decision_points[2]].get_comparison_plot()
best_guard = dpn_nonlinearities.get_guard_at_place(decision_points[2])
print("Best guard:", best_guard)

In [None]:
print("Decision point: ", decision_points[3])
dpn_nonlinearities.guard_manager_per_place[decision_points[3]].get_comparison_plot()
best_guard = dpn_nonlinearities.get_guard_at_place(decision_points[3])
print("Best guard:", best_guard)

In [None]:
print("Decision point: ", decision_points[4])
dpn_nonlinearities.guard_manager_per_place[decision_points[4]].get_comparison_plot()
best_guard = dpn_nonlinearities.get_guard_at_place(decision_points[4])
print("Best guard:", best_guard)

In [None]:
print("Decision point: ", decision_points[5])
dpn_nonlinearities.guard_manager_per_place[decision_points[5]].get_comparison_plot()
best_guard = dpn_nonlinearities.get_guard_at_place(decision_points[5])
print("Best guard:", best_guard)

### Decision Tree

In [None]:
print("Decision point: ", decision_points[0])
dt_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[0]].guards_list[ML_Technique.DT]
if dt_guard.is_explainable():
    dt_guard.get_explainable_representation()

In [None]:
print("Decision point: ", decision_points[1])
dt_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[1]].guards_list[ML_Technique.DT]
if dt_guard.is_explainable():
    dt_guard.get_explainable_representation()

In [None]:
print("Decision point: ", decision_points[2])
dt_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[2]].guards_list[ML_Technique.DT]
if dt_guard.is_explainable():
    dt_guard.get_explainable_representation()

In [None]:
print("Decision point: ", decision_points[3])
dt_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[3]].guards_list[ML_Technique.DT]
if dt_guard.is_explainable():
    dt_guard.get_explainable_representation()

In [None]:
print("Decision point: ", decision_points[4])
dt_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[4]].guards_list[ML_Technique.DT]
if dt_guard.is_explainable():
    dt_guard.get_explainable_representation()

In [None]:
print("Decision point: ", decision_points[5])
dt_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[5]].guards_list[ML_Technique.DT]
if dt_guard.is_explainable():
    dt_guard.get_explainable_representation()

### Neural Network

In [None]:
print("Decision point: ", decision_points[0])
nn_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[0]].guards_list[ML_Technique.NN]
if nn_guard.is_explainable():
    # use sample of test data to speed up computation of explainable representation
    sampled_test_data = dpn_nonlinearities.guard_manager_per_place[decision_points[0]].X_test.sample(n = min(100, len(dpn_nonlinearities.guard_manager_per_place[decision_points[0]].X_test)))
    nn_guard.get_explainable_representation(sampled_test_data)

In [None]:
print("Decision point: ", decision_points[1])
nn_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[1]].guards_list[ML_Technique.NN]
if nn_guard.is_explainable():
    # use sample of test data to speed up computation of explainable representation
    sampled_test_data = dpn_nonlinearities.guard_manager_per_place[decision_points[1]].X_test.sample(n = min(100, len(dpn_nonlinearities.guard_manager_per_place[decision_points[1]].X_test)))
    nn_guard.get_explainable_representation(sampled_test_data)

In [None]:
print("Decision point: ", decision_points[2])
nn_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[2]].guards_list[ML_Technique.NN]
if nn_guard.is_explainable():
    # use sample of test data to speed up computation of explainable representation
    sampled_test_data = dpn_nonlinearities.guard_manager_per_place[decision_points[2]].X_test.sample(n = min(100, len(dpn_nonlinearities.guard_manager_per_place[decision_points[2]].X_test)))
    nn_guard.get_explainable_representation(sampled_test_data)

In [None]:
print("Decision point: ", decision_points[3])
nn_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[3]].guards_list[ML_Technique.NN]
if nn_guard.is_explainable():
    # use sample of test data to speed up computation of explainable representation
    sampled_test_data = dpn_nonlinearities.guard_manager_per_place[decision_points[3]].X_test.sample(n = min(100, len(dpn_nonlinearities.guard_manager_per_place[decision_points[3]].X_test)))
    nn_guard.get_explainable_representation(sampled_test_data)

In [None]:
print("Decision point: ", decision_points[4])
nn_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[4]].guards_list[ML_Technique.NN]
if nn_guard.is_explainable():
    # use sample of test data to speed up computation of explainable representation
    sampled_test_data = dpn_nonlinearities.guard_manager_per_place[decision_points[4]].X_test.sample(n = min(100, len(dpn_nonlinearities.guard_manager_per_place[decision_points[4]].X_test)))
    nn_guard.get_explainable_representation(sampled_test_data)

In [None]:
print("Decision point: ", decision_points[5])
nn_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[5]].guards_list[ML_Technique.NN]
if nn_guard.is_explainable():
    # use sample of test data to speed up computation of explainable representation
    sampled_test_data = dpn_nonlinearities.guard_manager_per_place[decision_points[4]].X_test.sample(n = min(100, len(dpn_nonlinearities.guard_manager_per_place[decision_points[4]].X_test)))
    nn_guard.get_explainable_representation(sampled_test_data)

### Support Vector Machine

In [None]:
print("Decision point: ", decision_points[0])
svm_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[0]].guards_list[ML_Technique.SVM]
if svm_guard.is_explainable():
    svm_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[0]].X_test)

In [None]:
print("Decision point: ", decision_points[1])
svm_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[1]].guards_list[ML_Technique.SVM]
if svm_guard.is_explainable():
    svm_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[1]].X_test)

In [None]:
print("Decision point: ", decision_points[2])
svm_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[2]].guards_list[ML_Technique.SVM]
if svm_guard.is_explainable():
    svm_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[2]].X_test)

In [None]:
print("Decision point: ", decision_points[3])
svm_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[3]].guards_list[ML_Technique.SVM]
if svm_guard.is_explainable():
    svm_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[3]].X_test)

In [None]:
print("Decision point: ", decision_points[4])
svm_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[4]].guards_list[ML_Technique.SVM]
if svm_guard.is_explainable():
    svm_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[4]].X_test)

In [None]:
print("Decision point: ", decision_points[5])
svm_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[5]].guards_list[ML_Technique.SVM]
if svm_guard.is_explainable():
    svm_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[5]].X_test)

### Logistic Regression

In [None]:
print("Decision point: ", decision_points[0])
lr_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[0]].guards_list[ML_Technique.LR]
if lr_guard.is_explainable():
    lr_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[0]].X_test)

In [None]:
print("Decision point: ", decision_points[1])
lr_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[1]].guards_list[ML_Technique.LR]
if lr_guard.is_explainable():
    lr_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[1]].X_test)

In [None]:
print("Decision point: ", decision_points[2])
lr_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[2]].guards_list[ML_Technique.LR]
if lr_guard.is_explainable():
    lr_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[2]].X_test)

In [None]:
print("Decision point: ", decision_points[3])
lr_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[3]].guards_list[ML_Technique.LR]
if lr_guard.is_explainable():
    lr_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[3]].X_test)

In [None]:
print("Decision point: ", decision_points[4])
lr_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[4]].guards_list[ML_Technique.LR]
if lr_guard.is_explainable():
    lr_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[4]].X_test)

In [None]:
print("Decision point: ", decision_points[5])
lr_guard = dpn_nonlinearities.guard_manager_per_place[decision_points[5]].guards_list[ML_Technique.LR]
if lr_guard.is_explainable():
    lr_guard.get_explainable_representation(dpn_nonlinearities.guard_manager_per_place[decision_points[5]].X_test)

### Summary Event Log with Nonlinearities: - needs to be update when implementations are all done -

- Request Manager or Standard Approval:  
Only the Decision Tree guard models the true underlying guard. The split to classify the date samples is total_price <= 1208. For the other three machine learning techniques the other feature attributes like item_amount have a huge impact on the model prediction, these attributes are likely to be correlated with total_price.

- Manager Rejection or Approval:  
All machine learning techniques have problems modelling the true underlying guard. We can see that the Support Vector Machine guard and Logistic Regression guard seem to pick up noise instead of the true underlying guard. But it is possible that the features with the most impact correlate with the total_price. The Decision Tree guard assigns all samples to the same class label, which could be due to one of the stop criteria the Decision Tree Classifier uses. For the Neural Network guard all feature attributes only have a very small impact on the model prediction, thus the plot is empty.

- Standard Rejection or Approval:  
The Decision Tree guard again assigns all samples to the same class label. While the Support Vector Machine guard seems to pick up noise instead of the true underlying guard, for the Neural Network guard and Logistic Regression guard total_price has a huge impact on the model prediction.

- Are the goods fine or damaged?  
The Decision Tree guard again assigns all samples to the same class label. The Support Vector Machine guard and Logistic Regression guard are also not able to detect the true underlying guard. Only for the Neural Network guard item_amount is one of the top 3 feature attribute with the greatest impact on the model prediction.

- What happens if the goods are fine?  
All four machine learing techniques acomplish to model the true underlying guard using mainly previous activities which have impact on what happens if the goods are fine.

- What happens after Receive Invoice?  
The Decision Tree guard again assigns all samples to the same class label. The other three machine learning guards perform quite well in modeling the true underlying guard and either use possibly correlated attribute features or features that are directly associated with the true underlying guard.

Overall, the different machine learning techniques perform quite differently modelling the true underlying guard if nonlinearities are at play. The modeled Decision Trees fail to model the true underlying guards most of the times, while the Neural Network guards seem to handel the nonlinearities quite well, followed by the Logistic Regression guards and Support Vector Machine guards. While for Goods fine or damaged, Standard Rejection or Approval, Request Manager or Standard Approval and what happens if good are fine are quite good, all four techniques only have mediocre performance scores measures using F1-score for Manager Rejection or Approval and what happens after Receive Invoice as seen in the comparision plots.
