# High-Utility Action Rules Mining Example

### Import Module

In [19]:
from action_rules import ActionRules

### Import Data

The aim of this example is to find actions that increase the probability that employees do not leave the company.

In [20]:
import pandas as pd
# Data
data = pd.read_csv("data/attrition.csv")
data

Unnamed: 0,TID,Department,Salary,Attrition
0,1,Sales,Medium,False
1,2,R&D,Medium,False
2,3,R&D,Medium,True
3,4,R&D,Medium,True
4,5,Sales,Low,False
5,6,R&D,High,False
6,7,R&D,High,False
7,8,R&D,High,True


## Utility Tables

In [21]:
intrinsic_table = {
    ('Salary', 'Low'): -300.0,
    ('Salary', 'Medium'): -500.0,
    ('Salary', 'High'): -1000.0,
    ('Attrition', 'False'): 700.0,
    ('Attrition', 'True'): 0.0,
}
# cost per transition
transition_table = {
    ('Salary', 'Low', 'Medium'): -2.0,
    ('Salary', 'Low', 'High'): -4.0,
    ('Salary', 'Medium', 'High'): -2.5,
}

### Explanation of Utility Tables

The **intrinsic_utility_table** and **transition_utility_table** are two key inputs for high-utility action rule mining. They define, respectively, the inherent value (or cost) of having a particular attribute value and the cost (or benefit) of changing an attribute value from one state to another.

#### intrinsic_utility_table

The intrinsic utility table assigns a utility value to each attribute–value pair. This value represents the inherent "worth" or "cost" associated with that specific state of an attribute. Typically:

- **Positive utility values** indicate a benefit or gain.
- **Negative utility values** indicate a cost or penalty.

For example:
- `('Salary', 'Low'): -300.0` means that having a low salary incurs a cost of 300 units.
- `('Salary', 'Medium'): -500.0` means that a medium salary incurs a cost of 500 units.
- `('Salary', 'High'): -1000.0` means that a high salary incurs a cost of 1000 units.
- `('Attrition', 'False'): 700.0` means that the condition "not leaving the company" (Attrition False) has a benefit of 700 units.
- `('Attrition', 'True'): 0.0` means that the condition "leaving the company" (Attrition True) has no benefit (or a cost of 0).

In high-utility action rule mining, these intrinsic utilities are used to compute the overall utility of a rule before considering any changes (transitions).

#### transition_utility_table

The transition utility table specifies the cost or benefit of changing an attribute’s value from one state to another. These values capture the incremental effect (or cost) associated with a particular change in an attribute's state. For example:

- `('Salary', 'Low', 'Medium'): -2.0` means that the cost of increasing a salary from Low to Medium is 2 units.
- `('Salary', 'Low', 'High'): -4.0` means that the cost of increasing a salary from Low to High is 4 units.
- `('Salary', 'Medium', 'High'): -2.5` means that the cost of increasing a salary from Medium to High is 2.5 units.

These transition utilities are combined with the intrinsic utilities of the target states to determine the overall benefit (or cost) of performing an action that changes an attribute’s value.

---

Together, these tables allow the system to compute both:
- **Base (Intrinsic) Utilities:** The inherent worth of a rule based on the states of its attributes.
- **Transition Gains/Costs:** The additional effects of changing an attribute’s state, including any associated cost or benefit for modifying the target attribute.

When combined with support and confidence measures from the mined rules, these utility values help determine which action rules are most profitable or cost-effective.


### Initialize ActionRules Miner with Parameters

Utility parameters: intrinsic_utility_table, transition_utility_table

In [22]:
# Parameters
stable_attributes = ['Department']
flexible_attributes = ['Salary']
target = 'Attrition'
min_stable_attributes = 1
min_flexible_attributes = 1 #min 1
min_undesired_support = 2
min_undesired_confidence = 0.6 #min 0.5
min_desired_support = 2
min_desired_confidence = 0.6 #min 0.5
undesired_state = 'True'
desired_state = 'False'
# Action Rules Mining
action_rules = ActionRules(
    min_stable_attributes=min_stable_attributes,
    min_flexible_attributes=min_flexible_attributes,
    min_undesired_support=min_undesired_support,
    min_undesired_confidence=min_undesired_confidence,
    min_desired_support=min_desired_support,
    min_desired_confidence=min_desired_confidence,
    intrinsic_utility_table=intrinsic_table,
    transition_utility_table=transition_table,
)

### Train the Model

In [23]:
action_rules.fit(
    data=data,
    stable_attributes=stable_attributes,
    flexible_attributes=flexible_attributes,
    target=target,
    target_undesired_state=undesired_state,
    target_desired_state=desired_state,
    use_sparse_matrix=False,
    use_gpu=False,
)

### Print Action Rules with Utility Metrics

In [24]:
len(action_rules.get_rules().get_ar_notation())

1

In [25]:
# Print rules
for action_rule in action_rules.get_rules().get_ar_notation():
    print(action_rule)

[(Department: R&D) ∧ (Salary: Medium → High)] ⇒ [Attrition: True → False], support of undesired part: 2, confidence of undesired part: 0.6666666666666666, support of desired part: 2, confidence of desired part: 0.6666666666666666, uplift: 0.12499999999999997, utility: {undesired_rule_utility: -500.0, desired_rule_utility: -300.0, rule_utility_difference: 200.0, transition_gain: -2.5, rule_utility_gain: 197.5, realistic_undesired_utility: -433.3333333333333, realistic_desired_utility: -366.6666666666667, realistic_rule_difference: 66.66666666666663, transition_gain_dataset: -7.5, realistic_rule_gain_dataset: 192.4999999999999}


In [26]:
for action_rule in action_rules.get_rules().get_pretty_ar_notation():
    print(action_rule)

If attribute 'Department' is 'R&D', attribute 'Salary' value 'Medium' is changed to 'High', then 'Attrition' value 'True' is changed to 'False with uplift: 0.12499999999999997, support of undesired part: 2, confidence of undesired part: 0.6666666666666666, support of desired part: 2, confidence of desired part: 0.6666666666666666., base utilities: (undesired: -500.0, desired: -300.0, difference: 200.0), transition gain: -2.5, rule utility gain: 197.5, realistic utilities: (undesired: -433.3333333333333, desired: -366.6666666666667, difference: 66.66666666666663), dataset-level transition gain: -7.5, dataset-level rule gain: 192.4999999999999


In [27]:
export = action_rules.get_rules().get_export_notation()
print(export)

[{"stable": [{"attribute": "Department", "value": "R&D"}], "flexible": [{"attribute": "Salary", "undesired": "Medium", "desired": "High"}], "target": {"attribute": "Attrition", "undesired": "True", "desired": "False"}, "support of undesired part": 2, "confidence of undesired part": 0.6666666666666666, "support of desired part": 2, "confidence of desired part": 0.6666666666666666, "uplift": 0.12499999999999997, "utility": {"undesired_rule_utility": -500.0, "desired_rule_utility": -300.0, "rule_utility_difference": 200.0, "transition_gain": -2.5, "rule_utility_gain": 197.5, "realistic_undesired_utility": -433.3333333333333, "realistic_desired_utility": -366.6666666666667, "realistic_rule_difference": 66.66666666666663, "transition_gain_dataset": -7.5, "realistic_rule_gain_dataset": 192.4999999999999}}]


### Explanation of Utility Metrics in High-Utility Action Rule Mining

Below are detailed explanations of the utility metrics used to evaluate the profitability of action rules. These metrics combine the intrinsic utilities of items with the gains or costs incurred by transitions (i.e., changes in attribute values). The formulas below are provided in LaTeX for clarity.

---

#### Base Utility Metrics

Let:
- $I_U$ be the set of item indices in the **undesired** rule’s itemset.
- $I_D$ be the set of item indices in the **desired** rule’s itemset.
- $u(i)$ denote the intrinsic utility of item $i$.
- $u(\text{target}_{\text{undesired}})$ be the intrinsic utility of the undesired target state.
- $u(\text{target}_{\text{desired}})$ be the intrinsic utility of the desired target state.
- $T(i, j)$ denote the transition utility for changing from item $i$ to item $j$ (for flexible attributes).
- $T(\text{target}_{\text{undesired}}, \text{target}_{\text{desired}})$ denote the transition utility for changing the target state.

The base metrics are defined as follows:

1. **Undesired Rule Utility ($U_{\text{undesired}}$)**:
   $$
   U_{\text{undesired}} = \sum_{i \in I_U} u(i) + u(\text{target}_{\text{undesired}})
   $$
   This is the total intrinsic utility of all items in the undesired rule’s itemset, plus the intrinsic utility of the undesired target state.

2. **Desired Rule Utility ($U_{\text{desired}}$)**:
   $$
   U_{\text{desired}} = \sum_{i \in I_D} u(i) + u(\text{target}_{\text{desired}})
   $$
   This is the total intrinsic utility of all items in the desired rule’s itemset, plus the intrinsic utility of the desired target state.

3. **Rule Utility Difference ($\Delta U_{\text{intr}}$)**:
   $$
   \Delta U_{\text{intr}} = U_{\text{desired}} - U_{\text{undesired}}
   $$
   This measures the net change in intrinsic utility when moving from the undesired to the desired rule.

4. **Transition Gain ($G_{\text{trans}}$)**:
   $$
   G_{\text{trans}} = \sum_{(i, j) \in F} T(i, j) + T(\text{target}_{\text{undesired}}, \text{target}_{\text{desired}})
   $$
   where
   $$
   F = \{(i,j) \mid i \in I_U,\, j \in I_D,\, i \neq j\}.
   $$
   This is the additional utility (or cost) from the changes in flexible attributes, plus the gain (or cost) of transitioning the target state.

5. **Overall Rule Utility Gain ($\Delta U_{\text{rule}}$)**:
   $$
   \Delta U_{\text{rule}} = \Delta U_{\text{intr}} + G_{\text{trans}}
   $$
   This represents the net gain (or loss) when applying the action rule.

---

## Realistic (Confidence-Scaled) Utility Metrics

Realistic metrics adjust the base metrics by incorporating the confidence of the rules. Let:
- $c_u$ be the confidence of the undesired rule.
- $s_u$ be the absolute support of the undesired rule.
- $c_d$ be the confidence of the desired rule.
- $U_{\text{undesired}}$ and $U_{\text{desired}}$ be as defined above.
- $G_{\text{trans}}$ be the base transition gain.

Then, the realistic metrics are defined as follows:

1. **Realistic Undesired Utility ($U_{\text{undesired, realistic}}$)**:
   $$
   U_{\text{undesired, realistic}} = c_u \cdot U_{\text{undesired}} + (1 - c_u) \cdot U_{\text{desired}}
   $$
   This reflects the effective undesired utility when accounting for the possibility that, even without any action, some instances might already exhibit the desired outcome.

2. **Realistic Desired Utility ($U_{\text{desired, realistic}}$)**:
   $$
   U_{\text{desired, realistic}} = (1 - c_d) \cdot U_{\text{undesired}} + c_d \cdot U_{\text{desired}}
   $$
   This reflects the effective desired utility under the assumption that only a fraction $c_d$ of the instances transition as intended.

3. **Realistic Rule Difference ($\Delta U_{\text{realistic}}$)**:
   $$
   \Delta U_{\text{realistic}} = U_{\text{desired, realistic}} - U_{\text{undesired, realistic}}
   $$
   This measures the net gain using the realistic (confidence-scaled) intrinsic utilities.

4. **Dataset-Level Realistic Gain**:

   First, estimate the effective number of transactions as:
   $$
   N_{\text{eff}} = \frac{s_u}{c_u} \quad (\text{if } c_u > 0)
   $$

   Then, the dataset-level realistic rule gain is:
   $$
   \Delta U_{\text{dataset, realistic}} = N_{\text{eff}} \cdot \left(\Delta U_{\text{realistic}} + G_{\text{trans}}\right)
   $$

5. **Dataset-Level Transition Gain ($G_{\text{trans, dataset}}$)**:
   $$
   G_{\text{trans, dataset}} = N_{\text{eff}} \cdot G_{\text{trans}}
   $$
   This scales the transition gain by the effective number of transactions.

---

These metrics provide a comprehensive assessment of an action rule's effectiveness by combining the intrinsic value of attribute states with the costs or benefits of making transitions, and then adjusting these measures by the confidence in the rule.