# Co-Occurance

> Business qestion: What event (or product buys etc) goe together? E.g. Answer: "If A occurs then B is likely to occur as well"

A practical, small-scale example for a **co-occurrence / association problem**.
We’ll use a market basket analysis type dataset (classic example, but not silly), where we want to find items that are often bought together.

## Example: Market Basket Analysis

### Problem Explanation

We want to answer:
* “If a customer buys Bread, what other items are they likely to buy?”
* This is a co-occurrence problem where we search for patterns in transactions.

### Steps to Tackle
1. Prepare Data
2. Transform data
3. Apply Association Rule Mining
   * Use the **Apriori Algorithm** to find frequent itemsets.
   * Generate association rules:
  
      $\text{Rule: } A \rightarrow B, \quad \text{measured by support, confidence, lift}$
  
   * Where:
     * **Support**: How often A and B occur together
      
         $\text{Support}(A,B) = \frac{\text{Transactions with A and B}}{\text{Total transactions}}$

     * **Confidence**: How often B appears when A appears
    
        $\text{Confidence}(A \rightarrow B) = \frac{\text{Support}(A,B)}{\text{Support}(A)}$

     * **Lift**: How much more likely B is with A than without

         $\text{Lift}(A \rightarrow B) = \frac{\text{Confidence}(A \rightarrow B)}{\text{Support}(B)}$
4. Interpret results:
   * Rules like: “If Bread, then Butter” with high confidence and lift.

In [3]:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules

# -------------------------------
# Step 1: Sample Transactions
# -------------------------------
transactions = [
    ['Milk', 'Bread', 'Butter'],
    ['Beer', 'Bread'],
    ['Milk', 'Bread', 'Beer', 'Eggs'],
    ['Milk', 'Bread', 'Butter'],
    ['Bread', 'Butter'],
    ['Milk', 'Eggs']
]

# -------------------------------
# Step 2: Transform Data
# -------------------------------
te = TransactionEncoder()
te_array = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_array, columns=te.columns_)

# -------------------------------
# Step 3: Frequent Itemsets
# -------------------------------
frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)

# -------------------------------
# Step 4: Generate Association Rules
# -------------------------------
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1.0)

# Display results
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])

Frequent Itemsets:
     support               itemsets
0   0.333333                 (Beer)
1   0.833333                (Bread)
2   0.500000               (Butter)
3   0.333333                 (Eggs)
4   0.666667                 (Milk)
5   0.333333          (Beer, Bread)
6   0.500000        (Butter, Bread)
7   0.500000          (Milk, Bread)
8   0.333333         (Milk, Butter)
9   0.333333           (Milk, Eggs)
10  0.333333  (Milk, Butter, Bread)

Association Rules:
        antecedents      consequents   support  confidence      lift
0            (Beer)          (Bread)  0.333333    1.000000  1.200000
1           (Bread)           (Beer)  0.333333    0.400000  1.200000
2          (Butter)          (Bread)  0.500000    1.000000  1.200000
3           (Bread)         (Butter)  0.500000    0.600000  1.200000
4            (Milk)         (Butter)  0.333333    0.500000  1.000000
5          (Butter)           (Milk)  0.333333    0.666667  1.000000
6            (Milk)           (Eggs)  0.333333