## ✅ Step 1: Simulate Transaction Data
🎯 Task:
- Create at least 10 fake transactions.
- Each transaction should have 2–5 items.
- Use a pool of at least 8 unique items (e.g., Bread, Milk, Eggs, etc.).

In [25]:
import random
import pandas as pd

random.seed(123)

# Define a pool of at least 8 unique items
items_pool = ['Bread', 'Milk', 'Eggs', 'Butter', 'Cheese', 'Apples', 'Bananas', 'Cereal']

# Simulate 10 transactions with 2–5 items each
transactions = []

for i in range(10):
    num_items = random.randint(2, 5)
    transaction = random.sample(items_pool, num_items)
    transactions.append(transaction)

# Display the transactions
for idx, t in enumerate(transactions, 1):
    print(f"Transaction {idx}: {t}")

Transaction 1: ['Cheese', 'Bread']
Transaction 2: ['Cheese', 'Bread', 'Bananas', 'Butter', 'Eggs']
Transaction 3: ['Bread', 'Milk', 'Bananas', 'Eggs']
Transaction 4: ['Butter', 'Milk', 'Bread', 'Cereal']
Transaction 5: ['Bananas', 'Bread']
Transaction 6: ['Apples', 'Cereal']
Transaction 7: ['Milk', 'Bread', 'Bananas', 'Cereal', 'Cheese']
Transaction 8: ['Cheese', 'Butter']
Transaction 9: ['Cheese', 'Butter', 'Bread', 'Eggs', 'Cereal']
Transaction 10: ['Butter', 'Bananas', 'Cheese', 'Apples', 'Eggs']


## ✅ Step 2: Analyze with Apriori (4 Marks)
🎯 Tasks:
- Convert the transaction data into a one-hot encoded format.
- Use mlxtend's Apriori algorithm to find frequent itemsets.
- Set minimum support = 0.3 (30%).

Let's first install the package mlxtend.

In [28]:
!pip install mlxtend



In [29]:
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori

# Use TransactionEncoder to convert list of transactions to one-hot encoded DataFrame
te = TransactionEncoder()
te_array = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_array, columns=te.columns_)

# Display one-hot encoded DataFrame
print("One-hot encoded transaction data:")
print(df)

# Apply Apriori algorithm
frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)

# Display frequent itemsets
print("\nFrequent Itemsets (support >= 30%):")
print(frequent_itemsets)

One-hot encoded transaction data:
   Apples  Bananas  Bread  Butter  Cereal  Cheese   Eggs   Milk
0   False    False   True   False   False    True  False  False
1   False     True   True    True   False    True   True  False
2   False     True   True   False   False   False   True   True
3   False    False   True    True    True   False  False   True
4   False     True   True   False   False   False  False  False
5    True    False  False   False    True   False  False  False
6   False     True   True   False    True    True  False   True
7   False    False  False    True   False    True  False  False
8   False    False   True    True    True    True   True  False
9    True     True  False    True   False    True   True  False

Frequent Itemsets (support >= 30%):
    support                itemsets
0       0.5               (Bananas)
1       0.7                 (Bread)
2       0.5                (Butter)
3       0.4                (Cereal)
4       0.6                (Cheese)
5       0

## ✅ Step 3: Generate Association Rules (3 Marks)
🎯 Tasks:
- Generate association rules from the frequent itemsets.

  - Use:

     - Metric: Confidence

     - Minimum threshold: 0.7 (70%)

- Show at least 2 rules.

- Briefly explain one rule in everyday language.

In [33]:
from mlxtend.frequent_patterns import association_rules

# Generate association rules using confidence metric
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)

# Display rules
print("Association Rules (confidence ≥ 70%):")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])

# Show at least 2 rules
print("\nTop 2 Rules:")
print(rules[['antecedents', 'consequents', 'confidence']].head(2))

Association Rules (confidence ≥ 70%):
         antecedents       consequents  support  confidence      lift
0          (Bananas)           (Bread)      0.4        0.80  1.142857
1             (Eggs)         (Bananas)      0.3        0.75  1.500000
2           (Cereal)           (Bread)      0.3        0.75  1.071429
3             (Eggs)           (Bread)      0.3        0.75  1.071429
4             (Milk)           (Bread)      0.3        1.00  1.428571
5           (Butter)          (Cheese)      0.4        0.80  1.333333
6             (Eggs)          (Butter)      0.3        0.75  1.500000
7             (Eggs)          (Cheese)      0.3        0.75  1.250000
8   (Cheese, Butter)            (Eggs)      0.3        0.75  1.875000
9     (Cheese, Eggs)          (Butter)      0.3        1.00  2.000000
10    (Eggs, Butter)          (Cheese)      0.3        1.00  1.666667
11            (Eggs)  (Cheese, Butter)      0.3        0.75  1.875000

Top 2 Rules:
  antecedents consequents  confidence


🔍 Rule 1:
If a customer buys Bananas, they are 80% likely to also buy Bread.

Clear Explanation in Everyday Language:
"Most customers who buy Bananas also pick up Bread. In fact, 8 out of 10 times, when someone buys Bananas, they also buy Bread. So, if someone is shopping and puts Bananas in their cart, there's a strong chance they'll also want Bread."

🔍 Rule 2 (optional explanation):
If a customer buys Eggs, they are 75% likely to also buy Bananas.

Simple Explanation:
"People who buy Eggs often also buy Bananas—around 3 out of 4 times. So Eggs and Bananas are frequently bought together."