## 1. Importing the libraries

In [28]:
!pip install mlxtend

Defaulting to user installation because normal site-packages is not writeable



[notice] A new release of pip is available: 25.0.1 -> 25.1.1
[notice] To update, run: C:\Users\njeri\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.12_qbz5n2kfra8p0\python.exe -m pip install --upgrade pip


In [29]:
import pandas as pd
import random #used to randomly generate transactions
from mlxtend.frequent_patterns import apriori, association_rules #helps us find frequent patterns and generate the shopping rules


## 2.Simulate Transaction Data
We'll create 10 fake transactions from a random a pool of 8 items

In [30]:
items_pool = ['Bread', 'Milk', 'Eggs', 'Cheese', 'Apples', 'Diapers', 'Beer', 'Coke']#this is the supermarket item list

In [31]:
transactions = []
for _ in range(10):#repeat 10 times to create 10 transactions
    transaction = random.sample(items_pool, random.randint(2, 5))# randomly select between 2 to 5 items from the items_pool
    transactions.append(transaction)#adds each basket to the list


In [32]:
#to print the transactions
print("Sample Transactions:")
for i, t in enumerate(transactions, 1):
    print(f"Transaction {i}: {t}")


Sample Transactions:
Transaction 1: ['Bread', 'Eggs']
Transaction 2: ['Diapers', 'Beer']
Transaction 3: ['Milk', 'Beer', 'Eggs', 'Bread']
Transaction 4: ['Eggs', 'Beer']
Transaction 5: ['Milk', 'Eggs']
Transaction 6: ['Diapers', 'Apples']
Transaction 7: ['Eggs', 'Cheese', 'Diapers']
Transaction 8: ['Bread', 'Diapers', 'Cheese', 'Apples']
Transaction 9: ['Bread', 'Beer', 'Apples', 'Coke', 'Eggs']
Transaction 10: ['Diapers', 'Beer', 'Coke', 'Cheese', 'Eggs']


## 3. One-Hot Encode the Data

In [33]:
all_items = sorted(set(item for t in transactions for item in t))#created a sorted list of all unique items(alphabetically sorted)

In [34]:
encoded_data = []
for t in transactions:
    encoded_data.append({item: (item in t) for item in all_items})#converted each transaction into a dictionary with items as keys and boolean values indicating presence or absence

In [35]:
df = pd.DataFrame(encoded_data)#turned into a DataFrame
df

Unnamed: 0,Apples,Beer,Bread,Cheese,Coke,Diapers,Eggs,Milk
0,False,False,True,False,False,False,True,False
1,False,True,False,False,False,True,False,False
2,False,True,True,False,False,False,True,True
3,False,True,False,False,False,False,True,False
4,False,False,False,False,False,False,True,True
5,True,False,False,False,False,True,False,False
6,False,False,False,True,False,True,True,False
7,True,False,True,True,False,True,False,False
8,True,True,True,False,True,False,True,False
9,False,True,False,True,True,True,True,False


True if the item is in the transaction
False otherwise

And as you can see transaction 1 had bread and eggs:while in the table they're the only ones with true

## 4.Find Frequent Itemsets

In [None]:
frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)#runs the apriori algorithm to find frequent itemsets with a minimum support of 0.3
print("\nFrequent Itemsets:")
print(frequent_itemsets)



Frequent Itemsets:
   support           itemsets
0      0.3           (Apples)
1      0.5             (Beer)
2      0.4            (Bread)
3      0.3           (Cheese)
4      0.5          (Diapers)
5      0.7             (Eggs)
6      0.4       (Eggs, Beer)
7      0.3      (Bread, Eggs)
8      0.3  (Diapers, Cheese)


This table shows items or combinations that appear in at least 30% of the 10 transactions.

For example:
- Eggs appears in 7 out of 10 transactions → support = 0.7
- Bread appears in 4 out of 10 → support = 0.4
- Eggs and beer appear together in 4 out of 10 

It helps us see what people often buy, alone or together.

## 5. Generating the Association Rules

In [None]:
rules = association_rules(frequent_itemsets, metric='confidence', min_threshold=0.7)


this creates rules like if someone buys beer,they also buy diapers and we're only keeping rules where this relationship happens atleast 70% of the time

In [38]:
print("\nAssociation Rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])



Association Rules:
  antecedents consequents  support  confidence      lift
0      (Beer)      (Eggs)      0.4        0.80  1.142857
1     (Bread)      (Eggs)      0.3        0.75  1.071429
2    (Cheese)   (Diapers)      0.3        1.00  2.000000


From the association rules generated, we can see that there are strong patterns in how items are bought together. 
- For example, when a customer buys Cheese, they always also buy Diapers — this rule has a confidence of 1.0, meaning it happens 100% of the time. 
- Another rule shows that people who buy Beer also buy Eggs 80% of the time. These patterns suggest that certain items are often purchased together

A store could use this information to place those items near each other or offer bundle deals. 

The lift values, which are all above 1, show that these combinations are not just random but have a strong relationship.

## 6.Explaining One Rule

In [39]:
if not rules.empty:
    chosen_rule = rules.iloc[0]
    print("\nChosen Rule Explanation:")
    print(f"If someone buys {list(chosen_rule['antecedents'])}, they are likely to also buy {list(chosen_rule['consequents'])}")
    print(f"Confidence: {chosen_rule['confidence']:.2f} means this happens {chosen_rule['confidence']*100:.0f}% of the time")
else:
    print("\nNo rules generated with the given thresholds.")



Chosen Rule Explanation:
If someone buys ['Beer'], they are likely to also buy ['Eggs']
Confidence: 0.80 means this happens 80% of the time


One of the rules found from the data is that if someone buys beer, they are likely to also buy eggs. 

This rule has a confidence of 0.80, which means that in 80% of the cases where beer was bought,eggs were also bought. 

This shows a strong connection between the two items and could help a supermarket decide to place Beer and Eggs closer together on shelves or run a joint promotion. 

It’s a helpful way to understand customer shopping patterns.
