# Market Basket Analysis (MBA) is a data mining technique

In [None]:
used by retailers to understand customer purchasing patterns by analyzing transaction detail

It identifies items that are frequently bought together,

helping businesses make informed decisions about product placement, cross-selling, and marketing strategies

# For Example

In [None]:
if customers often buy bread and butter together, a store might place these items near each other to encourage additional sales

MBA typically uses algorithms like the Apriori algorithm

to uncover these associations and generate rules that predict customer behavior

# How It's work

In [1]:
# Install the necessary library
!pip install mlxtend



Collecting mlxtend
  Downloading mlxtend-0.23.2-py3-none-any.whl.metadata (7.3 kB)
Downloading mlxtend-0.23.2-py3-none-any.whl (1.4 MB)
   ---------------------------------------- 1.4/1.4 MB 3.7 MB/s eta 0:00:00
Installing collected packages: mlxtend
Successfully installed mlxtend-0.23.2





In [12]:
# Import libraries
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

# Sample data: Create a DataFrame with transaction data
data = {'Transaction': [1, 1, 1, 2, 2, 3, 3, 4, 4, 5],
        'Item': ['Bread', 'Butter', 'Milk', 'Bread', 'Milk', 'Bread', 'Butter', 'Butter', 'Milk', 'Bread']}
df = pd.DataFrame(data)

# Convert transaction data to a basket format
basket = df.groupby(['Transaction', 'Item'])['Item'].count().unstack().reset_index().fillna(0).set_index('Transaction')
basket = basket.applymap(lambda x: 1 if x > 0 else 0)

# Apply the Apriori algorithm to identify frequent itemsets
frequent_itemsets = apriori(basket, min_support=0.2, use_colnames=True)

# Manually compute association rules
rules_list = []

for i in range(len(frequent_itemsets)):
    itemset = frequent_itemsets.iloc[i]['itemsets']
    support = frequent_itemsets.iloc[i]['support']
    
    for j in range(1, len(itemset)):
        antecedents = frozenset([list(itemset)[k] for k in range(j)])
        consequents = itemset - antecedents
        if len(consequents) > 0:
            antecedent_support = frequent_itemsets[frequent_itemsets['itemsets'] == antecedents]['support'].values
            if len(antecedent_support) > 0:
                conf = support / antecedent_support[0]
                consequent_support = frequent_itemsets[frequent_itemsets['itemsets'] == consequents]['support'].values
                if len(consequent_support) > 0:
                    lift = conf / consequent_support[0]
                    rules_list.append({'antecedents': antecedents, 'consequents': consequents, 'support': support, 'confidence': conf, 'lift': lift})

rules = pd.DataFrame(rules_list)

# Display the results
print("Frequent Itemsets:")
print(frequent_itemsets)
print("\nAssociation Rules:")
print(rules)


Frequent Itemsets:
   support               itemsets
0      0.8                (Bread)
1      0.6               (Butter)
2      0.6                 (Milk)
3      0.4        (Butter, Bread)
4      0.4          (Milk, Bread)
5      0.4         (Milk, Butter)
6      0.2  (Milk, Butter, Bread)

Association Rules:
      antecedents      consequents  support  confidence      lift
0        (Butter)          (Bread)      0.4    0.666667  0.833333
1          (Milk)          (Bread)      0.4    0.666667  0.833333
2          (Milk)         (Butter)      0.4    0.666667  1.111111
3          (Milk)  (Butter, Bread)      0.2    0.333333  0.833333
4  (Milk, Butter)          (Bread)      0.2    0.500000  0.625000


  basket = basket.applymap(lambda x: 1 if x > 0 else 0)


In [None]:
Here’s an overview of what the code does:

Install the necessary library: Ensure that mlxtend is installed.

Import libraries: Import the required libraries (pandas and mlxtend.frequent_patterns).

Sample data: Create a sample DataFrame with transaction data.

Convert transaction data to a basket format: Transform the data into a format suitable for the Apriori algorithm, 
where each row represents a transaction, and each column represents an item.

Apply the Apriori algorithm: Identify frequent itemsets with a minimum support threshold.

Manually compute association rules: Generate association rules by manually calculating support, 
confidence, and lift for each itemset.

Display the results: Print the frequent itemsets and association rules.

In [None]:
Let's break down the output you're seeing:

Frequent Itemsets
The frequent itemsets are combinations of items that appear frequently together in transactions. 
Here’s what each column represents:

Support: The proportion of transactions that contain the itemset.

Itemsets: The combination of items that appear together.

For example:

(Bread) has a support of 0.8, meaning 80% of transactions contain Bread.

(Butter, Bread) has a support of 0.4, meaning 40% of transactions contain both Butter and Bread.

Association Rules
The association rules provide insights into the relationships between items. Here’s what each column represents:

Antecedents: The item(s) on the left-hand side of the rule (if-portion).

Consequents: The item(s) on the right-hand side of the rule (then-portion).

Support: The proportion of transactions that contain both the antecedent and consequent.

Confidence: The probability that the consequent is purchased when the antecedent is purchased.

Lift: The ratio of the observed support to the expected support if the antecedent and consequent were independent.

For example:

Rule: (Butter) -> (Bread)

Support: 0.4 (40% of transactions contain both Butter and Bread)

Confidence: 0.666667 (If a transaction contains Butter, there is a 66.67% chance it also contains Bread)

Lift: 0.833333 (Bread is 0.83 times as likely to be bought with Butter than it would be by random chance)

Rule: (Milk) -> (Bread)

Support: 0.4 (40% of transactions contain both Milk and Bread)

Confidence: 0.666667 (If a transaction contains Milk, there is a 66.67% chance it also contains Bread)

Lift: 0.833333 (Bread is 0.83 times as likely to be bought with Milk than it would be by random chance)

Rule: (Milk) -> (Butter)

Support: 0.4 (40% of transactions contain both Milk and Butter)

Confidence: 0.666667 (If a transaction contains Milk, there is a 66.67% chance it also contains Butter)

Lift: 1.111111 (Butter is 1.11 times as likely to be bought with Milk than it would be by random chance)

Rule: (Milk) -> (Butter, Bread)

Support: 0.2 (20% of transactions contain Milk, Butter, and Bread)

Confidence: 0.333333 (If a transaction contains Milk, there is a 33.33% chance it also contains both Butter and Bread)

Lift: 0.833333 (Milk, Butter, and Bread are 0.83 times as likely to be bought together than they would be by random chance)

Rule: (Milk, Butter) -> (Bread)

Support: 0.2 (20% of transactions contain Milk, Butter, and Bread)

Confidence: 0.5 (If a transaction contains both Milk and Butter, there is a 50% chance it also contains Bread)

Lift: 0.625 (Bread is 0.625 times as likely to be bought with Milk and Butter than it would be by random chance)

Summary
Support tells us how frequently the itemsets appear in the transactions.

Confidence tells us the likelihood of the consequent item(s) being bought when the antecedent item(s) are bought.

Lift tells us how much more likely the consequent item(s) are to be bought together with the antecedent item(s)
than by random chance.

The association rules help businesses understand which items are likely to be bought together, 
aiding in product placement, cross-selling, and marketing strategies

# Hence we conclude here