<a href="https://colab.research.google.com/github/raviteja-padala/Business_Analytics/blob/main/Association_rules_Practice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# "Implementing of Association-Based Strategies"

## Objective:
The objective of this analysis is to use transaction data to identify product affinities, which are products that are frequently purchased together. This insight will enable us to develop targeted cross-promotion strategies that leverage these affinities to enhance sales and customer satisfaction. We will set the support level at 33 percent and the confidence level at 50 percent to ensure that we focus on meaningful associations.

## Association rules
Association rules are a data mining technique used to uncover relationships, patterns, and associations within large datasets. These relationships highlight which items or events tend to occur together in transactions or events. Association rule mining is commonly applied to transactional data, such as customer purchase histories, web clickstreams, and more, to reveal hidden insights that can guide decision-making and strategy development.

The fundamental concept behind association rule mining is the discovery of rules of the form "If X, then Y," where X and Y are sets of items. These rules help us understand which items are frequently purchased or accessed together. The strength of an association rule is measured by metrics like support, confidence, and lift.

**Support**: The support of an itemset is the proportion of transactions in which the itemset appears. It indicates how frequently an itemset occurs in the dataset.

**Confidence**: The confidence of a rule "X → Y" measures the likelihood that itemset Y is purchased given that itemset X is purchased. It's calculated as the proportion of transactions containing both X and Y over the transactions containing X.

**Lift**: The lift of a rule "X → Y" measures how much more likely Y is to be purchased when X is purchased, compared to when Y is purchased independently. It's calculated as the ratio of the confidence of the rule to the support of Y.

# Steps involved in executing association rule mining.


**1. Data Preparation**


**2. Data Preprocessing**


**3. Itemset Generation**


**4. Rule Generation**


**5. Rule Evaluation**


**6. Interpretation**


**7. Strategy Formulation**



# 1. Data Preparation



In [None]:
import pandas as pd # to create Dataframe
import warnings # to ignore warnings
warnings.filterwarnings('ignore')
#ignoring Deprecation waring
warnings.filterwarnings("ignore", category=DeprecationWarning)

data = {
    'Transactions_Id': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
    'Item-1': ['Milk', 'Milk', 'Bread', 'Milk', 'Bread', 'Milk', 'Milk', 'Milk', 'Bread', 'Milk', 'Milk', 'Milk'],
    'Item-2': ['Egg', 'Butter', 'Butter', 'Bread', 'Butter', 'Bread', 'Cookies', 'Bread', 'Butter', 'Butter', 'Bread', 'Bread'],
    'Item-3': ['Bread', 'Egg', 'Ketchup', 'Butter', 'Cookies', 'Butter', None, 'Butter', 'Egg', 'Bread', 'Butter', 'Cookies'],
    'Item-4': ['Butter', 'Ketchup', None, None, None, 'Cookies', None, None, 'Cookies', None, None, 'Ketchup']
}

df = pd.DataFrame(data)
transactions = df.drop('Transactions_Id', axis=1)

df

Unnamed: 0,Transactions_Id,Item-1,Item-2,Item-3,Item-4
0,1,Milk,Egg,Bread,Butter
1,2,Milk,Butter,Egg,Ketchup
2,3,Bread,Butter,Ketchup,
3,4,Milk,Bread,Butter,
4,5,Bread,Butter,Cookies,
5,6,Milk,Bread,Butter,Cookies
6,7,Milk,Cookies,,
7,8,Milk,Bread,Butter,
8,9,Bread,Butter,Egg,Cookies
9,10,Milk,Butter,Bread,


# 2. Data Preprocessing

In [None]:
# Step 2: Data Preprocessing
# preprocess the transaction data by creating a list of lists (item_lists)
item_lists = [list(filter(None, row)) for _, row in transactions.iterrows()]

# 3. Itemset Generation



In [None]:
from mlxtend.frequent_patterns import apriori, association_rules

# Convert the preprocessed item_lists into a one-hot encoded DataFrame.
# Each item in the item_lists will become a binary column indicating its presence or absence in each transaction.
encoded_df = pd.get_dummies(pd.DataFrame(item_lists), prefix='', prefix_sep='')

# Use the Apriori algorithm to find frequent itemsets in the encoded DataFrame.
# The min_support parameter sets the minimum support threshold for an itemset to be considered frequent.
# Here, we use a minimum support of 0.33, which means an itemset must appear in at least 33% of the transactions to be considered frequent.
# The use_colnames parameter specifies that the column names should be used as itemset names.
frequent_itemsets = apriori(encoded_df, min_support=0.33, use_colnames=True)
frequent_itemsets

Unnamed: 0,support,itemsets
0,0.75,(Milk)
1,0.416667,(Bread)
2,0.416667,(Butter)
3,0.333333,(Butter)
4,0.416667,"(Bread, Milk)"
5,0.333333,"(Milk, Butter)"
6,0.333333,"(Bread, Butter)"
7,0.333333,"(Bread, Milk, Butter)"


# 4. Rule Generation


In [None]:
# 'metric' specifies the evaluation metric for generating rules, which is set to 'confidence' in this case.
# 'min_threshold' sets the minimum confidence threshold for the generated rules, which is set to 50%.
association_rules_df = association_rules(frequent_itemsets, metric='confidence', min_threshold=0.5)
association_rules_df

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
0,(Bread),(Milk),0.416667,0.75,0.416667,1.0,1.333333,0.104167,inf,0.428571
1,(Milk),(Bread),0.75,0.416667,0.416667,0.555556,1.333333,0.104167,1.3125,1.0
2,(Butter),(Milk),0.333333,0.75,0.333333,1.0,1.333333,0.083333,inf,0.375
3,(Bread),(Butter),0.416667,0.333333,0.333333,0.8,2.4,0.194444,3.333333,1.0
4,(Butter),(Bread),0.333333,0.416667,0.333333,1.0,2.4,0.194444,inf,0.875
5,"(Milk, Bread)",(Butter),0.416667,0.333333,0.333333,0.8,2.4,0.194444,3.333333,1.0
6,"(Bread, Butter)",(Milk),0.333333,0.75,0.333333,1.0,1.333333,0.083333,inf,0.375
7,"(Milk, Butter)",(Bread),0.333333,0.416667,0.333333,1.0,2.4,0.194444,inf,0.875
8,(Bread),"(Milk, Butter)",0.416667,0.333333,0.333333,0.8,2.4,0.194444,3.333333,1.0
9,(Butter),"(Milk, Bread)",0.333333,0.416667,0.333333,1.0,2.4,0.194444,inf,0.875


# 5. Rule Evaluation



In [None]:
# Sort the association rules by a specific metric, e.g., lift, confidence, support.
sorted_rules = association_rules_df.sort_values(by='lift', ascending=False)

# Display the sorted association rules.
sorted_rules

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
3,(Bread),(Butter),0.416667,0.333333,0.333333,0.8,2.4,0.194444,3.333333,1.0
4,(Butter),(Bread),0.333333,0.416667,0.333333,1.0,2.4,0.194444,inf,0.875
5,"(Milk, Bread)",(Butter),0.416667,0.333333,0.333333,0.8,2.4,0.194444,3.333333,1.0
7,"(Milk, Butter)",(Bread),0.333333,0.416667,0.333333,1.0,2.4,0.194444,inf,0.875
8,(Bread),"(Milk, Butter)",0.416667,0.333333,0.333333,0.8,2.4,0.194444,3.333333,1.0
9,(Butter),"(Milk, Bread)",0.333333,0.416667,0.333333,1.0,2.4,0.194444,inf,0.875
0,(Bread),(Milk),0.416667,0.75,0.416667,1.0,1.333333,0.104167,inf,0.428571
1,(Milk),(Bread),0.75,0.416667,0.416667,0.555556,1.333333,0.104167,1.3125,1.0
2,(Butter),(Milk),0.333333,0.75,0.333333,1.0,1.333333,0.083333,inf,0.375
6,"(Bread, Butter)",(Milk),0.333333,0.75,0.333333,1.0,1.333333,0.083333,inf,0.375


# 6. Interpretation



Interpretation of the provided association rules:

1. If a customer purchases "Bread," there's an 80% likelihood (confidence) that they will also purchase "Butter." The lift value of 2.4 indicates that the purchase of "Bread" and "Butter" is 2.4 times more likely than if they were purchased independently. The conviction value of 3.33 suggests a strong dependency between the two items, and the zhangs_metric value of 1 indicates a balanced rule.

2. If a customer purchases "Butter," there's a 100% likelihood (confidence) that they will also purchase "Bread." The lift value of 2.4 indicates that the purchase of "Butter" and "Bread" is 2.4 times more likely than if they were purchased independently. The infinite conviction value suggests a strong dependency, and the zhangs_metric value of 0.875 indicates a relatively weaker rule.

3. If a customer purchases "Milk" and "Bread," there's an 80% likelihood (confidence) that they will also purchase "Butter." The lift value of 2.4 indicates a positive association, and the zhangs_metric value of 1 indicates a balanced rule.

4. If a customer purchases "Milk" and "Butter," there's a 100% likelihood (confidence) that they will also purchase "Bread." The lift value of 2.4 indicates a positive association, and the zhangs_metric value of 0.875 suggests a relatively weaker rule.

5. If a customer purchases "Bread," there's an 80% likelihood (confidence) that they will also purchase "Milk" and "Butter." The lift value of 2.4 indicates a positive association, and the zhangs_metric value of 1 indicates a balanced rule.

6. If a customer purchases "Butter," there's a 100% likelihood (confidence) that they will also purchase "Milk" and "Bread." The lift value of 2.4 indicates a positive association, and the zhangs_metric value of 0.875 suggests a relatively weaker rule.

7. If a customer purchases "Bread," there's a 100% likelihood (confidence) that they will also purchase "Milk." The lift value of 1.333 indicates a slightly positive association, and the zhangs_metric value of 0.428571 suggests a relatively balanced rule.

8. If a customer purchases "Milk," there's a 55.6% likelihood (confidence) that they will also purchase "Bread." The lift value of 1.333 indicates a slightly positive association, and the zhangs_metric value of 1.3125 suggests a relatively balanced rule.

9. If a customer purchases "Butter," there's a 100% likelihood (confidence) that they will also purchase "Milk." The lift value of 1.333 indicates a slightly positive association, and the zhangs_metric value of 0.375 suggests a relatively weaker rule.

10. If a customer purchases "Bread" and "Butter," there's a 100% likelihood (confidence) that they will also purchase "Milk." The lift value of 1.333 indicates a slightly positive association, and the zhangs_metric value of 0.375 suggests a relatively weaker rule.

These interpretations provide insights into the relationships between products and the likelihood of their co-purchases based on the association rules generated from the dataset.

# 7. Strategy Formulation

Strategy formulation based on the provided association rules:

1. **Cross-Promotion Bundle: Bread and Butter**
   - Strategy: Create a cross-promotion bundle of "Bread" and "Butter" to encourage customers to purchase both items together.
   - Rationale: The association rule shows that customers who buy "Bread" are 80% likely to also purchase "Butter." Leveraging this affinity, the bundle can be marketed as a convenient pairing for daily consumption.

2. **Complementary Pair: Milk and Bread**
   - Strategy: Promote the complementary relationship between "Milk" and "Bread" to increase the likelihood of cross-purchases.
   - Rationale: The rule indicates that customers who purchase "Milk" are 55.6% likely to also buy "Bread." Marketing these items together can enhance customer convenience and encourage larger basket sizes.

3. **Combo Offer: Milk, Bread, and Butter**
   - Strategy: Introduce a combo offer for "Milk," "Bread," and "Butter" to cater to customers who tend to buy these items together.
   - Rationale: Multiple rules suggest a strong affinity between these three items, with confidence levels of 80% and above. A bundled offer can incentivize customers to purchase the trio, boosting sales.

4. **Cross-Category Promotion: Milk and Cookies**
   - Strategy: Promote "Milk" alongside "Cookies" to create a cross-category promotion.
   - Rationale: Although not directly linked in the rules, a potential marketing opportunity exists due to the popularity of "Milk" as a beverage choice for consuming cookies. Positioning these items together can enhance customer satisfaction.

5. **Promote Bread as Base Ingredient**
   - Strategy: Position "Bread" as a base ingredient in various recipes that involve "Butter," "Egg," and other ingredients.
   - Rationale: The rules suggest a strong association between "Bread" and other items. By showcasing how "Bread" can complement a variety of meals, you can encourage customers to explore different uses for this staple.

6. **Customer Education on Complementary Items**
   - Strategy: Launch an educational campaign to inform customers about complementary food pairings.
   - Rationale: The association rules indicate specific combinations that customers might not naturally consider. Educating them about these pairings can encourage them to try new combinations, increasing basket sizes.

7. **Seasonal Promotions: Bread and Ketchup**
   - Strategy: Introduce seasonal promotions featuring "Bread" and "Ketchup," targeting barbecue and picnic seasons.
   - Rationale: The association rule suggests a potential affinity between these items. By aligning promotions with relevant occasions, you can tap into customer preferences during specific periods.

8. **Personalized Recommendations**
   - Strategy: Implement personalized product recommendations on your online platform.
   - Rationale: Utilize the association rules to suggest relevant products to customers based on their previous purchases. For instance, if a customer adds "Bread" to their cart, recommend complementary items like "Butter" or "Milk."

These strategies leverage the insights gained from the association rules to promote cross-purchases, bundle offerings, and complementary items, enhancing the customer shopping experience and increasing overall sales.

## Conclusion:

 In this analysis, we identified several strong associations between products, such as "Bread" and "Butter," "Milk" and "Bread," and "Milk," "Bread," and "Butter." These findings suggest clear opportunities for cross-selling and bundle promotions. Incorporating product affinity analysis into marketing strategies can yield substantial benefits for businesses. By understanding which items are frequently purchased together, companies can create more effective cross-promotion strategies that capitalize on these affinities.

## Thank you for reading till the end

## **-Raviteja**
https://www.linkedin.com/in/raviteja-padala/