# ASSOCIATION RULES

The Objective of this assignment is to introduce students to rule mining techniques, particularly focusing on market basket analysis and provide hands on experience.

## Dataset:
Use the Online retail dataset to apply the association rules.

## Data Preprocessing:
Pre-process the dataset to ensure it is suitable for Association rules, this may include handling missing values, removing duplicates, and converting the data to appropriate format.  


In [45]:
import pandas as pd

In [46]:
df = pd.read_excel('./Online retail.xlsx')
df

Unnamed: 0,Basket
0,"shrimp,almonds,avocado,vegetables mix,green gr..."
1,"burgers,meatballs,eggs"
2,chutney
3,"turkey,avocado"
4,"mineral water,milk,energy bar,whole wheat rice..."
...,...
7496,"butter,light mayo,fresh bread"
7497,"burgers,frozen vegetables,eggs,french fries,ma..."
7498,chicken
7499,"escalope,green tea"


In [47]:
# Checking for Missing Values
df.isnull().sum()

Basket    0
dtype: int64

In [48]:
# Checking for Duplicates
df.duplicated().sum()

2325

In [49]:
# Dropping Duplicate Values
df.drop_duplicates(inplace=True)

In [50]:
df.shape

(5176, 1)

In [51]:
df.dtypes

Basket    object
dtype: object

In [52]:
# Converting Data to transactional format.
transactions = []
for index, row in df.iterrows():
    transactions.append(row['Basket'].split(','))
display(transactions[:5])

[['shrimp',
  'almonds',
  'avocado',
  'vegetables mix',
  'green grapes',
  'whole weat flour',
  'yams',
  'cottage cheese',
  'energy drink',
  'tomato juice',
  'low fat yogurt',
  'green tea',
  'honey',
  'salad',
  'mineral water',
  'salmon',
  'antioxydant juice',
  'frozen smoothie',
  'spinach',
  'olive oil'],
 ['burgers', 'meatballs', 'eggs'],
 ['chutney'],
 ['turkey', 'avocado'],
 ['mineral water', 'milk', 'energy bar', 'whole wheat rice', 'green tea']]

In [53]:
# Converting Transactions list into a Dataframe
data = pd.DataFrame(transactions)

# Applying one-hot encoding to boolean data
encoded_data = pd.get_dummies(data, prefix='', prefix_sep='')
encoded_data

Unnamed: 0,almonds,antioxydant juice,asparagus,avocado,babies food,bacon,barbecue sauce,black tea,blueberries,body spray,...,antioxydant juice.1,french fries,frozen smoothie,frozen smoothie.1,protein bar,spinach,cereals,mayonnaise,spinach.1,olive oil
0,False,False,False,False,False,False,False,False,False,False,...,True,False,False,True,False,False,False,False,True,True
1,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
2,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
5171,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
5172,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
5173,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
5174,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


## Association Rule Mining:

*	Implement an Apriori algorithm using tool like python with libraries such as Pandas and Mlxtend etc.
*	 Apply association rule mining techniques to the pre-processed dataset to discover interesting relationships between products purchased together.
*	Set appropriate threshold for support, confidence and lift to extract meaning full rules.


In [84]:
# Applying Apriori algorithm to find frequent itemsets
from mlxtend.frequent_patterns import apriori

frequent_items = apriori(encoded_data, min_support=0.01, use_colnames=True)

In [85]:
# Generating Association Rules
from mlxtend.frequent_patterns import association_rules

rules = association_rules(frequent_items, metric='lift', min_threshold=1.2)

# Sorting rules by lift in descending order:
rules = rules.sort_values(by='lift', ascending=False)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
0,(ground beef),(spaghetti),0.011012,0.011978,0.010819,0.982456,82.019242,0.010687,56.317233,0.998807
1,(spaghetti),(ground beef),0.011978,0.011012,0.010819,0.903226,82.019242,0.010687,10.219539,0.999784
2,(shrimp),(frozen vegetables),0.02956,0.016808,0.01449,0.490196,29.163849,0.013993,1.928568,0.995126
3,(frozen vegetables),(shrimp),0.016808,0.02956,0.01449,0.862069,29.163849,0.013993,7.035694,0.98222
4,(spaghetti),(mineral water),0.011978,0.016229,0.012172,1.016129,62.612903,0.011977,inf,0.995959
5,(mineral water),(spaghetti),0.016229,0.011978,0.012172,0.75,62.612903,0.011977,3.952087,1.000262
6,(turkey),(burgers),0.081144,0.01507,0.014297,0.17619,11.691819,0.013074,1.19558,0.995227
7,(burgers),(turkey),0.01507,0.081144,0.014297,0.948718,11.691819,0.013074,17.917697,0.928462
8,(ground beef),(mineral water),0.011012,0.016229,0.011978,1.087719,67.024227,0.0118,inf,0.996049
9,(mineral water),(ground beef),0.016229,0.011012,0.011978,0.738095,67.024227,0.0118,3.776135,1.00133


## Analysis and Interpretation:
*	Analyse the generated rules to identify interesting patterns and relationships between the products.
*	Interpret the results and provide insights into customer purchasing behaviour based on the discovered rules.


In [94]:
# Interpretation of a rule
print("Identified Patterns:")

for i in range(len(rules.head())):
    print("\nCustomers who buy {} are {} times more likely to buy {}.".format(
        list(rules.iloc[i]['antecedents']), 
        round(rules.iloc[i]['lift'], 2), 
        list(rules.iloc[i]['consequents'])
    )) 

Identified Patterns:

Customers who buy ['ground beef'] are 82.02 times more likely to buy ['spaghetti'].

Customers who buy ['spaghetti'] are 82.02 times more likely to buy ['ground beef'].

Customers who buy ['ground beef'] are 67.02 times more likely to buy ['mineral water'].

Customers who buy ['mineral water'] are 67.02 times more likely to buy ['ground beef'].

Customers who buy ['spaghetti'] are 62.61 times more likely to buy ['mineral water'].


### Interpretation:

* Meal Pairings:

    The analysis reveals common meal pairings such as ground beef with spaghetti and shrimp with frozen vegetables, suggesting that customers tend to buy complementary items for preparing meals.

* Beverage Choices:

    Itmms like mineral water are frequently associated with food items like ground beef and spaghetti, indicating that customers often include beverages in their food purchases.