# **Product Recommendation System Using Association Rule Mining**

# **Project Description**

The project involves developing a product recommendation system using association rule mining to identify relationships between items frequently purchased together in a transactional dataset. The dataset contains transaction records, including a **uniqueMember_number, the Date of purchase, and the itemDescription of the purchased product.**

The **main objective** is to:

Understand customer purchasing behavior by analyzing transactional data.
Generate product recommendations to suggest items based on a given product using association rules.

The recommendation system utilizes metrics like **lift**, which indicates the strength of the association between products. A higher lift score implies a stronger relationship and better recommendation quality.



**Association Rule Mining:**

Use the **Apriori or FP-Growth algorithm** to find frequent itemsets in the transactional data.
Generate association rules based on these itemsets.
Calculate metrics like support, confidence, and lift to evaluate the strength of the rules.

**Recommendation System:**

Allow the user to input a product and retrieve a list of recommended items sorted by their lift scores.

In [None]:
import numpy as np
import pandas as pd
df=pd.read_csv("/content/Groceries_dataset.csv")
df

Unnamed: 0,Member_number,Date,itemDescription
0,1808,21-07-2015,tropical fruit
1,2552,05-01-2015,whole milk
2,2300,19-09-2015,pip fruit
3,1187,12-12-2015,other vegetables
4,3037,01-02-2015,whole milk
...,...,...,...
38760,4471,08-10-2014,sliced cheese
38761,2022,23-02-2014,candy
38762,1097,16-04-2014,cake bar
38763,1510,03-12-2014,fruit/vegetable juice


In [None]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38765 entries, 0 to 38764
Data columns (total 3 columns):
 #   Column           Non-Null Count  Dtype 
---  ------           --------------  ----- 
 0   Member_number    38765 non-null  int64 
 1   Date             38765 non-null  object
 2   itemDescription  38765 non-null  object
dtypes: int64(1), object(2)
memory usage: 908.7+ KB


In [None]:
df.describe()

Unnamed: 0,Member_number
count,38765.0
mean,3003.641868
std,1153.611031
min,1000.0
25%,2002.0
50%,3005.0
75%,4007.0
max,5000.0


In [None]:
df.isnull().sum()

Member_number      0
Date               0
itemDescription    0
dtype: int64

In [None]:
df.duplicated().sum()

759

In [None]:
df.drop_duplicates(inplace=True)

In [None]:
df['itemDescription']=df['itemDescription'].str.strip()

In [None]:
data_unstack=df.groupby(['Member_number','itemDescription'])['itemDescription'].count().unstack().fillna(0)
data_unstack

itemDescription,Instant food products,UHT-milk,abrasive cleaner,artif. sweetener,baby cosmetics,bags,baking powder,bathroom cleaner,beef,berries,...,turkey,vinegar,waffles,whipped/sour cream,whisky,white bread,white wine,whole milk,yogurt,zwieback
Member_number,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1000,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,1.0,0.0
1001,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,1.0,0.0,1.0,0.0,2.0,0.0,0.0
1002,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0
1003,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1004,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4996,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4997,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.0
4998,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4999,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,...,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0


In [None]:
def encode(x):
  if x>0:
    return 1
  if x<=0:
    return 0
encoded_basket2=data_unstack.applymap(encode)
print(encoded_basket2)

  and should_run_async(code)


itemDescription  Instant food products  UHT-milk  abrasive cleaner  \
Member_number                                                        
1000                                 0         0                 0   
1001                                 0         0                 0   
1002                                 0         0                 0   
1003                                 0         0                 0   
1004                                 0         0                 0   
...                                ...       ...               ...   
4996                                 0         0                 0   
4997                                 0         0                 0   
4998                                 0         0                 0   
4999                                 0         0                 0   
5000                                 0         0                 0   

itemDescription  artif. sweetener  baby cosmetics  bags  baking powder  \
Member_number  

In [None]:
!pip install mlxtend
from mlxtend.frequent_patterns import apriori,association_rules



In [None]:
apr2=apriori(encoded_basket2,min_support=0.07,use_colnames=True)
apr2

  and should_run_async(code)


Unnamed: 0,support,itemsets
0,0.078502,(UHT-milk)
1,0.119548,(beef)
2,0.079785,(berries)
3,0.158799,(bottled beer)
4,0.213699,(bottled water)
...,...,...
78,0.075680,"(tropical fruit, yogurt)"
79,0.079785,"(whole milk, whipped/sour cream)"
80,0.150590,"(whole milk, yogurt)"
81,0.082093,"(whole milk, rolls/buns, other vegetables)"


In [None]:
rules= association_rules(apr2,metric='lift',min_threshold=1)
rules1=rules.sort_values('confidence',ascending=False)
rules1

  and should_run_async(code)


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction,zhangs_metric
94,"(yogurt, other vegetables)",(whole milk),0.120318,0.458184,0.071832,0.597015,1.303003,0.016704,1.344507,0.264348
88,"(rolls/buns, other vegetables)",(whole milk),0.146742,0.458184,0.082093,0.559441,1.220996,0.014859,1.229837,0.212124
71,(shopping bags),(whole milk),0.168291,0.458184,0.091329,0.542683,1.184422,0.014220,1.184772,0.187213
1,(bottled beer),(whole milk),0.158799,0.458184,0.085428,0.537964,1.174124,0.012669,1.172672,0.176297
85,(yogurt),(whole milk),0.282966,0.458184,0.150590,0.532185,1.161510,0.020940,1.158185,0.193926
...,...,...,...,...,...,...,...,...,...,...
89,(whole milk),"(rolls/buns, other vegetables)",0.458184,0.146742,0.082093,0.179171,1.220996,0.014859,1.039508,0.334055
82,(whole milk),(whipped/sour cream),0.458184,0.154695,0.079785,0.174132,1.125650,0.008906,1.023536,0.206019
20,(whole milk),(newspapers),0.458184,0.139815,0.072345,0.157895,1.129310,0.008284,1.021469,0.211332
95,(whole milk),"(yogurt, other vegetables)",0.458184,0.120318,0.071832,0.156775,1.303003,0.016704,1.043235,0.429190


In [None]:
def get_recommendations(product, rules, num_recommendations=5):
    recommendations = []#milk 1.17#water 1.17
    for idx, row in rules.iterrows():
        if product in row['antecedents']:
            recommended_item = next(iter(row['consequents']))
            recommendations.append((recommended_item, row['lift']))
    recommendations.sort(key=lambda x: x[1], reverse=True)
    return recommendations[0:num_recommendations]
    product_to_recommend = 'tropical fruit'
recommendations = get_recommendations(product_to_recommend,rules)
print(f"Recommendations for '{product_to_recommend}':")
for item, lift in recommendations:
    print(f"{item} (lift: {lift:.2f})")

Recommendations for 'tropical fruit':
yogurt (lift: 1.14)
soda (lift: 1.12)
whole milk (lift: 1.09)
rolls/buns (lift: 1.08)
other vegetables (lift: 1.04)


  and should_run_async(code)


**Conclusion:**

The project successfully developed a recommendation system using association rule mining to identify relationships between frequently purchased items. By analyzing transaction data, it provided relevant product suggestions based on lift scores, helping businesses enhance customer experience and identify cross-selling opportunities. This approach can be extended for personalized recommendations and seasonal trend analysis for greater impact.