# Apriori Modelling

## Mục tiêu
- Áp dụng thuật toán Apriori để tìm các tập sản phẩm thường được mua cùng nhau.
- Sinh ra các luật kết hợp (association rules).
- Phân tích và rút ra insight kinh doanh.


In [7]:
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

DATA_PATH = "../data/basket_bool.parquet"

basket_bool = pd.read_parquet(DATA_PATH)

print("Basket loaded:", basket_bool.shape)


Basket loaded: (18019, 4007)


In [8]:
frequent_itemsets = apriori(
    basket_bool,
    min_support=0.03,
    use_colnames=True
)

print("Số frequent itemsets:", frequent_itemsets.shape)
frequent_itemsets.head()




Số frequent itemsets: (145, 2)


Unnamed: 0,support,itemsets
0,0.04745,(6 RIBBONS RUSTIC CHARM)
1,0.032244,(60 CAKE CASES VINTAGE CHRISTMAS)
2,0.041789,(60 TEATIME FAIRY CAKE CASES)
3,0.030801,(72 SWEETHEART FAIRY CAKE CASES)
4,0.048615,(ALARM CLOCK BAKELIKE GREEN)


In [9]:
rules = association_rules(
    frequent_itemsets,
    metric="confidence",
    min_threshold=0.5
)

rules = rules.sort_values(by="lift", ascending=False)

print("Số luật sinh ra:", rules.shape)
rules.head()


Số luật sinh ra: (15, 14)


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
4,(GREEN REGENCY TEACUP AND SAUCER),(PINK REGENCY TEACUP AND SAUCER),0.051723,0.038959,0.031966,0.618026,15.863541,1.0,0.029951,2.515984,0.988068,0.544423,0.602541,0.719269
5,(PINK REGENCY TEACUP AND SAUCER),(GREEN REGENCY TEACUP AND SAUCER),0.038959,0.051723,0.031966,0.820513,15.863541,1.0,0.029951,5.283257,0.974945,0.544423,0.810723,0.719269
13,(ROSES REGENCY TEACUP AND SAUCER ),(PINK REGENCY TEACUP AND SAUCER),0.053055,0.038959,0.030246,0.570084,14.63296,1.0,0.028179,2.235414,0.98386,0.489668,0.552656,0.673218
14,(PINK REGENCY TEACUP AND SAUCER),(ROSES REGENCY TEACUP AND SAUCER ),0.038959,0.053055,0.030246,0.776353,14.63296,1.0,0.028179,4.23411,0.969429,0.489668,0.763823,0.673218
2,(GARDENERS KNEELING PAD CUP OF TEA ),(GARDENERS KNEELING PAD KEEP CALM ),0.041623,0.049836,0.030024,0.721333,14.474059,1.0,0.02795,3.409678,0.971341,0.488708,0.706717,0.661892


In [10]:
strong_rules = rules[
    (rules["confidence"] >= 0.6) &
    (rules["lift"] >= 1.5)
]

print("Số luật mạnh:", strong_rules.shape)
strong_rules.head()


Số luật mạnh: (12, 14)


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
4,(GREEN REGENCY TEACUP AND SAUCER),(PINK REGENCY TEACUP AND SAUCER),0.051723,0.038959,0.031966,0.618026,15.863541,1.0,0.029951,2.515984,0.988068,0.544423,0.602541,0.719269
5,(PINK REGENCY TEACUP AND SAUCER),(GREEN REGENCY TEACUP AND SAUCER),0.038959,0.051723,0.031966,0.820513,15.863541,1.0,0.029951,5.283257,0.974945,0.544423,0.810723,0.719269
14,(PINK REGENCY TEACUP AND SAUCER),(ROSES REGENCY TEACUP AND SAUCER ),0.038959,0.053055,0.030246,0.776353,14.63296,1.0,0.028179,4.23411,0.969429,0.489668,0.763823,0.673218
2,(GARDENERS KNEELING PAD CUP OF TEA ),(GARDENERS KNEELING PAD KEEP CALM ),0.041623,0.049836,0.030024,0.721333,14.474059,1.0,0.02795,3.409678,0.971341,0.488708,0.706717,0.661892
3,(GARDENERS KNEELING PAD KEEP CALM ),(GARDENERS KNEELING PAD CUP OF TEA ),0.049836,0.041623,0.030024,0.60245,14.474059,1.0,0.02795,2.410708,0.979737,0.488708,0.585184,0.661892


In [11]:
OUTPUT_PATH = "../data/apriori_rules.csv"

strong_rules.to_csv(OUTPUT_PATH, index=False)

print("Đã lưu luật Apriori tại:", OUTPUT_PATH)


Đã lưu luật Apriori tại: ../data/apriori_rules.csv


## Kết luận & Insight kinh doanh

- Các luật Apriori cho thấy những nhóm sản phẩm thường xuyên được mua cùng nhau.
- Một số luật có confidence và lift cao chứng tỏ mối quan hệ mua kèm mạnh.
- Các insight này có thể được áp dụng trong việc:
  - Gợi ý sản phẩm (recommendation)
  - Thiết kế combo bán hàng
  - Tối ưu trưng bày sản phẩm trong cửa hàng
