**Sumber Dataset :  https://www.kaggle.com/datasets/rupakroy/online-retail**

**Import Dataset**

In [1]:
# Import Library
import pandas as pd
import numpy as np
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules
import matplotlib.pyplot as plt
from google.colab import drive
import warnings
warnings.filterwarnings("ignore")




In [2]:
from google.colab import drive
drive.mount('/content/drive')


Mounted at /content/drive


In [3]:

df = pd.read_csv("/content/drive/MyDrive/Online_Retail.csv")
df.head()


Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom


**Pre-Processing Data**

In [4]:
# Hapus retur (Quantity negatif)
df = df[df['Quantity'] > 0]

# Hapus data tanpa CustomerID (tidak bisa identifikasi transaksi)
df = df[df['CustomerID'].notnull()]

# Hapus deskripsi kosong
df = df[df['Description'].notnull()]

# Bersihkan spasi ekstra
df['Description'] = df['Description'].str.strip()


**Membuat Mapping Season**

In [5]:


# Pastikan InvoiceDate sudah datetime
df['InvoiceDate'] = pd.to_datetime(df['InvoiceDate'])

# Ambil bulan
df['Month'] = df['InvoiceDate'].dt.month

# Mapping season UK
def get_season(month):
    if month in [12, 1, 2]:
        return "Winter"
    elif month in [3, 4, 5]:
        return "Spring"
    elif month in [6, 7, 8]:
        return "Summer"
    elif month in [9, 10, 11]:
        return "Autumn"

df['Season'] = df['Month'].apply(get_season)

df[['InvoiceDate', 'Month', 'Season']].head()


Unnamed: 0,InvoiceDate,Month,Season
0,2010-12-01 08:26:00,12,Winter
1,2010-12-01 08:26:00,12,Winter
2,2010-12-01 08:26:00,12,Winter
3,2010-12-01 08:26:00,12,Winter
4,2010-12-01 08:26:00,12,Winter


**Memisahkan Data per Season**

In [6]:

# SPLIT DATA PER SEASON
winter = df[df['Season']=='Winter']
spring = df[df['Season']=='Spring']
summer = df[df['Season']=='Summer']
autumn = df[df['Season']=='Autumn']

print("JUMLAH DATA PER SEASON")
print("Winter :", winter.shape)
print("Spring :", spring.shape)
print("Summer :", summer.shape)
print("Autumn :", autumn.shape)


JUMLAH DATA PER SEASON
Winter : (84624, 10)
Spring : (78143, 10)
Summer : (81025, 10)
Autumn : (154132, 10)


**One Hot Encoding Tiap Season**

In [8]:

# 5. ONE-HOT ENCODING TIAP SEASON (VERSI TABEL)


from mlxtend.preprocessing import TransactionEncoder

def create_basket(season_df):
    grouped = season_df.groupby('InvoiceNo')['Description'].apply(list)
    te = TransactionEncoder()
    hot = te.fit_transform(grouped)
    basket = pd.DataFrame(hot, columns=te.columns_)
    return basket

basket_winter = create_basket(winter)
basket_spring = create_basket(spring)
basket_summer = create_basket(summer)
basket_autumn = create_basket(autumn)


dataset Online Retail telah dipisahkan berdasarkan musim menggunakan kategori Winter, Spring, Summer, dan Autumn. Setiap subset data kemudian dikonversi menjadi basket transaksi melalui proses one-hot encoding.

Hasilnya adalah matriks biner yang menunjukkan apakah suatu produk muncul dalam satu transaksi pada setiap musim.

**Jalankan Apriori per Season**

In [9]:
from mlxtend.frequent_patterns import apriori, association_rules

def run_apriori(basket, min_supp=0.02):
    frq = apriori(basket, min_support=min_supp, use_colnames=True)
    rules = association_rules(frq, metric="lift", min_threshold=1)
    rules['antecedents'] = rules['antecedents'].apply(lambda x: ', '.join(list(x)))
    rules['consequents'] = rules['consequents'].apply(lambda x: ', '.join(list(x)))
    return frq, rules

frq_winter, rules_winter = run_apriori(basket_winter)
frq_spring, rules_spring = run_apriori(basket_spring)
frq_summer, rules_summer = run_apriori(basket_summer)
frq_autumn, rules_autumn = run_apriori(basket_autumn)



Apriori dijalankan secara terpisah untuk setiap musim sehingga pola asosiasi yang terbentuk benar-benar menggambarkan perilaku pelanggan di musim tersebut.

Setiap aturan asosiasi dihitung berdasarkan nilai support, confidence, dan lift, di mana lift digunakan sebagai metrik utama karena menunjukkan kekuatan hubungan antar produk.

**Tampilkan Top 5 Rules Tiap Season**

In [10]:
def show_rules(rules, season):
    print(f"\n===== Top 5 Association Rules – {season} =====\n")
    display(rules.sort_values("lift", ascending=False).head(5))

show_rules(rules_winter, "Winter")
show_rules(rules_spring, "Spring")
show_rules(rules_summer, "Summer")
show_rules(rules_autumn, "Autumn")



===== Top 5 Association Rules – Winter =====



Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
16,GREEN REGENCY TEACUP AND SAUCER,ROSES REGENCY TEACUP AND SAUCER,0.03411,0.037713,0.026663,0.78169,20.727236,1.0,0.025377,4.407894,0.985365,0.590426,0.773134,0.744348
17,ROSES REGENCY TEACUP AND SAUCER,GREEN REGENCY TEACUP AND SAUCER,0.037713,0.03411,0.026663,0.707006,20.727236,1.0,0.025377,3.296625,0.989055,0.590426,0.696659,0.744348
14,SPACEBOY LUNCH BOX,DOLLY GIRL LUNCH BOX,0.039154,0.033389,0.021139,0.539877,16.169131,1.0,0.019831,2.100767,0.976384,0.411215,0.523983,0.586485
15,DOLLY GIRL LUNCH BOX,SPACEBOY LUNCH BOX,0.033389,0.039154,0.021139,0.633094,16.169131,1.0,0.019831,2.618775,0.97056,0.411215,0.618142,0.586485
45,WOODEN FRAME ANTIQUE WHITE,WOODEN PICTURE FRAME WHITE FINISH,0.050204,0.042998,0.029786,0.593301,13.798402,1.0,0.027628,2.353099,0.976555,0.469697,0.575029,0.643019



===== Top 5 Association Rules – Spring =====



Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
7,BLUE HAPPY BIRTHDAY BUNTING,PINK HAPPY BIRTHDAY BUNTING,0.029565,0.031553,0.022609,0.764706,24.235757,1.0,0.021676,4.115901,0.987948,0.587097,0.75704,0.740621
6,PINK HAPPY BIRTHDAY BUNTING,BLUE HAPPY BIRTHDAY BUNTING,0.031553,0.029565,0.022609,0.716535,24.235757,1.0,0.021676,3.423478,0.989975,0.587097,0.707899,0.740621
144,"ROSES REGENCY TEACUP AND SAUCER, PINK REGENCY ...","GREEN REGENCY TEACUP AND SAUCER, REGENCY CAKES...",0.039255,0.02882,0.02236,0.56962,19.764841,1.0,0.021229,2.256566,0.988196,0.48913,0.556849,0.672741
141,"GREEN REGENCY TEACUP AND SAUCER, REGENCY CAKES...","ROSES REGENCY TEACUP AND SAUCER, PINK REGENCY ...",0.02882,0.039255,0.02236,0.775862,19.764841,1.0,0.021229,4.286402,0.977579,0.48913,0.766704,0.672741
142,"GREEN REGENCY TEACUP AND SAUCER, ROSES REGENCY...","PINK REGENCY TEACUP AND SAUCER, REGENCY CAKEST...",0.042981,0.027081,0.02236,0.520231,19.210373,1.0,0.021196,2.027892,0.990519,0.46875,0.506877,0.67296



===== Top 5 Association Rules – Summer =====



Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
264,PINK REGENCY TEACUP AND SAUCER,"GREEN REGENCY TEACUP AND SAUCER, ROSES REGENCY...",0.030212,0.028215,0.020724,0.68595,24.311782,1.0,0.019872,3.094369,0.98874,0.549669,0.676832,0.710232
261,"GREEN REGENCY TEACUP AND SAUCER, ROSES REGENCY...",PINK REGENCY TEACUP AND SAUCER,0.028215,0.030212,0.020724,0.734513,24.311782,1.0,0.019872,3.652867,0.986707,0.549669,0.726242,0.710232
263,GREEN REGENCY TEACUP AND SAUCER,"ROSES REGENCY TEACUP AND SAUCER, PINK REGENCY ...",0.037703,0.022971,0.020724,0.549669,23.92852,1.0,0.019858,2.169578,0.995752,0.51875,0.539081,0.725921
262,"ROSES REGENCY TEACUP AND SAUCER, PINK REGENCY ...",GREEN REGENCY TEACUP AND SAUCER,0.022971,0.037703,0.020724,0.902174,23.92852,1.0,0.019858,9.836815,0.980738,0.51875,0.898341,0.725921
14,GREEN REGENCY TEACUP AND SAUCER,PINK REGENCY TEACUP AND SAUCER,0.037703,0.030212,0.024969,0.662252,21.919982,1.0,0.02383,2.871332,0.991772,0.581395,0.65173,0.744349



===== Top 5 Association Rules – Autumn =====



Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
33,PINK REGENCY TEACUP AND SAUCER,GREEN REGENCY TEACUP AND SAUCER,0.026328,0.03027,0.021914,0.832335,27.497411,1.0,0.021117,5.783749,0.98969,0.631818,0.827102,0.778147
32,GREEN REGENCY TEACUP AND SAUCER,PINK REGENCY TEACUP AND SAUCER,0.03027,0.026328,0.021914,0.723958,27.497411,1.0,0.021117,3.527264,0.993712,0.631818,0.716494,0.778147
128,ROSES REGENCY TEACUP AND SAUCER,PINK REGENCY TEACUP AND SAUCER,0.034526,0.026328,0.020653,0.598174,22.719848,1.0,0.019744,2.423115,0.990173,0.513725,0.587308,0.691302
129,PINK REGENCY TEACUP AND SAUCER,ROSES REGENCY TEACUP AND SAUCER,0.026328,0.034526,0.020653,0.784431,22.719848,1.0,0.019744,4.478725,0.981836,0.513725,0.776722,0.691302
35,ROSES REGENCY TEACUP AND SAUCER,GREEN REGENCY TEACUP AND SAUCER,0.034526,0.03027,0.022702,0.657534,21.722603,1.0,0.021657,2.831613,0.98808,0.539326,0.646844,0.703767


**Interpretasi**

1.  Interpretasi Winter (December-February): Hasil analisis menunjukkan bahwa pada musim Winter, produk yang paling sering muncul dalam aturan asosiasi adalah Regency Tea Set, Wooden Frame, serta Alarm Clock.
Pola dengan lift tertinggi menunjukkan bahwa pelanggan sering membeli produk dengan tema rumah tangga bergaya klasik secara bersamaan.

    Hal ini masuk akal karena Winter bertepatan dengan periode holiday (Natal & Tahun Baru), di mana pelanggan cenderung membeli produk dekorasi rumah dan hadiah.
    Retail dapat memanfaatkan pola ini dengan membuat rekomendasi bundling seperti teacup + saucer atau antique wooden frame set.

2. Interpretasi Spring (March-May):
Pada musim Spring, pola pembelian yang paling dominan berkaitan dengan birthday bunting dan tea party set.
Lift yang sangat tinggi pada kombinasi 'Happy Birthday bunting' menunjukkan bahwa pelanggan membeli dekorasi pesta dalam bentuk pasangan warna yang berbeda.

    Selain itu, tea set masih tetap kuat, menandakan kebiasaan konsumen untuk membeli perlengkapan gathering skala kecil pada musim ini.

    Insight ini dapat digunakan untuk membuat promosi “Spring Party Decor” atau diskon bundling dekorasi ulang tahun.

3. Interpretasi Summer (June-August):
Pada musim Summer, aturan asosiasi didominasi oleh kombinasi picnic bag, baking supplies, serta Regency tea set yang tetap konsisten sepanjang tahun.
    Kehadiran pola 'baking cases' dan tas piknik dengan lift yang sangat tinggi sejalan dengan kebiasaan masyarakat UK yang sering melakukan aktivitas outdoor dan picnic selama musim panas.

    Retail dapat membuat kampanye “Summer Picnic Essentials” atau bundling perlengkapan baking dan tas piknik.

4. Interpretasi Autumn (September-November): Musim Autumn menunjukkan dua pola yang sangat kuat:

    (1) Christmas Wooden Decorations, dan

    (2) Regency tea set sebagai produk populer menjelang holiday.

    Lift tertinggi ditemukan pada kombinasi dekorasi kayu bertema Natal yang dibeli bersamaan, menunjukkan bahwa pelanggan sudah mulai melakukan early Christmas shopping.

    Retail dapat melakukan soft-launch koleksi dekorasi Natal pada awal Autumn dan menonjolkan bundling dekorasi kayu


Kesimpulan Akhir:

Analisis Market Basket 4-Season menunjukkan bahwa pola pembelian pelanggan Online Retail bersifat musiman.
Produk Regency Tea Set muncul konsisten sepanjang tahun, tetapi beberapa kategori seperti bunting (Spring), baking & picnic (Summer), dan Christmas décor (Autumn) memiliki dominasi musiman yang kuat.
Pendekatan seasonal ini memberikan insight granular dan lebih relevan bagi strategi merchandising dan bundling produk.

---

