## Introduction
DQLab.id Fashion is a fashion shop that sells various products such as jeans, shirts, cosmetics, and others. Although it is quite developed, but the increasing number of competitors and many products whose stocks are still large, it certainly worries DQLab.id Fashion managers. One solution is to create innovative packages. Where products that were previously unsold but have market share can even be packaged and sold.


### Data
Dataset in TSV format (Tab Separated Value)with 33,669 rows (3,450 transaction codes).
The data tidied up by containing only two variables :

- Kode Transaksi
- Nama Barang

Other variables such as price, date, number of purchases, etc not included because these two variables sufficient.

### Task
Several steps were taken to complete this project. First is prepare library and dataset. Then the raw data is cleaned up and scaled using standardization and normalization prior to modeling. And to realize this, using the A priori algorithm of the Arules package throughout this project.

Get insight into the top 10 and bottom 10 of the products sold.
Get a list of all product package combinations with strong correlations.
Get a list of all product package combinations with specific items.

### Import Library and Read File

In [1]:
library(arules)
transaksi_tabular <- read.transactions(file = "datasets/transaksi_dqlab_retail.tsv", format = "single", sep = "\t", cols = c(1,2), skip = 1)
transaksi_tabular

Loading required package: Matrix

Attaching package: 'arules'

The following objects are masked from 'package:base':

    abbreviate, write



transactions in sparse format with
 3450 transactions (rows) and
 69 items (columns)

### Top 10 Stats

In [2]:
data_top <- itemFrequency(transaksi_tabular, type = "absolute")
data_top <- sort(data_top, decreasing = TRUE)
data_top <- data_top[1:10]

data_top <- data.frame("Nama Produk" = names(data_top), "Jumlah" = data_top, row.names = NULL)

data_top

Nama.Produk,Jumlah
Shampo Biasa,2075
Serum Vitamin,1685
Baju Batik Wanita,1312
Baju Kemeja Putih,1255
Celana Jogger Casual,1136
Cover Koper,1086
Sepatu Sandal Anak,1062
Tali Pinggang Gesper Pria,1003
Sepatu Sport merk Z,888
Wedges Hitam,849


### Bottom 10 Stats

In [3]:
data_bottom <- itemFrequency(transaksi_tabular, type = "absolute")
data_bottom <- sort(data_bottom, decreasing = FALSE)
data_bottom <- data_bottom[1:10]

data_bottom <- data.frame("Nama Produk" = names(data_bottom), "Jumlah" = data_bottom, row.names = NULL)

data_bottom

Nama.Produk,Jumlah
Celana Jeans Sobek Pria,9
Tas Kosmetik,11
Stripe Pants,19
Pelembab,24
Tali Ban Ikat Pinggang,27
Baju Renang Pria Anak-anak,32
Hair Dye,46
Atasan Baju Belang,56
Tas Sekolah Anak Perempuan,71
Dompet Unisex,75


### Get an interesting Product Combination

In [4]:
combination <- apriori(transaksi_tabular, parameter = list(supp = 10/length(transaksi_tabular), confidence = 0.5, minlen= 2, maxlen = 3))

combination_result <- head(combination, n = 10, by = "lift")

combination_result
inspect(combination_result)

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime     support minlen
        0.5    0.1    1 none FALSE            TRUE       5 0.002898551      2
 maxlen target   ext
      3  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 10 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[69 item(s), 3450 transaction(s)] done [0.00s].
sorting and recoding items ... [68 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3

"Mining stopped (maxlen reached). Only patterns up to a length of 3 returned!"

 done [0.01s].
writing ... [4637 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].


set of 10 rules 

     lhs                             rhs                              support confidence     lift count
[1]  {Tas Makeup,                                                                                      
      Tas Pinggang Wanita}        => {Baju Renang Anak Perempuan} 0.010434783  0.8780488 24.42958    36
[2]  {Tas Makeup,                                                                                      
      Tas Travel}                 => {Baju Renang Anak Perempuan} 0.010144928  0.8139535 22.64629    35
[3]  {Tas Makeup,                                                                                      
      Tas Ransel Mini}            => {Baju Renang Anak Perempuan} 0.011304348  0.7358491 20.47322    39
[4]  {Sunblock Cream,                                                                                  
      Tas Pinggang Wanita}        => {Kuas Makeup }               0.016231884  0.6913580 20.21343    56
[5]  {Baju Renang Anak Perempuan,                               

### Product Packages that can be paired with Slow-Moving Items

In [5]:
slowmove_combination <- apriori(transaksi_tabular, parameter = list(supp = 10/length(transaksi_tabular), confidence = 0.1, minlen= 2, maxlen = 3))

c1 <- subset(slowmove_combination, rhs %in% "Tas Makeup")
result_c1 <- head(sort(c1, by = "lift", decreasing = TRUE), 3)
c2 <- subset(slowmove_combination, rhs %in% "Baju Renang Pria Anak-anak")
result_c2 <- head(sort(c2, by = "lift", decreasing = TRUE), 3)

final_result <- c(result_c1, result_c2)

inspect(final_result)

Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime     support minlen
        0.1    0.1    1 none FALSE            TRUE       5 0.002898551      2
 maxlen target   ext
      3  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 10 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[69 item(s), 3450 transaction(s)] done [0.01s].
sorting and recoding items ... [68 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3

"Mining stopped (maxlen reached). Only patterns up to a length of 3 returned!"

 done [0.01s].
writing ... [39832 rule(s)] done [0.01s].
creating S4 object  ... done [0.01s].
    lhs                             rhs                              support confidence     lift count
[1] {Baju Renang Anak Perempuan,                                                                      
     Tas Pinggang Wanita}        => {Tas Makeup}                 0.010434783  0.8000000 19.57447    36
[2] {Baju Renang Anak Perempuan,                                                                      
     Tas Ransel Mini}            => {Tas Makeup}                 0.011304348  0.7959184 19.47460    39
[3] {Baju Renang Anak Perempuan,                                                                      
     Celana Pendek Green/Hijau}  => {Tas Makeup}                 0.010144928  0.7777778 19.03073    35
[4] {Gembok Koper,                                                                                    
     Tas Waist Bag}              => {Baju Renang Pria Anak-anak} 0.004057971  0.2

### The final result can be seen in: output/combination _retail_ slow moving

In [6]:
write(final_result, file = "output/kombinasi_retail_slow_moving.txt")

### Summary :
From the results, we get the results of 3 combinations each for the 2 slow-moving items that have the strongest association,

Tas Makeup:
- {Baju Renang Anak Perempuan, Tas Pinggang Wanita} => {Tas Makeup}
- {Baju Renang Anak Perempuan, Tas Ransel Mini} => {Tas Makeup}
- {Baju Renang Anak Perempuan, Celana Pendek Green/Hijau} => {Tas Makeup}

Baju Renang Pria Anak-anak:
- {Gembok Koper, Tas Waist Bag} => {Baju Renang Pria Anak-anak}
- {Flat Shoes Ballerina, Gembok Koper} => {Baju Renang Pria Anak-anak}
- {Celana Jeans Sobek Wanita, Jeans Jumbo} => {Baju Renang Pria Anak-anak}
