-
Notifications
You must be signed in to change notification settings - Fork 0
/
ppt-milestone.Rmd
152 lines (116 loc) · 4.17 KB
/
ppt-milestone.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
---
title: "Mall Tenant Synergy Visualization using GraphViz"
author: "Team Algoritma"
date: "Last update: 6th September 2019"
output:
revealjs::revealjs_presentation:
theme: serif
df_print: paged
---
## Background
```{r echo=FALSE, message=FALSE, warning=FALSE}
suppressPackageStartupMessages(library(dplyr))
suppressPackageStartupMessages(library(tidyr))
suppressPackageStartupMessages(library(DT))
suppressPackageStartupMessages(library(arules))
suppressPackageStartupMessages(library(stringr))
suppressPackageStartupMessages(library(visNetwork))
suppressPackageStartupMessages(library(ggplot2))
suppressPackageStartupMessages(library(RColorBrewer))
```
**Case Study: Mall Tenant Analysis**
In order to manage and sort a highly valuable tenants, we need to be able to identify how much the tenant’s financial value can it brought in, and how does the business itself affect other tenants (synergy). A high synergy relationship will also indicates how the tenants will attract a same group of customers.
## Approach: Customer Shopping Association Rules
By looking at how customer’s probability of going to multiple stores in a mall we can generate its support to measure store’s popularity and its lift to measure the likelihood relationship of two stores visited together.
Main attribute of association rules using market basket analysis can be further breakdown into the following:
- Support
- Lift
- Confidence
## Dummy Transaction Data
Given the following dummy data, we'll try to implement market basket analysis:
```{r echo=FALSE}
transactions <- read.csv("data_input/transactions.csv")
DT::datatable(transactions, options = list(pageLength=5))
```
## Dummy Tenant and Members Data
The tenant examples presented in this examples are the following:
```{r echo=FALSE}
levels(transactions$Tenant)
```
Currently we use a limited number of members sample:
```{r, echo=FALSE}
levels(transactions$CostumerID)
```
## Transaction Type for Market Basket
```{r}
trans <- transactions %>%
select(-DiscountVoucher, -SalesSubTotal)
basket <- split(x = trans$Tenant, # column to be splitted into groups
f = list(trans$CostumerID,trans$Date), # splitted column to be grouped by..
drop = T) # drop groups which have no item (Tenant)
trans <- as(basket, "transactions")
trans
```
## Popularity Metrics using Support
```{r, echo=FALSE}
itemFrequency(trans) %>%
as.data.frame() %>%
tibble::rownames_to_column() %>%
ggplot(aes(x=reorder(rowname, -`.`), y=`.`)) +
geom_col(fill = "#d63d2d") +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 40, hjust=1)
) +
labs(title = "Support of Tenants (Popularity)",
y = "Frequency", x = "Tenant")
```
## Popularity Metrics using Support (cont.)
```{r echo=FALSE}
# finding Support of Tenants
tenant.support <- data.frame(Support = itemFrequency(trans))
DT::datatable(tenant.support, options = list(pageLength=5))
```
## Extract Apriori Rules
```{r echo=FALSE}
# making association rules
mall_arules <- apriori(data = trans,
parameter = list(confidence=0.75, maxlen=2),
control = list(verbose = F))
mall_arules
```
```{r echo=FALSE}
# to retrieve the first 10 rules
mall_arules <- DATAFRAME(mall_arules) %>%
mutate(LHS = as.character(LHS),
RHS = as.character(RHS)) %>%
mutate(LHS = str_replace_all(LHS, "\\{|\\}", ""),
RHS = str_replace_all(RHS, "\\{|\\}", ""))
DT::datatable(mall_arules, options = list(pageLength=5))
```
## Graph Data Preparation
```{r warning=FALSE}
#Edges
edges <- mall_arules %>%
rename(from = LHS, to = RHS,
weight = confidence) %>%
arrange(weight) %>%
mutate(color=rev(terrain.colors(nrow(mall_arules))),
width=weight^5*10,
arrows="to")
# Nodes
nodes <- tenant.support %>%
tibble::rownames_to_column() %>%
rename(id = rowname, value = Support) %>%
mutate(size=value/max(tenant.support$value)*30)
```
## vizNetwork
```{r echo=FALSE}
library(visNetwork)
set.seed(100)
visNetwork(nodes=nodes,edges=edges, width="100%") %>%
visLegend() %>%
visOptions(nodesIdSelection = T,highlightNearest = T)
```
## TODO
// DYNAMIC USER RULES EXTRACTION USING INTERACTIVE WIDGET