-
Notifications
You must be signed in to change notification settings - Fork 4
/
arena_static.Rmd
120 lines (107 loc) · 2.91 KB
/
arena_static.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
title: "Static Arena"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{arena_static}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
message = FALSE,
eval = FALSE
)
```
## Setup
```{r}
library(arenar)
apartments <- DALEX::apartments
head(apartments)
```
## Prepare models
Let's compare three models: GLM and GBMs with 100 and 500 trees. For each we
create explainer from DALEX package.
```{r, results = "hide"}
library(gbm)
library(DALEX)
library(dplyr)
model_gbm100 <- gbm(m2.price ~ ., data = apartments, n.trees = 100)
expl_gbm100 <- explain(
model_gbm100,
data = apartments,
y = apartments$m2.price,
label = "gbm [100 trees]"
)
model_gbm500 <- gbm(m2.price ~ ., data = apartments, n.trees = 500)
expl_gbm500 <- explain(
model_gbm500,
data = apartments,
y = apartments$m2.price,
label = "gbm [500 trees]"
)
model_glm <- glm(m2.price ~ ., data = apartments)
expl_glm <- explain(model_glm, data = apartments, y = apartments$m2.price)
```
## Prepare observations
Plots for static Arena are pre-caluclated and it takes time and file size. For
example we will take only apartments from 2009 or newer. Random
sample is also good.
```{r}
observations <- apartments %>% filter(construction.year >= 2009)
# Observations' names are taken from rownames
rownames(observations) <- paste0(
observations$district,
" ",
observations$surface,
"m2 "
)
```
## Create arena
```{r, eval = FALSE}
arena <- create_arena() %>%
# Pushing explainers for each models
push_model(expl_gbm100) %>%
push_model(expl_gbm500) %>%
push_model(expl_glm) %>%
# Push dataframe of observations
push_observations(observations) %>%
# Upload calculated arena files to Gist and open Arena in browser
upload_arena()
```
## Appending data
There are two ways of add new observations or new models without recalcualating
already generated plots. Let's add apartments built in 2008. It's similar for
models.
```{r}
observations2 <- apartments %>% filter(construction.year == 2008)
# Observations' names are taken from rownames
rownames(observations2) <- paste0(
observations2$district,
" ",
observations2$surface,
"m2 "
)
```
### New Arena session
We can add observations to already existing arena object and call
`arena_upload()`.
```{r, eval = FALSE}
arena %>%
push_observations(observations2) %>%
upload_arena()
```
### Append to already existing session
Sometimes we don't want to close Arena session and just add data. There is
argument in `arena_upload` function to do that. Remember to append new arena
object and to push all models and all observations that are required to plots
you want to append.
```{r, eval = FALSE}
create_arena() %>%
push_observations(arena_push_observations2) %>%
push_model(expl_glm) %>%
push_model(expl_gbm100) %>%
push_model(expl_gbm500) %>%
upload_arena(append_data = TRUE)
```