/
12-workflows-solutions.Rmd
98 lines (68 loc) · 2.02 KB
/
12-workflows-solutions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
title: "Workflows - Solutions"
output: html_notebook
editor_options:
chunk_output_type: inline
---
```{r setup, include=FALSE}
library(tidyverse)
library(AmesHousing)
library(tidymodels)
ames <- make_ames() %>%
dplyr::select(-matches("Qu"))
set.seed(100)
ames_split <- initial_split(ames)
ames_train <- training(ames_split)
ames_test <- testing(ames_split)
```
# Your Turn 1
Build a workflow that uses a linear model to predict `Sale_Price` with `Bedrooms_AbvGr`, `Full_Bath` and `Half_Bath` in `ames`. Save it as `bb_wf`.
```{r}
lm_spec <-
linear_reg() %>%
set_engine("lm")
bb_wf <-
workflow() %>%
add_formula(Sale_Price ~ Bedroom_AbvGr + Full_Bath + Half_Bath) %>%
add_model(lm_spec)
```
# Your Turn 2
Test the linear model that predicts `Sale_Price` with everything else in `ames`. Use cross validation to estimate the RMSE.
1. Create a new workflow by updating `bb_wf`.
1. Use `vfold_cv()` to create a 10-fold cross validation of `ames_train`.
1. Fit the workflow
1. Use `collect_metrics()` to estimate the RMSE.
```{r}
all_wf <-
bb_wf %>%
update_formula(Sale_Price ~ .)
ames_folds <- vfold_cv(ames_train, v = 10)
fit_resamples(all_wf, resamples = ames_folds) %>%
collect_metrics()
```
# Your Turn 3
Fill in the blanks to test the regression tree model that predicts `Sale_Price` with _everything else in `ames`_ on `ames_folds`. What RMSE do you get?
*Hint: Create a new workflow by updating `all_wf`.*
```{r}
rt_spec <-
decision_tree() %>%
set_engine(engine = "rpart") %>%
set_mode("regression")
rt_wf <-
all_wf %>%
update_model(rt_spec)
fit_resamples(rt_wf, resamples = ames_folds) %>%
collect_metrics()
```
# Your Turn 4
But what about the predictions of our model? Save the fitted object from your regression tree, and use `collect_predictions()` to see the predictions generated from the test data.
```{r}
all_fitwf <-
fit_resamples(
rt_wf,
resamples = ames_folds,
control = control_resamples(save_pred = TRUE)
)
all_fitwf %>%
collect_predictions()
```