Permalink
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
241 lines (179 sloc) 6.61 KB
---
title: "Variable Selection Methods"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Variable Selection Methods}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, echo=FALSE, message=FALSE}
library(olsrr)
library(dplyr)
library(ggplot2)
library(gridExtra)
library(purrr)
library(tibble)
library(nortest)
library(goftest)
```
<style type="text/css">
#best-subset-regression pre { /* Code block */
font-size: 10px
}
</style>
## Introduction
## All Possible Regression
All subset regression tests all possible subsets of the set of potential independent variables. If there are K potential independent variables (besides the constant), then there are $2^{k}$ distinct subsets of them to be tested. For example, if you have 10 candidate independent variables, the number of subsets to be tested is $2^{10}$, which is 1024, and if you have 20 candidate variables, the number is $2^{20}$, which is more than one million.
```{r allsub}
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_step_all_possible(model)
```
The `plot` method shows the panel of fit criteria for all possible regression methods.
```{r allsubplot, fig.width=10, fig.height=10, fig.align='center'}
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
k <- ols_step_all_possible(model)
plot(k)
```
## Best Subset Regression
Select the subset of predictors that do the best at meeting some well-defined objective criterion,
such as having the largest R2 value or the smallest MSE, Mallow's Cp or AIC.
```{r bestsub, size='tiny'}
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
ols_step_best_subset(model)
```
The `plot` method shows the panel of fit criteria for best subset regression methods.
```{r bestsubplot, fig.width=10, fig.height=10, fig.align='center'}
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars)
k <- ols_step_best_subset(model)
plot(k)
```
## Stepwise Forward Regression
Build regression model from a set of candidate predictor variables by entering predictors based on
p values, in a stepwise manner until there is no variable left to enter any more. The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed.
### Variable Selection
```{r stepf1}
# stepwise forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward_p(model)
```
### Plot
```{r stepf2, fig.width=10, fig.height=10, fig.align='center'}
model <- lm(y ~ ., data = surgical)
k <- ols_step_forward_p(model)
plot(k)
```
### Detailed Output
```{r stepwisefdetails}
# stepwise forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward_p(model, details = TRUE)
```
## Stepwise Backward Regression
Build regression model from a set of candidate predictor variables by removing predictors based on
p values, in a stepwise manner until there is no variable left to remove any more. The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed.
### Variable Selection
```{r stepb, fig.width=10, fig.height=10, fig.align='center'}
# stepwise backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward_p(model)
```
### Plot
```{r stepb2, fig.width=10, fig.height=10, fig.align='center'}
model <- lm(y ~ ., data = surgical)
k <- ols_step_backward_p(model)
plot(k)
```
### Detailed Output
```{r stepwisebdetails}
# stepwise backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward_p(model, details = TRUE)
```
## Stepwise Regression
Build regression model from a set of candidate predictor variables by entering and removing predictors based on
p values, in a stepwise manner until there is no variable left to enter or remove any more. The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed.
### Variable Selection
```{r stepwise1}
# stepwise regression
model <- lm(y ~ ., data = surgical)
ols_step_both_p(model)
```
### Plot
```{r stepwise2, fig.width=10, fig.height=10, fig.align='center'}
model <- lm(y ~ ., data = surgical)
k <- ols_step_both_p(model)
plot(k)
```
### Detailed Output
```{r stepwisedetails}
# stepwise regression
model <- lm(y ~ ., data = surgical)
ols_step_both_p(model, details = TRUE)
```
## Stepwise AIC Forward Regression
Build regression model from a set of candidate predictor variables by entering predictors based on
Akaike Information Criteria, in a stepwise manner until there is no variable left to enter any more.
The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed.
### Variable Selection
```{r stepaicf1}
# stepwise aic forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward_aic(model)
```
### Plot
```{r stepaicf2, fig.width=5, fig.height=5, fig.align='center'}
model <- lm(y ~ ., data = surgical)
k <- ols_step_forward_aic(model)
plot(k)
```
### Detailed Output
```{r stepwiseaicfdetails}
# stepwise aic forward regression
model <- lm(y ~ ., data = surgical)
ols_step_forward_aic(model, details = TRUE)
```
## Stepwise AIC Backward Regression
Build regression model from a set of candidate predictor variables by removing predictors based on
Akaike Information Criteria, in a stepwise manner until there is no variable left to remove any more.
The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed.
### Variable Selection
```{r stepaicb1}
# stepwise aic backward regression
model <- lm(y ~ ., data = surgical)
k <- ols_step_backward_aic(model)
k
```
### Plot
```{r stepaicb2, fig.width=5, fig.height=5, fig.align='center'}
model <- lm(y ~ ., data = surgical)
k <- ols_step_backward_aic(model)
plot(k)
```
### Detailed Output
```{r stepwiseaicbdetails}
# stepwise aic backward regression
model <- lm(y ~ ., data = surgical)
ols_step_backward_aic(model, details = TRUE)
```
## Stepwise AIC Regression
Build regression model from a set of candidate predictor variables by entering and removing predictors based on
Akaike Information Criteria, in a stepwise manner until there is no variable left to enter or remove any more.
The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed.
### Variable Selection
```{r stepwiseaic1}
# stepwise aic regression
model <- lm(y ~ ., data = surgical)
ols_step_both_aic(model)
```
### Plot
```{r stepwiseaic2, fig.width=5, fig.height=5, fig.align='center'}
model <- lm(y ~ ., data = surgical)
k <- ols_step_both_aic(model)
plot(k)
```
### Detailed Output
```{r stepwiseaicdetails}
# stepwise aic regression
model <- lm(y ~ ., data = surgical)
ols_step_both_aic(model, details = TRUE)
```