Permalink
Cannot retrieve contributors at this time
--- | |
title: "Variable Selection Methods" | |
output: rmarkdown::html_vignette | |
vignette: > | |
%\VignetteIndexEntry{Variable Selection Methods} | |
%\VignetteEngine{knitr::rmarkdown} | |
%\VignetteEncoding{UTF-8} | |
--- | |
```{r, echo=FALSE, message=FALSE} | |
library(olsrr) | |
library(ggplot2) | |
library(gridExtra) | |
library(nortest) | |
library(goftest) | |
``` | |
<style type="text/css"> | |
#best-subset-regression pre { /* Code block */ | |
font-size: 10px | |
} | |
</style> | |
## Introduction | |
## All Possible Regression | |
All subset regression tests all possible subsets of the set of potential independent variables. If there are K potential independent variables (besides the constant), then there are $2^{k}$ distinct subsets of them to be tested. For example, if you have 10 candidate independent variables, the number of subsets to be tested is $2^{10}$, which is 1024, and if you have 20 candidate variables, the number is $2^{20}$, which is more than one million. | |
```{r allsub} | |
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) | |
ols_step_all_possible(model) | |
``` | |
The `plot` method shows the panel of fit criteria for all possible regression methods. | |
```{r allsubplot, fig.width=10, fig.height=10, fig.align='center'} | |
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) | |
k <- ols_step_all_possible(model) | |
plot(k) | |
``` | |
## Best Subset Regression | |
Select the subset of predictors that do the best at meeting some well-defined objective criterion, | |
such as having the largest R2 value or the smallest MSE, Mallow's Cp or AIC. | |
```{r bestsub, size='tiny'} | |
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) | |
ols_step_best_subset(model) | |
``` | |
The `plot` method shows the panel of fit criteria for best subset regression methods. | |
```{r bestsubplot, fig.width=10, fig.height=10, fig.align='center'} | |
model <- lm(mpg ~ disp + hp + wt + qsec, data = mtcars) | |
k <- ols_step_best_subset(model) | |
plot(k) | |
``` | |
## Stepwise Forward Regression | |
Build regression model from a set of candidate predictor variables by entering predictors based on | |
p values, in a stepwise manner until there is no variable left to enter any more. The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed. | |
### Variable Selection | |
```{r stepf1} | |
# stepwise forward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_forward_p(model) | |
``` | |
### Plot | |
```{r stepf2, fig.width=10, fig.height=10, fig.align='center'} | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_forward_p(model) | |
plot(k) | |
``` | |
### Detailed Output | |
```{r stepwisefdetails} | |
# stepwise forward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_forward_p(model, details = TRUE) | |
``` | |
## Stepwise Backward Regression | |
Build regression model from a set of candidate predictor variables by removing predictors based on | |
p values, in a stepwise manner until there is no variable left to remove any more. The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed. | |
### Variable Selection | |
```{r stepb, fig.width=10, fig.height=10, fig.align='center'} | |
# stepwise backward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_backward_p(model) | |
``` | |
### Plot | |
```{r stepb2, fig.width=10, fig.height=10, fig.align='center'} | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_backward_p(model) | |
plot(k) | |
``` | |
### Detailed Output | |
```{r stepwisebdetails} | |
# stepwise backward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_backward_p(model, details = TRUE) | |
``` | |
## Stepwise Regression | |
Build regression model from a set of candidate predictor variables by entering and removing predictors based on | |
p values, in a stepwise manner until there is no variable left to enter or remove any more. The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed. | |
### Variable Selection | |
```{r stepwise1} | |
# stepwise regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_both_p(model) | |
``` | |
### Plot | |
```{r stepwise2, fig.width=10, fig.height=10, fig.align='center'} | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_both_p(model) | |
plot(k) | |
``` | |
### Detailed Output | |
```{r stepwisedetails} | |
# stepwise regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_both_p(model, details = TRUE) | |
``` | |
## Stepwise AIC Forward Regression | |
Build regression model from a set of candidate predictor variables by entering predictors based on | |
Akaike Information Criteria, in a stepwise manner until there is no variable left to enter any more. | |
The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed. | |
### Variable Selection | |
```{r stepaicf1} | |
# stepwise aic forward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_forward_aic(model) | |
``` | |
### Plot | |
```{r stepaicf2, fig.width=5, fig.height=5, fig.align='center'} | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_forward_aic(model) | |
plot(k) | |
``` | |
### Detailed Output | |
```{r stepwiseaicfdetails} | |
# stepwise aic forward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_forward_aic(model, details = TRUE) | |
``` | |
## Stepwise AIC Backward Regression | |
Build regression model from a set of candidate predictor variables by removing predictors based on | |
Akaike Information Criteria, in a stepwise manner until there is no variable left to remove any more. | |
The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed. | |
### Variable Selection | |
```{r stepaicb1} | |
# stepwise aic backward regression | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_backward_aic(model) | |
k | |
``` | |
### Plot | |
```{r stepaicb2, fig.width=5, fig.height=5, fig.align='center'} | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_backward_aic(model) | |
plot(k) | |
``` | |
### Detailed Output | |
```{r stepwiseaicbdetails} | |
# stepwise aic backward regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_backward_aic(model, details = TRUE) | |
``` | |
## Stepwise AIC Regression | |
Build regression model from a set of candidate predictor variables by entering and removing predictors based on | |
Akaike Information Criteria, in a stepwise manner until there is no variable left to enter or remove any more. | |
The model should include all the candidate predictor variables. If details is set to `TRUE`, each step is displayed. | |
### Variable Selection | |
```{r stepwiseaic1} | |
# stepwise aic regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_both_aic(model) | |
``` | |
### Plot | |
```{r stepwiseaic2, fig.width=5, fig.height=5, fig.align='center'} | |
model <- lm(y ~ ., data = surgical) | |
k <- ols_step_both_aic(model) | |
plot(k) | |
``` | |
### Detailed Output | |
```{r stepwiseaicdetails} | |
# stepwise aic regression | |
model <- lm(y ~ ., data = surgical) | |
ols_step_both_aic(model, details = TRUE) | |
``` |