Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stepwise forward regression fails when model formula contains inline functions #8

Closed
aravindhebbali opened this issue Jun 3, 2017 · 1 comment

Comments

Projects
None yet
1 participant
@aravindhebbali
Copy link
Member

commented Jun 3, 2017

ols_step_forward() returns an error when the model formula contains inline functions or interaction variables.

> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> ols_step_forward(lm_fit2)
We are selecting variables based on p value...
Error in eval(predvars, data, env) : object 'sqft' not found
Called from: eval(predvars, data, env)

> lm_fit1 <- lm(log(price)  ~ . - city, data = Sacramento)
> ols_step_forward(lm_fit1)
We are selecting variables based on p value...
Error in eval(predvars, data, env) : object 'price' not found
Called from: eval(predvars, data, env)

# interaction variables
> lm_fit3 <- lm(mpg ~ disp + hp + wt + am * disp, data = mtcars)
> ols_step_forward(lm_fit3)
We are selecting variables based on p value...
1 variable(s) added....
1 variable(s) added...
1 variable(s) added...
1 variable(s) added...
Error in ols_mallows_cp(fr$model, model) : 
  model must be a subset of full model
Called from: ols_mallows_cp(fr$model, model)

Session Info

> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_India.1252  LC_CTYPE=English_India.1252   
[3] LC_MONETARY=English_India.1252 LC_NUMERIC=C                  
[5] LC_TIME=English_India.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Rcpp_0.12.7     gridExtra_2.0.0 tidyr_0.6.0     tibble_1.2     
[5] purrr_0.2.2     dplyr_0.5.0     caret_6.0-76    ggplot2_2.2.1  
[9] lattice_0.20-35

loaded via a namespace (and not attached):
 [1] magrittr_1.5       splines_3.4.0      MASS_7.3-47       
 [4] munsell_0.4.3      colorspace_1.2-7   R6_2.2.1          
 [7] foreach_1.4.3      minqa_1.2.4        stringr_1.1.0     
[10] car_2.1-2          plyr_1.8.4         tools_3.4.0       
[13] parallel_3.4.0     nnet_7.3-12        pbkrtest_0.4-6    
[16] grid_3.4.0         gtable_0.2.0       nlme_3.1-125      
[19] mgcv_1.8-17        quantreg_5.19      DBI_0.5-1         
[22] MatrixModels_0.4-1 iterators_1.0.8    lme4_1.1-11       
[25] lazyeval_0.2.0     assertthat_0.2.0   Matrix_1.2-9      
[28] nloptr_1.0.4       reshape2_1.4.2     ModelMetrics_1.1.0
[31] codetools_0.2-15   stringi_1.1.2      compiler_3.4.0    
[34] scales_0.4.1       stats4_3.4.0       SparseM_1.7 

@aravindhebbali aravindhebbali added the bug label Jun 3, 2017

@aravindhebbali aravindhebbali self-assigned this Jun 3, 2017

@aravindhebbali aravindhebbali added this to the v0.2.0 milestone Jun 5, 2017

aravindhebbali added a commit that referenced this issue Jun 5, 2017

@aravindhebbali

This comment has been minimized.

Copy link
Member Author

commented Jun 5, 2017

ols_step_forward() does not return an error when the model formula contains inline functions or interaction variables.

> library(olsrr)
> library(caret)
> data("Sacramento")
> lm_fit2 <- lm(price  ~ beds + baths + log(sqft), data = Sacramento)
> ols_step_forward(lm_fit2)
We are selecting variables based on p value...
1 variable(s) added....
1 variable(s) added...
No more variables satisfy the condition of penter: 0.3
Forward Selection Method                                                        

Candidate Terms:                                                                

1 . beds                                                                        
2 . baths                                                                       
3 . log(sqft)                                                                   

-------------------------------------------------------------------------------
                               Selection Summary                                
-------------------------------------------------------------------------------
        Variable                   Adj.                                            
Step     Entered     R-Square    R-Square     C(p)         AIC          RMSE       
-------------------------------------------------------------------------------
   1    log(sqft)       0.568       0.567    52.6943    23833.1040    86242.3553    
   2    beds            0.591       0.590     2.9559    23784.5900    83981.7543    
-------------------------------------------------------------------------------

# interaction variables
> lm_fit3 <- lm(mpg ~ disp + hp + wt + am * disp, data = mtcars)
> ols_step_forward(lm_fit3)
We are selecting variables based on p value...
1 variable(s) added....
1 variable(s) added...
1 variable(s) added...
1 variable(s) added...
1 variable(s) added...
Forward Selection Method                                                  

Candidate Terms:                                                          

1 . disp                                                                  
2 . hp                                                                    
3 . wt                                                                    
4 . am                                                                    
5 . disp:am                                                               

-------------------------------------------------------------------------
                            Selection Summary                             
-------------------------------------------------------------------------
        Variable                  Adj.                                       
Step    Entered     R-Square    R-Square     C(p)        AIC        RMSE     
-------------------------------------------------------------------------
   1    wt             0.753       0.745    15.7814    166.0294    3.0458    
   2    hp             0.827       0.815     4.6820    156.6523    2.5935    
   3    am             0.840       0.823     4.3607    156.1348    2.5375    
   4    disp:am        0.853       0.831     4.0081    155.3638    2.4747    
   5    disp           0.853       0.825     6.0000    157.3538    2.5213    
------------------------------------------------------------------------- 
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.