Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuzzy methods throw warnings about data range #498

Closed
nebfield opened this issue Sep 29, 2016 · 2 comments
Closed

Fuzzy methods throw warnings about data range #498

nebfield opened this issue Sep 29, 2016 · 2 comments

Comments

@nebfield
Copy link
Contributor

Hello,

Every fuzzy method I've tried has thrown lots of the following warning: There are your newdata which are out of the specified range. By chance I found a day-old stackexchange post about this problem. The problem appears to be that when range.data is calculated and cross validation is used new data from different folds can fall outside the precalculated range.

When checking the output of $fit from getModelInfo("FRBCS.W") I spotted the following code:

args$range.data <- apply(x, 2, extendrange)

Using the frbs.learn function directly allows range.data to be passed as a parameter. A possible solution would be range.data to be a parameter for train. This way a user could calculate the range.data from the entire dataset, as suggested in the stackexchange post. Below is an MRE.

Thanks!

Minimal, reproducible example:

Minimal dataset:

library(caret)
set.seed(0451)
data(iris)
irisShuffled <- iris[sample(nrow(iris)),]

inTrain <- createDataPartition(y = irisShuffled$Species,
                               p = .75,
                               list = FALSE)

training <- irisShuffled[inTrain,]
testing <- irisShuffled[-inTrain,]

ctrl <- trainControl(method = "repeatedcv",
                     repeats = 3)

Minimal, runnable code:

fuzzyFit <- train(Species ~ .,
                  data = training,
                  method = "FRBCS.W",
                  trControl = ctrl)

> There were 27 warnings (use warnings() to see them)
Warning messages:
1: In validate.params(object, newdata) :
  There are your newdata which are out of the specified range

Session Info:

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] frbs_3.1-0      caret_6.0-71    ggplot2_2.1.0   lattice_0.20-34

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7        magrittr_1.5       splines_3.3.1      MASS_7.3-45       
 [5] munsell_0.4.3      colorspace_1.2-6   foreach_1.4.3      minqa_1.2.4       
 [9] stringr_1.1.0      car_2.1-3          plyr_1.8.4         tools_3.3.1       
[13] nnet_7.3-12        pbkrtest_0.4-6     parallel_3.3.1     grid_3.3.1        
[17] gtable_0.2.0       nlme_3.1-128       mgcv_1.8-15        quantreg_5.29     
[21] e1071_1.6-7        class_7.3-14       MatrixModels_0.4-1 iterators_1.0.8   
[25] lme4_1.1-12        Matrix_1.2-7.1     nloptr_1.0.4       reshape2_1.4.1    
[29] codetools_0.2-14   rsconnect_0.4.3    stringi_1.1.1      compiler_3.3.1    
[33] scales_0.4.0       stats4_3.3.1       SparseM_1.72     
@topepo
Copy link
Owner

topepo commented Sep 30, 2016

I can make a change to the code such that, if you pass range.data to train it will be used. If not, it can follow the current process.

This probably affects other models too.

topepo added a commit that referenced this issue Oct 9, 2016
@topepo
Copy link
Owner

topepo commented Oct 20, 2016

Can you test these on your data?

@topepo topepo closed this as completed Oct 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants