Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R XGBoost: No longer producing predictions on .RData or .model files when supplied a matrix #5815

Closed
tmbluth opened this issue Jun 19, 2020 · 13 comments

Comments

@tmbluth
Copy link

tmbluth commented Jun 19, 2020

Just a couple of months back I was able to run the code below to predict test data outcomes after loading in .RData or .model XGBoost files. These models were able to predict outcomes on .csv data loaded in with read.csv() which was then transformed to a matrix with as.matrix(). After updating to the latest R version and the latest XGBoost version I was no longer able to predict my test data with this code:

predict(object=model1_proto, newdata=as.matrix(test_set[,model1_proto_inputs]))

I was instead greeted by a somewhat confusing error message:

Error in predict.xgb.Booster(object = model1_proto, newdata = as.matrix(test_set[,  : 
  [12:58:12] amalgamation/../src/learner.cc:506: Check failed: mparam_.num_feature != 0 (0 vs. 0) : 0 feature is supplied.  Are you using raw Booster interface?

After double-checking my inputs and my model to make sure the integrity of each was intact it was hard to tell what could have been the issue. After scouring the web for any similar issues the closest thing I could find was #5599 though I'm not sure its exactly the same issue

Here is my sessionInfo():

R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server x64 (build 14393)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
 [1] **xgboost_1.1.1.1** forcats_0.5.0   stringr_1.4.0  
 [4] dplyr_1.0.0     purrr_0.3.4     readr_1.3.1    
 [7] tidyr_1.1.0     tibble_3.0.1    ggplot2_3.3.1  
[10] tidyverse_1.3.0 devtools_2.3.0  usethis_1.6.1  

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6      lubridate_1.7.9   lattice_0.20-41  
 [4] prettyunits_1.1.1 ps_1.3.3          assertthat_0.2.1 
 [7] rprojroot_1.3-2   digest_0.6.25     R6_2.4.1         
[10] cellranger_1.1.0  backports_1.1.7   reprex_0.3.0     
[13] httr_1.4.1        pillar_1.4.4      rlang_0.4.6      
[16] readxl_1.3.1      rstudioapi_0.11   data.table_1.12.8
[19] callr_3.4.3       blob_1.2.1        Matrix_1.2-18    
[22] desc_1.2.0        munsell_0.5.0     tinytex_0.23     
[25] broom_0.5.6       compiler_4.0.1    modelr_0.1.8     
[28] xfun_0.14         pkgconfig_2.0.3   pkgbuild_1.0.8   
[31] tidyselect_1.1.0  fansi_0.4.1       crayon_1.3.4     
[34] dbplyr_1.4.4      withr_2.2.0       grid_4.0.1       
[37] nlme_3.1-148      jsonlite_1.6.1    gtable_0.3.0     
[40] lifecycle_0.2.0   DBI_1.1.0         magrittr_1.5     
[43] scales_1.1.1      stringi_1.4.6     cli_2.0.2        
[46] fs_1.4.1          remotes_2.1.1     testthat_2.3.2   
[49] xml2_1.3.2        ellipsis_0.3.1    generics_0.0.2   
[52] vctrs_0.3.1       tools_4.0.1       glue_1.4.1       
[55] hms_0.5.3         processx_3.4.2    pkgload_1.1.0    
[58] yaml_2.2.1        colorspace_1.4-1  sessioninfo_1.1.1
[61] rvest_0.3.5       memoise_1.1.0     knitr_1.28       
[64] haven_2.3.1  

I'm trying to discover if this is a bug or if the new version is requiring a different method of prediction. If so, it's not immediately obvious after reviewing the release notes

@hcho3
Copy link
Collaborator

hcho3 commented Jun 19, 2020

Did you use saveRDS() method to produce model1_proto? See the thread #5794. The short answer is that saveRDS() does not produce a file that can be read in newer versions of XGBoost. This is a technical limitation of saveRDS(). You should install an old version of XGBoost, load model1_proto, and save it again using xgb.save(). Then you'll get a model file that can be safely read in the latest version of XGBoost.

@tmbluth
Copy link
Author

tmbluth commented Jun 19, 2020

I did not. I used base R's save() and xgboost's xgb.save(). These were saved under an older version of XGBoost that I do not remember. It wouldn't be an XGBoost version much older than a year ago. I'll try to install an older version and do that

@hcho3
Copy link
Collaborator

hcho3 commented Jun 19, 2020

Hmm, that's odd, since you say you used xgb.save(). Can you post the snippet you used to load the XGBoost model? Did you use xgb.load()?

@tmbluth
Copy link
Author

tmbluth commented Jun 19, 2020

As an example the file path looks like 'C:/Users/me/models/' as seen below.
The model was saved with this code when an older version of XGBoost was installed:

xgb.save(xgb_tiered_p1, 'C:/Users/me/models/xgb_tiered_phase1.model')

When I load this file in now like this..

model1_proto <- xgb.load('C:/Users/me/models/xgb_tiered_phase1.model')

..it yields this error:

Error in xgb.Booster.handle(modelfile = modelfile) : 
  [16:36:26] amalgamation/../src/objective/./regression_loss.h:89: Check failed: base_score > 0.0f && base_score < 1.0f: base_score must be in (0,1) for logistic loss, got: -0

@hcho3
Copy link
Collaborator

hcho3 commented Jun 19, 2020

@tmbluth The error is different from your original post? Wasn't it mparam_.num_feature != 0 (0 vs. 0) : 0 feature is supplied. Are you using raw Booster interface?

As for the new error message Check failed: base_score > 0.0f && base_score < 1.0f, I believe it is fixed in the latest XGBoost (1.1.1.1). Can you check which version of XGBoost you are using now?

@tmbluth
Copy link
Author

tmbluth commented Jun 20, 2020

The first error message appeared when I tried to load in the .RData file ( load() after save() ). The second error message was when I tried to load in the .model file ( new xgb.load() after old xgb.save() )

In the sessionInfo() above under "other attached packages" it shows "xgboost_1.1.1.1"

That's the version that has been throwing these errors

@hcho3
Copy link
Collaborator

hcho3 commented Jun 21, 2020

@tmbluth The first error is expected, since the .RData file may become unreadable when XGBoost is upgraded. The save() method has the same problem as saveRDS(). See the discussion in #5794.

As for the second error, it might be related to a past bug #5699. Can you share the .model file so that we can try to diagnose the issue?

@tmbluth
Copy link
Author

tmbluth commented Jun 22, 2020

I can't in this case since the model is company property and proprietary. Is there anything else I can do?

@hcho3
Copy link
Collaborator

hcho3 commented Jun 22, 2020

@tmbluth Do you have another model file that you can share and is experiencing the same issue? (Check failed: base_score > 0.0f && base_score < 1.0f)

@tmbluth
Copy link
Author

tmbluth commented Jun 22, 2020

I do not. It would have to be artificially simulated by training a model with the old XGBoost (I'm wanting to say v 0.9.0?) and loading it in with the latest version

@hcho3
Copy link
Collaborator

hcho3 commented Jun 22, 2020

Is there anything else I can do?

Please understand that it is quite difficult for us developers to diagnose the problem without an example at hand. I’m afraid there’s not much we can do at this moment.

I’ll keep this issue open for now and will update if I happen to run into the same error.

@hcho3
Copy link
Collaborator

hcho3 commented Jul 26, 2020

#5794 has been addressed by #5940, and in the upcoming XGBoost version, you can read models from old RDS models again.

@hcho3 hcho3 closed this as completed Jul 26, 2020
@zqsha
Copy link

zqsha commented Aug 9, 2020

OH. Great. I just came across the same issue. Glad to know this news. But, when I checked the latest version of xgboost in the CRAN (https://cran.r-project.org/web/packages/xgboost/index.html), it's still v1.1.1.1. Can I use the install.package to install the v1.2.0.1? Or I have to compile by myself? Thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants