-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[R] Remove stringi dependency #5905
Comments
Sure. But are you compiling everything from source? If so, did you strip out the debug symbols in stringi and xgboost? |
I noticed that for CPU only build, with gcc-9, the shared object in XGBoost is only 5.5MB without debug symbol: cmake -DCMAKE_BUILD_TYPE=Release -DUSE_OPENMP=ON |
I am installing from CRAN archive. This is how it looks:
NDEBUG is there, should it be something else? I guess it's so big because of the inclusion of rabbit and dmlc, no? In any case I am completely fine with 30MB object for xgboost. The real issue is stringi which is not a hard dependency; just used for plots and tree parser. |
It's the -g flag, which generates debug symbols. |
Thanks for this. I halved the size of the deployment by compiling with |
Awesome! Now do you think it's worthy to remove stringi? |
Although for XGBoost I highly recommend using CMake to build instead of changing flags yourself. |
I would still say it's worth to remove stringi. It's used for non-critical stuff and it's a low hanging fruit anyhow. 33MB it's not a big deal for sure, but given that the entire R runtime with 23 packages on board is only 49MB it surely feels disproportionate.
It's not an option for us. We are using |
@vspinu Sure, PR is welcomed. ;-) |
@vspinu Not rushing into anything, any update? |
Not yet. Busy month at work. Added to my reminders. Will provide a PR within a week or two. Thanks for pinging. |
No problem. Just following up. Have fun with it. |
I am building a predictor for amazon lambda where the hard limit of all unpacked dependencies is 250MB. The xgboost + deps is already 92MB mostly due to the 56MB of stringi.
I have looked at the code-base briefly and all of the use-cases of stringi seem to be easily replaceable by base R functionally.
Would you be ok with that? I can have a look into a PR.
The text was updated successfully, but these errors were encountered: