Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] R handles produce segmentation faults when de-serialized #4208

Closed
david-cortes opened this issue Apr 21, 2021 · 1 comment · Fixed by #4586
Closed

[R-package] R handles produce segmentation faults when de-serialized #4208

david-cortes opened this issue Apr 21, 2021 · 1 comment · Fixed by #4586

Comments

@david-cortes
Copy link
Contributor

The R interface uses a "handle" object which I guess stores a pointer to a C++ object. The objects these handles point to do not survive serialization in R, and will cause segmentation faults if they are serialized and de-serialized - which happens for example when restarting an R session.

Example:

library(lightgbm)
data(agaricus.train, package = "lightgbm")
train <- agaricus.train
dtrain <- lgb.Dataset(train$data, label = train$label)
data(agaricus.test, package = "lightgbm")
test <- agaricus.test
dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label)
params <- list(objective = "regression", metric = "l2")
valids <- list(test = dtest)
model <- lgb.train(
    params = params
    , data = dtrain
    , nrounds = 5L
    , valids = valids
    , min_data = 1L
    , learning_rate = 1.0
    , early_stopping_rounds = 3L
)

After running it, restart the R session (assuming it's configured to save the environment between restarts), and execute these two lines again:

library(lightgbm)
model <- lgb.train(
    params = params
    , data = dtrain
    , nrounds = 5L
    , valids = valids
    , min_data = 1L
    , learning_rate = 1.0
    , early_stopping_rounds = 3L
)

At that point it will produce a segmentation fault, crashing the R process.

I suppose the correct solution would be to use R's own external pointer object class and leave the destructor/free-er to the same R external pointer object. Those objects will also reset to nullptr upon de-serialization, which can then be checked inside the functions beforehand to avoid producing segmentation faults.

@jameslamb jameslamb changed the title R handles produce segmentation faults when de-serialized [R-package] R handles produce segmentation faults when de-serialized May 5, 2021
@jameslamb jameslamb mentioned this issue May 20, 2021
21 tasks
shiyu1994 pushed a commit that referenced this issue Sep 25, 2021
…es (fixes #4208) (#4586)

* [R-package] fix segfaults caused by missing Booster and Dataset handles (fixes #4208)

* fix test errors

* fixes for cpplint

* Update R-package/tests/testthat/test_dataset.R

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* fix tests

* Apply suggestions from code review

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>

* move asserts inside try-catch

Co-authored-by: Nikita Titov <nekit94-08@mail.ru>
@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants