Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xgboost x - segmentation fault #4

Open
mglowacki100 opened this issue Jul 19, 2016 · 7 comments
Open

xgboost x - segmentation fault #4

mglowacki100 opened this issue Jul 19, 2016 · 7 comments

Comments

@mglowacki100
Copy link

Segmentation fault (core dumped) occurs randomly when I run xgboost x model through R (02-models.build.R)...
I need to double-check it, but as far I remember there was no such problem with model.xgbx.x.stack.py (run in spider) with hard-coded options.

@diefimov
Copy link
Owner

It happened to me before, when data do not fit to RAM. Try to decrease dataset and check if you have the same error.

@mglowacki100
Copy link
Author

I have 64gb of ram, additionally I've set 256gb swapfile, but problem still occurs.
Now, I suspect this line: sys.path.append('/Users/ef/xgboost/wrapper'). I didn't update it.

@mglowacki100
Copy link
Author

This line doesn't matter.
I've tried to run directly in spider with hardcoded params - now, script stalls instead of segmentation fault. Still all cores used but usage in total is about 20%...
This is a similar issue:
dmlc/xgboost#209

@diefimov
Copy link
Owner

Did you try to reduce dataset (take first 10000 lines only, for example)? My guess is that if it is not the memory problem, then xgboost stalls because of some problem with data.

@mglowacki100
Copy link
Author

I've shrinked both: test.csv (first 10k rows) and train.csv(random 10k rows - to have all labels) and there is no segmentation fault or stalling.
Btw. how much RAM do you used for full dataset?

@diefimov
Copy link
Owner

I work on Mac with 32 Gb RAM. Try to increase dataset gradually. I suppose the problem with some line in the dataset, probably, some entries are not appropriate for xgboost (it could be the problem with NA values for example).

@mglowacki100
Copy link
Author

I've tried a few combinations, but this time except segmentation fault in logs, I got more meaningful message in RStudio:

train - all, test - 10k first
combine.preds("train_raw/model.xgbx", 10)
Reading epoch 0 ...
Reading epoch 1 ...
Reading epoch 2 ...
Reading epoch 3 ...
Reading epoch 4 ...
Reading epoch 5 ...
Reading epoch 6 ...
Reading epoch 7 ...
Reading epoch 8 ...
Reading epoch 9 ...
Error in rowSums(actual * predicted) :
error in evaluating the argument 'x' in selecting a method for function 'rowSums': Error in actual * predicted : non-conformable arrays

train - all random, test - all random
...
Error in rowSums(actual * predicted) :
error in evaluating the argument 'x' in selecting a method for function 'rowSums': Error in actual * predicted : non-conformable arrays

train - 60k random, test - 60k random
...
Error in rowSums(actual * predicted) :
error in evaluating the argument 'x' in selecting a method for function 'rowSums': Error in actual * predicted : non-conformable arrays

train - 30k random, test - 10k first
Error in cbind(data.pred, as.matrix(data.pred.epoch)) :
number of rows of matrices must match (see arg 2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants