Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ranger.unify fails on large models #13

Open
yovizzle opened this issue Apr 9, 2021 · 7 comments
Open

ranger.unify fails on large models #13

yovizzle opened this issue Apr 9, 2021 · 7 comments

Comments

@yovizzle
Copy link

yovizzle commented Apr 9, 2021

Hi,

We are training a large random forest model (rf object size is ~270mb) on a large dataset (dim 1,670,000 x 267, object size 3.3gb) and are hitting errors. The machine tested on has 96 cpus/354Gb ram.

Here is a repro.

library(treeshap)
library(ranger)
library(tidyverse)

# Generate random training tibble of similar size to our data
m = matrix(nrow = 800000,ncol = 200,data = runif(n = 800000*200))
object.size(m)/1024^3 # 1.2 gb
trainM = m %>% as_tibble
srf <- ranger(V200 ~ ., data=trainM, num.trees = 5,verbose = TRUE)
object.size(srf)/1024^2 # 89.4 MB
rfu = treeshap::ranger.unify(srf, trainM)

We then got this error:

# *** caught segfault ***
#   address 0x55e43e173ed0, cause 'memory not mapped'
# 
# Traceback:
# 1: new_covers(x, is_na, roots, yes, no, missing, is_leaf, feature,     split, decision_type)
# 2: set_reference_dataset(ret, as.data.frame(data))
# 3: treeshap::ranger.unify(srf, trainM)
# An irrecoverable exception occurred. R is aborting now ...
# Segmentation fault (core dumped)

# R version 4.0.2 (2020-06-22)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 20.04 LTS

Any ideas as to what may be causing this issue? Is it a limitation of the current implementation of the package, or perhaps an issue related to our R environment?

Thanks.

@yovizzle
Copy link
Author

@maksymiuks Any ideas how I might address this? Thanks again.

@maksymiuks
Copy link
Member

@yovizzle I'm on my way to find a solution

@yovizzle
Copy link
Author

yovizzle commented Apr 21, 2021 via email

@yovizzle
Copy link
Author

yovizzle commented Jun 8, 2021

@maksymiuks Any updates on this? We'd love to make use of this package!

@maksymiuks
Copy link
Member

@yovizzle hi!

I've identified the problem with ranger.unify however I'll have time to rebuild it in the second part of June/early July. I'll keep you posted

@yovizzle
Copy link
Author

yovizzle commented Jun 15, 2021 via email

@yovizzle
Copy link
Author

Hi @maksymiuks , just checking back to see how this is looking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants