Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScikitLearn.jl - JLD.jl segmentation fault error #104

Open
CBongiova opened this issue Oct 12, 2021 · 0 comments
Open

ScikitLearn.jl - JLD.jl segmentation fault error #104

CBongiova opened this issue Oct 12, 2021 · 0 comments

Comments

@CBongiova
Copy link

CBongiova commented Oct 12, 2021

Hi,

I am using Julia v. 1.6.3 and I have a problem using JLD to save a RandomForestClassifier() model, trained with ScikitLearn. Namely, when the number of features and labels are too large, I get a segmentation fault error.

Here a working example to reproduce the error:
`
using ScikitLearn
using ScikitLearn.Pipelines
using PyCall, JLD, PyCallJLD
using Random
@sk_import ensemble: (RandomForestClassifier)

#working example with 100 features and 100 labels
x_vals=rand(100,45)
y_vals=vec(rand([0,1],100,1))

clf_model=RandomForestClassifier(n_estimators=500,bootstrap=true,oob_score=true,n_jobs=-1,class_weight="balanced_subsample",)
fit!(clf_model,x_vals,y_vals)
oob_score_value = clf_model.oob_score_
println("Oob score: $oob_score_value")

JLD.save("clf_model_100.jld", "clf_model", clf_model)

#NOT working example with 10,000 features and 100,000 labels
x_vals=rand(10000,45)
y_vals=vec(rand([0,1],10000,1))

clf_model=RandomForestClassifier(n_estimators=500,bootstrap=true,oob_score=true,n_jobs=-1,class_weight="balanced_subsample",)
fit!(clf_model,x_vals,y_vals)
oob_score_value = clf_model.oob_score_
println("Oob score: $oob_score_value")

JLD.save("clf_model_10000.jld", "clf_model", clf_model)
`
Here the error:

signal (11): Segmentation fault: 11
in expression starting at /Users/admin/Desktop/Online2/Train_ML_new.jl:998
jl_exit_thread0_cb at /Applications/Julia-1.6.app/Contents/Resources/julia/lib/julia/libjulia-internal.1.dylib (unknown line)
Allocations: 86747943 (Pool: 86717037; Big: 30906); GC: 83

Could anyone help understanding what is going on?

UPDATE: I have downgraded Julia to v. 1.0.5 and this has solved the segmentation fault for the working example, although I get the following warning:

┌ Warning: JLD incorrectly extends FileIO functions (see FileIO documentation)
└ @ FileIO ~/.julia/packages/FileIO/DNKwN/src/loadsave.jl:217

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant