Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScikitLearn declared inside a module causes segmentation error #50

Closed
ppalmes opened this issue Jan 23, 2019 · 6 comments
Closed

ScikitLearn declared inside a module causes segmentation error #50

ppalmes opened this issue Jan 23, 2019 · 6 comments

Comments

@ppalmes
Copy link

ppalmes commented Jan 23, 2019

Tested in Julia 1.0.3 and Julia 1.1 and Julia 0.7

To recreate the problem:
create package A
pkg] generate A
bash> cd A
pkg] activate .
pkg] add ScikitLearn
julia> edit("src/A.jl")
-----
module A
using ScikitLearn
@sk_import linear_model: LogisticRegression

function testme()
model = LogisticRegression()
end

end
---
julia> using A
julia> A.testme() -> causes segmentation error

However, if you use:
julia> include("src/A.jl")
julia> A.testme() -> works

@ppalmes
Copy link
Author

ppalmes commented Jan 23, 2019

The purpose for this thing is to create a wrapper to have common API between caret of RCall and scikitlearn of PyCall

@cstjean
Copy link
Owner

cstjean commented Jan 23, 2019

Thank you for the report. I've hit several segmentation faults myself in Julia 1.1, but this is the first occurrence with ScikitLearn. It might be interesting to reduce it as much as possible and report that to Julialang. That said, it's not surprising that @sk_import doesn't work inside a module, although it really should be documented and warned against. The proper way is documented in PyCall. Use @macroexpand @sk_import ..., and it should be clear.

@ppalmes
Copy link
Author

ppalmes commented Jan 23, 2019

Thanks for the reply. I resolved it by adding:
__precompile__(false)
in the main module. It seems that precompiling is the issue and this also happens with PyCall.

@ppalmes
Copy link
Author

ppalmes commented Jan 23, 2019

I think the bug occurred when they decided to make precompiling as the default which was changed in this PR: JuliaLang/julia#26282

@cstjean
Copy link
Owner

cstjean commented Jan 23, 2019

Yeah. It's that when precompiling, it stores the values of all global variables. @sk_import or @pyimport create a global that holds a pointer to the loaded Python module. Obviously, storing that in the precompiled code file is a very bad idea. When reloading your module, it reads the pointer address, which now points to whatever, thus -> segfault. That's why you need the __init__ trick if you want precompilation.

@Red-Portal
Copy link

For anyone who's still wondering here's a snippet that worked for me.
Note that the loading is done by PyCall but the rest is done using the ScikitLearn.jl API.

using ScikitLearn
using PyCall
const mixture = PyNULL()
function __init__()
    copy!(mixture, pyimport("sklearn.mixture"))
end

### Usage
    gmm_config = mixture.BayesianGaussianMixture(n_components=10,
                                                 max_iter=1000,
                                                 weight_concentration_prior=1.0)
    model = fit!(gmm_config, samples[:,:])
    w = model.weights_[:,1,1]
    μ = model.means_[:,1,1]
    σ = sqrt.(model.covariances_[:,1,1])
###

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants