Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lgbm lib not found #125

Closed
wants to merge 2 commits into from
Closed

Conversation

FatemehTahavori
Copy link

No description provided.

if libpath == ""
throw(LibraryNotFoundError("$(library_name) not found. Please ensure this library is either in system dirs or the dedicated paths: $(custom_paths)"))
if libpath != ""
@info("$(library_names) found in $(custom_paths)")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong if the lib was found in DL_LOAD_PATH or ENV["PATH"]. You'd either want to have this last if/else as part of the fallback block above, or have a flag indicating whether system dirs were used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its simpler than that, if the libpath was not empty then it will either have a custom path prepended or will just be the libname if it was found in system dirs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its simpler than that, if the libpath was not empty then it will either have a custom path prepended or will just be the libname if it was found in system dirs.

I'm not sure what your suggestion is here. Are you saying this is a non-issue? That the log message should be something different? Or something else?

For clarity I am suggesting something like:

if libpath != ""
    @info("$(library_names) found in `DL_LOAD_PATH`, or system library paths $(ENV["PATH"])!")
    return libpath
end

# try specified paths
@info("$(library_names) not found in `DL_LOAD_PATH`, or system library paths, trying fallback")
libpath = Libdl.find_library(library_names, custom_paths)

if libpath != ""
    @info("$(library_names) found in $(custom_paths)")
else
    throw(LibraryNotFoundError("$(library_names) not found. Please check this library using " *
    "Libdl.dlopen(l; throw_error=true) where l = joinpath(custom_paths, lib)"))
end

return libpath

@@ -63,11 +63,14 @@ end
cp(settings["ref_lib_lightgbm_path"], settings["custom_fixture_path"]) # fake file copied from lightgbm

# Act
output = LightGBM.find_library(settings["sample_lib"], [src_dir])
push!(Libdl.DL_LOAD_PATH, src_dir)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to move these push! and deleteat! lines into the setup_env and teardown functions, as this means any additional tests won't need to remember to add these lines.

Copy link

@danielsoutar danielsoutar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small changes needed, but LGTM otherwise

@info("$(library_name) not found in system dirs, trying fallback")
libpath = Libdl.find_library(library_name, custom_paths)
if libpath != ""
@info("$(library_names) found in `DL_LOAD_PATH`, or system library paths $(ENV["PATH"])!")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ENV["PATH"] is only the system path for link libraries on windows. This doesn't apply in linux or mac. See the README

@@ -16,21 +16,25 @@ struct LibraryNotFoundError <: Exception
end


function find_library(library_name::String, custom_paths::Vector{String})
function find_library(library_names::Vector{String}, custom_paths::Vector{String})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theres no need to make this to a Vector{String} (because we're not hunting for multiple libs).

As I understand, the "fix" here is to call Libdl.find_library like this: Libdl.find_library([library_name]) instead of Libdl.find_library(library_name)

@yaxxie
Copy link
Contributor

yaxxie commented Aug 4, 2022

It would be good if you can explain how the fix works, or what was going wrong before that this changes.

@yaxxie
Copy link
Contributor

yaxxie commented Aug 11, 2022

@FatemehTahavori this PR does not fix anything, I got a M1 mac to test with

Let me first start by stating what the issue is.
The linker searches for libraries to load, but there is a constraint: all libraries loaded need to be for the same architecture as the running host program.
Screenshot 2022-08-11 at 11 24 28
You'll notice here that that it says this julia binary is intel architecture. That means the following:

  • the libomp must be intel
  • the lightgbm library must be intel
    If any part of that chain doesn't match., the linker will fail to load the library

Next, I note that brew installs packages in arm mode.
Finally, I note that upcoming julia 1.8 is with an (experimental) native arm compiled binary.

So now I can run a hypothesis. If what I stated above is true, then when I have arm libomp and arm lightgbm, it will fail to load with intel julia, and succeed to load with arm julia.

claire.watson@GBY0JFYDQ6Q6 local % DYLD_LIBRARY_PATH=/opt/homebrew/lib/ /Applications/Julia-1.8\ 2.app/Contents/Resources/julia/bin/julia -e "import Pkg; Pkg.status();import LightGBM" 
Status `~/.julia/environments/v1.8/Project.toml`
  [7acf609c] LightGBM v0.5.2
[ Info: lib_lightgbm found in system dirs!
claire.watson@GBY0JFYDQ6Q6 local % 
claire.watson@GBY0JFYDQ6Q6 local % DYLD_LIBRARY_PATH=/opt/homebrew/lib/ /Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia -e "import Pkg; Pkg.status();import LightGBM" 
Status `~/.julia/environments/v1.8/Project.toml`
  [7acf609c] LightGBM v0.5.2
[ Info: lib_lightgbm not found in system dirs, trying fallback
ERROR: InitError: LightGBM.LibraryNotFoundError("lib_lightgbm not found. Please ensure this library is either in system dirs or the dedicated paths: [\"/Users/claire.watson/.julia/packages/LightGBM/A7zVd/src\"]")
Stacktrace:
 [1] find_library(library_name::String, custom_paths::Vector{String})
   @ LightGBM ~/.julia/packages/LightGBM/A7zVd/src/LightGBM.jl:32
 [2] __init__()
   @ LightGBM ~/.julia/packages/LightGBM/A7zVd/src/LightGBM.jl:41
 [3] _include_from_serialized(pkg::Base.PkgId, path::String, depmods::Vector{Any})
   @ Base ./loading.jl:831
 [4] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt64)
   @ Base ./loading.jl:1039
 [5] _require(pkg::Base.PkgId)
   @ Base ./loading.jl:1315
 [6] _require_prelocked(uuidkey::Base.PkgId)
   @ Base ./loading.jl:1200
 [7] macro expansion
   @ ./loading.jl:1180 [inlined]
 [8] macro expansion
   @ ./lock.jl:223 [inlined]
 [9] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1144
during initialization of module LightGBM
claire.watson@GBY0JFYDQ6Q6 local % 

The julia install at the 2 path (the first command) is the ARM one, and the julia install without the 2 in the path (the second command) is the Intel install.
I also printed the LightGBM package information to show that this in fact works correctly with a released version of LightGBM.jl (i.e. without this change).

Now, to explain how I made this work

  • brew install libomp
  • brew install lightgbm
  • set the linker env var, in this case by doing DYLD_LIBRARY_PATH=/opt/homebrew/lib (which is where brew installs the libraries)

In light of this I have to recommend this PR be closed, since it doesn't actually do anything or fix the issue (this PR can't possibly fix cross-architecture linking nor is it the place of this specific project top fix that problem)

@yaxxie
Copy link
Contributor

yaxxie commented Aug 11, 2022

@danielsoutar please see the above. I recommend the closure of this PR

Also @FatemehTahavori in light of the above it seems the advice given in #122 is incorrect; the correct advise should be to install julia 1.8 with ARM build on mac M1 and to either compile own lightgbm binary or use the brew binary and set DYLD_LIBRARY_PATH to the correct locations.

@FatemehTahavori
Copy link
Author

FatemehTahavori commented Aug 12, 2022

@yaxxie we could reproduce the issue in docker:

Julia development

docker run -d -it --platform=linux/x86_64 --name lgbm -v pwd:/home/julia/ julia:1.6.3

Connect to Container using VSCode

Inside Container

cd ~
wget -O /tmp/lgbm.tar https://github.com/microsoft/LightGBM/archive/v3.2.0.tar.gz
tar -xf /tmp/lgbm.tar -C /tmp/
export LIGHTGBM_EXAMPLES_PATH=/tmp/LightGBM-3.2.0

To add the package to Julia:

import Pkg
Pkg.add("LightGBM")
If you run tests they are failing
Pkg.test("LightGBM")

Add this branch to Julia:

(@v1.6) pkg> add LightGBM#Fix_lgbm_libNotFound
tests are passing

@yaxxie
Copy link
Contributor

yaxxie commented Aug 12, 2022

[ Info: ["libcrypt"] not found in `DL_LOAD_PATH`, or system library paths, trying fallback
find_library finds system lib: Error During Test at /root/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:77
  Got exception outside of a @test
  LightGBM.LibraryNotFoundError("[\"libcrypt\"] not found. Please check this library using Libdl.dlopen(l; throw_error=true) where l = joinpath(custom_paths, lib)")
  Stacktrace:
    [1] find_library(library_names::Vector{String}, custom_paths::Vector{String})
      @ LightGBM ~/.julia/packages/LightGBM/fXI7r/src/LightGBM.jl:36
    [2] macro expansion
      @ ~/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:84 [inlined]
    [3] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
    [4] macro expansion
      @ ~/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:80 [inlined]
    [5] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
    [6] top-level scope
      @ ~/.julia/packages/LightGBM/fXI7r/test/basic/test_lightgbm.jl:44
    [7] include(fname::String)
      @ Base.MainInclude ./client.jl:444
    [8] macro expansion
      @ ~/.julia/packages/LightGBM/fXI7r/test/runtests.jl:84 [inlined]
    [9] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
   [10] macro expansion
      @ ~/.julia/packages/LightGBM/fXI7r/test/runtests.jl:84 [inlined]
   [11] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1151 [inlined]
   [12] top-level scope
      @ ~/.julia/packages/LightGBM/fXI7r/test/runtests.jl:59
   [13] include(fname::String)
      @ Base.MainInclude ./client.jl:444
   [14] top-level scope
      @ none:6
   [15] eval
      @ ./boot.jl:360 [inlined]
   [16] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:261
   [17] _start()
      @ Base ./client.jl:485
[ Info: ["lib_that_simply_doesnt_exist"] not found in `DL_LOAD_PATH`, or system library paths, trying fallback
Test Summary:                                   | Pass  Error  Broken  Total
Basic tests                                     |  111      1       2    114
  Estimator parameters                          |   20              2     22
  Estimator parameters                          |   15                    15
  Utils                                         |    5                     5
  Fit                                           |   52                    52
  CV                                            |    6                     6
  Search CV                                     |   10                    10
  LightGBM                                      |    3      1              4
    find_library                                |    3      1              4
      find_library works with no system lib     |    1                     1
      find_library finds system lib first       |    1                     1
      find_library finds system lib             |           1              1
      find_library returns empty and logs error |    1                     1
ERROR: LoadError: Some tests did not pass: 111 passed, 0 failed, 1 errored, 2 broken.
in expression starting at /root/.julia/packages/LightGBM/fXI7r/test/runtests.jl:57
ERROR: Package LightGBM errored during testing

(@v1.6) pkg> st
      Status `~/.julia/environments/v1.6/Project.toml`
  [7acf609c] LightGBM v0.5.2 `https://github.com/IQVIA-ML/LightGBM.jl.git#Fix_lgbm_libNotFound`

(@v1.6) pkg> 

still fails that particular test when I check it. Furthermore, I suspect there is something not quite right about that docker image because:

julia> import Libdl

julia> Libdl.find_library("libcrypt")
""

julia> Libdl.find_library("libpcprofile")
"libpcprofile"

julia> 

and when I check the system linker paths (in that docker image) I find this:

root@5e2eaf1c5f7f:~# cat /etc/ld.so.conf.d/* | grep -v '#' | xargs find | grep libcrypt
find: '/usr/local/lib/x86_64-linux-gnu': No such file or directory
/lib/x86_64-linux-gnu/libcrypt-2.28.so
/lib/x86_64-linux-gnu/libcrypt.so.1
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
root@5e2eaf1c5f7f:~# cat /etc/ld.so.conf.d/* | grep -v '#' | xargs find | grep libpcprofile
find: '/usr/local/lib/x86_64-linux-gnu': No such file or directory
/lib/x86_64-linux-gnu/libpcprofile.so

You can see that for the libpcprofile that was found, there was a file ending with .so only -- for libcrypt which was not found, there was no lib with .so only at the end, .so.1 etc. This is normal, as libs can be versioned. But systems usually ship with short form symlinks to the longer names:

[yaxattax@fedora ~]$ find / 2>/dev/null| grep libcrypt.so$ 
/home/yaxattax/.local/share/containers/storage/overlay/4723f6643c4df3f617b017f7063d23aa50c7562bed4dc3074578ce896a385972/diff/usr/lib/x86_64-linux-gnu/libcrypt.so
/home/yaxattax/.local/share/containers/storage/overlay/1d5b529db9abb046582d40bfac011d80fc021907f3bfc31dda6870a52ee5e3da/diff/usr/lib/x86_64-linux-gnu/libcrypt.so
/usr/lib64/libcrypt.so
[yaxattax@fedora ~]$ ls -l /usr/lib64/libcrypt.so
lrwxrwxrwx. 1 root root 17 Feb  1  2022 /usr/lib64/libcrypt.so -> libcrypt.so.2.0.0
[yaxattax@fedora ~]$ 

and then if you launch julia on this machine and try to load libcrypt:

julia> import Libdl

julia> Libdl.find_library("libcrypt")
"libcrypt"

julia> 

You can see it works.

As a bonus: inside the docker image, I draw your attention to this:

julia> Libdl.find_library("libcrypt-2.28")
"libcrypt-2.28"

julia> 

and remind ourselves what we found when looking for libcrypt:

root@5e2eaf1c5f7f:~# cat /etc/ld.so.conf.d/* | grep -v '#' | xargs find | grep libcrypt
find: '/usr/local/lib/x86_64-linux-gnu': No such file or directory
/lib/x86_64-linux-gnu/libcrypt-2.28.so
/lib/x86_64-linux-gnu/libcrypt.so.1
/usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
root@5e2eaf1c5f7f:~#

notice this one: /lib/x86_64-linux-gnu/libcrypt-2.28.so with a .so at the end, and we found it if we asked for libcrypt-2.28

So once again, I put it to you that this PR fixes nothing.

@yaxxie
Copy link
Contributor

yaxxie commented Aug 17, 2022

Since

  • we had a user confirm this PR doesn't fix the issue for them
  • i explained why the docker image for the "reproducing case" is faulty
  • alternative installation instructions validate the mixed architecture theory, including validation by a user
  • the above are all backed by a coherent explanation of the issues

I think that this PR needs to be closed. Please close it @FatemehTahavori

@yaxxie
Copy link
Contributor

yaxxie commented Jan 10, 2023

@FatemehTahavori would you mind closing this PR? Unless I am mistaken, I don't believe we require it.

@FatemehTahavori FatemehTahavori deleted the Fix_lgbm_libNotFound branch January 10, 2023 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants