Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use python binary instead of libpython when it's a PIE #612

Open
tkf opened this issue Nov 7, 2018 · 25 comments · May be fixed by #614
Open

Use python binary instead of libpython when it's a PIE #612

tkf opened this issue Nov 7, 2018 · 25 comments · May be fixed by #614

Comments

@tkf
Copy link
Member

tkf commented Nov 7, 2018

Presumably, this would let us share precompilation cache between PyCall and PyJulia in more situations [1]: conda-forge/python-feedstock#222 (comment)

[1] Not all situation. For example, non-PIE statically linked python still won't work.

@stevengj
Copy link
Member

stevengj commented Nov 8, 2018

You can't dlopen an executable, can you? Oh, I see that this is indeed possible for PIE executables on some platforms. Not sure if this is safe with python but I I guess we could give it a try.

Does ccall(("PyFoo", "/path/to/python"), ...) work directly, or do we need to explicitly call dlopen and work with the library handle?

@isuruf
Copy link
Contributor

isuruf commented Nov 8, 2018

Does ccall(("PyFoo", "/path/to/python"), ...) work directly, or do we need to explicitly call dlopen and work with the library handle?

Yes, it works. dlopen can't be used directly because julia appends a .so to the path.

@tkf
Copy link
Member Author

tkf commented Nov 8, 2018

Hmm... So Py_GetVersion works but Py_InitializeEx fails here https://github.com/python/cpython/blob/v3.7.1/Python/sysmodule.c#L2292

Any guess why?

(@isuruf BTW, it looks like I don't need .so if I pass absolute path to dlopen. Checked with Julia 1.0.1 and master.)

I created a conda environment with create --prefix py defaults::python which installs python 3.7.1-h0371630_3 and then run:

julia> using Libdl

julia> const libpath = abspath("py/bin/python")
"/home/takafumi/.julia/dev/_wt/PyCall/pie/py/bin/python"

julia> const wPYTHONHOME = Base.cconvert(Cwstring, string(abspath("py"), ':', abspath("py")));

julia> const wpyprogramname = Base.cconvert(Cwstring, libpath);

julia> h = Libdl.dlopen(libpath, Libdl.RTLD_LAZY|Libdl.RTLD_DEEPBIND|Libdl.RTLD_GLOBAL)
Ptr{Nothing} @0x0000000002b6ab10

julia> unsafe_string(ccall((:Py_GetVersion, libpath), Ptr{UInt8}, ()))
"3.7.1 (default, Oct 23 2018, 19:19:42) \n[GCC 7.3.0]"

julia> ccall((:Py_SetProgramName, libpath), Cvoid, (Ptr{Cwchar_t},), wpyprogramname)

julia> ccall((:Py_SetPythonHome, libpath), Cvoid, (Ptr{Cwchar_t},), wPYTHONHOME)

julia> ccall((:Py_InitializeEx, libpath), Cvoid, (Cint,), 0)

signal (11): Segmentation fault
in expression starting at no file:0
fileno_unlocked at /usr/lib/libc.so.6 (unknown line)
_PySys_BeginInit at /tmp/build/80754af9/python_1540319607830/work/Python/sysmodule.c:2292
_Py_InitializeCore_impl at /tmp/build/80754af9/python_1540319607830/work/Python/pylifecycle.c:753
_Py_InitializeCore at /tmp/build/80754af9/python_1540319607830/work/Python/pylifecycle.c:859
_Py_InitializeFromConfig at /tmp/build/80754af9/python_1540319607830/work/Python/pylifecycle.c:1002
Py_InitializeEx at /tmp/build/80754af9/python_1540319607830/work/Python/pylifecycle.c:1034
top-level scope at ./none:0
jl_fptr_trampoline at /buildworker/worker/package_linux64/build/src/gf.c:1831
jl_toplevel_eval_flex at /buildworker/worker/package_linux64/build/src/toplevel.c:807
jl_toplevel_eval_in at /buildworker/worker/package_linux64/build/src/builtins.c:622
eval at ./boot.jl:319
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2184
eval_user_input at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:85
macro expansion at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/REPL/src/REPL.jl:117 [inlined]
#28 at ./task.jl:259
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2184
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1537 [inlined]
start_task at /buildworker/worker/package_linux64/build/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 882346 (Pool: 882247; Big: 99); GC: 0

@isuruf
Copy link
Contributor

isuruf commented Nov 8, 2018

Can you check what Py_IsInitialized gives you?

@tkf
Copy link
Member Author

tkf commented Nov 8, 2018

OTOH Py_InitializeEx works with /usr/bin/python in Arch Linux.

@tkf
Copy link
Member Author

tkf commented Nov 8, 2018

@isuruf Py_IsInitialized gives me 0

julia> using Libdl

julia> pyhome = abspath("py")
"/home/takafumi/.julia/dev/_wt/PyCall/pie/py"

julia> const libpath = "$pyhome/bin/python"
"/home/takafumi/.julia/dev/_wt/PyCall/pie/py/bin/python"

julia> const wPYTHONHOME = Base.cconvert(Cwstring, string(pyhome, ':', pyhome));

julia> const wpyprogramname = Base.cconvert(Cwstring, libpath);

julia> h = Libdl.dlopen(libpath, Libdl.RTLD_LAZY|Libdl.RTLD_DEEPBIND|Libdl.RTLD_GLOBAL)
Ptr{Nothing} @0x00000000026dabc0

julia> unsafe_string(ccall((:Py_GetVersion, libpath), Ptr{UInt8}, ()))
"3.7.1 (default, Oct 23 2018, 19:19:42) \n[GCC 7.3.0]"

julia> ccall((:Py_IsInitialized, libpath), Cint, ())
0

julia> ccall((:Py_SetProgramName, libpath), Cvoid, (Ptr{Cwchar_t},), wpyprogramname)

julia> ccall((:Py_SetPythonHome, libpath), Cvoid, (Ptr{Cwchar_t},), wPYTHONHOME)

julia> ccall((:Py_InitializeEx, libpath), Cvoid, (Cint,), 0)

signal (11): Segmentation fault
in expression starting at no file:0
fileno_unlocked at /usr/lib/libc.so.6 (unknown line)
_PySys_BeginInit at /tmp/build/80754af9/python_1540319607830/work/Python/sysmodule.c:2292

@tkf
Copy link
Member Author

tkf commented Nov 8, 2018

With /usr/bin/python:

julia> using Libdl

julia> pyhome = "/usr"
"/usr"

julia> const libpath = "$pyhome/bin/python"
"/usr/bin/python"

julia> const wPYTHONHOME = Base.cconvert(Cwstring, string(pyhome, ':', pyhome));

julia> const wpyprogramname = Base.cconvert(Cwstring, libpath);

julia> h = Libdl.dlopen(libpath, Libdl.RTLD_LAZY|Libdl.RTLD_DEEPBIND|Libdl.RTLD_GLOBAL)
Ptr{Nothing} @0x00000000024d8410

julia> unsafe_string(ccall((:Py_GetVersion, libpath), Ptr{UInt8}, ()))
"3.7.0 (default, Jul 15 2018, 10:44:58) \n[GCC 8.1.1 20180531]"

julia> ccall((:Py_IsInitialized, libpath), Cint, ())
0

julia> ccall((:Py_SetProgramName, libpath), Cvoid, (Ptr{Cwchar_t},), wpyprogramname)

julia> ccall((:Py_SetPythonHome, libpath), Cvoid, (Ptr{Cwchar_t},), wPYTHONHOME)

julia> ccall((:Py_InitializeEx, libpath), Cvoid, (Cint,), 0)

julia> ccall((:Py_IsInitialized, libpath), Cint, ())
1

@isuruf
Copy link
Contributor

isuruf commented Nov 9, 2018

Okay. Looks like this idea won't work

@isuruf
Copy link
Contributor

isuruf commented Nov 9, 2018

julia> using Libdl

julia> const libpython = "/usr/bin/python"
"/usr/bin/python"

julia> const libpython_handle = Libdl.dlopen(libpython, Libdl.RTLD_LOCAL)
Ptr{Nothing} @0x000000000118c890

julia> using PyCall

julia> include("/home/isuru/.julia/packages/PyCall/0jMpb/test/runtests.jl")
┌ Info: Python version 2.7.15-rc1 from /usr/bin/python, PYTHONHOME=/usr:/usr
│ ENV[PYTHONPATH]=ENV[PYTHONHOME]=ENV[PYTHONEXECUTABLE]=
Test Summary: | Pass  Total
PyCall        |  432    432
Test Summary: | Pass  Total
pydef         |    6      6
Test Summary: | Pass  Total
callback      |    3      3
Test Summary: | Pass  Total
pycall!       |   16     16
Test.DefaultTestSet("pycall!", Any[DefaultTestSet("basics", Any[], 8, false), DefaultTestSet("kwargs", Any[], 8, false)], 0, false)

julia> using Pkg

julia> Pkg.dir("PyCall")
┌ Warning: `Pkg.dir(pkgname, paths...)` is deprecated; instead, do `import PyCall; joinpath(dirname(pathof(PyCall)), "..", paths...)`.
└ @ Pkg.API /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.0/Pkg/src/API.jl:454
"/home/isuru/.julia/packages/PyCall/0jMpb/"
$ cat ~/.julia/packages/PyCall/0jMpb/deps/deps.jl 
const python = "/usr/bin/python"
const pyprogramname = "/usr/bin/python"
const wpyprogramname = Base.cconvert(Cwstring, "/usr/bin/python")
const pyversion_build = v"2.7.15-rc1"
const PYTHONHOME = "/usr:/usr"
const wPYTHONHOME = Base.cconvert(Cwstring, "/usr:/usr")

const libpython = "/usr/bin/python"

"True if we are using the Python distribution in the Conda package."
const conda = false

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

Cool! But do you need to pass RTLD_LOCAL? IIUC it would make PyCall incompatible with wheels, right?

There is one situation where extensions that are linked in this way can fail to work: if a host program (e.g., apache2) uses dlopen() to load a module (e.g., mod_wsgi) that embeds the CPython interpreter, and the host program does not pass the RTLD_GLOBAL flag to dlopen(), then the embedded CPython will be unable to load any extension modules that do not themselves link explicitly to libpythonX.Y.so.1. Fortunately, apache2 does set the RTLD_GLOBAL flag, as do all the other programs that embed-CPython-via-a-dlopened-plugin that we could locate, so this does not seem to be a serious problem in practice. The incompatibility with Debian/Ubuntu is more of an issue than the theoretical incompatibility with a rather obscure corner case.
--- https://www.python.org/dev/peps/pep-0513/#libpythonx-y-so-1

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

What is the Linux distribution you are using?

@isuruf
Copy link
Contributor

isuruf commented Nov 9, 2018

Ubuntu 18.04

@isuruf
Copy link
Contributor

isuruf commented Nov 9, 2018

Libdl.RTLD_LAZY|Libdl.RTLD_GLOBAL works, but Libdl.RTLD_DEEPBIND doesn't

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

@isuruf Thanks! Yeah I just figured that out too :)

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

Looks like RTLD_DEEPBIND was added due to #189.

@isuruf
Copy link
Contributor

isuruf commented Nov 9, 2018

That issue needs only RTLD_GLOBAL and RTLD_DEEPBIND is not needed. (I can reproduce after removing the workaround and cannot when RTLD_DEEPBIND is removed)

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

@isuruf Thanks for checking! Do you know what is the flags commonly used when embedding CPython?

(I'll try to find out what apache2 etc. are using.)

@stevengj What do you think about removing RTLD_DEEPBIND?

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

So I put things together in a PR #614. But the same segmentation fault as above #612 (comment) happens even if I put the code that works in REPL in PyCall.__init__.

A MWE is:

module PIEPyCall

using Libdl

const pyhome = abspath("py")
const libpython = "$pyhome/bin/python3.7"
const wPYTHONHOME = Base.cconvert(Cwstring, string(pyhome, ':', pyhome));
const wpyprogramname = Base.cconvert(Cwstring, libpython);

# function __init__()
    h = Libdl.dlopen(libpython, Libdl.RTLD_LAZY|Libdl.RTLD_GLOBAL)
    @show unsafe_string(ccall((:Py_GetVersion, libpython), Ptr{UInt8}, ()))

    @show ccall((:Py_IsInitialized, libpython), Cint, ())

    ccall((:Py_SetProgramName, libpython), Cvoid, (Ptr{Cwchar_t},), wpyprogramname)
    ccall((:Py_SetPythonHome, libpython), Cvoid, (Ptr{Cwchar_t},), wPYTHONHOME)

    ccall((:Py_InitializeEx, libpython), Cvoid, (Cint,), 0)

    @show ccall((:Py_IsInitialized, libpython), Cint, ())
# end

end

where removing the #s causes the segfault. I find it strange. Could it be a Julia bug?

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

Interestingly, #614 works with conda in macOS https://travis-ci.org/JuliaPy/PyCall.jl/jobs/452716145#L230

@isuruf
Copy link
Contributor

isuruf commented Nov 9, 2018

Following works, but don't know why

function __init__()
    h = Libdl.dlopen(libpython, Libdl.RTLD_LAZY | Libdl.RTLD_GLOBAL)
    @show unsafe_string(ccall(Libdl.dlsym(h, :Py_GetVersion), Ptr{UInt8}, ()))


    @show ccall(Libdl.dlsym(h,:Py_IsInitialized), Cint, ())

    ccall(Libdl.dlsym(h,:Py_SetProgramName), Cvoid, (Ptr{Cwchar_t},), wpyprogramname)
    ccall(Libdl.dlsym(h,:Py_SetPythonHome), Cvoid, (Ptr{Cwchar_t},), wPYTHONHOME)

    ccall(Libdl.dlsym(h,:Py_InitializeEx), Cvoid, (Cint,), 0)

    @show ccall(Libdl.dlsym(h,:Py_IsInitialized), Cint, ())
end

@tkf
Copy link
Member Author

tkf commented Nov 9, 2018

Yes, it does! Puzzling...

@isuruf
Copy link
Contributor

isuruf commented Nov 10, 2018

looks like ccall((function, library)) calls dlopen with RTLD_DEEPBIND even if the library has been loaded before. using a library handle directly works.

@tkf
Copy link
Member Author

tkf commented Nov 10, 2018

I see! There is https://github.com/JuliaLang/julia/blob/c50aaeacc5c702f3a772c57ddacda0a80c2aafeb/src/julia.h#L1514

#define JL_RTLD_DEFAULT (JL_RTLD_LAZY | JL_RTLD_DEEPBIND)

which is used in jl_get_library in runtime_ccall.cpp. I suppose that's the internal of ccall.

I wonder if there is a way to avoid Libdl.dlsym. For example, above example works if I dlopen outside __init__. But of course it can't be precompiled.

Libdl.dlopen(libpython, Libdl.RTLD_LAZY|Libdl.RTLD_GLOBAL)
function __init__()
    @show unsafe_string(ccall((:Py_GetVersion, libpython), Ptr{UInt8}, ()))

    @show ccall((:Py_IsInitialized, libpython), Cint, ())

    ccall((:Py_SetProgramName, libpython), Cvoid, (Ptr{Cwchar_t},), wpyprogramname)
    ccall((:Py_SetPythonHome, libpython), Cvoid, (Ptr{Cwchar_t},), wPYTHONHOME)

    ccall((:Py_InitializeEx, libpython), Cvoid, (Cint,), 0)

    @show ccall((:Py_IsInitialized, libpython), Cint, ())
end

@tkf
Copy link
Member Author

tkf commented Nov 10, 2018

Resetting stdio in python namespace like this works (i.e., using PyCall; pybuiltin("print")("hello") prints hello):

diff --git a/src/pyinit.jl b/src/pyinit.jl
index e80af15..c822e52 100644
--- a/src/pyinit.jl
+++ b/src/pyinit.jl
@@ -89,6 +89,17 @@ function __init__()
     # issue #189
     libpy_handle = libpython === nothing ? C_NULL :
         Libdl.dlopen(libpython, Libdl.RTLD_LAZY|Libdl.RTLD_GLOBAL)
+    if is_pie
+        unsafe_store!(
+            cglobal((@pysym :stdin), Ptr{Cvoid}),
+            unsafe_load(cglobal(:stdin, Ptr{Cvoid})))
+        unsafe_store!(
+            cglobal((@pysym :stdout), Ptr{Cvoid}),
+            unsafe_load(cglobal(:stdout, Ptr{Cvoid})))
+        unsafe_store!(
+            cglobal((@pysym :stderr), Ptr{Cvoid}),
+            unsafe_load(cglobal(:stderr, Ptr{Cvoid})))
+    end

     already_inited = 0 != ccall((@pysym :Py_IsInitialized), Cint, ())

Not sure if this is safe, though. For example, would it close stdio twice? According to https://stackoverflow.com/a/24556049 it's undefined behavior. Maybe reopening stdio is safer?

@tkf
Copy link
Member Author

tkf commented Nov 12, 2018

I had to check if @pysym(:stdin) etc. exist to make it work on macOS. But it seems that this approach works in all relevant platforms:

We always(?) have dynamic linking in Windows so it's not relevant there. Also, it looks tricky to load an executable as a library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants