Skip to content

Using pybind in shared modules #1738

@banasraf

Description

@banasraf

I have a project with such structure:

operators.cpp

#include "operators.h"
#include <pybind11/pybind11.h>
#include <iostream>
#include <thread>

void do_absolutely_nothing() {
   pybind11::gil_scoped_acquire guard{};
   std::cout << "doing something critical" << std::endl;
}

backend.cpp

#include "backend.h"
#include "../operators/operators.h"

void do_nothing() {
  do_absolutely_nothing();
}

pymodule.cpp

#include <pybind11/pybind11.h>
#include "backend/backend.h"

namespace py = pybind11;

PYBIND11_MODULE(pymodule, m) {
  m.def("do_nothing", &do_nothing, py::call_guard<py::gil_scoped_release>());
}

operators is compiled with this CMakeLists.txt

add_library(operators_lib STATIC operators.cpp)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fPIC")
target_include_directories(operators_lib
        PRIVATE ${PYBIND11_INCLUDE_DIR}  # from project CMakeLists.txt
        PRIVATE ${pybind11_INCLUDE_DIR}  # from pybind11Config
        PRIVATE ${PYTHON_INCLUDE_DIRS})

backend

add_library(backend_lib SHARED backend.cpp)
target_link_libraries(backend_lib PRIVATE operators_lib)

pymodule

pybind11_add_module(pymodule MODULE pymodule.cpp)
add_dependencies(pymodule backend_lib)
target_link_libraries(pymodule PRIVATE backend_lib)

When testing

pymodule.do_nothing()

I get a segfault with this backtrace:

#0  0x000000000052afd4 in PyEval_GetFrame () at ../Python/ceval.c:4518
#1  0x000000000052b039 in PyEval_GetBuiltins () at ../Python/ceval.c:4480
#2  0x00000000006100ae in module_dict_for_exec () at ../Python/import.c:811
#3  0x000000000061030c in PyImport_ImportFrozenModuleObject () at ../Python/import.c:1222
#4  0x000000000061039d in PyImport_ImportFrozenModule () at ../Python/import.c:1245
#5  0x000000000060e5aa in import_init.isra () at ../Python/pylifecycle.c:241
#6  0x000000000060e94e in _Py_InitializeEx_Private () at ../Python/pylifecycle.c:411
#7  0x0000000000640076 in Py_Main () at ../Modules/main.c:669
#8  0x00000000004d0001 in main () at ../Programs/python.c:65
#9  0x00007ffff7810830 in __libc_start_main (main=0x4cff20 <main>, argc=2, argv=0x7fffffffde08, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffddf8) at ../csu/libc-start.c:291
#10 0x00000000005d6999 in _start ()

What's most interesting, this crashes only if the backend_lib or operators_lib are SHARED libraries. When both backend_lib and operators_lib are STATIC it works as expected.

I would like to emphasize that even though gil is released and acquired here, no new threads are created. In first version of those snippets I was acquiring the GIL in a new thread, but I figured out it crashes even without that.

Honestly, it's the first time I encountered a situation when segfaulting depends on type of a library compilation and I have no idea how to debug it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions