-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python backend crash #4857
Comments
Hi @Tsingjie89, thanks for sharing the config and back trace. Are you able to reproduce this with our newest container? Could you also share the model you are using and the steps to reproduce the issue that will really help us investigate this further? |
Hi @krishung5 |
Thank you @Tsingjie89 for the reply. Apart from the model config you shared, we would also need the model file, |
Closing due to inactivity. Please let us know to reopen the issue if you'd like to follow up. |
Description
python backend may crash on multi instance on cpu mode.
Triton Information
What version of Triton are you using?
22.04
Are you using the Triton container or did you build it yourself?
use Triton container
To Reproduce
recoginze model cfg:
name: "rec_ch_cpu"
backend: "paddle"
max_batch_size: 6
input [
{
name:"x",
data_type:TYPE_FP32,
dims:[3, 48, -1]
}
]
output [
{
name:"softmax_5.tmp_0",
data_type:TYPE_FP32,
dims:[-1, 6625]
}
]
instance_group [
{
count: 4
kind: KIND_CPU
}
]
optimization {
execution_accelerators {
cpu_execution_accelerator : [
{
name : "mkldnn"
parameters { key: "cpu_threads" value: "5" }
}
]
}
}
python backend cfg:
name: "ocr_lite_rec"
backend: "python"
input [
{
name: "INPUT_0"
data_type: TYPE_STRING
dims: [-1]
}
]
input [
{
name: "INPUT_1"
data_type: TYPE_STRING
dims: [-1]
}
]
output [
{
name: "OUTPUT"
data_type: TYPE_STRING
dims: [-1]
}
]
instance_group [{
count: 2
kind: KIND_CPU
}
]
test data:
84 images, 50 bboxes per image
Expected behavior
coredump info:
Core was generated by `/opt/tritonserver/backends/python/triton_python_backend_stub /workspace/ocr_lit'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00005594dc325f49 in boost::intrusive::bstree_algorithms_base<boost::intrusive::rbtree_node_traits<boost::interprocess::offset_ptr<void, long, unsigned long, 0ul>, t
rue> >::next_node(boost::interprocess::offset_ptr<boost::intrusive::compact_rbtree_node<boost::interprocess::offset_ptr<void, long, unsigned long, 0ul> >, long, unsigned
long, 0ul> const&) ()
[Current thread is 1 (Thread 0x7f98c1cfc000 (LWP 125682))]
(gdb)
(gdb) bt
#0 0x00005594dc325f49 in boost::intrusive::bstree_algorithms_base<boost::intrusive::rbtree_node_traits<boost::interprocess::offset_ptr<void, long, unsigned long, 0ul>, t
rue> >::next_node(boost::interprocess::offset_ptr<boost::intrusive::compact_rbtree_node<boost::interprocess::offset_ptr<void, long, unsigned long, 0ul> >, long, unsigned
long, 0ul> const&) ()
#1 0x00005594dc32e8b5 in boost::interprocess::rbtree_best_fit<boost::interprocess::null_mutex_family, boost::interprocess::offset_ptr<void, long, unsigned long, 0ul>, 0u
l>::priv_deallocate(void*) ()
#2 0x00005594dc32eea4 in std::_Function_handler<void (char*), triton::backend::python::AllocatedSharedMemory triton::backend::python::SharedMemoryManager::WrapObje
ctInUniquePtr(char*, triton::backend::python::AllocatedShmOwnership*, long const&)::{lambda(char*)#1}>::_M_invoke(std::_Any_data const&, char*&&) ()
#3 0x00005594dc33f2bd in triton::backend::python::PbTensor::~PbTensor() ()
#4 0x00005594dc343261 in std::_Sp_counted_ptr_inplace<triton::backend::python::PbTensor, std::allocatortriton::backend::python::PbTensor, (__gnu_cxx::_Lock_policy)2>:$_M_dispose() ()
#5 0x00005594dc319e55 in triton::backend::python::InferRequest::~InferRequest() ()
#6 0x00005594dc319f76 in std::_Sp_counted_ptr<triton::backend::python::InferRequest*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() ()
#7 0x00005594dc31b598 in std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::M_release() ()
#8 0x00005594dc31b71a in pybind11::class<triton::backend::python::InferRequest, std::shared_ptrtriton::backend::python::InferRequest >::dealloc(pybind11::detail::value_and_holder&) ()
#9 0x00005594dc309c27 in pybind11::detail::clear_instance(_object*) ()
#10 0x00005594dc30aba3 in pybind11_object_dealloc ()
#11 0x00007f98c2334dd3 in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#12 0x00007f98c254c865 in _PyGen_Send () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#13 0x00007f98b8839ef9 in ?? () from /usr/lib/python3.8/lib-dynload/_asyncio.cpython-38-x86_64-linux-gnu.so
#14 0x00007f98b88390ac in ?? () from /usr/lib/python3.8/lib-dynload/_asyncio.cpython-38-x86_64-linux-gnu.so
#15 0x00007f98c255db1b in _PyObject_MakeTpCall () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#16 0x00007f98c246a8a3 in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#17 0x00007f98c251444f in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#18 0x00007f98c255d830 in PyVectorcall_Call () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#19 0x00007f98c2332f48 in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#20 0x00007f98c233506b in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#21 0x00007f98c2329d6d in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#22 0x00007f98c232b018 in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#23 0x00007f98c233506b in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#24 0x00007f98c2329d6d in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#25 0x00007f98c232b018 in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#26 0x00007f98c233506b in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#27 0x00007f98c2329d6d in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#28 0x00007f98c232b018 in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#29 0x00007f98c233506b in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#30 0x00007f98c2329d6d in ?? () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#31 0x00007f98c232b018 in _PyEval_EvalFrameDefault () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#32 0x00007f98c247fe3b in _PyEval_EvalCodeWithName () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#33 0x00007f98c255d114 in _PyFunction_Vectorcall () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#34 0x00007f98c255d830 in PyVectorcall_Call () from /lib/x86_64-linux-gnu/libpython3.8.so.1.0
#35 0x00005594dc31f565 in pybind11::object pybind11::detail::object_api<pybind11::detail::accessorpybind11::detail::accessor_policies::str_attr >::operator()<(pybind11::return_value_policy)1, pybind11::object&>(pybind11::object&) const ()
#36 0x00005594dc313c95 in triton::backend::python::Stub::Execute(triton::backend::python::RequestBatch*, triton::backend::python::ResponseBatch*, long*) ()
#37 0x00005594dc317bf6 in triton::backend::python::Stub::RunCommand() ()
#38 0x00005594dc2fd160 in main ()
The text was updated successfully, but these errors were encountered: