Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SoX effect "rate" crashing or hanging in multiprocessing #1021

Closed
pzelasko opened this issue Nov 12, 2020 · 13 comments
Closed

SoX effect "rate" crashing or hanging in multiprocessing #1021

pzelasko opened this issue Nov 12, 2020 · 13 comments

Comments

@pzelasko
Copy link

pzelasko commented Nov 12, 2020

🐛 Bug

This time I'm pretty sure it's a bug :P

When running torchaudio speed + rate SoX effect chain inside of a ProcessPoolExecutor on the CLSP grid, the subprocess experiences segmentation fault inside the apply_effects_tensor function. I managed to make the subprocess wait and attached gdb to it, and this is the native stack trace I got:

Program received signal SIGSEGV, Segmentation fault.
0x00007fb3862087d6 in __kmp_acquire_ticket_lock () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/mkl/../../../libiomp5.so
(gdb) bt
#0  0x00007fb3862087d6 in __kmp_acquire_ticket_lock () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/mkl/../../../libiomp5.so
#1  0x00007fb3861dad4a in __kmpc_set_lock () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/mkl/../../../libiomp5.so
#2  0x00007fb30e244746 in update_fft_cache (len=len@entry=2048) at /opt/conda/conda-bld/torchaudio_1603752092839/work/third_party/src/libsox/src/effects_i_dsp.c:190
#3  0x00007fb30e245187 in lsx_safe_rdft (len=2048, type=1, d=0x55f3b1e30700) at /opt/conda/conda-bld/torchaudio_1603752092839/work/third_party/src/libsox/src/effects_i_dsp.c:218
#4  0x00007fb30e254be5 in dft_stage_init (instance=instance@entry=0, Fp=0.91362772738460019, Fs=Fs@entry=1, Fn=2, att=att@entry=132.45319809215172, phase=phase@entry=50, stage=0x55f3b1ed5730, L=L@entry=2, M=1)
    at /opt/conda/conda-bld/torchaudio_1603752092839/work/third_party/src/libsox/src/rate.c:239
#5  0x00007fb30e2568b6 in rate_init (noSmallIntOpt=<optimized out>, max_coefs_size=400, interpolator=-1, use_hi_prec_clock=sox_false, maintain_3dB_pt=<optimized out>, rolloff=rolloff_small, anti_aliasing_pc=<optimized out>,
    bw_pc=<optimized out>, phase=50, bits=<optimized out>, factor=0.98090137934956023, shared=<optimized out>, p=<optimized out>) at /opt/conda/conda-bld/torchaudio_1603752092839/work/third_party/src/libsox/src/rate.c:367
#6  start (effp=effp@entry=0x55f3b1e2edc0) at /opt/conda/conda-bld/torchaudio_1603752092839/work/third_party/src/libsox/src/rate.c:632
#7  0x00007fb30e241f95 in sox_add_effect (chain=0x55f3b1fdec40, effp=effp@entry=0x55f3b1e2edc0, in=in@entry=0x7ffdf6d5acf0, out=out@entry=0x7ffdf6d5acd0)
    at /opt/conda/conda-bld/torchaudio_1603752092839/work/third_party/src/libsox/src/effects.c:157
#8  0x00007fb30e232bcd in torchaudio::sox_effects_chain::SoxEffectsChain::addEffect (this=this@entry=0x7ffdf6d5ac90, effect=...) from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torchaudio/_torchaudio.so
#9  0x00007fb30e1f3e1f in torchaudio::sox_effects::apply_effects_tensor (input_signal=..., effects=...) from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torchaudio/_torchaudio.so
#10 0x00007fb30e20b593 in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > (*)(c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > >), c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> >, c10::guts::typelist::typelist<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > > > >::operator() (this=<optimized out>, args#1=<error reading variable: access outside bounds of object referenced via synthetic pointer>, args#0=...)
   from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torchaudio/_torchaudio.so
#11 c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > (*)(c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > >), c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> >, c10::guts::typelist::typelist<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > > > >, false, 0ul, 1ul> (stack=0x7ffdf6d5b490, functor=<optimized out>) at /opt/conda/conda-bld/torchaudio_1603752092839/work/torchaudio/csrc/register.cpp:339
#12 c10::impl::call_functor_with_args_from_stack<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > (*)(c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > >), c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> >, c10::guts::typelist::typelist<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > > > >, false> (stack=0x7ffdf6d5b490, functor=<optimized out>) at /opt/conda/conda-bld/torchaudio_1603752092839/work/torchaudio/csrc/register.cpp:346
#13 _ZZN3c104impl31make_boxed_from_unboxed_functorINS0_6detail31WrapFunctionIntoRuntimeFunctor_IPFNS_13intrusive_ptrIN10torchaudio9sox_utils12TensorSignalENS_6detail34intrusive_target_default_null_typeIS7_EEEERKSB_St6vectorISE_ISsSaISsEESaISG_EEESB_NS_4guts8typelist8typelistIJSD_SI_EEEEELb0EE4callEPNS_14OperatorKernelERKNS_14OperatorHandleEPSE_INS_6IValueESaISW_EEENKUlT_E_clINSL_6detail9_identityEEEDaS10_ (__closure=<optimized out>, delay_check=...)
    at /opt/conda/conda-bld/torchaudio_1603752092839/work/torchaudio/csrc/register.cpp:392
#14 _ZN3c104guts6detail13_if_constexprILb1EE4callIZNS_4impl31make_boxed_from_unboxed_functorINS5_6detail31WrapFunctionIntoRuntimeFunctor_IPFNS_13intrusive_ptrIN10torchaudio9sox_utils12TensorSignalENS_6detail34intrusive_target_default_null_typeISC_EEEERKSG_St6vectorISJ_ISsSaISsEESaISL_EEESG_NS0_8typelist8typelistIJSI_SN_EEEEELb0EE4callEPNS_14OperatorKernelERKNS_14OperatorHandleEPSJ_INS_6IValueESaIS10_EEEUlT_E_ZNSU_4callESW_SZ_S13_EUlvE0_LPv0EEEDcOS14_OT0_ (thenCallback=<optimized out>)
    at /opt/conda/conda-bld/torchaudio_1603752092839/work/torchaudio/csrc/register.cpp:193
#15 _ZN3c104guts12if_constexprILb1EZNS_4impl31make_boxed_from_unboxed_functorINS2_6detail31WrapFunctionIntoRuntimeFunctor_IPFNS_13intrusive_ptrIN10torchaudio9sox_utils12TensorSignalENS_6detail34intrusive_target_default_null_typeIS9_EEEERKSD_St6vectorISG_ISsSaISsEESaISI_EEESD_NS0_8typelist8typelistIJSF_SK_EEEEELb0EE4callEPNS_14OperatorKernelERKNS_14OperatorHandleEPSG_INS_6IValueESaISX_EEEUlT_E_ZNSR_4callEST_SW_S10_EUlvE0_EEDcOT0_OT1_ (elseCallback=<optimized out>,
    thenCallback=<optimized out>) at /opt/conda/conda-bld/torchaudio_1603752092839/work/torchaudio/csrc/register.cpp:282
#16 c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > (*)(c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > >), c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> >, c10::guts::typelist::typelist<c10::intrusive_ptr<torchaudio::sox_utils::TensorSignal, c10::detail::intrusive_target_default_null_type<torchaudio::sox_utils::TensorSignal> > const&, std::vector<std::vector<std::string, std::allocator<std::string> >, std::allocator<std::vector<std::string, std::allocator<std::string> > > > > >, false>::call (functor=<optimized out>, stack=0x7ffdf6d5b490) at /opt/conda/conda-bld/torchaudio_1603752092839/work/torchaudio/csrc/register.cpp:388
#17 0x00007fb35fa661f2 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so
#18 0x00007fb35fa61829 in torch::jit::(anonymous namespace)::createOperatorFromC10_withTracingHandledHere(c10::OperatorHandle const&)::{lambda(std::vector<c10::IValue, std::allocator<c10::IValue> >*)#1}::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >*) const () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torch/lib/libtorch_cpu.so
#19 0x00007fb36458f42d in torch::jit::invokeOperatorFromPython(std::vector<std::shared_ptr<torch::jit::Operator>, std::allocator<std::shared_ptr<torch::jit::Operator> > > const&, pybind11::args, pybind11::kwargs) ()
   from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#20 0x00007fb364567624 in torch::jit::initJITBindings(_object*)::{lambda(std::string const&)#104}::operator()(std::string const&) const::{lambda(pybind11::args, {lambda(std::string const&)#104}::kwargs)#1}::operator()(pybind11, pybind11::args) const () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#21 0x00007fb364567a0c in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::{lambda(std::string const&)#104}::operator()(std::string const&) const::{lambda(pybind11::args, pybind11::kwargs)#1}, pybind11::object, {lambda(std::string const&)#104}, pybind11::args, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::{lambda(std::string const&)#104}::operator()(std::string const&) const::{lambda(pybind11::args, pybind11::kwargs)#1}&&, pybind11::object (*)({lambda(std::string const&)#104}, pybind11::args), pybind11::name const&, pybind11::doc const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail) ()
   from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#22 0x00007fb3641da5ea in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) () from /home/pzelasko/miniconda3/envs/lhotse/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#23 0x000055f3adb53c94 in _PyMethodDef_RawFastCallKeywords () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:694
#24 0x000055f3adb53db1 in _PyCFunction_FastCallKeywords (func=0x7fb30cdac960, args=<optimized out>, nargs=<optimized out>, kwnames=<optimized out>) at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:734
#25 0x000055f3adbbf5be in call_function (kwnames=0x0, oparg=2, pp_stack=<synthetic pointer>) at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:4568
#26 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3093
#27 0x000055f3adb032b9 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3930
#28 0x000055f3adb53435 in _PyFunction_FastCallKeywords () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:433
#29 0x000055f3adbbf229 in call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>) at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:4616
#30 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3093
#31 0x000055f3adb0431b in function_code_fastcall (globals=<optimized out>, nargs=3, args=<optimized out>, co=0x7fb30e107ae0) at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:283
#32 _PyFunction_FastCallDict () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:322
#33 0x000055f3adb22b93 in _PyObject_Call_Prepend () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:908
#34 0x000055f3adb5a16a in slot_tp_call () at /tmp/build/80754af9/python_1588882889832/work/Objects/typeobject.c:6402
#35 0x000055f3adb5b00b in _PyObject_FastCallKeywords () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:199
#36 0x000055f3adbbf186 in call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>) at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:4619
#37 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3124
#38 0x000055f3adb032b9 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3930
#39 0x000055f3adb53497 in _PyFunction_FastCallKeywords () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:433
#40 0x000055f3adbbbcba in call_function (kwnames=0x7fb30e106fb0, oparg=<optimized out>, pp_stack=<synthetic pointer>) at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:4616
#41 _PyEval_EvalFrameDefault () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3139
#42 0x000055f3adb032b9 in _PyEval_EvalCodeWithName () at /tmp/build/80754af9/python_1588882889832/work/Python/ceval.c:3930
#43 0x000055f3adb04610 in _PyFunction_FastCallDict () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:376
#44 0x000055f3adb22b93 in _PyObject_Call_Prepend () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:908
#45 0x000055f3adb1595e in PyObject_Call () at /tmp/build/80754af9/python_1588882889832/work/Objects/call.c:245

The same test seems to be hanging in Lhotse's GitHub Actions CI: https://github.com/lhotse-speech/lhotse/pull/124/checks?check_run_id=1391378614

To Reproduce

Steps to reproduce the behavior:

  1. the latest PyTorch with torchaudio (via conda) and install Lhotse from the torchaudio data augmentation branch git clone https://github.com/lhotse-speech/lhotse && cd lhotse && git checkout feature/augmentation-refactoring && pip install -e '.[dev]'
  2. Run this test using pytest test/known_issues/test_augment_with_executor.py

Expected behavior

No crash

Environment

  • What commands did you used to install torchaudio (conda/pip/build from source)? conda
  • If you are building from source, which commit is it?
  • What does torchaudio.__version__ print? (If applicable)

Collecting environment information...
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A

OS: Debian GNU/Linux 9.13 (stretch) (x86_64)
GCC version: (Debian 6.3.0-18+deb9u1) 6.3.0 20170516
Clang version: 3.8.1-24 (tags/RELEASE_381/final)
CMake version: version 3.7.2

Python version: 3.7 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti
GPU 2: GeForce GTX 1080 Ti
GPU 3: GeForce GTX 1080 Ti

Nvidia driver version: 440.33.01
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.18.5
[pip3] torch==1.7.0
[pip3] torchaudio==0.7.0a0+ac17b64
[pip3] torchvision==0.8.1
[conda] blas 1.0 mkl
[conda] cudatoolkit 10.2.89 hfd86e86_1
[conda] mkl 2020.1 217
[conda] mkl-service 2.3.0 py37he904b0f_0
[conda] mkl_fft 1.1.0 py37h23d657b_0
[conda] mkl_random 1.1.1 py37h0573a6f_0
[conda] numpy 1.18.5 py37ha1c710e_0
[conda] numpy-base 1.18.5 py37hde5b4d6_0
[conda] pytorch 1.7.0 py3.7_cuda10.2.89_cudnn7.6.5_0 pytorch
[conda] torchaudio 0.5.1 pypi_0 pypi
[conda] torchvision 0.8.1 py37_cu102 pytorch

Additional context

@mthrok
Copy link
Collaborator

mthrok commented Nov 12, 2020

Hi @pzelasko

I confirm I could reproduce the issue. I will take a look into it.

error
$ pytest test/known_issues/test_augment_with_executor.py
============================================================================================================ test session starts =============================================================================================================
platform linux -- Python 3.8.3, pytest-5.4.3, py-1.9.0, pluggy-0.13.1
rootdir: /scratch/moto/lhotse
plugins: hypothesis-5.18.0
collected 2 items

test/known_issues/test_augment_with_executor.py Fatal Python error: Segmentation fault

Current thread 0x00007f60f78ea740 (most recent call first):
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/torchaudio/sox_effects/sox_effects.py", line 159 in apply_effects_tensor
  File "/scratch/moto/lhotse/lhotse/augmentation/torchaudio.py", line 58 in __call__
  File "/scratch/moto/lhotse/lhotse/features/base.py", line 136 in extract_from_samples_and_store
  File "/scratch/moto/lhotse/lhotse/cut.py", line 336 in compute_and_store_features
  File "/scratch/moto/lhotse/lhotse/cut.py", line 1509 in _extract_and_store_features_helper_fn
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/concurrent/futures/process.py", line 239 in _process_worker
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/multiprocessing/process.py", line 108 in run
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/multiprocessing/process.py", line 315 in _bootstrap
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/multiprocessing/popen_fork.py", line 75 in _launch
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/multiprocessing/popen_fork.py", line 19 in __init__
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/multiprocessing/context.py", line 276 in _Popen
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/multiprocessing/process.py", line 121 in start
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/concurrent/futures/process.py", line 608 in _adjust_process_count
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/concurrent/futures/process.py", line 584 in _start_queue_management_thread
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/concurrent/futures/process.py", line 645 in submit
  File "/scratch/moto/lhotse/lhotse/cut.py", line 1311 in compute_and_store_features
  File "/scratch/moto/lhotse/test/known_issues/test_augment_with_executor.py", line 22 in test_wav_augment_with_executor
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/python.py", line 182 in pytest_pyfunc_call
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/python.py", line 1477 in runtest
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 135 in pytest_runtest_call
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 217 in <lambda>
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 244 in from_call
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 216 in call_runtest_hook
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 186 in call_and_report
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 100 in runtestprotocol
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/runner.py", line 85 in pytest_runtest_protocol
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/main.py", line 272 in pytest_runtestloop
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/main.py", line 247 in _main
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/main.py", line 191 in wrap_session
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/main.py", line 240 in pytest_cmdline_main
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/callers.py", line 187 in _multicall
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 84 in <lambda>
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/manager.py", line 93 in _hookexec
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/pluggy/hooks.py", line 286 in __call__
  File "/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/_pytest/config/__init__.py", line 124 in main
  File "/home/moto/conda/envs/PY3.8-cuda101/bin/pytest", line 11 in <module>
F.                                                                                                                                                                                     [100%]

================================================================================================================== FAILURES ==================================================================================================================
____________________________________________________________________________________________ test_wav_augment_with_executor[ProcessPoolExecutor] _____________________________________________________________________________________________

exec_type = <class 'concurrent.futures.process.ProcessPoolExecutor'>

    @pytest.mark.parametrize('exec_type', [ProcessPoolExecutor, ThreadPoolExecutor])
    def test_wav_augment_with_executor(exec_type):
        with make_cut(sampling_rate=16000, num_samples=16000) as cut, \
                TemporaryDirectory() as d, \
                LilcomFilesWriter(storage_path=d) as storage, \
                exec_type(1) as ex:
            cut_set = CutSet.from_cuts(
                cut.with_id(str(i)) for i in range(100)
            )
            # Just test that it runs and does not hang.
>           cut_set_feats = cut_set.compute_and_store_features(
                extractor=Fbank(),
                storage=storage,
                augment_fn=SoxEffectTransform(speed(16000)),
                executor=ex
            )

test/known_issues/test_augment_with_executor.py:22:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
lhotse/cut.py:1320: in compute_and_store_features
    cut_set = CutSet.from_cuts(f.result() for f in futures)
lhotse/cut.py:988: in from_cuts
    return CutSet({cut.id: cut for cut in cuts})
lhotse/cut.py:988: in <dictcomp>
    return CutSet({cut.id: cut for cut in cuts})
lhotse/cut.py:1320: in <genexpr>
    cut_set = CutSet.from_cuts(f.result() for f in futures)
/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/concurrent/futures/_base.py:439: in result
    return self.__get_result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Future at 0x7f6083584820 state=finished raised BrokenProcessPool>

    def __get_result(self):
        if self._exception:
>           raise self._exception
E           concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.

/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/concurrent/futures/_base.py:388: BrokenProcessPool
============================================================================================================== warnings summary ==============================================================================================================
/home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/torchaudio/backend/utils.py:53
  /home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/torchaudio/backend/utils.py:53: UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to https://github.com/pytorch/audio/issues/903 for the detail.
    warnings.warn(

test/known_issues/test_augment_with_executor.py::test_wav_augment_with_executor[ThreadPoolExecutor]
  /home/moto/conda/envs/PY3.8-cuda101/lib/python3.8/site-packages/torchaudio/compliance/kaldi.py:574: UserWarning: The function torch.rfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.fft or torch.fft.rfft. (Triggered internally at  /opt/conda/conda-bld/pytorch_1603729009598/work/aten/src/ATen/native/SpectralOps.cpp:590.)
    fft = torch.rfft(strided_input, 1, normalized=False, onesided=True)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================================================================================================== short test summary info ===========================================================================================================
FAILED test/known_issues/test_augment_with_executor.py::test_wav_augment_with_executor[ProcessPoolExecutor] - concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was runn...
================================================================================================== 1 failed, 1 passed, 2 warnings in 1.48s ===================================================================================================
env
PyTorch version: 1.7.0
Is debug build: True
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.3 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: Quadro GP100
GPU 1: Quadro GP100

Nvidia driver version: 418.116.00
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] pytorch-sphinx-theme==0.0.24
[pip3] torch==1.7.0
[pip3] torchaudio==0.7.0a0+ac17b64
[conda] blas                      1.0                         mkl
[conda] cudatoolkit               10.1.243             h6bb024c_0
[conda] magma-cuda101             2.5.2                         1    pytorch
[conda] mkl                       2020.1                      217
[conda] mkl-include               2020.1                      219    conda-forge
[conda] mkl-service               2.3.0            py38he904b0f_0
[conda] mkl_fft                   1.2.0            py38h23d657b_0
[conda] mkl_random                1.1.1            py38h0573a6f_0
[conda] numpy                     1.19.2           py38h54aff64_0
[conda] numpy-base                1.19.2           py38hfa32c7d_0
[conda] pytorch                   1.7.0           py3.8_cuda10.1.243_cudnn7.6.3_0    pytorch
[conda] pytorch-sphinx-theme      0.0.24                    dev_0    <develop>
[conda] torchaudio                0.7.0                      py38    pytorch

@mthrok
Copy link
Collaborator

mthrok commented Nov 12, 2020

@pzelasko

Could you check what initialization method is used to launch subprocess? (print(multiprocessing.get_start_method()))
One possible cause is how subprocess is launched. pytorch/pytorch#46409 (comment)

In my env, multiprocessing.get_start_method() returns fork.
If I modify the test and change it to exec_type(1, mp_context=multiprocessing.get_context('spawn')), it works.
If this is the case, the root cause might be same as pytorch/pytorch#46409 .

@pzelasko
Copy link
Author

Thanks @mthrok, that is a valid work-around that un-blocks Lhotse :)

FYI you mentioned in that other thread that libsox was not built with OpenMP, but the three top frames from the call stack I reported suggest otherwise. Note that it broke at line 190 in update_fft_cache, that calls a macro that only does anything when HAVE_OPENMP is defined (see here).

@mthrok
Copy link
Collaborator

mthrok commented Nov 13, 2020

Thanks @mthrok, that is a valid work-around that un-blocks Lhotse :)

Glad it helped.

FYI you mentioned in that other thread that libsox was not built with OpenMP, but the three top frames from the call stack I reported suggest otherwise. Note that it broke at line 190 in update_fft_cache, that calls a macro that only does anything when HAVE_OPENMP is defined (see here).

In pytorch/pytorch#46409, the issue was focusing on macOS environment, and the binary distributions of torchaudio for macOS does not include OpenMP. The binary distributions for Linux have them.

I was not sure why there was MKL on the stack trace you shared, but that makes sense. Thanks for letting me know.
I guess that the cause of the issue is fork vs spawn in Python's multiprocessing module, regardless of the OS.
Thanks for reporting the issue. This is a good data point and gave us better insight on it. (though I am still not sure how we can fix it when subprocess is created with fork method.)

@mthrok
Copy link
Collaborator

mthrok commented Nov 13, 2020

I will add the workaround to the documentation.
I think this will be a pitfall potentially many people will fall in.

@mthrok
Copy link
Collaborator

mthrok commented Nov 13, 2020

On Ubuntu, disabling OpenMP support for libsox seems to resolve the issue.
#1026

mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
@cpuhrsch
Copy link
Contributor

cpuhrsch commented Nov 13, 2020

As a first step you could try OMP_NUM_THREADS=1 and see if this still causes the segfault. This will disable any openmp parallelization, which is especially important in a multiprocessing environment. If this segfault is caused by sox itself, then this segfault should also be reproducible outside of a multiprocessing environment.

@cpuhrsch
Copy link
Contributor

Another thing we should make sure is that torchaudio's sox is using PyTorch's openmp (PyTorch statically links openmp and ships with it). Or maybe we decide to disable openmp for sox entirely, but let's do some perf investigation before we do that.

mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
@mthrok
Copy link
Collaborator

mthrok commented Nov 13, 2020

@cpuhrsch I tested with OMP_NUM_THREADS=1 but it is still causing segmentation fault. What do you think this implies?

As a first step you could try OMP_NUM_THREADS=1 and see if this still causes the segfault.

mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
mthrok added a commit to mthrok/audio that referenced this issue Nov 13, 2020
@mthrok
Copy link
Collaborator

mthrok commented Nov 16, 2020

I talked with @malfet, and it is most like that Intel's OpenMP and GNU Open MP are conflicting.

@mthrok
Copy link
Collaborator

mthrok commented Nov 20, 2020

@pzelasko I have disabled the OpenMP support for libsox, which is the suspected cause. It will be available as nightly tomorrow, and we are adding the change to the 0.7.1 minor release, which is expected to happen in early December. If you have time, can you try the nightly and see things work without using "spawn" ?

@mthrok
Copy link
Collaborator

mthrok commented Dec 19, 2020

@pzelasko Can we close the issue? I believe now it works fine on "folk" method too.

@pzelasko
Copy link
Author

pzelasko commented Dec 19, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants