Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault on M1 Macbook #243

Closed
1 task done
hmaarrfk opened this issue Jun 23, 2024 · 10 comments
Closed
1 task done

Segfault on M1 Macbook #243

hmaarrfk opened this issue Jun 23, 2024 · 10 comments
Labels
bug Something isn't working

Comments

@hmaarrfk
Copy link
Contributor

Solution to issue cannot be found in the documentation.

  • I checked the documentation.

Issue

mamba create --name pytorch python=3.10 pytorch numpy --channel conda-forge --override-channels
mamba activate pytorch
python -c "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"

Installed packages

% mamba list
# packages in environment at /Users/mark/miniforge3/envs/pytorch:
#
# Name                    Version                   Build  Channel
bzip2                     1.0.8                h93a5062_5    conda-forge
ca-certificates           2024.6.2             hf0a4a13_0    conda-forge
filelock                  3.15.4             pyhd8ed1ab_0    conda-forge
fsspec                    2024.6.0           pyhff2d567_0    conda-forge
gmp                       6.3.0                h7bae524_2    conda-forge
gmpy2                     2.1.5           py310h3bc658a_1    conda-forge
jinja2                    3.1.4              pyhd8ed1ab_0    conda-forge
libabseil                 20240116.2      cxx17_hebf3989_0    conda-forge
libblas                   3.9.0           22_osxarm64_openblas    conda-forge
libcblas                  3.9.0           22_osxarm64_openblas    conda-forge
libcxx                    17.0.6               he7857fb_1    conda-forge
libffi                    3.4.2                h3422bc3_5    conda-forge
libgfortran               5.0.0           13_2_0_hd922786_3    conda-forge
libgfortran5              13.2.0               hf226fd6_3    conda-forge
liblapack                 3.9.0           22_osxarm64_openblas    conda-forge
libopenblas               0.3.27          openmp_h6c19121_0    conda-forge
libprotobuf               4.25.3               hbfab5d5_0    conda-forge
libsqlite                 3.46.0               hfb93653_0    conda-forge
libtorch                  2.3.1           cpu_generic_hf1facdc_0    conda-forge
libuv                     1.48.0               h93a5062_0    conda-forge
libzlib                   1.3.1                hfb2fe0b_1    conda-forge
llvm-openmp               18.1.8               hde57baf_0    conda-forge
markupsafe                2.1.5           py310hd125d64_0    conda-forge
mpc                       1.3.1                h91ba8db_0    conda-forge
mpfr                      4.2.1                h41d338b_1    conda-forge
mpmath                    1.3.0              pyhd8ed1ab_0    conda-forge
ncurses                   6.5                  hb89a1cb_0    conda-forge
networkx                  3.3                pyhd8ed1ab_1    conda-forge
nomkl                     1.0                  h5ca1d4c_0    conda-forge
numpy                     2.0.0           py310h52bbd9b_0    conda-forge
openssl                   3.3.1                hfb2fe0b_0    conda-forge
pip                       24.0               pyhd8ed1ab_0    conda-forge
python                    3.10.14         h2469fbe_0_cpython    conda-forge
python_abi                3.10                    4_cp310    conda-forge
pytorch                   2.3.1           cpu_generic_py310hb190f2a_0    conda-forge
readline                  8.2                  h92ec313_1    conda-forge
setuptools                70.1.0             pyhd8ed1ab_0    conda-forge
sleef                     3.5.1                h156473d_2    conda-forge
sympy                     1.12.1          pypyh2585a3b_103    conda-forge
tk                        8.6.13               h5083fa2_1    conda-forge
typing_extensions         4.12.2             pyha770c72_0    conda-forge
tzdata                    2024a                h0c530f3_0    conda-forge
wheel                     0.43.0             pyhd8ed1ab_1    conda-forge
xz                        5.2.6                h57fd34a_0    conda-forge

Environment info

% conda info

     active environment : pytorch
    active env location : /Users/mark/miniforge3/envs/pytorch
            shell level : 2
       user config file : /Users/mark/.condarc
 populated config files : /Users/mark/miniforge3/.condarc
                          /Users/mark/.condarc
          conda version : 24.5.0
    conda-build version : not installed
         python version : 3.10.14.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=m1
                          __conda=24.5.0=0
                          __osx=14.5=0
                          __unix=0=0
       base environment : /Users/mark/miniforge3  (writable)
      conda av data dir : /Users/mark/miniforge3/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/osx-arm64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : /Users/mark/miniforge3/pkgs
                          /Users/mark/.conda/pkgs
       envs directories : /Users/mark/miniforge3/envs
                          /Users/mark/.conda/envs
               platform : osx-arm64
             user-agent : conda/24.5.0 requests/2.32.3 CPython/3.10.14 Darwin/23.5.0 OSX/14.5 solver/libmamba conda-libmamba-solver/24.1.0 libmambapy/1.5.8
                UID:GID : 503:20
             netrc file : None
           offline mode : False
@hmaarrfk hmaarrfk added the bug Something isn't working label Jun 23, 2024
@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Jun 23, 2024

Recreatable with:

  • Pytorch 2.3.1 + numpy 1.26.4
  • Pytorch 2.3.0 + numpy 1.26.4

Does not happen with

  • Pytorch 2.1.2 + numpy 1.26.4

Cannot recreate with:

conda create --name pt python=3.10 
conda activate pt
# numpy 2.0 and pytorch 2.3.1 get instaled
pip install torch numpy
python -c "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"
# no segfault

can recreate with

conda create --name pt python=3.10 numpy 
conda activate pt
pip install torch
python -c "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"
# Segfault...

In all these cases importing numpy second does not recreate the issue

@hmaarrfk
Copy link
Contributor Author

And now with lldb

% lldb -- python -c "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"
(lldb) target create "python"
Current executable set to '/Users/mark/miniforge3/envs/pt/bin/python' (arm64).
(lldb) settings set -- target.run-args  "-c" "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"
(lldb) run
Process 58336 launched: '/Users/mark/miniforge3/envs/pt/bin/python' (arm64)
Process 58336 stopped
* thread #2, stop reason = EXC_BAD_ACCESS (code=1, address=0x540)
    frame #0: 0x000000010086ef94 libomp.dylib`__kmp_suspend_initialize_thread + 32
libomp.dylib`:
->  0x10086ef94 <+32>: ldr    w8, [x0, #0x540]
    0x10086ef98 <+36>: nop    
    0x10086ef9c <+40>: ldr    w9, 0x1008a1308           ; _MergedGlobals + 8
    0x10086efa0 <+44>: add    w20, w9, #0x1
  thread #3, stop reason = EXC_BAD_ACCESS (code=1, address=0x540)
    frame #0: 0x000000010086ef94 libomp.dylib`__kmp_suspend_initialize_thread + 32
libomp.dylib`:
->  0x10086ef94 <+32>: ldr    w8, [x0, #0x540]
    0x10086ef98 <+36>: nop    
    0x10086ef9c <+40>: ldr    w9, 0x1008a1308           ; _MergedGlobals + 8
    0x10086efa0 <+44>: add    w20, w9, #0x1
  thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x540)
    frame #0: 0x000000010086ef94 libomp.dylib`__kmp_suspend_initialize_thread + 32
libomp.dylib`:
->  0x10086ef94 <+32>: ldr    w8, [x0, #0x540]
    0x10086ef98 <+36>: nop    
    0x10086ef9c <+40>: ldr    w9, 0x1008a1308           ; _MergedGlobals + 8
    0x10086efa0 <+44>: add    w20, w9, #0x1
  thread #5, stop reason = EXC_BAD_ACCESS (code=1, address=0x540)
    frame #0: 0x000000010086ef94 libomp.dylib`__kmp_suspend_initialize_thread + 32
libomp.dylib`:
->  0x10086ef94 <+32>: ldr    w8, [x0, #0x540]
    0x10086ef98 <+36>: nop    
    0x10086ef9c <+40>: ldr    w9, 0x1008a1308           ; _MergedGlobals + 8
    0x10086efa0 <+44>: add    w20, w9, #0x1
  thread #8, stop reason = EXC_BAD_ACCESS (code=1, address=0x540)
    frame #0: 0x000000010086ef94 libomp.dylib`__kmp_suspend_initialize_thread + 32
libomp.dylib`:
->  0x10086ef94 <+32>: ldr    w8, [x0, #0x540]
    0x10086ef98 <+36>: nop    
    0x10086ef9c <+40>: ldr    w9, 0x1008a1308           ; _MergedGlobals + 8
    0x10086efa0 <+44>: add    w20, w9, #0x1
Target 0: (python) stopped.
(lldb) 

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Jun 23, 2024

OMP_NUM_THREADS=1 python -c "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"

seems to be ok with things.

but

OMP_NUM_THREADS=2 lldb -- python -c "import numpy; import torch; torch.zeros((1024, 1024), dtype=torch.uint8)"

recreates the segfault

backtrace:

* thread #4, stop reason = EXC_BAD_ACCESS (code=1, address=0x540)
  * frame #0: 0x000000010086ef94 libomp.dylib`__kmp_suspend_initialize_thread + 32
    frame #1: 0x000000010086faf8 libomp.dylib`void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 72
    frame #2: 0x0000000108339520 libomp.dylib`kmp_flag_64<false, true>::wait(kmp_info*, int, void*) + 1880
    frame #3: 0x0000000108334560 libomp.dylib`__kmp_hyper_barrier_release(barrier_type, kmp_info*, int, int, int, void*) + 184
    frame #4: 0x00000001083380e8 libomp.dylib`__kmp_fork_barrier(int, int) + 628
    frame #5: 0x0000000108314e14 libomp.dylib`__kmp_launch_thread + 340
    frame #6: 0x000000010835300c libomp.dylib`__kmp_launch_worker(void*) + 280
    frame #7: 0x000000019ff6ef94 libsystem_pthread.dylib`_pthread_start + 136

* thread #2
  * frame #0: 0x000000019ff319ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019ff6f55c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x00000001001ac700 python`PyThread_acquire_lock_timed + 596
    frame #3: 0x000000010020f8ac python`acquire_timed + 312
    frame #4: 0x000000010020fb20 python`lock_PyThread_acquire_lock + 72
    frame #5: 0x0000000100065448 python`method_vectorcall_VARARGS_KEYWORDS + 488
    frame #6: 0x0000000100149540 python`call_function + 524
    frame #7: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #8: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #9: 0x0000000100149540 python`call_function + 524
    frame #10: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #11: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #12: 0x0000000100149540 python`call_function + 524
    frame #13: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #14: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #15: 0x0000000100145658 python`_PyEval_EvalFrameDefault + 26980
    frame #16: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #17: 0x0000000100145658 python`_PyEval_EvalFrameDefault + 26980
    frame #18: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #19: 0x0000000100149540 python`call_function + 524
    frame #20: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #21: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #22: 0x0000000100149540 python`call_function + 524
    frame #23: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #24: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #25: 0x000000010005ad10 python`method_vectorcall + 344
    frame #26: 0x0000000100210830 python`thread_run + 180
    frame #27: 0x00000001001ac230 python`pythread_wrapper + 48
    frame #28: 0x000000019ff6ef94 libsystem_pthread.dylib`_pthread_start + 136

* thread #3
  * frame #0: 0x000000019ff319ec libsystem_kernel.dylib`__psynch_cvwait + 8
    frame #1: 0x000000019ff6f55c libsystem_pthread.dylib`_pthread_cond_wait + 1228
    frame #2: 0x00000001001ac700 python`PyThread_acquire_lock_timed + 596
    frame #3: 0x000000010abdf3f0 _queue.cpython-310-darwin.so`_queue_SimpleQueue_get_impl + 496
    frame #4: 0x000000010abdef5c _queue.cpython-310-darwin.so`_queue_SimpleQueue_get + 236
    frame #5: 0x00000001000ab37c python`cfunction_vectorcall_FASTCALL_KEYWORDS_METHOD + 140
    frame #6: 0x0000000100149540 python`call_function + 524
    frame #7: 0x0000000100145430 python`_PyEval_EvalFrameDefault + 26428
    frame #8: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #9: 0x0000000100145658 python`_PyEval_EvalFrameDefault + 26980
    frame #10: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #11: 0x0000000100149540 python`call_function + 524
    frame #12: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #13: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #14: 0x0000000100149540 python`call_function + 524
    frame #15: 0x0000000100144e38 python`_PyEval_EvalFrameDefault + 24900
    frame #16: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #17: 0x000000010005ad10 python`method_vectorcall + 344
    frame #18: 0x0000000100210830 python`thread_run + 180
    frame #19: 0x00000001001ac230 python`pythread_wrapper + 48
    frame #20: 0x000000019ff6ef94 libsystem_pthread.dylib`_pthread_start + 136

* thread #1, queue = 'com.apple.main-thread'
  * frame #0: 0x000000014493ff34 libtorch_cpu.dylib`void c10::function_ref<void (char**, long long const*, long long, long long)>::callback_fn<at::native::DEFAULT::VectorizedLoop2d<at::native::(anonymous namespace)::fill_kernel(at::TensorIterator&, c10::Scalar const&)::$_2::operator()() const::'lambda'()::operator()() const::'lambda'(), at::native::(anonymous namespace)::fill_kernel(at::TensorIterator&, c10::Scalar const&)::$_2::operator()() const::'lambda'()::operator()() const::'lambda0'()>>(long, char**, long long const*, long long, long long) + 632
    frame #1: 0x0000000142351e7c libtorch_cpu.dylib`at::TensorIteratorBase::serial_for_each(c10::function_ref<void (char**, long long const*, long long, long long)>, at::Range) const + 364
    frame #2: 0x0000000142351fe8 libtorch_cpu.dylib`.omp_outlined. + 216
    frame #3: 0x0000000108371c4c libomp.dylib`__kmp_invoke_microtask + 156
    frame #4: 0x0000000108315e40 libomp.dylib`__kmp_invoke_task_func + 348
    frame #5: 0x0000000108311ac0 libomp.dylib`__kmp_fork_call + 7552
    frame #6: 0x0000000108304088 libomp.dylib`__kmpc_fork_call + 196
    frame #7: 0x0000000142351c28 libtorch_cpu.dylib`at::TensorIteratorBase::for_each(c10::function_ref<void (char**, long long const*, long long, long long)>, long long) + 432
    frame #8: 0x000000014493f190 libtorch_cpu.dylib`at::native::(anonymous namespace)::fill_kernel(at::TensorIterator&, c10::Scalar const&) + 252
    frame #9: 0x0000000142746fc0 libtorch_cpu.dylib`at::native::fill_out(at::Tensor&, c10::Scalar const&) + 764
    frame #10: 0x0000000142e56bc8 libtorch_cpu.dylib`at::_ops::fill__Scalar::call(at::Tensor&, c10::Scalar const&) + 272
    frame #11: 0x00000001427485c0 libtorch_cpu.dylib`at::native::zero_(at::Tensor&) + 676
    frame #12: 0x000000014332d958 libtorch_cpu.dylib`at::_ops::zero_::call(at::Tensor&) + 260
    frame #13: 0x0000000142a139bc libtorch_cpu.dylib`at::native::zeros_symint(c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 676
    frame #14: 0x0000000142eb80d0 libtorch_cpu.dylib`at::_ops::zeros::redispatch(c10::DispatchKeySet, c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 152
    frame #15: 0x0000000142eb7cb8 libtorch_cpu.dylib`at::_ops::zeros::call(c10::ArrayRef<c10::SymInt>, std::__1::optional<c10::ScalarType>, std::__1::optional<c10::Layout>, std::__1::optional<c10::Device>, std::__1::optional<bool>) + 296
    frame #16: 0x00000001098f0898 libtorch_python.dylib`torch::zeros_symint(c10::ArrayRef<c10::SymInt>, c10::TensorOptions) + 204
    frame #17: 0x00000001098a0ad0 libtorch_python.dylib`torch::autograd::THPVariable_zeros(_object*, _object*, _object*) + 2820
    frame #18: 0x00000001000aab3c python`cfunction_call + 80
    frame #19: 0x000000010005759c python`_PyObject_MakeTpCall + 612
    frame #20: 0x00000001001495d8 python`call_function + 676
    frame #21: 0x0000000100145430 python`_PyEval_EvalFrameDefault + 26428
    frame #22: 0x000000010013e364 python`_PyEval_Vector + 2036
    frame #23: 0x0000000100199398 python`run_mod + 216
    frame #24: 0x0000000100198e38 python`_PyRun_SimpleFileObject + 1260
    frame #25: 0x0000000100197e1c python`_PyRun_AnyFileObject + 240
    frame #26: 0x00000001001bc8f8 python`Py_RunMain + 2340
    frame #27: 0x00000001001bda54 python`pymain_main + 1180
    frame #28: 0x000000010000131c python`main + 56
    frame #29: 0x000000019fbe60e0 dyld`start + 2360

@hmaarrfk
Copy link
Contributor Author

I'm somewhat afraid of setting package type conda due to the fact that their commit says that it is needed for torch compile......

But I would rather avoid this disastrous failure mode for others.

I expect that many simply haven't been able to update to 2.3.0 due to some ecosystem in compatibility because of multiple ongoing migrations, but I was able to by carefully picking and choosing packages and some on my team had reported the failure last week

@hmaarrfk
Copy link
Contributor Author

@conda-forge-admin please rerender

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-webservice.

I just wanted to let you know that I started rerendering the recipe in #244.

@isuruf
Copy link
Member

isuruf commented Jun 25, 2024

They seem to bundle libomp.dylib.

Following should fix this issue

diff --git a/recipe/build_pytorch.sh b/recipe/build_pytorch.sh
index cd27be0..c6cb567 100644
--- a/recipe/build_pytorch.sh
+++ b/recipe/build_pytorch.sh
@@ -1,13 +1,9 @@
 set -x
-if [[ "$megabuild" == true ]]; then
-  source $RECIPE_DIR/build.sh
-  pushd $SP_DIR/torch
-  for f in bin/* lib/* share/* include/*; do
-    if [[ -e "$PREFIX/$f" ]]; then
-      rm -rf $f
-      ln -sf $PREFIX/$f $PWD/$f
-    fi
-  done
-else
-  $PREFIX/bin/python -m pip install torch-*.whl
-fi
+source $RECIPE_DIR/build.sh
+pushd $SP_DIR/torch
+for f in bin/* lib/* share/* include/*; do
+  if [[ -e "$PREFIX/$f" ]]; then
+    rm -rf $f
+    ln -sf $PREFIX/$f $PWD/$f
+  fi
+done

@hmaarrfk
Copy link
Contributor Author

@conda-forge-admin please rerender

@hmaarrfk
Copy link
Contributor Author

Thank you Isuru!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants