Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive Termdet is broken #541

Open
abouteiller opened this issue May 11, 2023 · 1 comment
Open

Recursive Termdet is broken #541

abouteiller opened this issue May 11, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@abouteiller
Copy link
Contributor

Describe the bug

Termdet causes recursive to crash.

To Reproduce

Steps to reproduce the behavior:

  1. Checkout b958ae9f
  2. Checkout parsec 9fc74b6
  3. Compile with the following options ../dplasma/configure --disable-fortran --with-platform=macosx --enable-debug=paranoid\,noisier
  4. See error
tests/testing_dpotrf  -N 10000 -t 200 -z 100 -x -v                                                    ─╯
W@00000 /!\ DEBUG LEVEL WILL PROBABLY REDUCE THE PERFORMANCE OF THIS RUN /!\.
#+++++ cores detected       : 4
#+++++ nodes x cores + gpu  : 1 x 4 + 0 (4+0)
#+++++ thread mode          : THREAD_SERIALIZED
#+++++ P x Q                : 1 x 1 (1/1)
#+++++ M x N x K|NRHS       : 10000 x 10000 x 1
#+++++ MB x NB              : 200 x 200
#+++++ HMB x HNB            : 100 x 100
[aurelien16:88452] *** Process received signal ***
[aurelien16:88452] Signal: Segmentation fault: 11 (11)
[aurelien16:88452] Signal code: Address not mapped (1)
[aurelien16:88452] Failing at address: 0x300000008
[aurelien16:88452] [ 0] 0   libsystem_platform.dylib            0x00007ff8042f2dfd _sigtramp + 29
[aurelien16:88452] [ 1] 0   ???                                 0x0000600000189280 0x0 + 105553117876864
[aurelien16:88452] [ 2] 0   libparsec.4.0.0.dylib               0x00000001012c737a parsec_atomic_fetch_dec_int32 + 26
[aurelien16:88452] [ 3] 0   libparsec.4.0.0.dylib               0x00000001012c734c parsec_taskpool_termination_detected + 76
[aurelien16:88452] [ 4] 0   libparsec.4.0.0.dylib               0x00000001012ffecf parsec_termdet_local_termination_detected + 591
[aurelien16:88452] [ 5] 0   libparsec.4.0.0.dylib               0x00000001012ff185 parsec_termdet_local_taskpool_addto_nb_tasks + 1013
[aurelien16:88452] [ 6] 0   libparsec.4.0.0.dylib               0x00000001012d8174 parsec_release_task_to_mempool_update_nbtasks + 68
[aurelien16:88452] [ 7] 0   libdplasma.2.0.dylib                0x0000000103f74bae release_task_of_dtrsm_LUT_dtrsm + 126
[aurelien16:88452] [ 8] 0   libparsec.4.0.0.dylib               0x00000001012c7fbc __parsec_complete_execution + 204
[aurelien16:88452] [ 9] 0   libparsec.4.0.0.dylib               0x00000001012c8143 __parsec_task_progress + 323
[aurelien16:88452] [10] 0   libparsec.4.0.0.dylib               0x00000001012c86d4 __parsec_context_wait + 980
[aurelien16:88452] [11] 0   libparsec.4.0.0.dylib               0x00000001012a5c7e __parsec_thread_init + 1230
[aurelien16:88452] [12] 0   libsystem_pthread.dylib             0x00007ff8042dd4e1 _pthread_start + 125
[aurelien16:88452] [13] 0   libsystem_pthread.dylib             0x00007ff8042d8f6b thread_start + 15
[aurelien16:88452] *** End of error message ***
[1]    88452 segmentation fault  tests/testing_dpotrf -N 10000 -t 200 -z 100 -x -v
@abouteiller abouteiller added the bug Something isn't working label May 11, 2023
@QingleiCao
Copy link
Contributor

There are some discussions about this recursive issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants