CCL_LOG_LEVEL=info CCL_WORKER_COUNT=2 mpirun -np 2 examples/cpu/cpu_allreduce_test 2024:01:16-14:50:40:(576436) |CCL_INFO| process launcher: hydra, local_proc_idx: 1, local_proc_count: 2 2024:01:16-14:50:40:(576435) |CCL_INFO| process launcher: hydra, local_proc_idx: 0, local_proc_count: 2 2024:01:16-14:50:40:(576436) |CCL_INFO| OS info: { Linux machine 5.15.0-79-generic #86-Ubuntu SMP Mon Jul 10 16:07:21 UTC 2023 x86_64 } 2024:01:16-14:50:40:(576435) |CCL_INFO| OS info: { Linux machine 5.15.0-79-generic #86-Ubuntu SMP Mon Jul 10 16:07:21 UTC 2023 x86_64 } [0] MPI startup(): Intel(R) MPI Library, Version 2021.11 Build 20231005 (id: 74c4a23) [0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation. All rights reserved. [0] MPI startup(): library kind: release [0] MPI startup(): libfabric version: 1.18.1-impi [0] MPI startup(): libfabric provider: psm3 [0] MPI startup(): File "" not found [0] MPI startup(): Load tuning file: "/localdata/piotrc/oneCCL/build/_install/opt/mpi/etc/tuning_generic_ofi.dat" [0] MPI startup(): ===== Nic pinning on machine ===== [0] MPI startup(): Rank Pin nic [0] MPI startup(): 0 rocep161s0f1 [0] MPI startup(): 1 rocep161s0f1 [0] MPI startup(): THREAD_SPLIT mode is switched on, 2 endpoints in use [0] MPI startup(): Rank Pid Node name Pin cpu [0] MPI startup(): 0 576435 machine {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29, 30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,96,97,98,99,100,101,102,10 3,104,105,106,107,108,109,110,111,128,129,130,131,132,133,134,135,136,137,138,13 9,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,15 9,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,224,225,226,22 7,228,229,230,231,232,233,234,235,236,237,238,239} [0] MPI startup(): 1 576436 machine {48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74 ,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,112,113,114,115, 116,117,118,119,120,121,122,123,124,125,126,127,176,177,178,179,180,181,182,183, 184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203, 204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223, 240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255} 2024:01:16-14:50:41:(576435) |CCL_INFO| atl-mpi: { is_external_init: 1 mpi_lib_attr.type: impi mpi_lib_attr.hmem: 1 extra_ep: 0 mnic_type: none progress_mode: 1 sync_coll: 0 } 2024:01:16-14:50:41:(576435) |CCL_INFO| atl attrs: { in: { shm: 0, hmem: 0, sync_coll: 0, extra_ep: 0, ep_count: 2, mnic_type: none, mnic_count: 2, mnic_offset: none } out: { shm: 0, hmem: 0, mnic_type: none, mnic_count: 1, tag_bits: 32, max_tag: 1073741823 } } 2024:01:16-14:50:41:(576435) |CCL_INFO| start workers for local process [0:2] 2024:01:16-14:50:41:(576436) |CCL_INFO| start workers for local process [1:2] 2024:01:16-14:50:41:(576435) |CCL_INFO| library version: Gold-2021.11.2 2024-01-16T 11:37:55Z (master/8d18c7b) 2024:01:16-14:50:41:(576435) |CCL_INFO| specification version: 1.0 2024:01:16-14:50:41:(576435) |CCL_INFO| build mode: release 2024:01:16-14:50:41:(576435) |CCL_INFO| C compiler: GNU 11.4.0 2024:01:16-14:50:41:(576435) |CCL_INFO| C++ compiler: GNU 11.4.0 2024:01:16-14:50:41:(576436) |CCL_INFO| local process [1:2]: worker: 0, cpu: 254, numa: 7 2024:01:16-14:50:41:(576436) |CCL_INFO| local process [1:2]: worker: 1, cpu: 255, numa: 7 2024:01:16-14:50:41:(576435) |CCL_INFO| hwloc initialized: 1 { membind_thread_supported: 1 numa: {os_idx: 0, memory: 64049 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 1, memory: 64503 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 2, memory: 64503 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 3, memory: 64491 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 4, memory: 64455 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 5, memory: 64503 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 6, memory: 64503 MB, cores: 16, cpus: 32, membind: 1} numa: {os_idx: 7, memory: 64491 MB, cores: 16, cpus: 32, membind: 1} } 2024:01:16-14:50:41:(576435) |CCL_INFO| local process [0:2]: worker: 0, cpu: 238, numa: 6 2024:01:16-14:50:41:(576435) |CCL_INFO| local process [0:2]: worker: 1, cpu: 239, numa: 6 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_WORKER_COUNT: 2 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_WORKER_OFFLOAD: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_WORKER_WAIT: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_LOG_LEVEL: info 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ABORT_ON_THROW: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_QUEUE_DUMP: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_SCHED_DUMP: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_SCHED_PROFILE: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ENTRY_MAX_UPDATE_TIME_SEC: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FRAMEWORK: none 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ATL_TRANSPORT: mpi 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ATL_SHM: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ATL_RMA: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ATL_HMEM: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ATL_SEND_PROXY: none 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ATL_CACHE: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MNIC: none 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MNIC_NAME: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MNIC_COUNT: 2 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MNIC_OFFSET: none 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALGO_FALLBACK: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLGATHERV: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALL: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALLV: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_BARRIER: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_BCAST: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_RECV: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_REDUCE: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_REDUCE_SCATTER: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_SEND: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLGATHERV: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALL: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALLV: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_REDUCE: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLGATHERV_SCALEOUT: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE_SCALEOUT: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALL_SCALEOUT: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALLV_SCALEOUT: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_REDUCE_SCALEOUT: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_UNORDERED_COLL: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FUSION: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FUSION_BYTES_THRESHOLD: 16384 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FUSION_COUNT_THRESHOLD: 256 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FUSION_CHECK_URGENT: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FUSION_CYCLE_MS: 0.2 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_PRIORITY: none 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_SPIN_COUNT: 1000 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_YIELD: pause 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MAX_SHORT_SIZE: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_BCAST_PART_COUNT: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_CACHE_KEY: match_id 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_CACHE_FLUSH: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_BUFFER_CACHE: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_STRICT_ORDER: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_STAGING_BUFFER: regular 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_OP_SYNC: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_CHUNK_COUNT: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MIN_CHUNK_SIZE: 65536 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_RS_CHUNK_COUNT: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_RS_MIN_CHUNK_SIZE: 65536 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE_NREDUCE_BUFFERING: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE_NREDUCE_SEGMENT_SIZE: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE_2D_CHUNK_COUNT: 1 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE_2D_MIN_CHUNK_SIZE: 65536 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLREDUCE_2D_SWITCH_DIMS: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ALLTOALL_SCATTER_MAX_OPS: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_BACKEND: native 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_LOCAL_RANK: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_LOCAL_SIZE: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_PROCESS_LAUNCHER: hydra 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_MPI_LIBRARY_PATH: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_OFI_LIBRARY_PATH: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_PMIX_LIBRARY_PATH: 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ITT_LEVEL: 0 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_BF16: scalar 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_FP16: f16c 2024:01:16-14:50:41:(576435) |CCL_INFO| CCL_ROOT: /localdata/piotrc/oneCCL/build/_install 2024:01:16-14:50:41:(576435) |CCL_INFO| I_MPI_ROOT: /localdata/piotrc/oneCCL/build/_install 2024:01:16-14:50:41:(576435) |CCL_INFO| FI_PROVIDER_PATH: /localdata/piotrc/oneCCL/build/_install/opt/mpi/libfabric/lib/prov:/usr/lib64/libfabric 2024:01:16-14:50:41:(576435) |CCL_INFO| FI_PROVIDER: [1705416641.904551973] machine:rank1.cpu_allreduce_test: Reading from remote process' memory failed. Disabling CMA support [1705416641.904557543] machine:rank1.cpu_allreduce_test: Reading from remote process' memory failed. Disabling CMA support [1705416641.904569025] machine:rank0.cpu_allreduce_test: Reading from remote process' memory failed. Disabling CMA support [1705416641.904569005] machine:rank0.cpu_allreduce_test: Reading from remote process' memory failed. Disabling CMA support machine:rank1: Assertion failure at psm3/ptl_am/ptl.c:196: nbytes == req->req_data.recv_msglen machine:rank1: Assertion failure at psm3/ptl_am/ptl.c:196: nbytes == req->req_data.recv_msglen machine:rank0: Assertion failure at psm3/ptl_am/ptl.c:196: nbytes == req->req_data.recv_msglen machine:rank0: Assertion failure at psm3/ptl_am/ptl.c:196: nbytes == req->req_data.recv_msglen =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 0 PID 576435 RUNNING AT machine = KILLED BY SIGNAL: 6 (Aborted) =================================================================================== =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = RANK 1 PID 576436 RUNNING AT machine = KILLED BY SIGNAL: 6 (Aborted) ==================================================================================