Skip to content

Conversation

@jgunthorpe
Copy link
Member

Remove #ifdef related stuff that has no purpose in this tree.

Note: this is based on the copying branch, will rebase after collecting acks.

For greater clarity and certainty

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Needed to use the list debug features.

Fixes: c299cfb ("ccan: Add list functionality")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
This patch is not intended to change the copyright or license
situation of any of the original code.

Documentation is provided that identifies the various licenses we
have in the source tree.

The default license for new items after the 'Initial commit' merge is
corrected to match the majority license already in use. This is the
license we recommend all new code use. This corrects a mistake I made
in the 'Unified CMake build system' patch which selected the wrong
license file.

Also provide specific guidance on how the 18 different COPYING files
and per-file copyright headers are intended to be interpreted
within the merged tree.

Fixes: 8d4ebd8 ("Unified CMake build system")
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Each of these files is in a directory where every file contains
a copyright header. Our interpretation is that file level headers
supersede all other information, thus these COPYING files do not
apply.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
The cmake build system guarantees this header exists, we do not need
the define or the test.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
The cmake build system guarantees that valgrind/memcheck.h is present.
If the system does not have it, or valgrind is disabled, then it is
replaced with a dummy header full of empty stubs.

Thus all the copy&paste boiler plate is consolidated into buildlib.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
These were used to support building provider plugins against old
versions of libibverbs. Since libibverbs is now included together
with the provider that is no longer possible or supported.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
These were used to support building provider plugins against old
versions of libibverbs. Since libibverbs is now included together
with the provider that is no longer possible or supported.

The define is kept as it is in a public header.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
cmake now hard requires GNU style symbol version support in the
assembler and linker.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
The canonical place for these is infiniband/arch.h, nothing else
should declare them.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
All supported distros have this now.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
For a long time now endian.h has defined sane fixed with conversion
macros, so lets just use them instead of rolling our own.

Also, htonll is defined in this source tree under infiniband/arch.h,
so all users of that macro can just use the header.

Someday we should also get rid of all the endless wrappers..

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Unclear what this was ever for, but we don't support it.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Copy link
Contributor

@shefty shefty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good for ibcm, rdmacm, and ibacm. All acked.

@dledford dledford merged commit 5753f94 into linux-rdma:master Oct 2, 2016
@jgunthorpe jgunthorpe deleted the dead-removal branch October 6, 2016 19:27
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 1, 2018
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr().

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 22, 2018
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 23, 2018
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>
rosenbaumalex pushed a commit to rosenbaumalex/rdma-core that referenced this pull request Jan 7, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>
jgunthorpe pushed a commit to jgunthorpe/rdma-plumbing that referenced this pull request Feb 19, 2019
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Mar 27, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Mar 27, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Apr 9, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>

Orabug: 29410510

Rebase from RDMA Core 19.2 -> 20.2.

(cherry picked from commit fc2e7b4b)
cherry-pick-repo=linux-git/RDMA/rdma-core.git
unmodified-from-upstream: fc2e7b4b

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Nov 16, 2020
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=github.com/linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Acked-by: Aron Silverton <aron.silverton@oracle.com>

Orabug: 29410510

Rebase from RDMA Core 19.2 -> 20.2.

(cherry picked from commit 303f845)
cherry-pick-repo=linux-git.us.oracle.com/RDMA/rdma-core.git
unmodified-from-upstream: 303f845

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Acked-by: Aron Silverton <aron.silverton@oracle.com>
shefty pushed a commit to shefty/rdma-core that referenced this pull request Nov 10, 2025
Subject: [PATCH] librdmacm: Fix rdma_resolve_addrinfo() deadlock in sync mode

Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 linux-rdma#2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 linux-rdma#3  ___pthread_mutex_lock
 linux-rdma#4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 linux-rdma#5  0x00007ffff7fa1447 in rdma_get_cm_event
 linux-rdma#6  0x00007ffff7fa1fef in ucma_complete
 linux-rdma#7  0x00007ffff7fa2f9c in resolve_ai_sa
 linux-rdma#8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 linux-rdma#9  rdma_resolve_addrinfo
 linux-rdma#10 0x00000000004017b6 in start_cm_client_sync
 linux-rdma#11 0x00000000004018ee in main

Issue: 4582946
Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")
Change-Id: Ia724795a559bab6d965a35b8fd3e0f0096472a44
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
shefty pushed a commit to shefty/rdma-core that referenced this pull request Nov 11, 2025
Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 linux-rdma#2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 linux-rdma#3  ___pthread_mutex_lock
 linux-rdma#4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 linux-rdma#5  0x00007ffff7fa1447 in rdma_get_cm_event
 linux-rdma#6  0x00007ffff7fa1fef in ucma_complete
 linux-rdma#7  0x00007ffff7fa2f9c in resolve_ai_sa
 linux-rdma#8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 linux-rdma#9  rdma_resolve_addrinfo
 linux-rdma#10 0x00000000004017b6 in start_cm_client_sync
 linux-rdma#11 0x00000000004018ee in main

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
shefty pushed a commit to shefty/rdma-core that referenced this pull request Nov 11, 2025
Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 linux-rdma#2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 linux-rdma#3  ___pthread_mutex_lock
 linux-rdma#4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 linux-rdma#5  0x00007ffff7fa1447 in rdma_get_cm_event
 linux-rdma#6  0x00007ffff7fa1fef in ucma_complete
 linux-rdma#7  0x00007ffff7fa2f9c in resolve_ai_sa
 linux-rdma#8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 linux-rdma#9  rdma_resolve_addrinfo
 linux-rdma#10 0x00000000004017b6 in start_cm_client_sync
 linux-rdma#11 0x00000000004018ee in main

Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Sean Hefty <shefty@nvidia.com>
rleon pushed a commit that referenced this pull request Nov 12, 2025
Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 #2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 #3  ___pthread_mutex_lock
 #4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 #5  0x00007ffff7fa1447 in rdma_get_cm_event
 #6  0x00007ffff7fa1fef in ucma_complete
 #7  0x00007ffff7fa2f9c in resolve_ai_sa
 #8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 #9  rdma_resolve_addrinfo
 #10 0x00000000004017b6 in start_cm_client_sync
 #11 0x00000000004018ee in main

Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Sean Hefty <shefty@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
nmorey pushed a commit that referenced this pull request Nov 21, 2025
[ Upstream commit 7528827 ]

Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 #2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 #3  ___pthread_mutex_lock
 #4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 #5  0x00007ffff7fa1447 in rdma_get_cm_event
 #6  0x00007ffff7fa1fef in ucma_complete
 #7  0x00007ffff7fa2f9c in resolve_ai_sa
 #8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 #9  rdma_resolve_addrinfo
 #10 0x00000000004017b6 in start_cm_client_sync
 #11 0x00000000004018ee in main

Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Sean Hefty <shefty@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Nicolas Morey <nmorey@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants