Skip to content

Conversation

@jgunthorpe
Copy link
Member

Most of these commits are first time on the mailing list, and actually change things about how the package builds and the location of files post-install.

We setup the two standard CMake build_types so that Release includes
NDEBUG and RelWithDebInfo does not (by default CMake sets it in both).

The recommendation is for packagers to use Release (by setting
-DCMAKE_BUILD_TYPE=Release) and developers use RelWithDebInfo
(the default)

This also replaces the default flags for Release with the RelWithDebInfo,
flags (-O2 -g -DNDEBUG) which is what we consider suitable for packaging.
The CMake default of -O3 is not tested.

Note that all the packaging systems I looked at force NDEBUG into the
CFLAGS.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
We now recommend that this source be built with valgrind memcheck.h
present, so use it automatically if it is available. Users looking
to remove this tiny overhead can build with -DENABLE_VALGRIND=0

Downstream packagers should ensure the build is done with valgrind
headers available.

NOTE: Fedora/CentOS have shipped with valgrind turn on in their
packaging, so for most users this is no change.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
These were preserved as part of the cmake transition, but no distributor
uses them and we don't need them internally, so time for them to go.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
We no longer recommend that static libraries are distributed, this
never worked sanely for libibverbs.

Use:
 cmake -DENABLE_STATIC=1

To restore the old behaviour

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
rsocket.7 had an errant text substitution that never worked, it is
a good idea to have the man pages use the correct paths, so let
us have cmake run them through.

Any man page ending in '.in' will be substituted automatically.

Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
config.h is the only place we pass through cmake substitution,
so it is the only place that can define the various filesystem
paths.

This patch handles the C code portions that use paths.

Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
This removes hardwired paths from the documentation and broadly
makes the documentation and scripts match what the C code is now
doing.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
'/var/tmp' is an inappropriate places for lock files of this nature,
they belong in /var/run. /var/lock does not seem suitable because
this lock is not against a basic device node.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
The Debian packaging has always used this path, provide
official support for this configuration so Debian does not
rely on the absolute path in the .driver file, which breaks biarch.

Since there is no reason for the providers to be in the system
library search path (they export no symbols, and have no soname)
make this the default configuration.

The old behaviour can be restored by using:

 cmake -DVERBS_PROVIDER_DIR=''

This continues to support out-of-tree drivers by searching both
the provider path and the system library path if an unqualified
name is given in the .driver file.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
This is the FHS defined place for non-user runnable helper
programs.

Debian forbids the use of /usr/libexec/ so we provide
substitution support to let cmake customize this.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
This is particularly important for the three shared libraries, and
we haven't been doing it right historically, perhaps due to libtool
braindamage.

The names of the shlibs are updated to:
  libibcm 1.0.11
  libibumad 3.1.11
  libibverbs 1.3.11
  librdmacm 1.1.11

The SONAME remains the same.

The overall package release is set to 11 due to libibumad having got up
to a .10 release.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Turns out this is not a mailing list.

Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
@dledford dledford merged commit 060b4c8 into linux-rdma:master Sep 29, 2016
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 1, 2018
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 1, 2018
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr().

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 22, 2018
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 22, 2018
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 23, 2018
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2: Fixed Travis issue
Hakon-Bugge added a commit to Hakon-Bugge/rdma-core that referenced this pull request Nov 23, 2018
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>
rosenbaumalex pushed a commit to rosenbaumalex/rdma-core that referenced this pull request Jan 7, 2019
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2: Fixed Travis issue
rosenbaumalex pushed a commit to rosenbaumalex/rdma-core that referenced this pull request Jan 7, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Mar 27, 2019
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2: Fixed Travis issue

Orabug: 29037253

(cherry picked from commit c562033)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c562033

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Mar 27, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Mar 27, 2019
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2: Fixed Travis issue

Orabug: 29037253

(cherry picked from commit c562033)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c562033

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Mar 27, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Apr 9, 2019
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2: Fixed Travis issue

Orabug: 29037253

(cherry picked from commit c562033)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c562033

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>

Orabug: 29410510

Rebase from RDMA Core 19.2 -> 20.2.

(cherry picked from commit bbd44792)
cherry-pick-repo=linux-git/RDMA/rdma-core.git
unmodified-from-upstream: bbd44792

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Apr 9, 2019
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Aron Silverton <aron.silverton@oracle.com>

Orabug: 29410510

Rebase from RDMA Core 19.2 -> 20.2.

(cherry picked from commit fc2e7b4b)
cherry-pick-repo=linux-git/RDMA/rdma-core.git
unmodified-from-upstream: fc2e7b4b

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
gal-pressman pushed a commit to amzn/rdma-core that referenced this pull request Aug 4, 2019
Zero `ibv_query_device_ex` and `ibv_modify_qp` command structs before
use. This eliminates valgrind warnings such as:

==18022== Syscall param write(buf) points to uninitialised byte(s)
==18022==    at 0x513A894: write (in /usr/lib64/libc-2.26.so)
==18022==    by 0x4E440C0: _execute_cmd_write (cmd_fallback.c:250)
==18022==    by 0x4E4126C: ibv_cmd_modify_qp (cmd.c:1191)
==18022==    by 0x7E9D7BD: efa_modify_qp (verbs.c:839)
==18022==    by 0x4E4C0E8: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:620)
==18022==    by 0x401CA1: pp_init_ctx (ud_pingpong.c:402)
==18022==    by 0x401CA1: main (ud_pingpong.c:689)
==18022==  Address 0x1ffeffff08 is on thread 1's stack
==18022==  in frame linux-rdma#3, created by efa_modify_qp (verbs.c:834)

Signed-off-by: Firas Jahjah <firasj@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
yuvalshaia pushed a commit to yuvalshaia/rdma-core that referenced this pull request Sep 3, 2019
Zero `ibv_query_device_ex` and `ibv_modify_qp` command structs before
use. This eliminates valgrind warnings such as:

==18022== Syscall param write(buf) points to uninitialised byte(s)
==18022==    at 0x513A894: write (in /usr/lib64/libc-2.26.so)
==18022==    by 0x4E440C0: _execute_cmd_write (cmd_fallback.c:250)
==18022==    by 0x4E4126C: ibv_cmd_modify_qp (cmd.c:1191)
==18022==    by 0x7E9D7BD: efa_modify_qp (verbs.c:839)
==18022==    by 0x4E4C0E8: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:620)
==18022==    by 0x401CA1: pp_init_ctx (ud_pingpong.c:402)
==18022==    by 0x401CA1: main (ud_pingpong.c:689)
==18022==  Address 0x1ffeffff08 is on thread 1's stack
==18022==  in frame linux-rdma#3, created by efa_modify_qp (verbs.c:834)

Signed-off-by: Firas Jahjah <firasj@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Nov 16, 2020
In acm_addr_lookup(), an address compare is performed. It compares
ACM_MAX_ADDRESS worth of bytes. However, the bytes exceeding the
actual address length, as given by addr_type, may contain arbitrary
data.

For example, in acm_svr_select_src() is only the valid bytes for an
IPv4 or IPv6 copied. Similar in acm_nl_to_addr_data().

Here is an example from debugging with gdb, slightly edited for better brevity:

(gdb) where
 #0  acm_addr_lookup () at src/acm.c:419
 linux-rdma#1  acm_get_port_ep_address () at src/acm.c:829
 linux-rdma#2  acm_get_ep_address () at src/acm.c:848
 linux-rdma#3  acm_rm_ep_ip () at src/acm.c:1322
 linux-rdma#4  acm_ipnl_handler () at src/acm.c:1452
 linux-rdma#5  acm_server () at src/acm.c:1867
 linux-rdma#6  main () at src/acm.c:3228

(gdb) x/16u ep->addr_info[i].addr.info.addr
0x1da66a8:  192 168     200     200     0       0       0       0
0x1da66b0:  0   0       0       0       0       0       0       0

(gdb) x/16u addr
0x7ffd165ca9f8: 192     168     200     200     62      127     0       0
0x7ffd165caa00: 95      8       14      129     62      127     0       0

(gdb) p addr_type
$5 = 2 '\002'

addr_type is here 2, which is ACM_ADDRESS_IP. We see that the IPv4
addresses are equal, but the compare detects different addresses,
because the full ACM_MAX_ADDRESS is used.

By introducing a helper function comparing names or addresses, the
actual length is used for addresses, and the functions
acm_mark_addr_invalid() and acm_addr_lookup() are greatly simplified.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>

---

v1 -> v2: Fixed Travis issue

Orabug: 29037253

(cherry picked from commit c562033)
cherry-pick-repo=github.com/linux-rdma/rdma-core.git
unmodified-from-upstream: c562033

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Acked-by: Aron Silverton <aron.silverton@oracle.com>

Orabug: 29410510

Rebase from RDMA Core 19.2 -> 20.2.

(cherry picked from commit 8763162)
cherry-pick-repo=linux-git.us.oracle.com/RDMA/rdma-core.git
unmodified-from-upstream: 8763162

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Acked-by: Aron Silverton <aron.silverton@oracle.com>
aron-silverton pushed a commit to oracle/rdma-core that referenced this pull request Nov 16, 2020
In acm_ep_insert_addr() an attempt to zero out the tmp address buffer
is performed. But the subsequent memcpy(), which uses the supplied
addr_len as argument, copies the whole shebang. This implies that the
provider is called with an address with arbitrary data padded.

This leads to a false mis-compare in the default provider's binary
tree lookup. Here is the stack trace and dump of the address buffer
from gdb (edited for better brevity):

(gdb) where
 #0  acmp_compare_dest (dest1=0x18c46a8, dest2=0x18c5d70) at prov/acmp/src/acmp.c:289
 linux-rdma#1  tfind () from /lib64/libc.so.6
 linux-rdma#2  acmp_get_dest () at prov/acmp/src/acmp.c:336
 linux-rdma#3  acmp_acquire_dest () at prov/acmp/src/acmp.c:379
 linux-rdma#4  acmp_add_addr () at prov/acmp/src/acmp.c:2385
 linux-rdma#5  acm_ep_insert_addr (..., addr_len=addr_len@entry=64, ...) at src/acm.c:2044
 linux-rdma#6  acm_ep_insert_addr (..., addr_len=64, ...) at src/acm.c:1325
 linux-rdma#7  acm_add_ep_ip (ip_str=0x7ffeeda298e0 "192.168.200.200", ...) at src/acm.c:1326
 linux-rdma#8  acm_ipnl_handler () at src/acm.c:1453
 linux-rdma#9  acm_server () at src/acm.c:1884
 linux-rdma#10 main () at src/acm.c:3245

(gdb) x/20u dest1
0x18c46a8:  192 168     200     200     155     127     0       0
0x18c46b0:  95  184     77      105     155     127     0       0
0x18c46b8:  0   0       64      49
(gdb) x/20u dest2
0x18c5d70:  192 168     200     200     0       0       0       0
0x18c5d78:  0   0       0       0       0       0       0       0
0x18c5d80:  0   0       0       0

The fix is to use the real length of the address in the memcpy() in
acm_ep_insert_addr(). This is derived from the addr_type. Hence, we
can re-factor and remove the addr_len from the call stack.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Mark Haywood <mark.haywood@oracle.com>

Orabug: 29037270

(cherry picked from commit c73f5d7)
cherry-pick-repo=github.com/linux-rdma/rdma-core.git
unmodified-from-upstream: c73f5d7

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Acked-by: Aron Silverton <aron.silverton@oracle.com>

Orabug: 29410510

Rebase from RDMA Core 19.2 -> 20.2.

(cherry picked from commit 303f845)
cherry-pick-repo=linux-git.us.oracle.com/RDMA/rdma-core.git
unmodified-from-upstream: 303f845

Signed-off-by: Mark Haywood <mark.haywood@oracle.com>
Acked-by: Aron Silverton <aron.silverton@oracle.com>
nmorey pushed a commit to nmorey/rdma-core that referenced this pull request Apr 28, 2021
Zero `ibv_query_device_ex` and `ibv_modify_qp` command structs before
use. This eliminates valgrind warnings such as:

==18022== Syscall param write(buf) points to uninitialised byte(s)
==18022==    at 0x513A894: write (in /usr/lib64/libc-2.26.so)
==18022==    by 0x4E440C0: _execute_cmd_write (cmd_fallback.c:250)
==18022==    by 0x4E4126C: ibv_cmd_modify_qp (cmd.c:1191)
==18022==    by 0x7E9D7BD: efa_modify_qp (verbs.c:839)
==18022==    by 0x4E4C0E8: ibv_modify_qp@@IBVERBS_1.1 (verbs.c:620)
==18022==    by 0x401CA1: pp_init_ctx (ud_pingpong.c:402)
==18022==    by 0x401CA1: main (ud_pingpong.c:689)
==18022==  Address 0x1ffeffff08 is on thread 1's stack
==18022==  in frame linux-rdma#3, created by efa_modify_qp (verbs.c:834)

Signed-off-by: Firas Jahjah <firasj@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
shefty pushed a commit to shefty/rdma-core that referenced this pull request Nov 10, 2025
Subject: [PATCH] librdmacm: Fix rdma_resolve_addrinfo() deadlock in sync mode

Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 linux-rdma#2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 linux-rdma#3  ___pthread_mutex_lock
 linux-rdma#4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 linux-rdma#5  0x00007ffff7fa1447 in rdma_get_cm_event
 linux-rdma#6  0x00007ffff7fa1fef in ucma_complete
 linux-rdma#7  0x00007ffff7fa2f9c in resolve_ai_sa
 linux-rdma#8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 linux-rdma#9  rdma_resolve_addrinfo
 linux-rdma#10 0x00000000004017b6 in start_cm_client_sync
 linux-rdma#11 0x00000000004018ee in main

Issue: 4582946
Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")
Change-Id: Ia724795a559bab6d965a35b8fd3e0f0096472a44
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
shefty pushed a commit to shefty/rdma-core that referenced this pull request Nov 11, 2025
Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 linux-rdma#2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 linux-rdma#3  ___pthread_mutex_lock
 linux-rdma#4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 linux-rdma#5  0x00007ffff7fa1447 in rdma_get_cm_event
 linux-rdma#6  0x00007ffff7fa1fef in ucma_complete
 linux-rdma#7  0x00007ffff7fa2f9c in resolve_ai_sa
 linux-rdma#8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 linux-rdma#9  rdma_resolve_addrinfo
 linux-rdma#10 0x00000000004017b6 in start_cm_client_sync
 linux-rdma#11 0x00000000004018ee in main

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
shefty pushed a commit to shefty/rdma-core that referenced this pull request Nov 11, 2025
Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 linux-rdma#2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 linux-rdma#3  ___pthread_mutex_lock
 linux-rdma#4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 linux-rdma#5  0x00007ffff7fa1447 in rdma_get_cm_event
 linux-rdma#6  0x00007ffff7fa1fef in ucma_complete
 linux-rdma#7  0x00007ffff7fa2f9c in resolve_ai_sa
 linux-rdma#8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 linux-rdma#9  rdma_resolve_addrinfo
 linux-rdma#10 0x00000000004017b6 in start_cm_client_sync
 linux-rdma#11 0x00000000004018ee in main

Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Sean Hefty <shefty@nvidia.com>
rleon pushed a commit that referenced this pull request Nov 12, 2025
Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 #2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 #3  ___pthread_mutex_lock
 #4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 #5  0x00007ffff7fa1447 in rdma_get_cm_event
 #6  0x00007ffff7fa1fef in ucma_complete
 #7  0x00007ffff7fa2f9c in resolve_ai_sa
 #8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 #9  rdma_resolve_addrinfo
 #10 0x00000000004017b6 in start_cm_client_sync
 #11 0x00000000004018ee in main

Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Sean Hefty <shefty@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
nmorey pushed a commit that referenced this pull request Nov 21, 2025
[ Upstream commit 7528827 ]

Fix the issue that rdma_resolve_addrinfo() gets deadlock when run in
sync mode:
 (gdb) bt
 #0  futex_wait
 #1  __GI___lll_lock_wait
 #2  0x00007ffff7dae791 in lll_mutex_lock_optimized
 #3  ___pthread_mutex_lock
 #4  0x00007ffff7f9f018 in ucma_process_addrinfo_resolved
 #5  0x00007ffff7fa1447 in rdma_get_cm_event
 #6  0x00007ffff7fa1fef in ucma_complete
 #7  0x00007ffff7fa2f9c in resolve_ai_sa
 #8  0x00007ffff7fa36ab in __rdma_resolve_addrinfo
 #9  rdma_resolve_addrinfo
 #10 0x00000000004017b6 in start_cm_client_sync
 #11 0x00000000004018ee in main

Fixes: 7b1a686 ("librdmacm: Provide interfaces to resolve IB services")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Sean Hefty <shefty@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Nicolas Morey <nmorey@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants