Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue: errno=111 Connection refused #1049

Open
weizhoublue opened this issue Oct 10, 2023 · 0 comments
Open

issue: errno=111 Connection refused #1049

weizhoublue opened this issue Oct 10, 2023 · 0 comments

Comments

@weizhoublue
Copy link

weizhoublue commented Oct 10, 2023

I run sockperf test referring to the doc https://docs.nvidia.com/networking/display/vmav952/running+vma

I got two hosts with mellanox cx5 with dual-port

~# ethtool -i ens6f0np0
driver: mlx5_core
version: 23.07-0.5.1
firmware-version: 16.27.6008 (LNV0000000033)
expansion-rom-version:
bus-info: 0000:af:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

# show_gids
DEV	PORT	INDEX	GID					IPv4  		VER	DEV
---	----	-----	---					------------  	---	---
mlx5_0	1	0	fe80:0000:0000:0000:063f:72ff:fed0:cee6			v1	ens6f0np0
mlx5_0	1	1	fe80:0000:0000:0000:063f:72ff:fed0:cee6			v2	ens6f0np0
mlx5_0	1	2	0000:0000:0000:0000:0000:ffff:ac51:000a	172.81.0.10  	v1	ens6f0np0
mlx5_0	1	3	0000:0000:0000:0000:0000:ffff:ac51:000a	172.81.0.10  	v2	ens6f0np0
mlx5_0	1	4	fd00:0081:0000:0000:0172:0081:0000:0010			v1	ens6f0np0
mlx5_0	1	5	fd00:0081:0000:0000:0172:0081:0000:0010			v2	ens6f0np0
mlx5_1	1	0	fe80:0000:0000:0000:063f:72ff:fed0:cee7			v1	ens6f1np1
mlx5_1	1	1	fe80:0000:0000:0000:063f:72ff:fed0:cee7			v2	ens6f1np1
mlx5_1	1	10	fd00:0090:0000:0000:0000:0000:0000:0010			v1	ens6f1np1.90
mlx5_1	1	11	fd00:0090:0000:0000:0000:0000:0000:0010			v2	ens6f1np1.90
mlx5_1	1	2	0000:0000:0000:0000:0000:ffff:ac52:000a	172.82.0.10  	v1	ens6f1np1
mlx5_1	1	3	0000:0000:0000:0000:0000:ffff:ac52:000a	172.82.0.10  	v2	ens6f1np1
mlx5_1	1	4	fd00:0082:0000:0000:0172:0082:0000:0010			v1	ens6f1np1
mlx5_1	1	5	fd00:0082:0000:0000:0172:0082:0000:0010			v2	ens6f1np1
mlx5_1	1	6	fe80:0000:0000:0000:063f:72ff:fed0:cee7			v1	ens6f1np1.90
mlx5_1	1	7	fe80:0000:0000:0000:063f:72ff:fed0:cee7			v2	ens6f1np1.90
mlx5_1	1	8	0000:0000:0000:0000:0000:ffff:ac5a:000a	172.90.0.10  	v1	ens6f1np1.90
mlx5_1	1	9	0000:0000:0000:0000:0000:ffff:ac5a:000a	172.90.0.10  	v2	ens6f1np1.90
mlx5_2	1	0	fe80:0000:0000:0000:9888:49ff:fed9:428f			v1	ens6f0v0
mlx5_2	1	1	fe80:0000:0000:0000:9888:49ff:fed9:428f			v2	ens6f0v0
mlx5_3	1	0	fe80:0000:0000:0000:408d:07ff:feb3:0a9b			v1	ens6f0v1
mlx5_3	1	1	fe80:0000:0000:0000:408d:07ff:feb3:0a9b			v2	ens6f0v1
mlx5_4	1	0	fe80:0000:0000:0000:14ab:adff:fef9:16d7			v1	ens6f0v2
mlx5_4	1	1	fe80:0000:0000:0000:14ab:adff:fef9:16d7			v2	ens6f0v2
mlx5_5	1	0	fe80:0000:0000:0000:0891:f4ff:febc:46e2			v1	ens6f0v3
mlx5_5	1	1	fe80:0000:0000:0000:0891:f4ff:febc:46e2			v2	ens6f0v3
n_gids_found=26


# uname -a
Linux 10-20-1-10 5.15.0-86-generic #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04 (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy


I could succeed to run sockperf between two hosts all the times

on host 172.81.0.20

# sockperf sr --tcp -i 172.81.0.20 -p 15000
sockperf: == version #3.7-no.git ==
sockperf: [SERVER] listen on:
[ 0] IP = 172.81.0.20     PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: [tid 2797496] using recvfrom() to block on socket(s)

on client host 172.81.0.10

# sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
sockperf: == version #3.10-no.git ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)

[ 0] IP = 172.81.0.20     PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.000 sec; Warm up time=400 msec; SentMessages=40500; ReceivedMessages=40499
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=0.550 sec; SentMessages=23199; ReceivedMessages=23199
sockperf: ====> avg-latency=11.813 (std-dev=1.441, mean-ad=0.623, median-ad=0.487, siqr=0.337, cv=0.122, std-error=0.009, 99.0% ci=[11.789, 11.837])
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 11.813 usec
sockperf: Total 23199 observations; each percentile contains 231.99 observations
sockperf: ---> <MAX> observation =  107.870
sockperf: ---> percentile 99.999 =  107.870
sockperf: ---> percentile 99.990 =   60.648
sockperf: ---> percentile 99.900 =   21.000
sockperf: ---> percentile 99.000 =   17.315
sockperf: ---> percentile 90.000 =   12.599
sockperf: ---> percentile 75.000 =   11.947
sockperf: ---> percentile 50.000 =   11.564
sockperf: ---> percentile 25.000 =   11.272
sockperf: ---> <MIN> observation =   10.458

but I failed to run with libvma sometimes

on host 172.81.0.20

# LD_PRELOAD=libvma.so sockperf sr --tcp -i 172.81.0.20 -p 15000
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: VMA_VERSION: 9.8.31-0 Development Snapshot built on Oct 10 2023 11:31:55 -*- DEBUG -*-
 VMA INFO: Cmd Line: sockperf sr --tcp -i 172.81.0.20 -p 15000
 VMA INFO: Current Time: Tue Oct 10 12:05:15 2023
 VMA INFO: Pid: 2813781
 VMA INFO: OFED Version: MLNX_OFED_LINUX-23.07-0.5.1.2:
 VMA INFO: Architecture: x86_64
 VMA INFO: Node: 10-20-1-20
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
 VMA INFO: ---------------------------------------------------------------------------
^@ VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.110    netmask: 255.255.255.255 dev: veth9878877221a                      table :500        scope 253 type  1 index 43 scope 253 type  1 index 43
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.119    netmask: 255.255.255.255 dev: vethd861c2e0cf5                      table :500        scope 253 type  1 index 34 scope 253 type  1 index 34
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.104    netmask: 255.255.255.255 dev: vethf917a4f52ae                      table :500        scope 253 type  1 index 42 scope 253 type  1 index 42
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.114    netmask: 255.255.255.255 dev: cali21b37a164ee                      table :500        scope 253 type  1 index 40 scope 253 type  1 index 40
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.90.0.108    netmask: 255.255.255.255 dev: veth9810b9fa995                      table :500        scope 253 type  1 index 44 scope 253 type  1 index 44
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.240.0    netmask: 255.255.255.192 dev:                            table :main       scope   0 type  6 index  0 scope   0 type  6 index  0
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.240.4    netmask: 255.255.255.255 dev: calic5e25250998                      table :main       scope 253 type  1 index 41 scope 253 type  1 index 41
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.240.37   netmask: 255.255.255.255 dev: cali21b37a164ee                      table :main       scope 253 type  1 index 40 scope 253 type  1 index 40
sockperf: == version #3.7-no.git ==
sockperf: [SERVER] listen on:
[ 0] IP = 172.81.0.20     PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: [tid 2813781] using recvfrom() to block on socket(s)

on client host 172.81.0.10

# LD_PRELOAD=libvma.so sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: VMA_VERSION: 9.8.31-1 Release built on Jul 10 2023 11:42:20
 VMA INFO: Cmd Line: sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
 VMA INFO: OFED Version: MLNX_OFED_LINUX-23.07-0.5.1.2:
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
 VMA INFO: ---------------------------------------------------------------------------
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.100    netmask: 255.255.255.255 dev: veth29f76130861                      table :500        scope 253 type  1 index 20 scope 253 type  1 index 20
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.120    netmask: 255.255.255.255 dev: veth35695ee5c1e                      table :500        scope 253 type  1 index 17 scope 253 type  1 index 17
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.111    netmask: 255.255.255.255 dev: veth47f77b93392                      table :500        scope 253 type  1 index 22 scope 253 type  1 index 22
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.119    netmask: 255.255.255.255 dev: calic0d6e116972                      table :500        scope 253 type  1 index 18 scope 253 type  1 index 18
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.90.0.100    netmask: 255.255.255.255 dev: veth4dd7b95a373                      table :500        scope 253 type  1 index 21 scope 253 type  1 index 21
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.128   netmask: 255.255.255.192 dev:                            table :main       scope   0 type  6 index  0 scope   0 type  6 index  0
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.131   netmask: 255.255.255.255 dev: cali00a8163a6e5                      table :main       scope 253 type  1 index 32 scope 253 type  1 index 32
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.146   netmask: 255.255.255.255 dev: cali7cef15e86e8                      table :main       scope 253 type  1 index 33 scope 253 type  1 index 33
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.164   netmask: 255.255.255.255 dev: calic0d6e116972                      table :main       scope 253 type  1 index 18 scope 253 type  1 index 18
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.182   netmask: 255.255.255.255 dev: cali2f01bce650e                      table :main       scope 253 type  1 index 31 scope 253 type  1 index 31
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.183   netmask: 255.255.255.255 dev: calid7cf868faf9                      table :main       scope 253 type  1 index 19 scope 253 type  1 index 19
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.184   netmask: 255.255.255.255 dev: cali92710158f24                      table :main       scope 253 type  1 index 35 scope 253 type  1 index 35
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.185   netmask: 255.255.255.255 dev: calie84e04abcc4                      table :main       scope 253 type  1 index 34 scope 253 type  1 index 34
sockperf: == version #3.10-no.git ==
sockperf: ERROR: Can`t connect socket (errno=111 Connection refused)

and I succeed to run with libvma sometimes

# LD_PRELOAD=libvma.so sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: VMA_VERSION: 9.8.31-1 Release built on Jul 10 2023 11:42:20
 VMA INFO: Cmd Line: sockperf pp --tcp -i 172.81.0.20 -p 15000 -t 1
 VMA INFO: OFED Version: MLNX_OFED_LINUX-23.07-0.5.1.2:
 VMA INFO: ---------------------------------------------------------------------------
 VMA INFO: Log Level                      INFO                       [VMA_TRACELEVEL]
 VMA INFO: ---------------------------------------------------------------------------
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.100    netmask: 255.255.255.255 dev: veth29f76130861                      table :500        scope 253 type  1 index 20 scope 253 type  1 index 20
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.81.0.120    netmask: 255.255.255.255 dev: veth35695ee5c1e                      table :500        scope 253 type  1 index 17 scope 253 type  1 index 17
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.111    netmask: 255.255.255.255 dev: veth47f77b93392                      table :500        scope 253 type  1 index 22 scope 253 type  1 index 22
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.82.0.119    netmask: 255.255.255.255 dev: calic0d6e116972                      table :500        scope 253 type  1 index 18 scope 253 type  1 index 18
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.90.0.100    netmask: 255.255.255.255 dev: veth4dd7b95a373                      table :500        scope 253 type  1 index 21 scope 253 type  1 index 21
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.128   netmask: 255.255.255.192 dev:                            table :main       scope   0 type  6 index  0 scope   0 type  6 index  0
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.131   netmask: 255.255.255.255 dev: cali00a8163a6e5                      table :main       scope 253 type  1 index 32 scope 253 type  1 index 32
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.146   netmask: 255.255.255.255 dev: cali7cef15e86e8                      table :main       scope 253 type  1 index 33 scope 253 type  1 index 33
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.164   netmask: 255.255.255.255 dev: calic0d6e116972                      table :main       scope 253 type  1 index 18 scope 253 type  1 index 18
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.182   netmask: 255.255.255.255 dev: cali2f01bce650e                      table :main       scope 253 type  1 index 31 scope 253 type  1 index 31
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.183   netmask: 255.255.255.255 dev: calid7cf868faf9                      table :main       scope 253 type  1 index 19 scope 253 type  1 index 19
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.184   netmask: 255.255.255.255 dev: cali92710158f24                      table :main       scope 253 type  1 index 35 scope 253 type  1 index 35
 VMA WARNING: rtm:170:rt_mgr_update_source_ip() could not figure out source ip for rtv = dst: 172.21.32.185   netmask: 255.255.255.255 dev: calie84e04abcc4                      table :main       scope 253 type  1 index 34 scope 253 type  1 index 34
sockperf: == version #3.10-no.git ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on socket(s)

[ 0] IP = 172.81.0.20     PORT = 15000 # TCP
sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.000 sec; Warm up time=400 msec; SentMessages=103730; ReceivedMessages=103729
sockperf: ========= Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=0.550 sec; SentMessages=57262; ReceivedMessages=57262
sockperf: ====> avg-latency=4.778 (std-dev=1.034, mean-ad=0.218, median-ad=0.076, siqr=0.051, cv=0.216, std-error=0.004, 99.0% ci=[4.767, 4.789])
sockperf: # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
sockperf: Summary: Latency is 4.778 usec
sockperf: Total 57262 observations; each percentile contains 572.62 observations
sockperf: ---> <MAX> observation =  132.789
sockperf: ---> percentile 99.999 =  102.016
sockperf: ---> percentile 99.990 =   13.302
sockperf: ---> percentile 99.900 =   12.760
sockperf: ---> percentile 99.000 =    9.170
sockperf: ---> percentile 90.000 =    4.808
sockperf: ---> percentile 75.000 =    4.720
sockperf: ---> percentile 50.000 =    4.673
sockperf: ---> percentile 25.000 =    4.616
sockperf: ---> <MIN> observation =    4.278

detailed log is attached for the client host
fail-client-log.txt

In brief, when use libvma, it succeed sometimes and fail sometime with same command
but when does not use libvma, it succeed all the time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant